**5th INTERNATIONAL CONFERENCE ON**

# **Optical Characterization of Materials**

**MARCH 17th –18th, 2021 KARLSRUHE | GERMANY**

Beyerer | Längle (Eds.) CONFERENCE PROCEEDINGS

OCM 2021

J. BEYERER | T. LÄNGLE (Eds.)

Jürgen Beyerer | Thomas Längle (Eds.)

### OCM 2021

5th International Conference on Optical Characterization of Materials

March 17th – 18th, 2021 Karlsruhe | Germany

# OCM 2021

# 5th International Conference on Optical Characterization of Materials

March 17th – 18th, 2021 Karlsruhe | Germany

Edited by Jürgen Beyerer | Thomas Längle

#### Veranstalter

Fraunhofer Institut of Optronics, System Technologies and Image Exploitation IOSB c/o Karlsruhe Center for Material Signatures KCM Fraunhoferstraße 1, 76131 Karlsruhe

Dieser Tagungsband ist auch als Onlineversion abrufbar unter http://dx.doi.org/10.5445/KSP/1000128686

#### **Impressum**

Karlsruher Institut für Technologie (KIT) KIT Scientific Publishing Straße am Forum 2 D-76131 Karlsruhe

KIT Scientific Publishing is a registered trademark of Karlsruhe Institute of Technology. Reprint using the book cover is not allowed.

www.ksp.kit.edu

*This document – excluding the cover, pictures and graphs – is licensed under a Creative Commons Attribution-Share Alike 4.0 International License (CC BY-SA 4.0): https://creativecommons.org/licenses/by-sa/4.0/deed.en*

*The cover page is licensed under a Creative Commons Attribution-No Derivatives 4.0 International License (CC BY-ND 4.0): https://creativecommons.org/licenses/by-nd/4.0/deed.en*

Print on Demand 2021 – Gedruckt auf FSC-zertifiziertem Papier

ISSN 2510-7240 ISBN 978-3-7315-1081-9 DOI 10.5445/KSP/1000128686

### **Preface**

The state of the art in optical characterization of materials is advancing rapidly. New insights into the theoretical foundations of this research field have been gained and exciting practical developments have taken place, both driven by novel applications and innovative sensor technologies that are constantly emerging. The big success of the international conferences on Optical Characterization of Materials in 2013, 2015, 2017 and 2019 proves the necessity of a platform to present, discuss and evaluate the latest research results in this interdisciplinary domain. Due to that fact, the international conference on Optical Characterization of Materials (OCM) took place the fifth time in March 2021.

The OCM 2021 was organized by the Karlsruhe Center for Spectral Signatures of Materials (KCM) in cooperation with the German Chapter of the Instrumentation & Measurement Society of IEEE. The Karlsruhe Center for Spectral Signatures of Materials is an association of institutes of Karlsruhe Institute of Technology (KIT) and the business unit Automated Visual Inspection of the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB.

Despite the conference's young age, the organizing committee has had the pleasure to evaluate a large amount of abstracts. Based on the submissions, we selected 22 papers as talks, a keynote lecture and several practical demonstrations.

The present book is based on the conference held in Karlsruhe, Germany from March 17–18, 2021. The aim of this conference was to bring together leading researchers in the domain of Characterization of Materials by spectral characteristics from UV (240 nm) to IR (14 μm), multispectral image analysis, X-ray methods, polarimetry, and microscopy. Typical application areas for these techniques cover the fields of, e.g., food industry, recycling of waste materials, detection of contaminated materials, mining, process industry, and raw materials.

The editors would like to thank all of the authors that have contributed to these proceedings as well as the reviewers, who have invested a generous amount of their time to suggest possible improvements of the papers. The help of Jurgen Hock and Lukas Dippon in ¨ the preparation of this book is greatly appreciated. Last but not least, we thank the organizing committee of the conference, led by Britta Ost, for their effort in organizing this event. The excellent technical facilities and the friendly staff of the Fraunhofer IOSB greatly contributed to the success of the meeting.

March 2021 Jurgen Beyerer ¨ Thomas Langle ¨

#### **General Chairs**

**Program Chair**


# Thomas Langle ¨ Karlsruhe

#### **Program Committee**

Jochen Aderhold Braunschweig Oliver Albrecht Dresden Johannes Anastasiadis Karlsruhe Sebastian Bauer Madison (Wisconsin) Mark Bucking ¨ Schmallenberg Andrea Buttner ¨ Freising Robin Gruna Karlsruhe Michael Heizmann Karlsruhe Thomas Hofmann Wurzburg ¨ Olfa Kanoun Chemnitz Anna Kicherer Siebeldingen Andrea Krahmer ¨ Berlin Felix Salazar ´ Madrid Maximilian Schambach Karlsruhe Heinar Schmidt Kulmbach Henning Schulte Karlsruhe

# **Contents**




### **Recycling**


### **Other Applications**



### **Algorithms**


### **Sensors**


# **Are low-cost, hand-held NIR sensors suitable to detect adulterations of halal meat?**

Judith Muller-Maatsch ¨ 1, Yannick Weesepoel1, Emma Roetgerink1, Michiel Wijtten1, and Martin Alewijn1

> <sup>1</sup> Wageningen Food Safety Research Wageningen, The Netherlands

**Abstract** The demand of halal meat products is growing globally. Therefore, it is important to detect adulterations and food fraud attempts in a fast, non-invasive manner for example by using hand-held near-infrared (NIR) devices. In this study, samples of pork, lamb, beef and chicken were measured pure and in mixtures of 2, 5, 10, 25 and 50% pork in the non-pork meat samples, respectively. Five sensors were tested with varying wavelength range: Scio (740-1070 nm), Linksquare (400-1000 nm), Tellspec (900-1700 nm), MicroNIR (900-1650 nm), ASD Labspec 4 High-Res (350-1700 nm). A one-class-classification approach was used for data analysis, applying pork as the target group. For comparison, thresholds of the models were chosen to correctly identify 100% of the pork samples and 75% of all mixtures. Comparing the sensors upon the correct detection of all halal meat samples, i.e., no-pork containing ones, the Scio and the ASD Labspec performed best with an outcome of 34% and 32%, respectively. The Linksquare, MicroNIR and Tellspec were able to correctly identify 27%, 27%, and 10%, respectively, of the halal products. Concluding, the application of these five NIR devices are challenging when it comes to the detection of meat products from different species. Nonetheless, the usage of this application in combination with suitable chemometric approaches may contribute to the detection of food fraud in halal products.

**Keywords** Near-infrared sensors, pork, lamb, beef, chicken meat, one-class-classification

### **1 Introduction**

The market for halal-certified products increases within Western societies. While halal products have been intended for Muslim consumers, Jewish consumers as well as vegetarians/vegans, and people with various types of allergies or dietary restrictions purchase halal-certified products [1, 2]. When it comes to halal meat, several differences are found to the commercial meat that is available in Western countries. Halal meat may only contain meat from ruminant species like cows or birds like chicken. Horse and pig meat are not considered halal. Besides the species also the feed that is fed to the animals plays an important role. Animals fed with additions of biosolids or animal protein concentrates must undergo a quarantine period with other feed before slaughtering. Moreover, halal meat may only be retrieved from a slaughtering process that renders animals immobile or unconscious, without killing it, prior to the blood drainage [2]. These differences in animal species, feed and slaughtering process have been hard to detect and trace. Hence, several cases of fraud occurred as listed by [3] or illustrated in detail by [4]. Both authors conclude that the main enabling factor of halal meat food fraud is the challenging detection of halal meat authenticity. One possible solution to overcome the issue of halal meat authenticity detection is applying spectroscopy. Spectral data may be collected for example by near-infrared (NIR) sensors. [5], [6], and [7] showed that the discrimination of animal species is possible when using a portable FT-IR or benchtop NIR sensor with a wavelength range reaching between 1000 and 2000 nm. although spectral data acquisition is fast and easy to conduct, the machines trialled in the past were very costly, heavy and bulky [8]. In this paper, we show the application of a one-class-classification (OCC) chemometric approach on data obtained by using several hand-held NIR sensors. OCC describes one specific class as the target class and returns predictions of samples being out or in the respective target class. In the case of halal meat detection, in particular the speciation issue, the target class was set as pork meat, i.e., non-halal meat. That means in that all samples that are "in", do contain pork and are therefore not halal. On the other hand, all samples that are "out" do not contain pork and may be labelled as halal.

# **2 Materials and methods**

### **2.1 Materials and sample preparation**

Pork, beef, lamb and chicken meat was purchased at local butchers in Wageningen, the Netherlands. 40 samples of each species were purchased in 20 days, being two different meat parts per day per species, i.e., shoulder and leg of lamb, pig and cow or breast and drumstick from chicken. For reference purposes, all purchased, pure samples underwent a real-time PCR assay to validate the species identity. The method used has been described previously by [9]. All samples were purchased as intact meat and minced with a meat mincer Tristar VM-4310 (Smartwares Europe, Tilburg, Nederland). Mixtures of pork with beef, lamb or chicken, respectively, were prepared in the concentrations 2, 5, 10, 25, and 50% pork/ other meat (w/w). A randomised approach was used to make almost every day six mixtures using the two parts of lamb, beef and chicken mixed with pork, leading to 117 sample mixtures. In total 277 samples were measured, being 117 sample mixtures and 160 pure samples. To ensure that all samples have a similar storage period, all freshly prepared samples and mixtures were frozen, stored at -18 ºC and thawed for 12h prior to the measurements.

### **2.2 NIR spetroscopy and data acquisition**

Five different hand-held sensors (Fig. 2.1) were studied as follows:


The NIR hardware was calibrated according to the manufacturer instruction with a 99% diffuse reflectance standard. Measurements were done in diffuse reflectance mode by slightly pressing the optical part of the sensor to the sample or by slightly hoovering the optical part above the sample in a distance of about 1 cm. For the Scio, Tellspec, Linksquare and ASD Labspec sensors the integration time was set automatically by the manufacturer, for MicroNIR the integration time was set at 8 ms with 200 scans. Spectral measurements were conducted at room temperature when the meat had a temperature between 19 and 21 ºC. The measurements were repeated four times.

**Figure 2.1:** Scio, Linksquare, Tellspec, MicroNIR, and ASDLabspec (pictures courtesy of the respective hardware manufacturers).

#### **2.3 Data analysis**

Outliers were excluded manually using principal component analysis after standard normal variate (SNV) pre-processing in Unscrambler X 10.5 (Camo Analytics AS, Oslo, Norway). Out of the 1108 acquired spectra, 12 of Scio, 10 of Linksquare, 144 of Tellspec, 16 of MicroNIR, and 13 of ASD Labspec were excluded, resulting in 1096, 1098, 964, 1092, and 1095 spectra for Scio, Linksquare, Tellspec, MicroNIR and ASD Labspec, respectively. It is worth mentioning that the outliers are probably measurement errors, as no sample was found to be an outlier for all sensors. The OCC chemometric approach was conducted in R 3.6.1 (R Core Team, 2018) as described in detail by [10] and [11]. The same R-packages were used as previously reported. Among commonly used pre-processing methods and algorithms the most suited combination was picked according to (area under the receiver operator characteristic) AUROCs of the target class (pork) against the individual other classes, namely beef, lamb and chicken. For each sensor, three models were picked manually. Averaged class distances of four repetitive measurements (in a row on one occasion) were used for further calculations. Classification results of each model were fused into a final classification using a decision tree, i.e., if more than one out of the three models classified a sample as 'out-of-class' it was classified as 'not containing pork'.

### **3 Results and discussion**

#### **3.1 Raw NIR data**

The five sensors used in this study resulted in five very different groups of NIR spectra (Fig. 3.1). Next to the different measurement range of the sensors, also the hardware technology used differs for each sensor, resulting in different responses at each wavelength.

**Figure 3.1:** Raw data of all pure samples averaged according to meat species (Wavelength ranges were Scio 740-1070 nm, Linksquare 400-1000 nm, Tellspec 900-1700nm, MicroNIR 900-1650 nm and ASD Labspec 4 HighRes 350-1700nm).

#### **3.2 Models picked for the different sensors**


• The first ASD Labspec model used the fourth section of the spectra, SNV-DT and SIMCA (3PCs). The second SNV, the second section of the spectra and SIMCE (3PCs), and the third the 2nd derivative (Savitzky-Golay, 11-point filter length), the fourth section of the spectra and OCSVM.

**Figure 3.2:** Correct classification in % for pork, mixtures (2 to 50% pork in lamb, beef and chicken (w/w)), lamb, beef and chicken samples deriving from the fused classification results of three models per sensor, respectively. Thresholds were set to achieve a correct identification of 100% pork and 75% mixture samples.

#### **3.3 Determination of thresholds after fusion of classification results**

The thresholds of the fused classification results were set in the way that 100% of all samples containing 100% pork were correctly identified as "pork" (Fig. 3.2). In that case, with no false negatives, the correct classification of pure beef, chicken, and mutton samples, as well as mixtures, was lower than 100%. The correct classification of mixtures was set by adjusting the thresholds at 75%, whereby the ones containing less than 10% pork where wrongly identified in most cases. In summary, the thresholds were chosen in a way that some halal products, i.e., pure lamb, beef and chicken, were wrongly identified as containing pork but no pork was undetected. In an industrial setting, this approach should be beneficial as it reduces the number of samples that must be identified by more complex, costly and time-intensive experiments such as polymerase chain reactions (PCR).

#### **3.4 Comparing sensors**

Overall the Scio and the ASD Labspec performed best as still 34% or 32%, respectively, of the halal products were classified correctly (Fig. 3.2). Nevertheless, the Scio sensor was only able to discriminate beef samples as 'non-pork' but none of the other meat species such as chicken and lamb. Here, the ASD labspec sensor performed better identifying correctly, 15% of lamb, 51% of beef and 29% of chicken samples. In contrast to the two mentioned sensors, The Linksquare and MicroNIR identified 27% of the halal-products correctly and the Tellspec only 10%. Both the Linksquare and the ASD Labspec performed well on the identification of chicken with 53% and 29% correctly identified. This may be caused by the wider wavelength range that includes part of the visible spectrum 400-1000 nm and 350-1700 nm for Linksquare and ASD Labspec, respectively. Chicken meat has a lighter colour than the other meat species and may be discriminated visually. Also, both sensors included models that used the second section of the spectrum where the visible wavelengths may be found. In literature, the discrimination of meat according to its protein, moisture and fat content was successful using the Scio and another miniaturized NIR device. [12] showed that samples containing fat from 5% to 43%, protein from 12% to 23% and a moisture content of 35% to 69% may be correctly classified. In contrast to the study of [12], the samples used in this study are probably too close to each other concerning their composition. According to the USDA database, raw pork contains 72.6-72.9 g/100 g water, 19.6-20.5 g/100 g protein, 5.4-7.1 g/100 g fat, whereas chicken may contain 73.9-76.8 g /100 g water, 19.4-22.5 g/100 g protein, 2.6- 3.7 g/100 g protein fat, lamb 72.5-74.4 g/100 g water, 19.7-21.1 g/100 g protein, 6.5-8.3 g/100 g fat and beef 73.5-73.6 g/100 g water, 20.4- 20.5 g/100 g protein, 5.4-6.7 g/100 g fat. These might be the reason why, so few samples of the pure, halal-meat products were correctly identified when the threshold was chosen to correctly detect mixtures in a 75% rate.

### **4 Conclusion and outlook**

The application of fast, cheap and hand-held devices is challenging when it comes to the detection of meat products from different species. Nevertheless, the presented OCC approach enables the application of these devices to contribute to species admixture detection for halalcertified products. Five different devices were tested. The Scio and the ASD Labspec performed best followed by the Linksquare, MicroNIR and Tellspec sensor. Further research is conducted using hyperspectral imaging cameras that cover a wavelength range in the VIS-NIR from 400-1700 nm and include spatial information for classification attempts. Moreover, different slaughtering techniques and feeding will be included.

### **5 Funding**

This work was funded by the Dutch Ministry of Agriculture, Nature and Food Quality (Knowledge base grant).

### **References**


*Food Control*, vol. 57, pp. 258 – 267, 2015. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0956713515002364


# **Exotic Fruit Ripening Based on Optical Characterisation**

Anton Scheibelmasser1, Matthias Jeindl1, and Gunar Nakladal2

<sup>1</sup> Insort GmbH, Berndorf 166, 8324 Kirchberg an der Raab, Austria, <sup>2</sup> Fruturat GmbH, Fruturastraße 1, 8224 Hartl, Austria

**Abstract** Re-sellers and customers in super markets expect perfect quality fruit in terms of ripeness, sweetness, taste and a lack of inner and outer defects. Especially exotic fruits provide many challenges due to long transport routes and the logistics of ripening processes. Optical characterisation of the fruits could help ensure the required quality. A combination of visual and infrared light evaluation techniques allows the measurement of quality parameters that support the control system of the reseller's store and the delivery logistics to the super market. For the inspection of chemical fruit characteristics (e.g. dry matter, sugar content), a hyperspectral near-infrared sensor is used. Additionally, an RGB camera is responsible for the visual defect analysis. Based on this measurement principle, a ripening control machine was developed and tested in daily business, successfully.

**Keywords** Exotic fruits ripening, hyperspectral imaging, Sherlock Food Analyzer

### **1 Introduction**

During the last years the demand of ripened mangos and avocados rose significantly because shops offered more and more " Ready to eat" products. In former times the consumers found mostly unripened fruit and tried to achieve ripeness by simple methods at home. In most cases the result was unsatisfactory, keeping the consumption demand on lower levels. In order to improve the situation, some fruit traders started their own ripening processes. The result was an explosion of sale quantities. The company Frutura GmbH [1] has long-time experience in ripening of bananas, leading to the idea of following the ripening trend. After these ripening processes, the result was often an uneven product with several defects and sorting had to be done manually. The market did not yet provide proper sorting equipment for these types of fruit. The Austrian company Insort [2] was selected as project partner, developer and producer of a new kind of sorting machine.

**Characterisation of Exotic Fruits** In order to deliver perfect exotic fruits, in the appropriate ripening stage, a few requirements have to be met. For mangos, size and weight have to meet a defined range and should be sorted in respective quality classes. The right water content (moisture, dry matter value) and sugar concentration (Brix value) have to be met to ensure a tasty and sweet fruit. The visual appearance of the fruit should have a ripe color and lack damages caused by rot or diseases. The peel should be free of cosmetic defects or bruises. In case of avocados, the oil content given by the dry matter value is the essential quality parameter. These parameters are typically measured on single fruit samples offline in a lab using e.g. destructive sample preparations and high measurement times in the range of a few minutes per fruit. In this project, a nondestructive inline measurement method was required reaching measurement speeds of 5000 mangos per hour on a conveyor belt with a velocity of 1 m/s.

### **2 Ripeness Sorting Using the Sherlock Food Analyzer**

The basic instrumentation idea for the development of a ripeness analyser/sorter was the use of an optical system as an inline measurement device, the Sherlock Food Analyzer (see left image in 2.1). Additionally, in order to bring the fruits to the line of inspection, mechanical infeed machinery is required. Due to the necessity of double sided analysis for proper fruit quality inspection, a turning station was introduced in combination with a second Sherlock Food Analyzer. After analyzing the fruits on the conveyor belt, the Sherlock Food Analyzer acts as a control device capable of piloting appropriate eject mechanisms for quality class sorting.

**Figure 2.1:** Ripening sorter using Sherlock Food Analyzer and eight ejection units.

#### **2.1 Components of the Sherlock Food Analyzer**

In order to understand the measurement concept and the idea of the instrumentation, a short introduction to the capabilities and key elements of the Sherlock Food Analyzer will be given in the following paragraphs.

**Measurement Principles for Analyzing** The device is based on the optical characterisation of food products by means of light spectroscopy over a broad range of wavelengths. On one hand the hyperspectral imaging camera measures light information in the near infrared range, therefore gathering data results on the chemical structure of the product. On the other hand, an RGB camera uses the range of visible light and evaluates visual appearance of the products. In both wavelength ranges line scan cameras are implemented as push broom applications. Regarding the data acquisition, a false color coding method (Insort branding: *CIT*, Chemical Imaging Technology) is used. This means that after a manual modelling step, by means of software, the spectra of the NIR camera are classified and color coded according to class characteristics. In order to have a common software interface, false color coding is also used for the RGB camera images, highlighting visual characteristics of the fruit.In case of the RGB camera, a qualitative approach is used. Product defects are coded in desired colors based on categories: Good product (e.g. green) or product defects (e.g. red, violet). This coding is then used within the operating software in order to evaluate type, count and area of defects on each fruit. The hyperspectral camera has two modes available for analyzing spectral information. The already mentioned qualitative approach distinguishes between the spectra of the good product in contrast to product defects, which typically differ in chemical composition. Based on this model a false color picture is generated and evaluated by software inline and in real-time. An additional and very important method provided by the hyperspectral camera is the so-called quantitative approach. Based on the reflected light intensity in the selected wave length range (region of interest, ROI), a regression model is established which brings the reflected light intensity in correlation with the laboratory analysis of the chemical components' concentration (e.g. moisture, dry matter). Hence both camera types transfer false color pictures inline and in real-time to the evaluation software.

**Optical and Mechanical Sensors** The key element in this optical characterisation project is the hyperspectral camera. The near infrared camera *EVK Helios G2* [3] provides 320 pixel spatial resolution and 240 wavelengths per pixel in the range from 900-1700 nm. The line scan camera works in a push broom application and transfers the classified spectra, coded as false color RGB pictures, by means of Ethernet: GigE protocol. The camera needs a model upload upon initialization, together with basic parametrizations (e.g. frame rate). The modelling software *SQALAR* [4] enables the user to generate the model for the fruits and to update the camera.

The used RGB camera [5] provides 2048 pixel and also works in line scan mode. Utilizing a qualitative model, this camera determines fruit size and color defects. The qualitative classification is additionally optimized for black or dark spot area measurement and visible bruise detection. By means of the software *Falco* [6], a false color model is generated and executed on the analysis computer during runtime .

A weight measurement cell [7] is implemented as part of the mechanical infeed system. The load cell is placed after the optical inspection section and before the ejection system. The fruits, placed in cups, are transported over the load cell and the derived online measurement value is transferred to the analyzing software. The load cell provides an analogue current interface evaluated by a PLC [8]. After calculation and filtering, the fruit weight reading is transferred to the PC by means of an Ethernet connection. The challenge of the weight measurement is the fact, that the fruits in the cup are moving over the weight cell without stopping. In order to get a correct reading (+/- 2 gram), an adaptive filtering algorithm is necessary to flatten the measurement peak occurring in the moment the cup touches the weight cell with a velocity of 1 m/s.

Another important sensor in the system is the rotary encoder placed on the conveyor chain. Based on this sensor and the appropriate parametrization, the distance between the line of inspection, the weight measurement and the ejectors is determined. The interface from this central position encoder is connected to the PLC. Based on the PLC-PC Ethernet communication the detection of objects (fruits) in the line of inspection and the control of weight measurement and ejector actuating is synchronized.

**Illumination** Regarding the two types of optical cameras used, two different kinds of illumination are necessary.

On one hand, there is the need of broad band illumination for the hyperspectral camera implemented by means of halogen lamps and a reflector encased into an air cooled steel housing. This illumination is focused onto the line of inspection in order to provide sufficient light for optical evaluation. The fruits reflect this light to the hyperspectral camera enabling spectroscopic evaluations.

On the other hand, additional light in the visible range is needed for the high resolution RGB cameras. A white LED is therefore used to enhance the amount of light in the line of inspection. In order to support the applied machine vision algorithm for the detection of fruits (background segmentation), the conveyor rollers are kept in a unique color (blue). By means of blue light LED illumination, the background of the rollers is filled up homogeneously (see 2.2).

**Figure 2.2:** Blue conveyor rollers and the active blue LED background.

#### **2.2 Mechanical Infeed for Optical Inspection and Ejector Units**

The way of a fruit from the store to one of the eight quality category outputs is controlled by a complex conveyor system. At the entrance the fruit boxes are unloaded and the fruits are transported to an elevating conveyor in order to reach the height of the inspection line. Next, a short conveyor is used to separate the fruits in order to only carry two fruits at once onto the following rollers of the conveyor. The rollers section is used to transport the fruits to the line of inspection of the first Sherlock Food Analyzer. The fruits are detected by the camera system and their path is tracked and calculated in order to synchronize the following steps of the infeed process. The next step is the rotating part of the conveyor where the rollers are used to turn the fruits for the inspection of the rear side at the position of the second Sherlock Food Analyzer. After this inspection step, the fruits are transported to cups. By means of the rotary encoder, the position of the cups is calculated and the weight is measured on the fly in the moment the cup touches the integrated load cell.

Now the analysis PC has to evaluate the sensors and derive the necessary criteria for ejecting the fruit in one of the eight quality outputs based on the implemented rule engine. Again, the position of the electro-mechanical ejectors is derived from the rotary encoder and the rule engine's results. Boxes at all outputs are prepared for receiving the analyzed fruits.

**Figure 2.3:** Infeed to the Sherlock Food Analyzer with Conveyor System and Ejection Unit.

#### **2.3 Object Analysis and Measurement Quantities**

The analysis of fruit is conducted in the context of a so-called sorting program. A sorting program stores all parameters necessary for optical characterisation of the dedicated fruit. Not only different fruits are selectable but, in case of different optical characteristics, selections for different varieties of the fruit are available. In addition to that, the sorting criteria of the implemented rule engine are part of this program.

In terms of optical characterisation there is a distinction between the evaluation on the hyperspectral camera and the RGB image analysis. Nevertheless the results of both will be combined by software and treated as an abstract data structure called: object. The attributes of this object are the physical quantities measured. In the moment an object is generated, the position of the fruit is determined by the position of the rotary encoder. The offset due to calculation time is corrected and the values are then used to determine the position of the fruit on the conveyor. Regarding the measurement of damages and diseases, the amount of pixels and the size of areas in a certain false color coding combined with mechanical parameters (distance, conveyor speed) lead to classification into a certain category.

The false color coding for the RGB pictures is related to the visual optical appearance. Characteristics to be analyzed are: Several diseases, rot, scars, outer bruises, dark spots, color class, size. In a manual parametrization step, classes are generated from the RGB recordings for these categories and coded in different colors using the offline software (Falco). In contrast to this, classification of chemical characteristics is conducted with the hyperspectral camera images. Quantities to be measured are: Sugar content mango (BRIX, precision <1° Brix), dry matter (%, precision +/- 1%) an inner bruises (mm2).

In order to reach the demanded quantitative measurement quality, a calibration process followed by a validation step is necessary to generate a linear regression curve. In a first step, a selection of fruits is analyzed in the line of inspection to derive their spectral information. Then, samples are taken from these fruits at distinct surface positions. After packaging and identifications, these samples are sent to the food lab to be analyzed in terms of sugar concentration and moisture. By means of the parametrization software, the concentration values of the fruits are correlated to the magnitude of the reflected spectra in the region of interest.

Hence, a linear regression model is established, ready to be sent to the hyperspectral camera for the following validation step. New fruits are analyzed with the generated model in order to check the quality of the model in terms of accuracy.

For geometrical measurement calibration (e.g. length, area) an object with known dimensions is analyzed at the line of inspection to adjust the size parameters. Typically, the width of the infeed is given by the conveyor dimensions, but the adjustable product speed has an big influence on the image of a line scan camera. Hence, geometric corrections have to be prepared for the particular sorting program.

**Figure 2.4:** Software Tools: Hyperspectral modelling - Sqalar (EVK) and RGB color classification with Falco (KestrelEye).

### **2.4 Analysis Software**

The used software has to fulfill several tasks to enable ripeness sorting. It starts with the data acquisition from two hyperspectral and two RGB cameras located in the Sherlock Food Analyzer. This is done via Ethernet connection and the GigE protocol. The false color pictures are transferred from the network card to the computer's memory. With the availability of the accessed pictures, the next data processing step takes place: background segmentation. Upon successful segmentation, the objects (fruits) are extracted from the conveyor background. As soon as an object is generated, the synchronisation with the conveyor for positioning is started. Now the geometrical and color-based attributes are calculated in order to obtain false color images. This process is repeated on the second Sherlock Food Analyzer after rotating the fruit. Additionally, special algorithms called classifiers are used for the detection of further object attributes (e.g. black spots). Last but not least, the weight of each fruit is derived from the load cell and added to the fruit attributes.

Now all the needed measurement values are available to make the quality decision. The decision making parameters and thresholds are configured by means of the graphical interface of the implemented rule engine. The operator is in charge of selecting the appropriate ejection position based on a defined set of rules. The applied rule can logically combine sugar concentration, moisture, bruises, weight and black spot areas. Based on the selected sorting program, the rule engine is configured to the corresponding parameters and rule settings. The measurements and the rule decisions of every fruit are stored on the machine in a file that is available to be sent to host computers in the plant.

### **3 Conclusions**

The company Insort mastered the challenge and built a unique new sorting machine using their Sherlock Food Analyzer as the key element. Hyperspectral imaging technology in combination with the RGB cameras is able to sort out chemical parameters as well as color differences. The sorting is conducted with a higher precision and velocity than manually before. After the realization of this project, the customer is able to successfully sort the fruits into different quality categories based on important parameters and supply high quality fresh fruit to the fast growing market.Implementation of the optical characterisation system was done for mango and avocado sorting. Ideas for the sorting of other fruits are being considered.

### **4 Acknowledgment**

The development of the described ripening machine was only possible due to the confidence of the CEOs of Frutura (Mr. Manfred Hohensinner, Mrs. Katrin Hohensinner) and Insort (Mr. Matthias Jeindl) in their team and in the project. Great thanks to all the members of Frutura for the support during commissioning work and to the engineers at Insort who contributed with their knowledge in software, hardware and optical application to the success of this project. Many thanks to Mr. Philipp Staubmann who redacted this paper and transferred the text to LaTeX.

It should be mentioned that the work effort for the generation of this paper was co-funded by the European Union in the framework of Horizon 2020, Grant Agreement: 872770.

#### **References**


# **A new approach for evaluation of meat freshness**

Nensi Kasvand1, Aleksandr Frorip1, Artur Kuznetsov1, Tonu P ˜ ussa ¨ 2, Linda Rusalepp2, Alar Sunter ¨ 1, and Dea Anton2

<sup>1</sup> Ldiamon AS, Tartu Science Park, Teaduspargi 7, 51014, Tartu <sup>2</sup> Estonian University of Life Sciences,Institute of Veterinary Medicine and Animal Sciences, Department of Food Hygiene Kreutzwaldi 56/3, 51006 Tartu

**Abstract** Adenosine triphosphate (ATP) and inosine monophosphate (IMP) dominate in fresh meat at the stage of autolysis, whereas ATP exists only during very first hours after slaughter. Hence, ATP is a highly suitable marker of absolute freshness of rapidly frozen meat/fish. Fast Protein Metabolites Liquid Chromatography (FPMLC) method was used to evaluate freshness of minced pork during storage and compared to results of standard methods -Volatile Fat Acids (VFA) and total count of bacteria. Good agreement between new FPMLC method and standard methods was achieved.

**Keywords** Meat freshness index, inosine monophosphate, hypoxanthine, fast liquid chromatography

#### **1 Intoduction**

Many meat and fish freshness/quality assessment indices are based on combination of concentrations of ATP breakdown products:

$$\text{ATP} \rightarrow \text{ADP} \rightarrow \text{AMP} \rightarrow \text{IMP} \rightarrow \text{Iro} \rightarrow \text{Hx}\_{\prime} \tag{1.1}$$

where ATP, ADP, AMP are adenosine tri-, di- and monophosphate respectively, IMP - inosine monophosphate or inosinic acid, Ino – inosine and Hx – hypoxanthine [1]. ATP and IMP dominate in fresh meat at the stage of autolysis, whereas ATP exists only during very first hours after slaughter. Hence, ATP is a highly suitable marker of absolute freshness of rapidly frozen meat/fish. IMP is well-known endogenous taste enhancer (E630) and synergic component of umami taste [2]. Hx, on the contrary, lead to the bitter off-taste [3], which makes it a good sign of the onset of meat or fish spoilage. Increased concentration of Hx may also be a matter of health since hypoxanthine is a precursor of hyperuricemia and gout [4] or associated Lesh-Nyhan syndrome [5].

The first historical ATP-linked freshness index K was proposed by Saito et al. [6, 7]:

$$K = \frac{[Ino] + [Hx]}{[ATP] + [ADP] + [AMP] + [IMP] + [Ino] + [Hx]} \tag{1.2}$$

Since ATP, ADP and AMP exist in remarkable concentration only during short period of first hours, Karube et al. proposed simplified index *Ki* for operation in the ordinary time scale of days [8]:

$$K\_i = \frac{[Ino] + [Hx]}{[IMP] + [Ino] + [Hx]} \tag{1.3}$$

Both indices, K, and *Ki*, have been widely used in fish science [1,7], but considerably less for red meat [9], regardless of the fact that these ATP metabolites have been exploited for a long time as freshness markers [10–16]. One of the reasons can be that the indices change only in a limited range of values from 0 to 1 (or 0% - 100% ) with no great or dramatic dynamics.

Furthermore, linking data between freshness indices and bacterial contamination of meat is deficient. Mostly researches on ATP metabolites and bacterial contamination have been done separately without any attempt to combining both into one more universal approach. Recently, a two-band freshness index (TBFI) was proposed to monitor bacterial contamination on chicken meat surface using hyperspectral imagery technique [17]. It is crucial to look for correlation of freshness indices and bacterial contamination to expand methodological and technical capabilities in establishment of freshness and early spoilage indications of foods.

The aim of this work was to study correlations of *Ki* indices and standardized meat evaluation methods (e.g., Volatile fatty acids VFA and bacterial contamination), validating the FPMLC method in process.

### **2 Materials and methods**

**Sampling and materials** The study was performed in two sequential experiments. Pork sirloin, one-day-after-slaughter, was minced and samples were packed according to sampling plan in two different conditions: air and vacuum and were stored at 2 – 4 °C for 14 days. Samples were analyzed every other day according to the schedules, 8 times in total.

Volatile fatty acids (VFA), total count of aerobic microorganisms and Pseudomonas spp. were determined by the Estonian Veterinary and Food Laboratory (EVFL). The FPMLC device designed by Ldiamon AS has been used for determination of ATP metabolites [18, 19]. Determination of ATP metabolites concentration was carried out by LC-DAD (MS) and determination of lipid oxidation level by the TBARS method in Estonian University of life Sciences.

**pH determination** Measured in homogenized mixtures of 1 g of sample and 9 mL of distilled water with Consort C833 digital pH-meter (Consort, Turnhout, Belgium) at room temperature. Calibration of pH meter was regularly checked.

**Determination of ATP metabolites by FPMLC** Mixture of 2 g of minced pork with 6 ml TRIS buffer (pH 8.0) in 15 ml test tube was shaken for 11 minutes in a rotator Biosan Multi RS60 (BioSan, Riga, Latvia) and filterd with syringe filter (Whatman, 0.45 μm.). 6 drops of extract was dropped into the PD-10 column of the FPMLC device.

**Determination of ATP metabolites concentration by LC-DAD (MS)** The liquid chromatographic analysis of meat extracts was carried out by a 1290 Infinity system (Agilent Technologies, Waldbronn, Germany) coupled to an Agilent 6450 Q-TOF mass spectrometer equipped with a Jetstream ESI source. Samples were subjected to a Zorbax 300SB-C18 column 2.1×150 mm; 5 μm, (Agilent Technologies) kept at 40 °C. For the separation of compounds, a gradient of 0.1% of formic acid in water (A) and 5% of water in acetonitrile (B) was used as follows: 0.0 min 1% B, 3.0 min 1% B, 3.01 min 99% B, 11.1 min 99% B, 11.01 min 1% B, regeneration time 8 min. The eluent flow rate was set to 0.3 mL/min and the injection volume size was 2.5 μL. Mass-spectrometer was working in the negative ionization mode in the mass-to-charge ratio (m/z) range of 100–3200 Da. UV absorbance was measured at 250 nm. Data acquisition and initial data processing were carried out by MassHunter software (Agilent Technologies).

The identification of IMP, Ino and Hx in samples was confirmed by comparing MS/MS and UV spectra with those of analytical standards. IMP, Ino and Hx were quantified by UV absorption at 250 nm using external calibration curve method. Methanolic standard solutions with concentrations of 3.125, 6.25, 12.5, 25, 50 and 100 μM were prepared for all three analytical standards. Calibration curves were characterized by a high correlation coefficient (*R*<sup>2</sup> = 1).

**Validation** The chain of parallel measurements and comparison of VFA and FPMLC results were considered as validation procedure of FPMLC method. For this purpose, a "clone" samples were tested at the EVFL with the standard method VFA [20]. Official protocols from the EVFL have been compared with results of synchronized measurements carried out by University of Life Sciences and Ldiamon AS.

#### **3 Results and discussion**

FPMLC chromatograms of meat consist of two main parts: the sharp protein peak and broad post-protein band (Figure 3.1) that is formed by merging of individual peaks of the main nucleotide actors. During the storage of meat, the ATP is decomposed by the enzymes to the metabolites with smaller molecular weight, as a result, the retention time of the metabolites increases. The major operand of FPMLC analysis - Time, measured in seconds, is the time interval between retention time of the flat band of metabolites and sharp protein peak in the FPMLC chromatograms (Figure 3.1).

The changes in concentrations of ATP degradation products - IMP, Ino and Hx, that occurred in minced pork during air and vacuum storage for 14 days at 2 – 4 °C with correspondently calculated *Ki* indices are shown in Figure 3.2. In air storage decreasing in concentration of IMP during the meat storage took place more rapidly than in vacuum, degression 3,14 and 1,42 times, respectively. Changes in *Ki* indices were steeper in air stored samples for first 7 days of meat storage.

Correlation of new FPMLC method and LC-DAD/MS was satisfying (Figure 3.3). Correlations for both FPMLC indices – Time and K-value with *Ki* values determined by LC-DAD/MS were linear, with values of *R*<sup>2</sup> respectively 0,83 and 0,86.

Compatibility of FPMC method and bacterial contamination was good for both storage conditions air and vacuum (Figure 3.4). Correlations of indices Time and K-value with total count of bacteria *Pseudomonas spp.* was linear with values of *R*<sup>2</sup> 0.93-0.94 and 0.79-0.81 respectively for air and vacuum stored samples.

**Figure 3.1:** Comparison of two FPMLC chromatograms produced at days 2 and 9 during storage of a pork meat.

**Validation** Validation process of FPMLC method took place in strictly parallel mode of progression of the indices VFA and Time in cooperation with EVFL (Figure 3.5).

The simultaneous steep rises after the 9th (pH=5.6) or 11th (pH=5.3) days mark the end of changes in condition of meat caused by autolysis and onset of changes caused by bacterial activity. For more acidic pork (Figure 3.5b) the level of VFA in the flat interval of predominantly autolytic transformations is somewhat higher than for the meat of pH 5.6.

**Figure 3.2:** Changes in concentrations of ATP metabolites – IMP, Ino and Hx, and corresponding *Ki* values during the storage of minced pork in vacuum (a) and air (b) at 2 – 4 °C.

**Figure 3.3:** Correlations between *Ki* values determined by LC-DAD (MS) and main indices of FPMLC – Time and K-value for minced pork stored in vacuum and air at 2 – 4 °C.

Therefore, the maximal level of VFA indicated for the pork pH 5.6 on the day 14 is remarkably higher than for acidic one, i.e., 19.2 and 11.78 mg KOH/100g, respectively. In both cases, at the onset of the steep rise the meat had already specific heavy odor and slime.

Correlation coefficients for the paired curves are high: r = 0.955 (pH=5.6) and r = 0.93 (pH=5.3) sustaining credibility and reliability of the new FPMLC method.

### **4 Conclusions**

Results of FPMLC method were in a good agreement with standardized methods of VFA and total bacterial count. Both indices of FPMLC

**Figure 3.4:** Correlations of FPMLC operands – K-value and Time with total count of bacteria *Pseudomonas spp.* (Log (CFU/g)) for minced pork stored in vacuum and air at 2 – 4 °C.

**Figure 3.5:** Changes in VFA and FPMLC indices Time for air stored minced pork at2–4 °C with different initial pH levels - pH=5.6 (a) and pH=5.3 (b).

– Time and K-value were in a good correlation with bacterial growth of *Pseudomonas spp.*.

FPMLC method has promising validity in relation to minced pork freshness control and needs to be studied further with other subjects of meat or fish. The FPNLC method can be applicable for the selection of the raw meat or fish of the highest quality or cold chain management for special control of frozen meat and fish in suspicious cases.

#### **References**


# **Towards the universal assessment of dietary intake using spectral imaging solutions**

Yannick Weesepoel1, Freek Daniels2, Martin Alewijn1, Mireille Baart3, Judith Muller-Maatsch ¨ 1, Gorkem Simsek-Senel ¨ 2, Hajo Rijgersberg2, Jan Top2,4, and Edith Feskens3

<sup>1</sup> Wageningen Food Safety Research, Wageningen, The Netherlands <sup>2</sup> Wageningen Food and Biobased Research, Wageningen, The Netherlands <sup>3</sup> Wageningen University and Research, Department of Human Nutrition and Health, Wageningen, The Netherlands <sup>4</sup> Vrije Universiteit Amsterdam - Faculty of Sciences, Computer Sciences, Business, Web and Media,

Amsterdam, The Netherlands

**Abstract** Personal(ised) sensors or "wearables" could, in the future, be applied to provide personalized nutritional advice. A challenge here is to assess dietary intake. To date, this has mainly been done with questionnaires or interviews, for example, food frequency questionnaires or 24-hour recalls. However, these methods are prone to bias due to conscious or unconscious misreports, and more objective measurement methods are desirable. By applying and combining spectral imaging techniques like hyperspectral imaging (HSI) and RGB-depth (RGBD) imaging, information on the macro-composition, identity and quantity of food consumed can be obtained. In this work, we demonstrate that HSI was effective for estimation of the fat content and layer thickness of butter on slices of bread with root mean squared errors of predictions of 4.6 (fat w/w %) and 0.056 mm respectively. Identification and volume estimation of vegetables and preparation methods were successful with RGBD imaging. Using Convolutional Neural Networks, all samples were correctly identified. For volume estimations of vegetables, R-square scores between 0.80 – 0.96 were achieved.

**Keywords** Non-destructive sensing, measuring and detecting, healthy behavior, consumer science

### **1 Introduction**

Dietary intake data is important in epidemiology, for dietary interventions and for providing dietary advice. Advances in nutritional science have facilitated the shift from general population-based dietary guidelines to more personalized dietary recommendations. To provide personalized advice, an appropriate method for measuring and assessing dietary intake is required. Dietary intake is traditionally assessed with questionnaires or interviews, for example, food frequency questionnaires or 24-hour recalls. However, these methods are prone to bias due to conscious or unconscious misreports. Therefore, more objective measurement methods are desirable. Moreover, the traditional methods are rather time-consuming for both researchers and consumers.

For this reason, we aim to develop an easy-to-use measurement setup that enables consumers to measure their dietary intake themselves, using state-of-the-art vibrational spectroscopy and imaging techniques. The analytical and machine learning challenges for this work comprise several issues:


In the current study, a first demonstrator is developed at technology readiness level 4. This means that the demonstrator can be operated in a controlled environment by trained persons. We demonstrate in this study the individual imaging techniques applied to the intake of butter (HSI), and vegetables (RGBD), and their respective data analysis procedures.

### **2 Materials and Methods**

#### **2.1 Materials**

Fresh (1 – 2 days after baking) sliced white and wholegrain bread, crispbread, 44 commercially available types of butter and margarine, fresh endive (Cichorium endivia), fresh carrots (Daucus carota subsp. sativus), fresh spinach (Spinacia oleracea), frozen spinach and spinach with added cream of a single brand were purchased from a local supermarket in Wageningen, The Netherlands. All chemicals used for macro composition reference analysis were in purity grades following the specifications in the ISO standards used (Section 2.3).

#### **2.2 Sample preparation**

To test the ability of dietary intake assessment, the sensing principles were firstly assessed separately in two food cases: (i) (low-fat) margarine and butter applied to a sandwich using HSI, and (ii) raw and prepared vegetables using RGBD imaging. For the first case, standardized amounts (3, 6, 10, 15 and 30 g) of 28 butter types with varying fat concentrations (29.8% - 84.3% w/w) were applied on three different types of sandwiches with standardized sizes (wholegrain and white 10 x 10 cm and crispbread 6.0 x 8.1 cm). A randomized experimental design was made in combining the type of bread, margarine/butter type and layer thickness of the butter/margarine, resulting in 84 samples. As an alternative experiment, butter was applied (16 butter types, fat concentration 28.0% - 82.5% w/w, applied on white and wholegrain bread) using a custom-made triangular applier, resulting in a continuous layer thickness of 10 mm – <0.5 mm (Figure 2.1 A-B).

For the second food case, fresh endive, fresh carrots and fresh spinach were investigated in raw form and after cooking. In addition to the fresh spinach samples, frozen spinach and frozen spinach with added cream were added to the sample set resulting in 7 different sample types. Fresh samples were cleaned and washed with cold tap water before spectral acquisition. Cooking was performed using a steam oven and specific cooking times for each product. Frozen and frozen creamed spinach were defrosted and heated according to the producers' instruction. All samples were brought to room temperature before spectral acquisition. Sample plates were created by adding a random amount of mass from a product-specific range (carrots 20 - 80 g, endive raw 5 - 15 g, endive cooked 15 - 50 g and spinach 20 - 60 g) to the plate. After every five samples, the plate was emptied and cleaned. Product samples were put on the plate off centre and placed under the RGBD camera such that the centre of mass was roughly at a randomly selected rotation. In total, 585 samples plates were prepared, by incrementally adding vegetables to the same plate (i.e. 5 times for each plate). A single RGB and depth image was captured for each sample (Figure 2.1 C-D).

#### **2.3 Determination of reference values of butter and vegetables**

The total moisture content of the butter samples was determined by NEN-EN-ISO 3727-1:2001 / IDF 80-1:2001, the total fat content by NEN-EN-ISO 17189 : 2003 / IDF 194 : 2003. For the vegetables, the sample mass was determined using a standard-issue analytical balance.

#### **2.4 Spectral image setup and acquisition**

The HSI and RGBD image equipment was mounted in a customized build setup with an approximate distance to the sample of 30 and 40 cm respectively (Figure 2.1 E-F). A standard-issue white glazed ceramic plate (diameter 20 cm) was used for sample presentation. The HSI camera was mounted dead-centre above the sample, whereas the RGBD camera was mounted on a 45-degree angle. For the HSI system, four standard issue halogen lights were installed at each corner around the camera. Lights were directed dead-centre downwards to avoid direct illumination of the reflecting ceramic plate and surfaces of the samples presented. An Effilux bar light (ELB-400SW, Effilux, Les Ulis, France) with a diffuse window was installed underneath the RGBD camera.

An IMEC SWIR Snapscan hyperspectral imaging system (Interuniversity Microelectronics Centre, Leuven, Belgium) with a spectral range of 1117 – 1670 nm equipped with an Optec 16mm F1.7 SWIR lens (Optec S.p.A., Parabiago, Italy). The system was controlled by the IMEC Snapscan software (version 1.3.0.8, IMEC). Before data acquisition, the camera was calibrated using a 95% reflection calibration white standard (WS) tile sized 200 x 200 mm, using an integration time of 2.5 ms. Raw sample spectra were corrected according to **Equation 2.1** to generate corrected diffuse reflection spectra used in subsequent calculations for the 640 x 512 x 108 (x *x* y *x λ*) sized hypercubes.

$$Corr\_{(\mathbf{x},\mathbf{y},\lambda)} = (Raw\_{(\mathbf{x},\mathbf{y},\lambda)} - Dark\_{(\mathbf{x},\mathbf{y},\lambda)}) / (WS\_{(\mathbf{x},\mathbf{y},\lambda)} - DarkWS\_{(\mathbf{x},\mathbf{y},\lambda)}) \tag{2.1}$$

RGBD images were acquired using an Intel RealSense D415 depth camera (Intel Corporation, Santa Clara, CA, USA) controlled by a custom build GUI. Images were captured in a semi-controlled environment, with light blocking on the left, right and backside of the scene using mat dark plastic plates. Furthermore, a dark A2-sized paper was used as a background on which plates were placed. The exposure time was set to 70 ms to prevent saturation of the image. Intrinsic calibration parameters available on the camera were stored alongside the color and depth image to be able to construct a point cloud of the depth image.

#### **2.5 Data analysis**

#### **Multivariate statistics of HSI data**

Processing of the HSI images was performed using the hyperSpec package [2] in R 3.6.1 [3]. The corrected images contained hyperresponsive (dead) pixels and non-sample regions, both of which needed to be excluded from further processing (Figure 2.1 B). Dead pixels contained reflection values > 1*e*6 and were easily recognized and removed. To discriminate between the sample (region of interest, ROI) and the background (empty plate, table and showy edges were present), a simple spectral based principal component analysis (PCA) approach was performed, in which scores on Principal Component (PC) 2 and a threshold of 0 determined if any pixel belonged to the

**Figure 2.1:** (A) Triangular mold for accurate layer thickness estimation of butter; (B) Spectral Image Heat plot of the butter applicator on a wholegrain sandwich; (C) RGB image of raw endive; (D) Depth image of (C) visualized using a gray color map; (E) Frame design of food intake setup; (F) Experimental setup with Hyperspectral SWIR camera (center) and RGBD camera (right bottom).

sample or not. As PCA's signs are non-determined, the portion near the centre of the image – presumed to be sample and not background – determined whether positive or negative PC2 scores contained the ROI. No data preprocessing or further thresholding or area filling was needed nor applied.

To estimate the butter and bread types, and mainly to characterize the butter (fat content, type, fat (un)saturation) and quantity (average layer thickness) of the butter on the bread a straightforward approach was chosen. a more straightforward approach was chosen. First, ROIspectra were corrected for scattering effects (standard normal variate, SNV). Then the intensity of the signal at 1221 nm was found to correlate well with the amount of butter (fat) [4]. Individual spectra from the ROI pixels were sorted based on the value in this number, and the top and bottom 5% of the spectra were averaged to represent the butter and bread-spectra, respectively. These averaged spectra were input for further (multivariate) classification and regression. Although machine learning approaches per pixel (or pixel cluster) of spectral cube data are a normal approach, this method was not necessary for these relatively simple samples.Bread type (3 types) were classified using the "bottom 5%" spectra for each of the samples using Soft independent modelling by class analogy (SIMCA), employing cross-validation (25x bootstrapped training sets). Butterfat content (Section 2.3) and butter layer thickness on the bread (5 discrete values, Section 2.2) were fitted using multivariate regression (PLS) using leave-10-out cross-validation. To assess the effect of the interaction of the butter layer thickness on bread on the observed spectra, a custom plexiglass shape was built to help apply butter on bread in known thicknesses. This device was 10 cm long and approximately 2 cm wide, and had a height of 10 mm on one end, linearly decreasing to 0 mm on the other end. When placed on a sandwich, this shape can be filled with butter and smoothed with a knife, after which an HSI will yield spectra with known butter thicknesses on bread -– only marginally affected locally by the foamy structure of the bread, slightly increasing the average thickness of the butter layer.

#### **Analysis of RGBD images**

Colour and depth information was used to identify and estimate the mass of vegetables on the plate. Two separate Convolutional Neural Networks (CNNs) were trained on colour images; ResNet50 [5] to classify the vegetable present in the scene and U-Net [6] to semantically segment the scene into pixels belonging to vegetables and the background. The segmentation was used to extract the relevant information from the depth images which were subsequently converted into a point cloud using available intrinsic calibration parameters. From the point cloud the volume was estimated. Finally, linear regression models were constructed for each identified product on the volume to obtain the mass. ResNet was the first network to introduce so called skip connections. These connections allow a layer to learn the identity operator, such that layers can be skipped if they are considered superfluous. U-net is a fully CNN using up sampling layers to obtain the same resolution in the output as layers as the input image. Both networks were trained using PyTorch (version 1.2, www.pytorch.org). The initial learning rate for U-net was upto 0.01, and for ResNet50 to 0.001 as pretrained weights were used for the latter. The learning rate was reduced by factor 10 when no improvement was obtained on the validation set after 200 epoch. Early stopping was applied after 400 epoch of no improvements. Samples were partitioned over a training, validation and test set with each respectively 355, 114 and 116 samples.

### **3 Results and Discussion**

#### **3.1 Bread and butter case using HSI**

ROIs were set prior to classification of bread types, fat content estimation and layer thickness determination (Figure 3.1). Bread type classification was performed using SIMCA on the averaged spectra assigned to bread (Table 4.1). While it is clear that the spectra contain sufficient information to reliably distinguish between bread and crispbread underneath the butter layer (regardless of the amount and type of butter applied), the discrimination between white and wholegrain bread was not so clear. Apart from the visual color, their composition (moisture, protein, carbohydrates, fat) is not very different, and the difference in fibers is not strong enough in this part of the near-infrared (NIR) spectrum to make this discrimination reliably in this setup [7].


**Table 4.1:** Confusion matrix for the classification of bread type.

The spectra were also used to estimate the fat contents and the (average) thickness of the butter layer applied on the bread. Figure 3.1 gives the predictive capability of the multivariate method applied for both provisions. Note that the fat content of the butter was estimated from all layer thicknesses present in the dataset, and the layer thicknesses were estimated regardless of the butter type and fat content, and these parameters were estimated regardless of the type of bread under the layer. As root mean squared error of predictions (RMSEPs) values of 4.6 (fat w/w %) and 0.056 (mm) were found, which can be considered quite acceptable for the intended application. Of course, models perform slightly better when separate models are made by determining the bread type and layer thickness first before estimating fat content, or bread type and fat content before estimating layer thickness (data not shown), but these improvements require more data to be calibrated and tested more reliably.

For further development of this model, both in terms of robustness and possibly to reduce the calculation cost (currently approximately 5 seconds per sample on a 2.7 GHz 6-core laptop), it is desirable to understand the interaction that two foodstuffs have on the NIR spectrum as obtained by the hyperspectral camera. Figure 3.1 shows the observed spectra as a function of butter layer thickness as obtained using the butter applicator device. Each line is the average of 20 spectra (±0.4 mm) around the thickness indicated. Spectra are corrected for scatter-effects (SNV) and transformed using (simplified) Kubelka-Munk (K/S) absorption-over-scattering coefficient-values [8]. It is clear that there are trends in the spectra according to the layer thickness, but there does not seem to be a straightforward solution which explains the trend from 0 to 10 mm of butter on the bread for all wavelengths. The same holds for unprocessed and only scaled data. This implies that the desired information is present in the spectra, but the described multivariate statistics and due calibration are required to extract this information.

#### **3.2 Vegetable case using RGBD imaging**

To calculate the volume of vegetables, firstly, PCA was performed on point clouds centered around the origin (Figure 3.2). The vectors found using PCA represent the length, width and height of the threedimensional objects. Next, points were rotated such that the third principal component was aligned with the vector (0,0,-1). After rotation, the surfaces of the scenes were parallel to the plane z = 0. To compensate for noise in the depth image, z-values of the points were subtracted such that the 98th percentile was at z = 0. Finally, point clouds were subsampled in a uniform voxel grid of fixed dimensions and estimates of volumes were obtained by summing the z-values.

The CNNs were trained on the training set, early stopping was applied using the validation set and the results are reported on the test

**Figure 3.1:** (A) A result of ROI detection of the 10 x 10 cm of bread-and-butter detection of HSI data in a false-colour image of first PC scores; (B) Average NIR spectrum as a function of the butter layer thickness (0 – 10 mm); (C) determination of the total fat content of butter independent of the bread type and layer thickness; (D) determination of the layer thickness of the butter independent of the bread and butter type.

set. The classification of the selected vegetables and preparations was successful with zero misclassification as shown in the test set. Indeed, there seems to be sufficient variation present in the RGB data in either color or texture for successful identification. When endive is cooked the product darkens whereas the carrots lose the shiny smooth structure after having been prepared. As for the three different types of spinach, these products themselves are visually different. Estimates of the mass were obtained using product-specific regression models on the calculated volumes. Reasonable results were obtained ranging from an R-squared score of 0.80 to 0.96 (Table 4.2). Although the volume estimation method is in principle generic, as it does not assume any shape or form of the product, to get an accurate mass estimation individual models are needed. A single regression model could be constructed based on the estimated volumes when the (average product) densities are known.

**Figure 3.2:** (A) Colour image of a scene; (B) Reconstructed point cloud of the segmented vegetables with the first three principal components, corresponding the length (red), width (green), and height or depth (red) axis of the vegetables.

### **4 Conclusions and outlook**

Both sensing systems performed well in the identification and quantification of food products, and the estimation of nutrient composition. To expand the versatility of the setup, a data-fusion experiment will be performed. In this experiment, sandwiches containing multiple elements will be used, such as tomatoes, avocadoes, nuts/seeds, sprouting vegetables and butter. Also, the application of a functional meta-data layer for improving decision making and reducing the number of reference samples will be worked out. In 2021, a human study will be performed to test and validate the setup. Finally, the ultimate goal is to integrate the used analysis techniques into smaller devices such as wearables or smartphones that can be used by consumers themselves. Data obtained by using these wearables can then be combined with data from other sources such as consumer profiles to assess dietary intake as accurately as possible. Missing data can be supplemented based on user profiles, for example, if a particular food is not optimally oriented concerning the measuring equipment or if the food cannot be adequately determined through detection techniques at all. This will be of great value for providing personalized dietary advice that is optimally tailored at the individual consumer, eventually contributing to better health.


**Table 4.2:** R-squared scores for mass estimation of individual vegetables by RGBD imaging.

### **5 Acknowledgements**

This research is part of the Dutch Research Council (NWO) program "Dutch National Research Agenda: Start Impulse Measuring and Detecting of Healthy Behavior" under grant number 400.17.604. www.Metenendetecteren.nl

### **References**


# **Detection of pyrrolizidine alkaloid containing herbs using hyperspectral imaging in the short-wave infrared**

Julius Krause1, Nanina Tron2, Georg Maier1, Andrea Krahmer ¨ 2, Robin Gruna1, Thomas Langle ¨ 1, and Jurgen Beyerer ¨ 1,3


**Abstract** Plants containing pyrrolizidine alkaloids (PA) are unwanted contaminants in consumer products such as herbal tea due to their toxicity to humans. The detection of these plants or their components using hyperspectral imaging was investigated, with focus on application in sensor-based sorting. For this, 431 hyperspectral images of leafs from three common herbs (peppermint, lemon balm, stinging nettle) and the poisonous common groundsel were acquired. By using a convolutional neural network, a mean *F*<sup>1</sup> score of 0.89 was obtained for the classification of all four plant products based on the individual spectra. To validate the neural network, significant wavelengths were determined and visualized in an attribution map.

**Keywords** Hyperspectral Imaging, convolutional neural network, Pyrrolizidine alkaloids, sensor-based sorting

### **1 Introduction**

Pyrrolizidine alkaloids (PA) are secondary plant substances that are toxic to the genome and the liver. They can pose a health hazard, as they are often found as a contaminant in food or medicinal plant products. Just a few plants per hectare e.i. of *Senecio* are sufficient to contaminate the product non marketable. Hence, regular field monitoring and removal of appropriate weeds is necessary. This is a considerable personnel effort that can hardly be afforded economically by the growers. Additionally, for correct identification of toxic herbs and to prevent contamination of the products, the growers need well trained personal. Therefore, methods to identify and remove toxic contaminants after harvest are highly demanded.

Sensor-based sorting is a machine vision application that has found industrial application in various fields. An accept-or-reject task is executed by deflecting single particles from a material stream. The main fields of application of sensor-based sorting are recycling, e.g., removing materials from glass shard streams harmful to the melting process such as stones and ceramic glass [1]. Another field of application is the processing of industrial minerals, mainly to remove unwanted gangue from ore, e.g., copper-gold ore [2]. For ensuring product safety for foodstuff and agricultural products, these methods are used for the detection and removal of fungus-infected wheat kernels [3]. Hence, sensor-based sorting of crops also represents an opportunity to safe the harvest potentially contaminated with PA.

In this paper, three of the economically most important herb cultures in Germany (peppermint, lemon balm, stinging nettle) and the most common contaminant of these cultures (common groundsel) were characterized by near infrared spectroscopy. Optical spectroscopy in near and short-wave infrared is particularly suitable for the detection and differentiation of organic products. Using hyperspectral camera systems, material streams can be monitored and foreign substances can be separated after detection by implementing the techniques in a sensorbased sorting system. Targeting this application, hyperspectral data was acquired for the mentioned herbs and their frequent contamination in an experimental sensor-based sorting system.

### **2 Material and methods**

This section describes the experimental sensor-based sorting system that was used to acquire the data. In addition, insights into the resulting data set are given. Finally, the novel data analysis method for the task at hand is described.

#### **2.1 Experimental sensor-based sorting platform with hyperspectral camera system**

In the course of this study, data for different plants are acquired using an experimental sorting platform that is equipped with an hyperspectral camera system, see Fig. 2.1. A thorough description of the experimental platform is provided in [4]. The spectral sensitivity of the InGaAs sensor of the line-scanning hyperspectral camera lies in the range of approximately 1200 to 2200 nm that is sampled into 256 spectral bands and provides 320 pixels locally at a maximal temporal resolution of approximately 300 hz. The implemented illumination consists of two arrays of halogen spotlights, which are locally targeted at the scan-line of the camera. Transportation of the material is realized by means of a conveyor belt that runs at 1.1 ms−1. The material is observed directly after being discharged from the belt, i. e., during a free flight phase. An array of pneumatic nozzles is located behind this scan-line and serves the purpose of material separation. It consists of 16 nozzles that can be triggered individually, each of which covering a width of 10 mm.

### **2.2 Dataset**

The plant material for the measurements was cultivated in beds at the JKI Berlin. The medicinal cultures were harvested twice, each time at the beginning of flowering, as the concentration of essential oils is highest then, using a cutting height of approx. 15 cm. For lemon balm (*Melissa officinalis*) and peppermint (*Mentha x piperita* 'Multimentha') 24 each and for stinging nettle (*Urtica dioica*) 12 and for groundsel (Senecio vulgaris) 10 single plants were harvested.

For training, the data set was split by random sample selection. A ratio of 75% was assigned as training and of 25% as validation data.

**Figure 2.1:** Photo of the experimental sensor-based sorting platform, equipped with the hyperspectral camera system.

**Table 5.1:** Description of the resulting dataset.


The entire data set can be described with a matrix *X* := {*xi*}*i*=1..*<sup>N</sup>* of *N* spectra and a matrix *Y* := {*yi*}*i*=1..*<sup>N</sup>* of *N* labels. Each spectrum *xi* ∈ **<sup>R</sup>***<sup>Q</sup>* contains the reflectance of *<sup>Q</sup>* <sup>=</sup> 256 spectral bands. The labels *yi* <sup>∈</sup> **R***<sup>C</sup>* are coded as one-hot vectors with *C* = 4 classes, which correspond to the plant types. An exemplary visualization of the data is shown in Fig. 2.2.

**Figure 2.2:** Shown are the mean spectra of all four plant species. Through the intensity differences at certain wavelengths, a classification can be done.

#### **2.3 Data analysis using the convolutional neural network** *AnniNet*

A new neural network architecture, *AnniNet*, has been developed at Fraunhofer IOSB. This architecture is particularly suitable for the analysis of near-infrared spectra. *AnniNet* consists of three components, namely an encoder network for feature extraction, a decoder network to improve feature extraction and a classification network that estimates the class membership of a spectrum. All three networks are trained in parallel through multi-task learning. The feature extraction structure does not require any type of spectral data pre-processing. The individual parts of *AnniNet* and multi-task learning are presented in the following.

#### **Encoder network**

The encoder layer is used for feature extraction. For this purpose, a layer with different one-dimensional convolution kernels is trained and applied to the spectrum. This layer can be compared with waveletbased methods that are already successfully used in chemometrics [5]. It has been shown that wavelets are successful in reducing noise [6] and suppressing background effects [7]. This enabled improved results [8], even the transfer of chemometric models was improved [9]. The features determined by means of the convolutional layer are sent to a pooling layer. This layer has a very practical effect besides reducing the parameters for the following layers. The feature extraction becomes more robust against shifts in the wavelength. This is especially a disturbing effect known as smiley keystone distortion in hyperspectral cameras.

#### **Decoder network**

End-to-end learning in artificial neural networks with high dimensional inputs such as hyperspectral measurements requires a large amount of labeled training samples due to a high count of weights which are iteratively adjusted. Autoencoders are used to learn the representation of a dataset in a unsupervised manner. Adding a decoder network creates such an autoencoder within *AnniNet* and improves training results without the need for additional labels.

### **Classification network**

To evaluate the overlapping information from the individual absorption bands, fully connected (dense) layers were used. In correspondence with the exponential behavior of absorption processes (Beer-Lambert law), the Scaled Exponential Linear Units (SELU) activation function

$$f(\mathbf{x}) = \begin{cases} \lambda a(e^{\mathbf{x}} - 1), & \text{if } \mathbf{x} < 0 \\ \lambda \mathbf{x}, & \text{otherwise} \end{cases} \tag{2.1}$$

was chosen for the first dense layer. Finally, a fully connected layer with the softmax activation function is applied. The output of the classification network is a normalized probability density function of the estimated class memberships.

**Figure 2.3:** *AnniNet* consists of three components. First, an encoder network extracts spectral features using a convolutional layer. During training, a decoder network is used as an autoencoder to improve feature extraction. The classification network analyses the non-linear and overlapping spectral features.

#### **Multi-task-learning**

Using multi-task-learning, all four classes and the reconstruction of the autoencoder were trained in parallel. The categorical cross entropy

$$\text{CCE} = -\frac{1}{N} \sum\_{i=1}^{N} \sum\_{j=1}^{C} y\_{i,j} \log \mathcal{Y}\_{i,j} \tag{2.2}$$

was used as a loss function to optimize the classification results for all classes *C*. The reconstruction of the spectral data using the autoencoder was optimized by minimizing the Root Mean Square Error

$$\text{RMSE} = \frac{1}{N} \sum\_{i=1}^{N} \sqrt{\frac{\sum\_{j=1}^{Q} (\mathbf{x}\_{i,j} - \mathbf{x}\_{i,j})^2}{n}} \tag{2.3}$$

for the autoencoder, which improves the feature extraction.

### **3 Results and discussion**

The classification was carried out using *AnniNet* without any spectral pre-processing. To validate the results in the following, only spectra from samples that were not included in the training were used. To check the plausibility of the classification results, a sensitivity analysis was performed and compared with the spectral data.

#### **3.1 Classification results**

The results of the classification are shown in the confusion matrix in Fig. 3.1. For all classes, the classification achieves a rather high accuracy. The application of a majority vote for the respective samples has led to a completely correct mapping in the validation data set. The quality of the classification results can be quantified by the so-called *F*<sup>1</sup> score

$$F\_1 = \frac{\text{tp}}{\text{tp} + 0.5(\text{fp} + \text{tp})} \,\text{}\tag{3.1}$$

which is the harmonic mean of precision and recall. The evaluation calculation is based on the false (fp) and true (tp) positive classification results. For the present four-class classification, the *F*<sup>1</sup> score was determined for the individual classes and averaged. The score determined in this way is *F*<sup>1</sup> = 0.89.

#### **3.2 Spectral attribution map**

Neural networks cannot be interpreted by humans due to their high number of parameters and high-dimensional transformations. However, when evaluating hyperspectral data, it is interesting to see which

**Figure 3.1:** The normalised confusions matrix shows the probabilities of the estimated class memberships in the individual fields. The estimation of the class membership by single spectra is mostly correct. It is important here that by majority vote of all spectra of a sample, all test samples could be correctly classified.

absorption bands have an influence on the classification result. One possibility for investigating neural networks is the creation of a socalled attribution map. For this purpose, individual areas of the spectrum are successively masked and then the classification quality is evaluated. For the classification task at hand, such attribution maps were determined and are visualized in Fig. 4.1. As can be seen, the classification result is mainly depend on data in the range between 1400 nm and 1900 nm. This range is between the absorption's by water and can therefore be considered robust. In addition, areas are selected or weighted differently for the different class memberships, which is another indicator of robust classification.

### **4 Summary**

Leaves from four different plants were recorded with a hyperspectral camera built in a senor-based sorting system. The data were used without spectral pre-processing to train a neural network designed for this classification task. The trained network was able to predict individual spectra of the validation data of all four classes at a high accuracy. Using a majority vote per sample, all validation samples were correctly classified. An attribution map was used to investigate which spectral ranges dominate the classification decision. The result shows different spectral ranges for the individual classes, apart from absorption by water.

**Figure 4.1:** The attribution maps show the classification error caused by masking out individual spectral channels, highlighting the importance of information in the range of 1400 nm and 1900 nm.

### **Acknowledgement**

The authors of this work were supportet by the joint project "Detektion und Entfernung von Pyrrolizidinalkaloid-haltigen Unkrautern aus ¨ Kulturpflanzen nach der Ernte – PA-NIRSort" which is funded by the German Federal Ministry of Food and Agriculture and by the Fachagentur fur Nachwachsende Rohstoffe e. V. (FNR) (FKZ 220132165) on ¨ the basis of a resolution of the German Bundestag and the Fraunhofer Center for Machine Learning within the Fraunhofer Cluster of Excellence Cognitive Internet Technologies CCIT.

### **References**

1. G. Bonifazi and S. Serranti, "Imaging spectroscopy based strategies for ceramic glass contaminants removal in glass recycling," *Waste Management*, vol. 26, no. 6, pp. 627–639, Jan. 2006. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0956053X05001649


# **Phenoliner 2.0: RGB and near-infrared (NIR) image acquisition for an efficient phenotyping in grapevine research**

Xiaorong Zheng1, Julius Krause2, Benedikt Fischer2, Robin Gruna2, Reinhard Topfer ¨ 1, and Anna Kicherer1

<sup>1</sup> Julius Kuhn-Institut, Federal Research Centre of Cultivated Plants, Institute ¨ for Grapevine Breeding Geilweilerhof, 76833 Siebeldingen, Germany <sup>2</sup> Fraunhofer IOSB Institute of Optronics, System Technologies and Image Exploitation, Visual Inspection Systems Fraunhoferstr. 1, 76131 Karlsruhe

**Abstract** In grapevine research, phenotyping needs to be done for different traits such as abiotic and biotic stress. This phenotypic data acquisition is very time-consuming and subjective due to the limitation of manual visual estimation. Sensor-based approaches showed an improvement in objectivity and throughput in the past. For example, the 'Phenoliner' a phenotyping platform, based on a modified grape harvester, is equipped with two different sensor systems to acquire images in the field. It has so far been used in grapevine research for different research questions to test and apply different sensor systems. However, the driving speed for data acquisition has been limited to 0.5 - 1 km/h due to capacity of image acquisition frequency and storage. Therefore, a faster automatic data acquisition with high objectivity and precision is desirable to increase the phenotyping efficiency. To this aim, in the present study a prism-based simultaneous multispectral camera system was installed in the tunnel of the 'Phenoliner' with an artificial broadband light source for image acquisition. It consists of a visible color channel from 400 to 670 nm, a near infrared (NIR) channel from 700 to 800 nm, and a second NIR channel from 820 to 1,000 nm. Compared to the existing camera setup, image recording could be improved to at least 10 images per second and a driving speed of up to 6 km/h. Each image is geo-referenced using a real-time-kinematic (RTK)- GPS system. The setup of the sensor system was tested on seven varieties (Riesling, Pinot Noir, Chardonnay, Dornfelder, Dapako, Pinot Gris, and Phoenix) with and without symptoms of biotic stress in the vineyards of Geilweilerhof, Germany. Image analysis aims to segment images into four categories: trunk, cane, leaf, and fruit cluster to further detect the biotic stress status in these categories. Therefore, images have been annotated accordingly and first results will be shown.

**Keywords** Multispectral camera, image acquisition, geoinformation, *Vitis vinifera*, field phenotyping

### **1 Introduction**

*Vitis vinifera* (Grapevine) is considered to be one of the most economically important fruit crops worldwide with 7.4 million ha and an annual production of 78 million tonnes in 2018. In France and Germany, 99% of the grapes are grown for wine production [1]. The yield and vine health, including wine quality are regarded as the most important economic indicators for viticulturists [2]. Grapevine is highly susceptible to several diseases, such as powdery mildew and downy mildew, of which the infection can result in a significant reduction of total soluble solids in berries and further cause a negative effect on total glucose and fructose content, total red pigments, and thus alcohol content in wine [3, 4]. Therefore, the grapevine breeding activities mainly focus on selecting new varieties with high abiotic and biotic stress resistance and high quality characteristics [5]. As a perennial woody plant, observation of phenology, analysis of growth habits (grapevine architecture), and evaluation of yield are very time-consuming and can only be conducted in the field. Most of these characteristics need to be evaluated within a narrow time window resulting in limitation of phenotyping workload and somewhat limited efficiency of grapevine breeding. Therefore, faster sensor assisted phenotyping methods with high objectivity and precision have been intensively investigated in recent years to screen breeding material, such as number and size of beeries or clusters [6,7], berry skin characteristics [8], leaf area [9], cane mass [10], diseases recognition [11–13] etc.

Due to the improvement of pattern recognition algorithms, computation capability, and image quality in recent years, computer vision is being applied in agriculture [14] for disease detection in rice, maize, coffee, grapevine etc. [15–18]. To develop a more precise and faster tool for identification of plant disease by computer vision, the convolutional neural network (CNNs) techniques are numerously used. So far, most of the published methods applied for symptom detection are based on digital images acquired using wavelength of visible and nearinfrared range of the spectrum [19]. With increasing optimization of algorithms, sensor-based approaches showed an improvement in objectivity and throughput in disease sensing in the past. In grapevine research, several sensor techniques have already been tested for various applications, including pre-symptomatic and symptomatic disease sensing [13, 20, 21]. Both ground-based hyper- and multispectral imaging are proven to be suitable for detection of foliar symptoms of esca trunk disease and powdery mildew infection in vineyard achieving overall accuracies of 82-83% and 87%, respectively [13, 22]. By combining the biophysical- (e.g. content of chlorophyll, carotenoid and dry matter) and texture parameters during classification, the accuracy to predict esca disease was increased from 81% to 99% [23]. However a hyperspectral sensor, which covers a broader wavelength range to acquire more information, is usually expensive and bulky. Therefore, using multispectral or Red Green Blue (RGB)-cameras for image capture would be more feasible in agricultural applications.

Meanwhile, several platforms, such as *PHENObot* [24], Phenoliner [25], grape-picking robot [26] or unmanned aerial vehicle [27], have been developed to carry the sensor systems for sensing in vineyard or machinery picking of fruit. However, the operating speeds of these platforms is either too slow to adapt to the usual operating speed of field working machines or provides images with low resolution. Therefore, to improve the efficiency of machinery phenotyping, an update of the former screening platform 'Phenoliner' was conducted in the present study to optimize the automated image acquisition in the field, data management, and data analysis.

### **2 Material and methods**

As described by [25], the ERO-Grapeliner SF200 (ERO Geratebau, Sim- ¨ mern, Germany), of which all units for harvesting, including the hydraulic system were removed, served as a carrier for the sensor system. A generator driven by the vehicle was conducted to provide the energy to operate the sensors, light units, and computers.

#### **2.1 Plant Materials**

Field trials were conducted in September 2020 in the experimental vineyards at the Julius Kuhn-Institut Geilweilerhof, Siebeldingen, Germany ¨ (49°13'06.0"N, 8°02'48.2"E). Seven varieties including red and white grapevine cultivars , i.e., Riesling, Pinot Noir, Chardonnay, Dornfelder, Dapako, Pinot Gris, Calardis Musque and Rieslaner with and without ´ symptoms of biotic stress were used for testing.

#### **2.2** *SmartVision* **Camera System**

In the present study, the embedded image processing system *SmartVision* from Fraunhofer IOSB was used. The *SmartVision* system includes the user-interface, camera control, image acquisition and real-time image processing unit with artificial intelligence. Furthermore, *SmartVision* provides a machine-to-machine interface based on OPC-UA (Open Platform Communications Unified Architecture) using the open source implementation *open62541*. This allows the system to be controlled remotely and additional sensors can be added.

The system is equipped with a prism-based simultaneous multispectral camera system (Fusion Series FS-3200T-10GE-NNC, JAI A/S, Germany) with a 8 mm lens (VS-0818H/3CMOS, Pyramid Imaging, USA) and was installed in the right part of the tunnel of the 'Phenoliner' with an artificial light source for image acquisition as in figure 2.1. This camera system consists of a visible color channel from 400 to 670 nm, a near infrared (NIR) channel from 700 to 800 nm, and a second NIR channel from 820 to 1,000 nm. The camera system has 2048 x 1536 active pixels and single/multi-readout mode for each channel, which provides a high resolution and speed with lower processing loads. To minimize the direct sunlight interference, curtains were used on the opening of the tunnel, and therefore, five LED light lamps (Lumimax LB500-44-W, IIM AG, Germany) were installed to achieve efficient light conditions during image acquisition. In order to achieve a better image of NIR and further increase the driving speed, two broadband NIR LED bars (EFFI-Flex, EFFILUX GmbH) were installed in addition at the side of the camera. The distance of the camera to the grapevine plants and the height of the captured region were illustrated in figure 6.1(e). The camera system was ported to a separated mini computer (MAGNUS EN072070S, Zotac, Hongkong, China) extended with WLAN access point, which serves as local host and was placed on the top platform of the harvester to process the image on-board. A 1 T of fast solid-state disc drives (EMTEC X200 Portable SSD, 450MB/s transfer speed) was used for storage of data.

To reference the geographic information to each grapevine, a realtime-kinematic GPS system (SPS852, Trimble, Sunnyvale, USA) was installed. The GPS antenna is located on top of the vehicle (Fig. 6.1(d)) and the receiver provides standardized National Marine Electronics Association (NMEA) strings for the camera system as described by Kicherer *et al.* (2017).

### **3 Image acquisition**

The SmartVision system provides a WLAN access point and the user interface is accessible using an internet browser. Both signal intensity and visible images from both RGB- and NIR-channels are shown (Fig. 3.1), which gave the operators more information to evaluate the quality of image acquisition in the field. The developed program also provides an automatic evaluation of exposure of the images. To gain a better quality of the image several camera parameters, such as cycle time, integration time, and exposure balance were adjusted accordingly. In the end, a cycle time of 100 ms and an integration time of 9 ms were set to achieve a good image acquisition with 10 images per second at a driving speed of up to 6 km/h. This operating speed is much faster compared to the former Phenoliner setup, which is only able to move at a speed of 0.5 - 1 km/h when capturing the images with a frequency of 5 images per second [25]. Meanwhile, GPS information was taken and stored.

During the image acquisition, the program displayed the status of the application, i.e., number of image files, remaining disk space, camera frame rate etc. as well. The acquired images were saved in a hierarchical data format (hdf5) and further archived in folders with metadata

**Figure 2.1:** Construction of sensor system on Phenoliner. a) container for the camera, b) computer including WLAN access point for image processing, c) sensor and light units in the tunnel of Phenoliner, d) image acquisition by Phenoliner in the vineyard, e) scheme of distance.

correspondingly. The acquired images will be further processed to be segmented into four classes (trunk, cane, leaf, and fruit cluster) manually using software 'Labeltool' (Fraunhofer IOSB). Under classes of cane, leaf, and fruit cluster, at least two sub-classes, healthy and diseased (including different disease severity levels) will be given.

### **4 Prospective phenotyping pipeline of Phenoliner 2.0 in viticultural and grapevine breeding**

As the acquisition of images with high quality and at normal working driving speed has been successfully tested in vineyards in the present study, an opportunity to apply this sensor system in two different areas

**Figure 3.1:** RGB Image (a) and normalized difference vegetation index (NDVI) image using RGB and NIR data (b).

is expected. On the one hand the use in grapevine breeding to describe and screen breeding material and on the other hand to improve mechanical harvest in viticulture.

Both applications have a slightly different workflow and in some steps different parameters of interest. The pipeline for both applications is shown in Figure 4.1. For mechanical harvesting a trained model based on automatic detection is planned to help the grape harvester make a yes or no decision for harvesting. This could be achieved by controlling the on and off function of the shaking units, and therefore the diseased fruit clusters could be excluded to increase the harvest quality. The application of this sensor system in grapevine breeding aims at screening selection criteria like yield and plant health fast and objective in the field. Besides the estimation of yield, the distinction of different diseases, as well as the disease incidence and severity, are relevant for the description and selection of breeding material. For digital documentation of phenotyping data, the processed results referenced to geographic information will be recorded and saved in a database. Based on the recorded data, a map with information of yield, the health status of the individual grapevine can be generated. In case of the breeding application it opens the opportunity to evaluate different breeding material over several years comparably, also retrospective if new phenotyping tools may be available.

In further experiments, this constructed sensor system could be installed in front of a commercial grape harvester and used under natural sunlight that does not provide a stable light condition compared to the Phenoliner, to investigate whether it enables the sensing of the grapevine status in real-time during harvesting. This will open up the opportunity to use such a sensor system in viticulture in the future to help the wine grower to improve his plot performance, adjust vineyard management and increase his wine quality.

**Figure 4.1:** Phenotyping pipeline for grape growers and breeders during harvesting. Dotted frames and lines are considered by grape breeders only.

### **Acknowledgement**

The study is supported by funds of the Federal Ministry of Food and Agriculture (BMEL) based on a decision of the Parliament of the Federal Republic of Germany via the Federal Office for Agriculture and Food (BLE), within the frame work DigiVine (FKZ: 28DE113A18).

### **References**


# **Developing a handheld NIR sensor for the detection of ripening in grapevine**

Lucie Gebauer1, Julius Krause2, Xiaorong Zheng1, Felix Kronenwett2, Robin Gruna2, Reinhard Topfer ¨ 1, and Anna Kicherer1

<sup>1</sup> Julius Kuhn-Institut, Federal Research Centre of Cultivated Plants, Institute ¨ for Grapevine Breeding, Geilweilerhof, 76833 Siebeldingen <sup>2</sup> Fraunhofer IOSB Institute of Optronics, System Technologies and Image Exploitation, Visual Inspection Systems, Fraunhoferstr. 1, 76131 Karlsruhe

**Abstract** It has already been proven that near infrared (NIR) reflectance spectroscopy can be used to measure the maturity of grapes by the determination of the sugar and acid content. Until now, winegrowers frequently collect a random one hundred berries sample per plot, to measure these parameters destructively for the estimation of the ideal harvest time of the gained product. Meanwhile, inexpensive sensors are available, to build convenient instruments for the non-destructive, low-priced and fast control of ripening parameters in the vineyard. For this, a small handheld device including a NIR sensor (1350 nm – 2600 nm) was built from a Raspberry Pi 3 single-board computer and a NIR sensor. Spectra of individual berries, sampled from Riesling (*Vitis vinifera*, L.) were collected. Corresponding reference data were determined with high performance liquid chromatography (HPLC). Samples were taken from different fruit as well as cluster zones and from the beginning of veraison until af- ´ ter harvest, to ensure a broad range of ingredients and the ripening properties of different berries from the vine. This study is the first that systematically investigates the ripening parameters of a whole vineyard with a handheld sensor. The sensor can be used in viticulture practice to detect the ripening progress and determining the ideal harvest time effective, simple, and nondestructively.

**Keywords** *Vitis vinifera*, Near Infrared Spectroscopy (NIRS), reflectance, ripening parameters, soluble solids, sugar, acid, handheld sensor, individual berry measurement

### **1 Introduction**

Viticulture is of great economic importance worldwide with an estimated average productin of 258 million of hectoliters (Mio hl) of wine in 2020. In relation to global production, 60% were produced in Europe (156 Mio hl), while the USA (24.3 Mio hl), China (8.3 Mio hl) and Russia (4.6 Mio hl) are the leading countries outside of the European Union. Within Europe, Italy (47.5 Mio hl) is the largest wine producers followed by France (42.1 Mio hl), Spain (33.5 Mio hl), and Germany (9.0 Mio hl). Riesling is one of the widely grown white grapevine varieties worldwide, and in Germany the most important cultivar of about 23% (2019) of the wine grapes area. [1, 2]

One important factor to hold a large market share is the quality of the produced wine. While the proportion of table wine in Germany is relatively stable, the production of quality wine increases rapidly [1]. In order to obtain high quality grapes, the harvest time is of great importance, since after reaching the peak of ripening, over-ripening results in a decrease in quality. The growth of grape berry takes place in three phases consisting of two growth cycles and one lag-phase [3, 4]. After the lag-phase, from verasion onwards, acids (mainly malic acid) ´ slowly start to decrease while sugars increases rapidly. The presence and amount of mainly sugars and acids determines the quality and later characteristics of the wine, whereby the sugars are mainly represented by glucose and fructose and 70 % to 90 %, respectively, and the acid being mainly tartaric acid and malic acid [5–8]. Sugars determine the alcohol content of the later wine and acidity contributes to its fruity character. The balance of sugar and acidity are determining factors of typicity and wine style which is increasingly affected by climate change. [9] Due to uneven ripening on the cluster and on the grapevine, a winegrower in general collects one hundred berries, to estimate the average maturation of his vineyard. This process is destructive, laborious and to some extent error prone.

NIR sensors are already widespread in a wide variety of disciplines. They are used in medicine (blood oxygen, diabetes, intracranial bleeding), pharmacy or agriculture (particle measurements), and to ensure food quality. Till now, several vibrational spectroscopic methods were discovered to analyse different materials, based on the interaction of radiation with matter. The most prominent method is the FTIR, which is based on the lowering of the radiation intensity due to infrared-active (IR-active) bonds. In contrast to the passing of light through the material, near infrared spectroscopy relies on the principle of reflection and absorption of radiation from the visible and near infrared spectrum of light (400 nm – 2500 nm). The IR-active bonds lower the reflection, resulting in an increase of the extinction coefficient (the degree of radiation attenuation), depicted as a peak in the reflection spectrum.

In order to investigate a feasible technique to detect quality attributes in crops, several sensors using IR radiation have been build for several fruits and vegetables. [10–12] Whilst some sensors measure the transmission spectrum of destroyed berries/bunches [13], other measure the diffuse reflection but with complex and expensive built ups with lamps and sensors. [14,15] Goisser *et al.* 2019 [16] developed a handheld NIRsensor which measures diffuse reflection, by simply placing the fruit on the sensor. This technique could be used to classify the firmness and to determine sugar content of tomatoes.

Here, we built and investigated a handheld sensor to determine the most important quality attributes of an entire vineyard without having to repeatedly destroy grapes. With a few further developments, such as an app that can process the spectra immediately, we can give winemakers a simple and quick-to-use tool.

### **2 Material and methods**

#### **2.1 SmartSpectrometer system**

The embedded spectrometer system *SmartSpectrometer* from Fraunhofer IOSB includes the spectrometer control and a real-time spectral processing unit with artificial intelligence. Furthermore, *SmartSpectrometer* provides a machine-to-machine interface based on OPC-UA (Open Platform Communications Unified Architecture) using the open source implementation open62541. This allows the system to be controlled remotely and additional sensors can be added. The built system consists of a FT-NIR sensor (NeoSpectra-Micro SWS 62231, Neospectra, SI-Ware, La Canada, California) mounted on a Raspberry Pi 3 (Raspberry Pi Foundation, Cambridge, United Kingdom) single-board computer. The sensors measures near infrared radiation in the range of 1350 nm - 2600 nm with a resolution of 16 nm.

The spectra of the berries were measured and recorded in the laboratory. A white and a black calibration was done with a Spectralon ® reflection standard (Sphereoptics, Labsphere, Inc., North Sutton, NH), to minimize influence of the ambient light and background noise. Individual berries were dried with a paper towel, placed on the sensor, and five spectra from different sides of each berry were taken using a graphical user interface.

**Figure 2.1:** Handheld *SmartSpectrometer* device based on the NeoSpectra Micro development Kit. To determine the degree of maturity, the reflection spectrum is recorded non-destructively and evaluated directly in the embedded device.

#### **2.2 Dataset**

Grapevines of the variety Riesling (*Vitis vinifera*, L.) from four different locations in the individual vineyard site "Mutterle" in Wollmesheim ¨ (WH, Landau, Rhineland Palatinate, Germany) were used (see Fig. 2.2, Table 7.1). All vines were healthy, leaves and berries were free of damages. In the plots, vine and row spacing were comparable, the greening in the tracks was well-tended and rows were oriented north - south. All vines were trained in semi-arched canes and there were either one or two fruit zones, depending on height of the canopy, with a height of 30 cm to 50 cm per zone. Per vine and fruit zone two or four berries respectively were collected.

In total 512 individual berries were harvested. The sampling took place each week, beginning with veraison (08-17-2020) until harvest ´ (09-28-2020) from defined vines, and one week after harvest (10-05- 2020) from two to four vines in the field. Berries were taken from defined but random selected vines, according their exposure to the sun (sun exposed, shaded), position in the cluster (near or far from the peduncle) and position at the vine (Table 7.1). Additionally, from each plot one hundred berries were randomly picked once a week from 08-10-2020 till harvest 09-28-2020, in order to determine the average maturity of the vineyard. Berries were transported to laboratory with a cooling box equipped with ice packs.



#### **2.3 Reference data**

The juice for the measurement of the one hundred berries sample was obtained by mixing them with a commercial available mixer (BL 6280, Grundig, Germany). Individual berries were destroyed in a Falcon tube by shaking them with four metal bullets in a paint shaker (SK450 Fast and Fluid Management, Sassenheim, Netherlands). All Falcons were centrifuged at 25,419 · g in a cooled centrifuge (Sigma 6 – 16ks, Sigma, Kawasaki, Japan) to discard the cell debris. The juice was transferred, centrifuged with 12,100 · g (Minispin Eppendorf, Hamburg, Germany) again and 1:3 diluted with double distilled and filtered (pore size 0.2 nm) water. Amount of sugars and acids (glucose, fructose, malicand tartaric acid) was determined using high performance liquid chromatography (Agilent 12900 Infinity 2, Agilent Technologies, Inc., Santa Clara, California) with ion-exchange ligand-exchange HiPlex H column (Agilent Technologies, Inc., Santa Clara, California), a refractive index detector for detection of carbohydrates and a diode array detector for acids. A standard series ranged from 1.5 to 90 g/L and 0.25 to 15 g/L for sugars and acids, respectively, was used. Recorded data were processed with Agilent OpenLab CDS Chemstation Software (Agilent Technologies, Inc., Santa Clara, California).

#### **2.4 Least Squares Support Vector Regression**

Due to the Beer-Lambert law, there is a non-linear relationship between the optical signal and a concentration of compounds, therefore Least Squares Support Vector Regression (LSSVR) with an RBF kernel was chosen as a non-linear regression method. The so-called kernel trick enables the estimation of non-linear correlations using LSSVR.

For the basic LSSVR the linear relation *y* = **wx** + *b* between the regressors *x* and the dependent variable *y* is calculated by solving the following optimization problem

$$Q\_{\rm LSSVM} = f(\mathbf{w}, \mathbf{e}) = \frac{1}{2}\mathbf{w}^T\mathbf{w} + \gamma \sum\_{i=1}^n e\_i^2 \tag{2.1}$$

subject to the equality constraints

$$y\_i - \mathbf{w}^T \mathbf{x}\_i - b = e\_{i\prime} \qquad i = 1, 2, \ldots, l \tag{2.2}$$

where *γ* is the regularization parameter and *ei* is the regression error. The advantage of the LSSVR over the regular SVM is provided via converting a quadratic programming problem into a set of linear equations. By constructing the Lagrange function with the Lagrange multipliers *α*, a set of conditions for optimality can be derived. This leads to a set of linear equations that needs to be solved to obtain the parameters *α* and *b*. The resulting LSSVM model used in function estimation is

$$y\_i = \sum\_{i=1}^{n} \alpha\_i \mathcal{K}(\mathbf{x}, \mathbf{x}\_i) + b \tag{2.3}$$

with the RBF kernel function *K*. For the model calculation the regularization parameter *γ* and the bandwidth of the RBF kernel *σ* needs to be chosen [17].

### **3 Results and discussion**

Reference data were evaluated and depicted using R (Version 3.6.1) [18] and R-Studio [19], as well as the packages ggplot2 [20] and Rmisc [21]. Spectral data analysis and modelling was performed using the spectraltoolbox framework from Fraunhofer IOSB based on Python 3.8.

#### **3.1 Reference data for spectral data processing**

The sugar (glucose, fructose) and acid (tartaric -, malic acid) contents of berries were measured with high performance liquid chromatography (HPLC) and results of hundred berries samples are depicted in Figure 3.1.

In individual berries, sugar contents ranged from 6.59 g/l to 115.58 g/l and from 11.52 g/l to 108.83 g/l for glucose and fructose, respectively, over the entire ripening process. For acids, ranges were smaller and high values were gained at the begin of the ripening with 15.45 g/l for tartaric and 21.79 g/l for malic acid, respectively. While the malic acid was partially almost completely metabolized, minimum tartaric acid content was 3.93 g/l.

**Figure 3.1:** Sugar and acid contents in the hundred berries samples, measured with HPLC, error bars correspond to standard errors, each data point represents the mean of the four parcels.

#### **3.2 NIR data analysis**

The data set contains spectral data from four plots at different locations. The validation is performed like the later use-case of the ripeness estimation of a plot at a specified date. Therefore, the plot used for validation was not included in the training and the median of all predictions is used.

The evaluation of the spectral data was done in several steps. First, spectra with a low signal strength (average intensity below 0.04) were discarded. Subsequently, the intensities of the spectral data were normalised using Standard Normal Variate (SNV) [22]:

$$\chi\_{\rm snv} = \frac{\mathbf{x} - \bar{\mathbf{x}}}{\sigma\_{\mathbf{x}}} \tag{3.1}$$

As a metric to evaluate the regression model, the RMSE and *R*<sup>2</sup> score is used. The RMSE score

$$\text{RMSE} = \sqrt{\frac{\sum\_{i=1}^{n} (\mathcal{g}\_i - y\_i)^2}{n}} \tag{3.2}$$

estimates the standard deviation of the prediction of a regression model. A distinction can be made between the RMSE of prediction and the RMSE of calibration, depending on *yi* used during calculation. In addition, the R2 score

$$R^2 = 1 - \frac{\sum\_{i=1}^{n} (e\_i^2)}{\sum\_{i=1}^{n} (y\_i - \bar{y})^2} \tag{3.3}$$

indicates how well the independent variables are suited to explain the variance of the dependent variables. In both formulas *n* is the number of observations.

Due to the lack of sample preparation, the measurement in reflection with low signal strength and the use of miniaturised low-cost sensors, there is an enhanced measurement uncertainty. Because the prediction of the entire plot at a point in time are relevant for the winemaker, stochastic errors and outliers can be reduced very well by determining the median of the single berry predictions. The quality of the prediction of individual grape berries in the training and the median of the prediction of the validation plot at each time point is shown in Table 7.2.

It can be clearly seen that the median of the single berry prediction leads to very good results. More precisely, Figure 3.2 shows the median of acid and sugar of the measured individual berries at different time points. With this model we were able to determine glucose and fructose content with 87 % accuracy (±7.59 g/l and ±6.57 g/l, respectively) and tartaric, as well as malic acid with 89 % and 78 % accuracy (±0.52 g/l and ±1.89 g/l, respectively). Compared to previous studies that measured the total soluble solids, we were able to predict two individual sugars with a high degree of accuracy [23, 24]. The two selected acids could also be predicted separatly, with a better forcast for tartaric acid, which is the prevalent acid at harvest [23, 25]. Higher precision could eventually be achieved with other models, for which more data are to be collected.

**Table 7.2:** Results of the spectral evaluations using LSSVR. Shown are root mean square errors of calibration (RMSE*C*) from the training set, the root mean square error of the median prediction (RMSE*P*) from validation set, the degree of determination (R2).


### **4 Summary**

Sugar (alcohol) and acid content of grapes have a huge impact on the sensory perception of wine. Measuring of these ingredients is a proxy for ripening of a vineyard.

Proof of concept was provided on applying a miniature sensor to measure the maturity of Riesling in vineyards with high accuracy. In view of the application in the vineyard, the prototype offers an option for monitoring the ripening in a time efficient way. It is expected that the sensor will be manageable with one hand. If an App for mobile phones is developed, the results of the measurement will become immediately apparent. The alternative to invasive measurements being laborious e.g. a handheld sensor to quantify sugars and acids noninvasively on hundreds of berries is a step forward towards digitalization and precision viticulture.

**Figure 3.2:** Median of determined (true, solid line) and estimated (predicted, dashed line) sugar and acid contents in single berries of the plot WH 04.

### **Acknowledgement**

The study is supported by funds of the Federal Ministry of Food and Agriculture (BMEL) based on a decision of the Parliament of the Federal Republic of Germany via the Federal Office for Agriculture and Food (BLE), within the frame work DigiVine (FKZ: 28DE113A18). We also thank Ann-Kathrin Ickert and Maximilian Mathes for technical support and Michael Straube from the cooperative Deutsches Weintor e.G. for arranging experimental areas from winegrowers and corresponding rating data.

#### **References**


sugars, auxins, and gibberilins in fruit of seeded and seedless varieties of vitis vinifera," *Plant Physiology*, vol. 35, no. 2, pp. 241–250, 1960.


bunches using nir spectroscopy," *Frontiers in plant science*, vol. 10, p. 1517, 2019.


# **Fine metal-rich waste stream characterization based on RGB data: Comparison between feature-based and deep learning classification methods**

Nils Kroell, Kay Johnen, Xiaozheng Chen, and Alexander Feil

Department of Anthropogenic Material Cycles, RWTH Aachen University, Wullnerstraße 2, 52062 Aachen ¨

**Abstract** *Background:* Material compositions in the recycling industry are currently determined by manual sorting, which is time intensive and shows subjective influences. For an automated, sensor-based material flow characterization a particlebased material classification is necessary. *Aim:* The classification of metal-containing fine-fractions based on RGB images with different machine learning (ML) techniques is investigated on two created datasets A (12,480 images) and B (19,498 images). *Method:* Two approaches are compared: In approach I, images are firstly pixel- and then object-based classified with six different ML models on three color spaces. In approach II, images are classified by six different convolutional neural networks (CNNs). *Results:* The classification of dataset A was possible with high accuracy (> 99.8 %) for both approaches and chosen ML algorithms were of minor importance. For dataset B, approach I achieved an accuracy of 78.2 % ± 2.0 %, and chosen ML algorithms were of higher importance for object-based classification. In approach II, the best-performing CNN achieved an accuracy of 80.4 % ± 4.2 % and a top-3 score of 94.2 % ± 2.6 %. *Conclusion:* Results from existing studies for coarser particle sizes can be transferred to fine fractions. Further research is needed to improve the classification of dataset B, e. g. by adding instances to less frequent classes and applying deeper CNNs.

**Keywords** Sensor-based material flow characterization, fine fractions, recycling, metals, RGB, classification, machine learning, deep learning, convolutional neural networks.

### **1 Introduction**

In the European Union, around 15.7 million Mg of non-ferrous (NF) metal-containing waste was generated in 2016 [1]. Recycling of valuable materials inside substitutes primary resources and therefore, achieves environmental benefits [2]. For example, recycling of aluminium saves 95 % of the used energy and reduces greenhouse gas emissions by 92 % compared to primary production [3].

Currently, about one-third of all NF metals produced originate from secondary raw material sources [4]. With rising climate change [5], decreasing valuable material contents of primary resources [6, p. 47] and an increasing demand for raw materials from emerging and developing countries [7], a transition towards a circular economy and thus an increased exploitation of secondary raw materials is necessary [8].

During the processing of metal scrap, metal-containing fine fractions (≤ 20 mm) are generated as a by-product [9, p. 263f], which can contain up to 30 wt% metals [10] but are insufficiently recovered to date [10]. For an optimal recovery of the contained metals as well as a quality assurance of generated product streams, the material composition needs to be known. Currently, samples are analysed by manual sorting [11, p. 51], which is time and cost intensive and often shows subjective influences from the manual sorter [12]. As a result, material flows are often characterized on an irregular basis and sample amounts are statistically limited.

A sensor-based material characterization (SBMC) offers the possibility to overcome these limitations, as it minimizes subjective influences, and material flows can be monitored in nearly real-time. The first step of such an SBMC is the classification of individual particles to given material classes. In a second step, the predicted material classes can be combined with individual particle masses in order to derive massbased material compositions for e. g. process control or automated quality assurance.

NF metals can be classified by different sensors, e. g. X-ray transmission or fluorescence, laser induced breakdown spectroscopy, induction (IND) or color sensors (RGB as well as hyperspectral imaging (HSI)) [13]. In comparison to other available sensors, RGB sensors are available in high resolution, technically matured and often by one order of magnitude or more cost-effective and thus focus of this study.

# **2 Related Work**

While sensor-based sorting of metals is state of the art [13], the SBMC of metal-containing waste streams is an active area of research. For coarse particle sizes (> 20 mm), the classification based on RGB images has been researched in several studies:


Studies on the classification of metal-containing fine fractions (≤ 20 mm) until now have focused on the application of hyperspectral imaging or a combination of multiple sensors, see Table 8.1. The classification of these fine fractions only based on RGB images has not been investigated yet, which is the research gap addressed here.


**Table 8.1:** Existing studies on optical classification of metal-containing fine fractions; 3DLT: 3D laser triangulation, \* sorting rate, \*\* product purity.

### **3 Material and Methods**

Two machine learning (ML) approaches (I: "feature-based classification" and II: "deep learning") are compared. For training and evaluation of the models, two datasets A and B were created.

#### **3.1 Data Acquisition**

Particle images were taken with a custom made test rig on a white paper background by a high-resolution RGB-camera (IDS U3-3800CP-C-HQ Rev.2) with a color depth of 12-bit and a resulting spatial resolution of 0.11 mm/Pixel. The recording area of 100 mm × 100 mm was illuminated from top by white LED-stripes (Samsung 2835 120 LED 12V IP20 4000K). For each particle, a front and back image was recorded.

#### **3.2 Samples and Datasets**

Dataset A contains standard components made of copper (Cu), grey metals (GM), brass (Msg) and black plastics (KS). Characteristic particle shapes and sizes are generated by comminuting the particles in a hammer mill and sieving them into the investigated particle size range of 3.15 mm to 10 mm (Δ*t*sieving = 90 s). In total, there are *n* = 12,480 images in the dataset (*n*Cu = 4,016; *n*GM = 2,120; *n*Msg = 3,056; *n*KS = 3,288), see Figure 3.1. After acquisition, the dataset was split in a train and test dataset with a train-test-split-ratio of 80 % : 20 %.

**Figure 3.1:** Exemplary images (cropped) for the four material classes of dataset A.

For dataset B, samples in the particle range 3 mm to 9 mm were collected from a metal-recovery plant in Germany according to LAGA PN 98 [24]. The samples were manually sorted in 22 material classes with a two stage process (see [12] for details). The final dataset contains 19,498 images, see Table 8.2 and Figure 3.2, and was split in a train and test dataset with a train-test-split-ratio of 75 % : 25 % to guarantee enough instances for less frequent classes in the test set.


**Table 8.2:** Number of images *n* in dataset B. +: Composite of the mentioned materials.

**Figure 3.2:** Exemplary images (cropped) for the 22 material classes of dataset B.

#### **3.3 Preprocessing**

The captured images were then preprocessed to improve the color representation. Preprocessing algorithms were implemented in *Python 3.7* in a *scikit-image* v0.17.1 [25] framework. Firstly, a Gaussian blur filter with a standard deviation of *σ* = 0.35 was applied to reduce small artefacts from demosaicing during image acquisition. Secondly, a white balance based on a white reference image was applied to guarantee a neutral color representation. Thirdly, an spatial illumination correction based on the reference image ensured an uniform illumination.

To extract individual pixels for color classification, each captured image was segmented into background, shadows and particle trough thresholding the gray scale image and improving the segmentation result by a combination of morphological operations. For the color classification, 1,000 pixels per particle as well as 1,000 background pixels (850 pixels of white background and 150 shadows pixels) were randomly selected from each segmented image.

For the image classification with CNNs, quadratic image sections of the particle were extracted (cf. Figure 3.1 and 3.2). A minimum padding of 25 % of the respective particle bounding box size was ensured between particle and the border of the image section. All extracted image sections were rescaled to a size of 64 Pixel ×64 Pixel.

#### **3.4 Approach I: Feature-Based Classification**

The color classification was tested for three color spaces (RGB, HSV and L\*a\*b\*). For dataset A, each color class represents a material class. For dataset B, the color classes BG, Cu, GM, Msg, KS, Glas, KS and Rest ('pure fractions') were defined, as the color of composite material classes (e. g. Cu + Msg) overlapped strongly with the pure fractions. Besides, characteristic colors of PCBA and KaAD (cables and wires) were extracted with a tomek link technique [26, p. 57]. All color classifiers were trained on 4,000 pixels per material class and 8,000 background pixels to avoid overfitting.

For object-based classification, shares of each classified color class as well as 50 dimensionless shape factors were considered, which were generated by normalizing shape measurements by the particle area *A* or its square root (√*A*). Shape measurements were extracted from the false-color image by a self-developed Python module named *eidos*1.

For both pixel- and object-based classification, six ML classification algorithms, implemented in *scikit-learn* [27], were compared: *k* nearest

<sup>1</sup>https://github.com/nilskroell/eidos

neighbor (kNN), decision trees (DT), random forests (RF), AdaBoost (AB), SVM, neural networks (NN). For each model and classification task, a systematic grid search with a *k*-fold-cross-validation (*k* = 5) on the training dataset was conducted to find the optimal hyperparameter settings. All hyperoptimized models were evaluated on the test dataset.

#### **3.5 Approach II: Deep Learning**

For image classification, six different CNN architectures (Figure 3.3) were compared, implemented in *Keras* [28]. All hidden layers were equipped with a ReLU activation function and the final network output was normalized by a softmax function. During training, an Adam [29] optimizer with a learning rate of *η* = 10−<sup>3</sup> was used. Dataset A and B were trained for 15 and 20 epochs respectively with a mini batch size of 256 instances per iteration. L2-norm regularization (*λ* = 10−3) as well as dropout layers with an dropout rate of 0.5 were applied to avoid overfitting, and the classification performance was evaluated by a categorical cross entropy cost function. To reduce the imbalance, downsampling was applied to material classes with more than 2,000 instances. All trained CNNs were evaluated on the test dataset.

**Figure 3.3:** Investigated CNN architectures A-F consisting of input (64 × 64), fully connected (FC-�*n*�), 3 × 3 convolutional (conv3-�*n*�), 2 × 2 max pooling, dropout (0.5) and softmax activation layers; *n*: number of neurons/feature maps.

### **4 Results and Discussion**

#### **4.1 Approach I: Feature-Based Classification**

The color classification of dataset A was possible with similar accuracies of ≥ 79.8 % for all models and colorspaces, see Table 8.3. On average, the L\*a\*b\* color space with 86.1 % and a SVM model with 87.0 % achieved the best classification results. Because of the superior color classification, an object-based classification with 100 % accuracy could be achieved, e. g. by a plain kNN model, for dataset A.

**Table 8.3:** Accuracy for color classification of dataset A after hyperparameter optimization; 95 % confidence interval for all accuracies: ± 0.08 % to ± 0.18 %.


Based on the results above, the L\*a\*b\* color space was chosen for the color classification of dataset B (Table 8.4). Here, the classification was more challenging with achieved test scores between 47.2 % (DT) and 51.1 % (kNN, SVM). As for dataset A, the chosen classification algorithm was of minor importance. Significant differences were found in the prediction time: The fasted model (DT) was with 13 ms/MPixel 714 times faster than the slowest one (AB; 9,278 ms/MPixel). Overall, a NN model showed the best combination of prediction time and accuracy and was thus the basis for subsequent object-based classification.

Test scores of the object-based classification of dataset B, based on extracted false-color shares and ten shape factors with highest feature importance (determined by *χ*2-test on training data) are shown in Table 8.5. Here, the chosen algorithm was of higher importance with achieved accuracies ranging from 66.5 % (kNN) and 78.2 % (RF).

A closer look on the best-performing model (RF) in Figure 4.1a shows that the predictions differ significantly between different material classes. As expected, material classes with a higher number of instances can be significantly better classified than less frequent classes.

**Table 8.4:** Test scores and prediction time (Δ*t*pred) for color classification of dataset B after hyperparameter optimization in L\*a\*b\* color space; CI: confidence interval.


**Table 8.5:** Test scores for object-based classification of dataset B; CI: confidence interval.


On average, pure fractions with an accuracy of 62.5 % could be far better classified than composite fractions (16.7 %). Providing different numbers of shape factors to the model (Figure 4.1b) revealed that the shape factors contributed little to the classification performance, which is primarily achieved by the extracted false color components.

#### **4.2 Approach II: Deep Learning**

As for the feature-based approach, the classification of dataset A with CNNs was possible with high accuracy (> 98.8 %), see Table 8.6. Even simple CNN architectures showed a high accuracy.

**Table 8.6:** Test scores of the CNN architectures A-F for dataset A; CI: confidence interval.


**Figure 4.1:** Object-based classification of the trained RF model on dataset B. **(a)** Accuracy by material class (circle area ∼ #instances), **(b)** influence of shape factors.

In contrast, training CNNs on dataset B turned out to be more challenging. For all six CNN architectures, the error rate did not converge after the initial planned 20 training epochs (see Figure 4.2), resulting in accuracies of ≤ 33.5 % when training all 22 material classes.

To optimize the classification result the architecture and training of CNN F was further adapted. Firstly, the number of neurons in the fully connected layer was increased to 1,024, which boosted the resulting accuracy to 48.9 % ± 3.6 %. Secondly, the number of training epochs was increased from 20 to 400. As shown in Figure 4.2a, the optimized CNN F\* converged after about 200 training epochs and achieved an overall classification accuracy of 80.4 % ± 4.2 %. Furthermore, a strong correlation between the number of images in the training set and the classification performance (F1-score) per material class could be confirmed, see Figure 4.2b.

The activations in the six convolutional layers of CNN F\* (Figure 4.4) show how CNN F\* is able to abstract the particle characteristics with each layer. When considering the given probabilities for each prediction, CNN F\* achieved a top-3 score of 94.2 % ± 2.6 %, i. e. for 94.2 % of the predictions the right material class was in the three material classes with the highest probability.

**Figure 4.2:** Training of CNN A-F for the classification of dataset B **(a)** with four selected material classes (Cu, GM, Msg, KS) and **(b)** all 22 material classes.

**Figure 4.3: (a)** Training of CNN F\* with dataset B over 400 epochs, **(b)** correlation between number of instances per material class and F1-score for dataset B.

#### **5 Conclusions and Outlook**

In this research, the classification of two datasets A (12,480 images in four material classes) and B (19,498 images in 22 material classes) with a feature-based and deep learning approach was investigated. Our results show that a nearly perfect (> 99.8 %) classification for all four material classes of dataset A could be achieved with both approaches, as particles of the different material classes can be clearly differentiated by their color in the RGB images. This indicates that classification results from coarse fraction can in general be transferred to fine fractions. The achieved classification results are better than existing studies on coarse fractions, most likely due to missing influences of different metal alloys or post-consumer waste characteristics, as comminuted standard components were used to create dataset A.

**Figure 4.4:** Selected convolutional feature maps of CNN F\* when classifying dataset B.

In contrast, the classification of dataset B turned out to be quite challenging. Our best feature-based approach consisting of a pixel-based color classification with a neural network in the L\*a\*b\* color space and a subsequent object-based classification with a random forest, achieved a classification accuracy of 78.2 % ± 2.0 %. A slightly better classification result was achieved by the deep learning approach with a 14 layer CNN with an accuracy of 80.4 % ± 4.2 % and an top-3 score of 94.2 % ± 2.6 %. To improve the classification of dataset B in the future we see potential for future research in the following points:


3. **Application of multi sensors:** While this study has focused on RGB images, new distinctive features for classification might be obtained by utilizing multiple sensors, e. g. RGB and induction or X-ray fluorescence/transmission.

### **References**


Hyperspectral Data," *IEEE Transactions on Industrial Informatics*, vol. 5, no. 4, pp. 483–494, 2009.


# **Improvement of Thermal Fringe Projection for Fast and Accurate 3D Shape Measurement of Transparent Objects**

Martin Landmann1,2, Henri Speck1, Jan Till Schmieder1,2, Stefan Heist1, and Gunther Notni1,3

<sup>1</sup> Fraunhofer Institute for Applied Optics and Precision Engineering IOF, Albert-Einstein-Str. 7, 07745 Jena, German <sup>2</sup> Friedrich Schiller University, Institute of Applied Physics, Abbe Center of Photonics, Albert-Einstein-Str. 6, 07745 Jena, Germany <sup>3</sup> Ilmenau University of Technology, Department of Mechanical Engineering, Gustav-Kirchhoff-Platz 2, 98693 Ilmenau, Germany

**Abstract** Structured light sensors for three-dimensional (3D) shape measurements working in the visible up to the shortwave infrared range have been intensively investigated in the last decades. Reliable measurements require diffuse reflection of the projected patterns. However, this condition is strongly limited or not fulfilled for transparent, translucent, shiny, or absorbing materials. Based on the scanning from heating approach, we have developed a two-step optical method for the measurement of such uncooperative objects. In this contribution, we present a simplified and robust projection technique of a focused single thermal fringe. Due to the irradiation of only locally strongly restricted areas, we obtain considerably higher intensities which enables us to reduce the total measurment time to or even below the second range while increasing the measurement accuracy. We show measurement results of non-opaque surfaces of objects made of a single material as well as of metal objects. Our customizable setup is of interest for quality assurance, bin picking, digitization, and many other areas of applications.

**Keywords** 3D shape measurement, thermal fringe projection, surface, infrared, transparent

### **1 Introduction**

Structured light sensors for 3D shape measurements working in the visible up to the short-wave infrared range have become more and more important in recent years. Diffuse reflection of the projected patterns is required to obtain reliable results. However, this condition is strongly limited or not fulfilled for transparent, translucent, shiny, or absorbing materials. There have been several approaches [1–3] to measure such uncooperative objects. Based on the scanning from heating approach proposed by Pelletier et al. [4], Eren et al. [5], and Meriaudeau et al. [6], we have developed a two-step method [7–11].

First, multi-fringe thermal patterns are projected onto the measurement object. The object absorbs the energy resulting in a temperature on the surface. In a second step, the stereo recording of the thermal radiation, which is re-emitted by the object surface, takes place. To generate a thermal pattern onto the (object) surface, a metal mask with vertical slits and strips working is used. The mask works like a slide. Basically, this setup can be used to measure the shape of, e.g, transparent glasses. However, both the measurement time, ranging from a few tens of seconds to minutes, and the measurement accuracy are improvable. A big disadvantage is that at least 50 % of the incoming intensity is blocked. The envelope irradiance is Gaussian distributed with the consequence that at the horizontal edges of the measurement field, the irradiance drops down to only 28 % of the value in the center.

In this contribution, we present a simplified and robust projection approach of a focused single thermal fringe which is quickly scanned over the object surface. The temperature contrast between adjacent dark (cold) and bright (warm) infrared image areas is the key quantity for the thermal 3D measurement quality. The contrast increase during the irradiation period is counteracted by the thermal diffusion within the object material. Due to the irradiation of only locally strongly restricted areas, we obtain considerably higher irrandiances than with multi-fringe projection. This enables us to reduce the irradiation period to only a few milliseconds. Within such short irradiation periods, the effect of thermal diffusion is drastically reduced. Simultaneously to the projection, we record the stereo thermal image. In order to measure the entire surface of the object, a galvanometer scanner sequentially shifts the laser sheet to specified positions at which the irradiation and recording process is repeated. Although we have a higher number of sequential projections, the total measurement time is substantially reduced to or even below the seconds range while increasing the measurement accuracy. We show measurement results of a transparent glass as well as a metal objects. Our customizable design is attractive for quality inspection, bin picking, digitization and many other applications.

### **2 Setup of MWIR 3D system based on sequential thermal fringe projection**

Instead of projecting several two-dimensional patterns as in the multifringe projection technique ( [10, 11]), we now scan the object surface with a single thermal fringe. The irradiance of this fringe is by factors higher than in our previous setup. To obtain a sufficiently high contrast, the irradiation period can be drastically reduced. Within such short irradiation periods, the effect of thermal diffusion is significantly reduced. A schematic setup of our MWIR 3D system based on the projection of sequential thermal fringes is shown in Fig. 2.1.

**Figure 2.1:** Schematic setup of MWIR 3D system based on projection of sequential thermal fringes consisting of projection unit (CO2 laser, gold mirror, galvanometer scanner, and horizontal ZnSe cylinder lens 1 and vertical ZnSe cylinder lens 2), measurement object, and two MWIR cameras. The projection unit irradiates the measurement object with a single fringe which is sequentially deflected by the galvanometer scanner.

The setup shows two cylindrical ZnSe lenses to transform the CO2 laser beam into a thin laser line sheet. The convex lens 1 is focusing the beam in horizontal direction while the concave lens 2 is used to irradiate the object in vertical direction. We can adjust the fringe width on the object surface by shifting lens 1 between the gold mirror and the galvanometer scanner or by replacing it with another lens of another focal length. Equivalently, we can move or replace lens 2 to adjust the height of the illuminated measurement field. By changing the beam width or height, the irradiance increases or decreases by the same factor.

The galvanometer scanner enables the sequential horizontal scan of the entire measurement field with the single fringe. The scanning range is flexible and can be adjusted to match the horizontal width of the object. The projected laser line is absorbed by the object and re-emits thermal radiation according the temperature distribution which we record with two MWIR cameras.

The measurement procedure shown schematically in Fig. 2.2 is similar to the one for the multi-fringe projection setup [12]. The main difference is the significant reduction of the irradiation period. The measurement starts with the first irradiation of a single fringe which lasts for the irradiation period *t*irr in the milliseconds range. At the same time, both cameras synchronously start to capture the diffused and re-emitted thermal pattern from the object surface. The recording period *t*rec is the sum of the camera integration time *t*int and the camera read out time *t*ro. Immediately after the end of the irradiation period, the galvanometer scanner moves the fringe to the next position (*t*trans). We set the irradiation period *t*irr to the maximum (camera recording period *t*rec minus translation time *t*trans). For the total measurement time *t*meas = *Nt*rec, the camera frame rate is the limiting factor.

**Figure 2.2:** Representation of the measurement procedure with sequence length *N*, irradiation period *t*irr, translation time *t*trans, and recording period *t*rec as the sum of camera integration time *t*int and the camera read out time *t*ro.

### **3 Experimental investigations**

After extensive simulation tests with promising results, we designed a laboratory setup according to Fig. 2.1 for the new approach of sequential fringe projection. The distance between the sensor system and the measurement field of 160 <sup>×</sup> 128 mm2 is about 450 mm. The desired vertical width is *w*ver = 120 mm.

With our MWIR 3D system based on sequential fringe projection, we analyzed the 3D result quality dependent on the sequence length *N* and fringe width *w*hor. The following results were measured by a recording time of *t*rec = 8 ms. The translation per step took *t*trans = 0.5 ms and the irradiation period was *t*irr = 7.5 ms. The measurements were conducted on a parallel-sided borosilicate glass plate with a thickness of 3 mm. The determined 3D standard deviation (standard deviation of the 3D points to a fitted plane) refers to an area of 40 <sup>×</sup> 40 mm2.

**Figure 3.1:** 3D point standard deviation *σ*3D for sequential fringe projection as a function of the horizontal fringe width *w*hor (1.3 mm. . . 6.2 m) for different sequence lengths *N* (120. . . 480) and an integration time of *t*int = 7.8 ms.

Figure 3.1 illustrates the 3D standard deviation as a function of the horizontal fringe width *w*hor and the sequence length *N*. The measurements were taken for five different fringe widths (from 1.3 mm to 6.2 mm) for ten different sequence lengths (120. . . 480). These different horizontal fringe widths were created by changing the position of the horizontal focus lens or replacing them by another one. With a recording time *t*rec = 8 ms, 120 sequentially projected sequences correspond to a measuring time of 0.96 s. The lowest standard deviation was achieved with the smallest fringe width of 1.3 mm. By increasing the number of fringes per measurement, the measured object is scanned more precisely and more images are used for the 3D result, which results in a smaller standard deviation but in an increased measurement time.

**Figure 3.2:** 3D measurement example: (a) photograph of a honey jar, (b) left MWIR camera image and temperature profile curve, (c) 3D result for N = 125 and *t*meas = 1 s, (d) 3D result for N = 500 and *t*meas = 4 s.

### **4 3D measurement example**

We measured a honey jar with a plastic cap with *t*rec = 8 ms (*t*irr = 7.5 ms, *t*int = 7.8 ms), *w*hor = 1.3 mm, and *N* = 10 . . . 500. Figure 3.2 shows the measurement example with a photograph, a typical MWIR camera image as well as an exemplary 3D results for N = 125 and N = 500. It is already possible to obtain reasonable 3D results with measurement times of just one second (Fig. 3.2(c)). By increasing the sequence length and the measuring time from 1 s to 4 s, the 3D result can be improved considerably (Fig. 3.2(d)). Depending on the application, the setup can optionally measure very accurately (measurement accuracy about 10 μm) or very quickly (3D acquisition in less than 1 s). In the case of bin picking, short measurement times are required, the contour of the objects must be determined, and details are mostly unimportant. In contrast, in quality assurance, a high measurement accuracy is required. Moreover, objects with metal surfaces (e.g., metal cup, see Fig. 4.1) could also be measured. The previous measurement system based on multi-fringe projection was not able to provide reliable measurement results (Fig.4.1(c)) due to three disadvantages that occur with metallic materials: (1) high reflectivity of projected irradiance, (2) high thermal conductivity, and (3) low emissivity in MWIR.

**Figure 4.1:** 3D measurement example metal cup: (a) left MWIR camera image and temperature profile curve for multi-fringe projection, (b) left camera MWIR image and temperature profile curve for sequential fringe projection, (c) 3D result for *t*meas = 42 s for multi-fringe projection, (d) 3D result for *t*meas = 4 s for sequential fringe projection.

### **5 Improvement of thermal fringe projection**

In additions to the improvements due to higher irradiances, the sequential fringe projection with a galvanometer scanner has several advantages over the multi-fringe technique:


### **6 Conclusion**

In this contribution, we presented the new sequential fringe projection system based on the scanning from heating approach. Instead of projecting several two-dimensional patterns, we propose to scan with a single infrared line across the object. After the general system design, we described the measurement procedure. For this new projection system, we only need few components, such as a radiation source for the generation of the heat fringe, some gold mirrors and lenses for shaping and redirecting the laser beam, and a fast turning mirror for scanning over the object. With the resulting thin fringes, we can drastically reduce the irradiation period to a few milliseconds. The previous system based on multi-fringe technique has measuring times ranging from a few tens of seconds to minutes. We presented different parameters and their influences to the measurement quality. An increase in measurement quality can be achieved both by increasing the number of projected fringes and by using very thin fringes.

We also demonstrated the advantages of the sequential fringe technique by objects with high thermal diffusion. The previous measurement system based on multiple fringes was not able to provide reliable measurement results of a metal cup. With the sequential fringe technique, we are able to measure such objects in significantly shorter time. The experimental studies showed promising results, as not only the measurement accuracy (about 10 μm) is increased, but also the measurement time (less than 1 s) is significantly reduced. With these improvements to our system, this measurement technology based on scanning from heating can be applied, e.g., in industrial applications such as quality assurance or bin picking.

For faster measurements, the camera recording time is the limiting factor. In further investigations, the use of high-speed infrared cameras can exceed this limit. Integrating a rotary table into the setup will enable us to perform 360◦ measurements.

### **Funding**

This research project (2018 FGR 0098, project acronym "3DWARME") ¨ is supported by the Free State of Thuringia, the European Social Fund (ESF) of the European Union, and the Thuringer Aufbaubank (TAB). ¨

### **References**


*Express*, vol. 17, no. 14, pp. 11 457–11 468, 2009. [Online]. Available: https://doi.org/10.1364/OE.17.011457


# **Measurement of the coefficient of linear thermal expansion based on subjective laser speckle patterns**

Alexander Spaett<sup>1</sup> and Bernhard G. Zagar<sup>1</sup>

Johannes Kepler University Linz, Institute for Measurement Technology, Altenberger Straße 69, A-4040 Linz

**Abstract** We present an improved processing of laser speckle intensity signals aimed at attaining spatial displacement resolutions well in the sub-pixel domain, which is necessary for measuring the minute displacements caused by thermal expansion of metallic specimens. The signal processing revolves around the cross power spectral density and the coherence function of two discrete signals, which allows for a continuously varying displacement result as opposed to a discretized, if determined in the spatial domain via the application of the cross-correlation function. The presented technique is then applied to estimate the coefficient of linear thermal expansion (CLTE) of a brass sample.

**Keywords** Coefficient of linear thermal expansion, subjective laser speckles, strain measurement, displacement estimator, unbiased minimum variance estimator

### **1 Introduction**

Laser speckle patterns are caused by the interference of coherent light reflected off an optically rough surface. When it comes to speckle imaging, one differentiates between subjective and objective laser speckle patterns. In the former an optical setup is used and the speckle pattern in the image plane is evaluated. The latter describes the speckle pattern in the object plane that is captured directly by a camera [1]. Successively captured one-dimensional speckle pattern can be seen in Fig. 4.1.

Although laser speckles are often an unwanted source of noise they can also be utilized in different applications, since they represent something similar to a fingerprint of the illuminated surface. For instance, the measurement of the shift of the imaged laser speckle pattern may be used to determine the surface strain of a sample. Knowledge about the surface strain, in turn, allows to deduce the coefficient of linear thermal expansion (CLTE).

The presented contactless method provides the possibility to measure mechanical properties of samples for which common approaches, like strain gauges, are not applicable. As an example one could name the measurement of strain of materials which are small in at least one dimension, e.g. natural fibres.

#### **2 Speckle shift in the image and its relation to the CLTE**

In order to give a better understanding of the digital signal processing applied, a theoretical model linking the speckle shift in the image with the CLTE of the sample will be reviewed. However, the following considerations will be limited to telecentric imaging systems. A broader treatment can be found in [2].

When a telecentric imaging system is used to observe the speckles, the speckle shift captured by a camera is given by the following equation first derived by Yamaguchi in 1981 [3]

$$\begin{split} A\_{\text{Y}} &= \overbrace{\frac{a\_{\text{x}}}{M} \frac{\Delta L}{L\_{\text{S}}} l\_{\text{Sx}} l\_{\text{Sy}}}^{A\_{\text{I}}} \cdot \overbrace{\frac{a\_{\text{y}}}{M} \left[1 + \frac{\Delta L}{L\_{\text{S}}} (1 - l\_{\text{Sy}}^{2})\right]}^{A\_{\text{II}}} \cdot \overbrace{\frac{a\_{\text{z}}}{M} \frac{\Delta L}{L\_{\text{S}}} l\_{\text{Sx}} l\_{\text{Sz}}}^{A\_{\text{III}}} \\ &+ \underbrace{\frac{\Delta L}{M} \cdot \left(\varepsilon\_{\text{xy}} l\_{\text{Sx}} + \varepsilon\_{\text{yy}} l\_{\text{Sy}} - \mathfrak{g}(\Omega)\right)}\_{A\_{\text{IV}}} + E. \end{split} \tag{2.1}$$

*A*<sup>y</sup> describes the resulting shift of the laser speckle pattern in the image plane in *y*I-direction and *ε* is the desired strain tensor of the sample. As can be seen, both the components *ε*xy and *ε*yy of the strain tensor can - in theory - be observed directly. Instead, in order to avoid multiple sources contributing to *A*y, the shift *a*<sup>y</sup> of the diffracting surface will be estimated. The strain in *y*-direction *ε*yy can then be deduced from this quantity. The general setup leading to Eqn. 2.1 is depicted in Fig. 2.1.

**Figure 2.1:** General geometrical setup for modelling the pattern shift in the image plane.

The sample lying in the *x*-*y*-plane on the left is illuminated by a coherent light source. The distance *L*<sup>S</sup> then describes the distance between the ideal point source and the illuminated spot on the surface and therefore the radius of the curvature of the laser beam. The vector pointing from the source to the spot on the surface can be described by *L*<sup>S</sup> = *L*<sup>S</sup> · *l*<sup>S</sup> with*l*<sup>S</sup> = *<sup>l</sup>*Sx *<sup>l</sup>*Sy *<sup>l</sup>*Sz*<sup>T</sup>* denoting the unit vector, representing the direction of the laser beam. The demagnification factor *M*, given by the optics used, as well as the distance between the focal plane of the telecentric optics and the sample Δ*L* influence the speckle shift in the image plane.

Tilting the sample also translates to a shift in the image. This is stated by the function *g* of the rotation vector Ω. The additional error term *E* summarizes non ideal or not sufficiently well known dimensional parameters, like temperature dependant expansion of the sensor and overall build-up, causing speckle pattern shifts not distinguishable from a true strain contribution [4].

When the strain estimation is solely based on *a*y, it becomes evident that any term in Eqn. 2.1 but *A*II is negatively influencing the accuracy of the estimate and hence should be minimized. By placing the sample near or ideally in the focus, both Δ*L* and <sup>Δ</sup>*<sup>L</sup> <sup>L</sup>*<sup>s</sup> become small, resulting in the contributions of *A*I, *A*III as well as *A*IV to be negligible. The impact of *A*<sup>I</sup> and partially *A*IV can be reduced further by choosing the position of the coherent light source, which in our case is a laser diode (LD) with collimating optics, carefully. When the LD is placed at the same height, meaning having the same *y*-component, as the spot on the surface, then for the parameter *l*Sy it follows *l*Sy = 0.

Having taken all measures mentioned above, Eqn. 2.1 reduces to

$$A\_\mathbf{y} = -\frac{a\_\mathbf{y}}{M} + E.\tag{2.2}$$

For the calculation of the strain and the CLTE two vertically aligned laser spots are utilized. When *A*y,Upper is the shift that the upper spot experiences and *A*y,Lower the one of the lower spot, then the engineering strain can be described as

$$\varepsilon = \frac{-M \cdot (A\_{\text{y,Upper}} - A\_{\text{y,Lower}})}{d\_{\text{ul}}} \, \tag{2.3}$$

where *d*ul is the initial distance between the two spots. Here, the error term *E* has been neglected.

The instantaneous CLTE is defined as

$$\alpha\_{\rm L} = \frac{1}{l\_{\rm sample}} \frac{\rm dl\_{\rm sample}}{\rm dT} \,, \tag{2.4}$$

where *T* is the temperature in K and *l*sample denoting the initial length of the sample in m [5]. Since dealing with homogenous materials which are unconstrained and have zero stress components, the thermal strain is the only strain to be observed. Therefore, we can simply rewrite the equation above in order to get the following relationship between the estimated strain and the CLTE

$$
\mu\_{\rm L} = \frac{\rm d\varepsilon}{\rm d\overline{T}}.\tag{2.5}
$$

Applying the results from equation 2.3 it follows that the CLTE is given by

$$\alpha\_{\rm L} = \frac{\mathbf{d} \left( \frac{-M \cdot (A\_{\rm y, LPper} - A\_{\rm y,Lower})}{d\_{\rm ul}} \right)}{\mathbf{d}T} \,. \tag{2.6}$$

#### **3 Measurement setup**

The measurement setup, depicted in Fig. 3.1, consists of four components which correspond to the ones presented in the general setup in Fig. 2.1. Two LDs with maximum power outputs of 5 mW serve as coherent light sources. Each diode illuminates a small circular region on the sample. One of these regions is located in the upper and one in the lower half of the sample. The upper LD has a center wavelength of 657 nm whereas the center wavelength of the lower one was determined to be at 655 nm. The measurement for the center wavelength has been carried out with the help of a Hamamatsu TM-CCD series mini-spectrometer which has a wavelength resolution of 1 nm.

The two illuminated spots are imaged through a telecentric optical system with demagnification *M* = 1.000. Therefore, according to equation 2.3, every shift of the illuminated surface directly translates to the same shift in the image plane although in opposing direction. The telecentric imaging system also ensures that the minimum speckle size in the image plane is large enough to not cause spatial aliasing effects. The minimum speckle size is defined by the smallest optical aperture and the laser wavelength.

The camera is a 3000 pixel line scan CCD camera with a pixel pitch, the distance between the center of two of adjacent pixel, of 7 μm. The CCD is driven directly by a Linux system running the preempt rt patch, enabling real-time functionalities, for Ubuntu 20.04 in order to minimize variations of the exposure time which can also lead to falsely detected shifts of the patterns. Variations of Δ*t*exposure < 1 μs have been achieved. This limit is only exceeded in very few occasions.

The sample, depicted on the right side of Fig. 3.1, is mounted on a 2-dimensional axis so that positioning of the surface in the focus of the optics can be achieved. In order to measure the samples temperature a Pt-1000, placed in a bore in the sample, is utilized.

**Figure 3.1:** Measurement setup using telecentric imaging.

### **4 Signal processing and results**

In this section the signal processing applied in order to estimate the CLTE will be discussed.

In Section 2 it becomes evident, that for the CLTE estimation high resolution, in the range of 100 nm, of the speckle pattern shift is crucial. Hence, the main goal of the applied signal processing techniques is to determine the pattern shift as accurately as possible. The routine described below is executed for both spots independently. Therefore, a region of interest (ROI) for each spot is chosen. The ROI for the upper spot, which is a subset of one line captured by the CCD, is depicted in Fig. 4.1.

In order to accurately estimate the pattern shift, the cross power spectral density (CPSD) *S*xy(*f*) as well as the coherence function *ρ*(*f*) are utilized. According to the Fourier shift theorem, the argument of the complex valued CPSD encodes the space-lag of the image. The CPSD can be calculated by the following equation

$$S\_{\rm xy}(f) = E\left\{ \mathcal{F}\left\{ \mathbf{x}(m) \right\} \mathcal{F}\left\{ \mathbf{x}(m+i) \right\}^\* \right\},\tag{4.1}$$

with the expected value *E* {··· } and F {··· } denoting the (fast) Fourier transform (FFT) of two different lines *x*(*m*) and *x*(*m* + *i*), captured by the CCD camera at two different points in time *m* and *m* +*i* [6]. In order to minimize the variance of the estimate of the CPSD, Welch's method is applied. This method includes averaging over multiple periodograms of potentially overlapping sub-signals. In Fig. 4.1 this concept of splitting the signals into *i* sub-signals of length *N*FFT and *N*OL overlapping pixels is shown. However, there is a trade-off to be made. The more sub-signals *i* are used, the lower the variance of the resulting coefficients is. In turn, since the signal only consists of a finite amount of sample points, the number of frequencies that are calculated are reduced, too [7].

It follows, that the spatial shift *A*<sup>y</sup> of either of the two spots based on the phase *θ* = arg *S*xy of the CPSD is given by

$$A\_{\rm Y} = d\_{\rm pixel} \cdot \frac{\theta N\_{\rm FFT}}{2\pi fk}.\tag{4.2}$$

This can also be interpreted as a spatial group-delay, which can be assumed to be constant for the application to shifts in speckle patterns. Therefore, higher spatial frequencies also experience a larger phasedelay. For the evaluation of the space-lag based on the *k*−th spectral line, the pixel pitch *d*pixel and the number of points used for the calculation of the FFT *N*FFT are needed.

In order to improve the estimation, weighting of the spectral lines of the CPSD is introduced. Each spectral line of the CPSD is weighted with its corresponding squared coherence value 0 ≤ |*ρ*(*fk*)| <sup>2</sup> <sup>≤</sup> 1. The coherence function between two signals can be calculated by the relationship

$$\rho(f) = \frac{\mathbb{S}\_{\text{xy}}(f)}{\sqrt{\mathbb{S}\_{\text{xx}}(f) \cdot \mathbb{S}\_{\text{yy}}(f)}} \, \tag{4.3}$$

where *S*xx(*f*) and *S*yy(*f*) are the power spectral densities for line *x*(*m*) and *x*(*m* + *i*), respectively. The coherence function is a measure of the degree to which both lines in question are related by a linear timeinvariant transform [6]. Especially in the case at hand, one could alternatively view a spectral line with low coherence as having less cause and effect but more noise to determine the actual phase value. Hence, additionally to weighting, a spectral line with a coherence value undercutting a predefined minimum, here *ρ*min = 0.99, is not being considered further for the calculation of the shift.

Further improvements can be made, if for the calculation of the shift *A*y(*m*, *m* + 1) between line *m* and line *m* + 1 not only these two lines but all lines *m* − *j*, *m* − *j* + 1, ··· *m*, ··· *m* + *j* − 1, *m* + *j* are considered. In Fig. 4.1 lines *m* − 2, ··· , *m* + 3 are depicted. Since the time between capturing each of these lines is comparably small it is assumed that the associated shift between the lines is (i) linear as a function of time and (ii) sufficiently small. The pattern shifts *A*y(*m*, *m* − *j*)··· *A*y(*m*, *m* + *j*)

**Figure 4.1:** Speckle pattern in the ROI of the upper spot, taken at different points in time *m* + *j*.

can then be used to estimate a polynomial of first order describing the shift over time. This in turn, evaluated at *t*m+<sup>1</sup> results in the final estimate of the shift between line *m* and *m* + 1. This concept is visualized in Fig. 4.2. The empty circles (left) represent the pattern shifts *A*y(*m*, *m* − *j*)··· *A*y(*m*, *m* + *j*) whereas the dashed line is the polynomial of first order. The resulting shift then transfers to the total shift over time depicted on the right side.

Applying the previously described signal processing routine, the results in Fig. 4.3 have been obtained. A brass sample was heated up to about 110 ◦C using a heat-gun. Then, as can be seen in the left graph, the sample slowly cooled down to about room temperature. As expected, the specimen shrinks at approximately the theoretical rate, depicted in grey. However, as can be seen, the estimated shift is always lower than the theoretical shift. The reason for this lies within an observable bias of the estimator, which still needs further research and potentially the introduction of a more precise model of the influence of the CCD camera. Despite the bias, the estimated averaged CLTE *α*L,25−<sup>110</sup>

**Figure 4.2:** Local pattern shifts *A*y(*m*, *m* + *j*) translate to a point in the total pattern shift when evaluation of the linear polynomial at *tm*,*m*+<sup>1</sup> is carried out.

over the whole temperature range, based on the linearised total shift, seen on the right of Fig. 4.3, is in good agreement with the CLTE found in literature. For a smaller temperature range of 25 ◦C–50 ◦C a result closer to the theoretical one has been obtained. The values are as follows: *α*L,Theo = 18.7 ppm/◦C [8], *α*L,25−<sup>110</sup> = 17.8 ppm/◦C, *α*L,25−<sup>50</sup> = 18.2 ppm/◦C.

**Figure 4.3:** Comparison of the estimated difference Δ*A*<sup>y</sup> = −*A*y,Upper + *A*y,Lower based on measurements vs. theoretical values for a bulk brass specimen. Δ*A*<sup>y</sup> is essential for the calculation of the CLTE.

### **5 Conclusions**

We have shown, that it's possible to get good estimates for the CLTE of metals evaluating the CPSD weighted by the cause-and-effect measuring coherency function. The described signal processing routine leads to acceptable results presented for the case of bulk brass and can be extended to thin specimen as well. It has been observed that with the presented estimator resulting shifts are always smaller than the ones obtained by literature based calculations. Further research is needed in order to describe this phenomenon accurately and also improve the estimate.

### **References**


# **Vessel extraction from breast MRI**

Marco Gierlinger, Dinah Brandner, and Bernhard G. Zagar

Johannes Kepler University Linz, Institute for Measurement Technology, Altenberger Straße 69, AT-4040 Linz

**Abstract** We present an extension of the previous work, where a multi-seed region growing (MSRG) algorithm was shown, that extracts segments from breast MRI. The algorithm of our extended work filters elongated segments from the segments derived by the MSRG algorithm to obtain vessel-like structures. This filter is a skeletonization-like algorithm that collects useful information about the segments' thickness, length, etc. A model is shown that scans through the solution space of the MSRG algorithm by adjusting its parameters and by providing shape information for the filter. We further elaborate on the usefulness of the algorithm to assist medical experts in their diagnosis of diseases relevant to angiography.

**Keywords** Feature extraction, breast MRI, region-based, image segmentation

### **1 Introduction**

Magnet resonance (MR) angiography is a diagnostic tool to depict vessels, e.g. blood vessels. Also, potential malignant masses like cancer can be analyzed, since the spread of cancer is supported by excreting angiogenesis factor to prompt vessel growth into the mass to provide nutrients and oxygen [1]. Information about location, size and morphology of vessels may provide clinicians with useful information before surgery during e.g. a neoadjuvant therapy. Blood vessels in breast MRI are determined subjectively by radiologists, which is considered as the gold standard as research shows [2,3]. Rarely accessible data-sets and their expensive annotation by experts hinder the progress of vessel extraction in medical fields with deep-learning [4]. In recent years, many researchers were motivated to provide clinicians with useful algorithms for vessel segmentation for medical diagnostics [5].

In previous work, we developed a fast multi-seed region growing (MSRG) algorithm [6], that is capable of extracting homogeneous regions even in images exhibiting noise artefacts. While the algorithm is still in active development, it has shown promising results on which we rely on in this current contribution. The algorithm generates a finite two-dimensional solution space of extracted homogeneous regions from a given input image. We extend the previous work with a skeletonization-like algorithm that is capable of filtering elongated segments from this solution space.

In Section 2, the algorithm of the previous work is explained in detail. The subsequent sections will describe the skeletonization-like algorithm and show the results of allegedly extracted vessels.

### **2 Previous Work**

The multi-seed region growing (MSRG) algorithm is described for *n*-bit RGB images in **N***w*×*h*×<sup>3</sup> <sup>0</sup> as follows: Multiple seeds are employed and aligned as grid with a user-defined width *u* and height *v* with *u*, *v* ∈ **N**, *u* ≤ *w* and *v* ≤ *h* as schematically depicted in red (*u* = *v* = 3) on the blue pane of Fig. 2.1. The algorithm walks through pixel space in an 8-adjacency flood-fill manner independently for each of the *k* seeds as follows: Let �*<sup>p</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>5</sup> <sup>0</sup> = (*p*0, *p*1, *pR*, *pG*, *pB*) be a pixel, where (*p*0, *p*1) is its position in pixel space and let *IR*(�*p*) = *pR*, *IG*(�*p*) = *pG*, and *IB*(�*p*) = *pB* be its intensity values. A bucket queue is used to hold data objects (�*p*, *<sup>q</sup>*), where *<sup>q</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>0</sup> with *<sup>q</sup>* <sup>&</sup>lt; <sup>2</sup>*<sup>n</sup>* is the bucket's index. Initially, the queue is filled with the seed only, which is a single pixel only. An iteration *<sup>i</sup>* is initiated by polling a bucket *Bi*. <sup>∀</sup>�*<sup>a</sup>* <sup>∈</sup> *Bi* <sup>∀</sup>�*<sup>b</sup>* <sup>∈</sup> *<sup>N</sup>*� (�*a*), a cost function *δ<sup>f</sup>* : (�*a*, �*b*) �→ {*<sup>r</sup>* <sup>∈</sup> **<sup>N</sup>**<sup>0</sup> <sup>|</sup> *<sup>r</sup>* <sup>&</sup>lt; <sup>2</sup>*n*} is applied, where *<sup>N</sup>*� (�*a*) denotes all non-visited neighbors of �*a*. Cost function *δ<sup>f</sup>* is defined as:

$$\delta\_f(\vec{a}, \vec{b}) = \left\lfloor \sqrt{\frac{(I\_R(\vec{a}) - I\_R(\vec{b}))^2 + (I\_G(\vec{a}) - I\_G(\vec{b}))^2 + (I\_B(\vec{a}) - I\_B(\vec{b}))^2}{3}} \right\rfloor. \tag{2.1}$$

At the end of an iteration, each result is added to the queue as ( �*b*, *δf*(�*a*, �*b*)).

**Figure 2.1:** MSRG applied to a single slice of breast MRI dataset [7].

**Figure 2.2:** Left: Cumulative frequency of number of assigned pixels *mi* on 'pliers' image with (*w*, *h*)=(640, 480) and largest step *m*max is found at highlighted iteration *i*. Right: Image with arbitrary seed positions indicated.

#### **2.1 Heuristic**

Additionally, the number of newly assigned pixels *mi* is tracked for each iteration *<sup>i</sup>*. A map *Ms* <sup>∈</sup> **<sup>N</sup>***w*×*<sup>h</sup>* <sup>0</sup> , drawn as an orange pane in Fig. 2.1, is used for the *<sup>s</sup>th* of *<sup>k</sup>* seeds and every *ra*0,*a*<sup>1</sup> <sup>∈</sup> *Ms*, where (*a*0, *a*1) is the position of �*a*, is set to *m*max(*i*) = max(*m*0, *m*1, ..., *mi*) at iteration *i*. Finally, when the queue is empty, i.e. every pixel was visited, *Ms* is converted into a binary map by

$$r \in M\_{\rm s} = \begin{cases} 0 & r < m\_{\rm max}(i) \\ 1 & \text{otherwise} \end{cases} \tag{2.2}$$

Then, all *k* binary maps are added up elementwise to ∑*<sup>k</sup> <sup>s</sup>*=<sup>1</sup> *Ms*, and normalized to the range from zero to 2*<sup>n</sup>* <sup>−</sup> 1 to suppress regions that

occur less often than others. An example is highlighted in Fig. 2.1 (right) with random colors assigned for regions containing the same numerical value. Growth is depicted in Fig. 2.2 and Fig. 2.3 for two arbitrary seed positions.

Furthermore, an image quantization factor *g* ∈ **R** with 0 < *g* ≤ 1 is used to decrease the image's bit depth as a preprocessing step. A user can tune the grid size (*u*, *v*), i.e. the density of dispersed seeds, and image quantization factor *g* until a suitable solution is found as shown in Fig. 2.4.

**Figure 2.3:** Growth of blue seed at (60, 60) and orange seed at (320, 240) on 'pliers'.

**Figure 2.4:** Solution space of MSRG [6].

### **3 Proposed Method**

In this section we initially detail an algorithm that filters elongated segments from the result set of the multi-seed region growing (MSRG) algorithm. Furthermore, in the second subsection, a model is depicted to show how these two algorithms interact to obtain vessel-like structures from an input breast MRI stack. The stack is defined as follows: Let *Si* be the *i* th image on the breast MRI stack, that is an 8-bit grayscale image, we define *Ci* as an RGB image, that is composed of *Si* as red channel, *Si*<sup>+</sup><sup>1</sup> as green channel and *Si*<sup>+</sup><sup>2</sup> as blue channel. For example, the top of the second stack from left (*C*24) in Fig. 3.1 shows such an 8-bit 256 × 256 RGB image as one of 34 possible slices.

**Figure 3.1:** From left to right: Five of a set of 36 Breast MRI slices of a single session → RGB image composited by three grayscale images from the left stack → Filtered MSRG algorithm fi(*C*, *g*off + *n* · *g*inc) → Result stack showing a single result slice.

#### **3.1 Algorithm**

Elongated vessel-like segments are filtered out from the set of segments derived by the MSRG. We apply a skeletonization-like algorithm sk(msrg(*C*, *g*, *u*, *v*)) = *M*sk, where *C* is an arbitrary RGB image and *g*, *u*, *v* the other MSRG input parameters as described in the previous section. Result *M*sk is a binary map containing the filtered segments. The algorithm sk(msrg(*C*, *g*, *u*, *v*)) is described for a single segment of the MSRG result set as follows:

Initially, we select a random pixel *p* from given segment *A* and 8-adjacently flood-fill until every pixel has been 'filled'. Let *N*(*p*) be the set of adjacent filled pixels of *p* in pixel space of *A*. Each iteration, we prioritize from the set of potential new pixels *N*(*A*) = {*r* ∈ *A* | ¬isfilled(*r*) ∧ *N*(*r*) �= ∅} the subset *N*max(*A*) ⊆ *N*(*A*), defined as the largest group of adjacent pixels as highlighted in Fig. 3.2 left.

This rule leads to a fast method such that the segment is flood-filled nearly homogeneously along an elongated segment as shown in Fig. 3.2 right. Each iteration and until every pixel has been examined, we derive an updated set of *N*max(*A*). We use this set for statistical analysis: E.g., |*N*max(*A*)| gives a rough estimation of the thickness of an elongated shape. This statistical information is useful to filter out segments that are too short, too thick, or have a too large variance in thickness or intensity along the potential vessel or along its cross section. With this filter, the number of parameters increases but the parameters could be adjusted by medical experts to fit the realistic requirements of the vessels in question.

**Figure 3.2:** A Flood-fill algorithm that promotes nearly homogeneous flooding by prioritizing the largest adjacent group *N*max(*A*) ⊆ *N*(*A*).

#### **3.2 Model**

We generate with the MSRG a one-dimensional solution space: The seed grid size (*u*, *v*) is set to (4, 4), i.e. we select a column as shown in Fig. 2.4. These values are selected based on user experience with the MSRG. We define within this subsection the filtered MSRG algorithm fi(*C*, *g*) = sk(msrg(*C*, *g*, 4, 4)), where sk(msrg(*C*, *g*, 4, 4)) denotes the filter algorithm from the previous subsection. The algorithm fi(*C*, *g*) is applied to RGB (3 channels) image *<sup>C</sup>* <sup>∈</sup> **<sup>N</sup>***w*×*h*×<sup>3</sup> <sup>0</sup> with image width *w* and image height *<sup>h</sup>* and results in a binary map *<sup>M</sup>*fi <sup>∈</sup> **<sup>N</sup>***w*×*<sup>h</sup>* <sup>0</sup> . The image quantization factor *g* ∈ **R** with 0 < *g* ≤ 1 is now the only parameter that influences the result.

To keep the computation time low, we do not iterate over all possible values of the image quantization factor *g*. We limit the range from *g*off to *g*off + *g*inc · *g*res, where *g*off denotes the lower bound, *g*res the resolution, and *g*off + *g*inc · *g*res the upper bound of the dimension. Again, the values are selected based on user experience of the MSRG and should be selected such that the desired features are found within these bounds.

The model is schematically depicted in Fig. 3.1. The results are binary images with elongated structures that potentially reveal vessels. Formally, we generate a result stack (see Fig. 3.1 right) for all *j* ∈ **N** with 1 < *j* ≤ (MRI stack size − 3), defined as:

$$Y\_j = \sum\_{n=0}^{\text{gres}} \cos(\mathbb{C}\_{j-1} \, \mathbb{C}\_{j\prime} \mathbb{C}\_{j+1\prime} \, \mathbb{g}\_{\text{off}} + n \cdot \mathbb{g}\_{\text{inc}}) \tag{3.1}$$

where co(*R*1, *R*2, *R*3, *g*) returns a single binary map that is composed of the union of the two adjacent slice pairs. Formally,

$$\operatorname{co}(R\_1, R\_2, R\_3, \operatorname{g}) = \operatorname{fi}(R\_1, \operatorname{g}) \land \operatorname{fi}(R\_2, \operatorname{g}) \lor \operatorname{fi}(R\_2, \operatorname{g}) \land \operatorname{fi}(R\_3, \operatorname{g}) \tag{3.2}$$

where ∧ and ∨ denote element-wise binary operators. Finally, as a post processing step, we set

$$r \in \mathbb{Y}\_{\bar{\jmath}} = \begin{cases} 0 & r < t \\ 1 & \text{otherwise} \end{cases} \tag{3.3}$$

where *t* is a user-defined threshold.

With this technique, it is possible to differ the contours of the skin from elongated segments that are within the region of interest. We assume that the color gradient more likely varies along the cross section as shown in Fig. 3.3. This can be detected by the algorithm of the previous subsection by statistical analysis of *N*max(*A*).

#### **4 Results and Evaluation**

We apply the proposed method of the previous section to two different breast MRI stacks that were acquired from the same patient (and same

**Figure 3.3:** Left: A potential vessel with color gradient varying lengthwise. Right: Breast contour with color gradient varying along the cross section.

breast) during two sessions. Several weeks lay between these two sessions. Algorithm parameters are set for both stacks equally. The result is shown in Fig. 4.1 for four adjacent slices of each result stack. Slices of these two stacks are shifted such that they roughly match each other from column to column in Fig. 4.1. We derive similar results for both stacks, although the region of interest is shaped differently as seen in the two RGB images of Fig. 4.1. However, these binary results do not cover all elongated structures as visible in the RGB images. A better strategy is required to efficiently look for elongated segments in the solution space of the MSRG algorithm to derive more results. Also, a slice represents 3 mm thickness and obviously, a much higher resolution would improve the results for thinner vessel-like structures.

Due to the breast varying in shapes in different postures, it is very challenging to detect common features, however, this method might be used for automatic image registration.

### **5 Conclusions**

With our proposed filter and combined with the MSRG algorithm from the previous work, it seems possible to assist clinicians in detecting vessel-like segments. The employed algorithm filtered elongated structures from the solution space of the MSRG algorithm. The results seem promising, however, they could not be evaluated for the correct detection of vessels without medical expertise. Instead, the proposed method was applied to two input MRI stacks from the same patient (and same breast), that were acquired during two sessions. The evalua-

**Figure 4.1:** Each row contains four adjacent slices of the result stack as shown in Fig. 3.1 right. Each row's input stack (Fig. 3.1 on the left) was acquired from the same patient but several weeks lay between these sessions.

tion showed how barely the results deviate from each other. Although vessels will be positioned slightly differently between two sessions due to shaping the breast dependent on the actual posture in the MRI scanner, this method seemingly extracted similar elongated segments for both input MRI stacks. This suggests the assumption that the extracted segments are not artefacts.

Based on user experience, the solution space of the MSRG algorithm was searched through. Due to the complexity and computation time, it is impractical to search through the whole solution space. It is required to adjust the MSRG parameters accordingly, however, the MSRG algorithm is not guaranteed to be complete. Further investigations are required for more efficient search strategies.

### **Acknowledgement**

This work has been supported by the LCM K2 Center within the framework of the Austrian COMET-K2 Programme.

### **References**


# **Generation of artificial training data for spectral unmixing by modelling spectral variability using Gaussian random variables**

Johannes Anastasiadis and Michael Heizmann

Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology (KIT), Hertzstr. 16, 76187 Karlsruhe, Germany

**Abstract** A stochastic method how artificial training data for spectral unmixing can be generated from real pure spectra is presented. Since the pure spectra are modelled as Gaussian random vectors, spectral variability is also considered. These training data can in turn be used to train an artificial neural network for spectral unmixing. Non-negativity and sum-to-one constraints are enforced by the network architecture. The approach is evaluated using real mixed spectra and achieves promising results with the used datasets.

**Keywords** Spectral unmixing, spectral variability, Gaussian random variables, convolutional neural network

#### **1 Introduction**

Optical measurement techniques are often used for process monitoring in industry because they are non-contact and non-destructive. An important task is to determine the relative proportions of the components in substance mixtures. There are applications in many fields, such as food industry, medical technology, as well as in the processing of bulks. This task cannot be solved sufficiently with conventional colour images, because these only contain three colour channels (red, green, blue) per pixel, whereas hyperspectral images have a finely gained spectrum in each pixel that characterizes the materials much better [1]. If information of different materials is contained in one measured pixel, spectral unmixing (SU) is needed to get the relative proportions, the abundances, of the pure materials in the pixel [2]. This is often done model-based [3]. However, these models usually assume a single spectrum for each pure substance involved. Actually, the spectra of the pure substances and also of the mixtures vary quite a lot [4, 5].

Artificial neural networks (ANNs) achieve excellent results in many domains and have the benefit of not requiring model knowledge. Particularly for SU, the use of ANNs has further advantages [6]. First, the non-negativity and the sum-to-one constraints can be enforced by a normalising output layer. Second, spectral variability can be considered if it is contained in the training data. However, ANNs need a lot of significant training data to perform well, which are often not available in the domain of hyperspectral imaging.

In this approach we model the mixing characteristics including spectral variability of substances and use these models to generate training data for an ANN used for SU. Several spectra of each pure substance are needed for this approach. Even without the availability of real mixed spectra, most of the advantages of SU using an ANN can be exploited. Furthermore, performance can be improved if additional mixed spectra are available. This approach is therefore suitable for use in an industrial environment where the pure substances involved are known and hyperspectral images can be acquired in advance. We have already trained ANNs with artificially generated mixed spectra in a preceding work [7]. There, pure spectra randomly drawn from a set were mixed using models to obtain mixed spectra. In contrast, the approach in this paper models the spectra of the pure substances as Gaussian distributed random vectors. There is another approach, where SU is accomplished by directly applying Gaussian process regression [8], but not for training data generation. In this contribution, to generate the training data, the random vectors are combined for many different sets of abundances using the linear mixing model (LMM), the Fan Model (FM), the generalized bilinear model (GBM), and the linear quadratic model (LQM) [2, 9–11].

The rest of the paper is structured as follows: In Section 2 the basics of SU are summarized. Section 3 then provides the probabilistic equivalents of the mixing models used. In Section 4 the experimental results are shown. Finally the paper is summarized and conclusions are drawn in Section 5.

#### **2 Spectral unmixing**

The estimation of the abundances in a mixture of substances based on its spectrum is called SU [2]. If the spectra of the pure substances involved are known, the term supervised SU is used. This is the case here. The estimation **a**ˆ = [*a*ˆ1, ..., *a*ˆ*P*] <sup>T</sup> <sup>∈</sup> **<sup>R</sup>***<sup>P</sup>* of the true abundances **a** = [*a*1, ..., *aP*] <sup>T</sup> <sup>∈</sup> **<sup>R</sup>***<sup>P</sup>* of the up to *<sup>P</sup>* pure substances involved has to fulfil constraints in order to ensure physical validity. These are the non-negativity constraint and the sum-to-one constraint:

$$\mathfrak{a}\_p \ge 0 \quad \forall p \,, \qquad \sum\_{p=1}^{P} \mathfrak{a}\_p = 1 \,. \tag{2.1}$$

Many of the methods used for SU are model-based. The models used here approximate the mixed spectra using a parametric function. The most commonly used mixing model is the LMM, which works well for many applications [2, 5, 12, 13]:

$$\mathbf{y} = \sum\_{p=1}^{P} \mathbf{m}\_p \, a\_p + \boldsymbol{\omega} = \mathbf{M} \, \mathbf{a} + \boldsymbol{\omega} \,. \tag{2.2}$$

Here **<sup>y</sup>** <sup>∈</sup> **<sup>R</sup>***<sup>Λ</sup>* is a discrete spectrum sampled at *<sup>Λ</sup>* wavelength channels and **<sup>M</sup>** = [**m**1, ..., **<sup>m</sup>***P*] <sup>∈</sup> **<sup>R</sup>***Λ*×*<sup>P</sup>* are the spectra of the up to *<sup>P</sup>* involved pure substances. Differences between **y** and the weighted sum of the pure material spectra are considered by *<sup>ω</sup>* <sup>∈</sup> **<sup>R</sup>***Λ*. The LMM is based on the assumptions that mixing takes place on a macroscopic scale and that photons interact with only one material before they hit the sensor [14]. There are also non-linear mixing models that are summarized in [3]. In this paper the GBM [10]

$$\mathbf{y} = \sum\_{p=1}^{P} \mathbf{m}\_p \, a\_p + \sum\_{p=1}^{P-1} \sum\_{q=p+1}^{P} \gamma\_{pq} \, a\_p \, a\_q \, \mathbf{m}\_p \odot \mathbf{m}\_q + \omega \tag{2.3}$$

and the LQM [11]

$$\mathbf{y} = \sum\_{p=1}^{P} \mathbf{m}\_p \, a\_p + \sum\_{p=1}^{P} \sum\_{q=1}^{P} b\_{pq} \, \mathbf{m}\_p \odot \mathbf{m}\_q + \omega \tag{2.4}$$

are used, where *γpq* and *bpq* are the non-linearity coefficients and � is the element-wise product. For *γpq* = 1 ∀*p*, *q* the GBM is equivalent to the FM [9]. These mixing models also consider light that has interacted with up to two substances before hitting the sensor.

A commonly used approach considering (2.1) is the Fully Constrained Least Squares (FCLS) algorithm [15]. Here, the Lagrangian *<sup>L</sup>* : **<sup>R</sup>***P*+<sup>1</sup> <sup>→</sup> **<sup>R</sup>** with Lagrange multiplier *<sup>l</sup>* <sup>∈</sup> **<sup>R</sup>** is optimized:

$$L(\mathbf{a}, l) = \|\mathbf{y} - \mathbf{M}\mathbf{a}\|\_2^2 - l \left(\sum\_{p=1}^P a\_p - 1\right) \,. \tag{2.5}$$

It is an iterative process that removes negative *a*ˆ*<sup>p</sup>* and the corresponding pure spectra. For estimations based on non-linear mixing models, gradient based line search methods can be used [10].

The presented mixing models assume that each pure substance is represented by a single spectrum. In reality, however, pure substances may have varying spectra [4]. These variations are referred to as spectral variability, which is primarily caused by surface structure. There are also unmixing algorithms considering the spectral variability such as the extended linear mixing model (ELMM) [16]. The ELMM extends the LMM by a set of parameters that allow scaling of the pure material spectra.

The presented approach considers spectral variability by including it in the generated training data. The process of generating these data is described in the following section.

#### **3 Stochastic mixing models**

For the approach used in this paper the spectra **m***<sup>p</sup>* of the pure substances are modelled as Gaussian distributed random variables *<sup>M</sup><sup>p</sup>* <sup>∈</sup> **<sup>R</sup>***Λ*. These can be entirely described by their mean vector *<sup>μ</sup>M<sup>p</sup>* <sup>=</sup> *<sup>μ</sup><sup>p</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>Λ</sup>* and their covariance matrix **<sup>Σ</sup>***Mp*,*M<sup>p</sup>* <sup>=</sup> **<sup>Σ</sup>***p*,*<sup>p</sup>* <sup>∈</sup> **<sup>R</sup>***Λ*×*Λ*. Mean vectors and covariance matrices of pure spectra are estimated using real data, which is why a set of measured spectra is required for each pure substance. Since the pure spectra of different pure materials do not depend on each other, they are assumed to be stochastically independent and therefore the cross-covariance matrix is **Σ***p*,*<sup>q</sup>* = **0** if

*<sup>p</sup>* � *<sup>q</sup>*. A mixed spectrum *<sup>Y</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>Λ</sup>* is also modelled as a Gaussian distributed random vector with mean vector *<sup>μ</sup><sup>Y</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>Λ</sup>* and covariance matrix **<sup>Σ</sup>***Y*,*<sup>Y</sup>* <sup>∈</sup> **<sup>R</sup>***Λ*×*Λ*. The following moments result for the LMM:

$$
\mu\_Y = \sum\_{p=1}^P a\_p \,\mu\_{p\,\,\,\,'} \qquad \qquad \Sigma\_{Y,Y} = \sum\_{p=1}^P a\_p^2 \,\Sigma\_{p,p\,\,'} \,. \tag{3.1}
$$

Appropriate non-linearity coefficients *γpq* and *bpq* must be determined for the GBM and the LQM. In this paper a constant value is used for this purpose, which means for all *p* and *q*: *γpq* = *γ* and *bpq* = *b* . This is a limitation, but in practice it allows better results compared to the LMM and the FM. To determine suitable values, a small validation dataset with some real mixed spectra is required. The mean vector *μ<sup>Y</sup>* for the GBM can be obtained by replacing **m***<sup>p</sup>* by *μ<sup>p</sup>* in (2.3). The following covariance matrix results for the GBM:

$$\begin{split} \boldsymbol{\Sigma}\_{\mathbf{Y},\mathbf{Y}} &= \sum\_{p=1}^{P} a\_{p}^{2} \, \boldsymbol{\Sigma}\_{p,p} + \sum\_{p=1}^{P-1} \sum\_{q=p+1}^{P} \gamma^{2} \, a\_{p}^{2} \, a\_{q}^{2} \, \boldsymbol{a}\_{p\odot q, p\odot q} \\ &+ \sum\_{p=1}^{P} \sum\_{q=1}^{P} \gamma \, a\_{p}^{2} \, a\_{q} \left( \boldsymbol{\Sigma}\_{p,p\odot q} + \boldsymbol{\Sigma}\_{p\odot q, p} \right) \\ &+ \sum\_{p=1}^{P} \sum\_{q=1}^{P-1} \sum\_{l=q+1 \atop q \neq p}^{P} \gamma^{2} \, a\_{p}^{2} \, a\_{q} \, a\_{l} \left( \boldsymbol{\Sigma}\_{p\odot q, p\odot l} + \boldsymbol{\Sigma}\_{p\odot l, p\odot q} \right) . \end{split} \tag{3.2}$$

The (cross-)covariance matrices are calculated by:

$$\begin{split} \boldsymbol{\Sigma}\_{p \odot q, p \odot q} &= \mathcal{D}(\boldsymbol{\mu}\_{p}) \, \boldsymbol{\Sigma}\_{q, q} \, \mathcal{D}(\boldsymbol{\mu}\_{p}) + \mathcal{D}(\boldsymbol{\mu}\_{q}) \, \boldsymbol{\Sigma}\_{p, p} \, \mathcal{D}(\boldsymbol{\mu}\_{q}) + \boldsymbol{\Sigma}\_{p, p} \, \boldsymbol{\odot} \, \boldsymbol{\Sigma}\_{q, q} \\ \boldsymbol{\Sigma}\_{p, p \odot q} &+ \boldsymbol{\Sigma}\_{p \odot q, p} = \boldsymbol{\Sigma}\_{p, p} \, \mathcal{D}(\boldsymbol{\mu}\_{q}) + \mathcal{D}(\boldsymbol{\mu}\_{q}) \, \boldsymbol{\Sigma}\_{p, p} \, \boldsymbol{\prime} \\ \boldsymbol{\Sigma}\_{p \odot q, p \odot l} &+ \boldsymbol{\Sigma}\_{p \odot l, p \odot q} = \mathcal{D}(\boldsymbol{\mu}\_{q}) \, \boldsymbol{\Sigma}\_{p, p} \, \mathcal{D}(\boldsymbol{\mu}\_{l}) + \mathcal{D}(\boldsymbol{\mu}\_{l}) \, \boldsymbol{\Sigma}\_{p, p} \, \mathcal{D}(\boldsymbol{\mu}\_{q}) \, \boldsymbol{\prime} \end{split} \tag{3.3}$$

Here, the output of operator D(·) is a diagonal matrix with the elements of the input vector as its diagonal elements. If the input is a matrix, the output is a vector whose elements are the diagonal elements of the input matrix. The moments for the FM can be obtained by setting *γ* = 1.

The LQM also takes element-wise squared spectra into account. The following mean vector formula results for the LQM after transforming:

$$\mu\_Y = \sum\_{p=1}^P a\_p \mu\_p + b \,\mu\_{p \odot p} + \sum\_{p=1}^{P-1} \sum\_{q=p+1}^P 2 \, b \,\mu\_p \odot \mu\_{q \cdot} \tag{3.4}$$

where *<sup>μ</sup>p*�*<sup>p</sup>* <sup>=</sup> *<sup>μ</sup><sup>p</sup>* � *<sup>μ</sup><sup>p</sup>* <sup>+</sup> <sup>D</sup>(**Σ***p*,*p*). The covariance matrix of a mixed spectrum is not shown for the LQM because it is similar to the one of the GBM. As for the other mixing models, all covariance and crosscovariance matrices are taken into account.

Using the Gaussian random variables calculated with the mixing models, training data for an ANN can now be generated. If training is performed over several epochs, new samples of mixed spectra can be drawn in every epoch. After data sampling, the spectra generated with non-linear mixing models are normalized. For this purpose they are multiplied by *d*GBM or *d*LQM, respectively:

$$d\_{\rm GBM} = \left(1 + \sum\_{p=1}^{P-1} \sum\_{q=p+1}^{P} \gamma \, a\_p \, a\_q\right)^{-1}, \quad d\_{\rm LQM} = \left(1 + P^2 b\right)^{-1}.\tag{3.5}$$

This is necessary because, unlike the LMM, the prefactors do not add up to 1. However, since the light that contributes to the linear component cannot contribute additionally to the quadratic component, this normalization is useful [7]. Thus, values for *γ* and *b* greater than 1 are also reasonable. In the next section the described approach is evaluated with real hyperspectral datasets.

### **4 Experimental results**

To evaluate our approach we use real hyperspectral data acquired in our image processing lab. All datasets have 91 wavelength channels with an average width of 4 nm from 450 nm to 810 nm and 400 spectra per mixture. Two of the datasets consist of mixtures of coloured quartz sand. The first of them (quartz-3) contains 45 mixtures of maximum 3 components including pure substances, which vary in abundance steps of 0.125 . The second one (quartz-4) contains 56 mixtures of maximum 4 components including pure substances, which vary in abundance steps of 0.2 . The third dataset consists of 56 mixtures of colour powders (colour-4), which also have a maximum of 4 components each. Here, the components are varied in steps of 0.2, too. Only pure spectra are used for training. For all datasets, 12 mixtures are used for validation and the rest (30 respectively 40) to test. We train our ANN using the artificial mixtures generated with the approach in Section 3. The validation datasets are used to determine good non-linear coefficients for the GBM and the LQM. After training we test the ANN with the real mixtures in our test datasets. The root-mean-square error

$$\Delta\_{\text{RMSE}} = \sqrt{\frac{1}{N} \sum\_{n=1}^{N} \frac{1}{P} \sum\_{p=1}^{P} (\mathfrak{d}\_{pn} - a\_{pn})^2} \tag{4.1}$$

based on all *N* spectra including all mixtures of a test dataset, is used as a measure of performance. The results are compared to the FCLS algorithm and the ELMM based SU.

The ANN we used for the evaluation is the convolutional neural network (CNN) from [17], which is the one dimensional version of [7]. It has three convolutional layers and two fully connected layers. The length of the convolution kernels is 3 and the numbers of feature maps from input layer to output layer are 1, 16, 32, 64, 64, and 1. The number of epochs depends on the dataset. Therefore, the CNN is trained with quartz-3 dataset for 251 epochs, quartz-4 dataset for 61, and colour-4 dataset for 31 epochs.

Different training datasets are generated using all presented mixing models and different step sizes s ∈ **R** for the abundances **a** (see Fig. 4.1). In every epoch there are 400 spectra per mixture drawn from the previously determined Gaussian random vectors of the mixtures. Figure 4.1 shows the results for the different methods. The prefix CNN indicates that a CNN was trained for SU in order to get this result. The training data were generated with the corresponding stochastic mixing model.

It is evident that in almost all cases the presented approach leads to an improvement of the results compared to the FCLS or the ELMM based approach. This is due to spectral variability being modelled based on data and taken into account by the CNN. Which mixing model yields the best results depends on the unmixing task. In the datasets used here the non-linear mixing models perform better. The FM delivers quite good results, although like LMM, it does not need any additional parameters. The colour-4 dataset shows, however, that the choice of the right mixing model can achieve a significant improvement. Smaller step sizes and thus more large training datasets lead in most cases to an improvement.

**Figure 4.1:** Root-mean-square error of the abundances for the compared methods applied on the quartz-3 (top), the quartz-4 (middle), and the colour-4 (bottom) dataset, respectively.

### **5 Summary**

In this paper an approach is presented where artificial mixed spectra are generated using stochastic mixing models. These mixed spectra can then be used to train a CNN for SU. In this way, spectral variability is considered. Compared to methods from literature, which in part also include spectral variability, better results can be achieved with regard to the error of the estimated abundances. The choice of the mixing model in dependence of the problem significantly influences the quality of the estimation. Finer step sizes in the specified abundances for the training data can lead to an additional improvement.

In the future, the approach could be extended in such a way that the mean vectors and covariance matrices of the random vectors of mixed spectra are determined based on data instead of models. However, larger training datasets would be needed to train these networks.

### **References**


E. Santi, Eds., vol. 11533, International Society for Optics and Photonics. SPIE, 2020, pp. 188–199.

### **Line Spectra Analysis: A Cumulative Approach**

Achim Kehrein<sup>1</sup> and Oliver Lischtschenko2

<sup>1</sup> Rhine-Waal University of Applied Sciences, Faculty of Technology and Bionics, Marie-Curie Str. 1, 47533 Kleve, Germany

<sup>2</sup> Ocean Insight - A Brand of Ocean Optics B.V., Maybachstr. 11, 73760 Ostfildern, Germany

**Abstract** An optical spectrometer uses detector pixels that measure the integrated intensity over a certain interval of wavelengths. These integrated pixel values are divided by the interval width and then interpreted as estimates of function values of the wanted spectral irradiance. Hence each pixel measurement constitutes an averaging process. But, averaging biases at maxima: pixel data feature lower maxima. This paper proposes the conceptual use of a cumulated spectrum to estimate spectral data. The integrated quantities are placed in their natural habitat. The motivation originates from the fact that pixel data as integrated quantities are exact values of the cumulated spectrum. Averaging becomes obsolete. There is no information loss. We start with a single spectral line. This "true" spectrum is blurred mimicking the instrument function of the spectrometer optics. For simplicity we consider the instrument function to have Gaussian shape. We integrate the blurred spectrum over subintervals to simulate the pixel measurements. We introduce a cumulated spectrum approach. We compare the cumulated approach with the approach that interpolates the averaged function value estimates of the non-cumulated spectrum. The cumulated approach requires only basic mathematical concepts and allows fast computations.

**Keywords** Spectrum, spectral fitting, cumulative spectrum, data processing, plasma, line emission, spectroscopy

### **1 Introduction**

The analysis of emitted line spectra provides insight into the qualitative and quantitative composition of materials. Virtually all analysis is carried out by using non-linear fitting of (multi) Gaussians to the recorded data. This paper replaces the all-purpose non-linear fitting by direct computation of the peak parameters. This reduces dramatically the computational effort and eliminates the dependence on intelligent initial parameter guesses.

Probability theory or statistics knows two perspectives on Gaussians, i.e. on normal distributions. On the one hand there is famous bellshaped curve, the probability density function (PDF). This is how we usually visualize the normal distribution. On the other hand there is the tabulated cumulative distribution function (CDF) with which we perform almost all calculations. The PDF possesses non-negative values and finite area. The CDF is a non-decreasing function.

A spectrum is also a function with non-negative spectral irradiance values and finite area which corresponds to the finite irradiance. This is the usual visualization of a spectrum. In this paper we add the equivalent perspective of a cumulated spectrum to perform our calculations. After all the irradiance of the spectrum is nothing but the cumulated spectral irradiance (integrated with respect to wavelength). This paper introduces how the cumulated approach can be used to fit Gaussian peaks to discrete spectral data in a simple fashion.

In 2020 the global market volume of applications relying on spectral line emission analyses is estimated at a trillion US-Dollar. Examples of these applications are semi-conductor production, quality control of physical coating processes [1], minimally invasive cancer surgery with live tissue diagnostics [2], natural resource exploration [3], and magnetic plasma fusion research [4].

#### **2 Mathematical Model**

We use a simple model. An object sampled by a spectrometer possesses a "true spectrum" which is a function that assigns to each wavelength a certain non-negative value, the spectral irradiance. The wavelength can have any positive value, so mathematically, the true spectrum is a function *S* (signal) defined on the interval (0, ∞) with non-negative values. We restrict our attention to a true spectrum with a single spectral line which we approximate by a Dirac delta function,

$$S(\lambda) = S\_{\mu} \cdot \delta(\lambda - \mu). \tag{2.1}$$

#### **2.1 Blurring - Spectral Lines Become Gaussians**

A spectrometer blurs the true spectrum *S* by convoluting it with the so-called instrument function *I*. We call the result of this convolution *I* ∗ *S* the blurred spectrum,

$$B(\lambda) = \int\_0^\infty I(\lambda - \ell) \cdot \mathcal{S}(\ell) \, d\ell \,\,. \tag{2.2}$$

This blurring depends on the used spectrometer. In this paper we assume that the instrument function is sufficiently well approximated by the Gaussian shape

$$I(\lambda) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{\lambda^2}{2\sigma^2}} \tag{2.3}$$

with the width parameter3 *σ* > 0. Selecting a Gaussian is purely illustrative. Any shape can be cumulated and treated analogously.

The blurring convolutes the line spectrum, equation (2.1), into the Gaussian peak

$$B(\lambda) = S\_{\mu} \cdot I(\lambda - \mu) \,. \tag{2.4}$$

#### **2.2 Pixel Detectors - Spectral Irradiances Are Locally Integrated**

A spectrometer measures the blurred spectrum on a finite interval of wavelengths [*λ*0, *λR*]. Its detector consists of finitely many pixels that detect adjacent parts of the blurred spectrum. Each pixel measures the cumulated spectral irradiance on a subinterval of wavelengths. Let the *r*th pixel cover the subinterval [*λr*−1, *λr*], 1 ≤ *r* ≤ *R*. Then the spectral irradiance of the blurred spectrum measured by the *r*th pixel is

$$B\_r = \int\_{\lambda\_{r-1}}^{\lambda\_r} B(\lambda) \, d\lambda \,. \tag{2.5}$$

<sup>3</sup>We do not want to call *σ* standard deviation, since this is a non-statistical use.

We divide these cumulated values by their subinterval widths to convert *Br* into the same unit as the blurred spectrum and obtain the averaged values

$$B\_r^\* = \frac{B\_r}{\lambda\_r - \lambda\_{r-1}} = \frac{1}{\lambda\_r - \lambda\_{r-1}} \cdot \int\_{\lambda\_{r-1}}^{\lambda\_r} B(\lambda) \, d\lambda \,. \tag{2.6}$$

A spectrometer with *R* pixels provides the finitely many values *B*∗ 1 , *B*∗ <sup>2</sup> ,. . . ,*B*<sup>∗</sup> *<sup>R</sup>* that approximate the blurred spectrum over the interval [*λ*0, *λ*1] ∪···∪ [*λR*−1, *λR*]=[*λ*0, *λR*].

Each pixel value *B*∗ *<sup>r</sup>* – though obtained by averaging – can be interpreted as a function value estimate at the pixel's mean wavelength

$$
\lambda\_r^\* = \frac{\lambda\_{r-1} + \lambda\_r}{2} \,. \tag{2.7}
$$

Then the measured spectrum is a function whose graph consists of the finitely many points (*λ*∗ <sup>1</sup>, *B*<sup>∗</sup> <sup>1</sup> ), (*λ*<sup>∗</sup> <sup>2</sup>, *B*<sup>∗</sup> <sup>2</sup> ),..., (*λ*<sup>∗</sup> *<sup>R</sup>*, *B*<sup>∗</sup> *R*).

### **3 Proof of Concept - Matching a Gaussian Peak**

As an example we consider a single blurred spectral line shown in Figure 3.1. This is a Gaussian peak centered at wavelength *μ* = 500 nm with width parameter *σ* = 1 nm and irradiance 100 *μ*W/cm<sup>2</sup> (integrated spectral irradiance). We assume that the wavelength interval from *λ*<sup>0</sup> = 497 nm till *λ*<sup>11</sup> = 503 nm is detected by a row of eleven equidistant pixels. Their averaged values *B*∗ <sup>1</sup> , ..., *B*<sup>∗</sup> <sup>11</sup> are shown as function value estimates. Figure 3.1 illustrates in particular the effect of averaging the concave function segment close to the maximum: averaging underestimates a maximum value.

#### **3.1 Non-linear All-purpose Fitting of the Function Value Estimates**

As control case for our example we use the all-purpose Matlab command fit to fit a Gaussian peak to the pixel values (*λ*∗ <sup>1</sup>, *B*<sup>∗</sup> <sup>1</sup> ),. . . , (*λ*∗ <sup>11</sup>, *B*<sup>∗</sup> <sup>11</sup>). The model function to be fitted is

$$
\hat{B}(\lambda) = a \cdot e^{-\frac{(\lambda - b)^2}{\varepsilon}} \tag{3.1}
$$

**Figure 3.1:** A Gaussian peak models a blurred spectral line. The wavelength interval [497 nm, 503 nm] is divided into 11 equidistant subintervals (pixels). The pixel values are placed at the midpoints of the subintervals. The dashed curve is the fitted result by Matlab. The maximum of the fit is lower than the maximum of the blurred spectrum.

with model parameters *a*, *b*, and *c*. The underlying algorithm for nonlinear least squares requires intelligent initial parameters to produce convergence. We provided the initial parameters *a*<sup>0</sup> = 39.4 (maximal pixel value), *b*<sup>0</sup> = 500, and *c*<sup>0</sup> = 2.

Matlab finds *a*ˆ = 39.41, ˆ *b* = 500, and *c*ˆ = 1.432 with 95% confidence intervals (39.40, 39.41), (500, 500), and (1.432, 1.432) respectively. The estimated model parameters translate into: the line is located at wavelength 500 nm with width parameter *σ* = *c*ˆ/ <sup>√</sup><sup>2</sup> <sup>=</sup> 1.0126 nm and irradiance *S*ˆ *<sup>μ</sup>* = *a*ˆ · *σ* <sup>√</sup>2*<sup>π</sup>* <sup>=</sup> 100.0287 *<sup>μ</sup>*W/cm2. Figure 3.1 also shows the fitted curve. It fits the blurred spectrum well, but features the biased smaller maximum. The estimated irradiance corresponds to the area under the curve. The fit overestimates the irradiance.

#### **3.2 Cumulative Spectrum - Natural Use of Pixel Values**

This section presents the alternative way to match a Gaussian peak to the pixel data. Instead of interpreting the pixel data as function value estimates, we use them exactly as the integrated quantities that they are. To do so, we consider the blurred spectrum cumulated over the measured interval

$$\mathcal{C}(\lambda) = \int\_{\lambda\_0}^{\lambda} B(\ell) \, d\ell \quad \text{,} \lambda\_0 \le \lambda \le \lambda\_{R\\_\epsilon} \tag{3.2}$$

and the cumulated pixel values

$$\mathbf{C}\_{\mathbf{r}} = \sum\_{i=1}^{r} B\_{\mathbf{i}} \quad \text{ } \qquad r = 0, 1, \dots, R. \tag{3.3}$$

The cumulated pixel values are exact samples of the cumulated blurred spectrum at the subinterval endpoints, *Cr* = *C*(*λr*), *r* = 0, 1 . . . , *R*. For our example, these quantities are shown in Figure 3.2 and in Table 13.1.4

**Table 13.1:** Cumulated pixel values of example.


#### **3.3 Estimating Parameters from Cumulative Values**

The cumulative normal distribution function

$$F(\mathbf{x}) = \frac{1}{\sqrt{2\pi\sigma^2}} \int\_{-\infty}^{\infty} e^{-\frac{(\xi - \mu)^2}{2\sigma^2}} d\xi \tag{3.4}$$

<sup>4</sup>The blurred spectrum *B* (spectral irradiance) is the derivative of the cumulative spectrum *C* (irradiance). Consistently, the earlier introduced averaged values *B*∗ *<sup>r</sup>* = (*Cr* − *Cr*−1)/(*λ<sup>r</sup>* − *λr*−1) located at the midpoint *λ*<sup>∗</sup> *<sup>r</sup>* turn out to be the centered differences known from numerical differentiation.

**Figure 3.2:** The cumulative Gaussian peak function (solid curve) and the cumulated pixel values at the respective endpoints of the subintervals. The dashed curve is the piecewise linear interpolation. The cumulated pixel values are exact values of the cumulative Gaussian peak function, not just estimates.

satisfies *F*(*μ*) = 0.5 and can be normalized, *z* = (*x* − *μ*)/*σ*, into the standard normal distribution Φ(*z*). In particular, we obtain the tabulated value

$$F(\mu - \sigma) = \Phi(-1) = 0.1587\,\text{.}\tag{3.5}$$

We use these properties to estimate the parameters of the blurred spectrum.

The irradiance of the spectral line is estimated as the area under the measured blurred spectrum, which equals a difference in the cumulated spectrum,

$$\mathcal{S}\_{\mu} = \mathsf{C}\_{R} - \mathsf{C}\_{0} = \int\_{\lambda\_{0}}^{\lambda\_{R}} \mathsf{B}(\lambda) \, d\lambda \approx \int\_{0}^{\infty} \mathsf{B}(\lambda) \, d\lambda \approx \mathsf{S}\_{\mu} \,. \tag{3.6}$$

In Figure 3.2 this is *C*<sup>11</sup> − *C*<sup>0</sup> = 99.73 − 0 = 99.73. After we will have obtained a first estimate for all parameters *Sμ*, *μ*, and *σ*, we will check for consistency and update them.

Now we estimate the wavelength *μ* of the spectral line from the cumulated pixel values. First, we find the subinterval [*λj*−1, *λj*], in which the spectral line is located,

$$\mathbb{C}\_0 < \dots < \mathbb{C}\_{j-1} < 0.5 \cdot \mathbb{S}\_{\mu} \le \mathbb{C}\_j < \dots < \mathbb{C}\_{\mathcal{R}} \, . \tag{3.7}$$

In Figure 3.2 this is [*λ*5, *λ*6]. We estimate *μ*ˆ by linear interpolation,

$$
\hat{\mu} = \lambda\_{j-1} + \frac{0.5 \cdot \hat{S}\_{\mu} - \mathbb{C}\_{j-1}}{\mathbb{C}\_{j} - \mathbb{C}\_{j-1}} \left(\lambda\_{j} - \lambda\_{j-1}\right) \,. \tag{3.8}
$$

Since the cumulative normal distribution function is almost linear at the half-height value, linear interpolation is sufficient. However, the interpolation accuracy depends on the pixel width [*λj*−1, *λj*]. A narrower pixel width produces a better approximation. Figure 3.2 shows a good match between the cumulative blurred spectrum and the linear interpolation on [*λ*5, *λ*6]. We compute *μ*ˆ = 500 nm.

Now we estimate *σ* from the cumulated pixel values. We find the subinterval [*λk*−1, *<sup>λ</sup>k*], in which *<sup>μ</sup>* − *<sup>σ</sup>* is located (see equation 3.5),

$$\mathbb{C}\_0 < \cdots < \mathbb{C}\_{k-1} < 0.1587 \cdot \mathbb{S}\_{\mu} \le \mathbb{C}\_k < \cdots < \mathbb{C}\_R \, . \tag{3.9}$$

For the following interpolation we need *k* ≥ 2 which is a reasonable assumption on the number and widths of the pixels covering the spectral line. In Figure 3.2 this is the interval [*λ*3, *λ*4]. The curvature over this interval is not negligible. Linear interpolation would systematically overestimate *σ* due to the convexity. Thus we compute *μ*ˆ by quadratic interpolation through (*λk*−2, *Ck*−2), (*λk*−1, *Ck*−1), and (*λk*, *Ck*). <sup>5</sup> We use Newton's form of the interpolating parabola. The required divided differences are

$$\left[\mathbb{C}\_{k\prime}\mathbb{C}\_{k-1}\right] = \frac{\mathbb{C}\_{k} - \mathbb{C}\_{k-1}}{\lambda\_{k} - \lambda\_{k-1}} \quad , \quad \left[\mathbb{C}\_{k-1\prime}\mathbb{C}\_{k-2}\right] = \frac{\mathbb{C}\_{k-1} - \mathbb{C}\_{k-2}}{\lambda\_{k-1} - \lambda\_{k-2}} \; , \quad \text{(3.10)}$$

and

$$\begin{bmatrix} \mathbb{C}\_{k\prime} \mathbb{C}\_{k-1\prime} \mathbb{C}\_{k-2} \end{bmatrix} = \frac{\begin{bmatrix} \mathbb{C}\_{k\prime} \mathbb{C}\_{k-1} \end{bmatrix} - \begin{bmatrix} \mathbb{C}\_{k-1\prime} \mathbb{C}\_{k-2} \end{bmatrix}}{\lambda\_k - \lambda\_{k-2}} \quad . \tag{3.11}$$

5We use the indices *<sup>k</sup>* <sup>−</sup> 2, *<sup>k</sup>* <sup>−</sup> 1, *<sup>k</sup>* instead of *<sup>k</sup>* <sup>−</sup> 1, *<sup>k</sup>*, *<sup>k</sup>* <sup>+</sup> 1 for numerical reasons.

The interpolating parabola is

$$P(\lambda) = \mathbb{C}\_k + [\mathbb{C}\_{k'}\mathbb{C}\_{k-1}](\lambda - \lambda\_k) + [\mathbb{C}\_{k'}\mathbb{C}\_{k-1}\mathbb{C}\_{k-2}](\lambda - \lambda\_k)(\lambda - \lambda\_{k-1})\tag{3.12}$$

and we compute *<sup>μ</sup>*<sup>ˆ</sup> <sup>−</sup> *<sup>σ</sup>*<sup>ˆ</sup> from *<sup>P</sup>*(*μ*<sup>ˆ</sup> <sup>−</sup> *<sup>σ</sup>*ˆ) = 0.1587 · *<sup>S</sup>*<sup>ˆ</sup> *<sup>μ</sup>*. We plug in

$$\begin{aligned} \mathbf{C}\_{k} &+ [\mathbf{C}\_{k'}\mathbf{C}\_{k-1}](\boldsymbol{\hat{\mu}} - \boldsymbol{\mathcal{O}} - \boldsymbol{\lambda}\_{k}) \\ &+ [\mathbf{C}\_{k'}\mathbf{C}\_{k-1}\mathbf{C}\_{k-2}](\boldsymbol{\hat{\mu}} - \boldsymbol{\mathcal{O}} - \boldsymbol{\lambda}\_{k})(\boldsymbol{\hat{\mu}} - \boldsymbol{\mathcal{O}} - \boldsymbol{\lambda}\_{k-1}) = 0.1587 \cdot \boldsymbol{\hat{S}}\_{\mu} \end{aligned} \tag{3.13}$$

and sort for the powers of the unknown *σ*ˆ,

$$\begin{split} 0 &= \mathbb{C}\_{k} + [\mathbb{C}\_{k}\mathbb{C}\_{k-1}](\hat{\mu} - \lambda\_{k}) - 0.1587 \cdot \hat{S}\_{\mu} \\ &+ [\mathbb{C}\_{k}\mathbb{C}\_{k-1}\mathbb{C}\_{k-2}](\hat{\mu} - \lambda\_{k})(\hat{\mu} - \lambda\_{k-1}) \\ &+ \left( [\mathbb{C}\_{k}\mathbb{C}\_{k-1}] + [\mathbb{C}\_{k}\mathbb{C}\_{k-1}\mathbb{C}\_{k-2}](\hat{\mu} - \lambda\_{k} + \hat{\mu} - \lambda\_{k-1}) \right) \cdot \hat{\sigma} \\ &+ [\mathbb{C}\_{k}\mathbb{C}\_{k-1}\mathbb{C}\_{k-2}] \cdot \boldsymbol{\vartheta}^{2} \end{split} \tag{3.14}$$

In the template 0 = *aσ*ˆ <sup>2</sup> + *bσ*ˆ + *c* we obtain the following coefficients for our example.

$$\begin{aligned} a &= [\mathbb{C}\_k, \mathbb{C}\_{k-1}, \mathbb{C}\_{k-2}] = 10.730 \\ b &= 22.278 + 10.730(0.82 + 1.36) = 45.669 \\ c &= 20.53 + 22.278 \cdot 0.82 - 15.827 + 10.730 \cdot 0.82 \cdot 1.36 = 34.937 \end{aligned} \tag{3.15}$$

From the geometry, the solution of the quadratic formula with the difference is the relevant one. The result rounded to four significant digits is

$$\vartheta = \frac{b - \sqrt{b^2 - 4ac}}{2a} = 1.000\,\text{.}\tag{3.16}$$

#### **3.4 Updating the Estimates**

Now that we have a complete set of estimates, *S*ˆ *<sup>μ</sup>*, *μ*ˆ, and *σ*ˆ we use them to update the estimate of the irradiance *Sμ*. By design our first estimate of *S<sup>μ</sup>* is an underestimate. We do not know how much of its area we actually considered. Now *μ*ˆ and *σ*ˆ allow us to assess this underestimate. We can compute the area covered by the interval [*λ*0, *λR*] using the tabulated normal distribution. This becomes particularly important when we have covered a wavelength range significantly smaller than [*μ* − 3*σ*, *μ* + 3*σ*] (which corresponds to 99.7% of the area). This is also useful when we have an asymmetric interval about the location of the spectral line.

Such an update becomes important when spectral lines are located close to one another and their areas partially overlap. A similar problem arises when a spectral line is close to an endpoint *λ*<sup>0</sup> or *λ<sup>R</sup>* so that a significant part of the blurred spectrum lies outside our observed interval.

For our simple example the numbers will not change much, but let us describe the updating process. Our interval was [*λ*0, *λ*11] = [497 nm, 503 nm]. With respect to our estimates *μ*ˆ = 500 nm and *σ*ˆ = 1 nm the interval endpoints correspond to the standardized *z*-values −3 and 3 respectively (computed from *λ<sup>i</sup>* = *μ*ˆ + *zσ*ˆ). From the tabulated normal distribution we read off Φ(−3) = 0.0013 and Φ(3) = 0.9987. Therefore, the area covers 0.9987 − 0.0013 = 0.9974 = 99.74% of the irradiance. Hence we update the irradiance estimate

$$
\hat{S}\_{\mu} = \frac{\text{original estimate}}{\text{covered area}} = \frac{99.73}{0.9974} = 99.99 \,\text{.}\tag{3.17}
$$

With this updated estimate we recalculate *μ*ˆ and *σ*ˆ. If the updated values deviate significantly from the original ones, we iterate this updating process until the estimates become stationary.

### **4 Outlook**

We have only considered the simple example of an isolated spectral line in a sufficiently large wavelength interval. The results of the cumulated view point are promising. Only linear and quadratic equations were required in combination with only two tabulated values of the normal distribution. However, many aspects are still in need of further investigation:

• The whole process starts with the determination of an interval [*λ*0, *λR*] which covers most of the irradiance of a blurred spectral line. We have to automate the partitioning of a spectrum with several spectral lines into such intervals.


The presented approach significantly reduces the calculation complexity of spectral matching by avoiding all-purpose non-linear fit routines. Due to the simplicity of the computation steps, the presented cumulative approach can be implemented in FPGAs. This allows immediate on-camera real-time data processing.

This affects many applications. Control circuits for online plasma analysis will benefit from shorter latencies. Processes like laser induced breakdown spectroscopy (LIBS) that so far require a complex post mortem analysis have the potential to be treated live. Adapting the approach to complex fusion spectra will significantly reduce the amount of currently needed a-priori knowledge. The whole analysis becomes more deterministic and less dependent on intelligent initial estimates. The only information to be determined in advance is the cumulative distribution of the instrument function.

### **References**


4. M. G. von Hellermann, G. Bertschinger, W. Biel, C. Giroud, R. Jaspers, C. Jupen, O. Marchuk, M. O'Mullane, H. Summers, A. Whiteford ´ *et al.*, "Complex spectra in fusion plasmas," *Physica scripta*, vol. 2005, no. T120, p. 19, 2005.

# **Quality analysis for the production of pelletised materials using Fourier Descriptors**

Sebastian Michlmayr1, Dominik Neumann2, and Bernhard Zagar1

<sup>1</sup> Johannes Kepler University Linz Altenberger Straße 69 A-4040 Linz <sup>2</sup> ECON GmbH, Biergasse 9, A-4616 Bergern

**Abstract** We evaluate the effectiveness of Fourier Descriptors for an in-line characterization of the quality of pelletised materials. For the quality analysis we evaluate the significance of information conveyed by a limited number of low order Fourier Descriptors. Further, we investigate the influence of image resolution and shape on the outcome.

**Keywords** Fourier analysis, fourier descriptors, surface analysis, digital image processing

### **1 Introduction**

Plastic pellets may be produced by pushing molten prime material through die plates into a water stream and pelletizing by cutting the extrusion with a rotating cutter to the preselected length [1]. Examples of resulting pellets are shown in Figure 1.1. The water stream simultaneously cools the pellets and transports them to the next stage of production – the dryer.

The quality of these pellets is influenced, among other things, by the temperature of the molten material, the sharpness of the cutter, and the cleanliness and temperature of the water. Some polymers, for instance, may crystallize with higher temperatures and, subsequently, change their optical appearance and become opaque [2]. Furthermore, the shape of the pellets is highly dependent on the sharpness of the

**Figure 1.1:** Images of two different sets of pellets. On the left-hand side the pellets are comparatively smooth. The observable difference in Gray-scale values is due to temperature differences during production which causes some pellets to crystallize. On the right-hand side the shape of the pellets is less homogeneous which might indicate a blunt cutting tool.

cutter's blade. Given the plasticity of the molten filaments the separation process of the pellets may lean more on the side of the preferred cutting instead of tearing them off in the cutting tool, which indicates wear on the tool or some other quality diminishing conditions. A blunt blade may create pellets as shown on the right-hand side of Figure 1.1 where the main body of the pellet has thin appendices or strands.

To devise an in-line system able to observe the extrusion process and characterize the resultant quality with low reaction time is not a trivial task. Since the quality is dependent on multiple parameters and only partly controlled by the process itself but always dependent on the processed materials, to attain an optimized result requires quick reaction times of the quality assertion process. In our proposed scheme pellets were sampled at random right after the dryer and shape information, as one of the early indicators of wear on the cutter, was gathered with a camera to draw conclusions regarding some of the process parameters.

### **2 Measurement Setup**

A B&R Vision System [3] camera was used as it allows an easy integration into the PLC system of the production line. The gray-scale-camera is complemented by LEDs with different colours which by successive illumination of an object allows to gain also colour information about the analysed specimen. That information allows to detect a beginning cystallisation of the specimens, which can be accompanied by a change in opaqueness [2]. In Figure 1.1 the crystallised pellets are recognisable due to their different transparency.

The overall image acquisition and processing system is complemented by an on-board image processing system. The on-board system is capable to evaluate – amongst other things – the area in pixels, the mean gray value, the rectangularity, the circularity, and the anisometry of a detected object. In combination with a B&R PLC, additional image processing algorithms can be implemented.

The measurement setup used for acquiring the images is shown in Figure 2.1. A funnel is used to concentrate the falling pellets into the focal plane of the camera.

**Figure 2.1:** Picture of setup for testing.

### **3 Fourier Descriptors augmenting the feature space**

The camera's on-board image parameters showed strong limitations when trying to analyse the produced pellets for production quality. Figure 3.1 shows two pellets which objectively have a very different shape, yet a distinction based on the parameters rectangularity, circularity and anisometry with values of 79/69/13 for the first and 80/69/13 for the second blob is almost impossible.

To overcome those limits the usefulness of Fourier Descriptors to assess the quality of pellets was investigated.

**Figure 3.1:** Images taken of two different pellets. The red line represents the detected contour of the blobs.

**Image Processing** In a first step the gray-scale image acquired by the camera is transformed into a binary image by thresholding the image. With this the regions, also called blobs (binary large objects), that represent pellets can easily be separated from the dark background. After labelling the regions, the boundary of each region is identified. [4] proposes a simple tracing algorithm, that returns the region's boundary as a list of points with their *x*- and *y*-components.

Following [5] the contour is transformed into a complex valued signal

$$s[n] = x[n] + iy[n],\tag{3.1}$$

with *n* = 0, 1, 2, ...*M* − 1 where *M* is the number of contour pixels. The discrete Fourier transform of *s*[*n*] returns the complex valued spectrum

$$\mathcal{S}[n] = a[n] + ib[n],\tag{3.2}$$

the Fourier Descriptors, with *n* = 0, 1, 2, ··· *M* − 1 where *M* is the number of contour pixels.

The component *S*[0] of the nonsymmetric spectrum represents the centre of the contour. As it is irrelevant for the evaluation of the contour's shape we won't consider it for the following calculations.

The remaining spectral components can be interpreted as rotating pointers with different rotational speeds, amplitudes and phases. The components *S*[*k*] and *S*[*M* − *k*], with *k* = 1, 2, 3 ···(*M* − 1)/2 for an uneven *M* and *k* = 1, 2, 3 ··· *M*/2 − 1 for an even *M*, have the same rotational speed but opposite rotational directions and are in the following identified as *k*-th order Fourier Descriptors. For an even *M* there is only a single component of order *M*/2.

For the following analysis we consider only the Fourier Descriptors' absolute value

$$\mathbb{C}[n] = |\mathbb{S}[n]|\,\prime\,\tag{3.3}$$

as it describes the magnitude of each spectral component.

**Exemplification** Figures 3.2 and 3.3 show two different pellets. The red line represents the original contour and the green one the result of inversely transforming only the lowest 4, 20, and 32 (left to right) Fourier Descriptors.

Figure 3.2 shows a pellet with a smooth surface and no appendices. The simple contour implies that higher order Fourier Descriptors have small magnitudes. This can also be observed in the Figure, as the inverse transformation of only the first four descriptors already gives a very close approximation of the original contour. As the number of Fourier Descriptors for the inverse transformation is increased the green contour changes only slightly.

Figure 3.3 shows a pellet with a complex shape due to appendices resulting from a low quality production process.

The complexity implies larger magnitudes in higher order Fourier Descriptors, therefore the inverse transform using higher order Fourier Descriptors provide a better approximation of the contour. This is clearly visible in the figure as an inverse transform of the first four Fourier Descriptors results in an inadequate approximation of the contour. The use of 20 and 32 Fourier Descriptors results in an increasingly better approximation.

**Figure 3.2:** Images of a smooth pellet with its original contour in red and in green the result of inversely transforming the lowest 4, 20, and 32 Fourier Descriptors.

**Figure 3.3:** Images of complex blob with its original contour in red and in green the result of inversely transforming the lowest 4, 20, and 32 Fourier Descriptors.

**Application as Shape-Descriptor** In [6] the absolute values of the single descriptors are used in order to obtain information on the size of the object to be measured. Our goal, however, is not to estimate the size of the object, but rather the smoothness of the surface.

As mentioned above the smoother the surface the lower are the values in the higher order Fourier Descriptors in comparison to lower order ones. Hence, the ratio of lower order Fourier Descriptors to all descriptors would be comparatively large and, for example, for a perfect circle would be 1. Therefore we propose

$$v = \frac{\sum\_{n=0}^{N} \left( \mathbb{C}\_{low} \left[ n \right] \right)}{\sum\_{n=1}^{M-1} \left( \mathbb{C} \left[ n \right] \right)} \tag{3.4}$$

as a measure of the smoothness with *Clow* ∈ *C*. *Clow* is a selection of *N* lowest order Fourier Descriptors.

**Influence of Resolution** The above considerations are made under the assumption of shapes with an infinitely high resolution. In a real world application the resolution is always limited, due to the quantisation carried out by a camera. This quantisation leads to a stepped contour of objects in images, which lowers the their smoothness and leads to larger magnitudes in higher order Fourier Descriptors.

For increasingly lower resolutions – measured in pixels per blob – the influence of the quantisation on the smoothness increases as the steps in the contour become increasingly bigger relative to the contour.

This dependency is depicted in Figure 3.4 which shows the calculated smoothness values for a quantised ideal circle (left) and a quantised ideal square (right) for increasing pixels per blob. The edges of the square were rotated by 45◦ to the edges of the pixels to observe the influence of the quantisation. The four different lines in each plot represent each a different combination of the lowest three order Fourier Descriptors, which will be discussed in the next part.

**Figure 3.4:** Smoothness values for circles (left) and squares (right) with different areas. The different lines represent different combinations of the three lowest order Fourier Descriptors for calculating the smoothness.

It can be observed that in the graph representing the circle, the smoothness value is comparatively low for small resolutions and increases for higher ones. The smoothness of the square on the other hand stays comparatively constant for the range of resolution, when the second order Fourier Descriptors are included. When the first and the first and third order Fourier Descriptors are used the graph shows similar properties to the circle's.

The use of different cameras, the positioning of the camera to the falling pellets and the size of the pellets have an influence on the resolution of the detected pellets images. Consequently, the influence of the resolution on the smoothness have to be taken into account when making assumptions based on the result of Equation 3.4.

**Required order of Fourier Descriptors to calculate smoothness** Depending on the aimed form and the needed accuracy of the detection of faulty pellets different orders of Fourier Descriptors for the nominator of Equation 3.4 will be necessary.

Considering again the graphs in Figure 3.4, it is observable for both objects, that the second order Fourier descriptors have a relatively limited influence on the smoothness value. In both cases the addition of the third order Fourier Descriptors yields a higher change of the smoothness value, as they contain information on the rectangularity of the object. For the circle this is probably due to the quantisation with square pixels.

The above implies that generally the use of first and third order Fourier Descriptors for the calculation of the smoothness will return a high value for smooth pellets, as they will most likely result in a rectangular, circular or oval projection in the images.

If high precision for detection deviations from spherical and ellipsoidal shapes is the goal, the use of only first order Fourier Descriptors is suggested by Figure 3.4, as they contain only a low quadratic share in comparison to e.g. cylindic ones. Any information in the third order Fourier Descriptors will be considered erroneous and will lead to lower smoothness values.

### **4 Results**

In order to evaluate the application of the smoothness value on real images of pellets, we took 30 images of 11 different samples of pellets. The images were taken with the setup in Figure 2.1. Of every detected pellet the smoothness value was calculated. Figure 4.1 shows the result of every sample for different combination of orders of Fourier Descriptors as boxplots. Each boxplot shows the median as red bar, the 16% and 84% percentile as the edges of the blue box and the minimum and maximum value as black bars. Thus, the blue box contains 68% of the detected pellets.

Samples one to nine are roughly the same size with about 4000 pixels per blob. Samples ten and eleven have a size of around 900 pixels per blob. Samples one and two are also depicted in Figure 1.1. Samples two, six and nine are considered bad quality, as they have appendices

**Figure 4.1:** Boxplots of the smoothness values of different samples with a different combination of lower order Fourier Descriptors. The red bars show the median, the upper and lower edges of the blue boxes the 16%- and 84% percentiles and the black bars the maximum and minimum value of the smoothness values for every sample.

or show other kinds of deformations. Sample 11 contains pellets of high and low quality.

As implied by the results of the idealized forms above, the smoothness value shows the best results for the evaluation of production quality with the use of first and first and third order Fourier Descriptors. This is observable in Figure 3.3, as the 84% percentile of the low quality pellets is lower than or close to the 16% percentile of the high quality pellets in the upper two graphs. In the lower two graphs a clear distinction of samples 6 and 9 from the others is not possible.

For samples two and five the best results are obtained using first and third order Fourier Descriptors. This is due to the fact that those samples contain pellets with cylindric shapes.

The above confirms that the best results for the evaluation of a pellets

quality is obtained using first or first and third order Fourier Descriptors for the calculation of the smoothness value.

### **5 Conclusion**

Fourier Descriptors are a very useful tool to analyse the complexity of the contour of a blob. By analysing the information in the lower order Descriptors a single value can be calculated for a rough classification of the production quality. Flaws from the production process, as appendices, can easily be detected, with the proposed value for smoothness.

#### **Acknowledgement**

This work has been supported by the Pro2Future-K1 Center within the framework of the Austrian COMET-K1 Programme.

### **References**


# **Fiber-Coupled MEMS-based NIR Spectrometers for Material Characterization in Industrial Environments**

Robert Zimmerleiter1, Paul Gattinger1, Kristina Duswald1, Thomas Reischer2, and Markus Brandstetter2

<sup>1</sup> RECENDT – Research Center for Non-Destructive Testing GmbH, Altenbergerstraße 69, 4040 Linz <sup>2</sup> Metadynea Austria GmbH, Hafenstraße 77, 3500 Krems an der Donau

**Abstract** Recently emerged near-infrared (NIR) spectrometers based on micro-electromechanical systems (MEMS) are a highly compact, rugged and cost-efficient alternative to infrared spectrometers conventionally used in industrial environments. The majority of the devices currently available are designed for measurements in diffuse reflection geometry in close contact with the sample, with built-in low-power halogen light sources – usually intended for consumer applications. However, for most material characterization applications in the industrial environment such a measurement configuration is neither feasible, nor practical. Using light transmitting optical fibers in combination with a measurement probe and a high power fiber-coupled light source is often preferable. In this contribution we compare various fiber-coupled NIR-spectrometers based on different MEMS technologies and demonstrate the applicability of MEMS spectrometer technology in industrial environments.

**Keywords** NIR Spectroscopy, MEMS Technology, inline monitoring, fiber-coupled, partial least squares regression

### **1 Introduction**

Near infrared (NIR) spectroscopy is one of the most frequently used tools in industry to gain real-time information about the chemical or physical state during production processes. It allows for accurate monitoring and tight process control to achieve maximum process efficiency and product quality. Besides its inline-capability, non-destructive nature and low running costs, it allows to measure multiple process parameters simultaneously through the use of multivariate data analysis techniques [1]. One major obstacle for the implementation of NIR spectroscopy remains the relatively high hardware costs of conventionally used process spectrometers. Modern MEMS-based NIR spectrometers offer a rugged and compact alternative to conventional process spectrometers at a fraction of the price (about 1/10th) without any moving parts, which is an additional advantage in industrial environments. This relatively new technology has already proven capable of replacing conventional spectrometers in certain demanding applications [2–4]. However, most of the MEMS-based NIR spectrometers available today are designed for measurements in reflection geometry using built-in low power halogen light sources. Performance comparisons for the different types of these spectrometers can be found in the literature [5]. While this measurement geometry is suitable for solid samples that can be directly accessed e.g. in handheld measurements, it is often not suitable or practical for industrial applications. Often high power light sources in combination with inline measurement probes connected via light transmitting fibers are necessary or much more convenient.

In this contribution different fiber-coupled MEMS-based spectrometers are compared in terms of covered wavelength range, metrological properties and overall performance. Furthermore, the applicability of MEMS-based spectrometers in an industrial setting is demonstrated. This should shed some light on the upsides and downsides of the different available technologies and measurement principles when used in a fiber-coupled configuration.

### **2 Experimental**

Seven individual ultra-compact fiber-based NIR-spectrometers that use three different MEMS-based technologies for spectral light acquisition were tested. Two spectrometers rely on digital light processing (DLP), one on Fourier transform infrared (FTIR) spectroscopy and four on tunable Fabry-Perot (FP) filter technology. All spectrometers were simply ´ connected via USB to a PC and require no additional power source. For the comparison of these different MEMS-spectrometers, a simple setup was built consisting of a fiber-coupled halogen light source, a sample holder and two low-OH SMA-soupled fibers for connecting the light source and MEMS-spectrometer, respectively. A schematic drawing of the experimental setup is shown in Fig.2.1.

**Figure 2.1:** Schematic drawing of the measurement setup used to compare the different MEMS-based spectrometers. For details see text.

The fiber on the left was kept the same for all measurements (low-OH, 600μm core diameter). The same fiber was used on the right for the MEMS-FTIR and the FP spectrometers, but was exchanged for a 7-to-1 round to linear fiber for the DLP spectrometers (low-OH 200μm core diameter for each fiber). This was done to ensure maximum light coupling into the grating-based DLP-spectrometers which have an entrance slit that be illuminated much more effectively using a linear arrangement of fibers on the spectrometer end. For the measurement of the 100%-lines (Fig.3.1), no sample was inserted and the light of the fiber-coupled light source was measured directly, while for the sample spectrum shown in Fig.3.2, ethanol was put into a Infrasil quartz cuvette with a light path inside the sample of 1mm. Measurement settings for each spectrometer were individually set to get an acquisition time of 2 seconds for each spectrum.

### **3 Results and Discussion**

The setup shown in Fig. 2.1 without any sample cuvette was used to acquire 101 individual spectra for each spectrometer. These spectra were used to calculate 100%-lines *A*100%, where the first measured spectrum *S*<sup>0</sup> is used as background which results in 100 lines *A*100% *<sup>i</sup>* given by:

$$A\_i^{100\%} = \log\_{10}(S\_0) - \log\_{10}(S\_i) \tag{3.1}$$

The resulting 100%-lines for all tested spectrometers are plotted in Fig.3.1. Therein also the root mean square (RMS) of the 100%-lines for each of the spectrometers is indicated to allow for a better performance comparison. The RMS is obtained by averaging all the squared entries for all 100 obtained lines *A*100%:

$$\text{RMS} = \frac{1}{100} \sum\_{i=1}^{100} \left[ \frac{1}{N\_{\lambda}} \sum\_{j=1}^{N\_{\lambda}} \left[ A\_{i}^{100\%} (\lambda\_{j}) \right]^{2} \right]^{1/2} \tag{3.2}$$

where *N<sup>λ</sup>* is the total number of spectral points acquired with the respective spectrometer.

**Figure 3.1:** 100%-lines recorded with the different MEMS NIR spectrometers. The data for some of the spectrometers have been vertically shifted (DLP 1, FP1, FP 3, FP 4) for better visibility. RMS values for the spectrometers are given to allow for easier performance comparison.

The uppermost graph in Fig.3.1 shows the 100%-lines for the two investigated DLP-spectrometers. The first one (DLP 1) covers the wavelength region from 900nm-1650nm which is a very common region for NIR-spectrometers since it is the spectral response range of typical InGaAs detectors. The second DLP device (DLP 2) covers the spectral region from 1350nm-2150nm which requires an extended In-GaAs detector. Also the sizes of both devices differ, with DLP 2 having a size of only 59.5x47.5x24.5mm³ while DLP 1 has the dimensions 75x58x26.5mm³ (≈66% bigger). Due to the measurement principle used in both of these devices, different measurement modes can be used, namely the *Column* mode and *Hadamard* mode. The latter has a multiplex advantage resulting from collecting light from 50% of all available wavelengths at a time using Hadamard patterns on the digital mirror device (DMD) instead of single columns that project only a small wavelength band onto the photodetector. Both measurement modes were tested using the same measurement parameters (resolution, exposure time etc.) and are shown in Fig.3.1 in bright green (Column) and dark green (Hadamard), respectively. It can be seen that the shift to longer wavelengths and the more compact form factor does not come without trade-offs when it comes to noise performance. The RMS of the 100%-line is one or two orders of magnitude smaller for DLP 1 for Hadamard and Column mode, respectively. Interestingly, the multiplex advantage does not affect the performance of DLP 1, in fact the noise performance when measuring in Column mode is even better than in Hadamard mode, suggesting other noise sources that are not influenced by the multiplex advantage e.g. influence of temperature drifts. A clear demonstration of the benefits of multiplexing can be seen in the data for DLP 2, where the detector noise is the main noise source. Here the RMS is about four times better when using Hadamard mode instead of Column mode.

The middle graph visualizes the data obtained with the MEMS-FTIR device. This device covers the broadest spectral range of all tested spectrometers, namely 1100nm-2500nm using an extended InGaAs detector and has the largest size of 76x57x49mm³. Also the FTIR measurement principle used in this device has multiplex advantage, which allows for good noise performance similar to the performance using the multiplexing Hadamard mode with the DLP 2 spectrometer. The larger noise level at wavelengths above 2300nm can be partially explained by the low light intensity in this wavelength range. Not only does the halogen light source (2850K color temperature) emit most of its intensity at shorter wavelengths, but also the used low-OH fibers show significant light absorption for wavelengths longer than 2300nm.

The third kind of tested MEMS-spectrometers relies on tunable FPfilters. This measurement principle results in the smallest wavelength coverage, which is why four individual spectrometer modules are required to cover the wavelength range from 1350nm-2450nm. An advantage of this technology is that it can be realized in an extremely compact housing (25x25x25mm³) and also has the lowest hardware costs. As visible in the bottom graph in Fig.3.1, another upside of this technology is the superior noise performance over the whole wavelength region. All tested FP-spectrometers have RMS values in on the order of 10−<sup>4</sup> or lower similar to the performance of DLP 1, but also at longer wavelengths where commonly lower light intensities are encountered.

To get a better idea about how the measurement technology shapes acquired absorption spectra ethanol was used as a sample since it shows multiple absorption bands across the whole NIR spectral range. As background the measured spectrum without any sample was used. As a reference, the absorption spectra of ethanol measured with a Bruker Vertex 70 benchtop FTIR spectrometer using the same cuvette (without any fibers and using its internal light source) are shown in gray. The acquired data can be seen in Fig.3.2.

Spectra acquired with the individual MEMS-spectrometers overlap very nicely and show the same features at the same spectral positions. Furthermore good agreement with the reference spectrum measured using the benchtop FTIR was achieved. The spectra recorded with the FP-filter spectromters (blue) tend to show flattened spectral features due to the lower spectral resolution of approximately 15nm-30nm, compared to the MEMS-FTIR (5nm-13nm) and the DLP spectrometers (10nm-15nm).

The biggest differences between the recorded spectra can be seen in the wavelength region above 2200nm. Here the FP-based spectrometer strongly underestimates the strength of the absorption of ethanol. However, the main spectral features can still be seen albeit with significantly lower resolution, such as the shoulder at 2350nm resulting from the sharp absorption band at this wavelength visible in both the MEMS-FTIR and benchtop FTIR spectrum. The lack of multiplex ad-

**Figure 3.2:** Absortption spectrum of ethanol recorded with the different MEMS NIR spectrometers (green, red, blue) and a benchtop FTIR (gray). The inset shows a zoom of the wavelength region 1000nm-2150nm.

vantage that comes with the FP-based measurement technique can also cause problems in this wavelength region. Since the light intensity is very low due to low emission from the light source, high absorption from the used fibers and additionally strong absorption from the ethanol sample, the measured signal is rather weak and non-linearities of the used detector or electronics can cause deviations from the real absorption spectrum.

Slight differences can also be seen between the MEMS-FTIR and benchtop FTIR caused by the lower resolution of the MEMS-FTIR which leads to the flattening of the sharp absorption bands around 2300nm. Furthermore, the much lower light intensity at long wavelengths due to the absorption in the used fibers (not present for the benchtop device) does not allow for a good measurement of the ethanol absorption bands above 2400nm. This problem could be solved e.g. by using more expensive zirconium fluoride (ZrF4) fibers to reduce light attenuation.

### **4 Inline Process Monitoring using Fiber-Based MEMS-Spectrometer Technology**

In this section measurement data from an industrial application of a FP filter based MEMS-spectrometer will be presented to demonstrate the potential of MEMS technology to replace commonly used FTIRprocess spectrometers in the industrial environment. As a use-case, NIR-monitoring of a batch-wise melamine formaldehyde resin production plant was chosen. At Metadynea Austria GmbH, the resin production is routinely monitored and controlled in real time using NIR spectroscopy in combination with partial least squares (PLS) regression [6]. The combination of NIR spectroscopy and PLS regression is e.g. used to determine the batch end point. This leads to a significant reduction of necessary offline measurements and helps to avoid out of spec batches.

To demonstrate the usability of FP-filter based MEMS spectrometer technology for the determination of the batch endpoint, the optical fiber was simply switched from the routinely used FTIR spectrometer to a suitable MEMS device and a PLS regression model was calibrated and validated using the data acquired with the FP spectrometer. The data is summarized in Fig.4.1. As can be seen from the data, the measurement quality is very similar for the two different spectrometers. The root mean squared error of prediction (RMSEP), which is an important parameter to judge the measurement performance, is higher for the MEMS spectrometer (0.044) when compared to the FTIR (0.024), but still clearly below the threshold value (0.1) set by the company. This means the low-cost MEMS-based spectrometer can be used to replace the conventional, much larger FTIR spectrometer while still delivering satisfying results.

### **5 Conclusion**

In this contribution, seven individual fiber-coupled MEMS-based NIR spectrometers were compared in terms of their covered wavelength range, resolution and noise performance. The tested MEMSspectrometers based on different measurement technologies, namely DLP, FTIR and tunable FP-filters. Each technology has its advantages

**Figure 4.1:** Data comparison between a conventional FTIR process spectrometer (left) and a low-cost FP spectrometer (right). Normalized values calculated from the PLS regression models are plotted versus offline measurements. Calibration data is shown in gray (FTIR) and blue (FP), respectively. Validation data is shown in red.

and disadvantages. While MEMS-spectromters based on tunable FPfilters tend to have the best noise performance, they also have lower spectral resolution and spectral coverage (300nm-500nm) and lack the multiplex advantage that can be exploited with DLP and FTIR technology. The broadest spectral coverage can be achieved using the FTIR measurement technique, while still maintaining very reasonable noise performance, but at higher hardware costs (about 3x). MEMSspectrometers based on DLP-technology offer a good alternative by offering good spectral coverage of about 800nm while having comparable noise performance to the MEMS-FTIR when the multiplex advantage is exploited by measuring in Hadamard mode. When the right instrument is chosen for the specific application, fiber-based NIR MEMSspectrometers show huge potential to allow for much more cost effective spectroscopic measurement solutions in the industrial environment without significant sacrifices in measurement performance. For this the presented application for inline batch monitoring at a resin production facility is a good example. There it was successfully shown that the narrow band FP devices were a suitable substitute for the conventional broadband FTIR spectrometer.

#### **Acknowledgements**

Financial support was provided by the Austrian research funding association (FFG) under the scope of the COMET programme within the research project "PhotonicSensing for Smarter Processes (PSSP)" (contract no. 871974). This programme is promoted by BMK, BMDW, the federal state of Upper Austria and the federal state of Styria, represented by SFG.

### **References**


# **Improvement of Roughness Measurement in Sub-micron Ranges Using Contrast-based Depolarization Field Components**

Franziska Poller ¨ 1, Felix Salazar Bloise ´ 2, Martin Jakobi1, Jie Dong1, and Alexander W. Koch<sup>1</sup>

<sup>1</sup> Technical University of Munich, Institute for Measurement Systems and Sensor Technology, Theresienstr. 90, 80333 Munich, Germany <sup>2</sup> Universidad Politecnica de Madrid, ETSI Minas y Energ ´ ´ıa, Calle de R´ıos Rosas 21, 28003 Madrid, Spain

**Abstract** The characterization of a surface, especially the roughness, is of great importance for industrial applications. There are already various optical methods for roughness measurement, which usually do not consider the depolarization effects of the scattered beam caused by the sample. This work aims out improving the roughness measurement method described in [1] by taking into account depolarization effects, due to the rough sample. For validation, other instruments are employed to compare the results in the measurement interval for which this technique is applied.

**Keywords** Roughness measurement, optical method, speckle, depolarization, contrast

### **1 Introduction**

Surface roughness characterization is essential in many industrial applications: for monitoring manufacturing processes in the optical industry [2], for determining the roughness of pharmaceutical pills to ensure their effectiveness [3], and for analyzing fatigue in materials [4]. Due to the production of novel structured materials and the need to ensure their functionality without damaging them, the demand for purely optical characterization of roughness in industry is also increasing. Several methods have been provided for determining the surface roughness. Often profile-based methods, such as the stylus method, are used because they provide reliable results [5]. The diamond tip of the stylus instrument, however, may not be useful for soft surfaces like rubbers and some plastics because it destroys the surface under test. Moreover, these techniques are very time-consuming.

There are also optical techniques, such as the white light interferometer [6] or the confocal microscope [7]. One advantage of these optical methods is their accuracy, but they are associated with high acquisition and maintenance costs. This motivates research of alternative optical measuring techniques for roughness determination, which are fast, accurate, and cost-effective. At the same time, non-destructive properties must not be neglected. Most of the different optical procedures for roughness measurements do not take into account depolarization effects in the light-matter interaction process [1], which limits the applicability range of these methods. Other techniques based on light scattered by the rough surface, however, take into consideration changes in polarization, extending their possibilities in application [8, 9]. Another way to determine the roughness is to consider the depolarization processes in interferometric fringe speckle patterns by analyzing two different contrasts in two interferograms corresponding to the respective polarization field components. For this purpose, the interferogram is spatially evaluated in regions. This has already been the subject of previous investigations [10]. The method promises a higher accuracy compared with [1] of the estimated surface roughness and is valid for small roughness values (15 nm < *Rq* < 40 nm), where different industrial applications may be found. To verify the presented method, measurements are carried out for different rough samples and compared with measurement results from a stylus instrument, a white light interferometer, and a laser confocal microscope to verify the roughness values measured with the interferometer. The results are also evaluated concerning the objectives used in the confocal microscope and the white light interferometer. To investigate the current applicability of the method in industry, the suitability of the commercial measuring instruments for the presented technique is also evaluated. The roughness of the samples is measured in each case, and the accuracy and applicability of those methods are analyzed.

### **2 Methodology**

The theory of the method is explained below, before further detailing the operation principle of the roughness measurement by contrastbased depolarization field components.

#### **2.1 Theory**

The presented method is based on two-beam interference of two waves. This superposition results in the typical speckle pattern with the interference fringes. The two averaged intensities in the interferogram, �*I*A� and �*I*B�, visible on the camera can be expressed as follows [10]:

$$\begin{split} \langle I\_{\rm A} \rangle &= I\_{\rm ys} + I\_{\rm ym} + I\_{\rm xs} \\ &+ 2\sqrt{I\_{\rm ys}I\_{\rm ym}} \cdot \exp\left(-\frac{\left(2ku\_{\rm geom}\sigma\right)^{2}}{2}\right) \cdot \cos\left(u\_{\rm geom}kf\left(\mathbf{x},\mathbf{y}\right)\right), \quad \text{(2.1)} \end{split}$$

$$\begin{split} \langle I\_{\rm B} \rangle &= I\_{\rm xs} + I\_{\rm xm} + I\_{\rm ys} \\ &+ 2\sqrt{I\_{\rm xs}I\_{\rm xm}} \cdot \exp\left(-\frac{\left(2ku\_{\rm geom}\sigma\right)^{2}}{2}\right) \cdot \cos\left(u\_{\rm geom}kf\left(x,y\right)\right), \end{split} \tag{2.2}$$

where �*I*A� is the intensity on camera part A and �*I*B� is assigned to camera part B, where the quarter-wave plate is inserted in front of the reference mirror. *I*ys and *I*xs are the *y* component and the *x* component of the total intensity *I*<sup>S</sup> of the rough surface, where the same is assumed for the total intensity on the reference mirror *I*M. *u*geom is the geometrical factor that describes the scattering at the surface, defined here as *u*geom = 2, since the rough surface and the reference mirror are reflective. *k* = <sup>2</sup>*<sup>π</sup> <sup>λ</sup>* corresponds to the wave number with the laser wavelength *λ*, *σ* is the standard deviation of the heights on the surface of the target and *f* (*x*, *y*) defines the shape of the reference, which here corresponds to an inclination around the *x*- or *y*-axis, resulting in parallel interference fringes. The Michelson contrast *C*M, which describes the fringe visibility in the interferogram [1], is composed of the maximum intensity *I*max and the minimum intensity *I*min as follows:

$$\mathcal{C}\_{\text{M}} = \frac{I\_{\text{max}} - I\_{\text{min}}}{I\_{\text{max}} + I\_{\text{min}}}.\tag{2.3}$$

The two different contrasts *C*<sup>A</sup> and *C*<sup>B</sup> in the interferogram of the two camera parts can be determined under the assumption for the total intensity on the reference mirror formula *I*<sup>M</sup> = *Ix*<sup>m</sup> ≈ *Iy*<sup>m</sup> [10]:

$$\mathcal{C}\_{\rm A} \approx \frac{2\sqrt{I\_{\rm ys} I\_{\rm sym}} \exp\left(-\frac{\left(2\mu\_{\rm gecon} k r\right)^{2}}{2}\right)}{I\_{\rm xs} + I\_{\rm ys} + I\_{\rm M}}\tag{2.4}$$

$$\mathcal{C}\_{\mathsf{B}} \approx \frac{2\sqrt{I\_{\mathsf{xs}}I\_{\mathsf{x}\mathsf{m}}} \exp\left(-\frac{\left(2u\_{\text{geom}}kv\right)^{2}}{2}\right)}{I\_{\mathsf{xs}} + I\_{\mathsf{ys}} + I\_{\mathsf{M}}},\tag{2.5}$$

which result from only one experiment due to the quarter-wave plate being with one half inserted into the reference beam and provide the separated information on the camera. This divided information and the independent knowledge of the intensity components *I*xs and *I*ys, where *I*<sup>S</sup> = *I*xs + *I*ys applies, is required for the roughness determination of the sample, since the sample depolarizes the light with *I*xs �= 0. By rearranging the previous formulae the following expression for the standard deviation of the height the object surface *σ* results in [10]:

$$\sigma = \frac{\lambda}{4\pi} \sqrt{\ln \left( \frac{4I\_{\rm M}I\_{\rm S}}{\left(I\_{\rm xs} + I\_{\rm fs} + I\_{\rm M}\right)^2 \left(\mathcal{C}\_{\rm A}^2 + \mathcal{C}\_{\rm B}^2\right)} \right)}\tag{2.6}$$

where *σ* can be equated with the root mean square (RMS) height *Sq* = *σ* [11], which allows a comparison of the measurement results with the interferometer setup and the results of the measurements with the commercial measuring instruments. For the validity of the presented method, the standard deviation *σ* of the height must be Gaussian-distributed on the object surface, the surface itself must be a weak scatterer, i.e., *Rq* < *λ*/4 [12], so that we still find the necessary interference fringes and the speckles are not fully developed, and the rough object surface must be flat.

#### **2.2 Roughness Measurement by Contrast-based Depolarization Field Components**

For the measurements, we use a modified Michelson interferometer with a quarter-wave plate (QWP) set with one half in the reference beam (figure 16.1(a)). For the Ar+-laser, we choose the wavelength *λ* = 488 nm to verify the roughness measurement by contrast-based depolarization field components because it is in the middle of the spectrum of the laser and has a high intensity. To guarantee the initial polarization conditions, i.e., linear polarization of the laser light, a polarizer (PO) is placed in front of the Ar<sup>+</sup> laser. A 50:50 beam splitter (BS) divides the beam into an object and reference path. The laser beam, linearly polarized in the *y*-direction, is directed to the tilted plane mirror (M) in the reference path, where a QWP is placed in one of the two halves, called here part B, whose optical axis is set to 45◦, which enables the required two different polarization states of the reference beam, providing the different information about part A and part B. The rough surface (RS) scatters the laser light, creating different interferometric fringe patterns on both camera parts A and B. The camera (Photonfocus) has a resolution of 1312×1082 pixels (12 bit) with a pixel size of 8 μm × 8 μm. The achromatic lens (AL) and the adjustable aperture (AA) produce a focused image of the object (RS) and the reference (M) on the camera. The magnification of the measuring system is 1.5. To determine the RMS height *Sq* of the rough surface, the roughness measurement by contrast-based depolarization field components consists of three measurements. In the first measurement, we determine the total intensity of the rough surface *I*S, for which only the object path in figure 16.1(a) is considered. The light backscattered from the surface (RS) is recorded by the camera (see figure 16.1(b)) and we can calculate the intensity *I*<sup>S</sup> reflected from the rough surface. In a second measurement, we measure the total intensity of the reference mirror (M) *I*M, assuming approximately equal intensity on both parts of the plane mirror *I*<sup>M</sup> = *Ix*<sup>m</sup> ≈ *Iy*m. To ensure that the intensity of the reference mirror is the same, we place a glass in front of the mirror (in part A) for additional radiation absorption, since the QWP absorbs about 8 % of the radiation, in all directions of propagation. The glass must have the same absorption properties as the QWP. For the measurement, only the reference path in the Michelson interferometer is considered in figure 16.1(a) and the intensity reflected from the mirror (M) is recorded and shown in figure 16.1(b). In the third measurement, both paths of the Michelson setup in figure 16.1(a) are now considered, producing two interference patterns with different contrasts *C*<sup>A</sup> and *C*<sup>B</sup> generated on the camera image (see figure 16.1(b)). These two contrasts are calculated according to equation 2.3 by the Michelson contrast for part A and part B. If now the measured values for *I*S, *I*M, *C*A, and *C*<sup>B</sup> are inserted into equation 2.6, the RMS height *Sq* of the rough surface results.

**Figure 2.1:** Interferometer setup and measuring procedure of the roughness measurement by contrast-based depolarization field components for the surface under test with RMS roughness *Rq*stylus = 28 nm at a wavelength of *λ* = 488 nm.

### **3 Experimental Results**

To verify the suitability of the measurement instruments, the roughness of five different samples is evaluated using the theory explained above. The measurements with the interferometer setup in figure 16.1(a) were performed with the wavelength *λ* = 488.0 nm. Since the Michelson interferometer results (according to equation 2.6) may vary slightly from one part of the surface to another, the *Sq*michelson values from nine different areas for each sample were calculated and averaged. Besides, in order to compare this new technique with the basic methodology shown in [1], a calculation of the roughness without considering the depolarization of the scattered light was performed (*Sq*MWODSL). To validate the proposed procedure, measurements using other instruments were carried out. In addition, these measurements also serve to analyze the suitability of each technique for industrial applications. To this objective, a stylus instrument, a white light interferometer, and a laser confocal microscope were used.

We measured the samples twenty times in different directions using a stylus instrument (SURFCOM FLEX 50 A with a measuring force of 0.75 mN and a stylus tip radius of 2 μm) and then averaged them. The measurements with the white light interferometer (Wyko NT3300) have been performed in the VSI mode [13], i.e., by vertical scanning interferometry. To generate the magnification of a 40X objective, a 20X objective and an additional adjusted filter with a field of view (FOV) of 2 were used, and for the 10X objective, a 20X objective and an additional adjusted filter with a FOV of 0.5 were placed. With the confocal microscope (SENSOFAR PLμ 2300), two different microscope objectives were employed (50X and 100X).

By analyzing the experimental results shown in table 16.1 we find that the new method gives different values (*Sq*michelson) than those obtained by the technique without considering depolarization effects *Sq*MWODSL [1]. These series of values better confirm with some of the other methods employed. The stylus instrument and the confocal microscope (100X) seam to be adequate to compare the results. The Michelson interferometer working without depolarized scattered light (MWODSL) [1] provides values that do almost not change in the interval of the roughness values of samples 1 to 5. On the contrary, the *Sq*michelson varies approximately for S1 to S3 according to the stylus and the confocal microscope (100X). It shows that the method proposed may be a good estimate for small roughnesses up to 40 nm, which is an indicator for considering the depolarization effects of rough surfaces. It is interesting to note the discrepancies of the results among the techniques used. The results vary with the method and also with respect to the magnification of the objectives, then it is necessary to be careful when comparing experimental values. It is clear from table 16.1 that the confocal microscope with a 100X objective, and the stylus (radius of 2 μm), are suited to compare results within the investigated interval. The white light interferometer, even though has a very good resolution, it is not useful, in the context to validate the new technique. Exemplary captured images of sample 2 of the confocal microscope with a 50X objective and of the white light interferometer with a 40X objective are shown in figure 3.1.

**Table 16.1:** Experimental results for five samples (in nm) measured by the stylus profiler, the white light interferometer, the confocal microscope, the Michelson interferometer without considering depolarization of the scattered light (MWODSL), and the interferometer setup considering depolarization effects (*Sq*michelson). The uncertainties of the results and the measuring instruments are also given.


**(a) S 2:** 50X **(b) S 2:** 40X

**Figure 3.1:** Captured images for sample 2 of the confocal microscope with the objective 50X ((a)) and of the white light interferometer with the objective 40X ((b)).

To determine the uncertainties of the experimental results, the expanded uncertainty is calculated [14]:

$$\mathcal{U}\left(Sq\_{\text{michelson}}\right) = k\_{\text{cov}}\mu\_{\mathbb{C}}\left(Sq\_{\text{michelson}}\right) \tag{3.1}$$

with the coverage factor *k*cov = 2 [10] and the combined standard un-

certainty *u*<sup>2</sup> *<sup>c</sup>* (*Sq*michelson) of the method [14]:

$$
\mu\_c^2 \left( S q\_{\text{micheless}} \right) = \sum\_{i=1}^N \left( \frac{\partial f}{\partial \mathbf{x}\_i} \right)^2 \mu^2 \left( \mathbf{x}\_i \right), \tag{3.2}
$$

where *f* corresponds to the RMS height *Sq*michelson of the surface and *u* (*xi*) is the standard uncertainty for each input *λ*, *I*M, *I*S, *C*A, and *C*<sup>B</sup> (see equation 2.6). The uncertainties of the commercial measuring instruments, which are included in table 16.1, are given by the instrument and are determined according to the manufacturer's information.

From the investigation shown, we can conclude that the new technique, which includes depolarization effects, improves the results obtained with the usual Michelson interferometer. The confocal microscope may also be suitable for industrial applications due to its resolution in the same interval (15 nm < *Rq* < 40 nm) but is associated with high acquisition costs of tens of thousands of euros. In the same way, the white light interferometer has great capabilities in a wide interval of roughnesses, but at significant costs. Thus, if only low roughness values of < 40 nm are to be examined on materials, the modified Michelson interferometer may be preferable after a cost estimate.

#### **4 Summary and Outlook**

By the roughness measurement considering depolarization processes in interferometric fringe speckle patterns by analyzing two distinct contrasts in two patterns corresponding to the respective polarization field components, higher accuracy of the estimated surface roughness is possible for small roughnesses in the range of *Rq* = 15 nm to *Rq* = 40 nm. Commercially available measuring instruments, such as the confocal microscope, can be used for this method, and thus the technique can be currently applied in industry. Due to the short measurement time of a few seconds and the increased accuracy by considering the depolarization of the sample, the presented method is suitable for all reflective materials as a cost-effective method for optical characterization of surfaces in industry and as a supplement to (optical) measurement systems for roughness measurement.

### **References**


# **Multimodal OCT Imaging**

Bettina Heise<sup>1</sup> and Ivan Zorin1 and Guenther Hannesschlaeger, and Markus Brandstaetter1

RECENDT- Research Center for Non-Destructive Testing GmbH Altenberger Str 69, 4020 Linz, Austria

**Abstract** We demonstrate OCT as an inspection tool for monitoring and validating 3d printed AM specimens. We will discuss the potential and challenges in this applications. Furthermore, we illustrate extensions like spectroscopy and speckle imaging.

**Keywords** Optical Coherence Tomography, imaging, multimodal, speckles, quality inspection, reliability, additive manufacturing

#### **1 Introduction**

Multimodal imaging has emerged in recent years. Its advantage is the agglomeration of information through image and data fusion, which allows deeper insights into structural, functional or chemical information of samples. In addition, different spatial or temporal scales can be combined. Here, we mainly limit ourselves to imaging by optical coherence tomography (OCT) to extract structural information from samples and to enable optical characterization in a non-destructive and not hazardous way. The combination of OCT with spectroscopic information is a valuable tool, as it provides insights into the chemical and structural composition. Furthermore, with the advent of novel light sources, such as supercontinuum sources; covering spectral ranges from the visible to the near and mid infrared, this range can be facilitated in a very comprehensive way [1]. Additionally, also combinations of spectral probing range of OCT in near infrared and THz imaging are exploitable for advanced material characterization. OCT imaging itself also allows the use of different contrast mechanisms partly based on statistical optics approaches [2]. Therefore, the almost ubiquitous speckles in OCT imaging are not only noise but can also serve as a source of information. Their variance, higher order moments or local autocorrelation can provide information about the dynamics inherent in the sample under investigation [3]. We will present our latest results with a multimodal OCT system combining structural and spectroscopic material characterization. In addition, statistical optics/speckles based characterization for scattering samples with dynamics on different time scales will be presented.

### **2 Methods**

OCT imaging is currently mainly realized in Fourier Domain OCT (FD-OCT) configurations due to its increased sensitivity. Both versions, Spectral Domain OCT (SD-OCT) and Swept Source OCT (SS-OCT) are the most predominant versions nowadays. However, to extend these FD-OCT setting to other spectral ranges is challenging. However such extension offer promising combinations and multi-modal imaging approaches, especially such as spectroscopy or speckle (variance) imaging. We will demonstrate comparatively particular features of FD-OCT in NIR and MIR spectral range. Thereby we can cover a spectral range covering from about 2 to 5 μm currently, beyond the conventional NIR range of 0.8 to 1.5 μm.

### **3 Results**

In particular the inspecting and monitoring additive manufacturing (AM) products and processes already during manufacturing or in intermediate states by OCT seems very promising currently to introduce it with ab industrial background. The 3d printing of layers in micron scales represents a feasible approach for OCT imaging. The 3d printed materials hereby comprehends on one hand polymeric substances, on the other hand also the wide range of ceramics 3D printed specimens is of interest. OCT can enable to monitor in future such printing, or allows to inspect the green parts. That give valuable insights on microdefects already before e.g. sintering processes start. We will demonstrate different OCT settings and results for visualization and inspection of different 3d printed materials exhibiting micro-defects. However, one has also to know the specific features of OCT imaging, where pigments and fillers can modify appearance of cross-sectional images. This can result that defects are mitigated or suppressed in their visibility by scattering sites. It is essential therefore to consider the scattering features of specimens under test to define and select suitable wavelength and probing ranges for OCT imaging and sensing.

**Figure 3.1:** OCT cross-sectional images (B-scans), captured from 3d printed specimens. They represent examples both for (a) 3d filament printed polymer samples and (b) lithographically AM ceramics.

### **4 Acknowledgement**

We acknowledge the funding of this work by the Austrian FFG program within FFG-project DIQACAM Additionally, this work was supported by the strategic economic- and research program Innovative Upper Austria 2020 of the province of Upper Austria and the European Union's Horizon 2020 research and innovation programme under grant agreement No 820661 FLOIM. Furthermore, we would like to thank Lithoz GmbH in Vienna for providing the ceramics test samples.

### **References**

1. I. Zorin, R. Su, B. Heise, B. Lendl, and M. Brandstetter, "Correlative infrared optical coherence tomography and hyperspectral chemical imaging," *J.* *Opt. Soc. Am. A*, vol. 37, no. 9, pp. B19–B26, Sep 2020. [Online]. Available: http://josaa.osa.org/abstract.cfm?URI=josaa-37-9-B19


# **A High Quality Image Stitching Process for Industrial Image Processing and Quality Assurance**

Rolf Hoffmann

Technische Universitat Ilmenau, Group for Quality Assurance and Industrial ¨ Image Processing, Gustav Kirchhoff Platz 2, 98693 Ilmenau, GERMANY

**Abstract** The size of the recording area of a camera is limited. The resolution of a camera image is also limited. To capture larger areas, a wide angle lens can be used, for example. However, the image resolution per unit area decreases. The decreased image resolution can be compensated by image sensors with a higher number of pixels. However, the use of a high pixel number of image sensors is limited to the current state of the art and availability of real image sensors. Furthermore the use of a wide angle lens has the disadvantage of a stronger distortion of the image scene. Also the viewing direction from a central location is usually problematic in the outer areas of a wide angle lens. Instead of using a wide angle lens, there is still the possibility to capture the large image scene with several images. This can be done either by moving the camera or by using several cameras that are positioned accordingly. In case of multiple image captures, the single use of the required image is a simple way to evaluat e a limited area of a large image scene with image processing. For example, it can be determined whether a feature limited by the size is present in the image scene. The use of this simple variant of a moving camera system or the use of single images makes it difficult or even impossible to use some image processing options. For example, determining the positions and dimensions of features that exceed a single image is difficult. With moving camera systems, the required mechanics add to the effort, which is subject to wear and tear and introduces a time factor. Image stitching techniques can reduce many of these problems in large image scenes. Here, single images are captured (by one or more cameras) and stitched together to fit. The original smaller single images are merged into a larger coherent image scene. Difficulties that arise here and are problematic for the use in industrial image processing are, among others: the exact positioning of the single images to each other and the actual joining of the imag es, if possible without creating disturbing artifacts. This publication is intended to make a contribution to this.

**Keywords** Image processing, lens, image stitching, large image scene

### **1 Introduction**

With a single camera system, capturing a scene is limited in the size of the scene and the resolution of the image. Furthermore, the image may contain distortion and perspective errors that become larger toward the edge of the shot and are especially prevalent with wide angle lenses. Multi camera array systems (Fig. 1.1) can reduce such problems. The difficulty, however, is to combine the individual images from the multi camera system (Fig. 1.2) into a complete image in high quality and without errors. Typical stitching algorithms have problems if the individual images show major distortions, depict different perspectives or do not have exactly the same brightness [1]. A typical stitching result can be seen in Fig 1.3. The idea presented here is to enhance the individual images before the actual stitching process. Th is creates better preconditions to enable a high quality stitching process.

### **2 Homogeneity Correction**

A problem that can occur with the single images is that within an image a non-uniform illumination can occur (Fig. 2.1). This happens, for example, due to uneven illumination or vignetting.

Proposed solution: Determine the inhomogeneity of the brightness distribution and correct it in LAB color space. Since the brightness distribution is different in different brightness scenarios, it must be

**Figure 1.2:** Single images from multi camera array system.

**Figure 1.3:** Not optimal stitching result.

**Figure 2.1:** Uneven brightness distribution due to uneven illumination and vignetting.

**Figure 2.2:** Determined brightness distributions of a neutral white body in bright, medium and dark scenarios.

determined for bright, medium and dark scenarios (Fig. 2.2). This is done by means of a neutral white body.

To correct a single image (Fig. 2.3), the brightness scenario is first determined by determining the average brightness. This determines the corresponding correction data set to be used from the neutral white body (Fig. 2.4). By addition/subtraction, the uneven brightness distribution of the image can be improved in the brightness channel L of the LAB color space (Fig. 2.5). In addition, the overall brightness of the image can be brought to a uniform brightness level for all indi vidual images of the multi camera system at the same time in this step.

#### **3 Distortion and Perspective Correction**

Another major problem with stitching is the distortion of the individual images (Fig. 3.1). Likewise, different perspectives can already occur within an image, especially at the edges compared to the center of the image. In industrial image processing, systematic distortion correction


**Figure 2.3:** Image acquisition with uneven brightness distribution.

**Figure 2.4:** Selected correction data set from the neutral white body according to the determined brightness scenario.


**Figure 2.5:** Brightness distribution corrected with the correction data set.

can be determined and compensated. This also performs a perspective correction at the same time.

Proposed solution [2]: An areal correction is calculated by means of polynomial interpolation. Deviations from the ideal position can

**Figure 3.1:** Distorted image that may also contain perspective errors.

**Figure 3.2:** Decomposition of the total area into smaller areas to be corrected.

**Figure 3.3:** Distortion and perspective errors minimized by correction.

be determined and subsequently corrected by grid shap ed support points. To enable a highly accurate correction, many grid points are necessary. In order to handle such a system better, it is advisable to divide the total area into individual smaller areas (Fig. 3.2).

Support points are arranged in the grid to form *n* columns and *m*

rows. Distorted actual values (*xk*, *yk*)*k* ∈ (0, *l*), for *l* + 1 interpolation points are determined by image processing. Associated ideal coordinates (*xSoll*,*k*; *ySoll*,*k*) are known. Place polynomials of the (*n* − 1)th or (*m* − 1)th degree over interpolation points.

$$\chi\_{Ideal,k}(\chi\_k, \chi\_k) = \sum\_{i=0}^{n+1} \sum\_{j=0}^{m+1} a\_{ij} \ast \chi\_k^{(\ell)} \ast \chi\_k^{(j)} \qquad k \in (0;l) \tag{3.1}$$

$$\varphi\_{\text{ideal},k}(\mathbf{x}\_k, \mathbf{y}\_k) = \sum\_{\ell=0}^{n-1} \sum\_{j=0}^{m-1} b\_{ij} \* x\_k^{(\ell)} \* y\_k^{(j)}, \qquad k \in (0; \bar{\ell}) \tag{3.2}$$

System of equations in matrix notation:

$$
\begin{pmatrix}
\dot{X}\_{t:t:t-1} \\
\dot{x}\_{t:t-1} \\
\vdots \\
\dot{x}\_{t:t-1}
\end{pmatrix} = 
\begin{pmatrix}
1 & \dot{y}\_0 \dots \dot{y}\_a^{m-1} & \dot{x}\_0 \dots \dot{x}\_0^{m-1} & \dot{x}\_0 \gamma\_a & \mathbf{x}\_0^{-2} \mathbf{y}\_a^{m-1} \dots \dot{x}\_0^{m-1} \mathbf{y}\_a^{m-1} \\
1 & \dot{y}\_1 \dots \dot{y}\_a^{m-1} & \dot{x}\_1 \dots \dot{x}\_1^{m-1} & \dot{x}\_1 \gamma\_1 & \mathbf{x}\_1^{-2} \mathbf{y}\_a^{m-2} \dots \mathbf{y}\_a^{m-1} \mathbf{y}\_a^{m-1} \\
\vdots & \vdots & \vdots & \vdots \\
1 & \dot{y}\_1 \dots \dot{y}\_a^{m-1} & \mathbf{x}\_1 \dots \mathbf{x}\_a^{m-1} & \mathbf{x}\_1 \mathbf{y}\_a & \mathbf{x}\_1^{-2} \mathbf{y}\_a^{m-2} \dots \mathbf{y}\_a^{m-1} \mathbf{y}\_a^{m-1}
\end{pmatrix} \cdot \begin{pmatrix}
a\_{00} \\
a\_{10} \\
\vdots \\
a\_{m-1,m-1}
\end{pmatrix} \\
\times \begin{pmatrix}
a\_{00} \\
a\_{10} \\
\vdots \\
a\_{m-1,m-1}
\end{pmatrix} \\
\times \begin{pmatrix}
a\_{00} \\
\vdots \\
a\_{0,m-1}
\end{pmatrix}
\end{pmatrix} \tag{3.3}
$$

$$
\begin{pmatrix}
\mathcal{Y}\_{\text{ideal},n} \\
\vdots \\
\mathcal{Y}\_{\text{ideal},1} \\
\vdots
\end{pmatrix} = \begin{pmatrix}
1 & \mathcal{Y}\_0 \dots \mathcal{Y}\_0 \mathcal{Y}\_0^{m-1} & \langle x\_0 \dots \mathcal{x}\_0^{m-1} \mid x\_0 \mathbf{y}\_0 \dots \mathbf{x}\_0^{m} \mathbf{y}\_0^{m-1} \dots \mathbf{x}\_0^{m-1} \mathbf{y}\_0^{m-1} \\
1 & \mathcal{Y}\_1 \dots \mathcal{Y}\_1 \mathcal{Y}\_0^{m-1} & \langle x\_1 \dots \mathcal{x}\_1^{m-1} \mid x\_0 \mathbf{y}\_1 \dots \mathbf{y}\_0^{m} \mathbf{y}\_1^{m-1} \dots \mathbf{x}\_0^{m-1} \mathbf{y}\_1^{m-1} \\
1 & \vdots & \vdots & \vdots & \vdots \\
1 & \mathcal{Y}\_1 \dots \mathcal{Y}\_1 \mathbf{y}\_1^{m-1} & \langle x\_1 \dots \mathbf{x}\_1^{m-1} \mid x\_1 \mathbf{y}\_1 \dots \mathbf{x}\_1^{m} \mathbf{y}\_1^{m-1} \dots \mathbf{x}\_1^{m-1} \mathbf{y}\_1^{m-1} \end{pmatrix} + \begin{pmatrix}
b\_{00} \\
b\_{10} \\
b\_{11} \\
b\_{n-1\mathbf{m}-1} \\
b\_{n-1\mathbf{m}-1}
\end{pmatrix} \\ \tag{3.4}$$

### **4 Image Stitching**

The stitching process requires multiple source images that have overlapping areas. Due to the previous corrections of the individual images (Fig. 4.1), the conditions are very good for obtaining a very good stitching result [3]. Fig 4.2 shows the high quality result of the stitching process with OpenCV.

**Figure 4.1:** Single images optimized by previous corrections serve as the basis for the subsequent stitching process.

**Figure 4.2:** High quality result of the stitching process.

### **5 Results and Conclusions**

Camera arrays can be used to inspect large areas. The individual images from all cameras must be combined to form one large overall image. This stitching process can be significantly improved if each individual ima ge is enhanced before stitching. In this paper, homogeneity correction and distortion and perspective correction are demonstrated before the stitching process.

### **6 Acknowledgement**

The research work presented in this paper is part of the research support program "European Regional Development Fund" (ERDF), Thuringia, Germany

### **References**


# International Conference on Optical Characterization of Materials

Each material has its own specific spectral signature independent if it is food, plastics, or minerals. New trends and developments in material characterization have been discussed as well as latest highlights to identify spectral footprints and their realizations in industry.

# Conference topics


Beyerer | Längle (Eds.) CONFERENCE PROCEEDINGS

OCM 2021

**Spectral Sensors**

The International Conference on Optical Characterization of Materials (OCM-2021) was organized by the Karlsruhe Center for Spectral Signatures of Materials (KCM) in cooperation with the German Chapter of the Instrumentation & Measurement Society of IEEE.

KCM is an association of institutes of the Karlsruhe Institute of Technology (KIT) and the business unit Automated Visual Inspection of the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation (Fraunhofer IOSB).

ISSN 2510-7240 ISBN 978-3-7315-1081-9

Gedruckt auf FSC-zertifiziertem Papier