## Abstract

To achieve optimal results, femtosecond laser machining requires precise control of system variables such as Regenerative Amplifier Divider, Frequency, and Laser Power. To this end, two regression models, multi-layer perceptron (MLP) regression and Gaussian process regression (GPR) were used to define the complex relationships between these parameters of the laser system and the resulting diameter of a dimple fabricated on a 304 stainless-steel substrate by a 0.2-second laser pulse. In order to quantify dimple diameter accurately and quickly, machine vision was implemented as a processing step while incorporating minimal error. Both regression models were investigated by training with datasets containing 300, 600, 900, and 1210 data points to assess the effect of the dataset size on the training time and accuracy. Results showed that the GPR was approximately six times faster than the MLP model for all of the datasets evaluated. The GPR model accuracy stabilized at approximately 20% error when using more than 300 data points and training times of less than 5 s. In contrast, the MLP model accuracy stabilized at roughly 33% error when using more than 900 data points and training times ranging from 30 to 40 s. It was concluded that GPR performed much faster and more accurately than MLP regression and is more suitable for work with femtosecond laser machining.

## 1 Introduction

Micro-dimples play a significant role in surface properties including friction [1], wettability [2], lubricant retention [3], heat transfer [4], and antibacterial properties [5]. However, traditional fabrication processes for micro textures, including chemical etching and injection molding [6], are time-consuming and have material limitations, making investigation of micro dimple arrays difficult. Femtosecond laser machining has emerged as a promising one-step process that is more compatible with a variety of materials. Commercial femtosecond lasers can make dimples and textures in the range of tens of microns [7–10].

Despite the advantages of femtosecond laser machining, many of the effects of laser parameters, such as pulse duration, laser wavelength, and frequency, on the fabricated surface topographies are not sufficiently characterized [11]. Furthermore, femtosecond laser machining is a complex process with dynamic interacting relationships between the parameters and resulting fabricated surface topography [12]. While general trends have been found describing the effects of the laser parameters: high pulse energy leads to melting [9], increased repetition rates affect ablation depth [7], and ablation volume increases with more laser pulses [13], the complexity of these interactions means that tailoring the parameters of the laser can require days of active work to obtain the ideal settings for a single application.

Machine-learning (ML) methods have been investigated as a potential solution for automatic process control and predictive visualization of femtosecond laser behavior. Various techniques including genetic algorithms, clustering methods, reservoir computing, and neural networks have been used to predict the shape of light pulses generated for femtosecond laser optimization [14–16]. However, few studies focus on the effects of laser parameters on material removal. Using images of fabricated dimples, one study trained a convolutional neural network (CNN) to predict and optimize dimple depth and crown height on grey cast iron material using the repetition rate, pulse energy, and the number of pulses of a femtosecond laser [17].

CNNs are computationally complex because they store images as tensors to account for the spatial relations between points in the images. The computational complexity required for regression can be drastically reduced by using machine vision to analyze key features of interest in the input images and store them in vectors for inputting into ML models. This paper presents the utilization of such a novel approach. Machine vision was used to analyze laser-textured dimple images to obtain dimple diameters as inputs to ML models to effectively reduce the computational complexity.

The multi-layer perceptron (MLP) regression model is similar to CNN in function but only needs to represent the images as vectors for the input instead of tensors. MLP is a widely adopted ML model in scientific and engineering research due to its simplicity and ease of implementation. While MLP is known to be robust to small data noises or fluctuations, it cannot provide a quantifiable estimate of the uncertainty of their predictions. On the contrary, Gaussian process regression (GPR) can provide posterior distributions over their predictions and thus are more advantageous for applications where uncertainty quantification is important. Given this, we hypothesize that nonparametric regressors like GPR models would handle experimental uncertainties and noise in laser machining applications much better than MLP. This study aims to compare the effectiveness of nonparametric GPR models to parametric MLP models in order to verify that the nonparametric GPR model is ideal for analyzing dimple fabrication with a femtosecond laser. The focus is on the investigation of the effects of laser system variables (Regenerative Amplifier (RA) Divider, Frequency, and Laser Power) on the diameter and circularity of dimples fabricated on polished 304 Stainless steel using a single 0.2-second pulse. We demonstrate the integration of machine vision with machine learning in laser manufacturing applications and show the advantages of GPR in predicting experimental data over MLP, thus highlighting the importance of utilizing nonparametric methods for experimental work. The trained model not only leads to faster and more efficient data processing but also can be used inversely to design manufacturing parameters.

## 2 Methods

### 2.1 Integrated Experimentation, Machine Vision, and Machine-Learning Workflow.

Figure 1 shows the workflow to integrate experimentation, machine vision, and machine learning for femtosecond laser machining. First, experimentation is conducted as depicted in Fig. 1(a), where dimples are fabricated using a femtosecond laser machining setup followed by taking microscopic images of the dimples. Following this, a machine vision script is used to analyze the dimple images to extract the dimple diameters, as shown in Fig. 1(b). The diameter values extracted through machine vision are then used as the training data for developing the machine-learning models to predict the dimple diameters from the input laser parameters, as shown in Fig. 1(c).

### 2.2 Sample Fabrication.

Sample fabrication begins with polishing 1-in. by 1-in. 304 stainless-steel samples to a mirror finish with an average roughness value of 0.05 *µ*m. Micro-dimples are then fabricated using a femtosecond laser micromachining system (Oxford Laser A5 Femtosecond, Oxford Lasers Ltd., Didcot, UK). This system uses a Ytterbium-doped Potassium Gadolinium Tungstate (Yb: KGW)-based femtosecond laser, which operates at a wavelength of 515 nm, as schematically represented in Fig. 1(a). The laser pulse length is below 290 fs, the base repetition rate ranges 60–1000 kHz, and the maximum power is 2.6 W.

To evaluate experimental uncertainty, multiple arrays (as depicted in Fig. 2) are fabricated at each laser power. All dimples were fabricated with a 0.2-second laser pulse. The array frequency values range from 600 to 1000 kHz in 50 kHz increments, while the RA divider values span from 400 to 6000, increasing in steps of 400. The laser powers used for fabricating the arrays vary from 5% to 25% of the overall laser power (2.6 W), in 5% increments. These parameters are summarized in Table 1. The values were chosen to minimize the occurrences of under or over-exposure.

Over-exposure to energy during femtosecond laser machining can lead to effects like melting and the formation of heat-affected zones (HAZs), as illustrated in Fig. 3. Underexposure of the material results in low-energy regimes where areas either do not ablate or ablate unevenly. These effects are undesirable as the control afforded by direct ablation is the primary goal of the fabrication method and consistency is critical. Dimples affected by under or over-exposure are removed from the dataset during the machine vision step based on calculations determining the dimple geometry’s ideality.

Post ablation the substrates were cleaned of debris by sonication in acetone, as well as isopropanol and water rinses. High-resolution images of all dimples were taken using a 3D laser scanning confocal microscope (VK-X260K, Keyence Corporation of America, Itasca, IL) with a 100× magnification lens. Figure 4 depicts the locations on a 20× Keyence microscope image where the 100× images were taken on a 20% Laser Power parameter array. The 100× magnification improves the accuracy of the dimple images for the study while capturing six dimples in one image, reducing the time required for image capture and script execution.

### 2.3 Data Processing—Imaging and Machine Vision.

Images from the laser scanning confocal microscope were processed using a machine vision script designed in matlab to extract the diameter values of the dimples. The visualization of this process is shown in Figs. 1(a) and 1(b).

*d*, shown in Fig. 1(b), was calculated using the area of a circle obtained from matlab according to Eq. (1) below

*a*is the area of the dimple, in pixel

^{2}and

*Q*is the ratio of microns to pixels obtained from the scale bars of the Keyence Images using the ImageJ processing program. This ratio, 215 pixels to 20

*µ*m, facilitates the conversion of values from pixels to microns. Simultaneously, the circularity,

*c*, was calculated using Eq. (2)

*a*is the area of the dimples, in pixel

^{2}, and

*p*is the perimeter, in pixel. Circularity defines how circular an object is and ranges from 0 to perfect circularity at 1.0. Areas where the laser failed to form a dimple or where dimples with melting or HAZs occurred should exhibit extremely low circularity. Dimples with circularity values below 0.9 were excluded from the dataset, as these outliers were beyond the scope of our work which primarily concentrates on dimple size as quantified by diameter in this case and not on dimple ideality. Dimple size is highly relevant to the understanding of the effects of surface texturing as it relates directly to key features like contact area. After exclusions, a total of 1210 data points remained for analysis.

Standard practice for measuring dimples when designing parameters for surface textures is to take measurements by hand using the profile tools in imaging software. This process can take a minute per dimple. Although these measurements are susceptible to human error and each dimple can vary by tenths of microns, it is the most accurate measurement method available. The ultimate goal of incorporating machine vision is to reduce the computational complexity of the input to the regression model and to minimize the amount of memory needed. Traditionally, ML models are trained on images using convolutional methods and the images are formatted as arrays of data points. However, this approach necessitates storing all the images and adds complexity to the training process due to the array of values involved. In contrast, training on data simplified through machine vision processing streamlines the process and eliminates the need for long-term storage of images. However, for the implementation of machine vision methods to be practical, it needs to be faster than hand measurements and accurate to the same degree.

### 2.4 Machine-Learning Model Architectures.

Both ML methods used take identical inputs: frequency, RA divider, and laser power. The output values of the models were diameter values of dimples. Data inputs were randomized to ensure variability each time the regression models were executed, maintaining consistent data splits of 20% for testing, 16% for validation, and 64% for training, regardless of the overall number of data points used. All training times discussed in this paper are based on the models run on an i5-7200U CPU, a pragmatic decision based on the computational limitations of most computers used with femtosecond lasers.

*N*is the individual neuron,

*w*is the weight value applied to a neuron,

*x*is the output from a previous neuron, and

*B*is an applied bias; the variable

*i*is representative of the layer and

*j*is the designation of which neuron is in being called from that layer (

*j*) for each neuron. The MLP model used for this study has one input normalization layer followed by six dense layers with Rectified Linear Unit (ReLU) activation functions and 32 neurons in each layer and a final output linear layer. The ReLU functions return the result according to Eq. (4)

*N*is negative and returns

_{i,j}*N*if it is positive. This ensures only the positive neuron values are retained. The specific activation functions and weights are determined by iterations with a training dataset. The benefit of this method for modeling is that many hidden layers can be included to model complex relationships. The MLP model uses “adam” as the optimization method, mean absolute error as the loss value, and the standard built-in learning rate. The model was executed with a batch size of 5 and without early stopping for 100 epochs. Training this simple MLP model on the full dataset of 1210 data points takes approximately 40 s.

_{i,j}*μ*(

*x*), represents the expected value of the diameter based on the parameter inputs; this is found using the training data. The kernel function,

*K*(

*x*), is the covariance of this system, which characterizes the relationships between different input combinations in the GPR model. Specifically, it quantifies the similarity or dependency between data points based on their parameter inputs. This function, given in Eq. (6), is a combination of a Radial Basis Function (RBF) kernel and a White Kernel. The RBF kernel, also known as the squared exponential kernel, is a function of the Euclidean distance of the points from the mean of the dataset

*C*

_{1}and

*C*

_{2}are constants used for the weight values,

*l*is the length scale of the function, which was set to be 50.0,

*d*(

*x*

_{i},

*x*

_{j}) is the Euclidean distance between points, and the

*x*values are vectors of the input variables used to calculate the Euclidean distance. The length scale of the RBF Kernel controls the smoothness of the model to match the input data.

*W*is the noise component of the covariance function, which is a normal distribution centered around the mean, as shown in Eq. (7)

*µ*is the mean value of the normal distribution and

*σ*is the standard deviation of the normal distribution.

The variance for the model, *σ*^{2}, was set to be 1.5 to accommodate experimental data noise. GPR models consider multiple possible functions to explain the relationship between input variables and output values. By doing this, the GPR model considers different patterns and variations that may exist in the data. The probabilistic approach not only makes GPR more effective on smaller datasets but also is better at approximating experimental data than traditional parametric approaches, such as the MLP method. The noise level used in the final model was 5.0, in order to better account for noise introduced from inconsistencies in the laser-material interactions. The GPR kernel is trained iteratively, similar to MLP, by altering weight values within each of the kernels in the overall model kernel. Training time for this model on the 1210 data-point set was 7 s.

All models were evaluated using the complete dataset of 1210 points, as well as smaller datasets of 300, 600, and 900 points randomly selected from the larger set, in order to evaluate the tool’s accuracy with fewer data points. Model accuracy was assessed using the total Root Mean Square Percentage Error (RMSPE) and the coefficient of determination, *R*^{2}. These values were computed by comparing the predicted dimple diameter values calculated by the ML models against the machine vision predictions from the testing dataset. Each model was executed 10 times per dataset allowing the calculation of average and standard deviation of RMSPE and *R*^{2}.

## 3 Results and Discussion

### 3.1 Accuracy and Precision of the Machine Vision Approach.

Table 2 presents an analysis of the accuracy and precision achieved by a machine vision approach by studying dimples fabricated using 15% laser power on 304 stainless-steel samples at varied frequencies and RA Divider values. Figure 5 provides visual representations of the dimples and their transitions.

In Fig. 5(a), dimples fabricated with a 400 RA Divider exhibit lips of material from melting during the ablation process. In Fig. 5(b), asymmetrical dimples were created using an 800 kHz frequency and 800 or 1200 RA Divider settings. The 800 RA divider dimple measures 2.9 by 3.61 microns (averaging 3.03 microns), while the 1200 RA divider dimple measures 2.4 by 2.82 microns (averaging 2.61 microns). No dimples were formed at an 850 kHz frequency with the same RA divider settings.

Table 2 provides circularity and diameter values for the dimples shown in Fig. 5. These values were obtained through the machine vision method and manual measurements using the Keyence Multifile Analyzer software's profile tool. Dimples in Fig. 5(a) with melting exhibit circularity values of 0.85 and 0.88 and were excluded from the dataset. The remaining dimples in Fig. 5(a) and Table 2 have circularity values greater than or equal to 0.90, indicating their suitability for the study.

For the asymmetrical dimples in Fig. 5(b), the sample with an 800 RA divider has a circularity of 0.89 and is excluded, while the 1200 RA divider sample with a circularity of 0.92 is included. Despite their similar appearance, there is a 0.71-*µ*m difference in the hand-measured diameters between the longer and shorter axes of the 800 RA divider dimple, compared to a 0.42-*µ*m difference for the 1200 RA divider dimple. The more lopsided dimple is excluded but the less distorted dimple satisfies the circularity metric. In transitional zones like this one, occasional discrepancies from the machine vision process are included in the overall machine-learning systems. At the 850 kHz, 800 RA divider pulse in Fig. 5(b), there is a pulse effect but there is no discernable dimple structure. The machine vision algorithm detected an object with an estimated diameter of 1.58 *µ*m and a circularity of 0.85. No object was found in the 1200 RA divider area at the same frequency. Both data points were duly omitted from the dataset.

For the remaining dimples, the percentage discrepancy between machine vision and manual measurements is 4 or less, indicating high precision. Additionally, the machine vision approach processes the dimples in seconds, significantly faster than hand measurements. This highlights the efficiency and reliability of the machine vision approach as an alternative to manual measurements for analyzing dimple characteristics, particularly in high-throughput or large-scale studies.

### 3.2 Evaluation and Comparison of Machine-Learning Models.

Figure 6 presents a three-dimensional scatter plot of the complete set of dimple diameter values used for model training. Clear trends emerge when viewing the data in this way. First, as laser power increases, the dimple sizes become larger, with some overlap between different power levels. Furthermore, frequency and RA divider appear to have an inverse relationship with dimple diameter. These findings align with existing knowledge about femtosecond laser machining. Additionally, it can also be observed that dimples fabricated with identical parameters can vary in diameter. This disparity was analyzed across 237 dimple sets (a set comprising 3–5 dimples all made using identical laser parameters). For instance, the five dimples fabricated at 25% laser power, 1200 kHz, and 600 RA Divider constitute one set. The average percent uncertainty derived from these sets is 6.3%. This inherent uncertainty stemming from dimple diameter variance must be factored in when assessing the relatively large error of the ML models observed in this study.

Figure 7(a) depicts the RMSPE for each model during the training on multiple sizes of datasets. The MLP model delivers its best performance (RMSPE average of 32% with a standard deviation of 9% over ten trials) when trained on the complete 1210 data-point set. The GPR Model, when utilizing the full dataset, yields an RMSPE average of 20% with a standard deviation of 4%. The GPR model outperforms the MLP model by 12% at top performance.

Training each model with varying data-point quantities is crucial, as large datasets can improve the quality of ML models but drastically increase the training time.

This trade-off limits the responsiveness of a control loop for automatic process control intended for femtosecond laser machining. Figure 7(a) demonstrates that the GPR model consistently exhibits lower overall error across all dataset sizes. Both models stabilize at relatively high values, around 20% for GPR and 30% for MLP, but the GPR model stabilizes between 300 and 600 data points, while the MLP model accuracy does not begin to stabilize until the dataset contains between 900 and 1210 data points. The inherent uncertainty stemming from the dimple diameter variance of the experiments contributes to these high values.

Figure 7(b) presents the *R*^{2} values for each model when trained on 300, 600, 900, and 1210 data points. The GPR model achieves stable correlations at or above 0.8 for datasets of 600 points or more (with standard deviations of 0.04 or below). In contrast, the MLP model starts with a level of correlation at 0.51 with a substantial standard deviation of 0.18. The model gradually improves in accuracy with smaller standard deviations as the size of the dataset is increased. The GPR model attains a maximum of 0.83 (with a standard deviation of 0.03), which significantly outperforms the MLP model’s highest level of correlation at 0.62 (with a standard deviation of 0.05).

Figure 8 compares the actual versus predicted diameter plots of the test datasets for both models trained with different-sized datasets. The MLP model exhibits high levels of inaccuracy when trained using 300 data points, but accuracy improves as the model is trained on larger datasets. In comparison, the GPR model starts with reasonable accuracy and maintains consistency with increased data points. One aspect to note in the 1210 data-point GPR model and most of the MLP models show higher error for diameter values larger than 5 *µ*m, this is due to underrepresentation in the dataset.

The GPR model, being more accurate and stabilizing more rapidly than the MLP model, should be the focus of future work on integrating ML into femtosecond laser machine automation. Some error originates from the inherent variance of the dataset itself. However, further refinement of the machine vision process, especially with the imaging technology, could improve precision and offset some errors. Implementing ML methods as in-line processing could alleviate inconsistency in the data from sample focus and movement as removing samples from the laser stage and subsequently reinserting them leads to changes position and orientation, leading to inconsistent data. Considering factors like focus height, which can have a significant effect on the laser effect on a surface, can further enhance the ML system. Further exploration of kernel structures for the GPR model and refining parameters can address experimental noise more effectively.

In addition to accuracy, training time for each model is crucial because rapid feedback is essential for integrated laser and ML systems. Figure 9 provides estimates of training time for each model across different training data sizes. The GPR model is approximately six times faster than the MLP model for all dataset sizes used. Also, reducing the number of data points decreases the training time for both models.

## 4 Conclusion

Dimple arrays with varying laser parameters were fabricated using a Yb:KGW base femtosecond laser on 304 stainless-steel samples. These arrays were imaged using a laser scanning confocal microscope to facilitate the investigation of MLP and GPR machine-learning methods for use in femtosecond laser applications. Machine vision was implemented as a processing step to quantify dimple diameters for the regression methods. A comparative analysis with manually gathered data substantiated the effectiveness of the machine vision method, demonstrating that it did not introduce substantial errors into the overall machine-learning systems. Performance evaluations of the MLP and GPR models were conducted using various quantities of training data points. RMSPE, *R*^{2}, and the training time of each model were the performance metrics studied.

Throughout the evaluation, the GPR model consistently demonstrated lower RMSPE than the MLP, with an error reduction reaching up to 26% when only 300 data points were used. Even at the MLP models’ peak performance with a full 1210 data-point set, the difference in RMSPE remained as high as 12%. The GPR model's *R*^{2} is 0.78 with a low standard deviation when trained on a 300 data-point set. This value climbs to 0.83 with a standard deviation of only 0.03 as the size of the training dataset increases. In contrast, the MLP begins at 0.51 with a standard deviation of 0.18 and only climbs to 0.63 with a standard deviation of 0.07. The GPR model exhibited superior speed, being approximately six times faster than the MLP model for all training data quantities examined.

While neither model has reached an ideal level of refinement for integration into femtosecond laser machining process automation, there is substantial evidence from this study to suggest that future research focusing on incorporating ML into laser processing technologies should prioritize the GPR Model.

## Acknowledgment

The research was supported by the US National Science Foundation (NSF) under Grant No. OIA-1457888 through the Center for Advanced Surface Engineering and the Arkansas EPSCoR Program, ASSET III.

## Conflict of Interest

There are no conflicts of interest. This article does not include research in which human participants were involved. Informed consent not applicable. This article does not include any research in which animal participants were involved.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.