## Abstract

As Industry 4.0 and digitization continue to advance, the reliance on information technology increases, making the world more vulnerable to cyberattacks, especially cyber-physical attacks that can manipulate physical systems and compromise sensor data integrity. Detecting cyberattacks in multistage manufacturing systems (MMS) is crucial due to the growing sophistication of attacks and the complexity of MMS. Attacks can propagate throughout the system, affecting subsequent stages and making detection more challenging than in single-stage systems. Localization is also critical due to the complex interactions in MMS. To address these challenges, a group lasso regression-based framework is proposed to detect and localize attacks in MMS. The proposed algorithm outperforms traditional hypothesis testing-based methods in expected detection delay and localization accuracy, as demonstrated in a simple linear multistage manufacturing system.

## 1 Introduction

Due to the advancement in automation and the industrial internet-of-things, concerns around the cybersecurity of manufacturers have grown considerably in recent years [1]. After the well-known Stuxnet [2] targeting Iran’s nuclear program, cyberattacks aiming at disrupting manufacturing operations have surged. In 2022, the manufacturing industry became the top-target of cyberattacks in all operational technology (OT)-related industries. Cyberattacks affecting manufacturing companies are not new. For example, the WannaCry [3] ransomware affected several manufacturing companies, including Taiwan Semiconductor Manufacturing Company (TSMC). However, what is more alarming than attacks that aimed to simply shut down the system or bleach the data are the ones that aims to destroy the manufacturing assets or cause life-threatening events. In 2014, an attack against a German steel mill company [4] targeted the industrial control system and caused components of the plant controls to fail and caused the blast furnace to shut down improperly. In 2017, a malware was launched against a Saudi Arabian petrochemical plant and reprogrammed its safety systems which directly monitors and controls the equipment and processes. The above attacks are classified as cyber-physical attacks, where the attack intrudes into the system from the cyber network but aims to disrupt the physical process. Most cyber-physical attackers exploit the vulnerabilities in the design of supervisory control and data acquisition (SCADA) systems to design sophisticated sensor spoofing attacks based on techniques such as network traffic eavesdropping, traffic rerouting, IP spoofing, packet crafting, or session hijacking. Although secure communication protocols have been introduced for sensor data transmission, they do not prevent attacks injected to the sensors [5]. Further, due to the high volume of sensor data and limited computing and storage resources to encrypt or carry the encrypted data, current manufacturing operational (sensing and control) data are commonly being transmitted in an unencrypted manner, especially in legacy systems [6]. Further, smart attackers can leverage advanced techniques (such as machine learning) to expose the data or modify the data without decryption [7]. The above issues allow the attacker to manipulate the sensor data (compromising sensor data integrity) and lead to malicious control actions that will damage the manufacturing systems and the final products. In this paper, we focus on detecting and localizing sensor spoofing attacks at the level of manufacturing SCADA.

Multistage manufacturing systems play a crucial role in the manufacturing industry [8]. These systems, comprised of multiple components, stations, or stages, can be modeled as a series of interconnected elements working together to produce the final product [8,9]. The digital connectivity between components and devices in MMSs, as well as the standard data communication protocols used in manufacturing execution systems, make modern manufacturing systems vulnerable to cyberattacks. Additionally, due to the interconnectivity between stages in an MMS, a single point of failure can rapidly propagate throughout the system and result in quality issues with the final product, emphasizing the importance of robust cyberattack detection and localization measures. For instance, in a car assembly process, an attack on the machining stage could alter the dimensions of the part being assembled, leading to the final product failing to serve its intended purpose and introducing quality issues. Therefore, it is crucial to detect cyberattacks at an early stage and accurately localize the stage under attack.

Most existing studies on the cybersecurity of multistage manufacturing systems primarily focus on additive manufacturing (AM) contexts [10–13]. However, these methods are specific to AM and cannot be readily applied to generic MMS, such as assembly processes. For instance, in Ref. [10], each layer’s printing is treated as a stage, and an alteration detection method based on image analysis after each layer is printed is developed. While such methods are applicable to layer-by-layer manufacturing processes, there is a lack of fundamental research that analyzes the propagation mechanism of cyberattacks to other stages and how this propagation can be leveraged for attack localization. Hence, our goal is to analyze attack propagation in a generic multistage process and develop a cyberattack detection algorithm that can be applied across multiple MMS applications.

This paper considers a generalized multistage manufacturing system, where each stage sequentially processes the product. Each stage consists of a set of sensors, which take measurements of the product, and a controller that calculates the control output based on the sensor measurements from the previous stage to ensure the output sensor measurements of the stage are at the desired level. Because of this control mechanism, the impact of false data injection (FDI) attacks in a specific stage can propagate to later stages. In other words, a false data injection in the previous stage may not cause a physical impact on the attacked stage but on later stages. Moreover, when the attacker has some knowledge about the system, the attack can be designed to be undetectable at that stage [14]. The above factors pose significant challenges in detecting and localizing the attack in MMS, which we aim to address in this paper. The contributions of the paper can be summarized as follows:

We characterize a generic multistage manufacturing process model using Kalman filter (KF) and stochastic state-space model and perform theoretical analysis to extract the features uniquely related to the location of the attack.

We design a hybrid detection framework that integrates the benefits of signature- and anomaly-based detection techniques that fulfill the localization function without relying on attack data for training. This is achieved based on the theoretical analysis, which extracts the feature based on the domain knowledge of the system dynamics.

We designed a group regularization-based framework for simultaneous attack detection and localization. Unlike most existing methods that detect and localize in a two-phase manner, the group lasso-based framework enables us to identify the occurrence and location of potential false data injection attacks in real-time. With such information, further investigation and treatments can be triggered to minimize the attack’s impact on the system and its users.

## 2 Literature Review

Manufacturing systems are commonly considered as examples of cyber-physical systems (CPSs) in the context of cybersecurity. Security approaches that are typically applicable to CPSs, such as vulnerability analysis [15–17], secure IoT network architecture design [18,19], intrusion detection [20,21], and cyberattack-resilient state estimation and control [22–24], also apply to manufacturing systems. In this section, we specifically focus on reviewing the literature on cyberattack detection in manufacturing systems, which can be categorized into model-based and data-driven techniques.

Model-based methods rely on the physics rules that govern manufacturing processes to identify anomalies in events that do not adhere to these rules. The emergence of digital twins has enabled the development of many model-based methods for cyberattack detection, thanks to their real-time, high-fidelity representation of the actual system.

The literature on data-driven cyberattack detection in manufacturing systems can be divided into two categories: (i) signature-based methods and (ii) anomaly-based methods. Signature-based methods operate by matching patterns with known attacks and triggering an alarm when a match is found [3,25–27,40]. In essence, they function similarly to supervised machine learning algorithms, where they are trained using normal and under-attack data, and then deployed on real-world data for attack detection [28–33]. For example, Song et al. propose a real-time attack detection system using a convolutional neural network in cyber-manufacturing systems for detecting defects [34]. Wu et al. utilized machine learning algorithms to detect cyber-physical attacks in cyber-manufacturing systems, simulating attacks on 3D printing and CNC machining [35]. Another work by Wu et al. focused on detecting malicious infill defects in 3D printing using image classification, where features are extracted from images and classification algorithms such as the naive Bayes classifier and J48 Decision Tree are applied [31].

Anomaly detection methods aim to identify patterns representing the expected behavior of the system and raise alarms upon recognizing significant deviations from the normal pattern. These methods function similarly to unsupervised learning algorithms [29,36–38]. For instance, Qian et al. proposed a scheme for detecting cyber attacks in the cyber and physical stages of SCADA systems using a nonparallel hyperplane-based fuzzy classifier [37]. Kwon et al. introduced a hybrid approach that combines anomaly and signature detection algorithms for detecting cyber attacks in physical systems. They use normal data in training datasets to establish a threshold and then apply the trained model to test datasets to evaluate its performance [29].

The proposed method in this paper combines signature and anomaly detection techniques. The signatures associated with attacks on each stage in the multistage system capture the correlation between sensor measurements from all stages and the injected data. However, unlike the supervised learning-based algorithms mentioned earlier, these signatures are not extracted purely from data-driven approaches. Instead, they are derived from the system dynamics based on domain knowledge. Therefore, the proposed method can also be considered as an anomaly-based detection algorithm that relies solely on normal data rather than attack data. This characteristic makes the proposed method more realistic and applicable in MMS applications where there is typically a lack of sufficient attack data.

## 3 System Representation

As discussed in Sec. 1, MMSs consist of multiple stages in which the output of stage *i* is the input to stage *i* + 1. The representation of the system can be seen in Fig. 1. At each stage, the control action is taken based on the measurements of the input state to control the states to the reference values (setpoints). In practice, measurements before and after a processing stage maybe taken at the same station. However, we generalize the measurements to be taken after each stage, which means the sensor measurements is obtained from the output state at each stage. The attack we are considering in this paper is false data injection attack, where fake sensor measurements are sent to the controllers.

### 3.1 Multistage State-Space Model.

*K*-stage MMS, where the state transition and measurement functions vary between stages. Let

**x**

_{k}denotes the state variable of the product at stage

*k*such that $xk\u2208Rmk$, where

*m*

_{k}is the number of state variables at stage

*k*. Let

**y**

_{k}denote the sensor measurements at stage

*k*such that $yk\u2208Rnk$, where

*n*

_{k}is the number of sensors at stage

*k*. In the case that the original sensor measurement is in the form of a profile or a video stream, the extracted features can be used to formulate the state-space model, which can be represented by multiple elements in the vector

**y**

_{k}, and

*n*

_{k}is the total number of features, which can be viewed as virtual sensors at stage

*k*. For continuous manufacturing processes, sensor data snapshots or sliding windows at a constant frequency can be used to extract discrete-time features to formulate the model. Let

**u**

_{k}denotes the control actions at time

*k*such that $uk\u2208Rpk$, where

*p*

_{k}is the dimension of control action at stage

*k*.

**w**

_{k}and

**v**

_{k}are the process noise and measurement noise at stage

*k*, respectively, which are independent of all the other variables and are assumed to be not affected by any system anomaly, including attacks. Both

**w**

_{k}and

**v**

_{k}follow multivariate normal distributions with zero mean, i.e.,

**w**

_{k}∼

*N*(0,

*W*

_{k}) and

**v**

_{k}∼

*N*(0,

*V*

_{k}), where $Wk\u2208Rmk\xd7mk$ and $Vk\u2208Rnk\xd7nk$ are the covariance matrices of the noise terms at stage

*k*. The state-transition function and the measurement function are

*k*, respectively. For stage

*k*= 1, the input state

**x**

_{0}represents the initial state of the product. For example, for a multistage machining process,

**x**

_{0}can be the dimensions of the part before processing. We assume the matrices

*A*

_{k},

*B*

_{k},

*C*

_{k},

*V*

_{k},

*W*

_{k}for each stage

*k*are known. In Eq. (1),

*A*

_{k}

**x**

_{k−1}is the action taken on the input product (state variable) at stage

*k*. Also,

*B*

_{k}

**u**

_{k}is the action taken at stage

*k*by the controller to make sure that the state variable is the desired (reference) state variable at stage

*k*.

*K*

_{k}) is derived based on the KF formulations of discrete time, and the Kalman filter formulas are given in the following:

*k*− 1, $x^k|k$ denotes the updated state estimation given the measurement at time

*k*,

**y**

_{k}, and $y^k$ denotes the residual at time

*k*, which is the difference between predicted and actual measurements, as shown in Eq. (5).

### 3.2 Controller Model.

*k*is calculated as a linear function of the state estimate and the control parameters as follows:

*k*− 1. (the type of state estimator can vary, but the state estimate is calculated based on the instant or historical sensor measurements. The commonly used Kalman filter state estimator is introduced below) $xkr$ is the control setting parameters for controller

*k*(in this paper, we assume $xkr$ has the same dimensionality as

**x**

_{k}). The linear controller calculates the control action as a linear combination of the estimated state and the control parameter. The matrices

*L*

_{k}and $LRk$ are the linear coefficients of the estimated state after the previous stage and the reference values, respectively.

*Z*and

*U*are positive-semi definite matrices defining the cost. Under the LQG setting, Eq. (6) is defined by

*S*

_{k}is calculated by the following matrix Riccati difference equation that runs backward in time: $Sk=AkT(Sk+1\u2212Sk+1Bk(BkTSk+1Bk+Z)\u22121BkSk+1)$$Ak+U,SK=F$.

### 3.3 False Data Injection.

*k*can be modeled as

*δ*

_{k}represents the bias introduced to the sensor measurements by FDI at stage

*k*. $yka$ is the vector of the under-attack sensor measurements at stage

*k*. Notice that

*δ*

_{k}is introduced to facilitate theoretical analysis. The attacker may implement the FDI attack by replacing the original sensor measurement with $yka$. Hence,

*δ*

_{k}does not represent the data being injected, but only the bias caused by the false data injection.

## 4 Methodology

Given the system model described in Sec. 3, we propose a group Lasso-based hybrid attack detection and localization (GLHAD) framework against false data injection attacks in an MMS. The framework is depicted in Fig. 1. It takes the sensor data and the system dynamics of the MMS as input and integrates them into a group Lasso formulation. The GLHAD framework is based on theoretical analysis of the FDI attack propagation and characterization of the attack propagation as data features extracted from the system analysis. The major advantage of GLHAD is that it provides a generalizable framework to not only detect the occurrence of the attack, but also localize the source of the attack. To achieve this, we first perform theoretical analysis in Sec. 4.1 to build the mathematical model, based on which we extract the data features characterizing the impact of false data injection attacks on the sensor data. We then use the characterization to extract signatures from state estimation residuals that indicate the occurrence of the attack in the system and location in terms of the under-attack stage in the system. The formulation of the group Lasso model and the proposed attack detection and localization algorithm is described in Sec. 4.2.

### 4.1 Theoretical Analysis.

*augmented state variable*$x~$ comprised of the input state

**x**

_{0}, representing the state of the input material, and the reference values, $xkr$ at each stage

*k*.

**x**

_{0}represents the input of the system, also $xkr$ represents the reference state variable at stage

*k*. Notice the state variables $x=[x0T,x1T\cdots xNT]T$ tracks the product state at each stage, and with the control actions at each stage calculated based on the measurements after the previous stage, the

**x**

_{i}’s are cross-correlated. On the other hand, the augmented state variables are the independent external inputs to the MMS at each stage under our system setting. In this context, the sensor measurements, $y=[y0T,y1T\cdots yKT]T$, with

**y**

_{k}representing the sensor measurements at stage

*k*, for

*k*∈ {0, …,

*K*}, can be represented as a linear function of $x~$ as follows:

*H*is the augmented measurement matrix. Specifically,

*H*characterizes the relationship between the augmented state variables and the sensor outputs.

*H*is comprised of (

*K*+ 1) × (

*K*+ 1) submatrices

*h*

_{ij}, characterizing the relationship between the sensor measurements in stage

*i*− 1 and the augmented state in stage

*j*− 1. We develop Proposition 1 to define matrix

*H*.

*For an MMS described in Sec. 3, the matrix H in Eq. (9) can be written as*

*j*≥ 2, we have

*j*= 1, we have

*i*≥ 3, we have

The proof is provided in Appendix A.

Proposition 1 defines sensor measurements **y** as a function of the augmented state variable $x~$, representing the initial product state and the reference values of the distributed stage-level control in an MMS. This new system model distinguishes the deterministic external inputs, assumed to be immune to false data injection, from the intermediate product states between stages that are affected by the false data injection. Using this representation, we can estimate the augmented state variables based on sensor data and obtain the state estimation residuals, which helps us in detecting and localizing the attack.

When the system is under attack, based on the linearity assumptions in the system dynamics and controller model, the attack on stage *k* can propagate to stage *k* + 1 through the control mechanism in Eq. (6). This can be captured by the sensor data in stage *k* + 1, which then further affects the control in later stages. To characterize this linear propagation, we construct a new model for system under attack, characterized by the Proposition below.

*For an MMS described in Sec. 3, under a false data injection attack characterized by vector $\delta =[\delta 0T,\delta 1T\cdots \delta KT]T$, where δ*

_{k}represents the false data injected at stage k, the sensor measurement**y**can be expressed as*i*≥

*j*+ 2

The proof is provided in Appendix B.

In Proposition 2, *H*_{1} characterizes the relationship between the sensor measurements and the injected false data, with consideration of the attack propagation resulted from the multistage process. Equation (12) will facilitate extracting the important features from **y** to accurately detect FDI attacks and localize the source stage of the attack. Based on the linear relationship, we will develop the GLHAD framework based on the features extracted from the state estimation residuals.

To detect anomalous patterns in state estimation residuals, we must analyze the variance of **y**. In Eq. (9), since $x~$ contains only input variable **x**_{0}, and all reference values are deterministic, process noise, and measurement noise **w**_{k} and **v**_{k} contribute to the noise term $\epsilon $. Thus, we derive the variance of $\epsilon $ in Proposition 3, which will be used later to analyze $\epsilon $ and identify abnormal patterns caused by false data injection.

*For an MMS described in Sec. 3, denote the block diagonal matrices of process and measurement noise covariance as Σ*

_{x}= diag(W_{1}, …, W_{K}) and Σ_{y}= diag(V_{1}, …, V_{K}), respectively. The covariance of $\epsilon $ in Eq. (9) follows:*H*

_{w}can be represented as:

*H*

_{w}= [

*h*″

_{ij}]

_{(K+1)×(K+1)}. In the above expression, for any

*j*, we have

*i*≥

*j*+ 2

Proposition 3 derives the expression of $\Sigma \epsilon $, which will be used for formulating hypothesis testing on $\epsilon $. By understanding the normal covariance matrix, the anomalous pattern in the data caused by the attack will lead to a rejection of null hypothesis that can be used to identify the attack. The stage-level hypothesis tests formulated based on $\Sigma \epsilon $ will be used to localize the attack.

### 4.2 The GLHAD Framework.

Based on the theoretical analysis in Sec. 4.1, we propose a GLHAD framework that fuses the sensor data and domain knowledge of the system dynamics into the group Lasso formulation. The GLHAD framework combines the advantage of both signature-based method and anomaly detection methods, where we can identify the location of the attack using a signature-based mechanism without relying on a comprehensive dataset that contains labeled attack data to learn the features of attacks at different locations.

The GLHAD framework consists of two steps. The first step is the group lasso formulation, and the second step is the online detection and localization. In the first step, we use the augmented state estimation to distinguish the variance caused by external input and the variance caused by attack propagation to extract the state estimation residuals that serves as the input to the group lasso formulation. In the second step, we solve the group lasso problem and run the sequential hypothesis test based on the solution and raise alarms of attacks and with the attack location information.

**y**onto the column space spanned by

*H*

**r**, is calculated as

**r**

_{k}is the state estimation residual at stage

*k*. Under normal condition (when there is no attack), the augmented state estimate should be close to the true augmented state, and the residuals should be close to zero. On the other hand, based on Proposition 2, the residuals will have a component spanned by the matrix

*H*

_{1}, whose expectation is not 0. Therefore, the magnitude of the residual

**r**is an indication of whether the system is under attack. To further infer the location of attack, we need to analyze the patterns in

**r**that are consistent with attack propagation mechanism. We acknowledge that the

**r**will not fully reconstruct the term $H1\delta +\epsilon $, as it is obtained by projecting

*y*onto the column space of

*H*, but rather the subspace of

*H*

_{1}that is orthogonal to the column space of

*H*. Intuitively speaking,

**r**represents the part in

**y**that is “unexplainable” by

*H*. Therefore, it is necessary to identify the basis of such subspace of

*H*

_{1}that is orthogonal to

*H*as well as separating the subspaces corresponding to attacks on different stages. Denote

*R*

_{k}as the subspace basis for attack on stage

*k*, it can be obtained by projecting the columns of

*H*

_{1}corresponding to stage

*k*on to the columns space of

*H*and taking the residue

*R*

_{k}may not be full rank. Therefore, we apply principal component analysis to obtain the principal components of

*R*

_{k}, which formulates the signature basis of stage

*k*, $R~k$. In other words, $R~k=Rk[e1,e2,\u2026,egk]$, where

**e**

_{i}is the

*i*th eigenvector of (

*R*

_{k})

^{T}

*R*

_{k}, and the number of principal components

*g*

_{k}(

*g*

_{k}≤

*m*

_{k}−

*n*

_{k}) is selected based on the eigenvectors. After obtaining the signature basis $R~=[R~1,\u2026,R~K]$, Eq. (12) can be rewritten as

**r**can be written as

*δ*such that $R~k\delta ~k\u2248H1k\delta k$, the signature basis $R~k$ is orthogonal to

*H*, and $\epsilon ~$ is the new noise term that is “unexplainable” by either

*H*or $R~$. Based on the above model, the FDI vector

*δ*= [

*δ*

_{1}, …,

*δ*

_{K}] can be calculated by projecting the residuals

**r**onto the column space of

*R*. Based on our assumption that there is only one stage under attack, the vector

*δ*should be group-sparse, where each group is a stage and corresponds to a

*δ*

_{i}, and only one of the

*δ*

_{i}is non-zero. As each $\delta ~k$ is a linear transformation of

*δ*

_{k}, the vector $\delta ~$ should preserve the sparsity of

*δ*. Therefore, the detection and localization problem can be formulated as a group Lasso problem, where we estimate the transformed FDI vector $\delta ~$ by regressing the state estimation residual

**r**against the signature basis $R~k$. The group sparsity of the obtained coefficients will indicate the location of the attack, and the magnitude of the residuals spanned by the signature basis of the group can be used to detect the attack.

*l*

_{2}norm.

*λ*is a positive tuning parameter that decides how much sparsity is enforced, where a smaller

*λ*will result in a more sensitive detection algorithm which can potentially generate more false alarms, while a larger

*λ*leads to a more robust detection algorithm that may miss some of the slight attacks. Our

*λ*is tuned based on the training data containing only the normal data, to achieve a desired type-I error rate. In practice,

*λ*can also be selected based on the smallest magnitude of

*δ*to be detected.

*χ*

^{2}test based on the covariance matrix of the noise term $\epsilon $ given in Proposition 3. As the signatures were processed by principal component analysis, the covariance matrix used in the stage-level

*χ*

^{2}test is calculated as

*I*−

*H*(

*H*

^{T}

*H*)

^{−1}

*H*

^{T}and $\Psi k=I\u2212R~kR~kT$. The stage-level

*χ*

^{2}statistic is calculated as

*α*quantile of the

*χ*

^{2}distribution with

*g*

_{k}degrees-of-freedom, $\chi \alpha ,k2$. An alarm is triggered when $\chi k2>\chi \alpha ,k2$. This mechanism also potentially allows us to detect attacks on multiple stages simultaneously, but we do not consider such case both because of practicality and the complexity of the analysis involving multiple correlated hypothesis tests. In this paper, under the single stage-under-attack (SUA) assumption, the stage with the lowest

*p*-value is considered as the under-attack stage when multiple alarms are triggered. The pseudo-code of the detection algorithm is given in Algorithm 1.

#### GL-based attack detection and identification for multistage linear system

**Input:**$y$, $H$, $R~k$, $\Sigma ~k$ ($k=1,\u2026,K$), $\lambda $, $\alpha $

1: $x^\u2190(HTH)\u22121HTy$; $r\u2190y\u2212Hx^$

2: Solve the group Lasso problem in Eq. (18)

3: $k\u21901$, $l\u21900$, $pmin\u21901$

4: **while**$k\u2264K$**do**

5: $r^k\u2190R~k\delta ^k$

6: $\chi k2=r^kT\Sigma ~k\u22121r^k$

7: **if**$\chi k2>\chi \alpha ,k2$**then**

8: $p\u2190F\chi gk2\u22121(\chi k2)$

9: **if**$p<pmin$**then**

10: $l\u2190k$

11: **end if**

12: **end if**

13: $k\u2190k+1$

14: **end while**

15: **if**$l=0$**then**

16: **return** no attack

17: **else**

18: **return** stage-*l* is under attack

19: **end if**

Algorithm 1 shows that, after solving the group Lasso problem, the detection and localization process are completed simultaneously by running the stage-level *χ*^{2} tests. The algorithm demonstrates a combination of the theoretical analysis which involves the derivation of the signature basis and statistical learning which involves the group lasso algorithm based on the sensor data. The performance of the above algorithm is tested based on a numerical study on randomly generated system parameters as well as a case study based on an assembly process.

## 5 Experimental Results

*χ*

^{2}test based on the group-wise hypothesis testing scheme. In the baseline method, the residuals are obtained based on the measurement function (2)

*γ*

_{k}can be calculated as

_{k}is based on the independent measurement function of each stage without involving the stage-stage interaction. Then, we apply the stage-level

*χ*

^{2}test on each stage, where the test statistic is calculated as $\chi b,k2=\gamma kT\Sigma k\u22121\gamma k$. For stage

*k*, an alarm is triggered when $\chi b,k2>\chi \alpha ,nk2$.

*α*is the type-I error and

*n*

_{k}is the dimension of

**x**

_{k}.

### 5.1 Numerical Study.

*A*

_{k},

*B*

_{k},

*C*

_{k}with

*n*

_{k}= 3,

*m*

_{k}= 5 for

*k*= 0, 1, 2, 3) from a standard normal distribution. The process noise and measurement noise at each stage come from the multivariate normal distribution with mean zero and covariances:

*W*

_{k}= 0.1

*I*,

*V*

_{k}= 0.1

*I*for

*k*= 0, …, 3. Three hundred replications were implemented under the normal scenario for us to tune the parameter. Then, we simulate 100 replications for each of the sensor attacks on each stage with different attack severities and compare the performance of the two methods in terms of detection delay and localization accuracy. The severity of attack is quantified by the signal-to-noise ratio (SNR) calculated as follows:

The expected detection delay is shown in terms of the run length (RL) indicating the number of observations taken to raise the first alarm since the onset of the attack. The RLs of 100 replications are shown with box plots in Fig. 2. The type-I error rate is chosen at 0.005, so the nominal average run length (ARL) is set to be 200. It is shown that both methods are sensitive to attacks, and the detection delay decreases as the attack severity increases. However, the proposed GLHAD framework shows a lower ARL and a smaller variance in the run length distribution, which indicates a more timely and reliable attack detection.

Figure 3 shows the overall localization accuracy, which is calculated as the probability of correctly identifying the true SUA. The shaded area indicates the $90%$ confidence interval if the attack accuracy. The results show that the proposed GLHAD outperforms the baseline method, especially when the attack occurs at an earlier stage. This can be explained by the fact that the GLHAD framework takes the attack propagation into consideration and is hence utilizing more information from later stages than the independent stage-level hypothesis tests in the baseline method. It is also not surprising as the attack severity increases, the localization accuracy also increases. Tables 1 and 2 show the confusion matrix of the localization results. A key observation is that the baseline method misclassifies attacks at early stages (e.g., stage 0) as attacks at later stages (first column in Table 2). In contrast, the GLHAD framework has a lower probability of misclassifying an attack as attacks in later stages, but has a slightly higher chance of misclassifying them as early-stage attacks.

### 5.2 Case Study.

In the case study, we implement the GLHAD framework and benchmark method on a real-world three-stage SUV side frame assembly process introduced by Ref. [47]. The assembly process creates the final product called “inner-panel-complete.” This product consists of four main components: the A-pillar, B-pillar, rail roof side panel, and rear quarter panel. These components are assembled across three stations, namely Stations I, II, and III. The state variable **x**_{k} represents the accumulated part deviation up to station *k*, the input **u**_{k} is the fixture deviation where we assume to be controlled by an LQG controller, and **y**_{k} is the measurement deviation observed at station *k*. The configuration of the system is described through matrices A, B, and C, all of which are determined by the process and product design. Matrix A, referred to as the dynamic matrix, outlines how the orientation of the assembly changes as the individual parts are transferred between the different stations. Essentially, A captures the alterations in part positioning that occur as the production process moves from one station to another. Matrix B serves as the input matrix, and it governs how deviations in the fixture (the tooling used for alignment) impact the overall deviation of the assembled part. This is contingent on the specific layout of the fixture and how it interacts with the part’s geometry. More details can be found in Ref. [47].

We use the system parameters provided in Refs. [9,47] and consider only the first six state variables in our study. A 0.01 factor is used to re-scale the system to ensure computation efficiency considering the iterative substitutions in the matrix derivations, and the SNR values are selected to be 10, 15, 20, 30, 40, 50 in this study. The run length distributions, localization accuracy are shown in Figs. 4 and 5, respectively. The results demonstrate a more significant advantage of the proposed GLHAD framework than the baseline method both in detection delay and localization accuracy, especially when the attack is on stage 2. The baseline method fails to detect the attack with almost $80%$ probability, while the proposed GLHAD framework not only detects the attack, but also accurately localize the attack. This can also be observed in the confusion matrices in Tables 3 and 4. The GLHAD framework accurately identifies the SUA, while the baseline method has a lower than $50%$ localization accuracy, with similar mis-classification rates to regardless of stages.

## 6 Conclusion

In this paper, we proposed a novel group lasso-based hybrid cyberattack detection and localization framework for multistage manufacturing processes. Under a linear system dynamics assumption, we proposed a new system representation for the MMS by introducing an augmented state variable which helps distinguish the deterministic external control input to the MMS and the internal variations caused by the attack and attack propagation. We consider the most common sensor spoofing attack—false data injection in this paper. We propose the GLHAD framework based on a group lasso formulation that combines the advantages of signature-based and data-driven learning-based methods for attack detection. The formulation integrates the domain knowledge of the system dynamics into the sensor data-based group lasso algorithm, which allows us to fully utilize the real-time information from the data and the prior knowledge from the system. The group lasso formulation also allows us to detect the attack and localize it to the correct stage simultaneously with high accuracy. A numerical study and a case study validates the proposed framework and shows the advantage compared with traditional stage-level hypothesis testing methods. For future research, we aim to extend this work to consider nonlinear system representations, other types of attacks in the MMS, and relaxing the assumption of attack on a single stage.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The authors attest that all data for this study are included in the paper.

## Appendix A: Proof of Proposition 1

*j*∈ {2, 3, …,

*K*}. We present a proof for

*j*= 2. This proof can be easily generalized for any other

*j*∈ {3, …,

*K*}.

*h*

_{i2}represents the coefficient of $x1r$ in

**y**

_{i−1}. It can be easily seen that :

*h*

_{12}= 0, $h22=C1B1LR1$. For

*n*≥ 2, we use induction to complete the proof. For

*i*= 3,

*h*

_{32}represents the coefficient of $x1T$ in

**y**

_{2}and we have: $h32=C2[(A2+B2L2]B1LR1$. We assume that for an arbitrary

*i*≥ 3

*ρ*

_{i,j}is the coefficient of $xir$ in

*hat*

**x**

_{j | j}. Hence, we can conclude that

*β*

_{i,j}represents the coefficient of $xir$ in

*x*

_{j}. Hence, we can conclude that

*j*= 1,

*h*

_{i1}represents the coefficient of

**x**

_{0}in

**y**

_{i−1}. Hence,

*h*

_{11}and

*h*

_{12}represent the coefficient of

**x**

_{0}in

*y*

_{0}and

*y*

_{1}, respectively. We use induction for the proof for

*i*≥ 3.

*h*

_{31}represents the coefficient of

**x**

_{0}in

**y**

_{2}, so we have

*h*

_{31}=

*C*

_{2}[(

*A*

_{2}+

*B*

_{2}

*L*

_{2})(

*A*

_{1}+

*B*

_{1}

*L*

_{1}

*K*

_{0}

*C*

_{0}) −

*B*

_{2}

*L*

_{2}(

*I*−

*K*

_{1}

*C*

_{1})

*A*

_{1}(

*I*−

*K*

_{0}

*C*

_{0})]. We assume that for an arbitrary

*i*≥ 3, we have

*θ*

_{i,j}is the coefficient of

**x**

_{0}of $x^i|i$ in

*y*

_{j}. Also, it can be seen that

*π*

_{i}is the coefficient of

**x**

_{0}in

*x*

_{i}. The coefficient of

**x**

_{0}in

*y*

_{i}is

**x**

_{0}in

*y*

_{i}is

## Appendix B: Proof of Propositions 2 and 3

*i*= 1, …,

*j*+ 1, the proof is obvious. For

*i*≥

*j*+ 2, we can assume that for

*k*=

*a*,

*a*≥

*j*+ 2, the following statement is true:

*k*=

*a*+ 1, we can show that