## Abstract

We present a data-driven workflow to biological tissue modeling, which aims to predict the displacement field based on digital image correlation (DIC) measurements under unseen loading scenarios, without postulating a specific constitutive model form nor possessing knowledge of the material microstructure. To this end, a material database is constructed from the DIC displacement tracking measurements of multiple biaxial stretching protocols on a porcine tricuspid valve anterior leaflet, with which we build a neural operator learning model. The material response is modeled as a solution operator from the loading to the resultant displacement field, with the material microstructure properties learned implicitly from the data and naturally embedded in the network parameters. Using various combinations of loading protocols, we compare the predictivity of this framework with finite element analysis based on three conventional constitutive models. From in-distribution tests, the predictivity of our approach presents good generalizability to different loading conditions and outperforms the conventional constitutive modeling at approximately one order of magnitude. When tested on out-of-distribution loading ratios, the neural operator learning approach becomes less effective. To improve the generalizability of our framework, we propose a physics-guided neural operator learning model via imposing partial physics knowledge. This method is shown to improve the model's extrapolative performance in the small-deformation regime. Our results demonstrate that with sufficient data coverage and/or guidance from partial physics constraints, the data-driven approach can be a more effective method for modeling biological materials than the traditional constitutive modeling.

## 1 Introduction

For many decades, constitutive models based on continuum mechanics have been commonly employed for modeling the mechanical responses of soft biological tissues. In Ref. [1], the seminal phenomenological constitutive models were developed and later employed for the modeling of soft tissues, including the iris [2], cardiac heart valves [3–5], arterial vessels [6], and the skin [7]. In the constitutive modeling approaches, a strain energy density function is predefined with a specific functional form. Then, the material parameters are calibrated through an inverse method or analytical stress–strain fitting. The descriptive power of these models is often restricted to certain deformation modes/strain ranges, which might lead to limited predictivity and generalizability [8–10].

To circumvent such a limitation, data-driven computing has been considered in recent years as an alternative for modeling the mechanical response of biological tissues [9,11–13]. Unlike the traditional material identification techniques in constitutive modeling, data-driven approaches directly integrate material identification with the modeling procedures, and hence do not require a predefined constitutive model form. In Ref. [11], a fully convolutional neural network was trained based on synthetic datasets, to estimate a displacement field of material points in the simulated liver organ. In Ref. [14], Miñano et al. construct the constitutive law for soft tissue damage by solving the system of linear equations consisting of coefficients of shape functions, rather than nonlinear fitting to a predefined model. In Ref. [9], a local convexity data-driven computational framework was developed that couples manifold learning with nonlinear elasticity, for modeling a representative porcine mitral (heart) valve posterior leaflet's stress–strain data. This framework was further extended to an auto-embedding data-driven approach [12] to infer the underlying low-dimensional embedding representation of the material database. In Ref. [13], a neural network was developed to learn the mechanical behavior of porcine and murine skin from biaxial testing data by inferring the relationship between the isochoric strain invariants and the value of strain energy, as well as the strain energy derivatives. Despite these advances, data-driven methods on soft tissue modeling are mostly focusing on the identification of stress–strain and/or energy–strain relationships for a homogenized material model, and are thus not capable to capture the effects of material spatial heterogeneity. For example, the lack of considering the soft tissue heterogeneity could induce large errors in the predictions of tissue displacements and stresses [15].

Alternatively, there has been significant progress in the development of deep neural networks (NNs) for heterogeneous material modeling [16–27]. Among these works, we focus on the neural operator learning approach [22–27], which learns the maps between the inputs of a dynamical system and its state, so that the network serves as a surrogate for a solution operator. Compared with the classical NNs, the most notable advantage of neural operators is their generalizability to different input instances, rendering a computing advantage on prediction efficiency—once the neural operator is trained, solving for a new instance of the input parameter only requires a forward pass of the network. In Refs. [28–30], neural operators have been successfully applied to modeling the unknown physics law of homogeneous materials. In Refs. [25–27,31], neural operators were used as a solution surrogate for Darcy's flow in a heterogeneous porous medium with a known microstructure field. In our previous work [22], an implicit neural operator architecture, namely, the implicit Fourier neural operator (IFNO), was proposed to model heterogeneous material responses without using any predefined constitutive models or microstructure measurements. In particular, we have investigated the applicability of learning a material model for a latex material directly from digital image correlation (DIC) measurements and show that the learned solution operators substantially outperform the conventional constitutive models such as the generalized Mooney–Rivlin model.

To the best of our knowledge, the neural operator learning approaches have not been applied to soft tissue biomechanics. Moreover, the effectiveness of neural operator learning methods in extrapolation to small and large deformation regimes has yet to be systematically examined. To achieve these goals, in this work we propose to advance the current data-driven methods of soft tissue modeling by extending the neural operator learning approach. In particular, we employ the IFNO to learn the material model from DIC measurements on a representative tricuspid valve anterior leaflet (TVAL) specimen from a porcine heart and assess its predictability on unseen and out-of-distribution loading scenarios. To further improve the generalizability of the proposed framework, we also infuse partial physics knowledge via a soft penalty constraint to obtain a novel physics-guided neural operator learning framework. This method is shown to improve the extrapolative performance of our model in the small deformation regime.

The remainder of this paper is organized as follows. In Sec. 2, we introduce our data-driven computing paradigm based on the neural operator learning method, which integrates material identification, modeling procedures, and material response prediction into one unified learning framework. In particular, a stable deep layer architecture, i.e., the IFNO, is introduced in Sec. 2.2 and incorporated with partial physics knowledge in Sec. 2.3. In Sec. 3, we introduce our experimental setting on a representative TVAL specimen. Four study scenarios, considering different sets of experimental data for model training and predictions, are used to examine the in-distribution and out-of-distribution predictivity of the proposed IFNO method. The effectiveness of the IFNO approach is also compared with finite element simulation results based on three constitutive models. Then, we illustrate the prediction results of the IFNOs and physics-guided IFNOs, and compared their results with the modeling results based on fitted constitutive models in Sec. 4. Finally, we provide a summary of our achieved goals and concluding remarks in Sec. 5.

## 2 An Integrated Learning Framework

In this section, we first formulate the proposed workflow of data-driven material modeling using the operator learning framework and then introduce the deep neural operator model—the IFNO [22]. Next, we propose to further infuse partial physics knowledge via a soft penalty constraint to guide the training and prediction of the neural operators.

### 2.1 Neural Operator Learning Methods.

The main objective of this work is to model the mechanical response of a representative soft biological tissue directly from DIC-tracked displacement measurements, without any predefined constitutive model nor knowledge of the tissue microstructure. As depicted in Fig. 1 and Table 1, let us consider a soft biological tissue specimen that is mounted to a biaxial testing system and deforms under external loading. Denoting the region of interest on this specimen as a two-dimensional domain Ω, our aim is to identify the *best surrogate operator*, that can accurately predict the displacement field $u(x),\u2009x\u2208\Omega $, given new and unseen loading scenarios. In this work, we model the tissue mechanical response as a quasi-static and hyperelastic problem for simplicity, so the resultant displacement field can be fully determined by a displacement-type loading applied on the domain boundary $\u2202\Omega $. Thus, given the Dirichlet-type boundary condition, $uD(x)$ for $x\u2208\u2202\Omega $, our ultimate goal is to predict the corresponding displacement field $u(x),\u2009x\u2208\Omega $.

Set ID | Experiment protocol | $max(\lambda 1)$ | $max(\lambda 2)$ | $max(P11)$ | $max(P22)$ | # of samples |
---|---|---|---|---|---|---|

1 | Biaxial tensions $P11:P22=1:1$ | 1.46 | 1.68 | 184.1 kPa | 165.1 kPa | 3921 |

2 | Biaxial tensions $P11:P22=1:0.66$ | 1.48 | 1.63 | 187.1 kPa | 127.8 kPa | 3797 |

3 | Biaxial tensions $P11:P22=1:0.33$ | 1.52 | 1.52 | 186.9 kPa | 74.1 kPa | 3539 |

4 | Biaxial tensions $P11:P22=0.66:1$ | 1.42 | 1.72 | 145.9 kPa | 188.2 kPa | 4013 |

5 | Biaxial tensions $P11:P22=0.33:1$ | 1.32 | 1.79 | 77.9 kPa | 189.8 kPa | 4175 |

6 | Constrained uniaxial in x, $P11:P22=0.05:1$ | 1.56 | 1.0 | 197.9 kPa | 10.6 kPa | 3539 |

7 | Constrained uniaxial in y, $P11:P22=1:0.1$ | 1.0 | 1.89 | 17.2 kPa | 176.1 kPa | 3539 |

Set ID | Experiment protocol | $max(\lambda 1)$ | $max(\lambda 2)$ | $max(P11)$ | $max(P22)$ | # of samples |
---|---|---|---|---|---|---|

1 | Biaxial tensions $P11:P22=1:1$ | 1.46 | 1.68 | 184.1 kPa | 165.1 kPa | 3921 |

2 | Biaxial tensions $P11:P22=1:0.66$ | 1.48 | 1.63 | 187.1 kPa | 127.8 kPa | 3797 |

3 | Biaxial tensions $P11:P22=1:0.33$ | 1.52 | 1.52 | 186.9 kPa | 74.1 kPa | 3539 |

4 | Biaxial tensions $P11:P22=0.66:1$ | 1.42 | 1.72 | 145.9 kPa | 188.2 kPa | 4013 |

5 | Biaxial tensions $P11:P22=0.33:1$ | 1.32 | 1.79 | 77.9 kPa | 189.8 kPa | 4175 |

6 | Constrained uniaxial in x, $P11:P22=0.05:1$ | 1.56 | 1.0 | 197.9 kPa | 10.6 kPa | 3539 |

7 | Constrained uniaxial in y, $P11:P22=1:0.1$ | 1.0 | 1.89 | 17.2 kPa | 176.1 kPa | 3539 |

The resultant displacement fields, based on digital image correlation, were used in the data-driven computations (*P*_{11} and *P*_{22} denote the first Piola–Kirchhoff stresses in the *x*- and *y*-directions, respectively, and *λ*_{1}, *λ*_{2} are the stretches ratios in these two directions).

**. To this end, we propose to embrace the descriptive power of NNs, and develop a data-driven neural operator with its input being $uD(x)$ and its output being the displacement field $u(x)$, for any $x\u2208\Omega $. Given a collection of observed function pairs ${(uD)j(x),uj(x)}j=1N$ from DIC measurements, where the input ${(uD)j},\u2009j=1,\u2026,N$ is a sequence of boundary displacement loading and $G\u2020[(uD)j](x)=uj(x)$ is the corresponding (potentially noisy) displacement field. With neural operator learning, we aim to build an approximation of $G\u2020$ by constructing a nonlinear parametric map $G[\xb7\u2009;\u2009\theta ]$ in the form of a NN, for some finite dimensional parameter space Θ. Here, $\theta \u2208\Theta $ is the set of network architecture parameters to be inferred by solving the minimization problem**

*u*In this context, we have formulated the soft tissue response modeling problem as learning the solution operator $G$ of an unknown partial differential equation (PDE) system from the DIC data.

Thus, our goal is to provide a *neural operator*, i.e., an approximated solution operator $G[\xb7;\theta ]:uD\u2192u$, that delivers solutions of Eq. (2.1) for any input $uD$. Compared with the classical PDE solvers and the NN approaches, this is a more challenging task for several reasons. First, in contrast to the classical NN approaches where the solution operator is parameterized between finite dimensional Euclidean spaces [32–36], the neural operators are built as mappings between infinite dimensional spaces [25,27,37]. Second, for every new instance of material microstructure and/or loading scenario ** f**, the neural operators require only a forward pass of the network, which implies that the optimization problem (2.2)

*only needs to be solved once and the resulting NN can be utilized to solve for new and unseen loading scenarios*. This property is in contrast to the classical numerical PDE methods [38–40] and some machine learning approaches [41–45], where the optimization problem needs to be solved for every new instance of the input parameter of a known governing law. Finally, of fundamental importance is the fact that the neural operators can find solution maps regardless of the presence of an underlying PDE and only require the observed data pairs ${((uD)j,uj)}j=1N$. Therefore, the neural operator learning approach is particularly promising when the mechanical responses are provided by experimental measurements, such as the displacement tracking data from DIC considered in this paper.

### 2.2 Implicit Fourier Neural Operators.

To provide an efficient, deep, and stable integral neural operator for the solution operator learning problem discovered above, we employ the IFNOs [22]. IFNOs stem from the idea of modeling the solution operator as a fixed point equation that naturally mimics the solution procedure for the displacement/damage fields in materials modeling. The increment between neural network hidden layers is modeled as an integral operator, which is directly parameterized in the Fourier space to facilitate the fast Fourier transformation and accelerated learning techniques for deep networks. As shown in Ref. [22], by learning the material responses directly from data, the material microstructure and properties are learned implicitly and embedded naturally in the network parameters, enabling the prediction of the material displacement for unseen loading conditions.

*x*and

*y*components of the displacement field, respectively. For each IFNO, the input is a vector function $f(x)$ on Ω that contains information from $x$ and $uD(x)$. Here, we notice that the displacement boundary loading $uD(x)$ is only defined on $\u2202\Omega $. To make it well-defined on the whole domain, we employ the zero-padding strategy proposed in Ref. [31], namely, defining $f(x):=[x,u\u0303D(x)]$ where

*d*, that corresponds to the first network layer. For the consistency of notation, we label the first argument of

**as the space (the set of nodes) and the second argument as the time (the set of layers) and define the first network layer as**

*h*where $P\u2208\mathbb{R}d\xd74$ and $p\u2208\mathbb{R}d$ are trainable parameters.

*l*th network representation by $h(x,l\Delta t)$, and formulate the NN architecture in an iterative manner: $h(\xb7,0)\u2192h(\xb7,\Delta t)\u2192h(\xb7,2\Delta t)\u2192\cdots \u2192h(\xb7,T)$, where $h(\xb7,j\Delta t),\u2009j=0,\u2026,L:=T/\Delta t$, is a sequence of functions representing the features at each hidden layer, taking values in $\mathbb{R}d$. Here,

*l*=

*0 (or equivalently,*

*t*=

*0) denotes the first hidden layer, whereas $t=L\Delta t=T$ corresponds to the last hidden layer. In particular, the layer update rule in the IFNOs writes*

Here, $F$ and $F\u22121$ denote the Fourier transform and its inverse, respectively. In practice, $F$ and $F\u22121$ are computed using the fast Fourier transformation and its inverse to each component of ** h** separately, with the highest modes truncated and keeping only the first

*k*modes. Also, $c\u2208\mathbb{R}d$ defines a constant bias, $W\u2208\mathbb{R}d\xd7d$ is the weight matrix, and $F[\kappa (\xb7;v)]:=R\u2208\u2102d\xd7d\xd7k$ is a circulant matrix that depends on the convolution kernel $\kappa $. We further define

*σ*as the activation function, which is chosen to be the popular rectified linear unit (ReLU) function [46]. Here we note that the definition of

*t*stems from the relationship established between the network update and a time stepping scheme, which enables the employment of the accelerated training strategy for the NN in the deep layer limit.

Here, $Q1\u2208\mathbb{R}dQ\xd7d,\u2009Q2\u2208\mathbb{R}1\xd7dQ,\u2009q1\u2208\mathbb{R}dQ$, and $q2\u2208\mathbb{R}$ are the trainable parameters.

*u*and

_{x}*u*with the subscripts

_{y}*x*and

*y*, respectively, the vanilla version of our neural operator learning architecture without physics constraints (which will be denoted as IFNO in the following context, with a slight abuse of notation) is written as:

Further, as the layer becomes deep ($\Delta t\u21920$), the iterative architecture of the IFNOs can be seen as an analog of discretized ordinary differential equations (ODEs). This allows us to exploit the shallow-to-deep learning technique [22,37,47,48]. Specifically, using the optimal network parameters $\theta *$ obtained by training an IFNO of depth *L*, we initialize the (deeper) $L\u0303$-layer network. As such, the optimal parameters learned on shallow networks are considered as (quasi-optimal) initial guesses for the deeper networks—accelerating the training for deeper NNs.

### 2.3 Physics-Guided Neural Operators.

So far, the neural operator model introduced above fully relies on the data, and hence its predictions may not be consistent with the underlying physical principles. For instance, with the quasi-static and hyperelastic assumption of our model, the specimen has no permanent deformation. In other words, if there is no loading applied to the tissue (i.e., the specimen is at rest), we should observe a zero displacement field in the specimen. However, this is generally not guaranteed in a fully data-driven neural operator model.

*physics-guided neural operator model*that minimizes the residual of the above physical law together with the fitting loss from the data. This is achieved by solving the following minimization problem with a hybrid loss function:

Here, $\gamma >0$ is a penalty parameter to enforce the zero deformation state for material subject to zero loading. Thus, the physics-guided neural operator is anticipated to improve the prediction performance in the small deformation regime. In the following, we will denote this model as the physics-guided IFNO, or the PG-IFNO, in short.

## 3 Application to Tissue Biomechanics of the Heart Valve Leaflet

We now consider the problem of learning the material response of a TVAL specimen from displacement measurements based on DIC tracking. In this problem, the constitutive equations and material microstructure are both unknown, and the dataset has unavoidable measurement noise. To demonstrate the efficacy of the proposed IFNOs in conjunction with the physics-based enrichment, we further compared our method against three conventional approaches that use constitutive modeling with parameter fittings. The code and dataset have been publicly released at the following link.^{2}

### 3.1 Tissue Preparation and Mechanical Testing.

In this section, we first introduce the experimental specimen and data acquisition procedure. In brief, we followed our previously established biaxial testing procedure, including acquisition of a healthy porcine heart and retrieval of the TVAL [49,50]. We then sectioned the leaflet tissue into a square specimen and measured the thickness using an optical measuring system (Keyence, Itasca, IL). Afterward, we applied a speckling pattern to the tissue surface using an airbrush and black paint [51–53]. The painted specimen was then mounted to a biaxial testing device (Bio-Tester, CellScale, Waterloo, ON, Canada) with an effective testing area of 9 × 9 mm for the following tissue characterizations (Fig. 1(a)).

First, we performed a preconditioning protocol in which the specimen was subjected to ten cycles of biaxial loading and unloading that targeted a first Piola–Kirchhoff stress of 150 kPa to emulate the valve's in vivo functioning conditions [54]. Then, we performed seven protocols of displacement-controlled testing to target various biaxial stresses: $P11:P22=$ 1:1, 1:0.66, 1:0.33, 0.66:1, 0.33:1, 0.05:1, 1:0.1, with the last two protocols for constrained uniaxial stretching in *x* and *y* (Fig. 1(c) and Table 1). Here, *P*_{11} and *P*_{22} denote the first Piola–Kirchhoff stresses in the *x*- and *y*-directions, respectively. Each stress ratio was performed for three loading/unloading cycles. Throughout the test, images of the specimen were captured by a CCD camera, and the load cell readings and actuator displacements were recorded at 5 Hz. Due to the use of displacement-controlled testing, we observed mild deviations from the target stresses (see Table 1).

After testing, the acquired images were analyzed using the DIC module of the Bio-Tester's software. A $5.5\xd75.5$ mm domain in the central region of the TVAL specimen was selected since the speckling pattern was more uniform and could yield more reliable node tracking (see Figs. 1(a) and 1(b)). The pixel coordinate locations of the DIC-tracked grid were then exported for use in the subsequent study scenarios. Based on the tracked coordinates, we constructed two numerical testing datasets: (i) an original dataset obtained directly from the experimental measurements, and (ii) a smoothed dataset where moving least-squares (MLS) smoothing was performed for the nodal displacements.

To generate the displacement fields $uoriginal(x)$ for the original samples, we subtracted each material point location with its initial location on the first sample of each protocol, and the boundary displacement was obtained by enforcing $uoriginal(x)$ on the boundary nodes. Next, to construct the smoothed samples for the $ith$ material point, $xi=(xi,yi)$, we employed a two-dimensional MLS shape function $\Psi i$ to reconstruct the smoothed displacement field: $usmooth(xi,yi)=\u2211k=1NP\Psi k(xi,yi)uoriginal(xk,yk)$, based on the unsmoothed displacement vector of the *NP* numbers of points in the neighborhood of $xi$. For further details regarding the MLS shape function and the smoothing procedure, we refer interested readers to Refs. [22], [55], and [56]. Both the smoothed and the original datasets have 26,523 total time instants (samples), denoted as $Dsmooth={(uD)jsmooth,ujsmooth}j=126,523$ and $Doriginal={(uD)joriginal,ujoriginal}j=126,523$, respectively. Finally, to create a structured grid for the proposed IFNOs, we further applied a cubic spline interpolation to the displacement field on a structured 21 × 21 node grid.

### 3.2 Baseline: Constitutive Modeling.

where *c*, *a*_{1}, *a*_{2}, and *a*_{3} are the model parameters to be determined, and *E*_{11}, *E*_{22} are the principle Green–Lagrange strains in the *x*- and *y*-directions, respectively.

Herein, *c _{i}* ($i=0,1,2,3$) and

*w*are the model parameters to be determined, $w\u2208[0,1]$ denotes the material anisotropy, and $I1=tr(C)$ and $I4=m\xb7Cm$ are the invariant and pseudo-invariant of the right Cauchy–Green deformation $C=FTF$, respectively. In this study, we consider the direction of the collagen fibers in the reference configuration to be in the circumferential direction (i.e., $m=[1,0,0]T$).

*p*is the penalty term to enforce tissue incompressibility $J=det(F)=1$ that can be

*analytically*determined by further applying the plane-stress condition [59], $Ef=NT(\theta )EN(\theta )$ is the fiber strain along $N(\theta )=[cos(\theta ),\u2009sin(\theta ),0]T,\u2009E=(C\u2212I)/2$ is the Green–Lagrange strain, and

**I**is the identity tensor. For the fiber stress–strain behavior, we used an exponential model with a terminal stiffness for numerical stability

where *c*_{0} and *c*_{1} are the material parameters and $Eub$ is threshold fiber strain for the transition to a linear fiber tangent modulus. Finally, we used a Gaussian distribution function, with zero mean and fiber dispersion of *σ*, for $\Gamma \theta $. Thus, for the structure-informed model there are four parameters to be determined from optimization: $\mu m$, *c*_{0}, *c*_{1}, and *σ*, whereas $Eub$ is precalculated as the strain corresponding to a predetermined stress threshold value of 10 MPa for a given pair of *c*_{0} and *c*_{1}.

In this work, constitutive model parameters were obtained by nonlinear least-squares fitting to the biaxial stress–stretch data for the training samples. In brief, the first Piola–Kirchhoff stresses in the *x*- and *y*-directions were determined using the specimen thickness *L _{z}*, the undeformed edge lengths

*L*and

_{x}*L*, and the measured forces

_{y}*F*and

_{x}*F*: $P11=Fx/LyLz$ and $P22=Fy/LxLz$. The two stretches were calculated as the ratio of the deformed to the undeformed edge lengths. To obtain the optimal parameters for the different model, differential evolution optimization was employed that minimizes the residual mean squared errors in the stress between the experimental data and model prediction [61]. Finally, using the determined model parameters, finite element simulation was performed in Abaqus [62] with the DIC-tracked nodal displacements prescribed as boundary displacement conditions. The relative errors of displacement fields were then evaluated by comparing the finite element solution and the DIC-based measurements. In the following, we will refer to these baseline approaches as the “Fung model” method, the “invariant-based” method, and the “structure-informed” method.

_{y}### 3.3 Numerical Study Scenarios.

Based on the seven mechanical testing protocols listed in Table 1, four study scenarios are considered to evaluate the *interpolative* and *extrapolative* performances of the proposed neural operator learning methods. In each study scenario, a subset of the samples was selected to form the training set and to obtain the optimal neural operator by solving (2.2) and (2.6). Then, the displacement field predictions were made for the remaining samples and the results were compared with the ground-truth displacement fields from the DIC measurements, to evaluate the predictivity and generalizability of our proposed methods. Due to the relatively large number of samples, in constitutive modeling approach, it is generally intractable to perform finite element analysis for all 26,523 samples. To reduce the computational cost, although we train both models on samples from all cycles, we only evaluate the training and testing errors for samples in the first loading/unloading cycle of each protocol for the three constitutive modeling approaches. Then, we considered the averaged relative error of displacement, $||upred,j\u2212uj||L2(\Omega )/||uj||L2(\Omega )$, as the error metric, so as to provide a fair comparison between the constitutive modeling and our neural operator learning approaches. Here, $uj$ denotes the *j*th sample from the DIC measurement, and $upred,j$ is the prediction from either the neural operator or the corresponding baseline constitutive model for this sample.

#### 3.3.1 Study 1.

We mixed all samples from all seven protocols, randomly selected 83% of samples for training, and used the remaining for validation (10% of samples) and testing (7% of samples). In this scenario, we ensured that the boundary conditions of the samples in the testing set are inside the training region. Therefore, with this study, we aimed to investigate the *in-distribution* predictivity of the proposed method. With this study, we not only investigate the performance of the proposed approach in in-distribution learning tasks but also study the required amount of training data by demonstrating the errors when using different numbers of (randomly) selected training samples.

#### 3.3.2 Study 2.

For this study, we employed protocols 1, 2, and 4 for training and protocols 3, 5, 6, and 7 for testing. In this study and the two studies below, the samples from the second loading/unloading cycle of testing protocols are employed as the validation dataset for the purpose of hyperparameter tuning, while the first cycle is reserved as the test dataset. We notice that the testing protocols are not covered in any of training sets, and they have smaller maximum tensions compared with the training sets. Hence, with this study, we aimed to investigate the performance of the proposed IFNO methods for predicting the *out-of-distribution material responses in the small deformation regime*.

#### 3.3.3 Study 3.

We used protocols 1, 6, and 7 for training and protocols 2–5 for testing. The protocols considered in testing were not covered in any of the training protocols, although the deformation range of the testing protocols may fall inside the range of the training ones. Hence, we attempted to illustrate the *out-of-distribution prediction on the intermediate deformation regime*.

#### 3.3.4 Study 4.

We used protocols 2–7 for training, and protocol 1 for prediction. We notice that the equibiaxial tension protocol ($P11:P22=$1:1) is not covered in any of other sets, and protocol 1 exhibits the largest maximum tensions among all the sets. Hence, with this study, we aimed to investigate the *out-of-distribution predictivity in the large deformation regime* of the proposed method.

## 4 Results and Discussions

In this section, we illustrate the performance of the proposed neural operator learning approaches. All our numerical experiments were performed on a machine with a 2.8 GHz 8-core CPU and a single Nvidia (Santa Clara, CA) RTX 3060 graphics processing unit, using a Pytorch implementation modified from the package provided in Ref. [27]. The optimization was performed with the Adam optimizer. For all IFNOs, we set the dimension of ** h** as

*d*=

*16 and the number of truncated Fourier modes as $k=8\xd78$, with*

*L*=

*12 hidden layers. The network was trained with the shallow-to-deep training procedure: we initialized the $L\u2212$ layer network parameters from the $(L/2)\u2212$ layer IFNOs model. For each depth*

*L*, we trained the network for 1000 epochs with a learning rate of $3\xd710\u22123$, then decreased the learning rate with a ratio of 0.5 every 100 epochs. For all PG-IFNOs we took the penalty parameter $\gamma =1.0$, although we noted that this parameter can be further hand-tuned or optimized to potentially achieve a better performance.

### 4.1 Study 1: In-Distribution Prediction.

To verify the model's predictivity for in-distribution learning tasks, in this study we randomly selected 83% of the samples of all protocols to form the training set and then built the vanilla IFNO model and the three baseline constitutive models based on this common training set. Figure 3 (*top*) shows the relative displacement errors when using different amounts of training samples, and the samplewise errors for each model are provided in Fig. 3 (*bottom*). When comparing the results between the original dataset and the smoothed dataset, one can observe that their samplewise errors present a similar trend, while the smoothing procedure slightly improves the prediction accuracy for both the IFNO and three baseline models. Probably unsurprisingly, from the left of Fig. 3 (top), one can see that the accuracy of the IFNO improves when using more training samples. In particular, the test error decreases with a convergence rate of around $O(N\u22120.34)$, when the training dataset size, *N*, increases. With only 45 samples, the IFNO achieves a comparable accuracy as the three constitutive models trained on all 22,000 samples. When using all 22,000 measurements in both models, the IFNO outperforms the conventional constitutive modeling approaches by around one order of magnitude, on both the original and smoothed datasets. To provide further insights into this comparison, in Fig. 4 we visualized both the *x*- and *y*-displacement solutions and the prediction errors obtained with the IFNO and the structure-informed model (the best baseline model) on two test samples, which correspond to the large deformation (sample #2) and small deformation (sample #1) representatives, respectively. The structure-informed model, which considered the homogenized stress–strain at one material point (i.e., the center of the specimen) due to limited information about the spatial variation in the stress measurement, failed to capture material heterogeneity and hence exhibited large prediction errors in the interior region of the TVAL specimen domain. This observation confirms the importance of capturing the material heterogeneity and verifies the capability of the IFNOs in heterogeneous material modeling.

### 4.2 Study 2: Out-of-Distribution Prediction on the Small Deformation Regime.

In this study, three protocols with the largest tensions (i.e., sets 1, 2, and 4) were used for training, and the other four protocols were used for prediction validation (sets 3, 5, 6, and 7 as listed in Table 1). Since the prediction sets are with a different biaxial loading ratio that is unseen from the training samples, this is an *extrapolative* learning task in the small deformation region. Figure 5 (*left*) provides the relative displacement errors from all models. One can see that compared with the interpolative prediction task in study 1, the extrapolative predictions are less effective for the neural operator. It was, in particular, noted that for the vanilla IFNO model while the training error was at a relatively low error (i.e., the model still possessed good expressivity in sets 1, 2, and 4), the testing error deteriorates by ten times and reached a similar level to the Fung-type model but slightly higher than invariant-based and structure-informed models. Perhaps unsurprisingly, this observation again verifies the sensitivity of machine learning models in extrapolative tasks (see, e.g., Ref. [9]). As shown in Fig. 5 (*right*), we demonstrate the samplewise errors from the original dataset for each model, and we noticed that the results on the smoothed dataset exhibit a similar trend. One can observe that for the three baseline models, the level of prediction errors is relatively similar for all four testing sets, while large errors are observed in sets 6 and 7 (the sets with the smallest maximum tensions) for the vanilla IFNO model. Those sets are the furthest away from the training set and hence their sample distributions are substantially different from those in the training sets.

In this study, we also investigated the performance of the proposed PG-IFNOs. By infusing the no-permanent-deformation constraint, an improvement of the testing error was observed on both original and smoothed dataset. From the samplewise errors, we can tell that the invariant-based and structure informed models outperform PG-IFNO mostly on sets 6 and 7—which are the protocols on the small deformation regime. As we mentioned before, those sets are differ greatly from the training set, it's expected that neural operator would be less accurate. On the set where deformation is closer to the training set, i.e., set 3 and set 5, the PG-IFNO model shows a similar or even better performance than the baseline models. This fact is verified by the solutions and prediction errors on a representative sample in set 5, as depicted in Fig. 6. Compared with structure-informed method, the PG-IFNO model has a lower solution error and captures the material heterogeneity. Hence, these results suggest that sufficient coverage of sample distribution in the training protocol is critical for neural operator learning methods. Even though the constitutive modeling approach can have the lower solution error if the predefined model form exhibits good generalizability like the invariant-based or structured-informed model, the neural operator approach is superior at capturing the heterogeneous features. On these challenging extrapolative learning tasks, incorporating proper physics constraints seems to make the neural operator learning more versatile.

### 4.3 Study 3: Out-of-Distribution Prediction on the Intermediate Deformation Regime.

In this study, protocol sets 1, 6, and 7 were used in model training, while the rest of sets (protocols 2–5) were for prediction validation. As such, the prediction sets are still with unseen tension ratios from the training sets, but the deformation range of the testing protocols is *within* the range of the training ones. In Fig. 7 (left), the relative displacement errors are provided. We can see that the testing errors from the IFNOs are still much larger than their respective training errors, due to the out-of-distribution learning nature of this study. However, the prediction error from the IFNOs outperforms all three constitutive models by at least 1.15% and 0.67% on the original and smoothed dataset, respectively. Therefore, as long as the dataset has sufficient coverage of the deformation, neural operator learning methods can be a more effective approach than the traditional constitutive models, even in the more challenging out-of-distribution tasks. To provide further insights into this comparison, Fig. 7 (right) depicts the samplewise errors on the original (unsmoothed) dataset. One can see that the prediction errors are almost uniform among all protocol sets. In this study, the PG-IFNO does not substantially improve the prediction accuracy compared with the original IFNO, possibly because the physics constraint stems from the zero loading case and is more helpful in guiding the prediction in *small* deformation regimes.

### 4.4 Study 4: Out of Distribution Prediction on the Large Deformation Regime.

In the last comparative study scenario, the experimental protocol for prediction was selected to be the one with the largest maximum tension, with the aim to evaluate the models' extrapolative prediction abilities on the large deformation regime. From Fig. 8, one can see that while the IFNO becomes less effective on both datasets, the Fung-type model exhibits a better fit to both the original and smoothed testing datasets. On the other hand, the physics constraint does not help much in this study. These results suggest that to ensure reliable predictions from neural operator learning methods, with an insufficient coverage of deformation range in the training protocols, a judiciously designed physics constraint for the data range becomes important.

## 5 Conclusion

In this work, we have applied the neural operator learning method to modeling the mechanical responses of a biological tissue specimen under different loading conditions. Specifically, a data-driven computing workflow has been proposed, which learns the material model directly from the DIC displacement tracking measurements and integrates material identification, modeling procedures, and material response prediction into one unified learning framework. With the proposed neural operator learning, the mechanical response of this tissue specimen can be modeled as a data-driven solution operator from the boundary loading to the resultant displacement field, and the learned model will be applicable to unseen loading conditions. To verify its efficacy on real-world soft tissue response learning tasks which feature spatial heterogeneity, measurement noise, anisotropic and nonlinear behaviors, we have used the proposed workflow to model a porcine heart TVAL specimen with the DIC measurement data collected from biaxial and constrained uniaxial tension tests. In the in-distribution validation case, our proposed neural operator learning method has been shown to significantly outperform the conventional constitutive modeling approach, with its predictions on out-of-distribution learning tasks being less effective. To improve the model generalizability on out-of-distribution tasks, we have further leveraged the neural operator learning method toward physically consistent predictions for tissue at rest and proposed a physics-guided neural operator learning approach. Numerical studies have demonstrated substantial improvements in terms of enhanced generalization performance in the small deformation regime. On the other hand, once the network is fully trained, the prediction of displacement field under a new and unseen loading only requires a forward pass in the neural operator model. When tested on a single CPU core, on average it only takes 0.012 s for the neural operator to predict one sample of the displacement field, while the same task takes 0.537 s in the baseline finite element solver. Hence, these results suggest that with sufficient coverage of training sample distribution and/or properly designed physics constraints, the neural operator learning approach could offer an alternative approach that outperforms the conventional phenomenological model in complex and heterogeneous material modeling tasks on both accuracy and efficiency.

Despite the encouraging results presented herein, numerous questions and potentials require further investigation. For example, although the proposed no-permanent-deformation constraint seems effective in the small deformation regime, it has little impact on improving the prediction accuracy of the large deformation regimes. As a natural extension, we will consider the enforcement of other physics constraints, which would potentially further enhance the performance of the method's *extrapolative* predictivity. Another natural future extension to be considered is the generalizability to other specimens with a different computational domain and/or microstructure. We point out that in this work, the neural operator model is trained by assuming that the tissue has *unchanged* microstructure, geometry, and biomechanical properties, leading to the constantly learned network parameter *θ*. To achieve the generalizability across different specimens, one possible approach is to consider varying *θ* across different geometries and microstructures and adapt the model using transfer-learning techniques such as the metalearning methods proposed in Ref. [63]. Similarly, translating the currently trained model to whole organ simulations would be another interesting generalization problem. On the other hand, another important next step is to consider other boundary loading scenarios in our learning framework, such as the traction loading problems as demonstrated in the synthetic dataset example in [22]. Moreover, in this study, we have considered the three homogeneous phenomenological models: Fung-type, invariant-based, and structure-informed models as the baseline constitutive modeling, because of their popularity and proven efficacy in soft tissue modeling [1,64,65]. However, we would also like to point out that by considering other constitutive models, such as the Holzapfel model [66], and/or incorporating the material microstructure into the conventional model, one might further improve the accuracy of the conventional constitutive model. Finally, another question arises from the possibility of achieving improved performance by optimizing the penalty parameter *γ* in the physics-guided hybrid loss function (2.5). It has been shown that an optimized penalty parameter could further enhance the accuracy and trainability of the constrained neural networks (see, e.g., Refs. [67–69]). Hence, the performance of PG-IFNO might get further enhanced by designing effective algorithms which select appropriate weights in the hybrid loss function.

## Acknowledgment

H. You and Y. Yu would like to acknowledge support from the National Science Foundation under award DMS 1753031 and the AFOSR grant FA9550-22-1-0197. Portions of this research were conducted on Lehigh University's Research Computing infrastructure partially supported by NSF Award 2019035. C. Ross, C.-H. Lee, and M.-C. Hsu would like to acknowledge support from the Presbyterian Health Foundation Team Science Grant. C. Ross was in part supported by the National Science Foundation Graduate Research Fellowship Program (GRF2020307284). M.-C. Hsu was in part supported by the National Institutes of Health under award number R01HL142504.

## Funding Data

Air Force Office of Scientific Research (Grant No. FA9550-22-1-0197; Funder ID: 10.13039/100000181).

National Institutes of Health (Grant No. R01HL142504; Funder ID: 10.13039/100000002).

National Science Foundation (Grant Nos. DMS 1753031, 2019035, GRF2020307284; Funder ID: 10.13039/100000001).

Presbyterian Health Foundation (Team Science Grant; Funder ID: 10.13039/100001298).

## Footnotes

## References

**393**, p.

**469**, p.