Convolutional neural network denoising autoencoders for intelligent aircraft engine gas path health signal noise filtering

– Removing noise from health signals is critical in gas path diagnostics of aircraft engines. An efficient noise filtering/denoising method should remove noise without using future data points, preserve important changes, and promote accurate diagnostics without time delay. Machine Learning (ML)-based methods are promising for high fidelity, accuracy, and computational efficiency under the motivation of Intelligent Engines. However, previous ML-based denoising methods are rarely applied in actual engineering practice because they cannot accommodate time series and cannot effectively capture important changes or are limited by the time delay problem. This paper proposes a Convolutional Neural Network Denoising Autoencoder (CNN-DAE) method to build a denoising autoencoder structure. In this structure, a convolutional operation is used to accommodate time series, and causal convolution is introduced to solve the problem of using future data points. The proposed denoising method is evaluated against NASA's Propulsion Diagnostic Method Evaluation Strategy (ProDiMES) software. It has been proved that the proposed method can accommodate time series, remove noise for improved denoising accuracy and preserve the important changes for enhanced diagnostic information. NASA's blind test case results show that Kappa Coefficient of a common diagnostic method using the processed data is 0.731 and is at least 0.046 higher than the other diagnostic methods in the open literature. Processing health signals using the proposed method would significantly promote accurate diagnostics without time delay. The proposed method could support intelligent condition monitoring systems by exploiting historical information for improved denoising and diagnostic performance.


INTRODUCTION
Gas turbine performance deterioration may be due to gradual degradation and/or abrupt or rapid faults of an engine [1].Gradual performance degradation is normally due to fouling, erosion, corrosion, etc. [2], and it evolves on a slow timescale.Faults typically refer to sudden events and failures to the engine, such as Foreign Object Damage (FOD), Domestic Object Damage (DOD), bleed leaks & failures, variable geometry anomalies, actuator & instrumentation faults, and the like [3].In general, gas path faults can be classified as component (for example, fan, compressor, and turbine) faults and sub-system (for example, sensor and actuator) faults [4].Based on the evolution rate, faults may also be classified as abrupt faults that appear instantaneously but do not grow in magnitude over time and rapid faults that initiate and grow in magnitude over a short period [3].
Deviations in gas path measurements from an undamaged baseline engine, known as measurement deltas, are usually used for gas path diagnostics.Unfortunately, noise contaminates the measurement deltas, thereby reducing the signal-to-noise ratio [5].This can hide key features in the signal.In addition, the key features always change with time 2 and are contained in windows of time-series data.A key objective of gas turbine diagnostics is to determine faults' existence and location from the noisy data.An efficient denoising method should remove noise without using future data points, preserve important changes in time-series data, and promote accurate diagnostics without time delay.
In signal processing, filtering methods are used to process the data.Traditionally, filtering methods used by the gas turbine industry are moving average (MA) [6,7] and exponential moving average (EMA) [8,9].MA is a special case of the finite impulse response (FIR) filter, and EMA is a special case of the infinite impulse response (IIR) filter.Linear filters such as the FIR filters and the IIR filters can distort the sharp changes in the signals and they are also weak at outlier removal [10].Details about both the FIR and IIR filters and their limitations for gas turbine health signal denoising are discussed by Ganguli [5].Substantial research efforts have been conducted to find suitable alternatives to linear filters that are robust or resistant to the presence of impulsive noise.Among these works, nonlinear filters such as median filters have been proposed for noise removal from gas turbine signals.Median filters, such as FIR median hybrid (FMH) filters [11], center weighted idempotent median (CWIM) filters [12], and recursive median (RM) filters [10,13,14], can preserve edges while simultaneously reducing noise.A disadvantage is the diagnostic time delay, as median filters must use future data points.Details about median filters and their limitations for gas turbine health signal denoising are discussed by Uday et al. [10].
ML-based methods are promising under the motivation of Intelligent Engines.ML-based denoising methods can likewise be classified as nonlinear filters.Auto-Associative Neural Network (AANN), also called autoassociator, autoencoder, or Diabolo Network [15], is a special neural network (NN) that can be used for denoising.The name Auto-Associative Neural Network is commonly used in gas turbine gas path diagnostics.The concept of using a NN with a bottleneck to concentrate information has been firstly discussed in the context of "encoder/decoder" problems [16].They have primarily been used to extract sparse internal representations of any input and reduce its dimensionality.Vincent et al. [17] introduced denoising autoencoder as an extension to classic autoencoders that is robust to noise.Noise filtering using autoencoders was introduced much earlier.The concept of AANN was introduced by Kramer [18] and was used for noise filtering, sensor replacement and gross error detection and identification.AANN was introduced to gas turbine diagnostics by Guo et al. [19] for sensor validation.Then, Lu et al. [20] used AANN for filtering gas path measurement noise.Other studies have also been conducted on AANN in sensor fault diagnostics and noise filtering [21,22].The name autoencoder is always used in the context of deep network framework [17].Hence, in this research, the name denoising autoencoder is used to describe the autoencoder for noise filtering, and the name conventional denoising autoencoder is used instead of AANN.
In ML-based gas turbine noise filtering, conventional denoising autoencoders are commonly used [5].The number of input and output nodes for a conventional denoising autoencoder equals the number of measurements.Tensors are the basic data structure for all current ML systems.As containers for data, tensors are defined with the number of axes (rank), shape, and data type.One input sample for a conventional denoising autoencoder is a One-dimensional (1D) tensor, which consists of the measurements from selected sensor observations at a discrete-time instant.However, by using conventional denoising autoencoders, changes in the signals that evolve with time may not be adequately captured, resulting in poor noise filtering performance.
To accommodate time-series data, denoising autoencoders in a deep network setting that use windows of multivariate measurements (2D tensors) are investigated in this paper.The two fundamental deep-learning algorithms for sequence or time series modelling are Recurrent Neural Networks (RNNs) [23] and Convolutional Neural Networks (CNNs) [24].RNNs are dedicated sequence models that maintain a vector of hidden activations propagated through time.This family of algorithms has gained tremendous popularity due to prominent applications in language modelling and machine translation [25].Basic RNN algorithms are notoriously difficult to train, and more elaborate algorithms are commonly used instead, such as the Long Short-Term Memory Networks (LSTMs) [26] and the Gated Recurrent Units (GRUs) [27].In this research, sequence modelling algorithms are integrated into the encoder and decoder of the denoising autoencoder to build a deep denoising autoencoder used for gas turbine gas path measurement noise filtering.By comparison, the denoising autoencoder structure based on CNN is selected to develop the efficient denoising method for gas turbine diagnostics.
The study's contribution is the novel Convolutional Neural Network Denoising Autoencoder (CNN-DAE) method that can remove noise without using future data points, preserve important changes in time-series data, and promote accurate diagnostics without time delay.Specifically,  The denoising autoencoder structure can effectively remove the noise in health signals by reconstructing the denoised data from the noisy data.


The CNN-DAE method can accommodate historical time-series data.The convolution operation can adequately capture changes in the signals that evolve with time.


The causal convolution is introduced, where no future data points are needed.Accordingly, the diagnostic time delay problem can be solved.


The proposed CNN-DAE denoising method can preserve the important changes in health signals for enhanced diagnostic information, which significantly improves diagnostic accuracy.It is noticed that the scope of the paper is denoising.However, a complete diagnostic solution shall include processing and diagnostic algorithms.To validate the effect of the proposed CNN-DAE denoising method on diagnostic performance, in addition to the necessary case studies on denoising performance, case studies on diagnostic performance are also included.A common ML diagnostic method is introduced in the application section to compare the diagnostic performance with and without CNN-DAE denoising.
The remainder of this paper is organized as follows.Section 2 introduces the methodology to develop the CNN-DAE method.Section 3 describes the case studies using the ProDiMES software from NASA.The results and discussions are provided accordingly in Section 4. Section 5 draws the conclusions.

METHODOLOGY 2.1. Set Up Denoising Autoencoders
A denoising autoencoder is a NN that receives a corrupted data point as input and is trained to predict the original, uncorrupted data point as its output.A denoising autoencoder includes three hidden layers: the mapping layer, the bottleneck layer and the demapping layer, as shown in Fig. 1.The mapping phase where the input  is transformed into the hidden representation  is termed an encoder.A decoder is the part of an autoencoder where the input is reconstructed back to  from its hidden representation .In Fig. 1, the encoding and decoding functions are denoted by   (•) and   ′ (•) , respectively, and these mappings are correspondingly parametrized by vectors  and  ′ .Typically, the mapping functions comprise of affine mapping followed by certain nonlinearity. = {, } and  ′ = { ′ ,  ′ } are parameter sets with  and  ′ denoting the weight matrices and  and  ′ representing the bias vectors.The goal of the autoencoder presented in Fig. 1 is to predict an uncorrupted output with the minimum reconstruction loss.The autoencoder presented in Fig. 1 has three hidden layers.However, more hidden layers can be used for autoencoders of higher complexity.Accordingly, the encoder and decoder will comprise a series of mappings in each.

Set Up CNN-DAEs
CNN-DAEs are denoising autoencoders that use convolution in place of general matrix multiplication in at least one of the hidden layers.The name "Convolutional Neural Network (CNN)" indicates that the network employs a mathematical operation called convolution.In this research, to accommodate 2D data of shape (timesteps, measurements), which can be viewed as a 1D grid taking samples of shape (measurements, ) at a regular time interval, One-dimensional Convolutional Neural Networks (1DCNNs) [28,29] are used.
A CNN hidden layer contains a few functions from the convolution, pooling, and nonlinear activation [30].The pooling function provides an approach to down sample, which produces invariance to local translation.Pooling will not be used in the proposed CNN-DAE because it is not useful when priority is on temporal order and the feature location.Meanwhile, pooling can complicate the autoencoder architectures that use top-down information.

Convolution Operation for Signal Processing
Convolution operates on two functions of a real-valued argument in its most general form.In the research context, suppose a sensor provides a single output () at time  and the sensor is somewhat noisy.From the point of signal processing, a weighting operation that is similar to the weighting coefficients in an FIR filter [5] is used to filter the measurements.If a weighted operation () is applied at every moment, a new function  providing a smoothed estimate is obtained: This operation is termed convolution.The convolution operation is typically denoted with an asterisk: The first argument (in this case, the function ) is generally referred to as the input.The second argument (in this case, the function ) is commonly referred to as the kernel.The output is usually referred to as the feature map.

. An example of convolution operation [28].
In reality, a sensor can not produce measurements at every instant in time.Generally, when dealing with data on a computer, time is discretized, and one sensor will produce data at regular intervals.In this case, a more realistic assumption could be that one sensor produces a measurement, for example, once per second.Then, the time index  will only be able to take on integer values.If it is assumed that  and  are defined only on integer , the discrete convolution can be described as: Since the convolution operation is commutative, the discrete convolution can be equivalently written as: Generally, many NN libraries implement the cross-correlation function but call it convolution.In this research, the convention of calling both operations convolution is followed.An example of convolution (without kernel flipping) is shown in Fig. 2. The input is a 1D tensor of shape (10, ).The kernel size is four.The other factor that can influence convolution is the notion of strides.The description of convolution so far has assumed that the center tiles of the convolution windows are all contiguous.But the distance between two successive windows is a parameter of the convolution, called its stride.The length of the 1D convolution window is four, and the stride length of the convolution is one.Discrete convolution can be viewed as multiplication by a matrix.

Causal Convolution
In a CNN-DAE, some constraints and modifications may be required.An important constraint is that the model cannot violate the ordering in which the data is modelled: the prediction ( +1 | 1 , . . .,   ) emitted by the model at timestep  cannot depend on any of the future timesteps  +1 ,  +2 , . . .,   .Fig. 3 shows an example of modelling a 1D tensor with two conventional convolutional hidden layers.The input and output shapes are both (5,1).For the convolution operation, the kernel size is three, and the stride is one.To ensure the input and output are of the same shape, zero-padding is applied evenly to the left and right of the input.Zero-padding means adding zeros to the edge of the input matrix [30].The output from conventional convolution depends on some of the future timesteps.

Fig. 4. An example of causal convolutional hidden layers.
To address the time delay issues, the 1D causal convolution adapted from WaveNet [31] is used in this research.In the causal convolution, output at timestep t is convolved only with elements from time  and earlier, without depending on input at next timestep  + 1.This is implemented by shifting the output of a normal convolution by a few timesteps for the 1D tensor.Fig. 4 shows an example of modelling the same input using a two hidden layer causal CNN.The kernel size for the convolution operation is three and the stride is one.Zero-padding is applied to the left of the input.

Training of CNN-DAEs
A single input of a training sample is a 2D tensor with shape (timesteps, features), which is a window of multivariate health signals with noise.A single output of a training sample has the same shape as the input, and it is a window of multivariate health signals without noise.
Where conv1D (.,.) is a 1D causal convolution with zero-padding on the boundaries,    is the input,    is the bias of the  ℎ neuron at layer , and   −1 is the output of the  ℎ neuron at layer  − 1.   −1 is the kernel (weight) from the ith neuron at layer -1 to the  ℎ neuron at layer .
Let  = 1 and  =  be the input and output layers, respectively.For an input vector , and its corresponding output vector, [ 1  , … ,     ], let [ 1 , … ,    ] be the target class vector.The mean absolute error (MAE) in the output layer can then be expressed as iii.PP: Post-process to compute the weight and bias sensitivities.iv.Update the weights and biases with the (accumulation of) sensitivities.
When the training process is complete, the well-trained CNN-DAE is ready for noise filtering.

Case Study Description
The investigated case study is based on the ProDiMES software, which provides a standard benchmark problem enabling users to develop, evaluate, and compare diagnostic methods.An introduction to ProDiMES is in reference [4], and detailed instructions on its application can be found in reference [8].
Two types of case studies are conducted in this section to evaluate the performance of the proposed CNN-DAE denoising method on noising filtering and diagnostics.To evaluate the effect of the proposed CNN-DAE denoising method on diagnostic performance.A single flat MLP classifier is developed for fault diagnostics and details are described in Section 4.2.

Data Generation
An Engine Fleet Simulator (EFS) in ProDiMES based on a steady-state version of the NASA Commercial Modular Aero-Propulsion System Simulation (C-MAPSS) high-bypass two-spool turbofan engine simulation is used to generate the data for this study.The EFS produces simulated "snapshot" engine measurements, with relevant measurement noise, as if collected from a fleet of engines over multiple flights.Within ProDiMES, engine operating conditions, deterioration profiles, fault magnitudes, and sensor noise are randomly generated to emulate realistic behaviour.
Four sets of data were used in this study.The first data set is the training data to develop denoising autoencoders for noise filtering.It includes noisy and noise-free measurements from a fleet of 18963 engines conducting 50 flights each.The second data set includes a fleet of 1896 engines, conducting 50 flights each to train and validate the diagnostic algorithms.The description of the second data is shown in Table 1.It is worth noting that the training data size could be modified during the training optimization.The third data set is the test data that includes a fleet of 9993 engines conducting 50 flights each.The fourth data set includes the blind test case data (i.e., a data set where the true fault state of the engines contained in the data set is unknown to the end-users) from NASA.It is noted that only takeoff data is used in the case studies.The parameter correction and deterioration elimination methods in this case study are identical to the methods given in ProDiMES User's Guide [4].As an initial step of data processing, all engine measurement data are corrected to standard ISA condition at sea level to eliminate ambient conditions' impact on the measurement data variations.Then, a gradual deterioration trend monitoring approach is applied to capture the gradual performance changes in the form of residuals, or measurement deltas, relative to a fleet average engine model or 50 percent deteriorated engine.For each individual engine, corrected data collected during each flight are then referenced against the fleet average engine model to calculate measurement deltas,  , , as: Where  , () is the corrected value of the  ℎ measurement collected on  ℎ engine during the  ℎ flight, and  _ () is the fleet average engine value for the  ℎ measurement at the corresponding pressure altitude, Mach number, and corrected fan speed values of the  ℎ flight.
Standardization is the process to transform the data to center it by removing the mean value of each feature, then scaling it by dividing non-constant features by their standard deviation: Where  , () is the standardscaled value of the  ℎ measurement collected on  ℎ engine during the  ℎ flight.
is the mean of the  ℎ measurement in the training samples and is the standard deviation of  ℎ measurement in the training samples.M is the number of  ℎ measurement in the training samples.Approximate standard normally distributed data are obtained with data standardization.

Noise Filtering with Denoising Autoencoders
The training data set for the denoising autoencoders (the first set of data) has 18963 training samples (inputs), and each sample is a 2D tensor with shape (50, 7).The training data set for the denoising autoencoders also has 18963 training outputs, and each output is a 2D tensor with shape (50, 7).The shape of the 2D tensor means each engine conducts 50 flights and seven sensors are used for each flight.Fig. 6 shows an example of a takeoff training sample with an abrupt T24 sensor fault emerging at flight cycle 27 and the corresponding noise-free output.The data shown has been processed with parameter correction, deterioration elimination, data standardization.

K-fold cross-validation
An initial CNN-DAE with three 1D convolution layers is developed.Then, an optimization process is conducted to decide the model size and hyperparameters.Bayesian Optimization [32] is used to optimize the denoising autoencoders.In the Bayesian Optimization method, firstly, a domain of hyperparameters is decided.In this case, the searching ranges for hyperparameters are defined in the second column of Table 2 to form a searching space.Secondly, an objective function takes in hyperparameters and outputs a score that indicates how well a set of hyperparameters performs on the validation set.In this case, MAE is chosen as the objective function.Thirdly, the next set of hyperparameters is selected based on a model of the objective function called a surrogate.Finally, each time the algorithm proposes a new set of candidate hyperparameters, it evaluates them with the actual objective function and records the result in a pair (score, hyperparameters).The best hyperparameters can be selected from the history now.The optimization results are shown in Table 2. Once training and optimization processes have been completed, the well-trained denoising autoencoder is ready to be used for testing.

RESULTS AND DISCUSSION
This section consists of two subsections.Section 4.1 states the test case results on the denoising performance of the proposed CNN-DAE denoising method.Section 4.2 states the blind test case results that evaluate the effect of the proposed CNN-DAE denoising method on diagnostic performance.

Test Case Results
The second data set described in Section 3.2 includes a fleet of 1896 engines, conducting 50 flights each to test the denoising methods.In this case study, noisy and noise-free takeoff data are used as the input and corresponding output.
Two types of error criteria are used to obtain a quantitative idea of noise reduction.The MAE measures the difference between the filtered and the noise-free data.In the MAE criterion, the error is defined as: Where  is the number of measurements in the data set.  is the filtered measurement and   is the noise-free measurement.
A parameter called noise reduction rate, which is used for the efficiency measurement of these filters in terms of noise reduction is also used.The noise reduction rate is defined as:  =  () −  ()  ()  × 100(%) (10)

Comparison with Conventional Denoising Autoencoders
Conventional denoising autoencoder is the most widely used ML method in gas turbine gas path measurement noise filtering.Multilayer Perceptrons (MLPs), also often called feedforward neural networks or deep feedforward networks [30], are the quintessential learning models for conventional denoising autoencoders.An MLP-based denoising autoencoder (MLP-DAE) is developed and compared with the CNN-DAE.A typical MLP hidden layer is shown in Fig. 7. Feedforward propagation is expressed as follows: The training samples of an MLP can only be 2D tensors (matrices) of shape (samples, features).A single input sample is a 1D tensor (vector), which means there can only be one 0D tensor (scalar) for an input layer neuron.The number of input layer neurons of a typical MLP-DAE equals the number of the measurements.There should be seven neurons for the typical MLP-DAE input layer in the case study corresponding to the seven sensor measurement features.After training and optimization, a well-trained MLP-DAE with the shape of 7-32-6-32-7 is obtained and is named MLP1-DAE.
MLP1-DAE cannot consider temporal order; hence, the 3D tensors of shape (samples, timesteps, features) must be "flattened" as 2D tensors of shape (samples, timesteps*features).Details of the flatten process can be found in reference [28].There should be 350 neurons for the flattened MLP-DAE input layer in the case study.After training and optimization, a well-trained MLP-DAE with the shape of 350-100-10-100-350 was obtained and was named MLP2-DAE.However, MLP2-DAE would induce a time delay because of using future data points.It is noted that the data was processed with the parameter correction and data standardization processes before the noise filtering process.Table 3 shows a comparison of the denoising performance of the CNN-DAE against the other methods considered in this study.As is shown in Table 3, with the flattened data, MLP2-DAE offers better denoising performance than MLP1-DAE.However, the denoising performance of the flattened methods will be worse with the increase of the sample length because the model and computation complexity are increased for the MLPs.

Comparison with other Optional Signal Processing Methods
EMA filter is the state-of-the-art linear filter, and median filter is the state-of-the-art nonlinear filter.An EMA filter and a weighted recursive median (WRM) filter are developed for comparison purposes in this case study.
The EMA filter is given as: Where  , () is the EMA of  ℎ measurement collected on  ℎ engine during the  ℎ flight.The weighting between previous and current data is established by the constant  (0 <  < 1).In the example solution,  was chosen to be 0.65.
An O-point WRM filter [10] is given as: Where  = 2 + 1 is the window length of the filter. , () is the output of the median filter.() is a function that takes  points surrounding the central point and gives their median as the output.A disadvantage of the median filters is that they induce a time delay of  time steps.In the present case, a value of  = 11 is selected as it offers the best noise reduction performance.The fourth and fifth rows of Table 3 show the denoising performance of the EMA filter and WRM filter for the whole test data set.Fig. 10 and Fig. 11 visually represent the EMA filter and WRM filter effects on two samples of takeoff data in the test data set, respectively.The EMA filter provides a noise reduction of 41.92%, and the MAE of the method is 0.187.EMA filters are often used in gas turbine fault diagnostics to smooth data.But, as is typical in linear filters, they can also smooth out important signal features.The WRM filter can provide a noise reduction of 51.86%, and the MAE of the method is 0.155.The WRM filter performs better than the EMA filter on noise filtering.The WRM filter has better denoising performance by removing the noise while preserving important features such as the changes in the measurements caused by abrupt or rapid faults.Nevertheless, the CNN-DAE provides much better denoising performance than these two signal processing methods.

Comparison with Other Deep Denoising Autoencoders
In this study, sequence modelling algorithms, including LSTM [30] and a special deep NN called WaveNet [31], are used to construct deep denoising autoencoders for comparison.After setting the models and optimization, trained LSTM-based denoising autoencoders (LSTM-DAEs) and WaveNet-based denoising autoencoders (WaveNet-DAEs) were obtained and used in the test case for comparison purposes.
The sixth and seventh rows of Table 3 show the denoising performance of LSTM-DAE and WaveNet-DAE for the whole test data set.Fig. 12 and Fig. 13 visually represent the effects of LSTM-DAE and WaveNet-DAE on two samples of takeoff data in the test data set, respectively.It is observed that LSTM-DAE provides a noise reduction of 16.77% and the MAE of the method is 0.268.The WaveNet-DAE can provide a noise reduction of 31.37%, and the MAE of the method is 0.221.While LSTM and WaveNet are good sequence modelling algorithms, denoising methods based on them show much worse denoising performance than the CNN-DAE.

Discussion
An overall comparison for all the denoising methods described above is conducted, and the results are shown in Fig. 14


The EMA filter can reduce noise but also distort the edges in the signal due to the MA process, which results in import feature loss for diagnostics.


The WRM filter can remove the noise while preserving important features in health signals.A primary disadvantage of the WRM filter is that it induces a time delay, which will result in high Detection Latency. While all the other noise filtering methods studied here use 2D tensors, MLP1-DAE is the only method that uses 1D tensors.It tends to smooth out the changes, especially when the change magnitudes are comparable with the noise magnitudes in time-series signals. MLP2-DAE can remove noise in the data and perform better at preserving important changes in health signals than the other methods except CNN-DAE.However, like the WRM filter, a disadvantage of MLP2-DAE is the time delay due to using future data points, which results in increased Detection Latency.


Noise filtering methods based on sequence modelling algorithms, i.e., LSTM-DAE and WaveNet-DAE, are not suitable for this research's noise filtering problem.


The proposed CNN-DAE method performs best among all these methods in removing noise and preserving important features for diagnostics.It can address the time delay problem by introducing causal convolution.

Blind Test Case Results
A case study using the blind test case data from NASA was conducted to evaluate the influence of the noise filtering process on the overall diagnostic performance.A single flat MLP classifier is developed for fault classification that includes detection and isolation.The initial training data set described in Section 3.2 is used for training and validating the MLP classifier.The initial training data set and the blind test case data from NASA are processed with the proposed data processing method.Blind test case diagnostic assessments of the MLP classifiers with the proposed CNN-DAE method were submitted to NASA for evaluation.The evaluation results of the proposed CNN-DAE method are shown in Fig. 16 of Appendix A and explained in the following subsections.

Detection Performance Metrics
Table 4 presents the True Positive Rate (TPR), False Positive Rate (FPR), False Alarm Rate (FAR), and Detection Latency metrics concerning all fault types, fault evolution rates, and fault magnitudes.FAR is the inverse of FPR and is presented in the third column of Table 4.It reflects the average number of flights required to generate a false alarm.A target FAR of 1000 flights or greater is specified in reference [8] to maintain uniformity in the diagnostic methods.All these methods satisfy the FAR target.With the same MLP classifier for fault diagnostics, the MLP with the CNN-DAE method provides the highest TPR of 65.7%, which is a 2.9% improvement over the method with the second-highest TPR score (the MLP with MLP2-DAE method).In most cases, some latency is associated with the correct detection of a fault resulting in missed detections within the first few flights after the fault occurs.As shown in the fifth column of Table 4, the Detection Latency of the proposed diagnostic method is 2.69, which exhibits superior diagnostic latency compared to the other methods.In conclusion, the proposed MLP with the CNN-DAE method performs best on fault detection among all the compared methods by detecting fault events more accurately with the shortest Detection Latency.

Kappa Coefficient
Table 5 presents the average Kappa Coefficient results concerning all fault types, fault evolution rates, and fault magnitudes.Kappa Coefficient reflects the overall fault classification performance.With the same MLP classifier for fault diagnostics, MLP with the CNN-DAE method produces an overall Kappa Coefficient of 0.731.The Kappa Coefficient for abrupt faults is 0.824, and the Kappa Coefficient for rapid faults is 0.704.MLP with CNN-DAE method produces the highest Kappa Coefficient.

Discussion
Table 6 summarises the blind test case study results, which provides the ranking of the diagnostic methods for each evaluation metric.With the same MLP classifier for fault diagnostics, the MLP with the CNN-DAE method proposed in this paper ranks first in TPR, Detection Latency, and Kappa Coefficient evaluation metrics.As a reflection of fault classification performance, the average Kappa Coefficient is enhanced to 0.731, and the Kappa Coefficient for abrupt faults is above 0.80.It clearly illustrates the importance of the noise filtering process on the gas path diagnostic performance.Detection Latency for the MLP with MLP2-DAE method and MLP with WRM method are 19.50 and 8.39, which are much higher than the other noise filtering methods.The reason for the high Detection Latency is time delay due to the use of future data points.MLP1-DAE provides a noise reduction rate of 42.24% and ranks fourth among the studied methods; however, the average Kappa Coefficient evaluation for MLP with MLP1-DAE method ranks seventh.The MLP with MLP1-DAE method has the worst diagnostic performance, which proves the worse performance of MLP1-DAE on preserving critical features in the signal compared to the other studied noise filtering methods.Table 7 summarises the blind-test-case metric results from other known diagnostic methods using the ProDiMES software for evaluation.Regularized Extreme Learning Machines-Sparse Representation Classification (RELM-SRC) method [33] has the best diagnostic performance, which provides a kappa coefficient of 0.685.The kappa coefficient of the proposed method in this paper is 0.731, which is 0.046, larger than the kappa coefficient provided by the RELM-SRC method.

CONCLUSIONS
In the context of Intelligent Engines, this study proposes a novel CNN-DAE method for aircraft engine gas path health signal denoising.The proposed method is evaluated with NASA's ProDiMES software.
The conclusions drawn from this study are as follows: • The proposed denoising method can effectively remove the noise in health signals by reconstructing the denoised data from the noisy data.The proposed denoising method is superior in denoising performance to other optional denoising methods in the open literature.
• The proposed method can accommodate time-series data.The convolution operation can adequately capture changes in the signals that evolve with time.The causal convolution can solve the problem of using future data points.• The MLP with CNN-DAE diagnostic method presents the best performance compared to other known diagnostic methods.Kappa Coefficient of the proposed diagnostic method is 0.731 and is at least 0.046 higher than the other diagnostic methods.It is proved that the proposed CNN-DAE denoising method can preserve the important changes in health signals for enhanced diagnostic information, which significantly improves diagnostic accuracy.Overall, the proposed method can accommodate time series without using future data points, remove noise for improved denoising accuracy and preserve the important changes in health signals for enhanced diagnostic information.The proposed method can potentially contribute to intelligent condition monitoring systems by effectively exploiting historical information for improved denoising and diagnostic performance, which will enhance the availability, reliability, and efficiency of Intelligent Engines.
As an ML method, the proposed method has limitations in terms of the need for a large amount of labelled training data, case-dependency and offline training for the applications in aircraft engine diagnostics.Fortunately, emerging technologies such as Digital Twin, Intelligent Engine, and Incremental Online Learning provide potential opportunities for ML applications.As for future work, more studies to address these application limitations will be worth developing.

Fig. 1 .
Fig. 1.The architecture of a denoising autoencoder with three hidden layers.

Fig. 5 .
Fig. 5.A typical hidden layer consists of convolution and activation functions.As shown in Fig.5, a hidden layer of a CNN-DAE consists of convolution operation and nonlinear activation function in turn.The final output of the  ℎ neuron at layer  is    .In each layer, one-dimensional forward propagation (1D-FP) is expressed as follows:

Fig. 6 .
Fig. 6.A takeoff training sample and the output with a -1.47σ abrupt T24 sensor fault at flight 27.

Fig. 8 and
Fig.8and Fig.9visually represent the effects of MLP1-DAE and CNN-DAE on two samples in the test data set, respectively.It illustrates that MLP1-DAE has poorer denoising performance when faults emerge, especially when the fault magnitudes are comparable with the noise magnitudes.MLP1-DAE has good performance in learning the information from discrete snapshots.However, continuous 2D data representing the initiation and growth in magnitude of a fault event is needed.Using the conventional denoising autoencoders, changes in the signals that evolve with time may not be adequately captured, eventually resulting in bad denoising performance.

Fig. 14 .
Fig. 14.MAE of different filters for the test data set.

Fig. 15 .
Fig. 15.Noise reduction rate (%) of different filters for the test data set.
AANN = Auto-Associative Neural Network BP = Back Propagation C-MAPSS = Commercial Modular Aero-Propulsion System Simulation CNN = Convolutional Neural Network CNN-DAE = Convolutional Neural Network Denoising Autoencoder CWIM = Center Weighted Idempotent Median D = Dimension (Axis or Rank) for Tensor DOD = Domestic Object Damage EFS = Engine Fleet Simulator EMA = Exponential Moving Average FIR = Finite Impulse Response FMH = FIR Median Hybrid FOD = Foreign Object Damage FP = Forward Propagation FPR = False Positive Rate HPC/LPC = High/Low Pressure Compressor HPT/LPT = High/Low Pressure Turbine IIR = Infinite Impulse Response LSTM = Long Short-Term Memory Network LSTM-DAE = Long Short-Term Memory Network Denoising Autoencoder MA Method Evaluation Strategy RELM-SRC = Regularized Extreme Learning Machines-Sparse Representation Classification ReLU = Rectified Linear Activation Function RNN = Recurrent Neural Network RM = Recursive Median Seq2Seq = Sequence-to-Sequence Learning