Handling imbalanced data for aircraft predictive maintenance using the BACHE algorithm

Date

2022-05-14

Free to read from

Supervisor/s

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier

Department

Type

Article

ISSN

1568-4946

Format

Citation

Dangut MD, Skaf Z, Jennions IK. (2022) Handling imbalanced data for aircraft predictive maintenance using the BACHE algorithm, Applied Soft Computing, Volume 123, July 2022, Article number 108924

Abstract

Developing a prognostic model to predict an asset’s health condition is a maintenance strategy that increases asset availability and reliability through better maintenance scheduling. Therefore, developing reliable vehicle health predictive models is vital in the aerospace industry, especially considering a safety–critical system such as aircraft. However, one of the significant challenges faced in building reliable data-driven prognostic models is the imbalance dataset. Training machine-learning models using an imbalanced dataset causes classifiers to be biased towards the class with majority samples, resulting in poor predictive accuracy in data-driven models. This problem can become more challenging if the imbalance ratio is extreme and classes overlap. In this paper, a novel approach called Balanced Calibrated Hybrid Ensemble Technique (BACHE) is developed to tackle the severe imbalanced classification problem. The proposed method involves the combination of hybrid data sampling and ensemble-based learning. It uses a cascading balanced approach to transfer a class imbalance problem into a sub-problem by decomposing the original problem into a set of subproblems, each characterized by a reduced imbalance ratio. Then uses a calibrated boosting with a cost-sensitive decision tree to enhance recognition of hard-to-learn patterns, which improves the prediction of the extreme minority class. BACHE is evaluated using a real-world aircraft dataset with rare component replacement instances. Also, a comparative experiment of the proposed approach with other similar existing methods is conducted. The performance metrics used are precision, recall, G-mean, and an area under the curve. The final results show that the proposed model outperforms other similar methods. Also, it can attain an excellent performance on large, extremely imbalanced datasets.

Description

Software Description

Software Language

Github

Keywords

Prognostic, Imbalanced learning, Ensemble learning, Predictive maintenance, Aerospace

DOI

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Relationships

Relationships

Supplements

Funder/s