Browsing by Author "He, Hongmei"
Now showing 1 - 20 of 20
Results Per Page
Sort Options
Item Open Access An academic review: applications of data mining techniques in finance industry(2017-05-31) Jadhav, Swati; He, Hongmei; Jenkins, Karl W.With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance.Item Open Access Analytical review of cybersecurity for embedded systems(IEEE, 2020-12-21) Aloseel, Abdulmohsan; He, Hongmei; Shaw, Carl; Khan, Muhammad AliTo identify the key factors and create the landscape of cybersecurity for embedded systems (CSES), an analytical review of the existing research on CSES has been conducted. The common properties of embedded systems, such as mobility, small size, low cost, independence, and limited power consumption when compared to traditional computer systems, have caused many challenges in CSES. The conflict between cybersecurity requirements and the computing capabilities of embedded systems makes it critical to implement sophisticated security countermeasures against cyber-attacks in an embedded system with limited resources, without draining those resources. In this study, twelve factors influencing CSES have been identified: (1) the components; (2) the characteristics; (3) the implementation; (4) the technical domain; (5) the security requirements; (6) the security problems; (7) the connectivity protocols; (8) the attack surfaces; (9) the impact of the cyber-attacks; (10) the security challenges of the ESs; (11) the security solutions; and (12) the players (manufacturers, legislators, operators, and users). A Multiple Layers Feedback Framework of Embedded System Cybersecurity (MuLFESC) with nine layers of protection is proposed, with new metrics of risk assessment. This will enable cybersecurity practitioners to conduct an assessment of their systems with regard to twelve identified cybersecurity aspects. In MuLFESC, the feedback from the system-components layer to the system-operations layer could help implement ‘‘Security by Design’’ in the design stage at the bottom layer. The study provides a clear landscape of CSES and, therefore, could help to find better comprehensive solutions for CSES.Item Open Access A comparison of resource allocation process in grid and cloud technologies(IOP, 2018-06-04) Hallawi, Huda; Mehnen, Jorn; He, HongmeiGrid Computing and Cloud Computing are two different technologies that have emerged to validate the long-held dream of computing as utilities which led to an important revolution in IT industry. These technologies came with several challenges in terms of middleware, programming model, resources management and business models. These challenges are seriously considered by Distributed System research. Resources allocation is a key challenge in both technologies as it causes the possible resource wastage and service degradation. This paper is addressing a comprehensive study of the resources allocation processes in both technologies. It provides the researchers with an in-depth understanding of all resources allocation related aspects and associative challenges, including: load balancing, performance, energy consumption, scheduling algorithms, resources consolidation and migration. The comparison also contributes an informal definition of the Cloud resource allocation process. Resources in the Cloud are being shared by all users in a time and space sharing manner, in contrast to dedicated resources that governed by a queuing system in Grid resource management. Cloud Resource allocation suffers from extra challenges abbreviated by achieving good load balancing and making right consolidation decision.Item Open Access Complexity of combinatorial ordering genetic algorithms COFFGA and CONFGA(AIP Publishing, 2019-08-23) Hallawi, Huda; He, HongmeiThis paper analyses the complexity of two Algorithms called COFFGA (Combinatorial Ordering First Fit Genetic Algorithm) and CONFGA (Combinatorial Ordering Next Fit Genetic Algorithm). It also identifies the parameters that affect the performance of these algorithms. The complexity of the GA depends on the problem being solved by this GA, as well as the operators of the GA itself. The complexity of COFFGA and CONFGA are analysed individually. Even of these algorithms are slightly different, they may have extremely different complexities depending on the differences in their fitness function or termination condition. To provide a provable bound on a problem, there must be a bound on the evaluation function as well as a manner by which the underlying problem is tied to the representation. Given that there is no standard complexity of the GA, and the complexity of any GA depends on the problem that being solved by this GA and its operators, then CONFGA and COFFGA are analysed with different complexities; although they built upon the same algorithm and they are used to solve the same problem (Cloud resource allocation problem), but they are different in their operators their fitness function and termination condition.Item Open Access A comprehensive obstacle avoidance system of mobile robots using an adaptive threshold clustering and the morphin algorithm(Springer, 2018-12-31) Yuan Chen, Meng Yuan; Wu, Yong Jian; He, HongmeiTo solve the problem of obstacle avoidance for a mobile robot in unknown environment, a comprehensive obstacle avoidance system (called ATCM system) is developed. It integrates obstacle detection, obstacle classification, collision prediction and obstacle avoidance. Especially, an Adaptive-Threshold Clustering algorithm is developed to detect obstacles, and the Morphin algorithm is applied for path planning when the robot predicts a collision ahead. A dynamic circular window is set to continuously scan the surrounding environment of the robot during the task period. The simulation results show that the obstacle avoidance system enables robot to avoid any static and dynamic obstacles effectively.Item Open Access Data mining in computational finance(2017-12) Jadhav, Swati; Jenkins, Karl W.; He, HongmeiComputational finance is a relatively new discipline whose birth can be traced back to early 1950s. Its major objective is to develop and study practical models focusing on techniques that apply directly to financial analyses. The large number of decisions and computationally intensive problems involved in this discipline make data mining and machine learning models an integral part to improve, automate, and expand the current processes. One of the objectives of this research is to present a state-of-the-art of the data mining and machine learning techniques applied in the core areas of computational finance. Next, detailed analysis of public and private finance datasets is performed in an attempt to find interesting facts from data and draw conclusions regarding the usefulness of features within the datasets. Credit risk evaluation is one of the crucial modern concerns in this field. Credit scoring is essentially a classification problem where models are built using the information about past applicants to categorise new applicants as ‘creditworthy’ or ‘non-creditworthy’. We appraise the performance of a few classical machine learning algorithms for the problem of credit scoring. Typically, credit scoring databases are large and characterised by redundant and irrelevant features, making the classification task more computationally-demanding. Feature selection is the process of selecting an optimal subset of relevant features. We propose an improved information-gain directed wrapper feature selection method using genetic algorithms and successfully evaluate its effectiveness against baseline and generic wrapper methods using three benchmark datasets. One of the tasks of financial analysts is to estimate a company’s worth. In the last piece of work, this study predicts the growth rate for earnings of companies using three machine learning techniques. We employed the technique of lagged features, which allowed varying amounts of recent history to be brought into the prediction task, and transformed the time series forecasting problem into a supervised learning problem. This work was applied on a private time series dataset.Item Open Access A framework for operational security metrics development for industrial control environment(Taylor and Francis, 2018-12-13) Daniel Ani, Uchenna P.; He, Hongmei; Tiwari, AshutoshSecurity metrics are very crucial towards providing insights when measuring security states and susceptibilities in industrial operational environments. Obtaining practical security metrics depend on effective security metrics development approaches. To be effective, a security metrics development framework should be scope-definitive, objective-oriented, reliable, simple, adaptable, and repeatable (SORSAR). A framework for Operational Security Metrics Development (OSMD) for industry control environments is presented, which combines concepts and characteristics from existing approaches. It also adds the new characteristic of adaptability. The OSMD framework is broken down into three phases of: target definition, objective definition, and metrics synthesis. A case study scenario is used to demonstrate an instance of how to implement and apply the proposed framework to demonstrate its usability and workability. Expert elicitation has also be used to consolidate the validity of the proposed framework. Both validation approaches have helped to show that the proposed framework can help create effective and efficient ICS-centric security metrics taxonomy that can be used to evaluate capabilities or vulnerabilities. The understanding from this can help enhance security assurance within industrial operational environments.Item Open Access How good a shallow neural network is for solving non-linear decision-making problems(Springer, 2018-12-31) He, Hongmei; Zhu, Zhilong; Xu, Gang; Zhu, ZhenhuanThe universe approximate theorem states that a shallow neural network (one hidden layer) can represent any non-linear function. In this paper, we aim at examining how good a shallow neural network is for solving non-linear decision making problems. We proposed a performance driven incremental approach to searching the best shallow neural network for decision making, given a data set. The experimental results on the two benchmark data sets, Breast Cancer in Wisconsin and SMS Spams, demonstrate the correction of universe approximate theorem, and show that the number of hidden neurons, taking about the half of input number, is good enough to represent the function from data. It is shown that the performance driven BP learning is faster than the error-driven BP learning, and that the performance of the SNN obtained by the former is not worse than that of the SNN obtained by the latter. This indicates that when learning a neural network with the BP algorithm, the performance reaches a certain value quickly, but the error may still keep reducing. The performance of the SNNs for the two databases is comparable to or better than that of the optimal linguistic attribute hierarchy, obtained by a genetic algorithm in wrapper or in terms of semantics manually, which is much time-consuming.Item Open Access Human capability evaluation approach for cybersecurity in critical industrial infrastructure(Springer, 2016-07-10) Ani, Uchenna P.; He, Hongmei; Tiwari, AshutoshEvery organization is as frail as its frailest human link in the cyber security of Industry Control System (ICS), which is without predisposition to conceivable technological solutions for enforcing security. Noticeably, human-involved systems are becoming more chaotic, and gravely under attacks due to irregular actions or inactions of human entities in the constituent chain. Many industrial cyber-attacks have successfully defeated technological security solutions through preying on human weaknesses in knowledge and skills, and manipulating insiders within organizations into unsuspectingly delivering entry and access to sensitive industrial assets. In order to help enterprises assess the level of employees’ cyber security awareness and responsiveness, and enhance ICS Cyber security knowledge and skills for ICS protection, a Workforce Cyber Security Capability evaluation model is presented, and theoretically validated. A capability evaluation will allow industries to have a better understanding of the potential state of consciousness, readiness and diagnostic abilities of the industries; thus improve the prevention, detection, and response to any cyber-specific incidents.Item Unknown Human factor security: evaluating the cybersecurity capacity of the industrial workforce(Emerald, 2019-03-11) Ani, Uchenna Daniel; He, HongmeiPurpose As cyber-attacks continue to grow, organisations adopting the internet-of-things (IoT) have continued to react to security concerns that threaten their businesses within the current highly competitive environment. Many recorded industrial cyber-attacks have successfully beaten technical security solutions by exploiting human-factor vulnerabilities related to security knowledge and skills and manipulating human elements into inadvertently conveying access to critical industrial assets. Knowledge and skill capabilities contribute to human analytical proficiencies for enhanced cybersecurity readiness. Thus, a human-factored security endeavour is required to investigate the capabilities of the human constituents (workforce) to appropriately recognise and respond to cyber intrusion events within the industrial control system (ICS) environment. Design/methodology/approach A quantitative approach (statistical analysis) is adopted to provide an approach to quantify the potential cybersecurity capability aptitudes of industrial human actors, identify the least security-capable workforce in the operational domain with the greatest susceptibility likelihood to cyber-attacks (i.e. weakest link) and guide the enhancement of security assurance. To support these objectives, a Human-factored Cyber Security Capability Evaluation approach is presented using conceptual analysis techniques. Findings Using a test scenario, the approach demonstrates the capacity to proffer an efficient evaluation of workforce security knowledge and skills capabilities and the identification of weakest link in the workforce. Practical implications The approach can enable organisations to gain better workforce security perspectives like security-consciousness, alertness and response aptitudes, thus guiding organisations into adopting strategic means of appropriating security remediation outlines, scopes and resources without undue wastes or redundancies. Originality/value This paper demonstrates originality by providing a framework and computational approach for characterising and quantify human-factor security capabilities based on security knowledge and security skills. It also supports the identification of potential security weakest links amongst an evaluated industrial workforce (human agents), some key security susceptibility areas and relevant control interventions. The model and validation results demonstrate the application of action research. This paper demonstrates originality by illustrating how action research can be applied within socio-technical dimensions to solve recurrent and dynamic problems related to industrial environment cyber security improvement. It provides value by demonstrating how theoretical security knowledge (awareness) and practical security skills can help resolve cyber security response and control uncertainties within industrial organisations.Item Unknown Improving cyber security in industrial control system environment.(2018-02) Ani, Uchenna Daniel; He, Hongmei; Tiwari, AshutoshIntegrating industrial control system (ICS) with information technology (IT) and internet technologies has made industrial control system environments (ICSEs) more vulnerable to cyber-attacks. Increased connectivity has brought about increased security threats, vulnerabilities, and risks in both technology and people (human) constituents of the ICSE. Regardless of existing security solutions which are chiefly tailored towards technical dimensions, cyber-attacks on ICSEs continue to increase with a proportionate level of consequences and impacts. These consequences include system failures or breakdowns, likewise affecting the operations of dependent systems. Impacts often include; marring physical safety, triggering loss of lives, causing huge economic damages, and thwarting the vital missions of productions and businesses. This thesis addresses uncharted solution paths to the above challenges by investigating both technical and human-factor security evaluations to improve cyber security in the ICSE. An ICS testbed, scenario-based, and expert opinion approaches are used to demonstrate and validate cyber-attack feasibility scenarios. To improve security of ICSs, the research provides: (i) an adaptive operational security metrics generation (OSMG) framework for generating suitable security metrics for security evaluations in ICSEs, and a list of good security metrics methodology characteristics (scope-definitive, objective-oriented, reliable, simple, adaptable, and repeatable), (ii) a technical multi-attribute vulnerability (and impact) assessment (MAVCA) methodology that considers and combines dynamic metrics (temporal and environmental) attributes of vulnerabilities with the functional dependency relationship attributes of the vulnerability host components, to achieve a better representation of exploitation impacts on ICSE networks, (iii) a quantitative human-factor security (capability and vulnerability) evaluation model based on human-agent security knowledge and skills, used to identify the most vulnerable human elements, identify the least security aspects of the general workforce, and prioritise security enhancement efforts, and (iv) security risk reduction through critical impact point assessment (S2R-CIPA) process model that demonstrates the combination of technical and human-factor security evaluations to mitigate risks and achieve ICSE-wide security enhancements. The approaches or models of cyber-attack feasibility testing, adaptive security metrication, multi-attribute impact analysis, and workforce security capability evaluations can support security auditors, analysts, managers, and system owners of ICSs to create security strategies and improve cyber incidence response, and thus effectively reduce security risk.Item Unknown Information gain directed genetic algorithm wrapper feature selection for credit rating(Elsevier, 2018-04-22) Jadhav, Swati; He, Hongmei; Jenkins, Karl W.Financial credit scoring is one of the most crucial processes in the finance industry sector to be able to assess the credit-worthiness of individuals and enterprises. Various statistics-based machine learning techniques have been employed for this task. “Curse of Dimensionality” is still a significant challenge in machine learning techniques. Some research has been carried out on Feature Selection (FS) using genetic algorithm as wrapper to improve the performance of credit scoring models. However, the challenge lies in finding an overall best method in credit scoring problems and improving the time-consuming process of feature selection. In this study, the credit scoring problem is investigated through feature selection to improve classification performance. This work proposes a novel approach to feature selection in credit scoring applications, called as Information Gain Directed Feature Selection algorithm (IGDFS), which performs the ranking of features based on information gain, propagates the top m features through the GA wrapper (GAW) algorithm using three classical machine learning algorithms of KNN, Naïve Bayes and Support Vector Machine (SVM) for credit scoring. The first stage of information gain guided feature selection can help reduce the computing complexity of GA wrapper, and the information gain of features selected with the IGDFS can indicate their importance to decision making.Item Unknown Multi-capacity combinatorial ordering GA in application to cloud resources allocation and efficient virtual machines consolidation(Elsevier, 2016-11-02) Hallawi, Huda; Mehnen, Jorn; He, HongmeiThis paper describes a novel approach making use of genetic algorithms to find optimal solutions for multi-dimensional vector bin packing problems with the goal to improve cloud resource allocation and Virtual Machines (VMs) consolidation. Two algorithms, namely Combinatorial Ordering First-Fit Genetic Algorithm (COFFGA) and Combinatorial Ordering Next Fit Genetic Algorithm (CONFGA) have been developed for that and combined. The proposed hybrid algorithm targets to minimise the total number of running servers and resources wastage per server. The solutions obtained by the new algorithms are compared with latest solutions from literature. The results show that the proposed algorithm COFFGA outperforms other previous multi-dimension vector bin packing heuristics such as Permutation Pack (PP), First Fit (FF) and First Fit Decreasing (FFD) by 4%, 34%, and 39%, respectively. It also achieved better performance than the existing genetic algorithm for multi-capacity resources virtual machine consolidation (RGGA) in terms of performance and robustness. A thorough explanation for the improved performance of the newly proposed algorithm is given.Item Unknown A new semantic attribute deep learning with a linguistic attribute hierarchy for spam detection(2017) He, Hongmei; Watson, Tim; Maple, Carsten; Mehnen, Jorn; Tiwari, AshutoshThe massive increase of spam is posing a very serious threat to email and SMS, which have become an important means of communication. Not only do spams annoy users, but they also become a security threat. Machine learning techniques have been widely used for spam detection. In this paper, we propose another form of deep learning, a linguistic attribute hierarchy, embedded with linguistic decision trees, for spam detection, and examine the effect of semantic attributes on the spam detection, represented by the linguistic attribute hierarchy. A case study on the SMS message database from the UCI machine learning repository has shown that a linguistic attribute hierarchy embedded with linguistic decision trees provides a transparent approach to in-depth analysing attribute impact on spam detection. This approach can not only efficiently tackle ‘curse of dimensionality’ in spam detection with massive attributes, but also improve the performance of spam detection when the semantic attributes are constructed to a proper hierarchy.Item Unknown A novel approach for detecting cyberattacks in embedded systems based on anomalous patterns of resource utilization - Part I(IEEE, 2021-06-11) Aloseel, Abdulmohsan; Al-Rubaye, Saba; Zolotas, Argyrios; He, Hongmei; Shaw, CarlThis paper presents a novel security approach called Anomalous Resource Consumption Detection (ARCD), which acts as an additional layer of protection to detect cyberattacks in embedded systems (ESs). The ARCD approach is based on the differentiation between the predefined standard resource consumption pattern and the anomalous consumption pattern of system resource utilization. The effectiveness of the proposed approach is tested in a rigorous manner by simulating four types of cyberattacks: a denial-of-service attack, a brute-force attack, a remote code execution attack, and a man-in-the-middle attack, which are executed on a Smart PiCar (used as the testbed). A septenary tuple model consisting of seven parameters, representing the embedded system’s architecture, has been created as the core of the detection mechanism. The approach’s efficiency and effectiveness has been validated in terms of range and pattern by analyzing the collected data statistically in terms of mean, median, mode, standard deviation, range, minimum, and maximum values. The results demonstrated the potential for defining a standard pattern of resource utilization and performance of the embedded system due to a significant similarity of the parameters’ values at normal states. In contrast, the attacked cases showed a definite, observable, and detectable impact on resource consumption and performance of the embedded system, causing an anomalous pattern. Thus, by merging these two findings, the ARCD approach has been developed. ARCD facilitates building secure operating systems in line with the ES’s capabilities. Furthermore, the ARCD approach can work along with existing countermeasures to augment the security of the operating system layer.Item Unknown Prediction of earnings per share for industry(IEEE, 2016-08-01) Jadhav, Swati; He, Hongmei; Jenkins, Karl W.Prediction of Earnings Per Share (EPS) is the fundamental problem in finance industry. Various Data Mining technologies have been widely used in computational finance. This research work aims to predict the future EPS with previous values through the use of data mining technologies, thus to provide decision makers a reference or evidence for their economic strategies and business activity. We created three models LR, RBF and MLP for the regression problem. Our experiments with these models were carried out on the real datasets provided by a software company. The performance assessment was based on Correlation Coefficient and Root Mean Squared Error. These algorithms were validated with the data of six different companies. Some differences between the models have been observed. In most cases, Linear Regression and Multilayer Perceptron are effectively capable of predicting the future EPS. But for the high nonlinear data, MLP gives better performance.Item Open Access Recognition of speed signs in uncertain and dynamic environments(IOP Publishing: Conference Series, 2019-05-08) Zhu, Zhilong; Xu, Gang; He, Hongmei; Jiang, Juanjuan; Wang, TaoThe speed limit signs recognition directly affects the safety of autonomous vehicles. Vehicles are usually running in an uncertain and dynamic environment. The performance of the recognition system is affected by various factors such as the different sizes of pictures, illumination condition and position circumstances, which can lead to misclassification. This makes the speed sign recognition challengeable. To improve the recognition rate of the speed signs in such environments, this work firstly applies the method of the saliency target detection based on the background-absorbing Markov chain, to extract the node in an image, then uses SPP-CNN to classify the extracted nodes with ten-folder validation. The recognition rate is up to 9.32%, higher than that obtained directly by SPP-CNN working on raw dataset.Item Open Access Simulation of autonomous UAV navigation with collision avoidance and spatial awareness.(Cranfield University, 2019-08) Li, Jian; He, Hongmei; Tiwari, AshutoshThe goal of this thesis is to design a collision-free autonomous UAV navigation system with spatial awareness ability within a comprehensive simulation framework. The navigation system is required to find a collision-free trajectory to a randomly assigned 3D target location without any prior map information. The implemented navigation system contains four main components: mapping, localisation, cognition and control system, where the cognition system makes execution command based on the perceived position information about obstacles and UAV itself from mapping and localisation system respectively. The control system is responsible for executing the input command made from the cognition system. The implementation for the cognition system is split into three case studies for real-life scenarios, which are restricted area avoidance, static obstacle avoidance and dynamic obstacles. The experiment results in the three cases have been conducted, and the UAV is capable of determining a collision-free trajectory under all three cases of environments. All simulated components were designed to be analogous to their real-world counterpart. Ideally, the simulated navigation framework can be transferred to a real UAV without any changes. The simulation framework provides a platform for future robotic research. As it is implemented in a modular way, it is easier to debug. Hence, the system has good reliability. Moreover, the system has good readability, maintainability and extendability.Item Open Access Various heuristic algorithms to minimise the two-page crossing numbers of graphs(De Gruyter, 2015-08-13) He, Hongmei; Sălăgean, Ana; Mäkinen, Erkki; Vrt'o, ImrichWe propose several new heuristics for the twopage book crossing problem, which are based on recent algorithms for the corresponding one-page problem. Especially, the neural network model for edge allocation is combined for the first time with various one-page algorithms. We investigate the performance of the new heuristics by testing them on various benchmark test suites. It is found out that the new heuristics outperform the previously known heuristics and produce good approximations of the planar crossing number for severalwell-known graph families. We conjecture that the optimal two-page drawing of a graph represents the planar drawing of the graph.Item Open Access Web robot detection using supervised learning algorithms(Cranfield University, 2020-06) Chen, Hanlin; He, Hongmei; Starr, AndrewWeb robots or Web crawlers have become the main source of Web traffic. Although some bots perform well, such as search engines, other bots can perform DDoS attacks, posing a huge threat to websites. The project aims to develop an offline system that can effectively detect malicious web robots, which is not only conducive to network traffic cleaning, but also conducive to improving the network security of IoT systems and services. A comprehensive literature review for the years 2010-2019 was conducted to identify the research gap. The key contributions of the research are: 1) it provided a systematic methodology to address the web robot detection problem based on the log file from industrial company; 2) it provided an approach of feature engineering, thus overcoming the challenge of curse of dimensionality; 3) It made a big progress in the accuracy of off-line web robot detection through a holistic study on the three types of machine learning techniques based on real data from industry. Three algorithms based on Keras sequential model, random forest, and SVM, were developed with python to detect web robots from human visitors on the TensorFlow 2.0 platform. Experimental results suggested that random forest obtained the best performance in accuracy and training time...[cont.]