Browsing by Author "Filippone, Salvatore"
Now showing 1 - 17 of 17
Results Per Page
Sort Options
Item Open Access BootCMatch: A software package for bootstrap AMG based on graph weighted matching(Association for Computing Machinery (ACM), 2018-07-14) D’Ambra, Pasqua; Filippone, Salvatore; Vassilevski, Panayot S.This article has two main objectives: one is to describe some extensions of an adaptive Algebraic Multigrid (AMG) method of the form previously proposed by the first and third authors, and a second one is to present a new software framework, named BootCMatch, which implements all the components needed to build and apply the described adaptive AMG both as a stand-alone solver and as a preconditioner in a Krylov method. The adaptive AMG presented is meant to handle general symmetric and positive definite (SPD) sparse linear systems, without assuming any a priori information of the problem and its origin; the goal of adaptivity is to achieve a method with a prescribed convergence rate. The presented method exploits a general coarsening process based on aggregation of unknowns, obtained by a maximum weight matching in the adjacency graph of the system matrix. More specifically, a maximum product matching is employed to define an effective smoother subspace (complementary to the coarse space), a process referred to as compatible relaxation, at every level of the recursive two-level hierarchical AMG process. Results on a large variety of test cases and comparisons with related work demonstrate the reliability and efficiency of the method and of the software.Item Open Access Coarray-based Load Balancing on Heterogeneous and Many-Core Architectures(Elsevier, 2017-06-03) Cardellini, Valeria; Fanfarillo, Alessandro; Filippone, SalvatoreIn order to reach challenging performance goals, computer architecture is expected to change significantly in the near future. Heterogeneous chips, equipped with different types of cores and memory, will force application developers to deal with irregular communication patterns, high levels of parallelism, and unexpected behavior. Load balancing among the heterogeneous compute units will be a critical task in order to achieve an effective usage of the computational power provided by such new architectures. In this highly dynamic scenario, Partitioned Global Address Space (PGAS) languages, like Coarray Fortran, appear a promising alternative to standard MPI programming that uses two-sided communications, in particular because of PGAS one-sided semantic and ease of programmability. In this paper, we show how Coarray Fortran can be used for implementing dynamic load balancing algorithms on an exascale compute node and how these algorithms can produce performance benefits for an Asian option pricing problem, running in symmetric mode on Intel Xeon Phi Knights Corner and Knights Landing architectures.Item Open Access A deep feedforward neural network and shallow architectures effectiveness comparison: Flight delays classification perspective(ACM, 2021-11-22) Bisandu, Desmond Bala; Homaid, Mohammed Salih A; Moulitsas, Irene; Filippone, SalvatoreFlight delays have negatively impacted the socio-economics state of passengers, airlines and airports, resulting in huge economic losses. Hence, it has become necessary to correctly predict their occurrences in decision-making because it is important for the effective management of the aviation industry. Developing accurate flight delays classification models depends mostly on the air transportation system complexity and the infrastructure available in airports, which may be a region-specific issue. However, no specific prediction or classification model can handle the individual characteristics of all airlines and airports at the same time. Hence, the need to further develop and compare predictive models for the aviation decision system of the future cannot be over-emphasised. In this research, flight on-time data records from the United State Bureau of Transportation Statistics was employed to evaluate the performances of Deep Feedforward Neural Network, Neural Network, and Support Vector Machine models on a binary classification problem. The research revealed that the models achieved different accuracies of flight delay classifications. The Support Vector Machine had the worst average accuracy than Neural Network and Deep Feedforward Neural Network in the initial experiment. The Deep Feedforward Neural Network outperformed Support Vector Machines and Neural Network with the best average percentage accuracies. Going further to investigate the Deep Feedforward Neural Network architecture on different parameters against itself suggest that training a Deep Feedforward Neural Network algorithm, regardless of data training size, the classification accuracy peaks. We examine which number of epochs works best in our flight delay classification settings for the Deep Feedforward Neural Network. Our experiment results demonstrate that having many epochs affects the convergence rate of the model; unlike when hidden layers are increased, it does not ensure better or higher accuracy in a binary classification of flight delays. Finally, we recommended further studies on the applicability of the Deep Feedforward Neural Network in flight delays prediction with specific case studies of either airlines or airports to check the impact on the model’s performance.Item Open Access Development and performance comparison of MPI and Fortran Coarrays within an atmospheric research model(IEEE, 2018-11-16) Rasmussen, Soren; Gutmann, Ethan D.; Friesen, Brian; Rouson, Damian; Filippone, Salvatore; Moulitsas, IreneA mini-application of The Intermediate Complexity Research (ICAR) Model offers an opportunity to compare the costs and performance of the Message Passing Interface (MPI) versus coarray Fortran, two methods of communication across processes. The application requires repeated communication of halo regions, which is performed with either MPI or coarrays. The MPI communication is done using non-blocking two-sided communication, while the coarray library is implemented using a one-sided MPI or OpenSHMEM communication backend. We examine the development cost in addition to strong and weak scalability analysis to understand the performance costs.Item Open Access Efficient algebraic multigrid preconditioners on clusters of GPUs(World Scientific Publishing, 2019-05-10) Abdullahi Hassan, Ambra; Cardellini, Valeria; D'Ambra, Pasqua; Di Serafino, Daniela; Filippone, SalvatoreMany scientific applications require the solution of large and sparse linear systems of equations using Krylov subspace methods; in this case, the choice of an effective preconditioner may be crucial for the convergence of the Krylov solver. Algebraic MultiGrid (AMG) methods are widely used as preconditioners, because of their optimal computational cost and their algorithmic scalability. The wide availability of GPUs, now found in many of the fastest supercomputers, poses the problem of implementing efficiently these methods on high-throughput processors. In this work we focus on the application phase of AMG preconditioners, and in particular on the choice and implementation of smoothers and coarsest-level solvers capable of exploiting the computational power of clusters of GPUs. We consider block-Jacobi smoothers using sparse approximate inverses in the solve phase associated with the local blocks. The choice of approximate inverses instead of sparse matrix factorizations is driven by the large amount of parallelism exposed by the matrix-vector product as compared to the solution of large triangular systems on GPUs. The selected smoothers and solvers are implemented within the AMG preconditioning framework provided by the MLD2P4 library, using suitable sparse matrix data structures from the PSBLAS library. Their behaviour is illustrated in terms of execution speed and scalability, on a test case concerning groundwater modelling, provided by the Jülich Supercomputing Center within the Horizon 2020 Project EoCoE.Item Open Access Enhancing bank's profitability by applying machine learning techniques on financial data to intelligently predict customer behaviour towards the use of electronic channels.(2021-09) Alsanousi, Hessa Abdulaziz; Moulitsas, Irene; Filippone, SalvatoreTechnology is evolving rapidly, and this represents a huge prospect for investment in business. In the banking industry, new technologies have led to the creation of new electronic communication channels for customers. Therefore, banks need to do consider how to make the most of these channels. This will help them create prediction models to implement better strategies and improve their decision making. In the long run, this will help them to decrease costs and increase revenues. The aim of this study was to determine the appropriate electronic channels for specific customers based on their information. It used a big data prediction model to predict the best online channel for banking customers. The point of the model was to help banks understand their customers’ preferences, thereby increasing customer satisfaction, productivity, and profits. I obtained a substantial amount of financial data from a minimum of 100,000 customers from ten local banks in Kuwait. The data covered a period of ten years. The independent variables I studied were age, gender, number of current accounts, number of savings accounts, number of deposit accounts, income, number of consumer loans, number of instalment loans, credit card limit, outstanding credit card balance, length of relationship with the bank, continent, and nationality. The dependent variables were call centres, websites, and mobile applications. Given the size and type of data, I used machine learning. I used the Statistics and Machine Learning Toolbox, Financial Toolbox, and other functions in MATLAB to run a multinomial logistic regression analysis. I examined four different methods for analyzing the data and chose the one that was most appropriate: multinomial logistic regression. I also considered whether there was any correlation between the independent variables. I discovered one significant correlation between credit card limit and outstanding credit card balance. I explored this finding further by considering the time of execution and the key performance measures. Consequently, I decided to keep both variables in my study. I also studied the relation between the volume of training data and the accuracy of the model to see how sensitive the models were to variations in data size. The results confirmed that my method performed stably across the different sample sizes. I addressed the issue of overfitting by running a sub-sample regression and comparing it to the original model.my results indicated that the model was not overfitted. I differentiated between conventional banks and Islamic banks. This distinction had not been made by previous studies, and my outcomes provided more information about the difference between Islamic banks and conventional banks. I found that there were very few differences between the customers of conventional banks and those of Islamic banks. However, I found that male customers of Islamic banks tended to use internet more often than female customers, who tended to use the mobile applications. I also found that older customers tended to use call centres as their primary way of communicating with their bank. The results showed that clients with more current accounts and savings accounts were more likely to use mobile applications. However, one unexpected finding was that clients with more deposit accounts were more likely to use the internet or call centres rather than mobile applications. The findings also demonstrated that clients with higher incomes were more likely to use mobile applications than other communication platforms. In a couple of banks, customers with more loans tended to use call centres as their primary means of communication. The type of loan had no significant impact on their choices. Also, I found that customers who had been with their bank for longer were more likely to use call centres as their primary communication source. Finally, in a few banks, the results showed that a client’s continent or nationality had no significant impact on their preference for a particular communication channel. These are very important findings that could change how banks operate. They will have a positive influence on decision-making and strategies in the banking industry.Item Open Access Fortran coarray implementation of semi-Lagrangian convected air particles within an atmospheric model(MDPI, 2021-05-06) Rasmussen, Soren; Gutmann, Ethan D.; Moulitsas, Irene; Filippone, SalvatoreThis work added semi-Lagrangian convected air particles to the Intermediate Complexity Atmospheric Research (ICAR) model. The ICAR model is a simplified atmospheric model using quasi-dynamical downscaling to gain performance over more traditional atmospheric models. The ICAR model uses Fortran coarrays to split the domain amongst images and handle the halo region communication of the image’s boundary regions. The newly implemented convected air particles use trilinear interpolation to compute initial properties from the Eulerian domain and calculate humidity and buoyancy forces as the model runs. This paper investigated the performance cost and scaling attributes of executing unsaturated and saturated air particles versus the original particle-less model. An in-depth analysis was done on the communication patterns and performance of the semi-Lagrangian air particles, as well as the performance cost of a variety of initial conditions such as wind speed and saturation mixing ratios. This study found that given a linear increase in the number of particles communicated, there is an initial decrease in performance, but that it then levels out, indicating that over the runtime of the model, there is an initial cost of particle communication, but that the computational benefits quickly offset it. The study provided insight into the number of processors required to amortize the additional computational cost of the air particles.Item Open Access A framework for unit testing with coarray Fortran(The Society for Modeling and Simulation International, 2017-04-26) Abdullahi Hassan, Ambra; Cardellini, Valeria; Filippone, SalvatoreParallelism is a ubiquitous feature of modern computing architectures; indeed, we might even say that serial code is now automatically legacy code. Writing parallel code poses significant challenges to programs, and is often error-prone. Partitioned Global Address Space (PGAS) languages, such as Coarray Fortran (CAF), represent a promising development direction in the quest for a trade-off between simplicity and performance. CAF is a parallel programming model that allows a smooth migration from serial to parallel code. However, despite CAF simplicity, refactoring serial code and migrating it to parallel versions is still error-prone, especially in complex softwares. The combination of unit testing, which drastically reduces defect injection, and CAF is therefore a very appealing prospect; however, it requires appropriate tools to realize its potential. In this paper, we present the first CAF-compatible framework for unit tests, developed as an extension to the Parallel Fortran Unit Test framework (pFUnit).Item Open Access Modelling the probability of household default for conventional and Islamic banking.(2021-09) AlBarrak, Nijlah Saleh; Moulitsas, Irene; Filippone, SalvatoreCredit rating risks have become a main indicator of the performance of banks. An effective credit assessment can make it easier to anticipate losses and manage it. Based on regulators, banks' credit exposure must have a risk weight that are assigned as a way of calculating the expected loss for each customer. In a country like Kuwait, whose economy is based on oil, advances in technology and increases in the amount of data available about banking customers have made it possible to develop a built-in credit default probability model that is more effective than the standard fixed risk model that is currently in use. Having a robust and fair model under the control of a central bank will increase supervision and make the internal risk weight framework more reliable, bringing it into line with regulations. Our aim in this study is to come up with an internal credit rating system that uses machine learning to calculate the risk weights for different households. We aim to create models that will be approved by the Central Bank of Kuwait and be acceptable to other banks. Our objectives are to develop a customized model for each of the banks under study; to account for different types of banking (conventional and Islamic); to produce different models for the different types of loans granted, to generate a general model for the Central Bank as well for each type of loan. We compared different classification models for conventional and Islamic banks. The classification models were as follows: Logistic Regression, Fine Decision Tree, Linear Support Vector Machines, Kernel Naïve Bayes, Bagging, AdaBoostM1, and RUSBoosted. Ensemble models were used to classify the customers who took our household loans at the different banks and determine the likelihood that they would default. This led to the development of further mechanisms for assessing the models at the central bank, such as Gentelboost, Logitboost and Robustboost. The following were adopted as key performance measures: AUC curve, confusion matrix, running time, and k-fold. This ensured that the prediction models in this critical field were well structured. Our findings met our expectations. The key performance measures for comparing the different models show that ensemble models are the most suitable. The parameters selected for this study varied in importance depending on the prediction model. We found that several new parameters were significant and as influential as we expected. One of the key contributions that we made was that we used a substantial amount of data; this can be considered a contribution in itself. This was also the first time that a credit rating model was developed for Islamic banks. We provided an internal model for each of the six banks we studied, as well as a robust system for the central banks. The results of our work show that banks should reduce the credit risk for each customer from 75% to 30% or less. The results should also enhance the role of the central banks given that it provides a robust system, with new and undiscovered variables. This system can be used to calculate the credit default probability. Therefore, there will be efficient supervision of the banks' internal systems, ensuring the reliability of the banks. This will also be considered in central banks through periodic stress testing that adds new impact factors for predicting default cases. A periodic run of the systems will help to prevent customers from defaulting unexpectedly. In conclusion, our models could be passed to other governmental financial bodies in Kuwait.Item Open Access A parallel generalized relaxation method for high-performance image segmentation on GPUs(Elsevier, 2015-05-01) D’Ambra, Pasqua; Filippone, SalvatoreFast and scalable software modules for image segmentation are needed for modern high-throughput screening platforms in Computational Biology. Indeed, accurate segmentation is one of the main steps to be applied in a basic software pipeline aimed to extract accurate measurements from a large amount of images. Image segmentation is often formulated through a variational principle, where the solution is the minimum of a suitable functional, as in the case of the Ambrosio–Tortorelli model. Euler–Lagrange equations associated with the above model are a system of two coupled elliptic partial differential equations whose finite-difference discretization can be efficiently solved by a generalized relaxation method, such as Jacobi or Gauss–Seidel, corresponding to a first-order alternating minimization scheme. In this work we present a parallel software module for image segmentation based on the Parallel Sparse Basic Linear Algebra Subprograms (PSBLAS), a general-purpose library for parallel sparse matrix computations, using its Graphics Processing Unit (GPU) extensions that allow us to exploit in a simple and transparent way the performance capabilities of both multi-core CPUs and of many-core GPUs. We discuss performance results in terms of execution times and speed-up of the segmentation module running on GPU as well as on multi-core CPUs, in the analysis of 2D gray-scale images of mouse embryonic stem cells colonies coming from biological experimentsItem Open Access Social ski driver conditional autoregressive-based deep learning classifier for flight delay prediction(Springer, 2022-01-30) Bisandu, Desmond Bala; Moulitsas, Irene; Filippone, SalvatoreThe importance of robust flight delay prediction has recently increased in the air transportation industry. This industry seeks alternative methods and technologies for more robust flight delay prediction because of its significance for all stakeholders. The most affected are airlines that suffer from monetary and passenger loyalty losses. Several studies have attempted to analysed and solve flight delay prediction problems using machine learning methods. This research proposes a novel alternative method, namely social ski driver conditional autoregressive-based (SSDCA-based) deep learning. Our proposed method combines the Social Ski Driver algorithm with Conditional Autoregressive Value at Risk by Regression Quantiles. We consider the most relevant instances from the training dataset, which are the delayed flights. We applied data transformation to stabilise the data variance using Yeo-Johnson. We then perform the training and testing of our data using deep recurrent neural network (DRNN) and SSDCA-based algorithms. The SSDCA-based optimisation algorithm helped us choose the right network architecture with better accuracy and less error than the existing literature. The results of our proposed SSDCA-based method and existing benchmark methods were compared. The efficiency and computational time of our proposed method are compared against the existing benchmark methods. The SSDCA-based DRNN provides a more accurate flight delay prediction with 0.9361 and 0.9252 accuracy rates on both dataset-1 and dataset-2, respectively. To show the reliability of our method, we compared it with other meta-heuristic approaches. The result is that the SSDCA-based DRNN outperformed all existing benchmark methods tested in our experiment.Item Open Access Sparse approximate inverse preconditioners on high performance GPU platforms(Elsevier, 2016-01-28) Bertaccini, Daniele; Filippone, SalvatoreSimulation with models based on partial differential equations often requires the solution of (sequences of) large and sparse algebraic linear systems. In multidimensional domains, preconditioned Krylov iterative solvers are often appropriate for these duties. Therefore, the search for efficient preconditioners for Krylov subspace methods is a crucial theme. Recent developments, especially in computing hardware, have renewed the interest in approximate inverse preconditioners in factorized form, because their application during the solution process can be more efficient. We present here some experiences focused on the approximate inverse preconditioners proposed by Benzi and Tůma from 1996 and the sparsification and inversion proposed by van Duin in 1999. Computational costs, reorderings and implementation issues are considered both on conventional and innovative computing architectures like Graphics Programming Units (GPUs).Item Open Access Sparse matrix-vector multiplication on GPGPUs(Association for Computing Machinery (ACM), 2017-03-01) Filippone, SalvatoreThe multiplication of a sparse matrix by a dense vector (SpMV) is a centerpiece of scientific computing applications: it is the essential kernel for the solution of sparse linear systems and sparse eigenvalue problems by iterative methods. The efficient implementation of the sparse matrix-vector multiplication is therefore crucial and has been the subject of an immense amount of research, with interest renewed with every major new trend in high performance computing architectures. The introduction of General Purpose Graphics Processing Units (GPGPUs) is no exception, and many articles have been devoted to this problem. With this paper we provide a review of the techniques for implementing the SpMV kernel on GPGPUs that have appeared in the literature of the last few years. We discuss the issues and trade-offs that have been encountered by the various researchers, and a list of solutions, organized in categories according to common features. We also provide a performance comparison across different GPGPU models and on a set of test matrices coming from various application domains.Item Open Access Understanding customer behaviours toward the use of electronic banking given customer characteristics and financial portfolios(Center for Promoting Ideas, 2021-01-31) Alsanousi, Hessa; Al Barrak, Najla; Moulitsas, Irene; Filippone, SalvatoreThe evolution of electronic banking demonstrates the need for research regarding demographics and banking preferences. In this study, financial data was collected from three Kuwaiti banks. The data included customer characteristics, product portfolios, and usage information in all electronic banking channels. This data was used to predict the use of electronic channels, treated as dependent variables in our study, based on individual customer’s information, that is treated as independent variables. Machine learning (ML) techniques, specifically multinomial logistic regression, were used to handle the data bearing in mind that these techniques would bring the most benefit to financial analysts and bankers. The results showed that one can determine the preferred electronic banking channel for each customer by knowing some of their characteristics and financial portfolio.Item Open Access Using Big Data to compare classification models for household credit rating in Kuwait(Springer, 2021-09-10) Albarrak, Najla; Alsanousi, Hessa; Moulitsas, Irene; Filippone, SalvatoreCredit rating risks have become the backbone of bank performance. They are the reflection of the current status of the bank and the milestone for future planning. A good credit assessment can better anticipate expected losses and will minimize unexpected losses from accumulating. Given advancements in technology as well as the big data available within banks about customers in an oil country such as Kuwait, a built-in model to help in-household credit scoring is at management’s decision. Compared with the current ‘black box’ rating models, we did a comparison between different classification models for two types of banking: conventional and Islamic. The classification models are as follows: Logistic Regression, Fine Decision Tree, Linear Support Vector Machines, Kernel Naïve Bayes, and RUSBoosted. Sufficiently, the last could be used to classify banks’ household customers and determine their default cases.Item Open Access Using Big Data to compare classification models for household credit rating in Kuwait(Inderscience, 2021-03-13) Albarrak, Najla; Alsanousi, Hessa; Moulitsas, Irene; Filippone, SalvatoreCredit rating risks have become the main indicator of bank performance. They are the reflection of the current status of the bank and an important milestone for future planning. An effective credit assessment can better anticipate expected losses and will minimize unexpected losses from accumulating. In an oil country such as Kuwait, advancements in technology as well as the big data available within banks about customers can lead to a built-in credit assessment model. This built-in model can work to help in-household credit scoring at the decision of a financial institution’s management. Compared to the current ‘black box’ rating models, we did a comparison between different classification models for two types of banking: conventional and Islamic. The classification models are as follows: Logistic Regression, Fine Decision Tree, Linear Support Vector Machines, Kernel Naïve Bayes, and RUSBoosted. Sufficiently, the last could be used to classify banks household customers and determine their default cases. Keywords - Classification Models, Conventional Banking, Credit Rating, Household Customers, Islamic BankingItem Open Access Using client’s characteristics and their financial products to predict their usage of banking electronic channels(Springer, 2021-10-27) Alsanousi, Hessa; Albarrak, Najla; Moulitsas, Irene; Filippone, SalvatoreTechnology innovation and its impact on the progress of electronic banking establish the requirement for this research regarding customer demographics, their financial portfolio and banking preferences. In this research, banking financial data were collected from three Kuwaiti banks. The data included usage information in all electronic banking channels for each customer, their characteristics and financial portfolio. The aim of this study is to predict the customer use of electronic channels, treated as dependent variables, considering individual customer information, which is treated as independent variables. To bring the most benefit to bankers and financial analysts, machine learning techniques (ML), specifically multinomial logistic regression, were used to deal with the data from cleaning to analysing. The results disclosed that banks can determine the preferred electronic banking channel for each of their customers by knowing some information about their characteristics and financial product portfolio.