Solvers on Advanced Parallel Architectures with Application to Partial Differential Equations and Discrete Optimisation

dc.contributor.advisorThompson, Chris
dc.contributor.authorCzapinski, Michal
dc.date.accessioned2015-07-10T14:25:16Z
dc.date.available2015-07-10T14:25:16Z
dc.date.issued2014-05
dc.description.abstractThis thesis investigates techniques for the solution of partial differential equations (PDE) on advanced parallel architectures comprising central processing units (CPU) and graphics processing units (GPU). Many physical phenomena studied by scientists and engineers are modelled with PDEs, and these are often computationally expensive to solve. This is one of the main drivers of large-scale computing development. There are many well-established PDE solvers, however they are often inherently sequential. In consequence, there is a need to redesign the existing algorithms, and to develop new methods optimised for advanced parallel architectures. This task is challenging due to the need to identify and exploit opportunities for parallelism, and to deal with communication overheads. Moreover, a wide range of parallel platforms are available — interoperability issues arise if these are employed to work together. This thesis offers several contributions. First, performance characteristics of hybrid CPU-GPU platforms are analysed in detail in three case studies. Secondly, an optimised GPU implementation of the Preconditioned Conjugate Gradients (PCG) solver is presented. Thirdly, a multi-GPU iterative solver was developed — the Distributed Block Direct Solver (DBDS). Finally, and perhaps the most significant contribution, is the innovative streaming processing for FFT-based Poisson solvers. Each of these contributions offers significant insight into the application of advanced parallel systems in scientific computing. The techniques introduced in the case studies allow us to hide most of the communication overhead on hybrid CPU-GPU platforms. The proposed PCG implementation achieves 50–68% of the theoretical GPU peak performance, and it is more than 50% faster than the state-of-the-art solution (CUSP library). DBDS follows the Block Relaxation scheme to find the solution of linear systems on hybrid CPU-GPU platforms. The convergence of DBDS has been analysed and a procedure to compute a high-quality upper bound is derived. Thanks to the novel streaming processing technique, our FFT-based Poisson solvers are the first to handle problems larger than the GPU memory, and to enable multi- GPU processing with a linear speed-up. This is a significant improvement over the existing methods, which are designed to run on a single GPU, and are limited by the device memory size. Our algorithm needs only 6.9 seconds to solve a 2D Poisson problem with 2.4 billion variables (9 GB) on two Tesla C2050 GPUs (3 GB memory).en_UK
dc.identifier.urihttp://dspace.lib.cranfield.ac.uk/handle/1826/9315
dc.language.isoenen_UK
dc.publisherCranfield Universityen_UK
dc.rights© Cranfield University, 2014. All rights reserved. No part of this publication may be reproduced without the written permission of the copyright holder.en_UK
dc.titleSolvers on Advanced Parallel Architectures with Application to Partial Differential Equations and Discrete Optimisationen_UK
dc.typeThesis or dissertationen_UK
dc.type.qualificationlevelDoctoralen_UK
dc.type.qualificationnamePhDen_UK

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Czapinski_Michal_Thesis_2014.pdf
Size:
7.41 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: