Understanding the relevance of parallelising machine learning algorithms using CUDA for sentiment analysis
Date published
Free to read from
Supervisor/s
Journal Title
Journal ISSN
Volume Title
Publisher
Department
Type
ISSN
Format
Citation
Abstract
Sentiment classification is essential in natural language processing, leveraging machine learning algorithms to understand the sentiment expressed in textual data. Over the years, advancements in machine learning, particularly with Naive Bayes (NB) and Support Vector Machines (SVM), have tremendously improved sentiment classification. These models benefit from word embedding techniques such as Word2Vec and GloVe, which provide dense vector representations of words, capturing their semantic and syntactic relationships. This paper explores the parallelisation of NB and SVM models using CUDA on GPUs to enhance computational efficiency and performance. Despite the computational power offered by GPUs, the literature on parallelising machine learning methods, especially for sentiment classification, remains limited. Our work aims to fill this gap by comparing the performance of NB and SVM on CPU and GPU platforms, focusing on execution time and model accuracy. Our experiments demonstrate that NB outperforms SVM in execution time and overall efficiency, mainly when using GPU acceleration. The NB model consistently achieves higher accuracy, precision, and F1 scores with Word2Vec and GloVe embeddings. The results show the importance of leveraging GPU acceleration using varying numbers of threads per block for large-scale sentiment analysis and laying the foundation for parallelising sentiment classification tasks.