Creation of a software tool for browsing genome variation

Date published

2011-10

Free to read from

Journal Title

Journal ISSN

Volume Title

Publisher

Cranfield University

Department

Cranfield Health

Type

Thesis or dissertation

ISSN

Format

Citation

Abstract

The advent of next generation sequencing has led to an explosion of the amount of DNA sequences in public databases. A challenge is now to find tools that are able to make it easier for researchers to browse and make sense of this data. One organism that has recently been subject to extensive sequencing is Plasmodium falciparum, a devastating pathogen that infects hundreds of millions of people annually. The first goal of this project was to create a new desktop genome variation browser that can quickly handle large amounts of data from sequencing projects involving numerous isolates. The second aim was to use the new tool to analyse recently-sequenced strains of P. falciaprum in order to identify polymorphisms that may be involved in antibiotic resistance. The variation browser described here was written in C++ and the Qt graphical framework in order to make an easy to use and fast tool that can visualise data from variant call format (VCF) files, which is now a de facto standard for storing polymorphism data. The user is able to browse a VCF file to gain a graphical representation of the variation among multiple samples. For rapid identification of relevant polymorphisms, the user is able to filter variant positions using several criteria including mapping quality, sample group membership, and whether the mutations alter the amino acid sequence of a gene. Some basic statistical analysis was incorporated to help identify selective pressures acting on polymorphic sites. The usefulness of the program was ascertained by analysing 75 isolates of P. falciparum from Africa and Asia. Mutations were identified in the chloroquine resistance marker protein, PI4-K, and a putative ubiquitin carboxyl hydrolase, which are potentially involved in antibiotic resistance.

Description

Software Description

Software Language

Github

Keywords

sequencing, DNA, Plasmodium falciparum, variant call format, VCF, polymorphisms, Mutations

DOI

Rights

Relationships

Relationships

Supplements

Funder/s

Biotechnology and Biological (BBSRC)