Developing novel bioinformatics tools and pipelines for working with reference genomes and large sets of resequenced genomes.

dc.contributor.advisorMohareb, Fady R.
dc.contributor.authorKurowski, Tomasz Janusz
dc.date.accessioned2024-04-30T11:23:57Z
dc.date.available2024-04-30T11:23:57Z
dc.date.issued2022-01
dc.description.abstractBoth reference genomes assembled for individual species and large, publicly maintained sets of resequenced genomes are of immense value to researchers. The former represent important milestones for research involving the species of interest and serve as ostensibly static points of reference for other data, while the latter serve as catalogues of genetic variation, enabling researchers to place their own data in a wider context. However, maintaining sets of resequenced genomes and ensuring their integrity as they undergo updates to match any new releases of their reference genome poses certain computational challenges, as does manipulating and comparing those large sets of genomes in general. This work reports on the detection and correction of significant errors which were introduced into resequenced tomato data in the course of updating them to a new version. It also introduces Tersect, a low-level utility optimized for manipulating and comparing large sets of resequenced genomic data, as well as Tersect Browser, a Web application which uses the high performance of Tersect, coupled with a higher-level indexing and precomputation scheme to allow for interactive comparison of large sets of resequenced genomes, giving biologists a tool capable of generating visualisations of genetic distance and phylogenetic relationships based on whole-genome sequence data from hundreds of genomes in seconds rather than hours.en_UK
dc.description.coursenamePhD in Environment and Agrifooden_UK
dc.identifier.urihttps://dspace.lib.cranfield.ac.uk/handle/1826/21286
dc.language.isoen_UKen_UK
dc.publisherCranfield Universityen_UK
dc.publisher.departmentSWEEen_UK
dc.rights© Cranfield University, 2022. All rights reserved. No part of this publication may be reproduced without the written permission of the copyright holder.en_UK
dc.subjectComparative genomicsen_UK
dc.subjectgenotypingen_UK
dc.subjectSNPen_UK
dc.subjectSNVen_UK
dc.subjectVariant Call Formaten_UK
dc.subjectIntrogressionen_UK
dc.subjectTomatoen_UK
dc.titleDeveloping novel bioinformatics tools and pipelines for working with reference genomes and large sets of resequenced genomes.en_UK
dc.typeThesis or dissertationen_UK
dc.type.qualificationlevelDoctoralen_UK
dc.type.qualificationnamePhDen_UK

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kurowski_T_2023.pdf
Size:
8.47 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.63 KB
Format:
Item-specific license agreed upon to submission
Description: