CERES > Cranfield Health > PhD, EngD and MSc by research theses (Cranfield Health) >

Please use this identifier to cite or link to this item: http://dspace.lib.cranfield.ac.uk/handle/1826/7441

Document Type: Thesis or dissertation
Title: Identification, organisation and visualisation of complete proteomes in UniProt throughout all taxonomic ranks :|barchaea, bacteria, eukatyote and virus
Authors: Stanley, Eleanor Juliet
Supervisors: Morareb, Fady
Jesus Martin, Maria
Issue Date: Apr-2012
Abstract: Users of uniprot.org want to be able to query, retrieve and download proteome sets for an organism of their choice. They expect the data to be easily accessed, complete and up to date based on current available knowledge. UniProt release 2012_01 (25th Jan 2012) contains the proteomes of 2,923 organisms; 50% of which are bacteria, 38% viruses, 8% eukaryota and 4% archaea. Note that the term 'organism' is used in a broad sense to include subspecies, strains and isolates. Each completely sequenced organism is processed as an independent organism, hence the availability of 38 strain-specific proteomes Escherichia coli that are accessible for download. There is a project within UniProt dedicated to the mammoth task of maintaining the “Proteomes database”. This active resource is essential for UniProt to continually provide high quality proteome sets to the users. Accurate identification and incorporation of new, publically available, proteomes as well as the maintenance of existing proteomes permits sustained growth of the proteomes project. This is a huge, complicated and vital task accomplished by the activities of both curators and programmers. This thesis explains the data input and output of the proteomes database: the flow of genome project data from the nucleotide database into the proteomes database, then from each genome how a proteome is identified, augmented and made visible to uniprot.org users. Along this journey of discovery many issues arose, puzzles concerning data gathering, data integrity and also data visualisation. All were resolved and the outcome is a well-documented, actively maintained database that strives to provide optimal proteome information to its users.
URI: http://dspace.lib.cranfield.ac.uk/handle/1826/7441
Appears in Collections:PhD, EngD and MSc by research theses (Cranfield Health)

Files in This Item:

File Description SizeFormat
Eleanor_Juliet_Stanley_Thesis_2012.pdf2.95 MBAdobe PDFView/Open

SFX Query

Items in CERES are protected by copyright, with all rights reserved, unless otherwise indicated.