Browsing by Author "Mohareb, Fady R."
Now showing 1 - 20 of 23
Results Per Page
Sort Options
Item Open Access An automated ranking platform for machine learning regression models for meat spoilage prediction using multi-spectral imaging and metabolic profiling(Elsevier, 2017-05-20) Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady R.Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, “MeatReg”, a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg” was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC–MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: http://elvis.misc.cranfield.ac.uk/SORF/.Item Open Access Biochemical profile of heritage and modern apple cultivars and application of machine learning methods to predict usage, age, and harvest season(American Chemical Society, 2017-06-02) Anastasiadi, Maria; Mohareb, Fady R.; Redfern, Sally P.; Berry, Mark; Simmonds, Monique; Terry, Leon A.The present study represents the first major attempt to characterise the biochemical profile in different tissues of a large selection of apple cultivars sourced from the UK’s National Fruit Collection comprising dessert, ornamental, cider and culinary apples. Furthermore, advanced Machine Learning methods were applied with the objective to identify whether the phenolic and sugar composition of an apple cultivar could be used as a biomarker fingerprint to differentiate between heritage and mainstream commercial cultivars as well as govern the separation among primary usage groups and harvest season. Prediction accuracy > 90% was achieved with Random Forest for all three models. The results highlighted the extraordinary phytochemical potency and unique profile of some heritage, cider and ornamental apple cultivars, especially in comparison to more mainstream apple cultivars. Therefore, these findings could guide future cultivar selection on the basis of health-promoting phytochemical content.Item Open Access A bioinformatics and genotyping approach exploring personalised nutrition.(Cranfield University, 2021-11) Molitor, Corentin; Mohareb, Fady R.; Thompson, Andrew J.Personalised nutrition is at its early stages but shows the potential of improving the health of the general population, at a time when diabetes and obesity are becoming worldwide epidemics. However, it will need to be based on rigorous scientific research, as well as being accompanied by public policies and ethical considerations. Research is making great progress towards the understanding of the impact of genetics on complex diseases, which involve hundreds, or thousands, of variants, each having varying effect on the disease. Personalised medicine aims at harnessing this genetic information to tailor prevention and treatment according to each individual. Unfortunately, the links between the genotype and the phenotype are not yet fully understood. And while the content of publicly available genetic databases is exponentially growing, they are often using different formats and means of access, making it difficult to get complete information. Moreover, evaluating the genetic predisposition of an individual to a disease is not straightforward, and while Polygenic Risk Score models can help in this regard, they are often only based on common variants, which might lead to misevaluation of the risk for rare- variants carriers. In this thesis will be presented (i) VarGen, an R package to merge information from different genetic databases, which has the potential to infer new variant- disease relationships. (ii) a new method to improve Polygenic Risk Score models, which includes variants obtained from VarGen on top of the common variants from standard polygenic analyses. (iii) the results of a microRNA differential expression analysis, aiming at identifying the impact of microRNAs, on the development of severe Hypoxic-Ischemic Encephalopathy in new-borns.Item Open Access A chromosome-level genome assembly of Solanum chilense, a tomato wild relative associated with resistance to salinity and drought(Frontiers, 2024-03-08) Molitor, Corentin; Kurowski, Tomasz J.; Fidalgo de Almeida, Pedro M.; Kevei, Zoltan; Spindlow, Daniel J.; Chacko Kaitholil, Steffimol R.; Iheanyichi, Justice U.; Prasanna, H. C.; Thompson, Andrew J.; Mohareb, Fady R.Introduction: Solanum chilense is a wild relative of tomato reported to exhibit resistance to biotic and abiotic stresses. There is potential to improve tomato cultivars via breeding with wild relatives, a process greatly accelerated by suitable genomic and genetic resources. Methods: In this study we generated a high-quality, chromosome-level, de novo assembly for the S. chilense accession LA1972 using a hybrid assembly strategy with ~180 Gbp of Illumina short reads and ~50 Gbp long PacBio reads. Further scaffolding was performed using Bionano optical maps and 10x Chromium reads. Results: The resulting sequences were arranged into 12 pseudomolecules using Hi-C sequencing. This resulted in a 901 Mbp assembly, with a completeness of 95%, as determined by Benchmarking with Universal Single-Copy Orthologs (BUSCO). Sequencing of RNA from multiple tissues resulting in ~219 Gbp of reads was used to annotate the genome assembly with an RNA-Seq guided gene prediction, and for a de novo transcriptome assembly. This chromosome-level, high-quality reference genome for S. chilense accession LA1972 will support future breeding efforts for more sustainable tomato production. Discussion: Gene sequences related to drought and salt resistance were compared between S. chilense and S. lycopersicum to identify amino acid variations with high potential for functional impact. These variants were subsequently analysed in 84 resequenced tomato lines across 12 different related species to explore the variant distributions. We identified a set of 7 putative impactful amino acid variants some of which may also impact on fruit development for example the ethylene-responsive transcription factor WIN1 and ethylene-insensitive protein 2. These variants could be tested for their ability to confer functional phenotypes to cultivars that have lost these variants.Item Open Access A comparison of artificial neural networks and partial least squares modelling for the rapid detection of the microbial spoilage of beef fillets based on Fourier transform infrared spectral fingerprints(Elsevier Science B.V., Amsterdam, 2011-06-30T00:00:00Z) Panagou, Efstathios Z.; Mohareb, Fady R.; Argyri, Anthoula A.; Bessant, Conrad M.; Nychas, George-John E.A series of partial least squares (PLS) models were employed to correlate spectral data from FTIR analysis with beef fillet spoilage during aerobic storage at different temperatures (0, 5, 10, 15, and 20°C) using the dataset presented by Argyri etal. (2010). The performance of the PLS models was compared with a three-layer feed-forward artificial neural network (ANN) developed using the same dataset. FTIR spectra were collected from the surface of meat samples in parallel with microbiological analyses to enumerate total viable counts. Sensory evaluation was based on a three-point hedonic scale classifying meat samples as fresh, semi-fresh, and spoiled. The purpose of the modelling approach employed in this work was to classify beef samples in the respective quality class as well as to predict their total viable counts directly from FTIR spectra. The results obtained demonstrated that both approaches showed good performance in discriminating meat samples in one of the three predefined sensory classes. The PLS classification models showed performances ranging from 72.0 to 98.2% using the training dataset, and from 63.1 to 94.7% using independent testing dataset. The ANN classification model performed equally well in discriminating meat samples, with correct classification rates from 98.2 to 100% and 63.1 to 73.7% in the train and test sessions, respectively. PLS and ANN approaches were also applied to create models for the prediction of microbial counts. The performance of these was based on graphical plots and statistical indices (bias factor, accuracy factor, root mean square error). Furthermore, results demonstrated reasonably good correlation of total viable counts on meat surface with FTIR spectral data with PLS models presenting better performance indices compared to ANN.Item Open Access CRAMER: A lightweight, highly customisable web-based genome browser supporting multiple visualisation instances(Oxford University Press, 2020-02-28) Anastasiadi, Maria; Bragin, E.; Biojoux, P.; Ahamed, A.; Burgin, Josephine; de Castro Cogle, K.; Llaneza-Lago, S.; Muvunyi, R.; Scislak, M.; Aktan, I.; Molitor, Corentin; Kurowski, Tomasz J.; Mohareb, Fady R.In recent years the ability to generate genomic data has increased dramatically along with the demand for easily personalised and customisable genome browsers for effective visualisation of diverse types of data. Despite the large number of web-based genome browsers available nowadays, none of the existing tools provide means for creating multiple visualisation instances without manual set up on the deployment server side. The Cranfield Genome Browser (CRAMER) is an open-source, lightweight and highly customisable web application for interactive visualisation of genomic data. Once deployed, CRAMER supports seamless creation of multiple visualisation instances in parallel while allowing users to control and customise multiple tracks. The application is deployed on a Node.js server and is supported by a MongoDB database which stored all customisations made by the users allowing quick navigation between instances. Currently, the browser supports visualising a large number of file formats for genome annotation, variant calling, reads coverage and gene expression. Additionally, the browser supports direct Javascript coding for personalised tracks, providing a whole new level of customisation both functionally and visually. Tracks can be added via direct file upload or processed in real-time via links to files stored remotely on an FTP repository. Furthermore, additional tracks can be added by users via simple drag and drop to an existing visualisation instance.Item Open Access Data analysis methods for copy number discovery and interpretation(Cranfield University, 2014-10) Fitzgerald, Tomas W.; Hurles, Matthew E.; Mohareb, Fady R.Copy number variation (CNV) is an important type of genetic variation that can give rise to a wide variety of phenotypic traits. Differences in copy number are thought to play major roles in processes that involve dosage sensitive genes, providing beneficial, deleterious or neutral modifications to individual phenotypes. Copy number analysis has long been a standard in clinical cytogenetic laboratories. Gene deletions and duplications can often be linked with genetic Syndromes such as: the 7q11.23 deletion of Williams-‐Bueren Syndrome, the 22q11 deletion of DiGeorge syndrome and the 17q11.2 duplication of Potocki-‐Lupski syndrome. Interestingly, copy number based genomic disorders often display reciprocal deletion / duplication syndromes, with the latter frequently exhibiting milder symptoms. Moreover, the study of chromosomal imbalances plays a key role in cancer research. The datasets used for the development of analysis methods during this project are generated as part of the cutting-‐edge translational project, Deciphering Developmental Disorders (DDD). This project, the DDD, is the first of its kind and will directly apply state of the art technologies, in the form of ultra-‐high resolution microarray and next generation sequencing (NGS), to real-‐time genetic clinical practice. It is collaboration between the Wellcome Trust Sanger Institute (WTSI) and the National Health Service (NHS) involving the 24 regional genetic services across the UK and Ireland. Although the application of DNA microarrays for the detection of CNVs is well established, individual change point detection algorithms often display variable performances. The definition of an optimal set of parameters for achieving a certain level of performance is rarely straightforward, especially where data qualities vary ... [cont.].Item Open Access Developing novel bioinformatics tools and pipelines for working with reference genomes and large sets of resequenced genomes.(Cranfield University, 2022-01) Kurowski, Tomasz Janusz; Mohareb, Fady R.Both reference genomes assembled for individual species and large, publicly maintained sets of resequenced genomes are of immense value to researchers. The former represent important milestones for research involving the species of interest and serve as ostensibly static points of reference for other data, while the latter serve as catalogues of genetic variation, enabling researchers to place their own data in a wider context. However, maintaining sets of resequenced genomes and ensuring their integrity as they undergo updates to match any new releases of their reference genome poses certain computational challenges, as does manipulating and comparing those large sets of genomes in general. This work reports on the detection and correction of significant errors which were introduced into resequenced tomato data in the course of updating them to a new version. It also introduces Tersect, a low-level utility optimized for manipulating and comparing large sets of resequenced genomic data, as well as Tersect Browser, a Web application which uses the high performance of Tersect, coupled with a higher-level indexing and precomputation scheme to allow for interactive comparison of large sets of resequenced genomes, giving biologists a tool capable of generating visualisations of genetic distance and phylogenetic relationships based on whole-genome sequence data from hundreds of genomes in seconds rather than hours.Item Open Access Ensemble-based support vector machine classifiers as an efficient tool for quality assessment of beef fillets from electronic nose data(Royal Society of Chemistry, 2016-04-06) Mohareb, Fady R.; Papadopoulou, Olga; Panagou, Efstathios; Nychas, George-John; Bessant, Conrad M.Over the past years, the application of electronic nose devices has been investigated as a potential tool for assessing food freshness. This relies on the application of various pattern recognition methods to provide accurate classification and regression models. The models' accuracy depends on the number of samples used during the training process. This often leads to unstable and unreliable classifiers in the case of food quality assessment, where the number of samples is typically less than 200 for a given experiment. The aim of this work is to tackle this problem through the development of a series of ensemble-based classifiers and regression models using support vector machines and electronic nose datasets based on the previously published work of this group. It was found that the developed ensemble provides a higher prediction accuracy compared to the single model approach when estimating the freshness score assigned by the sensory panel; achieving an overall accuracy of 84.1% compared to 72.7% in the case of the single classifier model. Another set of calibration ensembles were developed based on SVMregression, in order to predict bacterial species counts, achieving an increase in the average overall performance of 85.0%, compared to 76.5% when a single classifier was applied. This increase in the predictive power therefore suggests that combining an electronic nose with ensemble-based systems can be used as an innovative method to assess the freshness of beef fillets.Item Open Access Generating beneficial predator genomes to provide comparative insights into insecticide resistance-related gene families.(Cranfield University, 2022-01) Bailey, Emma; Mohareb, Fady R.; Hassani-Pak, Keywan; King, RobertWith a rapidly growing population to feed, finding ways to increase crop yields has become more important than ever. Insect pests contribute hugely to yield losses every year and finding methods to effectively control pest levels is therefore crucial to reduce these losses. Insecticide use alone is no longer a viable solution, due to ever increasing levels of resistance developing amongst crop pests, as well as the environmental concerns associated with their overuse. Biological control – the use of natural predators to keep pest populations under control – has proven to be a highly effective method of pest control and generally has less severe environmental impacts than pesticides (although introducing non-native species can result in undesirable changes to local biodiversity). Biological control agents are therefore a key component of Integrated Pest Management (IPM) strategies, which aim to manage pest populations in a sustainable and economical manner. IPM programs prioritise selective insecticides which target the pest species and are harmless to the beneficial predators. However, the numbers of reported insecticide resistance cases are far lower in beneficial predators as opposed to crop pests. As a result, insecticide application often harms beneficial predator populations and reduces their biological control capabilities, which may allow resistant pest populations to surge after application. Genomic information is readily available for a multitude of crop pest species, however, when this project began, there was minimal genomic data available for beneficial predators. By increasing the availability of genomic data for beneficial predators, we can perform comparative analyses between crop pests and their predators of insecticide target-sites and genes encoding metabolic enzymes potentially responsible for insecticide resistance. These analyses may help uncover whether there is any genomic basis for the reduced number of insecticide resistance cases in beneficial predators compared to crop pests. The aim of this project was to firstly sequence and assemble the genomes of key beneficial predators for which no genomic information is currently available. This included Orius laevigatus (minute pirate bug), Sphaerophoria rueppellii (European hoverfly) and Microctonus brassicae (parasitoid wasp). Next, these genomes were annotated and manual curation of resistance-associated detoxification genes was performed. The resultant detoxification gene sets were then used to perform comparative analyses between beneficial predators and crop pests. The results from the comparative analysis suggest a greater degree of detoxification family gene expansion within crop pests compared to beneficial predators. This difference was particularly apparent in the families associated with detoxification of plant xenobiotics and suggests that the plant-based diet of crop pests provided increased selection pressure for resistance mechanisms prior to the introduction of insecticides. Once insecticides were introduced, crop pests may therefore have had an advantage over beneficial predators in terms of developing insecticide resistance. In addition, variation in the levels of resistance between different beneficial predators correlated to some extent with gene expansion, with several factors having likely had some influence on this, including diet, migration and length of commercial use. The knowledge gained from this project could contribute to our understanding of insecticide resistance from a genomic perspective and aid in the development of successful IPM strategies.Item Open Access Genomic variation in Plasmodium vivax malaria reveals regions under selective pressure(PLoS One, 2017-05-11) Diez Benavente, Ernest; Ward, Zoe; Chan, Wilson; Mohareb, Fady R.; Sutherland, Colin J.; Roper, Cally; Campino, Susana; Clark, TaaneBackground Although Plasmodium vivax contributes to almost half of all malaria cases outside Africa, it has been relatively neglected compared to the more deadly P. falciparum. It is known that P. vivax populations possess high genetic diversity, differing geographically potentially due to different vector species, host genetics and environmental factors. Results We analysed the high-quality genomic data for 46 P. vivax isolates spanning 10 countries across 4 continents. Using population genetic methods we identified hotspots of selection pressure, including the previously reported MRP1 and DHPS genes, both putative drug resistance loci. Extra copies and deletions in the promoter region of another drug resistance candidate, MDR1 gene, and duplications in the Duffy binding protein gene (PvDBP) potentially involved in erythrocyte invasion, were also identified. For surveillance applications, continental-informative markers were found in putative drug resistance loci, and we show that organellar polymorphisms could classify P. vivax populations across continents and differentiate between Plasmodia spp. Conclusions This study has shown that genomic diversity that lies within and between P. vivax populations can be used to elucidate potential drug resistance and invasion mechanisms, as well as facilitate the molecular barcoding of the parasite for surveillance applications.Item Open Access Identification of meat spoilage gene biomarkers in Pseudomonas putida using gene profiling(Elsevier, 2015-04-20) Mohareb, Fady R.; Iriondo, Maite; Doulgeraki, Agapi I.; van Hoek, Angela; Aarts, Henk; Cauchi, Michael; Nychas, George-John E.While current food science research mainly focuses on microbial changes in food products that lead to foodborne illnesses, meat spoilage remains as an unsolved problem for the meat industry. This can result in important economic losses, food waste and loss of consumer confidence in the meat market. Gram-negative bacteria involved in meat spoilage are aerobes or facultative anaerobes. These represent the group with the greatest meat spoilage potential, where Pseudomonas tend to dominate the microbial consortium under refrigeration and aerobic conditions. Identifying stress response genes under different environmental conditions can help researchers gain an understanding of how Pseudomonas adapts to current packaging and storage conditions. We examined the gene expression profile of Pseudomonas putida KT2440, which plays an important role in the spoilage of meat products. Gene expression profiles were evaluated to select the most differentially expressed genes at different temperatures (30 °C and 10 °C) and decreasing glucose concentrations, in order to identify key genes actively involved with the spoilage process. A total of 739 and 1269 were found to be differentially expressed at 30 °C and 10 °C respectively; of which 430 and 568 genes were overexpressed, and 309 and 701 genes were repressed at 30 °C and 10 °C respectively.Item Open Access A loss-of-function allele of a TAC1-like gene (SlTAC1) located on tomato chromosome 10 is a candidate for the Erectoid leaf (Erl) mutation(Springer Verlag, 2019-04-16) González-Arcos, Matías; de Noronha Fonseca, Maria Esther; Basílio Zandonadi, Daniel; Peres, Lázaro E. P.; Arruabarrena, Ana; Ferreira, Demetryus S.; Kevei, Zoltan; Mohareb, Fady R.; Thompson, Andrew J.; Boiteux, Leonardo S.The genetic basis of an erectoid leaf phenotype was investigated in distinct tomato breeding populations, including one derived from Solanum lycopersicum ‘LT05’ (with the erectoid leaf phenotype and uniform ripening, genotype uu) × S. pimpinellifollium ‘TO-937’ (with the wild-type leaf phenotype and green fruit shoulder, genotype UU). The erectoid leaf phenotype was inherited as a semi-dominant trait and it co-segregated with the u allele of gene SlGLK2 (Solyc10g008160). This genomic location coincides with a previously described semi-dominant mutation named as Erectoid leaf (Erl). The genomes of ‘LT05’, ‘TO-937’, and three other unrelated accessions (with the wild-type Erl+ allele) were resequenced with the aim of identifying candidate genes. Comparative genomic analyses, including the reference genome ‘Heinz 1706’ (Erl+ allele), identified an Erectoid leaf-specific single nucleotide polymorphism (SNP) in the gene Solyc10g009320. This SNP caused a change of a glutamine codon (present in all the wild-type genomes) to a TAA (= ochre stop-codon) in the Erl allele, resulting in a smaller version of the predicted mutant protein (221 vs. 279 amino acids). Solyc10g009320, previously annotated as an ‘unknown protein’, was identified as a TILLER ANGLE CONTROL1-like gene. Linkage between the Erl and Solyc10g009320 was confirmed via Sanger sequencing of the PCR amplicons of the two variant alleles. No recombinants were detected in 265 F2 individuals. Contrasting S7 near-isogenic lines were also homozygous for each of the alternate alleles, reinforcing Solyc10g009320 as a strong Erl candidate gene and opening the possibility for fine-tuning manipulation of tomato architecture in breeding programs.Item Open Access MapOptics: A light-weight, cross-platform visualisation tool for optical mapping alignment(Oxford University Press, 2018-12-07) Burgin, Josephine; Molitor, Corentin; Mohareb, Fady R.Bionano optical mapping is a technology that can assist in the final stages of genome assembly by lengthening and ordering scaffolds in a draft assembly by aligning the assembly to a genomic map. However, currently, tools for visualisation are limited to use on a Windows operating system or are developed initially for visualising large-scale structural variation. MapOptics is a lightweight cross-platform tool that enables the user to visualise and interact with the alignment of Bionano optical mapping data and can be used for in depth exploration of hybrid scaffolding alignments. It provides a fast, simple alternative to the large optical mapping analysis programs currently available for this area of research.Item Open Access MRMaid: the SRM assay design tool for Arabidopsis and other species(2012-07-20T00:00:00Z) Fan, Jun; Mohareb, Fady R.; Jones, Alexandra M. E.; Bessant, Conrad M.Selected reaction monitoring (SRM), sometimes called multiple reaction monitoring (MRM), is becoming the tool of choice for targeted quantitative proteomics in the plant science community. Key to a successful SRM experiment is prior identification of the distinct peptides for the proteins of interest and the determination of the so-called transitions that can be programmed into an LC-MS/MS to monitor those peptides. The transition for a given peptide comprises the intact peptide m/z and a high intensity product ion that can be monitored at a characteristic retention time (RT). To aid the design of SRM experiments, several online tools and databases have been produced to help researchers select transitions for their proteins of interest, but many of these tools are limited to the most popular model organisms such as human, yeast, and mouse or require the experimental acquisition of local spectral libraries. In this paper we present MRMaid1, a web-based SRM assay design tool whose transitions are generated by mining the millions of identified peptide spectra held in the EBI’s PRIDE database. By using data from this large public repository, MRMaid is able to cover a wide range of species that can increase as the coverage of PRIDE grows. In this paper MRMaid transitions for 25Arabidopsis thalianaproteins are evaluated experimentally, and found capable of quantifying 23 of these proteins. This performance was found to be comparable with the more time consuming approach of designing transitions using locally acquired orbitrap data, indicating that MRMaid is a valuable tool for targeted Arabidopsis proteomics.Item Open Access Multispectral image analysis approach to detect adulteration of beef and pork in raw meats(Elsevier, 2019-08-08) Ropodi, Athina; Pavlidis, Dimitrios; Mohareb, Fady R.; Panagou, Efstathios; Nychas, GeorgeThe aim of this study was to investigate the potential of multispectral imaging supported by multivariate data analysis for the detection of minced beef fraudulently substituted with pork and vice versa. Multispectral images in 18 different wavelengths of 220 meat samples in total from four independent experiments (55 samples per experiment) were acquired for this work. The appropriate amount of beef and pork-minced meat was mixed in order to achieve nine different proportions of adulteration and two categories of pure pork and beef. After an image processing step, data from the first three experiments were used for partial least squares-discriminant analysis (PLS-DA) and linear discriminant analysis (LDA) so as to discriminate among all adulteration classes, as well as among adulterated, pure beef and pure pork samples. Results showed very good discrimination between pure and adulterated samples, for PLS-DA and LDA, yielding 98.48% overall correct classification. Additionally, 98.48% and 96.97% of the samples were classified within a ± 10% category of adulteration for LDA and PLS-DA respectively. Lastly, the models were further validated using the data of the fourth experiment for independent testing, where all pure and adulterated samples were classified correctly in the case of PLS-DA, while LDA was proved to be less accurate.Item Open Access Novel approaches for food safety management and communication(Elsevier, 2016-06-25) Nychas, George-John E.; Panagou, Efstathios Z.; Mohareb, Fady R.The current safety and quality controls in the food chain are lacking or inadequately applied and fail to prevent microbial and/or chemical contamination of food products, which leads to reduced confidence among consumers. On the other hand to meet market demands food business operators (producers, retailers, resellers) and regulators need to develop and apply structured quality and safety assurance systems based on thorough risk analysis and prevention, through monitoring, recording and controlling of critical parameters covering the entire product's life cycle. However the production, supply, and processing sectors of the food chain are fragmented and this lack of cohesion results in a failure to adopt new and innovative technologies, products and processes. The potential of using information technologies, for example, data storage, communication, cloud, in tandem with data science, for example, data mining, pattern recognition, uncertainty modelling, artificial intelligence, etc., through the whole food chain including processing within the food industry, retailers and even consumers, will provide stakeholders with novel tools regarding the implementation of a more efficient food safety management system.Item Open Access Obesity in pregnancy: risk of gestational diabetes(2018) Balani, Jyoti; Cellek, Selim; Mohareb, Fady R.; Hyer, SteveBackground: Maternal obesity is a risk factor for gestational diabetes and other adverse pregnancy outcomes, but the body fat distribution may be a more important risk factor than body mass index. Pregnancy is an insulin resistant state and more so, in obese women. Metformin could be beneficial in obese pregnant women due to its insulin sensitizing action. The aims of this study are to investigate visceral fat mass as a risk factor for gestational diabetes (VFM study), to develop a mathematical model for the prediction of gestational diabetes in obese women (VFM study) and to examine the effect of metformin on pregnancy outcomes in obese non-diabetic women (MOP Trial). Methods and Results: VFM study: The body composition of 302 obese pregnant women was assessed using bioelectrical impedance. A mathematical model to predict gestational diabetes using machine learning was developed using visceral fat mass which is a novel risk factor in addition to conventional risk factors. 72 of the women developed gestational diabetes (GDM). These women had higher visceral fat mass. Women with a baseline visceral fat mass ≥ 75th percentile, had a 3-fold risk of subsequent gestational diabetes. The mathematical model predicted gestational diabetes with an average overall accuracy of 77.5% and predicted birth centile classes with an average accuracy of 68%. According to the decision tree developed, VFM emerged as the most important variable in determining the risk of GDM and a VFM < 210 was used as the first split in the decision tree. MOP Trial: 133 obese pregnant women were randomised to either metformin or placebo. The pregnancy outcomes were compared in both groups. Insulin resistance was measured in all women. 118 women completed the trial. Metformin did not reduce the neonatal birth weight z-score, which was the primary outcome of the trial or the incidence of large for gestational age babies. However, metformin therapy significantly reduced gestational weight gain, reduced the pregnancy rise in visceral fat mass, and attenuated the expected physiological rise in insulin resistance at 28 weeks gestation. However, this did not result in an overall significant reduction in the incidence of gestational diabetes. There was a trend towards a reduced incidence of gestational diabetes in women with high baseline insulin resistance randomised to metformin. Conclusions: Visceral fat mass is a novel risk factor for gestational diabetes. The mathematical model successfully predicted gestational diabetes. Metformin reduced gestational weight gain and insulin resistance but did not lower the median neonatal birth weight or reduce the incidence of GDM.Item Open Access Resequencing at ≥ 40-fold depth of the parental genomes of a Solanum lycopersicum × S. pimpinellifolium recombinant inbred line population and characterisation of frame-shift InDels that are highly likely to perturb protein function(Genetics Society of America, 2015-03-24) Kevei, Zoltan; King, Robert C.; Mohareb, Fady R.; Sergeant, Martin J.; Awan, Sajjad Z.; Thompson, Andrew J.A recombinant in-bred line population derived from a cross between Solanum lycopersicum var. cerasiforme (E9) and S. pimpinellifolium (L5) has been used extensively to discover quantitative trait loci (QTL), including those that act via rootstock genotype, however, high-resolution single-nucleotide polymorphism genotyping data for this population are not yet publically available. Next-generation resequencing of parental lines allows the vast majority of polymorphisms to be characterized and used to progress from QTL to causative gene. We sequenced E9 and L5 genomes to 40- and 44-fold depth, respectively, and reads were mapped to the reference Heinz 1706 genome. In L5 there were three clear regions on chromosome 1, chromosome 4, and chromosome 8 with increased rates of polymorphism. Two other regions were highly polymorphic when we compared Heinz 1706 with both E9 and L5 on chromosome 1 and chromosome 10, suggesting that the reference sequence contains a divergent introgression in these locations. We also identified a region on chromosome 4 consistent with an introgression from S. pimpinellifolium into Heinz 1706. A large dataset of polymorphisms for the use in fine-mapping QTL in a specific tomato recombinant in-bred line population was created, including a high density of InDels validated as simple size-based polymerase chain reaction markers. By careful filtering and interpreting the SnpEff prediction tool, we have created a list of genes that are predicted to have highly perturbed protein functions in the E9 and L5 parental lines.Item Open Access Study of microRNAs-21/221 as potential breast cancer biomarkers in Egyptian women(Elsevier, 2016-01-29) Motawi, Tarek Mohamed Kamal; Sadik, Nermin Abdel Hamid; Shaker, Olfat Gamil; El Masry, Maha Rafik; Mohareb, Fady R.microRNAs (miRNAs) play an important role in cancer prognosis. They are small molecules, approximately 17–25 nucleotides in length, and their high stability in human serum supports their use as novel diagnostic biomarkers of cancer and other pathological conditions. In this study, we analyzed the expression patterns of miR-21 and miR-221 in the serum from a total of 100 Egyptian female subjects with breast cancer, fibroadenoma, and healthy control subjects. Using microarray-based expression profiling followed by real-time polymerase chain reaction validation, we compared the levels of the two circulating miRNAs in the serum of patients with breast cancer (n = 50), fibroadenoma (n = 25), and healthy controls (n = 25). The miRNA SNORD68 was chosen as the housekeeping endogenous control. We found that the serum levels of miR-21 and miR-221 were significantly overexpressed in breast cancer patients compared to normal controls and fibroadenoma patients. Receiver Operating Characteristic (ROC) curve analysis revealed that miR-21 has greater potential in discriminating between breast cancer patients and the control group, while miR-221 has greater potential in discriminating between breast cancer and fibroadenoma patients. Classification models using k-Nearest Neighbor (kNN), Naïve Bayes (NB), and Random Forests (RF) were developed using expression levels of both miR-21 and miR-221. Best classification performance was achieved by NB Classification models, reaching 91% of correct classification. Furthermore, relative miR-221 expression was associated with histological tumor grades. Therefore, it may be concluded that both miR-21 and miR-221 can be used to differentiate between breast cancer patients and healthy controls, but that the diagnostic accuracy of serum miR-21 is superior to miR-221 for breast cancer prediction. miR-221 has more diagnostic power in discriminating between breast cancer and fibroadenoma patients. The overexpression of miR-221 has been associated with the breast cancer grade. We also demonstrated that the combined expression of miR-21 and miR-221can be successfully applied as breast cancer biomarkers.