Predicting gene co-expression from CIS-regulatory regions
© Allen et al; licensee BioMed Central Ltd. 2007
Published: 8 May 2007
Single Nucleotide Polymorphisms (SNPs) affecting phenotype are frequently found in the CIS regulatory factors of a gene. However current prediction tools ignore positional data and thus over estimate the co-expression of genes. In order to reduce this error, this work intends to produce a method of correlating quantitative trait loci (QTLs) simultaneously to their CIS element and the expressed phenotype. To decipher the cryptic code in this non-coding DNA autoassociative neural network tools and multidimensional self-organising maps (Kohenen maps) are being used.
Preliminary work produced a tool using a 2 dimensional Kohenen map has produced some promising results. This work aims to further improve the prediction rate by increasing the dimensions used in the self-organising map so more positional data is incorporated. Once an effective prediction tool has been developed it will be trained using existing genomic data from Arabidopsis pollen and its effectiveness assessed. Once trained the prediction tool can then be used on novel data and the predictions, if not available in published literature can be tested in vitro using Transcript and Locus profiling techniques within the university.
The ultimate aim being to use the prediction tool on regions of the rice genome mapped for high density heterosis QTLs to predicted co-regulated genes. Then using in-house genomics techniques meaningful biological information may be assigned to the co-expression.
Motifs in bacteria were scanned using the Prokaryotic Database of Gene Regulation's virtual footprint tool http://prodoric.tu-bs.de/. Output was filtered using customised Perl scripts. For rice and arabidopsis, motif lists were obtained from the Plant CIS-acting regulatory DNA elements (PLACE) database  and are scanned using a custom written Perl script.
Bacterial data has been initially analysed using multivariate techniques and indicates greater variation between isogenes than between genes or even between isogenes of the same gene in different species. The future is to develop the neural network tools to refine prediction.
This article is published under license to BioMed Central Ltd.