There are at least three compelling reasons
for obtaining finished quality sequence for the complete rice genome:
I. The ability to determine gene function is highly
dependent on having accurate sequences.
II. As a model plant for the cereal grasses, having
a complete rice sequence will directly affect what can be accomplished
with the
other cereal grasses.
III. Agronomic traits of economic importance require
precise map-based genomic sequence.
I. Functional genomics is dependent on high-quality
sequence.
A. Some genes cannot be identified in draft sequence
because of sequence gaps. In one example involving a region of 2.8 million
bases,
taking high-quality draft
to finished quality sequence increased the number of genes identified by
more than 6%.
B. Some sequences that are overlooked or under-represented
in draft sequences are likely to be valuable in future work on gene function.
1. Gene control regions,
which are very important for functional variation, typically are located
some distance from the gene coding
regions and are likely to be missing in draft sequences.
2. Repetitive sequences
may influence gene action.
3. One class of genes that
are poorly understood are the non-coding RNAs. These RNAs are not recognized
by conventional
annotation software and, as they are found in intergenic regions, they
may be missing from draft sequence.
C. Draft sequence is more difficult to analyze than
finished sequence. This problem is compounded the less informatics rich
one's
environment is. Therefore,
a draft makes data mining harder on less well equipped researchers.
II. Rice shares a co-linear gene arrangement
with the other cereal grasses. Therefore
money spent on the completing the rice sequence
will leverage otherwise more costly endeavors
in the other grasses with larger genomes.
A. As a model plant, rice is the most likely species
in which homologous genes will be tested for functionality. The identification
of
function, in addition to
their gene location in the rice genome, will facilitate comparable functional
assignments in other cereal species.
B. It is important to know what levels of redundancy
are present and what genetic pathways are missing from a genome. These
determinations can only
be made with a complete, finished quality sequence. As rice is the only
cereal likely to be completely
sequenced, the importance
of these conclusions are magnified when one considers the other larger
cereal genomes.
III. There are significant economic reasons
for finishing the rice genome.
A. The association of genes with important agronomic
traits demands the availability of a complete, accurate, map-based rice
genome
sequence. Using high-quality
sequence produced by the RGP, Dr. Masahiro Yano and his colleagues of the
NIAS identified genes
controlling flowering time
in rice.
B. Mapping genes associated with traits found by
plant breeders and plant geneticists demands correct identification of
genes and the
ability to detect polymorphisms.
The confidence in distinguishing polymorphisms (as opposed to sequence
errors) comes from having
a high-quality reference
sequence from a single cultivar.
C. All genes of interest to scientists and breeders,
if left unfinished, will eventually be sequenced to high-quality in individual
laboratories
that do not normally perform
high throughput sequencing. The distributed costs of these collective efforts
will conservatively cost 20
times more to finish. Sequencing
that is completed piecemeal may not utilize the same cultivar, diminishing
its utility.
D. By the end of 2002 the IRGSP will have completed
the rice genome to at least phase 2 level (high-quality draft). About 150
out of
400 Mb of the genome will
be in finished quality sequence at that time. We estimate that it will
take an additional $12.5 million to
bring the remainder of the
genome to finished quality. This is a small additional increment compared
with $100 million already spent
by the public effort and
$80 million by two private companies.
Useful scientific discoveries and applications
come from basic research.
The availability of an accurate, complete, map-based sequence of a
cereal genome will promote ground breaking research in a number of areas:
A. One of the discoveries from plant genome sequencing
is the abundance of gene families that are comprised of active and inactive
members. Finished sequence
will facilitate a better understanding of how gene families evolve and
how new functionalities are
generated, as well as how
some gene family members are shut down.
B. The intergenic repetitive fraction of the genome
is not well understood and is frequently described as "junk". However,
we know
that functional genes are
found in repetitive sequences and that transposable elements embedded in
the repetitive sequences can
restructure genomes and
control gene action. These elements are likely to be directly involved
in generating some of the economically
valuable allelic variation
that has been selected in plant breeding and crop production. Learning
how to direct this mutagenic force
from within a genome is
an emerging area of interest for plant breeders. In fact, cell culture
manipulation of transposable element
activity constitutes the
core of the rice functional genomics research conducted in the laboratory
of Dr. Hirohiko Hirochika at the
NIAS in Tsukuba.
C. High-quality finished sequence provides the only
real opportunity to study gene regulation, as most of the critical regulatory
sequences
fall outside of the transcribed
regions. This aspect of gene function is of critical importance to understanding
how genomes function.
|
|
|