Highlights:
Progress is being monitored on a clone by clone basis and the IRGSP is on track to complete the phase 2 sequence by the end of 2002. No chromosomal reassignments will be made at this time.
The IRGSP is prepared to exchange trace files with BGI.
December 1, 2002 is the cut off date for the submission of all phase 2 or phase 3 sequences to complete the first pass of the genome.
A joint IRGSP paper will
be prepared based on the sequence submitted as of December 1. It will be
the best paper that can possibly be written. The IRGSP will attempt to
publish the paper as part of a package that includes individual chromosome
papers.
Sequencing Progress
| Chromosome | SequencingGroup | Chrom.
Length (Mb) |
Tile Length
(Mb) |
No. Clones
in Tile |
Finished | Phase 2 | Phase 1 | Clones Not
Sequenced |
% Coverage | Gaps |
| 1 | RGP | 45 | 42.9 | 390 | 352 | 34 | 7 | 0 | 95.3 | 8 |
| KRGRP | 12 | 1 | 9 | 2 | 0 | |||||
| 2 | RGP | 41 | 36.0 | 369 | 3 | 253 | 44 | 69 | 88.0 | 16 |
| UK | 1 | |||||||||
| 3 | ACWW | 41 | 14.7 | 116 | 31 | 28 | 2 | 55 | 85.0 | 9 |
| TIGR | 27.2 | 173 | 41 | 61 | 19 | 52 |
|
|||
| 4 | China | 36 | 35.6 | 289 | 260 | 43 | 0 | 97.5 | 7 | |
| 5 | Taiwan | 30 | 26.4 | 292 | 10 | 127 | 3 | 152 | 14 | |
| 6 | RGP | 33 | 30.7 | 279 | 24 | 185 | 25 | 45 | 93.0 | 13 |
| 7 | RGP | 32 | 28.4 | 276 | 18 | 172 | 66 | 20 | 88.8 | 8 |
| 8 | RGP | 31 | 29.1 | 282 | 1 | 187 | 47 | 47 | 93.9 | 11 |
| 9 | RGP | 21 | 15.4 | 150 | 0 | 8 | 2 | 140 | 73.0 | 44 |
| KRGRP | 19 | 8 | 5 | 6 | ||||||
| Thailand | 2 | 6 | ||||||||
| 10 | ACWW | 23 | 15.5 | 102 | 82 | 5 | 1 | 14 | 90.0 | 7 |
| PGIR | 2.9 | 23 | 19 | 4 | 0 | |||||
| TIGR | 9.3 | 83 | 99 | 22 | 10 | 0 | 1 | |||
| 11 | Genoscope | 30 | 0.7 | 5 | 5 | 78.0 | ||||
| PGIR | 2.7 | 20 | 3 | 17 | 0 | |||||
| IIRGS | 14.2 | 102 | 30 | 14 | 58 | 5 | ||||
| TIGR | 11.3 | 65 | 9 | 12 | 44 | 8 | ||||
| GCOW | 3 | |||||||||
| ACWW | 2 | |||||||||
| 12 | Genoscope | 31 | 22 | 197 | 19 | 90 | 88 | 71.0 | 13 | |
| Totals | 394 | 365.0 | 3244 | 971 | 1281 | 259 | 733 | 86.7 | 175 |
Notes on Sequencing Progress:
Bin Han reported that his group will now concentrate on sequencing chromosome 4 from the indica strain Guangluai4. Rod Wing has made two BAC libraries for this purpose that together have 20X coverage of the genome.
Takashi Matsumoto reported that the 80 kb centromere on chromosome 8 is completely covered by PACs.
Rod Wing reported that for 12 BACs on chromosome 3, he sent sequence to TMRI for flanking sequence to fill gaps and then performed in silico assembly. In all cases he found flanking BACs on one or both sides and was able to reduce the number of contigs in a region.
Some reapportionment of clones
will occur among members working on chromosome 11.
Syngenta Data
TIGR has so far only received trace files. Robin has not seen those yet. No assignment information has been received as yet.
The RGP has not yet received any data.
Dick McCombie proposed
that software they have developed for searching mouse shotgun data could
be installed on the servers at TIGR and the RGP. This would permit queries
of the entire data set with partial sequence data so that relevant TMRI
data could be retrieved with a minimum of error due to incorrect assignment
and would presumably speed delivery.
Getting ready of completing the sequence:
It was agreed that all data should be submitted by December 1.
The goals are:
Ordered
BAC/PAC clones anchored to the genetic map.
All available
BACs/PACs sequenced to at least phase 2 quality.
Gaps
defined - see Rod Wing's plan.
Walking
should be complete by end of October.
Psuedo-molecules
synthesized.
Segmental
duplications defined.
Enhance
phase 2 sequence by editing.
Integration
of Monsanto and Syngenta data.
Rod Wing will develop
a plan with Jiming Jiang to define all of the gaps in the genome. All the
members that participate in this exercise will probably be asked for a
financial contribution to cover the costs.
Cooperation:
The 26 April meeting at the Rockefeller Foundation with representatives of the IRGSP, BGI, TMRI, and Monsanto was reviewed. Exchange of the trace files with BGI was outlined and approved by the group. We also discussed possible BGI contribution to whole genome annotation.
As Jun Yu was unable to make
the meeting, the goals of the BGI, as we understood them, were reviewed.
It is not clear who is in charge of making decisions at the BGI as Dick
has had conflicting responses to his request for data exchange. Han Bin
believes the person to contact is actually Jun Wang. Dick will try to contact
him to find out if they agree to data exchange.
Physical Map:
Jianzhong Wu talked
about the challenges facing the RGP:
The difficulty
of finishing centromeres.
FISH
analysis is needed for all chromosomes.
The rDNA
cluster on chromosome 9 which encompasses about 5 Mb.
How the
TMRI and BGI data can be used.
Jianzhong proposed genetic mapping of the unmapped TMRI contigs. He also described sequences found in 300 kb of centromere sequence - 4 ESTs and 1 genetic marker. He also described measuring the size of gaps on chromosome 1.
Rod Wing reported that the CUGI physical map is continually updated with finger prints from simulated digests of newly sequenced and submitted BACs. Cary Sonderland has written a script to build pseudo-molecules from overlapping BACs/PACs.
When the groups sets up in
Arizona, Rod will appoint a curator who will be able to resolve questions
and discrepancies pointed out be users.
Annotation:
Heiko Schoof from
MIPs talked about their activities with a view of cooperating with the
IRGSP on annotating the rice genome. He gave examples from their work with
Arabidopsis which can be found at MAtDB (http://mips.gsf.de/proj/thal/db/:)
Whole
genome analysis
Comparative
analysis of genomes
Tandem
and segmental duplications
EST clusters
MAR and
regulatory elements
Gene
prediction
Functional
classification of genes.
MIPS would like to coordinate annotation activities with the IRGSP without duplicating efforts. For this they will need access to the data. This doesn't mean they need traces, but they do need contact information in order to know who to ask about apparent discrepancies. They have been working on, but have not posted, MosDB.
In response to questions,
Heiko said that they could annotate 24 genes per day per person and that
they might be able to put two people on the rice project. It was agreed
that automated annotation was the state of the art method. Robin Buell
said that names are the major problem with automated annotation.
Doreen Ware described
the Gramene (http://www.gramene.org/)
database which has absorbed the Rice Genes database. The database, which
prominently features IRGSP progress, has the following tools:
Genome
Browser based on Ensemble
Map Viewer
Blast
search
Protein
search
Ontology
browser
Mutant
search
Robin Buell described
automated annotation at TIGR fueled by a pipeline of newly submitted rice
clones. About 1800 clones have been annotated (http://www.tigr.org/tigr-scripts/e2k1/irgsp.spl).
Takashi Matsumoto described annotation at the RGP. After automated annotation by RiceGAAS, all complete BACs from the RGP are manually annotated. So far more than 2400 clones submitted by the IRGSP have been automatically annoted and the data submitted to INE (http://rgp.dna.affrc.go.jp/giot/INE.html). The annotated BACs are updated every 20 days and FgeneSH has been added to the suite of gene prediction programs used by Rice GAAS. An annotation database is under construction.
Takashi talked about improving
prediction of protein function using BLASTP and domain searching. He showed
how merged contigs of phase 2 sequence were submitted to RiceGAAS for predicting
proteins and function.
Finishing:
Dick McCombie reviewed the work at Cold Spring Harbor that takes phase 2 sequence - drafted in other labs - to finished quality. He talked about the savings in time and cost that increased per person capacity 7-fold. He estimated a cost of $.05 a base. The work, presented in detail at our February meeting, uses a balance between transposon mediated sequencing and BAC primer walking. The heart of the technology is to automate as many steps as possible.
Takashi Matsumoto described finishing at the RGP after 2002. They figure that it will take two years to finish 1700 clones at the rate of 70 per month. The current rate is 20 per month. Takashi outlined the steps taken to increase the rate of finishing:
1) For gap filling, obtaining
the full sequence of bridge clones.
Transposon-mediated
sequencing
Sequencing
PCR products
Direct
sequencing of sonicated PAC DNA
Resequencing
with different chemistries
Possible
use of TMRI data
2) Autofinish software to select clones.
3) Shift human resources
to finishing step.
Takashi
also arrived at a finishing cost of $.05 a base.
Action Items:
Rod Wing and Dick McCombie will develop language for a request to use software developed at Cold Spring Harbor that would permit searches of the entire TMRI data sets at the RGP and TIGR. The request will be made by Robin and Takuji.
Rod Wing will develop a plan with Jiming Jiang to define all of the gaps in the genome.
Dick McCombie will contact Jun Wang to find out if BGI will participate in data exchange.
Genoscope has been unable to obtain raw fingerprint data from Syngenta. Rod Wing will check to see what the problem is. Rod and Cary will also pursue the possibility of resolving CUGI and Myriad physical maps.
A committee made up of Takuji Sasaki, Francis Quetier, and Dick McCombie, will shop a package of individual chromosome papers and a joint IRGSP paper with the editors of five journals.
Takuji Sasaki will pick a firm date in December for the announcement of the completed phase 2 sequence.
|
|
|