Interim Meeting May 23-24, 2002 at Genoscope

Highlights:

Progress is being monitored on a clone by clone basis and the IRGSP is on track to complete the phase 2 sequence by the end of 2002. No chromosomal reassignments will be made at this time.

The IRGSP is prepared to exchange trace files with BGI.

December 1, 2002 is the cut off date for the submission of all phase 2 or phase 3 sequences to complete the first pass of the genome.

A joint IRGSP paper will be prepared based on the sequence submitted as of December 1. It will be the best paper that can possibly be written. The IRGSP will attempt to publish the paper as part of a package that includes individual chromosome papers.
 

Sequencing Progress
 
Chromosome SequencingGroup Chrom.
Length (Mb)
Tile Length
(Mb)
No. Clones
in Tile
Finished Phase 2 Phase 1 Clones Not
Sequenced
% Coverage Gaps
1 RGP 45 42.9 390 352 34 7 0 95.3 8
KRGRP 12 1 9 2 0
2 RGP 41 36.0 369 3 253 44 69 88.0 16
UK 1
3 ACWW 41 14.7 116 31 28 2 55 85.0 9
TIGR 27.2 173 41 61 19 52
11
4 China 36 35.6 289 260 43 0 97.5 7
5 Taiwan 30 26.4 292 10 127 3 152 14
6 RGP 33 30.7 279 24 185 25 45 93.0 13
7 RGP 32 28.4 276 18 172 66 20 88.8 8
8 RGP 31 29.1 282 1 187 47 47 93.9 11
9 RGP 21 15.4 150 0 8 2 140 73.0 44
KRGRP 19 8 5 6
Thailand 2 6
10 ACWW 23 15.5 102 82 5 1 14 90.0 7
PGIR 2.9 23 19 4 0
TIGR 9.3 83 99 22 10 0 1
11 Genoscope 30 0.7 5 5 78.0
PGIR 2.7 20 3 17 0
IIRGS 14.2 102 30 14 58 5
TIGR 11.3 65 9 12 44 8
GCOW 3
ACWW 2
12 Genoscope 31 22 197 19 90 88 71.0 13
Totals 394 365.0 3244 971 1281 259 733 86.7 175

Notes on Sequencing Progress:

Bin Han reported that his group will now concentrate on sequencing chromosome 4 from the indica strain Guangluai4. Rod Wing has made two BAC libraries for this purpose that together have 20X coverage of the genome.

Takashi Matsumoto reported that the 80 kb centromere on chromosome 8 is completely covered by PACs.

Rod Wing reported that for 12 BACs on chromosome 3, he sent sequence to TMRI for flanking sequence to fill gaps and then performed in silico assembly. In all cases he found flanking BACs on one or both sides and was able to reduce the number of contigs in a region.

Some reapportionment of clones will occur among members working on chromosome 11.
 

Syngenta Data

TIGR has so far only received trace files. Robin has not seen those yet. No assignment information has been received as yet.

The RGP has not yet received any data.

Dick McCombie proposed that software they have developed for searching mouse shotgun data could be installed on the servers at TIGR and the RGP. This would permit queries of the entire data set with partial sequence data so that relevant TMRI data could be retrieved with a minimum of error due to incorrect assignment and would presumably speed delivery.
 

Getting ready of completing the sequence:

It was agreed that all data should be submitted by December 1.

The goals are:
    Ordered BAC/PAC clones anchored to the genetic map.
    All available BACs/PACs sequenced to at least phase 2 quality.
    Gaps defined - see Rod Wing's plan.
    Walking should be complete by end of October.
    Psuedo-molecules synthesized.
    Segmental duplications defined.
    Enhance phase 2 sequence by editing.
    Integration of Monsanto and Syngenta data.

Rod Wing will develop a plan with Jiming Jiang to define all of the gaps in the genome. All the members that participate in this exercise will probably be asked for a financial contribution to cover the costs.
 

Cooperation:

The 26 April meeting at the Rockefeller Foundation with representatives of the IRGSP, BGI, TMRI, and Monsanto was reviewed. Exchange of the trace files with BGI was outlined and approved by the group. We also discussed possible BGI contribution to whole genome annotation.

As Jun Yu was unable to make the meeting, the goals of the BGI, as we understood them, were reviewed. It is not clear who is in charge of making decisions at the BGI as Dick has had conflicting responses to his request for data exchange. Han Bin believes the person to contact is actually Jun Wang. Dick will try to contact him to find out if they agree to data exchange.
 

Physical Map:

Jianzhong Wu talked about the challenges facing the RGP:
    The difficulty of finishing centromeres.
    FISH analysis is needed for all chromosomes.
    The rDNA cluster on chromosome 9 which encompasses about 5 Mb.
    How the TMRI and BGI data can be used.

Jianzhong proposed genetic mapping of the unmapped TMRI contigs. He also described sequences found in 300 kb of centromere sequence - 4 ESTs and 1 genetic marker. He also described measuring the size of gaps on chromosome 1.

Rod Wing reported that the CUGI physical map is continually updated with finger prints from simulated digests of newly sequenced and submitted BACs. Cary Sonderland has written a script to build pseudo-molecules from overlapping BACs/PACs.

When the groups sets up in Arizona, Rod will appoint a curator who will be able to resolve questions and discrepancies pointed out be users.
 

Annotation:

Heiko Schoof from MIPs talked about their activities with a view of cooperating with the IRGSP on annotating the rice genome. He gave examples from their work with Arabidopsis which can be found at MAtDB (http://mips.gsf.de/proj/thal/db/:)
    Whole genome analysis
    Comparative analysis of genomes
    Tandem and segmental duplications
    EST clusters
    MAR and regulatory elements
    Gene prediction
    Functional classification of genes.

MIPS would like to coordinate annotation activities with the IRGSP without duplicating efforts. For this they will need access to the data. This doesn't mean they need traces, but they do need contact information in order to know who to ask about apparent discrepancies. They have been working on, but have not posted, MosDB.

In response to questions, Heiko said that they could annotate 24 genes per day per person and that they might be able to put two people on the rice project. It was agreed that automated annotation was the state of the art method. Robin Buell said that names are the major problem with automated annotation.
 

Doreen Ware described the Gramene (http://www.gramene.org/) database which has absorbed the Rice Genes database. The database, which prominently features IRGSP progress, has the following tools:
    Genome Browser based on Ensemble
    Map Viewer
    Blast search
    Protein search
    Ontology browser
    Mutant search

Robin Buell described automated annotation at TIGR fueled by a pipeline of newly submitted rice clones. About 1800 clones have been annotated (http://www.tigr.org/tigr-scripts/e2k1/irgsp.spl).
 

Takashi Matsumoto described annotation at the RGP. After automated annotation by RiceGAAS, all complete BACs from the RGP are manually annotated. So far more than 2400 clones submitted by the IRGSP have been automatically annoted and the data submitted to INE (http://rgp.dna.affrc.go.jp/giot/INE.html). The annotated BACs are updated every 20 days and FgeneSH has been added to the suite of gene prediction programs used by Rice GAAS. An annotation database is under construction.

Takashi talked about improving prediction of protein function using BLASTP and domain searching. He showed how merged contigs of phase 2 sequence were submitted to RiceGAAS for predicting proteins and function.
 

Finishing:

Dick McCombie reviewed the work at Cold Spring Harbor that takes phase 2 sequence - drafted in other labs - to finished quality. He talked about the savings in time and cost that increased per person capacity 7-fold. He estimated a cost of $.05 a base. The work, presented in detail at our February meeting, uses a balance between transposon mediated sequencing and BAC primer walking. The heart of the technology is to automate as many steps as possible.

Takashi Matsumoto described finishing at the RGP after 2002. They figure that it will take two years to finish 1700 clones at the rate of 70 per month. The current rate is 20 per month. Takashi outlined the steps taken to increase the rate of finishing:

1) For gap filling, obtaining the full sequence of bridge clones.
    Transposon-mediated sequencing
    Sequencing PCR products
    Direct sequencing of sonicated PAC DNA
    Resequencing with different chemistries
    Possible use of TMRI data

2) Autofinish software to select clones.

3) Shift human resources to finishing step.
    Takashi also arrived at a finishing cost of $.05 a base.
 

Action Items:

Rod Wing and Dick McCombie will develop language for a request to use software developed at Cold Spring Harbor that would permit searches of the entire TMRI data sets at the RGP and TIGR. The request will be made by Robin and Takuji.

Rod Wing will develop a plan with Jiming Jiang to define all of the gaps in the genome.

Dick McCombie will contact Jun Wang to find out if BGI will participate in data exchange.

Genoscope has been unable to obtain raw fingerprint data from Syngenta. Rod Wing will check to see what the problem is. Rod and Cary will also pursue the possibility of resolving CUGI and Myriad physical maps.

A committee made up of Takuji Sasaki, Francis Quetier, and Dick McCombie, will shop a package of individual chromosome papers and a joint IRGSP paper with the editors of five journals.

Takuji Sasaki will pick a firm date in December for the announcement of the completed phase 2 sequence.


Posted June 12, 2002 by B. Burr and T. Sasaki




RICE GENOME RESEARCH PROGRAM (RGP) HOME PAGE

rgpwebmaster@ml.affrc.go.jp
Copyright (C) The International Rice Genome Sequencing Project (IRGSP). 2005 All rights reserved.
RGP NIAS STAFF IRGSP