In the context of the 2017 Maize Genetics Conference, Dr. Doreen Ware will give an overview of the Gramene and MaizeCode projects at the pre-conference workshops on Thursday, March 9th, 2017 at 5:40 - 5:50 PM EST in the MidWay Suite 4 of the St. Louis Union Station Hotel.
Here is the full program for the Maize Tools & Resources Pre-MaizeMeeting Workshops
"Maize Tools & Resources" - Chair: Lisa Harper (MaizeGDB) @ 4:30 PM - 6:00 PM (Midway Suite 4)
Time Topic Presenter
4:29 - 4:30 Intro Lisa Harper
4:30 - 4:40 Maize Transformation William Gordon-Kamm
4:40 - 4:50 Functional Annotation Gokul Wilmalanathan
4:50 - 5:00 GenoMaize Daniel Vera
5:00 - 5:10 AGB Software Randy Wisser
5:10 - 5:20 ZeaBigData Jinliang Yang
5:20 - 5:30 qTeller James Schnable
5:30 - 5:40 NCBI Brian Smith-White
5:40 - 5:50 Gramene & MaizeCode Doreen Ware
5:50 - 6:00 Grassius Wilber Ouma
Other Gramene related presentations include a poster entitled "Mining Maize with Gramene" by Dr. Joshua Stein, and presentations by Ware Lab members Dr. Michael Campbell, Dr. Yinping Jiao and Dr. Bo Wang. Their abstracts are provided below.
Mining Maize with Gramene
Joshua Stein1, Sharon Wei1, Yinping Jiao1, Bo Wang1, Michael Campbell1, Marcela Karey Tello-Ruiz1, Andrew Olson1, Jim Thomason1, Maria Keays3, Paul J. Kersey3, Pankaj Jaiswal2, Doreen Ware1,4
1Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; 2Oregon State University, Corvallis, OR, USA; , 3EMBL-EBI, Hinxton, UK, 4USDA ARS NAE Robert W. Holley Center for Agriculture and Health, Ithaca, NY, USA
The Gramene database (http://www.gramene.org) is an integrated resource for comparative genome and functional analysis in plants. The database provides agricultural researchers and plant breeders with valuable biological information on genomes and plant pathways of numerous crops and model species - including maize - thus enabling powerful comparisons across species. In addition to maize B73 (RefGen_V4), the database also includes reference genomes of sorghum, rice, wheat, Brachypodium, Setaria, and dozens of other plant species. New data associated with the RefGen_V4 release includes: i) subgenome designation and ohnologs, ii) full-length transposable element annotations, and iii) gene ID history table (v3 ⇔ v4). Annotation tracks include methylome signatures, genome-wide long non-coding RNAs, and nascent transcriptomes. We also added two pairs of synteny for Z. mays vs Setaria italica and Brachypodium distachyon. Gramene is also a resource for variation data. The current release includes the maize HapMap2 (~55 million SNPs in 104 lines) and Panzea’s 2.7 GBS (~720K SNPs in 16,718 lines) variation data sets. Gramene has also developed an integrated search database and modern user interface that leverage these diverse annotations to allow scientists to find genes through selecting auto-suggested filters. The interface offers interactive views of the search results both in aggregate and in the context of a gene in the result set. Gramene’s pathway portal, Plant Reactome (http://plantreactome.gramene.org/), hosts ~240 metabolic, signaling, regulatory, and genetic reference pathways, -omics data and pathway comparison analysis tools, and orthology-based projections to over 78,000 gene products in 66 species across the plant kingdom. Both pathway and genome browsers, as well as the search result panel, display EBI-ATLAS baseline gene expression. Gramene is supported by an NSF grant IOS-1127112, and partially from USDA-ARS (1907-21000-030-00D).
Resources for maize genome annotation: lessons learned from B73
Michael S. Campbell, Yinping, Bo, Sharon, Kapeel Chougule, Josh Stein, Doreen Ware.
Genome projects began as large international efforts focused on generating reference genomes for organisms important to humans, including agriculturally important plants. As sequencing costs decreased, efforts have turned to population level re-sequencing to explore genomic diversity. Though informative and adequate for most genetic applications, short read re-sequencing is limited in its ability to identify large structural variation, a hallmark of maize diversity. To explore this important area of maize diversity and evolution, multiple maize lines are being sequenced and assembled de novo. Annotating these genomes is the next step. For the annotation of protein coding genes in the new B73 genome assembly we used the MAKER-P annotation pipeline with transcript data from Sanger, Illumina, and PacBio platforms, protein homology from multiple monocot genomes, and Arabidopsis, and a maize specific repeat library. Comparative phylogenomic analysis reveals that maize contains a deficit of genes not explainable by missing annotation, including many defense-related genes. MAKER-P is available through Cyverse (Discovery Environment and Jetstream), and through XSEDE at TACC (Stampede cluster). We have generated a Jetstream image with MAKER-P installed and all of above evidence preloaded. The B73 annotation is complete and has been accepted by GenBank. Though improved over the version 3 annotations, some inaccuracies remain, such as merged genes-- including a classical maize gene, and a hard to annotate single-exon zinc-finger transcription factor. Accurate annotation of alternative transcripts also remains challenging. The current version of the annotations errs on the side of sensitivity where alternate transcription start sites were allowed. Transcripts not expected to generate functional protein products due to alternative splicing events such as intron retention are also included as well as those containing non-canonical splicing events. And finally, tissue specific transcripts are also included. These annotations will be available through GenBank’s GRC tools allowing community curation/correction.