Release Notes 60 (February 2019)

Table of Contents


The Gramene Team is pleased to announce its Release #60 with the Genome section providing access to information on 2,162,056 genes and 58 reference plant genomes. It includes new reference genome of Arabidopsis halleri and updates to several existing genomes. 1,891,391 protein-coding genes are organized in 93,194 gene families. The Plant Reactome, Gramene's pathway portal hosts pathway projections for 79 species ranging from unicellular autotrophs to higher plants. In this release, we have extended orthology-based pathway projections for 1 new species: Arabidopsis halleri, and revised pathways for Solanum lycopersicu(tomato) and Glycine max (soybean), based on their recent genome updates; added 10 new curated pathways for reference species Oryza sativa (total curated reference pathways are 293)

Gramene is a curated, open-source, integrated data resource for comparative functional genomics in crops and model plant species. The genome databases were built in direct collaboration with Ensembl Plants and the Plant Reactome database was produced in collaboration with the Reactome project.

Genomes Release Notes 

  • New genomes:

Arabidopsis halleri, assembly GCA_900078215, annotation doi:10.5061/dryad.gn4hh

  • Updated genomes:
  1. Aegilops tauschii: updated assembly and annotation to version Aet v4.0 (GCA_002575655)
  2. Solanum lycopersicum: updated assembly (GCA_000188115.3) and annotation (ITAG3.0, except unplaced genes)
  3. Glycine max: updated assembly (GCA_000004515.4) and annotation (Rel. 137, Version 4, Glycine_max_v2.1)
  4. Vigna radiata: updated annotation (vigra.VC1973A.gnm6.ann1.M1Qs)
  5. Physcomitrella patens: updated annotation (V3.3)
  • New & Updated data: 
  1. Triticum aestivum: Inter-Homeologous Variants (IHVs) were added
  2. Lupinus angustifolius: genes from previous release were added
  3. Arabidopsis thaliana: TAIR10 is now the unique source of ncRNA annotations
  4. Nicotiana attenuata: mapped microarray probes were updated
  5. Zea mays: mapped microarray probes were updated
  6. BioMarts for all gene and variation data
  7. Updated split gene predictions from peptide comparative genomics.

List of Genomes

Name Assembly Accession/ Annotation
Aegilops tauschii Aet_v4.0 GCA_002575655.1
Amborella trichopoda AMTR1.0 GCA_000471905.1
Arabidopsis halleri Ahal2.2 GCA_900078215.1
Arabidopsis lyrata v.1.0 GCA_000004255.1
Arabidopsis thaliana TAIR10 GCA_000001735.1
Beta vulgaris RefBeet-1.2.2 GCA_000511025.2
Brachypodium distachyon Brachypodium_distachyon_v3.0 GCA_000005505.4
Brassica napus AST_PRJEB5043_v1 GCA_000751015.1
Brassica oleracea BOL GCA_000695525.1
Brassica rapa Brapa_1.0 GCA_000309985.1
Chlamydomonas reinhardtii Chlamydomonas_reinhardtii_v5.5 GCA_000002595.3
Chondrus crispus ASM35022v2 GCA_000350225.2
Corchorus capsularis CCACVL1_1.0 GCA_001974805.1
Cucumis sativus ASM407v2 GCA_000004075.2
Cyanidioschyzon merolae ASM9120v1 GCA_000091205.1
Daucus carota ASM162521v1 GCA_001625215.1
Dioscorea rotundata TDr96_F1_Pseudo_Chromosome_v1.0 GCA_002240015.2
Galdieria sulphuraria ASM34128v1 GCA_000341285.1
Glycine max Glycine_max_v2.1 GCA_000004515.4
Gossypium raimondii Graimondii2_0 GCA_000327365.1
Helianthus annuus HanXRQr1.0 GCA_002127325.1
Hordeum vulgare IBSC_v2 -
Leersia perrieri Lperr_V1.4 GCA_000325765.3
Lupinus angustifolius LupAngTanjil_v1.0 GCA_001865875.1
Manihot esculenta Manihot_esculenta_v6 GCA_001659605.1
Medicago truncatula MedtrA17_4.0 GCA_000219495.2
Musa acuminata ASM31385v1 GCA_000313855.1
Nicotiana attenuata NIATTr2 GCA_001879085.1
Oryza barthii O.barthii_v1 GCA_000182155.2
Oryza brachyantha Oryza_brachyantha.v1.4b GCA_000231095.2
Oryza glaberrima Oryza_glaberrima_V1 GCA_000147395.1
Oryza glumipatula Oryza_glumaepatula_v1.5 GCA_000576495.1
Oryza sativa Indica Group ASM465v1 GCA_000004655.2
Oryza longistaminata O_longistaminata_v1.0 GCA_000789195.1
Oryza meridionalis Oryza_meridionalis_v1.3 GCA_000338895.2
Oryza nivara Oryza_nivara_v1.0 GCA_000576065.1
Oryza punctata Oryza_punctata_v1.2 GCA_000573905.1
Oryza rufipogon OR_W1943 GCA_000817225.1
Oryza sativa Japonica Group IRGSP-1.0 GCA_001433935.1
Ostreococcus lucimarinus ASM9206v1 GCA_000092065.1
Phaseolus vulgaris PhaVulg1_0 GCA_000499845.1
Physcomitrella patens Phypa_V3 GCA_000002425.2
Populus trichocarpa Pop_tri_v3 GCA_000002775.3
Prunus persica Prunus_persica_NCBIv2 GCA_000346465.2
Selaginella moellendorffii v1.0 GCA_000143415.1
Setaria italica Setaria_italica_v2.0 GCA_000263155.2
Solanum lycopersicum SL3.0 GCA_000188115.3
Solanum tuberosum SolTub_3.0 GCA_000226075.1
Sorghum bicolor Sorghum_bicolor_NCBIv3 GCA_000003195.3
Theobroma cacao Theobroma_cacao_20110822 GCA_000403535.1
Trifolium pratense Trpr GCA_900079335.1
Triticum aestivum IWGSC GCA_900519105.1
Triticum dicoccoides WEWSeq_v.1.0 GCA_002162155.1
Triticum urartu ASM34745v1 GCA_000347455.1
Vigna angularis Vigan1.1 GCA_001190045.1
Vigna radiata Vradiata_ver6 GCA_000741045.2
Vitis vinifera 12X GCA_000003745.2
Zea mays B73_RefGen_v4 GCA_000005005.6

Genetic and Structural Variation

No new and updated genetic/structural variation data since release 54.

Compara

Gene Trees

The EnsemblCompara GeneTree database updated. A total of 93,194 GeneTree families were constructed comprising 1,891,391 individual genes from 58 plant genomes (and 5 non-plant outgroups).

Putative Split Genes

  • Updated Split Genes: Updated split genes for the current release are available on FTP siteThese are putative gene annotation artifacts (also known as contiguous gene split models) and are based on the latest Ensembl Compara Gene Tree database.  Brief statistics for each species is given below.

Here is brief statistics of contiguous gene split models for each species.

Species Split Counts
Aegilops_tauschii 186
Arabidopsis_lyrata_subsp._lyrata 95
Arabidopsis_thaliana 37
Beta_vulgaris_subsp._vulgaris 45
Brachypodium_distachyon 53
Brassica_oleracea_var._oleracea 621
Brassica_rapa 96
Chlamydomonas_reinhardtii 41
Cucumis_sativus 416
Daucus_carota_subsp._sativus 220
Glycine_max 240
Gossypium_raimondii 362
Helianthus_annuus 528
Hordeum_vulgare_subsp._vulgare 138
Leersia_perrieri 90
Lupinus_angustifolius 107
Manihot_esculenta 141
Medicago_truncatula 183
Musa_acuminata_subsp._malaccensis 611
Nicotiana_attenuata 55
Oryza_barthii 111
Oryza_brachyantha 187
Oryza_glaberrima 303
Oryza_glumipatula 159
Oryza_meridionalis 95
Oryza_nivara 97
Oryza_punctata 84
Oryza_rufipogon 96
Oryza_sativa_Indica_Group 402
Oryza_sativa_Japonica_Group 371
Ostreococcus_lucimarinus_CCE9901 10
Phaseolus_vulgaris 574
Physcomitrella_patens 42
Populus_trichocarpa 168
Prunus_persica 75
Setaria_italica 386
Solanum_lycopersicum 451
Solanum_tuberosum 297
Sorghum_bicolor 34
Theobroma_cacao 53
Trifolium_pratense 618
Triticum_aestivum 458
Triticum_dicoccoides 670
Vigna_angularis 194
Vigna_radiata_var._radiata 53
Vitis_vinifera 379
Zea_mays 177

Genomic alignments

Here is a summary of all genomic alignments and syntenies. New LastZ  were conducted between Arabidopsis halleri and each of Oryza sativa JaponicaArabidopsis thaliana, Theobroma cacao and Vitis vinifera. A stats page for each paiwise LastZ alignment is available, see for example the one for A. halleri vs V. vinifera. Click here for a complete list of comparative analyses.

Protein Annotation, GO, Xref Protein domain information 

These were generated for the new and updated genomes.

Mart

_______________________________________________________________________________________________________________________________________________________

Infrastructure



  • Web Services: Gramene's web services page documents many ways to directly connect to and analyze our databases.
  • Public MySQL Server

Our partner Ensembl Genomes offers a public, read-only MySQL server with copies of the species-specific and comparative genomic databases that we use. To use this with the mysql command-line client:

$ mysql -hmysql-eg-publicsql.ebi.ac.uk -P4157 -uanonymous

Please note that the versioning scheme used at this public database differs from ours; Gramene's release 60_95 is set as version 42_95 at Ensembl.

  • Website and coding updates: The latest Plant Reactome pathway data has been re-indexed and made available via Gramene search.


Pathways (Plant Reactome): Plant Reactome Release Summary - Version 17 (Gramene r60)

  • Analytical Tools: Currently, Plant Reactome supports researchers with the following analytical tools: 

    • Search for gene/protein, metabolites, pathways
    • Upload and analyze gene-expression data on plant pathways
    • Upload and analyze gene-gene interaction data on plant pathways
    • Compare reference rice pathways with pathways from any of 79 projected species currently hosted by Plant Reactome.
  • Curation of reference rice pathways: 
    • We have added 10 newly curated pathways and 3 "container" pathways, and updated 2 existing pathways, resulting in a total of 293 reference rice pathways.
    • New Pathways:
    •          - Regulation of embryo development
    •          - Maternal tissue PCD
               - Cell cycle regulation
               - Aleurone layer formation
               - Regulatory network of nutrient accumulation
               - Regulation of seed size
               - Regulation of leaf development
               - HSFA7/ HSFA6B-regulatory network-induced by drought and ABA.
               - SNAC1 transcription network involved in drought and salinity tolerance
               - Arsenic uptake and detoxification

    Example of recently curated rice pathway:

Nutrient accumulation

Regulatory Network of Nutrient Accumulation

  • Updated/renamed pathways:

         - Flower development
         - Thiosulfate disproportionation III (rhodanese)

  • Pathway Projection Statistics: We have extended orthology-based pathway projections for 1 new species: Arabidopsis halleri. Plant Reactome now hosts pathway projections for 79 species ranging from unicellular autotrophs to higher plants. In addition, we have revised pathways for Solanum lycopersicum (tomato) and Glycine max (soybean), based on their recent genome updates (source: Ensembl Plants).

*data from sequenced transcriptomes
^ projections currently exclude cell-cycle pathways and annotations
Planteome Inparanoid data was kindly provided by the Planteome project
When available the outgoing links from gene product IDs mapped to reactions are always hyperlinked to respective entries in collaborator databases/online resources

Species Pathways Reactions Genes Sequence source Homology method
Oryza sativa 293 1273 1727 UniProt Curated Reference
Aegilops tauschii 230 628 1118 Ensembl Gramene Compara
Amborella trichopoda 234 585 677 Ensembl Gramene Compara
Arabidopsis halleri 233 592 1031 Ensembl Gramene Compara
Arabidopsis lyrata 233 597 1052 Ensembl Gramene Compara
Arabidopsis thaliana 233 599 1045 Ensembl Gramene Compara
Arachis duranensis 241 619 1035 PeanutBase Inparanoid
Arachis ipaensis 239 633 1040 PeanutBase Inparanoid
Beta vulgaris 230 575 745 Ensembl Gramene Compara
Brachypodium distachyon 228 621 1023 Ensembl Gramene Compara
Brassica napus 234 590 3146 Ensembl Gramene Compara
Brassica oleracea 230 582 1587 Ensembl Gramene Compara
Brassica rapa 231 590 1598 Ensembl Gramene Compara
Cajanus cajan 236 598 1205 LegumeInfo Inparanoid
Capsicum annuum 238 579 1182 PMID: 24441736 Inparanoid
Chlamydomonas reinhardtii 166 338 273 Ensembl Gramene Compara
Chondrus crispus 141 216 182 Ensembl Gramene Compara
Cicer arietinum 237 591 1437 NCBI Inparanoid
Citrus sinensis 236 590 2333 Phytozome Inparanoid
Coffea canephora 236 586 1026 PMID:25190796 Inparanoid
Corchorus capsularis 227 554 803 Ensembl Gramene Compara
Cucumis sativus 232 589 840 Ensembl Gramene Compara
Cyanidioschyzon merolae 136 218 174 Ensembl Gramene Compara
Daucus carota 230 569 1067 Ensembl Gramene Compara
Dioscorea rotundata 221 480 582 Ensembl Gramene Compara
Erythranthe guttata 210 508 670 Phytozome Inparanoid
Eucalyptus grandis 212 507 707 Phytozome Inparanoid
Fragaria vesca 236 563 998 Phytozome Inparanoid
Galdieria sulphuraria 148 255 204 Ensembl Gramene Compara
Glycine max 233 603 1943 Ensembl Gramene Compara
Gossypium raimondii 232 604 1435 Ensembl Gramene Compara
Helianthus annuus 231 585 1567 Ensembl Gramene Compara
Hordeum vulgare 229 588 1018 Ensembl Gramene Compara
Jatropha curcas 210 499 534 KDRI (Kazusa) Inparanoid
Leersia perrieri 231 609 978 Ensembl Gramene Compara
Lupinus angustifolius 231 593 1422 Ensembl Gramene Compara
Malus domestica 234 572 1970 PMID: 20802477 Inparanoid
Manihot esculenta 233 598 1182 Ensembl Gramene Compara
Medicago truncatula 232 594 1173 Ensembl Gramene Compara
Musa acuminata 222 570 1287 Ensembl Gramene Compara
Nicotiana attenuata 224 504 716 Ensembl Gramene Compara
Oryza australiensis * 229 542 1650 OMAP/OGE Inparanoid
Oryza barthii 233 636 1038 Ensembl Gramene Compara
Oryza brachyantha 231 620 1001 Ensembl Gramene Compara
Oryza glaberrima 235 628 1031 Ensembl Gramene Compara
Oryza glumaepatula 234 636 1043 Ensembl Gramene Compara
Oryza granulata 230 597 892 OMAP/OGE Inparanoid
Oryza indica 225 563 890 Ensembl Gramene Compara
Oryza longistaminata * 235 565 3407 Ensembl Gramene Compara
Oryza meridionalis 236 584 2026 Ensembl Gramene Compara
Oryza minuta * 231 646 1046 OMAP/OGE Inparanoid
Oryza nivara 233 575 1823 Ensembl Gramene Compara
Oryza officinalis * 230 623 1046 OMAP/OGE Inparanoid
Oryza punctata 233 639 1054 Ensembl Gramene Compara
Oryza rufipogon 186 282 352 Ensembl Gramene Compara
Oryza sativa aus subgroup 236 655 1131 PMID: 24578372 Inparanoid
Ostreococcus lucimarinus 155 275 225 Ensembl Gramene Compara
Phaseolus vulgaris 234 600 1063 Ensembl Gramene Compara
Phoenix dactylifera 230 553 1067 PMID: 23917264 Inparanoid
Physcomitrella patens 209 511 1042 Ensembl Gramene Compara
Picea abies 232 526 1282 Congenie Inparanoid
Pinus taeda 221 473 1237 TreeBase Inparanoid
Populus trichocarpa 232 596 1321 Ensembl Gramene Compara
Prunus persica 234 605 917 Ensembl Gramene Compara
Selaginella moellendorffii 217 524 1197 Ensembl Gramene Compara
Setaria italica 233 629 1085 Ensembl Gramene Compara
Solanum lycopersicum 231 589 972 Ensembl Gramene Compara
Solanum tuberosum 226 563 980 Ensembl Gramene Compara
Sorghum bicolor 233 635 1057 Ensembl Gramene Compara
Synechocystis sp. PCC 6803 156 313 225 GenBank Inparanoid
Theobroma cacao 232 598 893 Ensembl Gramene Compara
Trifolium pratense 230 591 998 Ensembl Gramene Compara
Triticum aestivum 232 644 3575 Ensembl Gramene Compara
Triticum dicoccoides 232 637 2100 Ensembl Gramene Compara
Triticum turgidum * 234 592 2677 PMID: 23800085 Inparanoid
Triticum urartu 222 543 858 Ensembl Gramene Compara
Vigna angularis 232 580 987 Ensembl Gramene Compara
Vigna radiata 220 534 874 Ensembl Gramene Compara
Vitis vinifera 232 597 908 Ensembl Gramene Compara
Zea mays 232 621 1330 Ensembl Gramene Compara

 

NOTE: The pathway counts for both reference and projected species include a few organizational “container” names, such as “Hormone biosyntheses” and “Metabolism”. Additionally, the bulk of the projected pathways occur in the areas of metabolic and regulatory function, whereas the rice reference data set has additional pathways related to cell cycle functions. We are not currently using these additional pathways as a source for orthology projection.

The Plant Reactome increasingly includes curated regulatory and developmental pathways, which require more reference DNA and RNA sequence elements, in addition to the traditional protein-coding elements. These sequence elements are not included in Reactome orthoinference at this time, although we are actively working to enhance the projection process to include these elements on projected pathways in future releases.

Please let us know if you have questions or suggestions.

The Gramene Team www.gramene.org