Release Notes 63 (October 2020)

Table of Contents


The Gramene Knowledgebase Team is pleased to announce its Release #63 with the Genome section providing access to information on 93 reference genomes and 122,947 gene family trees. We are adding 26 new reference genomes, including pineapple, cantaloupe, watermelon, Golden apple, clementine, sweet cherry, almond, pistachio, olive tree, cannabis or marijuana, tobacco, a relative of sugar cane, five varieties of European wheat, and new varieties of barley and cocoa. With these new genomes, our comparative genomics collection has reached a total of 382 pairwise DNA alignments and 84 synteny maps. We have also added SNP variation datasets for sunflower, Golden apple, and durum wheat.

The Plant Reactome is the pathway knowledgebase of Gramene. We utilize the Reactome pathway data model to represent plant metabolic, transport and signaling pathways, developmental processes, organ differentiation, and transcriptional regulatory networks. Manual biocuration is conducted in the reference species rice (O. sativa) and pathways are then projected via gene orthology to other species including crops and model plants, lower plants and single-cell photoautotrophs. In this release, we have extended orthology-based pathway projections for 9 new species; added 13 newly curated pathways; and updated 8 pathways in the reference species Oryza sativa.  In total, Plant Reactome is currently hosting  320 rice curated pathways and their gene-orthology based projection for 106 species ranging from unicellular autotrophs to higher plants.  

The Gramene Knowledgebase is a curated, open-source, integrated data resource for comparative functional genomics in crops and model plant species. The genome databases were built in direct collaboration with Ensembl Plants and the Plant Reactome database was produced in collaboration with the Reactome project. Core funding for the project is provided by the Agricultural Research Service of the U.S. Department of Agriculture (USDA ARS 1907-21000-030-00D), and the National Science Foundation (NSF IOS-1127112).

 

GENOMES Release Notes 

  • New genomes:
  1. Arabis alpina (alpine rock-cress)
  2. Camelina sativa (camelina, gold-of-pleasure or false flax)
  3. Cannabis sativa (cannabis or marijuana)
  4. Citrullus lanatus (watermelon)
  5. Cucumis melo (cantaloupe or muskmelon, smooth-skinned)
  6. Nymphaea colorata (a tropical water lily)
  7. Prunus dulcis (almond)
  8. Rosa chinensis (China or Chinese rose; Bengal rose, crimson or beauty)
  9. Setaria viridis (wild or green foxtail millet, green bristlegrass )
  10. Citrus clementina (clementine; citrus fruit hybrid between mandarin and sweet orange)
  11. Ipomoea triloba (morning glory, a relative of sweet potato )
  12. Malus domestica (Golden apple)
  13. Olea europaea sylvestris (wild olive tree)
  14. Pistacia vera (pistachio)
  15. Prunus avium (sweet cherry)
  16. Ananas comosus (pineapple)
  17. Eragrostis curvula (weeping lovegrass)
  18. Saccharum spontaneum (a sugar-poor relative of sugarcane)
  19. Chara braunii (Braun's stonewort, an ecorticated streptophyte algae)
  20. Hordeum vulgare subsp. GoldenPromise (barley var. GoldenPromise)
  21. Theobroma cacao subsp. Matina 1-6 (Matina cocoa, cacao or chocolate tree)
  22. Triticum aestivum subsp. Cadenza
  23. Triticum aestivum subsp. Claire
  24. Triticum aestivum subsp. Paragon
  25. Triticum aestivum subsp. Robigus
  26. Triticum aestivum subsp. Weebill
  • Updated genomes:
  1. Renamed almond gene names.
  • New & updated data: 
  1. Added new variation data for sunflower, apple and durum wheat.
  2. New whole-genome alignments, see summary here.
  3. New synteny maps, see summary here.
  4. Updated split gene predictions from peptide comparative genomics.
  5. A total of 122,947 GeneTree families were constructed comprising 2,875,432 individual genes from 93 plant genomes with 3,169,866 input proteins.
  • Updated software: 

Updated genome browser, database schema, and API to Ensembl 101 software.

 

NOTE: We are aware about the filter to select chromosome for a given region in sorghum being missing in the current release 63. This will be fixed in the next release 64. Meantime, users are encouraged to use the "Multiple regions (Chr:Start:End:Strand) [Max 250 advised]" filter for one or many regions. Examples of formatted entries are:

4:10000:1000000:-1

7:150000:250000:1

 

Here is a complete list of Genomes with every data category available for it in Gramene's genome browser.

 

Compara

Gene Trees

The EnsemblCompara GeneTree database updated. A total of total of 122,947 GeneTree families were constructed comprising 2,875,432 individual genes from 93 plant genomes with 3,169,866 input proteins.

Putative Split Genes

  • Updated Split Genes: Updated split genes for the current release are available on FTP site.  These are putative gene annotation artifacts (also known as contiguous gene split models) and are based on the latest Ensembl Compara Gene Tree database. 
  • Brief statistics for each species is given below.

 

Species_id

Split_genes

Actinidia_chinensis_var._chinensis

160

Aegilops_tauschii_subsp._strangulata

200

Arabidopsis_lyrata_subsp._lyrata

96

Arabidopsis_thaliana

30

Arabis_alpina

28

Beta_vulgaris_subsp._vulgaris

46

Brachypodium_distachyon

78

Brassica_oleracea_var._oleracea

668

Brassica_rapa

100

Camelina_sativa

814

Cannabis_sativa

88

Capsicum_annuum

704

Chlamydomonas_reinhardtii

38

Citrullus_lanatus

222

Coffea_canephora

326

Cucumis_sativus

480

Daucus_carota_subsp._sativus

218

Dioscorea_rotundata

206

Eragrostis_curvula

374

Glycine_max

206

Gossypium_raimondii

426

Helianthus_annuus

606

Hordeum_vulgare_subsp._vulgare

160

Ipomoea_triloba

430

Leersia_perrieri

88

Lupinus_angustifolius

114

Malus_domestica

872

Manihot_esculenta

130

Medicago_truncatula

218

Musa_acuminata_subsp._malaccensis

690

Nicotiana_attenuata

60

Nymphaea_colorata

400

Olea_europaea_var_sylvestris

176

Oryza_barthii

122

Oryza_brachyantha

192

Oryza_glaberrima

422

Oryza_glumipatula

166

Oryza_meridionalis

110

Oryza_nivara

118

Oryza_punctata

90

Oryza_rufipogon

122

Oryza_sativa_Indica_Group

530

Oryza_sativa_Japonica_Group

482

Ostreococcus_lucimarinus_CCE9901

10

Panicum_hallii_var._filipes

114

Panicum_hallii_var._hallii

122

Phaseolus_vulgaris

662

Physcomitrella_patens

46

Populus_trichocarpa

210

Prunus_persica

70

Rosa_chinensis

1476

Saccharum_spontaneum

352

Setaria_italica

422

Setaria_viridis

158

Solanum_lycopersicum

482

Solanum_tuberosum

316

Sorghum_bicolor

34

Theobroma_cacao

102

Trifolium_pratense

448

Triticum_aestivum

448

Triticum_dicoccoides

908

Triticum_turgidum_subsp_durum

552

Vigna_angularis

206

Vigna_radiata_var._radiata

60

Vitis_vinifera

470

Zea_mays

184

Genomic alignments

There are 382 pairwise genomic alignments.

 

Synteny data

There are 84 synteny maps.

 

Protein Annotation, GO, Xref Protein domain information 

These were generated for the new and updated genomes.

Gramene Mart


PATHWAYS Release Notes
(Plant Reactome Version 20; Gramene r63) Summary:

Here, we provide a summary of Gramene Release 63 including website and coding updates, a list of new & updated pathways, information on new species, and projection statistics. Plant Reactome now hosts pathway projections for 106 plant species. 

    Website and coding updates

    This release utilizes an updated ortho-inference process, converted to Java from Perl and revised for efficiency. The script uses the rules previously set up for ortho-inference while containing minor adjustments to facilitate changes in the underlying Reactome data schema and take advantage of new features and functionalities present in our partner Reactome site.

    Analytical tools

    Currently, Plant Reactome supports researchers with the following analytical tools:

    - Search for gene/protein, metabolites, pathways
    - Upload and analyze gene-expression data on plant pathways
    - Upload and analyze gene-gene interaction data on plant pathways
    - Compare reference rice pathways with pathways from any of 97 projected species hosted by Plant Reactome.

    Curation of reference rice pathways

    We have added 13 newly curated pathways, 8 updated pathways, and 1 "container" pathways, resulting in a total of 320 reference rice pathways. In this release, we focused on updates to rice metabolic pathways and curation of cell-cycle events.

    New pathways Updated pathways

    (under Cell Cycle)
    S phase
        Synthesis of DNA
            DNA replication initiation
            DNA strand Elongation
                Lagging strand synthesis
                Leading strand synthesis
        Maturation

    (under Root Structure Development)
    Root elongation
    Crown root development
        Crown root emergence
        Crown root initiation
    Lateral root development
        Lateral root emergence
        Lateral root initiation

    Photorespiration
    Cardiolipin biosynthesis
    Flower development
    Floral bracts development
    Primary root development
    UDP-L-arabinose biosynthesis and transport
    Vitamin E biosynthesis
    Ascorbate biosynthesis
     

    Pathway Projection Statistics

    We have extended orthology-based pathway projections for 9 new species (in bold below). Plant Reactome now hosts pathway projections for 106 species ranging from unicellular autotrophs to higher plants.
    *data from sequenced transcriptomes
    ^ projections currently exclude cell-cycle pathways and annotations
    Planteome Inparanoid data was kindly provided by the Planteome project
    When available the outgoing links from gene product IDs mapped to reactions are always hyperlinked to respective entries in collaborator databases/online resources

    Species Pathways Reactions Genes Sequence
    Source
    Homology
    Method
    Oryza sativa 320 1887 2170 UniProt Curated Reference
    Actinidia chinensis 264 669 1626 Ensembl Gramene Compara
    Aegilops tauschii 270 719 1257 Ensembl Gramene Compara
    Amborella trichopoda 265 659 799 Ensembl Gramene Compara
    Ananas comosus 255 634 852 Ensembl Gramene Compara
    Arabidopsis halleri 266 670 1197 Ensembl Gramene Compara
    Arabidopsis lyrata 265 669 1234 Ensembl Gramene Compara
    Arabidopsis thaliana 266 677 1215 Ensembl Gramene Compara
    Arachis duranensis 277 743 1581 PeanutBase Inparanoid
    Arachis ipaensis 274 702 1532 PeanutBase Inparanoid
    Asparagus officinalis 275 664 1075 Phytozome Inparanoid
    Beta vulgaris 266 663 889 Ensembl Gramene Compara
    Brachypodium distachyon 264 710 1163 Ensembl Gramene Compara
    Brassica napus 268 676 3708 Ensembl Gramene Compara
    Brassica oleracea 265 664 1885 Ensembl Gramene Compara
    Brassica rapa 264 665 1862 Ensembl Gramene Compara
    Cajanus cajan 273 702 1573 LegumeInfo Inparanoid
    Cannabis sativa 267 651 1603 JCVI Inparanoid
    Cannabis sativa subsp. indica 274 690 1100 CCBR-UToronto Inparanoid
    Capsella rubella 274 700 1772 Phytozome Inparanoid
    Capsicum annuum 261 638 1159 Ensembl Gramene Compara
    Chara braunii 207 377 439 Ensembl Gramene Compara
    Chlamydomonas reinhardtii 192 365 316 Ensembl Gramene Compara
    Chondrus crispus 159 229 213 Ensembl Gramene Compara
    Cicer arietinum 275 687 1298 NCBI Inparanoid
    Citrullus lanatus 272 692 1172 CuGenDB Inparanoid
    Citrus clementina 269 681 1058 Ensembl Gramene Compara
    Citrus sinensis 270 700 2860 Phytozome Inparanoid
    Coffea canephora 262 665 1005 Ensembl Gramene Compara
    Corchorus capsularis 261 628 908 Ensembl Gramene Compara
    Corchorus olitorius 274 665 1311 NCBI Inparanoid
    Cucumis sativus 265 672 980 Ensembl Gramene Compara
    Cyanidioschyzon merolae 159 235 204 Ensembl Gramene Compara
    Cynara cardunculus var. scolymus 263 645 1163 Ensembl Gramene Compara
    Daucus carota 264 644 1268 Ensembl Gramene Compara
    Dioscorea rotundata 255 547 689 Ensembl Gramene Compara
    Eragrostis curvula 267 697 1505 Ensembl Gramene Compara
    Eragrostis tef 266 694 1724 Ensembl Gramene Compara
    Erythranthe guttata 270 700 1515 Phytozome Inparanoid
    Eucalyptus grandis 273 707 1716 Phytozome Inparanoid
    Fragaria vesca 272 667 1338 Phytozome Inparanoid
    Galdieria sulphuraria 173 290 246 Galdieria sulphuraria Compara
    Glycine max 267 682 2250 Glycine max Compara
    Gossypium raimondii 267 690 1643 Gossypium raimondii Compara
    Helianthus annuus 261 657 1706 Helianthus annuus Compara
    Hordeum vulgare 265 677 1169 Hordeum vulgare Compara
    Humulus lupulus haplotig 249 478 720 Hendrix Inparanoid
    Humulus lupulus primary 267 680 2236 Hendrix Inparanoid
    Ipomoea triloba 263 661 1333 Ensembl Gramene Compara
    Jatropha curcas 269 663 1233 KDRI (Kazusa) Inparanoid
    Leersia perrieri 267 697 1133 Leersia perrieri Compara
    Lupinus angustifolius 265 670 1631 Lupinus angustifolius Compara
    Malus domestica 262 660 1610 PMID: 20802477 Inparanoid
    Manihot esculenta 266 678 1381 Ensembl Gramene Compara
    Marchantia polymorpha 249 575 695 Ensembl Gramene Compara
    Medicago truncatula 264 669 1330 Ensembl Gramene Compara
    Musa acuminata 256 630 1433 Ensembl Gramene Compara
    Nelumbo nucifera 273 689 1397 iPlant Collaborative Inparanoid
    Nicotiana attenuata 253 536 815 Ensembl Gramene Compara
    Olea europaea var. sylvestris 256 627 1435 Ensembl Gramene Compara
    Oryza australiensis * 268 651 2247 OMAP/OGE Inparanoid
    Oryza barthii 269 737 1198 Ensembl Gramene Compara
    Oryza brachyantha 266 715 1151 Ensembl Gramene Compara
    Oryza glaberrima 271 721 1185 Ensembl Gramene Compara
    Oryza glumaepatula 270 740 1201 Ensembl Gramene Compara
    Oryza granulata 267 657 4182 OMAP/OGE Inparanoid
    Oryza indica 271 756 1298 Ensembl Gramene Compara
    Oryza longistaminata * 267 684 1033 Ensembl Gramene Compara
    Oryza meridionalis 261 651 1026 Ensembl Gramene Compara
    Oryza minuta * 271 680 2712 OMAP/OGE Inparanoid
    Oryza nivara 267 743 1198 Ensembl Gramene Compara
    Oryza officinalis * 274 686 2357 OMAP/OGE Inparanoid
    Oryza punctata 265 723 1202 Ensembl Gramene Compara
    Oryza rufipogon 267 739 1212 Ensembl Gramene Compara
    Oryza sativa aus kasalath 206 310 377 PMID: 24578372 Inparanoid
    Ostreococcus lucimarinus 178 304 267 Ensembl Gramene Compara
    Panicum hallii FIL2 267 734 1221 Ensembl Gramene Compara
    Panicum hallii var. hallii HAL2 269 739 1256 Ensembl Gramene Compara
    Phaseolus vulgaris 266 678 1223 Ensembl Gramene Compara
    Phoenix dactylifera 269 662 1455 PMID: 23917264 Inparanoid
    Phyllostachys edulis 273 661 1908 NCGR Inparanoid
    Physcomitrella patens 247 576 1208 Ensembl Gramene Compara
    Picea abies 271 651 1889 Congenie Inparanoid
    Pinus taeda 267 643 2630 TreeBase Inparanoid
    Pistacia vera 263 675 1269 Ensembl Gramene Compara
    Populus trichocarpa 266 682 1519 Ensembl Gramene Compara
    Prunus avium 256 612 848 Ensembl Gramene Compara
    Prunus persica 267 690 1052 Ensembl Gramene Compara
    Saccharum spontaneum 265 671 2121 Ensembl Gramene Compara
    Salvia hispanica 273 672 2190 Jaiswal Inparanoid
    Selaginella moellendorffii 249 575 1352 Ensembl Gramene Compara
    Setaria italica 269 723 1245 Ensembl Gramene Compara
    Solanum lycopersicum 260 668 1128 Ensembl Gramene Compara
    Solanum tuberosum 254 626 1120 Ensembl Gramene Compara
    Sorghum bicolor 269 737 1248 Ensembl Gramene Compara
    Synechocystis sp. PCC 6803 150 268 216 Jaiswal Inparanoid
    Theobroma cacao 267 687 1008 Ensembl Gramene Compara
    Trifolium pratense 264 667 1147 Ensembl Gramene Compara
    Triticum aestivum 271 734 3990 Ensembl Gramene Compara
    Triticum dicoccoides 271 727 2380 Ensembl Gramene Compara
    Triticum turgidum * 269 723 2403 Ensembl Gramene Compara
    Triticum urartu 259 613 965 Ensembl Gramene Compara
    Vigna angularis 261 650 1157 Ensembl Gramene Compara
    Vigna radiata 255 613 1005 Ensembl Gramene Compara
    Vitis vinifera 266 680 1074 Ensembl Gramene Compara
    Zea mays 269 708 1540 Ensembl Gramene Compara
    Zoysia japonica 271 669 1920 KDRI (Kazusa) Inparanoid

    The Plant Reactome increasingly includes curated regulatory and developmental pathways, which require more reference DNA and RNA sequence elements, in addition to the traditional protein-coding elements. These sequence elements are not included in Reactome ortho-inference at this time, although we are actively working to enhance the projection process to include these elements on projected pathways in future releases.

    Plant Reactome mirror at Powered-by-CyVerse

    We continue to leverage the resources made available in the Powered-by-CyVerse virtual server environment by providing the Plant Reactome database mirror (https://plantreactome.cyverse.org) to facilitate training, education, and integration with the CyVerse platform and user community.


    Infrastructure

    Web Services: Gramene's web services page documents many ways to directly connect to and analyze our databases.

    • Public MySQL Server: 

    Our partner Ensembl Genomes offers a public, read-only MySQL server with copies of the species-specific and comparative genomic databases that we use. To use this with the mysql command-line client:

    $ mysql -hmysql-eg-publicsql.ebi.ac.uk -P4157 -uanonymous

    Please note that the versioning scheme used at this public database differs from ours; Gramene's release 62_98 is set as version 45_98 at Ensembl.

    • Website and coding updates

    The latest Plant Reactome pathway data has been re-indexed and made available via Gramene search.


    Recent Publications:

    1. Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece S, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D’Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, and Ware D. Gramene 2021: Harnessing the power of comparative genomics and pathways for plant research. Manuscript accepted by NAR Database (2021).
    2. Naithani S., P. Gupta, J. Preece, P. D'Eustachio, J. Elser, J. Kiff, P. Garg, D.A. Dikeman$, A.J. Olson, S. Wei, M.K. Tello-Ruiz, J. Cook, A. Fabregat, T. Cheng, E. Bolton, A.F. Muñoz-Pomer, S. Mohammed, I. Papatheodorou, L. Stein, D. Ware, and P. Jaiswal (2020). Plant Reactome: A knowledgebase and resource for comparative pathway analysis. Nucleic Acids Res. https://doi.org/10.1093/nar/gkz996.
    3. Howe K.L., B. Contreras-Moreira, N. De Silva, G. Maslen, W. Akanni, J. Allen, J. Alvarez-Jarreta, M. Barba, D.M. Bolser, L. Cambell, M. Carbajo, M. Chakiachvili, M. Christensen, C. Cummins, A. Cuzick, P. Davis, S. Fexova, A. Gall, N. George, L. Gil, P. Gupta, K. E. Hammond-Kosack, E. Haskell, S. E. Hunt, P. Jaiswal, S. H. Janacek, P. J. Kersey, N. Langridge, U. Maheswari, T. Maurel, M. D. McDowall, B. Moore, M. Muffato, G. Naamati, S. Naithani, A. Olson, I. Papatheodorou, M. Patricio, M. Paulini, H. Pedro, E. Perry, J. Preece, M. Rosello, M. Russell, V. Sitnik, D. M. Staines, J. Stein, M. K. Tello-Ruiz, S. J. Trevanion, M. Urban, S. Wei, D. Ware, G. Williams, A. D. Yates, P. Flicek (2020). Ensembl Genomes 2020—enabling non-vertebrate genomic research. Nucleic Acids Res., gkz890, https://doi.org/10.1093/nar/gkz890
    4. Tello-Ruiz MK, Marco CF, Hsu FM, Khangura RS, Qiao P, et al. (2019) Double triage to identify poorly annotated genes in maize: The missing link in community curation. PLOS ONE 14(10): e0224086. https://doi.org/10.1371/journal.pone.0224086
    5. Naithani S, P. Gupta, J. Preece, P. Garg, V. Fraser, L.K. Padgitt-Cobb, M. Martin, K. Vining and P. Jaiswal (2019). Involving community in genes and pathway curation. Database, 2019:1-8,  https://doi.org/10.1093/database/bay146

     

    Please let us know if you have questions or suggestions.

    The Gramene Team
    www.gramene.org.