Release Notes 63 (October 2020)

Table of Contents


The Gramene Knowledgebase Team is pleased to announce its Release #63 with the Genome section providing access to information on 93 reference genomes and 122,947 gene family trees. We are adding 26 new reference genomes, including pineapple, cantaloupe, watermelon, Golden apple, clementine, sweet cherry, almond, pistachio, olive tree, cannabis or marijuana, tobacco, a relative of sugar cane, five varieties of European wheat, and new varieties of barley and cocoa. With these new genomes, our comparative genomics collection has reached a total of 382 pairwise DNA alignments and 84 synteny maps. We have also added SNP variation datasets for sunflower, Golden apple, and durum wheat.

The Plant Reactome is the pathway knowledgebase of Gramene. We utilize the Reactome pathway data model to represent plant metabolic, transport and signaling pathways, developmental processes, organ differentiation, and transcriptional regulatory networks. Manual biocuration is conducted in the reference species rice (O. sativa) and pathways are then projected via gene orthology to other species including crops and model plants, lower plants and single-cell photoautotrophs. In this release, we have extended orthology-based pathway projections for 9 new species; added 13 newly curated pathways; and updated 8 pathways in the reference species Oryza sativa.  In total, Plant Reactome is currently hosting  320 rice curated pathways and their gene-orthology based projection for 106 species ranging from unicellular autotrophs to higher plants.  

The Gramene Knowledgebase is a curated, open-source, integrated data resource for comparative functional genomics in crops and model plant species. The genome databases were built in direct collaboration with Ensembl Plants and the Plant Reactome database was produced in collaboration with the Reactome project. Core funding for the project is provided by the Agricultural Research Service of the U.S. Department of Agriculture (USDA ARS 1907-21000-030-00D), and the National Science Foundation (NSF IOS-1127112).

 

GENOMES Release Notes 

  • New genomes:
  1. Arabis alpina (alpine rock-cress)
  2. Camelina sativa (camelina, gold-of-pleasure or false flax)
  3. Cannabis sativa (cannabis or marijuana)
  4. Citrullus lanatus (watermelon)
  5. Cucumis melo (cantaloupe or muskmelon, smooth-skinned)
  6. Nymphaea colorata (a tropical water lily)
  7. Prunus dulcis (almond)
  8. Rosa chinensis (China or Chinese rose; Bengal rose, crimson or beauty)
  9. Setaria viridis (wild or green foxtail millet, green bristlegrass )
  10. Citrus clementina (clementine; citrus fruit hybrid between mandarin and sweet orange)
  11. Ipomoea triloba (morning glory, a relative of sweet potato )
  12. Malus domestica (Golden apple)
  13. Olea europaea sylvestris (wild olive tree)
  14. Pistacia vera (pistachio)
  15. Prunus avium (sweet cherry)
  16. Ananas comosus (pineapple)
  17. Eragrostis curvula (weeping lovegrass)
  18. Saccharum spontaneum (a sugar-poor relative of sugarcane)
  19. Chara braunii (Braun's stonewort, an ecorticated streptophyte algae)
  20. Hordeum vulgare subsp. GoldenPromise (barley var. GoldenPromise)
  21. Theobroma cacao subsp. Matina 1-6 (Matina cocoa, cacao or chocolate tree)
  22. Triticum aestivum subsp. Cadenza
  23. Triticum aestivum subsp. Claire
  24. Triticum aestivum subsp. Paragon
  25. Triticum aestivum subsp. Robigus
  26. Triticum aestivum subsp. Weebill
  • Updated genomes:
  1. Renamed almond gene names.
  • New & updated data: 
  1. Added new variation data for sunflower, apple and durum wheat.
  2. New whole-genome alignments, see summary here.
  3. New synteny maps, see summary here.
  4. Updated split gene predictions from peptide comparative genomics.
  5. A total of 122,947 GeneTree families were constructed comprising 2,875,432 individual genes from 93 plant genomes with 3,169,866 input proteins.
  • Updated software: 

Updated genome browser, database schema, and API to Ensembl 101 software.

 

NOTE: We are aware about the filter to select chromosome for a given region in sorghum being missing in the current release 63. This will be fixed in the next release 64. Meantime, users are encouraged to use the "Multiple regions (Chr:Start:End:Strand) [Max 250 advised]" filter for one or many regions. Examples of formatted entries are:

4:10000:1000000:-1

7:150000:250000:1

 

Here is a complete list of Genomes with every data category available for it in Gramene's genome browser.

 

Compara

Gene Trees

The EnsemblCompara GeneTree database updated. A total of total of 122,947 GeneTree families were constructed comprising 2,875,432 individual genes from 93 plant genomes with 3,169,866 input proteins.

Putative Split Genes

  • Updated Split Genes: Updated split genes for the current release are available on FTP site.  These are putative gene annotation artifacts (also known as contiguous gene split models) and are based on the latest Ensembl Compara Gene Tree database. 
  • Brief statistics for each species is given below.

 

Species_id

Split_genes

Actinidia_chinensis_var._chinensis

160

Aegilops_tauschii_subsp._strangulata

200

Arabidopsis_lyrata_subsp._lyrata

96

Arabidopsis_thaliana

30

Arabis_alpina

28

Beta_vulgaris_subsp._vulgaris

46

Brachypodium_distachyon

78

Brassica_oleracea_var._oleracea

668

Brassica_rapa

100

Camelina_sativa

814

Cannabis_sativa

88

Capsicum_annuum

704

Chlamydomonas_reinhardtii

38

Citrullus_lanatus

222

Coffea_canephora

326

Cucumis_sativus

480

Daucus_carota_subsp._sativus

218

Dioscorea_rotundata

206

Eragrostis_curvula

374

Glycine_max

206

Gossypium_raimondii

426

Helianthus_annuus

606

Hordeum_vulgare_subsp._vulgare

160

Ipomoea_triloba

430

Leersia_perrieri

88

Lupinus_angustifolius

114

Malus_domestica

872

Manihot_esculenta

130

Medicago_truncatula

218

Musa_acuminata_subsp._malaccensis

690

Nicotiana_attenuata

60

Nymphaea_colorata

400

Olea_europaea_var_sylvestris

176

Oryza_barthii

122

Oryza_brachyantha

192

Oryza_glaberrima

422

Oryza_glumipatula

166

Oryza_meridionalis

110

Oryza_nivara

118

Oryza_punctata

90

Oryza_rufipogon

122

Oryza_sativa_Indica_Group

530

Oryza_sativa_Japonica_Group

482

Ostreococcus_lucimarinus_CCE9901

10

Panicum_hallii_var._filipes

114

Panicum_hallii_var._hallii

122

Phaseolus_vulgaris

662

Physcomitrella_patens

46

Populus_trichocarpa

210

Prunus_persica

70

Rosa_chinensis

1476

Saccharum_spontaneum

352

Setaria_italica

422

Setaria_viridis

158

Solanum_lycopersicum

482

Solanum_tuberosum

316

Sorghum_bicolor

34

Theobroma_cacao

102

Trifolium_pratense

448

Triticum_aestivum

448

Triticum_dicoccoides

908

Triticum_turgidum_subsp_durum

552

Vigna_angularis

206

Vigna_radiata_var._radiata

60

Vitis_vinifera

470

Zea_mays

184

Genomic alignments

There are 382 pairwise genomic alignments.

 

Synteny data

There are 84 synteny maps.

 

Protein Annotation, GO, Xref Protein domain information 

These were generated for the new and updated genomes.

Gramene Mart


PATHWAYS Release Notes
(Plant Reactome Version 20; Gramene r63) Summary:

Here, we provide a summary of Gramene Release 63 including website and coding updates, a list of new & updated pathways, information on new species, and projection statistics. Plant Reactome now hosts pathway projections for 106 plant species. 

Website and coding updates

This release utilizes an updated ortho-inference process, converted to Java from Perl and revised for efficiency. The script uses the rules previously set up for ortho-inference while containing minor adjustments to facilitate changes in the underlying Reactome data schema and take advantage of new features and functionalities present in our partner Reactome site.

Analytical tools

Currently, Plant Reactome supports researchers with the following analytical tools:

- Search for gene/protein, metabolites, pathways
- Upload and analyze gene-expression data on plant pathways
- Upload and analyze gene-gene interaction data on plant pathways
- Compare reference rice pathways with pathways from any of 97 projected species hosted by Plant Reactome.

Curation of reference rice pathways

We have added 13 newly curated pathways, 8 updated pathways, and 1 "container" pathways, resulting in a total of 320 reference rice pathways. In this release, we focused on updates to rice metabolic pathways and curation of cell-cycle events.

New pathways Updated pathways

(under Cell Cycle)
S phase
    Synthesis of DNA
        DNA replication initiation
        DNA strand Elongation
            Lagging strand synthesis
            Leading strand synthesis
    Maturation

(under Root Structure Development)
Root elongation
Crown root development
    Crown root emergence
    Crown root initiation
Lateral root development
    Lateral root emergence
    Lateral root initiation

Photorespiration
Cardiolipin biosynthesis
Flower development
Floral bracts development
Primary root development
UDP-L-arabinose biosynthesis and transport
Vitamin E biosynthesis
Ascorbate biosynthesis
 

Pathway Projection Statistics

We have extended orthology-based pathway projections for 9 new species (in bold below). Plant Reactome now hosts pathway projections for 106 species ranging from unicellular autotrophs to higher plants.
*data from sequenced transcriptomes
^ projections currently exclude cell-cycle pathways and annotations
Planteome Inparanoid data was kindly provided by the Planteome project
When available the outgoing links from gene product IDs mapped to reactions are always hyperlinked to respective entries in collaborator databases/online resources

Species Pathways Reactions Genes Sequence
Source
Homology
Method
Oryza sativa 320 1887 2170 UniProt Curated Reference
Actinidia chinensis 264 669 1626 Ensembl Gramene Compara
Aegilops tauschii 270 719 1257 Ensembl Gramene Compara
Amborella trichopoda 265 659 799 Ensembl Gramene Compara
Ananas comosus 255 634 852 Ensembl Gramene Compara
Arabidopsis halleri 266 670 1197 Ensembl Gramene Compara
Arabidopsis lyrata 265 669 1234 Ensembl Gramene Compara
Arabidopsis thaliana 266 677 1215 Ensembl Gramene Compara
Arachis duranensis 277 743 1581 PeanutBase Inparanoid
Arachis ipaensis 274 702 1532 PeanutBase Inparanoid
Asparagus officinalis 275 664 1075 Phytozome Inparanoid
Beta vulgaris 266 663 889 Ensembl Gramene Compara
Brachypodium distachyon 264 710 1163 Ensembl Gramene Compara
Brassica napus 268 676 3708 Ensembl Gramene Compara
Brassica oleracea 265 664 1885 Ensembl Gramene Compara
Brassica rapa 264 665 1862 Ensembl Gramene Compara
Cajanus cajan 273 702 1573 LegumeInfo Inparanoid
Cannabis sativa 267 651 1603 JCVI Inparanoid
Cannabis sativa subsp. indica 274 690 1100 CCBR-UToronto Inparanoid
Capsella rubella 274 700 1772 Phytozome Inparanoid
Capsicum annuum 261 638 1159 Ensembl Gramene Compara
Chara braunii 207 377 439 Ensembl Gramene Compara
Chlamydomonas reinhardtii 192 365 316 Ensembl Gramene Compara
Chondrus crispus 159 229 213 Ensembl Gramene Compara
Cicer arietinum 275 687 1298 NCBI Inparanoid
Citrullus lanatus 272 692 1172 CuGenDB Inparanoid
Citrus clementina 269 681 1058 Ensembl Gramene Compara
Citrus sinensis 270 700 2860 Phytozome Inparanoid
Coffea canephora 262 665 1005 Ensembl Gramene Compara
Corchorus capsularis 261 628 908 Ensembl Gramene Compara
Corchorus olitorius 274 665 1311 NCBI Inparanoid
Cucumis sativus 265 672 980 Ensembl Gramene Compara
Cyanidioschyzon merolae 159 235 204 Ensembl Gramene Compara
Cynara cardunculus var. scolymus 263 645 1163 Ensembl Gramene Compara
Daucus carota 264 644 1268 Ensembl Gramene Compara
Dioscorea rotundata 255 547 689 Ensembl Gramene Compara
Eragrostis curvula 267 697 1505 Ensembl Gramene Compara
Eragrostis tef 266 694 1724 Ensembl Gramene Compara
Erythranthe guttata 270 700 1515 Phytozome Inparanoid
Eucalyptus grandis 273 707 1716 Phytozome Inparanoid
Fragaria vesca 272 667 1338 Phytozome Inparanoid
Galdieria sulphuraria 173 290 246 Galdieria sulphuraria Compara
Glycine max 267 682 2250 Glycine max Compara
Gossypium raimondii 267 690 1643 Gossypium raimondii Compara
Helianthus annuus 261 657 1706 Helianthus annuus Compara
Hordeum vulgare 265 677 1169 Hordeum vulgare Compara
Humulus lupulus haplotig 249 478 720 Hendrix Inparanoid
Humulus lupulus primary 267 680 2236 Hendrix Inparanoid
Ipomoea triloba 263 661 1333 Ensembl Gramene Compara
Jatropha curcas 269 663 1233 KDRI (Kazusa) Inparanoid
Leersia perrieri 267 697 1133 Leersia perrieri Compara
Lupinus angustifolius 265 670 1631 Lupinus angustifolius Compara
Malus domestica 262 660 1610 PMID: 20802477 Inparanoid
Manihot esculenta 266 678 1381 Ensembl Gramene Compara
Marchantia polymorpha 249 575 695 Ensembl Gramene Compara
Medicago truncatula 264 669 1330 Ensembl Gramene Compara
Musa acuminata 256 630 1433 Ensembl Gramene Compara
Nelumbo nucifera 273 689 1397 iPlant Collaborative Inparanoid
Nicotiana attenuata 253 536 815 Ensembl Gramene Compara
Olea europaea var. sylvestris 256 627 1435 Ensembl Gramene Compara
Oryza australiensis * 268 651 2247 OMAP/OGE Inparanoid
Oryza barthii 269 737 1198 Ensembl Gramene Compara
Oryza brachyantha 266 715 1151 Ensembl Gramene Compara
Oryza glaberrima 271 721 1185 Ensembl Gramene Compara
Oryza glumaepatula 270 740 1201 Ensembl Gramene Compara
Oryza granulata 267 657 4182 OMAP/OGE Inparanoid
Oryza indica 271 756 1298 Ensembl Gramene Compara
Oryza longistaminata * 267 684 1033 Ensembl Gramene Compara
Oryza meridionalis 261 651 1026 Ensembl Gramene Compara
Oryza minuta * 271 680 2712 OMAP/OGE Inparanoid
Oryza nivara 267 743 1198 Ensembl Gramene Compara
Oryza officinalis * 274 686 2357 OMAP/OGE Inparanoid
Oryza punctata 265 723 1202 Ensembl Gramene Compara
Oryza rufipogon 267 739 1212 Ensembl Gramene Compara
Oryza sativa aus kasalath 206 310 377 PMID: 24578372 Inparanoid
Ostreococcus lucimarinus 178 304 267 Ensembl Gramene Compara
Panicum hallii FIL2 267 734 1221 Ensembl Gramene Compara
Panicum hallii var. hallii HAL2 269 739 1256 Ensembl Gramene Compara
Phaseolus vulgaris 266 678 1223 Ensembl Gramene Compara
Phoenix dactylifera 269 662 1455 PMID: 23917264 Inparanoid
Phyllostachys edulis 273 661 1908 NCGR Inparanoid
Physcomitrella patens 247 576 1208 Ensembl Gramene Compara
Picea abies 271 651 1889 Congenie Inparanoid
Pinus taeda 267 643 2630 TreeBase Inparanoid
Pistacia vera 263 675 1269 Ensembl Gramene Compara
Populus trichocarpa 266 682 1519 Ensembl Gramene Compara
Prunus avium 256 612 848 Ensembl Gramene Compara
Prunus persica 267 690 1052 Ensembl Gramene Compara
Saccharum spontaneum 265 671 2121 Ensembl Gramene Compara
Salvia hispanica 273 672 2190 Jaiswal Inparanoid
Selaginella moellendorffii 249 575 1352 Ensembl Gramene Compara
Setaria italica 269 723 1245 Ensembl Gramene Compara
Solanum lycopersicum 260 668 1128 Ensembl Gramene Compara
Solanum tuberosum 254 626 1120 Ensembl Gramene Compara
Sorghum bicolor 269 737 1248 Ensembl Gramene Compara
Synechocystis sp. PCC 6803 150 268 216 Jaiswal Inparanoid
Theobroma cacao 267 687 1008 Ensembl Gramene Compara
Trifolium pratense 264 667 1147 Ensembl Gramene Compara
Triticum aestivum 271 734 3990 Ensembl Gramene Compara
Triticum dicoccoides 271 727 2380 Ensembl Gramene Compara
Triticum turgidum * 269 723 2403 Ensembl Gramene Compara
Triticum urartu 259 613 965 Ensembl Gramene Compara
Vigna angularis 261 650 1157 Ensembl Gramene Compara
Vigna radiata 255 613 1005 Ensembl Gramene Compara
Vitis vinifera 266 680 1074 Ensembl Gramene Compara
Zea mays 269 708 1540 Ensembl Gramene Compara
Zoysia japonica 271 669 1920 KDRI (Kazusa) Inparanoid

The Plant Reactome increasingly includes curated regulatory and developmental pathways, which require more reference DNA and RNA sequence elements, in addition to the traditional protein-coding elements. These sequence elements are not included in Reactome ortho-inference at this time, although we are actively working to enhance the projection process to include these elements on projected pathways in future releases.

Plant Reactome mirror at Powered-by-CyVerse

We continue to leverage the resources made available in the Powered-by-CyVerse virtual server environment by providing the Plant Reactome database mirror (https://plantreactome.cyverse.org) to facilitate training, education, and integration with the CyVerse platform and user community.


Infrastructure

Web Services: Gramene's web services page documents many ways to directly connect to and analyze our databases.

  • Public MySQL Server: 

Our partner Ensembl Genomes offers a public, read-only MySQL server with copies of the species-specific and comparative genomic databases that we use. To use this with the mysql command-line client:

$ mysql -hmysql-eg-publicsql.ebi.ac.uk -P4157 -uanonymous

Please note that the versioning scheme used at this public database differs from ours; Gramene's release 62_98 is set as version 45_98 at Ensembl.

  • Website and coding updates

The latest Plant Reactome pathway data has been re-indexed and made available via Gramene search.


Recent Publications:

  1. Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece S, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D’Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, and Ware D. (2021). Gramene 2021: Harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Research 49(D1): D1452–D1463. 10.1093/nar/gkaa979
  2. Naithani S., P. Gupta, J. Preece, P. D'Eustachio, J. Elser, J. Kiff, P. Garg, D.A. Dikeman$, A.J. Olson, S. Wei, M.K. Tello-Ruiz, J. Cook, A. Fabregat, T. Cheng, E. Bolton, A.F. Muñoz-Pomer, S. Mohammed, I. Papatheodorou, L. Stein, D. Ware, and P. Jaiswal (2020). Plant Reactome: A knowledgebase and resource for comparative pathway analysis. Nucleic Acids Res. 10.1093/nar/gkz996.
  3. Howe K.L., B. Contreras-Moreira, N. De Silva, G. Maslen, W. Akanni, J. Allen, J. Alvarez-Jarreta, M. Barba, D.M. Bolser, L. Cambell, M. Carbajo, M. Chakiachvili, M. Christensen, C. Cummins, A. Cuzick, P. Davis, S. Fexova, A. Gall, N. George, L. Gil, P. Gupta, K. E. Hammond-Kosack, E. Haskell, S. E. Hunt, P. Jaiswal, S. H. Janacek, P. J. Kersey, N. Langridge, U. Maheswari, T. Maurel, M. D. McDowall, B. Moore, M. Muffato, G. Naamati, S. Naithani, A. Olson, I. Papatheodorou, M. Patricio, M. Paulini, H. Pedro, E. Perry, J. Preece, M. Rosello, M. Russell, V. Sitnik, D. M. Staines, J. Stein, M. K. Tello-Ruiz, S. J. Trevanion, M. Urban, S. Wei, D. Ware, G. Williams, A. D. Yates, P. Flicek (2020). Ensembl Genomes 2020—enabling non-vertebrate genomic research. Nucleic Acids Res., gkz890, 10.1093/nar/gkz890
  4. Tello-Ruiz MK, Marco CF, Hsu FM, Khangura RS, Qiao P, et al. (2019) Double triage to identify poorly annotated genes in maize: The missing link in community curation. PLOS ONE 14(10): e0224086. 10.1371/journal.pone.0224086
  5. Naithani S, P. Gupta, J. Preece, P. Garg, V. Fraser, L.K. Padgitt-Cobb, M. Martin, K. Vining and P. Jaiswal (2019). Involving community in genes and pathway curation. Database, 2019:1-8, 10.1093/database/bay146

 

Please let us know if you have questions or suggestions.

The Gramene Team
www.gramene.org.