Release Notes 61 (April 2019)

Table of Contents


The Gramene Knowledgebase Team is pleased to announce its Release #61 with the Genome section providing access to information on 2,264,698 genes and 61 reference plant genomes. It includes three new reference genomes: kiwifruit (Actinidia chinensis), and two ecotypes of Hall's panicgrass (Panicum hallii FIL2) and (Panicum hallii HAL2), and updates to several existing genomes. This release also includes 109 manually curated maize gene models as a GFF file to attach as a custom track in Gramene's genome browser (and will soon be also available as a permanent track). The Plant Reactome, Gramene's pathway portal hosts pathway projections for 82 species including those in the Genome section ranging from unicellular autotrophs to higher plants. Total curated reference pathways are 298. See more details on the release #61 page.

Gramene Knowledgebase is a curated, open-source, integrated data resource for comparative functional genomics in crops and model plant species. The genome databases were built in direct collaboration with Ensembl Plants and the Plant Reactome database was produced in collaboration with the Reactome project. Core funding for the project is provided by the National Science Foundation, USA.

Genomes Release Notes 

  • New genomes:
  1. Kiwifruit (Actinidia chinensis): assembly and annotation GCA_003024255.1
  2. Panicum hallii FIL2: PHallii_v3.1 assembly and annotation GCA_002211085.2
  3. Panicum hallii HAL2: PhalliiHAL_v2.1 assembly and annotation GCA_003061485.1
  • Updated genomes:
  1. Oryza sativa Japonica: updated nonOrganelle gene annotation from RAP-DB version 2018-11-26 and added stable_id mappings to previous annotation.
  2. Solanum lycopersicum: added ITAG3.0 unplaced gene models (Chr0)
  • New & Updated data: 
  1. Triticum aestivum (bread wheat): added 31,779 markers from the 35K Axiom SNP array (provided by CerealsDB).
  2. Triticum dicoccoides (Emma Zavitan wheat): added polyploid view.
  3. Vigna radiata: added stable_id mappings to previous annotation.
  4. Physcomitrella patens: added stable_id mappings to previous annotation.
  5. New manually curated maize gene models (custom genome track may be set up using this GFF file).
  6. New whole-genome alignments for A. chinensis, P. hallii FIL2 and P. hallii HAL2, see WGA section.
  7. New synteny data for A. chinensisP. hallii FIL2P. hallii HAL2 and Sorghum bicolor, see Synteny section.
  8. BioMarts for all gene and variation data.
  9. Updated split gene predictions from peptide comparative genomics.

List of Genomes (Genome browser link)

 

Name

Assembly

Accession

Actinidia chinensis Red5

Red5_PS1_1.69.0

GCA_003024255.1

Aegilops tauschii

Aet_v4.0

GCA_002575655.1

Amborella trichopoda

AMTR1.0

GCA_000471905.1

Arabidopsis halleri

Ahal2.2

GCA_900078215.1

Arabidopsis lyrata

v.1.0

GCA_000004255.1

Arabidopsis thaliana

TAIR10

GCA_000001735.1

Beta vulgaris

RefBeet-1.2.2

GCA_000511025.2

Brachypodium distachyon

Brachypodium_distachyon_v3.0

GCA_000005505.4

Brassica napus

AST_PRJEB5043_v1

GCA_000751015.1

Brassica oleracea

BOL

GCA_000695525.1

Brassica rapa

Brapa_1.0

GCA_000309985.1

Chlamydomonas reinhardtii

Chlamydomonas_reinhardtii_v5.5

GCA_000002595.3

Chondrus crispus

ASM35022v2

GCA_000350225.2

Corchorus capsularis

CCACVL1_1.0

GCA_001974805.1

Cucumis sativus

ASM407v2

GCA_000004075.2

Cyanidioschyzon merolae

ASM9120v1

GCA_000091205.1

Daucus carota

ASM162521v1

GCA_001625215.1

Dioscorea rotundata

TDr96_F1_Pseudo_Chromosome_v1.0

GCA_002240015.2

Galdieria sulphuraria

ASM34128v1

GCA_000341285.1

Glycine max

Glycine_max_v2.1

GCA_000004515.4

Gossypium raimondii

Graimondii2_0

GCA_000327365.1

Helianthus annuus

HanXRQr1.0

GCA_002127325.1

Hordeum vulgare

IBSC_v2

tbd

Leersia perrieri

Lperr_V1.4

GCA_000325765.3

Lupinus angustifolius

LupAngTanjil_v1.0

GCA_001865875.1

Manihot esculenta

Manihot_esculenta_v6

GCA_001659605.1

Medicago truncatula

MedtrA17_4.0

GCA_000219495.2

Musa acuminata

ASM31385v1

GCA_000313855.1

Nicotiana attenuata

NIATTr2

GCA_001879085.1

Oryza barthii

O.barthii_v1

GCA_000182155.2

Oryza brachyantha

Oryza_brachyantha.v1.4b

GCA_000231095.2

Oryza glaberrima

Oryza_glaberrima_V1

GCA_000147395.1

Oryza glumipatula

Oryza_glumaepatula_v1.5

GCA_000576495.1

Oryza longistaminata

O_longistaminata_v1.0

GCA_000789195.1

Oryza meridionalis

Oryza_meridionalis_v1.3

GCA_000338895.2

Oryza nivara

Oryza_nivara_v1.0

GCA_000576065.1

Oryza punctata

Oryza_punctata_v1.2

GCA_000573905.1

Oryza rufipogon

OR_W1943

GCA_000817225.1

Oryza sativa Indica Group

ASM465v1

GCA_000004655.2

Oryza sativa Japonica Group

IRGSP-1.0

GCA_001433935.1

Ostreococcus lucimarinus

ASM9206v1

GCA_000092065.1

Panicum hallii FIL2

PHallii_v3.1

GCA_002211085.2

Panicum hallii HAL2

PhalliiHAL_v2.1

GCA_003061485.1

Phaseolus vulgaris

PhaVulg1_0

GCA_000499845.1

Physcomitrella patens

Phypa_V3

GCA_000002425.2

Populus trichocarpa

Pop_tri_v3

GCA_000002775.3

Prunus persica

Prunus_persica_NCBIv2

GCA_000346465.2

Selaginella moellendorffii

v1.0

GCA_000143415.1

Setaria italica

Setaria_italica_v2.0

GCA_000263155.2

Solanum lycopersicum

SL3.0

GCA_000188115.3

Solanum tuberosum

SolTub_3.0

GCA_000226075.1

Sorghum bicolor

Sorghum_bicolor_NCBIv3

GCA_000003195.3

Theobroma cacao

Theobroma_cacao_20110822

GCA_000403535.1

Trifolium pratense

Trpr

GCA_900079335.1

Triticum aestivum

IWGSC

GCA_900519105.1

Triticum dicoccoides

WEWSeq_v.1.0

GCA_002162155.1

Triticum urartu

ASM34745v1

GCA_000347455.1

Vigna angularis

Vigan1.1

GCA_001190045.1

Vigna radiata

Vradiata_ver6

GCA_000741045.2

Vitis vinifera

12X

GCA_000003745.2

Zea mays

B73_RefGen_v4

GCA_000005005.6

 

Genetic and Structural Variation 

For Triticum aestivum (Wheat) genome, 31,779 markers from the 35K Axiom SNP array were added. They were provided by CerealsDB and designed by Affymetrix. This Affymetrix product (ID 550524) contains 35,143 SNPs selected to be informative across a wide range of hexaploid wheat accessions. This 384-sample format array is a cost effective and efficient system for screening large numbers of lines. The array has been used to screen a large global collection of elite and landrace varieties including hexaploid and tetraploid accessions.

Compara

Gene Trees

The EnsemblCompara GeneTree database updated. A total of 92,708 GeneTree families were constructed comprising 1,948,793 individual genes from 61 plant genomes with 2,158,277 input proteins.

Putative Split Genes

  • Updated Split Genes: Updated split genes for the current release are available on FTP siteThese are putative gene annotation artifacts (also known as contiguous gene split models) and are based on the latest Ensembl Compara Gene Tree database. 
  • Brief statistics for each species is given below.
Species_id Splitgene_counts
Actinidia_chinensis_var._chinensis 142
Aegilops_tauschii_subsp._strangulata 178
Arabidopsis_lyrata_subsp._lyrata 89
Arabidopsis_thaliana 31
Beta_vulgaris_subsp._vulgaris 47
Brachypodium_distachyon 62
Brassica_oleracea_var._oleracea 604
Brassica_rapa 96
Chlamydomonas_reinhardtii 41
Cucumis_sativus 418
Daucus_carota_subsp._sativus 229
Dioscorea_rotundata 182
Glycine_max 215
Gossypium_raimondii 336
Helianthus_annuus 527
Hordeum_vulgare_subsp._vulgare 145
Leersia_perrieri 94
Lupinus_angustifolius 114
Manihot_esculenta 122
Medicago_truncatula 189
Musa_acuminata_subsp._malaccensis 623
Nicotiana_attenuata 51
Oryza_barthii 107
Oryza_brachyantha 189
Oryza_glaberrima 283
Oryza_glumipatula 168
Oryza_meridionalis 111
Oryza_nivara 96
Oryza_punctata 89
Oryza_rufipogon 81
Oryza_sativa_Indica_Group 392
Oryza_sativa_Japonica_Group 376
Ostreococcus_lucimarinus_CCE9901 10
Panicum_hallii_var._filipes 114
Panicum_hallii_var._hallii 102
Phaseolus_vulgaris 571
Physcomitrella_patens 38
Populus_trichocarpa 157
Prunus_persica 71
Setaria_italica 384
Solanum_lycopersicum 427
Solanum_tuberosum 296
Sorghum_bicolor 37
Theobroma_cacao 50
Trifolium_pratense 437
Triticum_aestivum 438
Triticum_dicoccoides 642
Vigna_angularis 189
Vigna_radiata_var._radiata 59
Vitis_vinifera 380
Zea_mays 176

 

Genomic alignments

 Here is a summary of all genomic alignments. New Whole Genome Alignments between  

Synteny data

Protein Annotation, GO, Xref Protein domain information 

These were generated for the new and updated genomes.

Mart

_______________________________________________________________________________________________________________________________________________________

Infrastructure



  • Web Services: Gramene's web services page documents many ways to directly connect to and analyze our databases.
  • Public MySQL Server

Our partner Ensembl Genomes offers a public, read-only MySQL server with copies of the species-specific and comparative genomic databases that we use. To use this with the mysql command-line client:

$ mysql -hmysql-eg-publicsql.ebi.ac.uk -P4157 -uanonymous

Please note that the versioning scheme used at this public database differs from ours; Gramene's release 61_96 is set as version 43_96 at Ensembl.

  • Website and coding updates: The latest Plant Reactome pathway data has been re-indexed and made available via Gramene search.


Pathways (Plant Reactome): Plant Reactome Release Summary - Version 18 (Gramene r61)

  • Wesbite updates:
    For user and data security, SSL (Secure Sockets Layer) has now been implemented on Plant Reactome. The entire site and all services are now available over https. The latest Plant Reactome pathway data has also been re-indexed and made available via Gramene search.

  • Analytical Tools: Currently, Plant Reactome supports researchers with the following analytical tools: 

    • Search for gene/protein, metabolites, pathways
    • Upload and analyze gene-expression data on plant pathways
    • Upload and analyze gene-gene interaction data on plant pathways
    • Compare reference rice pathways with pathways from any of 82 projected species currently hosted by Plant Reactome.
  • Curation of reference rice pathways: 
    • We have added 4 newly curated pathways, 2 updated pathways and 1 "container" pathway, resulting in a total of 298 reference rice pathways.
    • New Pathways:
      • Root hair development
      • Phytic Acid biosynthesis (lipid independent)
      • Response to aluminum stress
      • Root-specific gene-network of NAC10_TF induced by drought, salinity, and ABA
    • Updated pathways:
      • HSFA7-regulatory network induced by drought and ABA
      • SNAC1 transcription network involved in drought and salinity tolerance

Example of recently curated rice pathway: Response to aluminum stress

Response to aluminum stress

*data from sequenced transcriptomes
^ projections currently exclude cell-cycle pathways and annotations
Planteome Inparanoid data was kindly provided by the Planteome project
When available the outgoing links from gene product IDs mapped to reactions are always hyperlinked to respective entries in the collaborator databases/online resources

Species Pathways Reactions Genes Sequence
Source
Homology
Method
Oryza sativa 298 1723 1824 UniProt Curated Reference
Actinidia chinensis 238 594 1410 Ensembl Gramene Compara
Aegilops tauschii 236 627 1123 Ensembl Gramene Compara
Amborella trichopoda 236 573 682 Ensembl Gramene Compara
Arabidopsis halleri 235 582 1021 Ensembl Gramene Compara
Arabidopsis lyrata 235 584 1053 Ensembl Gramene Compara
Arabidopsis thaliana 236 590 1059 Ensembl Gramene Compara
Arachis duranensis 245 624 1057 PeanutBase Inparanoid
Arachis ipaensis 240 594 949 PeanutBase Inparanoid
Beta vulgaris 235 576 747 Ensembl Gramene Compara
Brachypodium distachyon 231 619 1031 Ensembl Gramene Compara
Brassica napus 238 590 3209 Ensembl Gramene Compara
Brassica oleracea 236 581 1619 Ensembl Gramene Compara
Brassica rapa 235 578 1598 Ensembl Gramene Compara
Cajanus cajan 240 602 1235 LegumeInfo Inparanoid
Capsicum annuum 242 583 1214 PMID: 24441736 Inparanoid
Chlamydomonas reinhardtii 164 332 273 Ensembl Gramene Compara
Chondrus crispus 140 210 183 Ensembl Gramene Compara
Cicer arietinum 241 595 1461 NCBI Inparanoid
Citrus sinensis 240 594 2366 Phytozome Inparanoid
Coffea canephora 240 590 1050 PMID:25190796 Inparanoid
Corchorus capsularis 233 554 781 Ensembl Gramene Compara
Cucumis sativus 237 590 834 Ensembl Gramene Compara
Cyanidioschyzon merolae 142 216 169 Ensembl Gramene Compara
Daucus carota 235 574 1091 Ensembl Gramene Compara
Dioscorea rotundata 222 480 605 Ensembl Gramene Compara
Erythranthe guttata 213 502 670 Phytozome Inparanoid
Eucalyptus grandis 215 501 707 Phytozome Inparanoid
Fragaria vesca 240 566 1014 Phytozome Inparanoid
Galdieria sulphuraria 152 253 206 Ensembl Gramene Compara
Glycine max 236 599 1947 Ensembl Gramene Compara
Gossypium raimondii 238 602 1420 Ensembl Gramene Compara
Helianthus annuus 237 590 1576 Ensembl Gramene Compara
Hordeum vulgare 234 596 1043 Ensembl Gramene Compara
Jatropha curcas 213 493 534 KDRI (Kazusa) Inparanoid
Leersia perrieri 236 618 1001 Ensembl Gramene Compara
Lupinus angustifolius 237 597 1430 Ensembl Gramene Compara
Malus domestica 238 576 2009 PMID: 20802477 Inparanoid
Manihot esculenta 236 597 1180 Ensembl Gramene Compara
Medicago truncatula 236 587 1182 Ensembl Gramene Compara
Musa acuminata 227 566 1280 Ensembl Gramene Compara
Nicotiana attenuata 232 507 703 Ensembl Gramene Compara
Oryza australiensis * 234 545 1704 OMAP/OGE Inparanoid
Oryza barthii 238 642 1057 Ensembl Gramene Compara
Oryza brachyantha 236 625 1013 Ensembl Gramene Compara
Oryza glaberrima 242 636 1051 Ensembl Gramene Compara
Oryza glumaepatula 239 642 1066 Ensembl Gramene Compara
Oryza granulata 239 568 3509 OMAP/OGE Inparanoid
Oryza indica 242 664 1157 Ensembl Gramene Compara
Oryza longistaminata * 236 604 917 Ensembl Gramene Compara
Oryza meridionalis 233 581 914 Ensembl Gramene Compara
Oryza minuta * 240 588 2092 OMAP/OGE Inparanoid
Oryza nivara 238 652 1066 Ensembl Gramene Compara
Oryza officinalis * 237 579 1882 OMAP/OGE Inparanoid
Oryza punctata 235 628 1067 Ensembl Gramene Compara
Oryza rufipogon 238 641 1076 Ensembl Gramene Compara
Oryza sativa aus subgroup 191 287 365 PMID: 24578372 Inparanoid
Ostreococcus lucimarinus 156 274 226 Ensembl Gramene Compara
Panicum hallii FIL2 235 639 1076 Ensembl Gramene Compara
Panicum hallii var. hallii HAL2 237 642 1109 Ensembl Gramene Compara
Phaseolus vulgaris 239 599 1065 Ensembl Gramene Compara
Phoenix dactylifera 234 557 1097 PMID: 23917264 Inparanoid
Physcomitrella patens 214 513 1060 Ensembl Gramene Compara
Picea abies 236 530 1315 Congenie Inparanoid
Pinus taeda 225 475 1308 TreeBase Inparanoid
Populus trichocarpa 236 596 1329 Ensembl Gramene Compara
Prunus persica 239 606 944 Ensembl Gramene Compara
Selaginella moellendorffii 220 522 1226 Ensembl Gramene Compara
Setaria italica 237 631 1114 Ensembl Gramene Compara
Solanum lycopersicum 235 583 968 Ensembl Gramene Compara
Solanum tuberosum 229 561 967 Ensembl Gramene Compara
Sorghum bicolor 237 639 1095 Ensembl Gramene Compara
Synechocystis sp. PCC 6803 157 311 263 Jaiswal Inparanoid
Theobroma cacao 238 598 877 Ensembl Gramene Compara
Trifolium pratense 237 588 1006 Ensembl Gramene Compara
Triticum aestivum 238 642 3574 Ensembl Gramene Compara
Triticum dicoccoides 238 634 2123 Ensembl Gramene Compara
Triticum turgidum * 238 597 2753 PMID: 23800085 Inparanoid
Triticum urartu 229 544 864 Ensembl Gramene Compara
Vigna angularis 237 582 1003 Ensembl Gramene Compara
Vigna radiata 226 543 891 Ensembl Gramene Compara
Vitis vinifera 238 602 930 Ensembl Gramene Compara
Zea mays 235 621 1380 Ensembl Gramene Compara

NOTE: The pathway counts for both reference and projected species include a few organizational “container” names, such as “Hormone biosyntheses” and “Metabolism”. Additionally, the bulk of the projected pathways occur in the areas of metabolic and regulatory function, whereas the rice reference data set has additional pathways related to cell cycle functions. We are not currently using these additional pathways as a source for orthology projection.

The Plant Reactome increasingly includes curated regulatory and developmental pathways, which require more reference DNA and RNA sequence elements, in addition to the traditional protein-coding elements. These sequence elements are not included in Reactome orthoinference at this time, although we are actively working to enhance the projection process to include these elements on projected pathways in future releases.

Please let us know if you have questions or suggestions.

The Gramene Team
www.gramene.org