Oryza PanGenome R4 - Declaration of intentions

What’s coming in Oryza PanGenome Release 4?

The Oryza PanGenome R4 is due to be released in May 2022.

 

New Genomes

  • Oryza sativa indica XI-adm var. MH63 

  • Oryza sativa indica XI-1A var. ZS97 

  • Oryza sativa japonica var. KitaakeX

 

Asian cultivated rice is a staple food for half of the world population. With a genome size of ~390 Mb (n=12), rice has the smallest genome among the domesticated cereals, making it particularly amenable to genomic studies. Reanalysis of population structure in rice by Rod Wing and colleagues in 2020, showed that it could be subdivided into a total of 15 subpopulations (XI-1A, XI-1B1, XI-1B2, XI-2B, XI-2A, XI-3B1, XI-3A, XI-3B2, GJ-trop1, GJ-trop2, GJ-subtrp, GJ-tmp, cA2, cA1 and cB, where subpopulations are designated with XI = Xian-indica, GJ = Geng-japonica and trop = tropical or subtrp = subtropical, cB = circum-Basmati, and cA = circum-Aus) and 4 admixed populations (GJ-adm, XI-adm, admixed and cA-adm) (Zhou et al, 2020). Sixteen rice cultivar varieties were then selected based on genetic diversity and origin to represent the 15 subpopulations and the largest admixed population, i.e. XI-adm. The collection of 16 Platinum Standard RefSeqs (PSRefSeq, also known as the 16 MAGIC rice cultivars) can be used as a template to detect virtually all standing natural variation that exists in the pan-genome of cultivated Asian rice. In this fourth release, we complete the 16 accessions of the PSRefSeq collection with Oryza sativa indica cv. Minghui 63 or MH63, a representative of the XI-adm subpopulation, and O. sativa indica cv. Zhengshan 97, Zhenshan 97 or ZS97, representing the XI-1A subpopulation. 

 

The Kitaake accession is a fast cycling rice variety, seed to seed in nine weeks, making it an excellent model for functional genomics in rice. The accession was used to generate mutagenized lines, of which 1,504 have been sequenced and indexed (Li et al, 2017). The accession carries the rice XA21 immune receptor and its genome was sequenced and assembled, annotated and analyzed for variation by Pamela Ronalds and collaborators (Jain et al, 2019) in the Department of Plant Pathology and the Genome Center of the University of California, Davis, and the Feedstocks Division of the Joint BioEnergy Institute, Lawrence Berkeley National Laboratory. 

 

The protein gene trees will be updated to  include the new genomes, improving the robustness of protein family trees, and will be made available for BLAST alignments.