A Gold-Standard Reference Genome for Grapevine

Cultivated grapevine (Vitis vinifera ssp. vinifera) has a high level of heterozygosity, so when the grapevine genome was to be sequenced, the nearly homozygous PN40024 clone was selected, and it is now used as a reference in many grapevine studies. The third assembled version of this genotype (12X.v2) depended on six dense parental genetic maps (Canaguier et al. 2017). Although it is of high quality, the haploid assembly has some deficiencies that could be improved upon, thanks to recent advances in sequencing technologies. Researchers from the University of Strasbourg, Bielefeld University, the ​​USDA ARS, Cold Spring Harbor Laboratory and other collaborating institutions sought to develop an improved version of the PN40024 genome sequence assembly through combining the top-quality Sanger contigs found in the 12x version with long reads generated by Single-Molecule Real-Time (SMRT) sequencing (PacBio). The improved version, PN40024.v4, is composed of 640 scaffolds with a cumulative size of 474.5 Mb (N50 size of 6.5 Mb). When compared with 12X.v2, the new version has 3 times fewer scaffolds, a doubled  N50, as well as a significant reduction in the number of unknown bases, down from 3.1% of the former assembly to 0.4% of PN40024.v4. The new version increases the scaffold continuity and the number of informative sequences. Rather than annotating the genome with a completely new set of gene models, the researchers used liftover to maintain high-quality gene models the community is familiar with and has manually curated over the years. This approach was complemented with an automated annotation workflow resulting in a total set of 35,230 genes. These advances to the genome assembly and gene models elevate the PN40024 genome as a gold-standard reference, and will contribute a robust backbone to an upcoming grapevine pangenome.

Vitis.gramene.org examples:

The paper mentions that “The “Flowering locus T” (FT) and the “Adenine phosphoribosyltransferase 3” (APRT3) genes are absent and truncated in PN12X.v2, respectively.”

 

Figure 1: APT3 (Vitvi02g04155) in PanVitis

 

Figure 2: APT3 (Vitvi02g04155) in Curation Tree Tool

 

Figure 3: APT3 (Vitvi02g04155) with genetic variants from Myles et al (2010) in Gramene

 

Figure 4: Vitvi07g04487 is the ortholog of Arabidopsis FT in V4 but this model did not exist in V3.

 

Figure 5: Vitvi07g04487 (FT) in Gramene. This V4 model did not exist in V3. Two transcripts can be seen for the gene, and an Affymetrix primer maps to the last exon of both of them.

References:

Canaguier A, Grimplet J, Di Gaspero G, Scalabrin S, Duchêne E, Choisne N, Mohellibi N, Guichard C, Rombauts S, Le Clainche I, Bérard A, Chauveau A, Bounon R, Rustenholz C, Morgante M, Le Paslier MC, Brunel D, Adam-Blondon AF. A new version of the grapevine reference genome assembly (12X.v2) and of its annotation (VCost.v3). Genom Data. 2017 Sep 18;14:56-62. PMID: 28971018. DOI: 10.1016/j.gdata.2017.09.002. 

Velt A, Frommer B, Blanc S, Holtgräwe D, Duchêne É, Dumas V, Grimplet J, Hugueney P, Kim C, Lahaye M, Matus JT, Navarro-Payá D, Orduña L, Tello-Ruiz MK, Vitulo N, Ware D, Rustenholz C. An improved reference of the grapevine genome reasserts the origin of the PN40024 highly-homozygous genotype. G3 (Bethesda). 2023 Mar 26:jkad067. PMID: 36966465. DOI: 10.1093/g3journal/jkad067. Read more

Related Project Websites: 

 

Image 1:PN40024 white berry cluster. Photo credit Camille Rustenholz.