Putative gene split models are available now for 23 plant reference genomes based on the latest Gramene release 36b (Gramene36bEnsembl70) hosted at Gramene. The split gene models are commonly related to an annotation artifact where a single gene is annotated as two or more genes due to incomplete evidence, but could also result from legitimate evolutionary processes. The Compara Gene Tree method predicts a special class of within-species paralogs called "contiguous_gene_split". A contiguous_gene_split is called when the two apparently paralogous genes lie on the same strand and in close proximity (<1MB) but have no (or little) overlapping sequence. The putative gene split models for each species can be downloaded from Gramene ftp site
Here is brief statistics of gene split models for each species.
subtype contiguous_gene_split
1 Arabidopsis lyrata subsp. lyrata 172
2 Arabidopsis thaliana 34
3 Brachypodium distachyon 150
4 Brassica rapa subsp. pekinensis 150
5 Chlamydomonas reinhardtii 96
6 Cyanidioschyzon merolae 8
7 Glycine max 404
8 Hordeum vulgare subsp. vulgare 56
9 Musa acuminata 1558
10 Oryza barthi 1
11 Oryza brachyantha 334
12 Oryza glaberrima 436
13 Oryza sativa Indica Group 730
14 Oryza sativa Japonica Group 628
15 Physcomitrella patens subsp. patens 248
16 Populus trichocarpa 2076
17 Selaginella moellendorffii 172
18 Setaria italica 362
19 Solanum lycopersicum 1138
20 Solanum tuberosum 648
21 Sorghum bicolor 1036
22 Vitis vinifera 876
23 Zea mays 1160