Exploring the utility and limits of target enrichment methods to study polyploidy and reticulate evolution

Quatela, Anne-Sophie [1], Cangren, Patrik [2], De Boer, Hugo [3], Bacon, Christine [1], Oxelman, Bengt [4].

Targeted long reads from herbarium specimens enable to identify allopolyploidization events of arctic/subarctic Silene sect. Physolychnis.

Reticulation events are a ubiquitous feature of plant genomes, occurring in all major clades of angiosperms.. Phylogenies based on low-copy markers enable the identification of homoeologs (i.e. homologous genes that diverged after a speciation event and are brought back together after hybridization). Together with the development of NGS , The emergence of probe-based target enrichment of low-copy markers has made feasible the inference of allopolyploidization events. Target enrichment methods are commonly used with the short-read Illumina sequencing platform, providing good results from both fresh material and herbarium specimens (i.e. fragmented DNA). However, short reads make assignment of homoeologs and alleles difficult in polypoid genome assembly. Our study aims  at solving this issue by combining probe-based target enrichment and the long-read sequencing technology SMRT PacBio Sequel. Considering the significant value of herbaria as biorepositories, we investigate as well how long can be the reads from herbarium specimen DNA. Our study model is the Arctic and subarctic representatives of Silene sect. Physolychnis, a complex where the most recent taxonomic treatment recognizes 13 taxonomic species and subspecies. We enrich 50 loci from herbariums specimens with a probe set Silene-specific, 104kb total length and 1-2 kb individual reads. We demonstrate that (i) we can obtain 1-3 kb of enriched loci from herbarium specimens; (ii) we show that allele phasing enables identifying homoeologs and thus the phylogenetic reconstruction of allopolyploidization events.

