基于基因组survey数据的疣吻沙蚕微卫星特征分析及多态标记开发

Characterization of microsatellites and polymorphic marker development in ragworm (Tylorrhynchus heterochaetus) based on genome survey data

  • 摘要: 为了解疣吻沙蚕 (Tylorrhynchus heterochaetus) 基因组信息并高效地开发微卫星标记,指导其种质资源保护与新品种的遗传改良研究,采用低深度高通量测序开展全基因组survey,k-mer分析估计疣吻沙蚕基因组大小为759.53 Mb,杂合率1.41%,重复序列比例45.92%;初步组装获得2 181 621条scaffold,全长为840 375 821 bp。在基因组序列中检测到130 216个微卫星位点,丰度为154.9 个·Mb−1。微卫星重复次数集中在4~18拷贝;单碱基重复比例最高 (35.00%),二碱基 (32.48%)、三碱基 (14.42%) 次之;二碱基、三碱基优势基序分别是AT/AT、AAT/ATT,表现出A/T碱基优势。从随机挑选的50对引物中筛选到15对多态标记,在30尾样本中共检测到87个等位基因,等位基因数 (Na) 为2.000~12.000 (平均5.800),有效等位基因数 (Ne) 为1.164~6.713 (平均3.328),期望杂合度 (He) 为0.141~0.789 (平均0.561),多态信息含量 (PIC) 为0.136~0.776 (平均0.511);其中13个为高度或中度多态性位点,在遗传分析中有较高的实用价值。结果表明,疣吻沙蚕基因组为复杂基因组,其微卫星位点类型丰富且具备良好的多态性潜能,可为种质资源评价、群体遗传学及分子育种研究提供有效的标记资源。

     

    Abstract: In order to understand the genomic information of Tylorrhynchus heterochaetus and efficiently develop microsatellite markers, so as to guide the conservation of its germplasm resources and genetic improvement of new varieties, we conducted a whole-genome survey by using low depth high-throughput sequencing. A total of 57.48 Gb of clean data were generated after the quality control of raw data. K-mer analysis estimates that the genome size of T. heterochaetus was 759.53 Mb; the heterozygosity rate was 1.41%; the proportion of repetitive sequences was 45.92%. Preliminary assembly obtained 2 181 621 scaffolds with a total length of 840 375 821 bp. A total of 130 216 microsatellite loci were detected with a density of 154.9 loci per Mb. The repeated number of microsatellite units largely ranged from 4 to 18. The ratio of mononucleotide loci was the highest (35.00%), followed by those of dinucleotide (32.48%) and trinucleotide (14.42%) loci. AT/AT and AAT/ATT motifs were dominant in dinucleotide and trinucleotide loci, respectively, indicating an A/T dominance. Fifteen polymorphic loci were identified from 50 randomly selected primers, and 87 alleles were amplified in a T. heterochaetus population containing 30 individuals. The number of alleles per locus ranged from 2.000 to 12.000, with an mean of 5.800. The effective allele number (Ne) and expected heterozygosity (He) ranged from 1.164 to 6.713 and from 0.141 to 0.789, with means of 3.328 and 0.561, respectively. The polymorphic information content (PIC) ranged from 0.136 to 0.776, with a mean of 0.511. Thirteen loci were found to be highly or moderately polymorphic, having high practical value in genetic analysis. In conclusion, T. heterochaetus genome is a complex genome, and its microsatellites have a rich variety and high polymorphic potential. The results can provide effective marker resources for germplasm resource evaluation, population genetics and molecular breeding research.

     

/

返回文章
返回