Biocell DOI:10.32604/biocell.2021.015053 | www.techscience.com/journal/biocell |
Article |
Development and characterization of Simple Sequence Repeat (SSR) markers from the genomic sequence of sweet potato [Ipomoea batatas L. (Lam)]
Department of Biotechnology, College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang, 524088, China
*Address correspondence to: Hongbo Zhu, tdzhu@126.com
Received: 18 November 2020; Accepted: 11 January 2021
Abstract: Sweet potato is a multifunctional root crop with many essential nutrients and bioactive compounds. Due to its genetic complexity and lack of genomic resources, efficient genetic studies and cultivar development lag far behind other major crops. Simple sequence repeats (SSRs) offer an effective molecular marker technology for molecular-based breeding and for locating important loci in crop plants, but only a few have previously been developed in sweet potato. To further explore new SSR markers and accelerate their use in sweet potato genetic studies, genome-wide characterization and development of SSR markers were performed using the recently published genome of sweet potato cultivar, Taizhong6. In this study, a set of 2,431 primer pairs were developed from 133,727 SSRs identified in the sweet potato genome using the Perl script MISA software. The average frequency was one SSR per 6.26 kb, with dinucleotides (38.5%) being the most dominant repeat motif. The main motif types in all repeats were AT/AT, AAT/ATT, A/T, AAAT/ATTT, AAAAT/ATTTT and AAAAAT/ATTTTT accounting for 78.29% of the total SSRs. 50% of the 100 randomly selected primer pairs amplified 251 alleles, and the average number of alleles was 5.02 per locus for values ranging between 1 and 13. The UPGMA cluster analysis grouped the 24 sweet potato genotypes into four clusters at a similarity coefficient of 0.68. The SSR markers currently developed will provide valuable genetic resources for germplasm identification, genetic diversity analysis, and functional genomics studies in sweet potato and related species.
Keywords: SSR markers; Sweet potato; Genetic diversity; Polymorphism
Sweet potato [Ipomoea batatas L. (Lam)] is an essential food crop belonging to the Convolvulaceae family. Currently, cultivated sweet potato is extensively cultured in over 100 countries worldwide but originally native to Central America (Rozen and Skaletsky, 2000), traversing the Orinoco River of Venezuela and the Yucatán Peninsula of Mexico. Hence, this region is considered the center of diversity and rich morphological variation of sweet potato (Roullier et al., 2013). Sweet potato is of great economic importance due to the excellent supply of dietary fiber, vitamins, minerals, and phenolic compounds (Lebot et al., 2016). Furthermore, it is appreciated greatly for its health-promoting functions such as anti-carcinogenic and cardiovascular disease-preventing properties, in addition to its radical-scavenging activity (Chandrasekara and Josheph Kumar, 2016). Sweet potato is a highly heterogeneous auto-hexaploid (2n = 6x = 90) plant with complex genetics and large genome size (~3 Gb). These complexities, coupled with the lack of genomic resources (Yan et al., 2015), hinder breeding progress. Thus, genetic and functional analyses, as well as cytological studies in sweet potato, lag far behind other major food crops such as wheat rice, and maize.
The advancement of next-generation sequencing has generated scores of datasets for many plant species that provide useful genomic materials for developing efficient molecular markers for genetic analyses (Yang et al., 2015a). Molecular marker technology, benefiting from advancement in high-throughput DNA-sequencing has been reported to play an essential role in genetic diversity and relationship assessment in plants (Kumar et al., 2009). Currently, this technology has been widely used as an effective tool through marker-assisted selection to improve genomic selection and accelerate breeding progress in many plant species. In recent years, molecular markers such as random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), single nucleotide polymorphism (SNP), and microsatellite or simple sequence repeat (SSR) have been employed for origin and dispersal study (Roullier et al., 2011), genetic diversity analysis (Tumwegamire et al., 2011), and construction of genetic linkage maps (Zhao et al., 2013) in sweet potato. Interestingly, despite the rapid development of a new generation of molecular markers such as InDels and SNPs (a more stable and abundant type of genetic marker), SSR markers remain indispensable for many genetic-based studies in sweet potato (Edwards and Batley, 2010; Liu et al., 2012). Simple Sequence Repeat (SSR) is a short repeat sequence composed of 1–6 bases, widely distributed in the genomes of both prokaryotic and eukaryotic organisms (Venter et al., 2001). SSR markers are extensively used as molecular markers for crop improvement and other genetic-based studies due to their allele specificity, high polymorphism, co-dominant and multi-allelic nature (Silva et al., 2013; Yue et al., 2014). In sweet potato, SSR markers are utilized extensively for the construction of genetic maps, diversity analysis, and variety identification (Ngailo et al., 2016; Yada et al., 2015; Yang et al., 2015b). They are widely distributed in both transcribed and non-coding sequences, generally described as EST- and genomic-SSRs, respectively (Morgante et al., 2002). EST-SSRs are mostly restricted to highly conserve transcribed regions and are less polymorphic. In contrast, genomic-SSRs are widely distributed throughout the genome and highly polymorphic (La Rota et al., 2005). Recent advances in molecular marker technology have accelerated the large-scale development of genomic-SSR markers in many plants (Iniguez-Luy et al., 2008; Nunome et al., 2009; Wang et al., 2011a; Wen et al., 2010). To a greater extent, SSR markers in sweet potato are derived from expressed sequences mined from public databases (Wang et al., 2011b), genomic libraries (Hu et al., 2004), and pyrosequencing (Tao et al., 2012; Xie et al., 2012).
Though some SSR markers have been developed and used in sweet potato, their number, quality, and availability are restricted, in that only a few could amplify or show polymorphisms among the diverse sweet potato varieties (Schafleitner et al., 2010; Wang et al., 2011b), and most of them are EST-SSR markers. However, the development and application of genomic-SSR markers remain limited in sweet potato due to challenges to its genome sequence dissection. Thus, there is a great need to develop novel genomic-SSR markers to accelerate genomic and genetic studies in sweet potato. In this study, we evaluated the frequency and relative number of SSRs in the sequenced genome of sweet potato (cv. Taizhong6), developed a set of 2,431 SSR markers from the assembled genomic sequences, and assessed the genetic diversity in 24 sweet potato cultivars. These SSR markers will offer new genetic resources for marker-assisted selection in sweet potato breeding, adding up to the available resources for analyzing the molecular phylogeny and genetic diversity of sweet potato and related species.
Plant materials and DNA extraction
A total of 24 sweet potato genotypes were sampled (Tab. 1) for this study. The test materials were obtained from the Research Field of Guangdong Ocean University, Guangdong, China. Genomic DNA from fresh young leaves was extracted using the CTAB (cetyltrimethyl ammonium bromide) protocol with few modifications (Porebski et al., 1997). DNA concentration and quality were determined using NanoDrop ND-2000 (Thermo Scientific, Wilmington, DE, USA) and 1% agarose gel electrophoresis. Finally, the DNA was liquefied in 1× TE buffer (10 mM Tris-HCl and 1 mM EDTA, pH 8.0), and the working concentration was adjusted to 50 ng/μL with ddH2O. The genomic sequences of sweet potato cv. Taizhong6 was downloaded from the Ipomoea Genome Hub (https://ipomoea-genome.org/download_genome.htm).
SSRs from the sweet potato genome were identified using the Perl Script MISA (MIcroSAtellite) identification software (http://pgrc.ipk-gatersleben.de/misa, Thiel et al., 2003). The minimum consecutive repeat units for mono- to hexa-nucleotide sequences were 20, 10, 7, 5, 4, and 4, respectively. A total of 2,431 SSR primer pairs were designed based on the selected SSR motifs using Primer3 ver. 4.0.0 (http://bioinfo.ut.ee/primer3/, Untergasser et al., 2012). The amplified product size was 100–280 bp, primer length ranged between 18–27 bp (optimum; 20 bp), and the minimum interval between two SSR sequences was 100 bp. The CG contents ranged between 20% and 80%, the mean melting temperature (Tm) was 60°C (from 57°C to 63°C), and all other parameters were default values.
SSR primer validation and marker amplification
From the developed primers, 50 primer pairs with good amplification effects and clear stable bands were selected after the initial screening to amplify the genomic DNA of 24 sweet potato genotypes. The frequency and distribution of all SSRs were analyzed and measured as one SSR per kb of sequence. SSR markers were grouped according to the location of the SSR motifs in the gene. To optimize PCR amplification condition, a reaction volume of 20 μL was used containing 1.5 μL genomic DNA (20 ng/μL), 0.4 μL Taq enzyme (3 U/μL), 2 μL 10× PCR buffer, 0.2 μL dNTP (10 mmol/L), 1 μL each of the Forward and Reverse primers (2 μmol/L), and sterile double distilled water added to make the final volume (Xie et al., 2017). PCR reaction procedure was; pre-denaturation at 94°C for 5 min, followed by 33 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 30 s, extension at 72°C for 30 s, and a final extension at 72°C for 5 min using the Takara PCR Thermal Cycler Dice.
Allele identification and genetic diversity analysis
Amplified products were identified and separated on a 6% non-denaturing polyacrylamide gel electrophoresis (PAGE). The electrophoresis buffer contained 1× TBE (100 mM Tris–HCl, 83 mM boric acid, 1 mM Na2EDTA, pH 8.0) (Han et al., 2008). Electrophoresis at 200 V and 100 mA for 120 min was executed after loading of samples. The silver staining method was employed for allele visualization and selection of suitable SSR primers. Markers were scored manually, and the polymorphic ones were selected for genetic analysis (Chevallet et al., 2006).
Each SSR allele for a given primer pair was scored 1 and 0 for samples with and without band, respectively, and a “0, 1” binary matrix was established. PowerMarker ver.3.25 was used to calculate the number of alleles per locus for the genotypes (Liu and Muse, 2005). The clustering analysis based on the genetic similarity coefficient was calculated by the unweighted pair group method with arithmetic average (UPGMA) method (Nei and Li, 1979) and a dendrogram generated using the Numerical Taxonomy and Multivariate Analysis System (NTSYS) software Ver. 2.02 (Rohlf, 2000).
Distribution of SSRs in the sweet potato genome
Based on the sweet potato (cv. Taizhong6) genome sequence, SSRs identified were characterized as mono- to hexa-nucleotides. The 133,727 SSRs identified had a density of 159.77 SSR/Mb or one SSR per 6.26 kb of sequence on average. Dinucleotides (38.50%), trinucleotides (31.45%), and mononucleotides (12.77%) were the most abundant repeat motif representing 82.71% of the total SSR, while the remaining repeat motifs represented 17.29% (Tab. 2; Suppl. Fig. A1). The most dominant motifs accounted for 78.29% of all SSRs; among them, AT/AT (81.24%), AC/GT (9.98%), and AG/CT (8.77%) were the most abundant dinucleotide repeats, while A/T (98.75%), AAT/ATT (87.17%) and AAAT/ATTT (56.01%) were the most dominant motifs among mononucleotides, trinucleotides and tetra nucleotides respectively (Tab. 2; Suppl. Fig. A1). Of all the repeating sequences, AT/AT were the most common representing 31.27% followed by AAT/ATT (27.42%), A/T (12.61%), AAAT/ATTT (4.67%), AC/GT (3.84%), AG/CT (3.38%), and AAAAT/ATTTT (1.7%; Tab. 2).
SSR marker development and characterization
To ascertain the novelty of the designed SSR primer pairs, the SSR-containing sequences from the genomic sequence of sweet potato cv. Taizhong6 was analyzed. A total of 133,727 SSRs were identified for mono- to hexa-nucleotide repeats. Of these repeat sequences, dinucleotides (38.50%), trinucleotides (31.45%), and mononucleotides (12.77%) represented 82.71% of the total SSRs (Tab. 2; Suppl. Fig. A1). Out of the 2,431 SSR primer pairs designed, 100 primer pairs were randomly selected for validation.
The efficiency of the SSR primer pairs was evaluated and validated in 24 sweet potato genotypes using the 100 randomly selected primer pairs. Out of these primers, 50 primer pairs (50%) effectively amplified clear bands, with 27 primer pairs being polymorphic. The amplification results also showed that the 50 primer pairs amplified 251 alleles in the 24 sweet potato genotypes with a mean allele number of 5.02 per locus for the values ranging between 1 and 13 (Tab. 3). Several bands are shown in the profile (Fig. 1) to depict the PCR amplification in the 24 cultivars used for this study, while the characteristics of the 50 randomly selected primer pairs are outlined in Tab. 3.
The genetic diversity of 24 sweet potato varieties as analyzed by the 50 primer pairs revealed a genetic distance range of 0.605 to 1.00 with an average distance of 0.740 among the 24 sweet potato varieties studied (Suppl. Tab. A1). According to the UPGMA clustering results, the genetic similarity coefficient between the 24 sweet potato germplasms was relatively high-ranging between 0.66 and 0.87 with an average of 0.765 (Fig. 2). The dendrogram generated grouped the 24 resources into 4 clusters (Cluster I–IV; Fig. 2) at a similarity coefficient of 0.68. Among them, Dayehong and Fushu18 were the same with the closest genetic association, and the similarity coefficient was 0.745. The clustering results revealed no direct relationship with the geographical sources of germplasm, indicating a more frequent exchange of germplasm in sweet potato cultivation and breeding.
Recent advances in high throughput DNA sequencing technology offer new information to accelerate the development of molecular markers. Molecular markers are widely used in many plant genetic and genomic-based studies. SSR markers are distributed in both transcribed and non-coding sequences referred to as EST- and genomic-SSRs, respectively. With the advantages of being co-dominant, PCR-based, highly polymorphic, chromosome-specific, reproducible, and consistent (Karihaloo, 2015) compared to other molecular markers, SSR markers have been widely used in many genetic and molecular-based studies in sweet potato including variety identification, genetic diversity analysis, and construction of linkage maps (Ngailo et al., 2016; Yada et al., 2015; Yang et al., 2015b). Although several studies have reported the use of SSR markers in sweet potato, most of them evolved from investigating transcriptome libraries and expressed sequence tags (ESTs). Again, their number and availability are limited, with only a few being polymorphic compared to other crops. Therefore, there is a great need to develop novel SSR markers that are highly polymorphic and distributed throughout the genome.
To identify valuable genomic-SSR markers for sweet potato genetic improvement, the sweet potato genome was searched, and a total of 2,431 SSR markers were successfully developed based on the SSR-containing sequences. The distribution density was 159.77 Mb per SSR or 6.26 kb per one SSR on average, which was lower than the average density recorded for sweet potato (7.1 kb), pigeon pea (8.4 kb), cotton (20.0 kb), and soybean (23.80 kb) but almost the same as that of sesame (6.55 kb), and relatively higher compared to that of rice (3.4 kb) and radish (4.93 kb) (Cardle et al., 2000; Wang et al., 2011b; Zhai et al., 2014). However, the differences in frequency and abundance could be attributed to the size of the database, tools for SSR data-mining, length of repeat motifs, and application of different repeat unit thresholds; hence, it is practically difficult to directly compare the frequency and abundance estimates of different studies (Dutta et al., 2011). In our current study, mono-, di- and trinucleotides were the most common SSRs with dinucleotides showing the highest frequency (38.50%) followed by trinucleotides (31.46%) and mononucleotide (12.77%; Suppl. Fig. A1). Feng et al. (2020) identified dinucleotides (9439, 51.52%) as the most abundant repeats followed by trinucleotides (7636, 41.68%) in the sweet potato, which is consistent with the results of this study. Our findings contrast with previous reports showing trinucleotides as the most dominant repeat motifs in sweet potato followed by dinucleotides (Tao et al., 2012; Wang et al., 2011b). Other studies also suggested trinucleotides as the second predominant repeat motifs in sweet potato, which is in agreement with our current findings (Xie et al., 2012).
The main repeat types among the identified SSRs were A/T (12.61%), AT/AT (61.51%), AAT/ATT (27.42%), and AAAT/ATTT (13.32%, Suppl. Fig. A1). In agreement with our current study, Wang et al. (2011b) identified AAT/ATT as the most dominant SSR motif in the sweet potato. Similarly, Yang et al. (2015a) identified AAAT/ATTT as the most frequent repeat motif among tetranucleotides in Welsh onion. However, previous studies identified AG/CT, AAG/CTT, and AT/TA motifs as the most dominant motif types in sweet potato (Zhang et al., 2016), conflicting with our findings.
In this study, 100 primer pairs were randomly selected to validate the SSR markers and assess their usefulness in sweet potato. Of these, 50 primer pairs (50%) produced clear stable bands. The PCR amplification rate (50%) in this study was much lower than the reported 75–90% EST-SSR amplification rate in sweet potato (Hu et al., 2004; Schafleitner et al., 2010; Wang et al., 2011b). However, the amplification efficiency of genomic-SSRs has always been lower than EST-SSRs in sweet potato, which is in line with our results (Buteler et al., 1999; Hu et al., 2004). The reason being that genomic-SSR primers are designed randomly from genomic libraries, whereas EST-SSRs are from relatively highly conserved transcribed regions. Due to this reason, EST-SSRs are reported to be highly applicable and transferable to related species but less polymorphic compared to genomic-SSRs (Aggarwal et al., 2007). The 50 working primer pairs amplified 251 alleles in the 24 sweet potato genotypes (Tab. 3). The number of alleles recorded was 5.02 per locus on average for values ranging between 1 and 13 alleles. Previous studies reported a higher number of alleles per locus using SSR markers to analyze the genetic diversity of sweet potato germplasm. This indicates a high polymorphism among the sweet potato accessions studied. Several studies have also reported high number of alleles ranging between 2–23 alleles per loci, similar to that reported in our studies (Buteler et al., 1999; Roullier et al., 2011; Tumwegamire et al., 2011; Veasey et al., 2008; Yada et al., 2010). Conversely, Hwang et al. (2002) had low polymorphism and recorded 1 to 4 alleles per SSR using varied annealing temperatures and SSR primers. The result of our current study confirms the exceptional discriminatory ability of SSR markers (Gichuru et al., 2005). As a hexaploid plant, distinguishing between homozygous and heterozygous sites becomes difficult; hence, dominant markers are preferred over co-dominant markers (Silva et al., 2013; Yue et al., 2014). Previous studies reported the high polymorphism of sweet potato which is attributed to the large genome size and high heterozygosity (Hwang et al., 2002) influenced by its mating systems (self-incompatibility and outcrossing). Again, the polyploidy (autohexaploid) of sweet potato combined with the large chromosome number (2n = 6x = 90) makes sweet potato SSR primers highly polymorphic (Li et al., 2015; Ngailo et al., 2016). Hence, it is likely for sweet potato genotypes to have huge genetic distances among them, even in smaller populations (Gruneberg et al., 2015). In our study, we recorded 27 (54%) primer pairs exhibiting polymorphism among the 50 primer pairs. This value was higher than the 41.9% polymorphism recorded by Wang et al. (2011b) in the eight cultivated sweet potato varieties tested but lower than the 67.2% and 62.5% polymorphism reported in different sweet potato test materials (Hu et al., 2004; Schafleitner et al., 2010). Differences in polymorphism are attributed to the different geographic origins of samples and the number of DNA samples used. For instance, Chavarriaga-Aguirre et al. (1998) observed a relatively high polymorphism when the number of accessions increased from 38 to over 500 in cassava. Generally, genetic linkage mapping, comparative genomics, diversity analysis, gene-based association, and evolutionary studies require polymorphic SSR markers. Thus, the SSR markers in this study could be used for such studies in sweet potato. However, the polymorphic information content (PIC) of the SSRs was not determined because sweet potato exhibits a high overlap of homoeoallelic SSR variations with allelic ones, in that a clear single amplified SSR band may well be from two or three loci.
The average SSR-based genetic distance among the 24 sweet potato varieties was 0.740 average for values ranging between 0.605 and 1.00 (Suppl. Tab. A1). The genetic similarity coefficient range of 0.66 to 0.87 with a mean value of 0.765 recorded in this study is high, indicating a low diversity in the sweet potato genotypes studied (Fig. 2). The result is consistent with the high similarity coefficient of 0.64 on average recorded by Hwang et al. (2002) and thus concluded a low diversity among the accessions studied. On the contrary, Yada et al. (2010) reported an average similarity coefficient of 0.57 by evaluating the genetic diversity of cultivars from Uganda. Zhang et al. (2000) observed a low similarity coefficient (0.588) among the sweet potato varieties from South America. In another study, Tumwegamire et al. (2011) also recorded a similarity coefficient of 0.54 on average when the genetic diversity of farmer varieties of both white- and orange-fleshed sweet potato from East Africa were assessed. Similarly, David et al. (2018) reported a low genetic similarity coefficient of 0.54 on average and concluded a high diversity among the studied accessions. Thus, the differences could be attributed to the number and type of markers used and the genotypic variances. The clustering results revealed no direct relationship between the national and regional sources of germplasm, indicating a more frequent exchange of germplasm in sweet potato cultivation and breeding. The findings of this study provide background information for the development of genomic-SSR markers in the sweet potato.
To facilitate marker-assisted selection (MAS) and explore new molecular markers in sweet potato, we developed a set of SSR markers from the reference genome of cultivated sweet potato (cv. Taizhong6) using MISA software. A total of 133,727 SSRs were identified, from which 2,431 new SSR markers were developed. About 50% of the randomly selected SSR markers showed good amplification effects and produced clear, stable bands. The findings of this study will help update the sweet potato genomic-SSR marker database and aid future genetic and genomic studies. Also, the ability of the markers to analogize cultivars qualify them to be utilized as background data and resources for germplasm identification, genetic relationship studies, and diversity analysis in sweet potato and related species.
Acknowledgement: We are thankful to Mr. Eric Amenyogbe, Mr. Gyan Ray Watson and Miss Linda Adzigbli for the critical review and useful suggestions during the manuscript preparation.
Availability of Data and Materials: All data and materials associated with this research work have been provided in this article and its additional files.
Author Contribution: The authors confirm contribution to the paper as follows: study conception and design: Hongbo Zhu, Hanna Amoanimaa-Dede; data collection: Hanna Amoanimaa-Dede, Jiacheng Zhang, Chuntao Su; analysis and interpretation of results: Hanna Amoanimaa-Dede, Jiacheng Zhang; draft manuscript preparation: Hanna Amoanimaa-Dede. All authors reviewed the results and approved the final version of the manuscript.
Ethics Approval: Not applicable.
Supplementary Material: The supplementary material is available online at DOI 10.32604/biocell.2021.015053.
Funding Statement: This research was funded by the National Natural Science Foundation of China-Guangdong Joint Fund, China and Studies on Resistance Resources and Molecular Mechanisms of Sweet potato Weevil in South China (Grant No. U1701234).
Conflicts of Interest: The authors declare that there is no conflict of interest to report regarding this current study.
Figure A1: Characteristics of SSRs in the sweetpotato genome.
a. Distribution of different repeat types
b. Frequency distribution of major repeat motifs
c. Frequency distribution of main motif sequence in I. Dinucleotide, II. Trinucleotide and III. Tetranucleotide repeats.
Table A1: Genetic distances generated from the SSR marker analysis of sweetpotato germplasm (1-24) in this study.
This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |