In silico analysis of simple sequence repeats (SSRs) in chloroplast genomes of Glycine species


Creative Commons License

ÖZYİĞİT İ. İ., Dogan I., Filiz E.

Plant OMICS, cilt.8, sa.1, ss.24-29, 2015 (Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 8 Sayı: 1
  • Basım Tarihi: 2015
  • Dergi Adı: Plant OMICS
  • Derginin Tarandığı İndeksler: Scopus
  • Sayfa Sayıları: ss.24-29
  • Anahtar Kelimeler: Bioinformatics analysis, Chloroplast genome, cpSSRs, Glycine, In silico analysis
  • Marmara Üniversitesi Adresli: Evet

Özet

Microsatellites, also known as simple sequence repeats, are short (1-6 bp long) repetitive DNA sequences present in chloroplast genomes (cpSSRs). In this work, chloroplast genomes (cpDNA) of eight different species (G. canescens, G. cyrtoloba, G. dolichocarpa, G. falcata, G. max, G. soja, G. stenophita, and G. tomentella) from Glycine genus were screened for cpSSRs by MISA perl script with a repeat size of ≥10 for mono-, 5 for di-, 3 for tri-, tetra-, penta- and hexa-nucleotide, including frequency, distributions, and putative codon repeats of cpSSRs. According to our results, a total of 1273 cpSSRs were identified and among them, 413 (32.4%) were found to be in genic regions and the remaining (67.6%) were all located in intergenic regions, with an average of 1.04 cpSSRs per kb. Trinucleotide repeats (45%) were the most abundant motifs, followed by mononucleotides (36%) and dinucleotides (11.8%) in the plastomes of the Glycine species. In genic regions, trimeric repeats, the most frequent one reached the maximum of 70.7%. Among the other repeats, mono- and tetrameric repeats were represented in proportions of 25.7% and 3.6%, respectively. Interestingly, there were no di-, penta-, and hexameric repeats in coding sequences. The most common motifs found in all plastomes were A/T (97.8%) for mono-, AT/AT (98%) for di-, and AAT/ATT (41.5%) for trinucleotides. Among the chloroplast genes, ycf1 had the highest number of cpSSRs, and G. cyrtoloba and G. falcata species had the maximum number of genes containing cpSSRs. The most frequent putative codon repeats located in coding sequences were found to be glutamic acid (21.2%), followed by serine (15.5%), arginine (8.3%) and phenylalanine (7.8%) in all species. Also, tryptophan, proline, and aspartic acid were not detected in all plastomes.