Abstract

SIRE1 is unusual among Ty1-copia retrotransposons in that it has an additional open reading frame with structural features similar to retroviral envelope proteins between pol and the 3 LTR. Here we report the characterization and comparison of eight different SIRE1 elements derived from a soybean genomic library, as well as SIRE1 reverse transcriptases from Glycine soja. The DNA sequences of the eight SIRE1 elements are highly homogeneous and share greater than 95% nucleotide identity. Partial sequences obtained from BAC ends are similarly conserved. Phylogenetic analyses resolve two closely related SIRE1 lineages, and nucleotide changes within and between SIRE1 lineages have occurred to preserve function. Both the gag and the env-like genes are evolving under similar levels of functional constraint. Considerable sequence heterogeneity in the form of short duplications was found within the LTRs and in the region between the envelope-like ORF and the 3 LTR. These duplications are suggestive of slippage by reverse transcriptase during replication. Sequence identity between LTRs of individual insertions suggests that they transposed within the last 70,000 years. Three of ten SIRE1 insertions examined abut Ty3-gypsy retroelements. Since the soybean genome harbors more than 1000 SIRE1 insertions, the collective data suggest that SIRE1 has undergone a very recent and robust amplification in soybean.