To identify mechanisms that influence the evolution of bacterial transposons, DNA sequence variation was evaluated among homologs of insertion sequences IS1, IS3 and IS30 from natural strains of Escherichia coli and related enteric bacteria. The nucleotide sequences within each class of IS were highly conserved among E. coli strains, over 99.7% similar to a consensus sequence. When compared to the range of nucleotide divergence among chromosomal genes, these data indicate high turnover and rapid movement of the transposons among clonal lineages of E. coli. In addition, length polymorphism among IS appears to be far less frequent than in eukaryotic transposons, indicating that nonfunctional elements comprise a smaller fraction of bacterial transposon populations than found in eukaryotes. IS present in other species of enteric bacteria are substantially divergent from E. coli elements, indicating that IS are mobilized among bacterial species at a reduced rate. However, homologs of IS1 and IS3 from diverse species provide evidence that recombination events and horizontal transfer of IS among species have both played major roles in the evolution of these elements. IS3 elements from E. coli and Shigella show multiple, nested, intragenic recombinations with a distantly related transposon, and IS1 homologs from diverse taxa reveal a mosaic structure indicative of multiple recombination and horizontal transfer events.
The population biology and molecular evolution of the transposable element mariner has been studied in the eight species of the melanogaster subgroup of the Drosophila subgenus Sophophora. The element occurs in D. simulans, D. mauritiana, D. sechellia, D. teissieri, and D. yakuba, but is not found in D. melanogaster, D. erecta, or D. orena. Sequence comparisons suggest that the mariner element was present in the ancestor of the species subgroup and was lost in some of the lineages. Most species contain both active and inactive mariner elements. A deletion of most of the 3' end characterizes many elements in D. teissieri, but in other species the inactive elements differ from active ones only by simple nucleotide substitutions or small additions/deletions. Active mariner elements from all species are quite similar in nucleotide sequence, although there are some species-specific differences. Many, but not all, of the inactive elements are also quite closely related. The genome of D. mauritiana contains 20-30 copies of mariner, that of D. simulans 0-10, and that of D. sechellia only two copies (at fixed positions in the genome). The mariner situation in D. sechellia may reflect a reduced effective population size owing to the restricted geographical range of this species and its ecological specialization to the fruit of Morinda citrifolia.
Highly polymorphic segments of the human genome containing variable numbers of tandem repeats (VNTRs) have been widely used to establish DNA profiles of individuals for use in forensics. Methods of estimating the probability of occurrence of matching DNA profiles between two randomly selected individuals have been subject to extensive debate regarding the possibility of significant substructure occurring within the major races. We have sampled two Caucasian subpopulations, Finns and Italians, at four commonly used VNTR loci to determine the extent to which the subgroups differ from each other and from a mixed Caucasian database. The data were also analyzed for the occurrence of linkage disequilibrium among the loci. The allele frequency distributions of some loci were found to differ significantly among the subpopulations in a manner consistent with population substructure. Major differences were also found in the probability of occurrence of matching DNA profiles between two individuals chosen at random from the same subpopulation. With respect to the Finnish and Italian subpopulations, the conventional product rule for estimating the probability of a multilocus VNTR match using a mixed Caucasian database consistently yields estimates that are artificially small. Systematic errors of this type were not found using the interim ceiling principle recently advocated in the National Research Council's report [National Research Council (1992) DNA Technology in Forensic Science (Natl. Acad. Sci., Washington)]. The interim ceiling principle is based on currently available racial or ethnic databases and sets an arbitrary lower limit on each VNTR allele frequency. In the future the ceiling frequencies are expected to be established from more adequate data acquired for relevant VNTR loci from multiple subpopulations.
Inconsistencies in taxonomic relationships implicit in different sets of nucleic acid sequences potentially result from horizontal transfer of genetic material between genomes. A nonparametric method is proposed to determine whether such inconsistencies are statistically significant. A similarity coefficient is calculated from ranked pairwise identities and evaluated against a distribution of similarity coefficients generated from resampled data. Subsequent analyses of partial data sets, obtained by the elimination of individual taxa, identify particular taxa to which the significance may be attributed, and can sometimes help in distinguishing horizontal genetic transfer from inconsistencies due to convergent evolution or variation in evolutionary rate. The method was successfully applied to data sets that were not found to be significantly different with existing methods that use comparisons of phylogenetic trees. The new statistical framework is also applicable to the inference of horizontal transfer from restriction fragment length polymorphism distributions and protein sequences.
Defective (nonautonomous) copies of transposable elements are relatively common in the genomes of eukaryotes but less common in the genomes of prokaryotes. With regard to transposable elements that exist exclusively in the form of DNA (nonretroviral transposable elements), nonautonomous elements may play a role in the regulation of transposition. In prokaryotes, plasmid-mediated horizontal transmission probably imposes a selection against nonautonomous elements, since nonautonomous elements are incapable of mobilizing themselves. The lower relative frequency of nonautonomous elements in prokaryotes may also reflect the coupling of transcription and translation, which may bias toward the cis activation of transposition. The cis bias we suggest need not be absolute in order to militate against the long-term maintenance of prokaryotic elements unable to transpose on their own. Furthermore, any cis bias in transposition would also decrease the opportunity for trans repression of transposition by nonautonomous elements.
Population data suggest that many parasitic protozoa (e.g. Trypanosoma, Leishmania, Entamoeba and Giardia) reproduce clonally, but this hypothesis has been highly controversial for Plasmodium falciparum. Although reproduction is predominantly clonal in the enteric bacteria Escherichia coli and Salmonella, the level of recombination affecting short (< 1 kb) regions of the chromosome is sufficient such that many genes are obviously mosaics of different ancestries. Transposable insertion sequences in E. coli are examples of selfish DNA whose short-term population dynamics are determined mainly by transposition and horizontal transmission among strains balanced against the regulation of transposition as a function of copy number, and negative effects on fitness. Occasional advantageous effects of transposable elements have also been documented.
Frequencies of mutant sites are modeled as a Poisson random field in two species that share a sufficiently recent common ancestor. The selective effect of the new alleles can be favorable, neutral, or detrimental. The model is applied to the sample configurations of nucleotides in the alcohol dehydrogenase gene (Adh) in Drosophila simulans and Drosophila yakuba. Assuming a synonymous mutation rate of 1.5 x 10(-8) per site per year and 10 generations per year, we obtain estimates for the effective population size (N(e) = 6.5 x 10(6)), the species divergence time (tdiv = 3.74 million years), and an average selection coefficient (sigma = 1.53 x 10(-6) per generation for advantageous or mildly detrimental replacements), although it is conceivable that only two of the amino acid replacements were selected and the rest neutral. The analysis, which includes a sampling theory for the independent infinite sites model with selection, also suggests the estimate that the number of amino acids in the enzyme that are susceptible to favorable mutation is in the range 2-23 at any one time. The approach provides a theoretical basis for the use of a 2 x 2 contingency table to compare fixed differences and polymorphic sites with silent sites and amino acid replacements.
He-T sequences are a complex repetitive family of DNA sequences in Drosophila that are associated with telomeric regions, pericentromeric heterochromatin, and the Y chromosome. A component of the He-T family containing open reading frames (ORFs) is described. These ORF-containing elements within the He-T family are designated T-elements, since hybridization in situ with the polytene salivary gland chromosomes results in detectable signal exclusively at the chromosome tips. One T-element that has been sequenced includes ORFs of 1,428 and 1,614 bp. The ORFs are overlapping but one nucleotide out of frame with respect to each other. The longer ORF contains cysteine-histidine motifs strongly resembling nucleic acid binding domains of gag-like proteins, and the overall organization of the T-element ORFs is reminiscent of LINE elements. The T-elements are transcribed and appear to be conserved in Drosophila species related to D. melanogaster. The results suggest that T-elements may play a role in the structure and/or function of telomeres.
Lewontin, RC, and DL Hartl. 1992. “Response.” Science 255: 1053-4.
Active and inactive mariner elements from natural and laboratory populations of Drosophila simulans were isolated and sequenced in order to assess their nucleotide variability and to compare them with previously isolated mariner elements from the sibling species Drosophila mauritiana and Drosophila sechellia. The active elements of D. simulans are very similar among themselves (average 99.7% nucleotide identity), suggesting that the level of mariner expression in different natural populations is largely determined by position effects, dosage effects and perhaps other factors. Furthermore, the D. simulans elements exhibit nucleotide identities of 98% or greater when compared with mariner elements from the sibling species. Parsimony analysis of mariner elements places active elements from the three species into separate groups and suggests that D. simulans is the species from which mariner elements in D. mauritiana and D. sechellia are most likely derived. This result strongly suggests that the ancestral form of mariner among these species was an active element. The two inactive mariner elements sequenced from D. simulans are very similar to the inactive peach element from D. mauritiana. The similarity may result from introgression between D. simulans and D. mauritiana or from selective constraints imposed by regulatory effects of inactive elements.
A physical map of the genome of Drosophila melanogaster has been created using 965 yeast artificial chromosome (YAC) clones assigned to locations in the cytogenetic map by in situ hybridization with the polytene salivary gland chromosomes. Clones with insert sizes averaging about 200 kb, totaling 1.7 genome equivalents, have been mapped. More than 80% of the euchromatic genome is included in the mapped clones, and 75% of the euchromatic genome is included in 161 cytological contigs ranging in size up to 2.5 Mb (average size 510 kb). On the other hand, YAC coverage of the one-third of the genome constituting the heterochromatin is incomplete, and clones containing long tracts of highly repetitive simple satellite DNA sequences have not been recovered.
DNA sequences and chromosomal locations of four Drosophila pseudoobscura opsin genes were compared with those from Drosophila melanogaster, to determine factors that influence the evolution of multigene families. Although the opsin proteins perform the same primary functions, the comparisons reveal a wide range of evolutionary rates. Amino acid identities for the opsins range from 90% for Rh2 to more than 95% for Rh1 and Rh4. Variation in the rate of synonymous site substitution is especially striking: the major opsin, encoded by the Rh1 locus, differs at only 26.1% of synonymous sites between D. pseudoobscura and D. melanogaster, while the other opsin loci differ by as much as 39.2% at synonymous sites. Rh3 and Rh4 have similar levels of synonymous nucleotide substitution but significantly different amounts of amino acid replacement. This decoupling of nucleotide substitution and amino acid replacement suggests that different selective pressures are acting on these similar genes. There is significant heterogeneity in base composition and codon usage bias among the opsin genes in both species, but there are no consistent relationships between these factors and the rate of evolution of the opsins. In addition to exhibiting variation in evolutionary rates, the opsin loci in these species reveal rearrangements of chromosome elements.
A multiple-hit bacteriophage P1 library containing DNA fragments from Drosophila melanogaster in the size range 75-100 kb was created and subjected to a preliminary evaluation for completeness, randomness, fidelity, and clone stability. This P1 library presently contains 3840 individual clones, or approximately two genome equivalents. The library was screened with a small set of unique-sequence test probes, and clones containing the sequences have been recovered. In situ hybridization with salivary gland chromosomes indicates that the clones originate from the site of the probe sequences in the genome, and filter hybridization of restriction digests suggests that the clones are not rearranged in comparison with the genomic sequences. Approximately 1.7% of the clones contain sequences that hybridize with ribosomal DNA. A small subset of these clones was tested for stability by examination of restriction fragments produced after repeated subculturing, and no evidence for instability was found. The P1 cloning system has general utility in molecular genetics and may provide an important intermediate level of resolution in physical mapping of the Drosophila genome.
We present a strategy for assembling a physical map of the genome of Drosophila melanogaster based on yeast artificial chromosomes (YACs). In this paper we report 500 YACs containing inserts of Drosophila DNA averaging 200 kb that have been assigned positions on the physical map by means of in situ hybridization with salivary gland chromosomes. The cloned DNA fragments have randomly sheared ends (DY clones) or ends generated by partial digestion with either NotI (N clones) or EcoRI (E clones). Relative to the euchromatic portion of the genome, the size distribution and genomic positions of the clones reveal no significant bias in the completeness or randomness of genome coverage. The 500 mapped euchromatic clones contain an aggregate of approximately 100 million base pairs of DNA, which is approximately one genome equivalent of Drosophila euchromatin.
The transposable element mariner occurs widely in the melanogaster species group of Drosophila. However, in drosophilids outside of the melanogaster species group, sequences showing strong DNA hybridization with mariner are found only in the genus Zaprionus. The mariner sequence obtained from Zaprionus tuberculatus is 97% identical with that from Drosophila mauritiana, a member of the melanogaster species subgroup, whereas a mariner sequence isolated from Drosophila tsacasi is only 92% identical with that from D. mauritiana. Because D. tsacasi is much more closely related to D. mauritiana than is Zaprionus, the presence of mariner in Zaprionus may result from horizontal transfer. In order to confirm lack of a close phylogenetic relationship between the genus Zaprionus and the melanogaster species group, we compared the alcohol dehydrogenase (Adh) sequences among these species. The results show that the coding region of Adh is only 82% identical between Z. tuberculatus and D. mauritiana, as compared with 90% identical between D. tsacasi and D. mauritiana. Furthermore, the mariner gene phylogeny obtained by maximum likelihood and maximum parsimony analyses is discordant with the species phylogeny estimated by using the Adh genes. The only inconsistency in the mariner gene phylogeny is in the placement of the Zaprionus mariner sequence, which clusters with mariner from Drosophila teissieri and Drosophila yakuba in the melanogaster species subgroup. These results strongly suggest horizontal transfer.
The distribution of the transposable element mariner was examined in the genus Drosophila. Among the eight species comprising the melanogaster species subgroup, the element is present in D. mauritiana, D. simulans, D. sechellia, D. yakuba and D. teissieri, but it is absent in D. melanogaster, D. erecta and D. orena. Multiple copies of mariner were sequenced from each species in which the element occurs. The inferred phylogeny of the elements and the pattern of divergence were examined in order to evaluate whether horizontal transfer among species or stochastic loss could better account for the discontinuous distribution of the element among the species. The data suggest that the element was present in the ancestral species before the melanogaster subgroup diverged and was lost in the lineage leading to D. melanogaster and the lineage leading to D. erecta and D. orena. This inference is consistent with the finding that mariner also occurs in members of several other species subgroups within the overall melanogaster species group. Within the melanogaster species subgroup, the average divergence of mariner copies between species was lower than the coding region of the alcohol dehydrogenase (Adh) gene. However, the divergence of mariner elements within species was as great as that observed for Adh. We conclude that the relative sequence homogeneity of mariner elements within species is more likely a result of rapid amplification of a few ancestral elements than of concerted evolution. The mariner element may also have had unequal mutation rates in different lineages.
The genome of Drosophila melanogaster contains a class of repetitive DNA sequences called the He-T family, which is unusual in being confined to telomeric and heterochromatic regions. The specific He-T fragment designated Dm665 was cloned in yeast by selection for an autonomously replicating sequence (ARS). Dm665 contains a restriction fragment length polymorphism (RFLP) that is specific to males and thus derives from the Y chromosome. Deletion mapping using X-Y translocations indicates that sequences homologous to Dm665 occur in at least one major cluster in each arm of the Y chromosome. Among 20 yeast artificial chromosome (YAC) clones containing Drosophila sequences homologous with Dm665, four clones derive from defined regions of the long arm of the Y and two from the short arm. The sequence of Dm665 is 2443 bp long, consists of 59% A + T, and contains no significant open reading frames or direct or inverted repeats. However, Dm665 contains a region of 650 bp that shares homology with portions of the X-linked locus Stellate.
Six copies of the mariner element from the genomes of Drosophila mauritiana and Drosophila simulans were chosen at random for DNA sequencing and functional analysis and compared with the highly active element Mos1 and the inactive element peach. All elements were 1286 base pairs in length, but among them there were 18 nucleotide differences. As assayed in Drosophila melanogaster, three of the elements were apparently nonfunctional, two were marginally functional, and one had moderate activity that could be greatly increased depending on the position of the element in the genome. Both molecular (site-directed mutagenesis) and evolutionary (cladistic analysis) techniques were used to analyze the functional effects of nucleotide substitutions. The nucleotide sequence of the element is the primary determinant of function, though the activity level of elements is profoundly influenced by position effects. Cladistic analysis of the sequences has identified a T----A transversion at position 1203 (resulting in a Phe----Leu amino acid replacement in the putative transposase) as being primarily responsible for the low activity of the barely functional elements. Use of the sequences from the more distantly related species, Drosophila yakuba and Drosophila teissieri, as outside reference species, indicates that functional mariner elements are ancestral and argues against their origination by a novel mutation or by recombination among nonfunctional elements.