Copy-number variants (CNVs) reshape gene structure, modulate gene expression, and contribute to significant phenotypic variation. Previous studies have revealed CNV patterns in natural populations of Drosophila melanogaster and suggested that selection and mutational bias shape genomic patterns of CNV. Although previous CNV studies focused on heterogeneous strains, here, we established a number of second-chromosome substitution lines to uncover CNV characteristics when homozygous. The percentage of genes harboring CNVs is higher than found in previous studies. More CNVs are detected in homozygous than heterozygous substitution strains, suggesting the comparative genomic hybridization arrays underestimate CNV owing to heterozygous masking. We incorporated previous gene expression data collected from some of the same substitution lines to investigate relationships between CNV gene dosage and expression. Most genes present in CNVs show no evidence of increased or diminished transcription, and the fraction of such dosage-insensitive CNVs is greater in heterozygotes. More than 70% of the dosage-sensitive CNVs are recessive with undetectable effects on transcription in heterozygotes. A deficiency of singletons in recessive dosage-sensitive CNVs supports the hypothesis that most CNVs are subject to negative selection. On the other hand, relaxed purifying selection might account for the higher number of protein-protein interactions in dosage-insensitive CNVs than in dosage-sensitive CNVs. Dosage-sensitive CNVs that are upregulated and downregulated coincide with copy-number increases and decreases. Our results help clarify the relation between CNV dosage and gene expression in the D. melanogaster genome.
Resistance to antifolates in Plasmodium falciparum is well described and has been observed in clinical settings for decades. At the molecular level, point mutations in the dhfr gene that lead to resistance have been identified, and the crystal structure of the wildtype and mutant dihydrofolate reductase enzymes have been solved in complex with native substrate and drugs. However, we are only beginning to understand the complexities of the evolutionary pressures that lead to the evolution of drug resistance in this system. Microbial systems that allow heterologous expression of malarial proteins provide a tractable way to investigate patterns of evolution that can inform our eventual understanding of the more complex factors that influence the evolution of drug resistance in clinical settings. In this paper we will review work in Escherichia coli and Saccharomyces cerevisiae expression systems that explore the fitness landscape of mutations implicated in drug resistance and show that (i) a limited number of evolutionary pathways to resistance are followed with high probability; (ii) fitness costs associated with the maintenance of high levels of resistance are modest; and (iii) different antifolates may exert opposing selective forces.
BACKGROUND: Patterns of emerging drug resistance reflect the underlying adaptive landscapes for specific drugs. In Plasmodium falciparum, the parasite that causes the most serious form of malaria, antifolate drugs inhibit the function of essential enzymes in the folate pathway. However, a handful of mutations in the gene coding for one such enzyme, dihydrofolate reductase, confer drug resistance. Understanding how evolution proceeds from drug susceptibility to drug resistance is critical if new antifolate treatments are to have sustained usefulness. METHODOLOGY/PRINCIPAL FINDINGS: We use a transgenic yeast expression system to build on previous studies that described the adaptive landscape for the antifolate drug pyrimethamine, and we describe the most likely evolutionary trajectories for the evolution of drug resistance to the antifolate chlorcycloguanil. We find that the adaptive landscape for chlorcycloguanil is multi-peaked, not all highly resistant alleles are equally accessible by evolution, and there are both commonalities and differences in adaptive landscapes for chlorcycloguanil and pyrimethamine. CONCLUSIONS/SIGNIFICANCE: Our findings suggest that cross-resistance between drugs targeting the same enzyme reflect the fitness landscapes associated with each particular drug and the position of the genotype on both landscapes. The possible public health implications of these findings are discussed.
The Plasmodium falciparum parasite's ability to adapt to environmental pressures, such as the human immune system and antimalarial drugs, makes malaria an enduring burden to public health. Understanding the genetic basis of these adaptations is critical to intervening successfully against malaria. To that end, we created a high-density genotyping array that assays over 17,000 single nucleotide polymorphisms ( approximately 1 SNP/kb), and applied it to 57 culture-adapted parasites from three continents. We characterized genome-wide genetic diversity within and between populations and identified numerous loci with signals of natural selection, suggesting their role in recent adaptation. In addition, we performed a genome-wide association study (GWAS), searching for loci correlated with resistance to thirteen antimalarials; we detected both known and novel resistance loci, including a new halofantrine resistance locus, PF10_0355. Through functional testing we demonstrated that PF10_0355 overexpression decreases sensitivity to halofantrine, mefloquine, and lumefantrine, but not to structurally unrelated antimalarials, and that increased gene copy number mediates resistance. Our GWAS and follow-on functional validation demonstrate the potential of genome-wide studies to elucidate functionally important loci in the malaria parasite genome.
The Drosophila Y chromosome is a degenerated, heterochromatic chromosome with few functional genes. Nonetheless, natural variation on the Y chromosome in Drosophila melanogaster has substantial trans-acting effects on the regulation of X-linked and autosomal genes. However, the contribution of Y chromosome divergence to gene expression divergence between species is unknown. In this study, we constructed a series of Y chromosome introgression lines, in which Y chromosomes from either Drosophila sechellia or Drosophila simulans are introgressed into a common D. simulans genetic background. Using these lines, we compared genome-wide gene expression and male reproductive phenotypes between heterospecific and conspecific Y chromosomes. We find significant differences in expression for 122 genes, or 2.84% of all genes analyzed. Genes down-regulated in males with heterospecific Y chromosomes are significantly biased toward testis-specific expression patterns. These same lines show reduced fecundity and sperm competitive ability. Taken together, these results imply a significant role for Y/X and Y/autosome interactions in maintaining proper expression of male-specific genes, either directly or via indirect effects on male reproductive tissue development or function.
To honor James F. Crow on the occasion of his 95th birthday, GENETICS has commissioned a series of Perspectives and Reviews. For GENETICS to publish the honorifics is fitting, as from their birth Crow and GENETICS have been paired. Crow was scheduled to be born in January 1916, the same month that the first issue of GENETICS was scheduled to appear, and in the many years that Crow has made major contributions to the conceptual foundations of modern genetics, GENETICS has chronicled his and other major advances in the field. The commissioned Perspectives and Reviews summarize and celebrate Professor Crow's contributions as a research scientist, administrator, colleague, community supporter, international leader, teacher, and mentor. In science, Professor Crow was the international leader of his generation in the application of genetics to populations of organisms and in uncovering the role of genetics in health and disease. In education, he was a superb undergraduate teacher whose inspiration changed the career paths of many students. His teaching skills are legendary, his lectures urbane and witty, rigorous and clear. He was also an extraordinary mentor to numerous graduate students and postdoctoral fellows, many of whom went on to establish successful careers of their own. In public service, Professor Crow served in key administrative positions at the University of Wisconsin, participated as a member of numerous national and international committees, and served as president of both the Genetics Society of America and the American Society for Human Genetics. This Perspective examines Professor Crow as teacher and mentor through the eyes and experiences of one student who was enrolled in his genetics course as an undergraduate and who later studied with him as a graduate student.
Evolving lineages face a constant intracellular threat: most new coding sequence mutations destabilize the folding of the encoded protein. Misfolded proteins form insoluble aggregates and are hypothesized to be intrinsically cytotoxic. Here, we experimentally isolate a fitness cost caused by toxicity of misfolded proteins. We exclude other costs of protein misfolding, such as loss of functional protein or attenuation of growth-limiting protein synthesis resources, by comparing growth rates of budding yeast expressing folded or misfolded variants of a gratuitous protein, YFP, at equal levels. We quantify a fitness cost that increases with misfolded protein abundance, up to as much as a 3.2% growth rate reduction when misfolded YFP represents less than 0.1% of total cellular protein. Comparable experiments on variants of the yeast gene orotidine-5'-phosphate decarboxylase (URA3) produce similar results. Quantitative proteomic measurements reveal that, within the cell, misfolded YFP induces coordinated synthesis of interacting cytosolic chaperone proteins in the absence of a wider stress response, providing evidence for an evolved modular response to misfolded proteins in the cytosol. These results underscore the distinct and evolutionarily relevant molecular threat of protein misfolding, independent of protein function. Assuming that most misfolded proteins impose similar costs, yeast cells express almost all proteins at steady-state levels sufficient to expose their encoding genes to selection against misfolding, lending credibility to the recent suggestion that such selection imposes a global constraint on molecular evolution.
BACKGROUND: At a time when genomes are being sequenced by the hundreds, much attention has shifted from identifying genes and phenotypes to understanding the networks of interactions among genes. We developed a gene network developmental model expanding on previous models of transcription regulatory networks. In our model, each network is described by a matrix representing the interactions between transcription factors, and a vector of continuous values representing the transcription factor expression in an individual. RESULTS: In this work we used the gene network model to look at the impact of mating as well as insertions and deletions of genes in the evolution of complexity of these networks. We found that the natural process of diploid mating increases the likelihood of maintaining complexity, especially in higher order networks (more than 10 genes). We also show that gene insertion is a very efficient way to add more genes to a network as it provides a much higher chance of developmental stability. CONCLUSIONS: The continuous model affords a more complete view of the evolution of interacting genes. The notion of a continuous output vector also incorporates the reality of gene networks and graded concentrations of gene products.
The ribosomal rDNA gene array is an epigenetically-regulated repeated gene locus. While rDNA copy number varies widely between and within species, the functional consequences of subtle copy number polymorphisms have been largely unknown. Deletions in the Drosophila Y-linked rDNA modifies heterochromatin-induced position effect variegation (PEV), but it has been unknown if the euchromatic component of the genome is affected by rDNA copy number. Polymorphisms of naturally occurring Y chromosomes affect both euchromatin and heterochromatin, although the elements responsible for these effects are unknown. Here we show that copy number of the Y-linked rDNA array is a source of genome-wide variation in gene expression. Induced deletions in the rDNA affect the expression of hundreds to thousands of euchromatic genes throughout the genome of males and females. Although the affected genes are not physically clustered, we observed functional enrichments for genes whose protein products are located in the mitochondria and are involved in electron transport. The affected genes significantly overlap with genes affected by natural polymorphisms on Y chromosomes, suggesting that polymorphic rDNA copy number is an important determinant of gene expression diversity in natural populations. Altogether, our results indicate that subtle changes to rDNA copy number between individuals may contribute to biologically relevant phenotypic variation.
Chimeric genes, which form through the genomic fusion of two protein-coding genes, are a significant source of evolutionary novelty in Drosophila melanogaster. However, the propensity of chimeric genes to produce adaptive phenotypic changes is not fully understood. Here, we describe the chimeric gene Quetzalcoatl (Qtzl; CG31864), which formed in the recent past and swept to fixation in D. melanogaster. Qtzl arose through a duplication on chromosome 2L that united a portion of the mitochondrially targeted peptide CG12264 with a segment of the polycomb gene escl. The 3' segment of the gene, which is derived from escl, is inherited out of frame, producing a unique peptide sequence. Nucleotide diversity is drastically reduced and site frequency spectra are significantly skewed surrounding the duplicated region, a finding consistent with a selective sweep on the duplicate region containing Qtzl. Qtzl has an expression profile that largely resembles that of escl, with expression in early pupae, adult females, and male testes. However, expression patterns appear to have been decoupled from both parental genes during later embryonic development and in head tissues of adult males, indicating that Qtzl has developed a distinct regulatory profile through the rearrangement of different 5' and 3' regulatory domains. Furthermore, misexpression of Qtzl suppresses defects in the formation of the neuromuscular junction in larvae, demonstrating that Qtzl can produce phenotypic effects in cells. Together, these results show that chimeric genes can produce structural and regulatory changes in a single mutational step and may be a major factor in adaptive evolution.
The principles governing protein evolution under strong selection are important because of the recent history of evolved resistance to insecticides, antibiotics, and vaccines. One experimental approach focuses on studies of mutant proteins and all combinations of mutant sites that could possibly be intermediates in the evolutionary pathway to resistance. In organisms carrying each of the engineered proteins, a measure of protein function or a proxy for fitness is estimated. The correspondence between protein sequence and fitness is widely known as a fitness landscape or adaptive landscape. Here, we examine some empirical fitness landscapes and compare them with simulated landscapes in which the fitnesses are randomly assigned. We find that mutant sites in real proteins show significantly more additivity than those obtained from random simulations. The high degree of additivity is reflected in a summary statistic for adaptive landscapes known as the "roughness," which for the actual proteins so far examined lies in the smallest 0.5% tail of random landscapes.
Whether a trade-off exists between robustness and evolvability is an important issue for protein evolution. Although traditional viewpoints have assumed that existing functions must be compromised by the evolution of novel activities, recent research has suggested that existing phenotypes can be robust to the evolution of novel protein functions. Enzymes that are targets of antibiotics that are competitive inhibitors must evolve decreased drug affinity while maintaining their function and sustaining growth. Utilizing a transgenic Saccharomyces cerevisiae model expressing the dihydrofolate reductase (DHFR) enzyme from the malarial parasite Plasmodium falciparum, we examine the robustness of growth rate to drug-resistance mutations. We assay the growth rate and resistance of all 48 combinations of 6 DHFR point mutations associated with increased drug resistance in field isolates of the parasite. We observe no consistent relationship between growth rate and resistance phenotypes among the DHFR alleles. The three evolutionary pathways that dominate DHFR evolution show that mutations with increased resistance can compensate for initial declines in growth rate from previously acquired mutations. In other words, resistance mutations that occur later in evolutionary trajectories can compensate for the fitness consequences of earlier mutations. Our results suggest that high levels of resistance may be selected for without necessarily jeopardizing overall fitness.
Genetic conflicts between sexes and generations provide a foundation for understanding the functional evolution of sex chromosomes and sexually dimorphic phenotypes. Y chromosomes of Drosophila contain multi-megabase stretches of satellite DNA repeats and a handful of protein-coding genes that are monomorphic within species. Nevertheless, polymorphic variation in heterochromatic Y chromosomes of Drosophila result in genome-wide gene expression variation. Here we show that such naturally occurring Y-linked regulatory variation (YRV) can be detected in somatic tissues and contributes to the epigenetic balance of heterochromatin/euchromatin at three distinct loci showing position-effect variegation (PEV). Moreover, polymorphic Y chromosomes differentially affect the expression of thousands of genes in XXY female genotypes in which Y-linked protein-coding genes are not transcribed. The data show a disproportionate influence of YRV on the variable expression of genes whose protein products localize to the nucleus, have nucleic-acid binding activity, and are involved in transcription, chromosome organization, and chromatin assembly. These include key components such as HP1, Trithorax-like (GAGA factor), Su(var)3-9, Brahma, MCM2, ORC2, and inner centromere protein. Furthermore, mitochondria-related genes, immune response genes, and transposable elements are also disproportionally affected by Y chromosome polymorphism. These functional clusterings may arise as a consequence of the involvement of Y-linked heterochromatin in the origin and resolution of genetic conflicts between males and females. Taken together, our results indicate that Y chromosome heterochromatin serves as a major source of epigenetic variation in natural populations that interacts with chromatin components to modulate the expression of biologically relevant phenotypic variation.
BACKGROUND: Hybrid male sterility (HMS) is a usual outcome of hybridization between closely related animal species. It arises because interactions between alleles that are functional within one species may be disrupted in hybrids. The identification of genes leading to hybrid sterility is of great interest for understanding the evolutionary process of speciation. In the current work we used marked P-element insertions as dominant markers to efficiently locate one genetic factor causing a severe reduction in fertility in hybrid males of Drosophila simulans and D. mauritiana. RESULTS: Our mapping effort identified a region of 9 kb on chromosome 3, containing three complete and one partial coding sequences. Within this region, two annotated genes are suggested as candidates for the HMS factor, based on the comparative molecular characterization and public-source information. Gene Taf1 is partially contained in the region, but yet shows high polymorphism with four fixed non-synonymous substitutions between the two species. Its molecular functions involve sequence-specific DNA binding and transcription factor activity. Gene agt is a small, intronless gene, whose molecular function is annotated as methylated-DNA-protein-cysteine S-methyltransferase activity. High polymorphism and one fixed non-synonymous substitution suggest this is a fast evolving gene. The gene trees of both genes perfectly separate D. simulans and D. mauritiana into monophyletic groups. Analysis of gene expression using microarray revealed trends that were similar to those previously found in comparisons between whole-genome hybrids and parental species. CONCLUSIONS: The identification following confirmation of the HMS candidate gene will add another case study leading to understanding the evolutionary process of hybrid incompatibility.
Genetic diversity and population structure of Plasmodium vivax parasites can predict the origin and spread of novel variants within a population enabling population specific malaria control measures. We analyzed the genetic diversity and population structure of 425 P. vivax isolates from Sri Lanka, Myanmar, and Ethiopia using 12 trinucleotide and tetranucleotide microsatellite markers. All three parasite populations were highly polymorphic with 3-44 alleles per locus. Approximately 65% were multiple-clone infections. Mean genetic diversity (H(E)) was 0.7517 in Ethiopia, 0.8450 in Myanmar, and 0.8610 in Sri Lanka. Significant linkage disequilibrium was maintained. Population structure showed two clusters (Asian and African) according to geography and ancestry. Strong clustering of outbreak isolates from Sri Lanka and Ethiopia was observed. Predictive power of ancestry using two-thirds of the isolates as a model identified 78.2% of isolates accurately as being African or Asian. Microsatellite analysis is a useful tool for mapping short-term outbreaks of malaria and for predicting ancestry.
Differences in gene expression are thought to be an important source of phenotypic diversity, so dissecting the genetic components of natural variation in gene expression is important for understanding the evolutionary mechanisms that lead to adaptation. Gene expression is a complex trait that, in diploid organisms, results from transcription of both maternal and paternal alleles. Directly measuring allelic expression rather than total gene expression offers greater insight into regulatory variation. The recent emergence of high-throughput sequencing offers an unprecedented opportunity to study allelic transcription at a genomic scale for virtually any species. By sequencing transcript pools derived from heterozygous individuals, estimates of allelic expression can be directly obtained. The statistical power of this approach is influenced by the number of transcripts sequenced and the ability to unambiguously assign individual sequence fragments to specific alleles on the basis of transcribed nucleotide polymorphisms. Here, using mathematical modelling and computer simulations, we determine the minimum sequencing depth required to accurately measure relative allelic expression and detect allelic imbalance via high-throughput sequencing under a variety of conditions. We conclude that, within a species, a minimum of 500-1000 sequencing reads per gene are needed to test for allelic imbalance, and consequently, at least five to 10 millions reads are required for studying a genome expressing 10 000 genes. Finally, using 454 sequencing, we illustrate an application of allelic expression by testing for cis-regulatory divergence between closely related Drosophila species.
Over the past decade, attempts to explain the unusual size and prevalence of low-complexity regions (LCRs) in the proteins of the human malaria parasite Plasmodium falciparum have used both neutral and adaptive models. This past research has offered conflicting explanations for LCR characteristics and their role in, and influence on, the evolution of genome structure. Here we show that P. falciparum LCRs (PfLCRs) are not a single phenomenon, but rather consist of at least three distinct types of sequence, and this heterogeneity is the source of the conflict in the literature. Using molecular and population genetics, we show that these families of PfLCRs are evolving by different mechanisms. One of these families, named here the HighGC family, is of particular interest because these LCRs act as recombination hotspots, both in genes under positive selection for high levels of diversity which can be created by recombination (antigens) and those likely to be evolving neutrally or under negative selection (metabolic enzymes). We discuss how the discovery of these distinct species of PfLCRs helps to resolve previous contradictory studies on LCRs in malaria and contributes to our understanding of the evolution of the of the parasite's unusual genome.
Selfish genes, such as meiotic drive elements, propagate themselves through a population without increasing the fitness of host organisms. X-linked (or Y-linked) meiotic drive elements reduce the transmission of the Y (X) chromosome and skew progeny and population sex ratios, leading to intense conflict among genomic compartments. Drosophila simulans is unusual in having a least three distinct systems of X chromosome meiotic drive. Here, we characterize naturally occurring genetic variation at the Winters sex-ratio driver (Distorter on the X or Dox), its progenitor gene (Mother of Dox or MDox), and its suppressor gene (Not Much Yang or Nmy), which have been previously mapped and characterized. We survey three North American populations as well as 13 globally distributed strains and present molecular polymorphism data at the three loci. We find that all three genes show signatures of selection in North America, judging from levels of polymorphism and skews in the site-frequency spectrum. These signatures likely result from the biased transmission of the driver and selection on the suppressor for the maintenance of equal sex ratios. Coalescent modeling indicates that the timing of selection is more recent than the age of the alleles, suggesting that the driver and suppressor are coevolving under an evolutionary "arms race." None of the Winters sex-ratio genes are fixed in D. simulans, and at all loci we find ancestral alleles, which lack the gene insertions and exhibit high levels of nucleotide polymorphism compared to the derived alleles. In addition, we find several "null" alleles that have mutations on the derived Dox background, which result in loss of drive function. We discuss the possible causes of the maintenance of presence-absence polymorphism in the Winters sex-ratio genes.
The Y chromosome, inherited without meiotic recombination from father to son, carries relatively few genes in most species. This is consistent with predictions from evolutionary theory that nonrecombining chromosomes lack variation and degenerate rapidly. However, recent work has suggested a dynamic role for the Y chromosome in gene regulation, a finding with important implications for spermatogenesis and male fitness. We studied Y chromosomes from two populations of Drosophila melanogaster that had previously been shown to have major effects on the thermal tolerance of spermatogenesis. We show that these Y chromosomes differentially modify the expression of hundreds of autosomal and X-linked genes. Genes showing Y-linked regulatory variation (YRV) also show an association with immune response and pheromone detection. Indeed, genes located proximal to the euchromatin-heterochromatin boundary of the X chromosome appear particularly responsive to Y-linked variation, including a substantial number of odorant-binding genes. Furthermore, the data show significant regulatory interactions between the Y chromosome and the genetic background of autosomes and X chromosome. Altogether, our findings support the view that interpopulation, Y-linked regulatory polymorphisms can differentially modulate the expression of many genes important to male fitness, and they also point to complex interactions between the Y chromosome and genetic background affecting global gene expression.