Evolutionary changes in gene expression underlie many aspects of phenotypic diversity within and among species. Understanding the genetic basis for evolved changes in gene expression is therefore an important component of a comprehensive understanding of the genetic basis of phenotypic evolution. Using interspecific introgression hybrids, we examined the genetic basis for divergence in genome-wide patterns of gene expression between Drosophila simulans and Drosophila mauritiana. We find that cis-regulatory and trans-regulatory divergences differ significantly in patterns of genetic architecture and evolution. The effects of cis-regulatory divergence are approximately additive in heterozygotes, quantitatively different between males and females, and well predicted by expression differences between the two parental species. In contrast, the effects of trans-regulatory divergence are associated with largely dominant introgressed alleles, have similar effects in the two sexes, and generate expression levels in hybrids outside the range of expression in both parental species. Although the effects of introgressed trans-regulatory alleles are similar in males and females, expression levels of the genes they regulate are sexually dimorphic between the parental D. simulans and D. mauritiana strains, suggesting that pure-species genotypes carry unlinked modifier alleles that increase sexual dimorphism in expression. Our results suggest that independent effects of cis-regulatory substitutions in males and females may favor their role in the evolution of sexually dimorphic phenotypes, and that trans-regulatory divergence is an important source of regulatory incompatibilities.
A combinatorially complete data set consists of studies of all possible combinations of a set of mutant sites in a gene or mutant alleles in a genome. Among the most robust conclusions from these studies is that epistasis between beneficial mutations often shows a pattern of diminishing returns, in which favorable mutations are less fit when combined than would be expected. Another robust inference is that the number of adaptive evolutionary paths is often limited to a relatively small fraction of the theoretical possibilities, owing largely to sign epistasis requiring evolutionary steps that would entail a decrease in fitness. Here we summarize these and other results while also examining issues that remain unresolved and future directions that seem promising.
Antifolate antimalarials, such as pyrimethamine, have experienced a dramatic reduction in therapeutic efficacy as resistance has evolved in multiple malaria species. We present evidence from one such species, Plasmodium vivax, which has experienced sustained selection for pyrimethamine resistance at the dihydrofolate reductase (DHFR) locus since the 1970s. Using a transgenic Saccharomyces cerevisiae model expressing the P. vivax DHFR enzyme, we assayed growth rate and resistance of all 16 combinations of four DHFR amino acid substitutions. These substitutions were selected based on their known association with drug resistance, both in natural isolates and in laboratory settings, in the related malaria species P. falciparum. We observed a strong correlation between the resistance phenotypes for these 16 P. vivax alleles and previously observed resistance data for P. falciparum, which was surprising since nucleotide diversity levels and common polymorphic variants of DHFR differ between the two species. Similar results were observed when we expressed the P. vivax alleles in a transgenic bacterial system. This suggests common constraints on enzyme evolution in the orthologous DHFR proteins. The interplay of negative trade-offs between the evolution of novel resistance and compromised endogenous function varies at different drug dosages, and so too do the major trajectories for DHFR evolution. In simulations, it is only at very high drug dosages that the most resistant quadruple mutant DHFR allele is favored by selection. This is in agreement with common polymorphic DHFR data in P. vivax, from which this quadruple mutant is missing. We propose that clinical dosages of pyrimethamine may have historically been too low to select for the most resistant allele, or that the fitness cost of the most resistant allele was untenable without a compensatory mutation elsewhere in the genome.
Chromatin remodeling is crucial for gene regulation. Remodeling is often mediated through chemical modifications of the DNA template, DNA-associated proteins, and RNA-mediated processes. Y-linked regulatory variation (YRV) refers to the quantitative effects that polymorphic tracts of Y-linked chromatin exert on gene expression of X-linked and autosomal genes. Here we show that naturally occurring polymorphisms in the Drosophila melanogaster Y chromosome contribute disproportionally to gene expression variation in the testis. The variation is dependent on wild-type expression levels of mod(mdg4) as well as Su(var)205; the latter gene codes for heterochromatin protein 1 (HP1) in Drosophila. Testis-specific YRV is abolished in genotypes with heterozygous loss-of-function mutations for mod(mdg4) and Su(var)205 but not in similar experiments with JIL-1. Furthermore, the Y chromosome differentially regulates several ubiquitously expressed genes. The results highlight the requirement for wild-type dosage of Su(var)205 and mod(mdg4) in enabling naturally occurring Y-linked regulatory variation in the testis. The phenotypes that emerge in the context of wild-type levels of the HP1 and Mod(mdg4) proteins might be part of an adaptive response to the environment.
The evolution of transcriptional regulatory networks entails the expansion and diversification of transcription factor (TF) families. The forkhead family of TFs, defined by a highly conserved winged helix DNA-binding domain (DBD), has diverged into dozens of subfamilies in animals, fungi, and related protists. We have used a combination of maximum-likelihood phylogenetic inference and independent, comprehensive functional assays of DNA-binding capacity to explore the evolution of DNA-binding specificity within the forkhead family. We present converging evidence that similar alternative sequence preferences have arisen repeatedly and independently in the course of forkhead evolution. The vast majority of DNA-binding specificity changes we observed are not explained by alterations in the known DNA-contacting amino acid residues conferring specificity for canonical forkhead binding sites. Intriguingly, we have found forkhead DBDs that retain the ability to bind very specifically to two completely distinct DNA sequence motifs. We propose an alternate specificity-determining mechanism whereby conformational rearrangements of the DBD broaden the spectrum of sequence motifs that a TF can recognize. DNA-binding bispecificity suggests a previously undescribed source of modularity and flexibility in gene regulation and may play an important role in the evolution of transcriptional regulatory networks.
The importance of epistasis--non-additive interactions between alleles--in shaping population fitness has long been a controversial topic, hampered in part by lack of empirical evidence. Traditionally, epistasis is inferred on the basis of non-independence of genotypic values between loci for a given trait. However, epistasis for fitness should also have a genomic footprint. To capture this signal, we have developed a simple approach that relies on detecting genotype ratio distortion as a sign of epistasis, and we apply this method to a large panel of Drosophila melanogaster recombinant inbred lines. Here we confirm experimentally that instances of genotype ratio distortion represent loci with epistatic fitness effects; we conservatively estimate that any two haploid genomes in this study are expected to harbour 1.15 pairs of epistatically interacting alleles. This observation has important implications for speciation genetics, as it indicates that the raw material to drive reproductive isolation is segregating contemporaneously within species and does not necessarily require, as proposed by the Dobzhansky-Muller model, the emergence of incompatible mutations independently derived and fixed in allopatry. The relevance of our result extends beyond speciation, as it demonstrates that epistasis is widespread but that it may often go undetected owing to lack of statistical power or lack of genome-wide scope of the experiments.
Using parasite genotyping tools, we screened patients with mild uncomplicated malaria seeking treatment at a clinic in Thies, Senegal, from 2006 to 2011. We identified a growing frequency of infections caused by genetically identical parasite strains, coincident with increased deployment of malaria control interventions and decreased malaria deaths. Parasite genotypes in some cases persisted clonally across dry seasons. The increase in frequency of genetically identical parasite strains corresponded with decrease in the probability of multiple infections. Further, these observations support evidence of both clonal and epidemic population structures. These data provide the first evidence of a temporal correlation between the appearance of identical parasite types and increased malaria control efforts in Africa, which here included distribution of insecticide treated nets (ITNs), use of rapid diagnostic tests (RDTs) for malaria detection, and deployment of artemisinin combination therapy (ACT). Our results imply that genetic surveillance can be used to evaluate the effectiveness of disease control strategies and assist a rational global malaria eradication campaign.
Analysis of genome sequences of 159 isolates of Plasmodium falciparum from Senegal yields an extraordinarily high proportion (26.85%) of protein-coding genes with the ratio of nonsynonymous to synonymous polymorphism greater than one. This proportion is much greater than observed in other organisms. Also unusual is that the site-frequency spectra of synonymous and nonsynonymous polymorphisms are virtually indistinguishable. We hypothesized that the complicated life cycle of malaria parasites might lead to qualitatively different population genetics from that predicted from the classical Wright-Fisher (WF) model, which assumes a single random-mating population with a finite and constant population size in an organism with nonoverlapping generations. This paper summarizes simulation studies of random genetic drift and selection in malaria parasites that take into account their unusual life history. Our results show that random genetic drift in the malaria life cycle is more pronounced than under the WF model. Paradoxically, the efficiency of purifying selection in the malaria life cycle is also greater than under WF, and the relative efficiency of positive selection varies according to conditions. Additionally, the site-frequency spectrum under neutrality is also more skewed toward low-frequency alleles than expected with WF. These results highlight the importance of considering the malaria life cycle when applying existing population genetic tools based on the WF model. The same caveat applies to other species with similarly complex life cycles.
The Drosophila Y chromosome is a degenerated, heterochromatic chromosome with few functional genes. Despite this, natural variation on the Y chromosome in D. melanogaster has substantial trans-acting effects on the regulation of X-linked and autosomal genes. It is not clear, however, whether these genes simply represent a random subset of the genome or whether specific functional properties are associated with susceptibility to regulation by Y-linked variation. Here, we present a meta-analysis of four previously published microarray studies of Y-linked regulatory variation (YRV) in D. melanogaster. We show that YRV genes are far from a random subset of the genome: They are more likely to be in repressive chromatin contexts, be expressed tissue specifically, and vary in expression within and between species than non-YRV genes. Furthermore, YRV genes are more likely to be associated with the nuclear lamina than non-YRV genes and are generally more likely to be close to each other in the nucleus (although not along chromosomes). Taken together, these results suggest that variation on the Y chromosome plays a role in modifying how the genome is distributed across chromatin compartments, either via changes in the distribution of DNA-binding proteins or via changes in the spatial arrangement of the genome in the nucleus.
X-linked sex-ratio distorters that disrupt spermatogenesis can cause a deficiency in functional Y-bearing sperm and a female-biased sex ratio. Y-linked modifiers that restore a normal sex ratio might be abundant and favored when a X-linked distorter is present. Here we investigated natural variation of Y-linked suppressors of sex-ratio in the Winters systems and the ability of these chromosomes to modulate gene expression in Drosophila simulans. Seventy-eight Y chromosomes of worldwide origin were assayed for their resistance to the X-linked sex-ratio distorter gene Dox. Y chromosome diversity caused males to sire approximately 63% to approximately 98% female progeny. Genome-wide gene expression analysis revealed hundreds of genes differentially expressed between isogenic males with sensitive (high sex ratio) and resistant (low sex ratio) Y chromosomes from the same population. Although the expression of about 75% of all testis-specific genes remained unchanged across Y chromosomes, a subset of post-meiotic genes was upregulated by resistant Y chromosomes. Conversely, a set of accessory gland-specific genes and mitochondrial genes were downregulated in males with resistant Y chromosomes. The D. simulans Y chromosome also modulated gene expression in XXY females in which the Y-linked protein-coding genes are not transcribed. The data suggest that the Y chromosome might exert its regulatory functions through epigenetic mechanisms that do not require the expression of protein-coding genes. The gene network that modulates sex ratio distortion by the Y chromosome is poorly understood, other than that it might include interactions with mitochondria and enriched for genes expressed in post-meiotic stages of spermatogenesis.
We show that the genomes of maize, sorghum, and brachypodium contain genes that, when transformed into rice, confer resistance to rice blast disease. The genes are resistance genes (R genes) that encode proteins with nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains (NBS-LRR proteins). By using criteria associated with rapid molecular evolution, we identified three rapidly evolving R-gene families in these species as well as in rice, and transformed a randomly chosen subset of these genes into rice strains known to be sensitive to rice blast disease caused by the fungus Magnaporthe oryzae. The transformed strains were then tested for sensitivity or resistance to 12 diverse strains of M. oryzae. A total of 15 functional blast R genes were identified among 60 NBS-LRR genes cloned from maize, sorghum, and brachypodium; and 13 blast R genes were obtained from 20 NBS-LRR paralogs in rice. These results show that abundant blast R genes occur not only within species but also among species, and that the R genes in the same rapidly evolving gene family can exhibit an effector response that confers resistance to rapidly evolving fungal pathogens. Neither conventional evolutionary conservation nor conventional evolutionary convergence supplies a satisfactory explanation of our findings. We suggest a unique mechanism termed "constrained divergence," in which R genes and pathogen effectors can follow only limited evolutionary pathways to increase fitness. Our results open avenues for R-gene identification that will help to elucidate R-gene vs. effector mechanisms and may yield new sources of durable pathogen resistance.
Success of the global research agenda toward eradication of malaria will depend on development of new tools, including drugs, vaccines, insecticides and diagnostics. Genomic information, now available for the malaria parasites, their mosquito vectors, and human host, can be leveraged to both develop these tools and monitor their effectiveness. Although knowledge of genomic sequences for the malaria parasites, Plasmodium falciparum and Plasmodium vivax, have helped advance our understanding of malaria biology, simply knowing this sequence information has not yielded a plethora of new interventions to reduce the burden of malaria. Here we review and provide specific examples of how genomic information has increased our knowledge of parasite biology, focusing on P. falciparum malaria. We then discuss how population genetics can be applied toward the epidemiological and transmission-related goals outlined by the International Centers of Excellence for Malaria Research groups recently established by the National Institutes of Health. Finally, we propose genomics is a research area that can promote coordination and collaboration between various ICEMR groups, and that working together as a community can significantly advance the value of this information toward reduction of the global malaria burden.
Chimeric genes form through the combination of portions of existing coding sequences to create a new open reading frame. These new genes can create novel protein structures that are likely to serve as a strong source of novelty upon which selection can act. We have identified 14 chimeric genes that formed through DNA-level mutations in Drosophila melanogaster, and we investigate expression profiles, domain structures, and population genetics for each of these genes to examine their potential to effect adaptive evolution. We find that chimeric gene formation commonly produces mid-domain breaks and unites portions of wholly unrelated peptides, creating novel protein structures that are entirely distinct from other constructs in the genome. These new genes are often involved in selective sweeps. We further find a disparity between chimeric genes that have recently formed and swept to fixation versus chimeric genes that have been preserved over long periods of time, suggesting that preservation and adaptation are distinct processes. Finally, we demonstrate that chimeric gene formation can produce qualitative expression changes that are difficult to mimic through duplicate gene formation, and that extremely young chimeric genes (d(S) < 0.03) are more likely to be associated with selective sweeps than duplicate genes of the same age. Hence, chimeric genes can serve as an exceptional source of genetic novelty that can have a profound influence on adaptive evolution in D. melanogaster.
Antibiotic resistance can evolve through the sequential accumulation of multiple mutations. To study such gradual evolution, we developed a selection device, the 'morbidostat', that continuously monitors bacterial growth and dynamically regulates drug concentrations, such that the evolving population is constantly challenged. We analyzed the evolution of resistance in Escherichia coli under selection with single drugs, including chloramphenicol, doxycycline and trimethoprim. Over a period of approximately 20 days, resistance levels increased dramatically, with parallel populations showing similar phenotypic trajectories. Whole-genome sequencing of the evolved strains identified mutations both specific to resistance to a particular drug and shared in resistance to multiple drugs. Chloramphenicol and doxycycline resistance evolved smoothly through diverse combinations of mutations in genes involved in translation, transcription and transport. In contrast, trimethoprim resistance evolved in a stepwise manner, through mutations restricted to the gene encoding the enzyme dihydrofolate reductase (DHFR). Sequencing of DHFR over the time course of the experiment showed that parallel populations evolved similar mutations and acquired them in a similar order.
In many species, both morphological and molecular traits related to sex and reproduction evolve faster in males than in females. Ultimately, rapid male evolution relies on the acquisition of genetic variation associated with differential reproductive success. Many newly evolved genes are associated with novel functions that might enhance male fitness. However, functional evidence of the adaptive role of recently originated genes in males is still lacking. The Sperm dynein intermediate chain multigene family, which encodes a Sperm dynein intermediate chain presumably involved in sperm motility, originated from complex genetic rearrangements in the lineage that leads to Drosophila melanogaster within the last 5.4 million years since its split from Drosophila simulans. We deleted all the members of this multigene family resident on the X chromosome of D. melanogaster by chromosome engineering and found that, although the deletion does not result in a reduction of progeny number, it impairs the competence of the sperm in the presence of sperm from wild-type males. Therefore, the Sperm dynein intermediate chain multigene family contributes to the differential reproductive success among males and illustrates precisely how quickly a new gene function can be incorporated into the genetic network of a species.
Malaria is a deadly disease that causes nearly one million deaths each year. To develop methods to control and eradicate malaria, it is important to understand the genetic basis of Plasmodium falciparum adaptations to antimalarial treatments and the human immune system while taking into account its demographic history. To study the demographic history and identify genes under selection more efficiently, we sequenced the complete genomes of 25 culture-adapted P. falciparum isolates from three sites in Senegal. We show that there is no significant population structure among these Senegal sampling sites. By fitting demographic models to the synonymous allele-frequency spectrum, we also estimated a major 60-fold population expansion of this parasite population approximately 20,000-40,000 years ago. Using inferred demographic history as a null model for coalescent simulation, we identified candidate genes under selection, including genes identified before, such as pfcrt and PfAMA1, as well as new candidate genes. Interestingly, we also found selection against G/C to A/T changes that offsets the large mutational bias toward A/T, and two unusual patterns: similar synonymous and nonsynonymous allele-frequency spectra, and 18% of genes having a nonsynonymous-to-synonymous polymorphism ratio >1.
The study of the evolution of novel genes generally focuses on the formation of new coding sequences. However, equally important in the evolution of novel functional genes are the formation of regulatory regions that allow the expression of the genes and the effects of the new genes in the organism as well. Herein, we discuss the current knowledge on the evolution of novel functional genes, and we examine in more detail the youngest genes discovered. We examine the existing data on a very recent and rapidly evolving cluster of duplicated genes, the Sdic gene cluster. This cluster of genes is an excellent model for the evolution of novel genes, as it is very recent and may still be in the process of evolving.
Chromosomal inversions have been an enduring interest of population geneticists since their discovery in Drosophila melanogaster. Numerous lines of evidence suggest powerful selective pressures govern the distributions of polymorphic inversions, and these observations have spurred the development of many explanatory models. However, due to a paucity of nucleotide data, little progress has been made towards investigating selective hypotheses or towards inferring the genealogical histories of inversions, which can inform models of inversion evolution and suggest selective mechanisms. Here, we utilize population genomic data to address persisting gaps in our knowledge of D. melanogaster's inversions. We develop a method, termed Reference-Assisted Reassembly, to assemble unbiased, highly accurate sequences near inversion breakpoints, which we use to estimate the age and the geographic origins of polymorphic inversions. We find that inversions are young, and most are African in origin, which is consistent with the demography of the species. The data suggest that inversions interact with polymorphism not only in breakpoint regions but also chromosome-wide. Inversions remain differentiated at low levels from standard haplotypes even in regions that are distant from breakpoints. Although genetic exchange appears fairly extensive, we identify numerous regions that are qualitatively consistent with selective hypotheses. Finally, we show that In(1)Be, which we estimate to be approximately 60 years old (95% CI 5.9 to 372.8 years), has likely achieved high frequency via sex-ratio segregation distortion in males. With deeper sampling, it will be possible to build on our inferences of inversion histories to rigorously test selective models-particularly those that postulate that inversions achieve a selective advantage through the maintenance of co-adapted allele complexes.
Through rapid genetic adaptation and natural selection, the Plasmodium falciparum parasite--the deadliest of those that cause malaria--is able to develop resistance to antimalarial drugs, thwarting present efforts to control it. Genome-wide association studies (GWAS) provide a critical hypothesis-generating tool for understanding how this occurs. However, in P. falciparum, the limited amount of linkage disequilibrium hinders the power of traditional array-based GWAS. Here, we demonstrate the feasibility and power improvements gained by using whole-genome sequencing for association studies. We analyzed data from 45 Senegalese parasites and identified genetic changes associated with the parasites' in vitro response to 12 different antimalarials. To further increase statistical power, we adapted a common test for natural selection, XP-EHH (cross-population extended haplotype homozygosity), and used it to identify genomic regions associated with resistance to drugs. Using this sequence-based approach and the combination of association and selection-based tests, we detected several loci associated with drug resistance. These loci included the previously known signals at pfcrt, dhfr, and pfmdr1, as well as many genes not previously implicated in drug-resistance roles, including genes in the ubiquitination pathway. Based on the success of the analysis presented in this study, and on the demonstrated shortcomings of array-based approaches, we argue for a complete transition to sequence-based GWAS for small, low linkage-disequilibrium genomes like that of P. falciparum.
Although the Drosophila Y chromosome is degenerated, heterochromatic, and contains few genes, increasing evidence suggests that it plays an important role in regulating the expression of numerous autosomal and X-linked genes. Here we use 15 Y chromosomes originating from a single founder 550 generations ago to study the role of the Y chromosome in regulating rRNA gene transcription, position-effect variegation (PEV), and the link among rDNA copy number, global gene expression, and chromatin regulation. Based on patterns of rRNA gene transcription indicated by transcription of the retrotransposon R2 that specifically inserts into the 28S rRNA gene, we show that X-linked rDNA is silenced in males. The silencing of X-linked rDNA expression by the Y chromosome is consistent across populations and independent of genetic background. These Y chromosomes also vary more than threefold in rDNA locus size and cause dramatically different levels of PEV suppression. The degree of suppression is negatively associated with the number and fraction of rDNA units without transposon insertions, but not with total rDNA locus size. Gene expression profiling revealed hundreds of differentially expressed genes among these Y chromosome introgression lines, as well as a divergent global gene expression pattern between the low-PEV and high-PEV flies. Our findings suggest that the Y chromosome is involved in diverse phenomena related to transcriptional regulation including X-linked rDNA silencing and suppression of PEV phenotype. These results further expand our understanding of the role of the Y chromosome in modulating global gene expression, and suggest a link with modifications of the chromatin state.