University of Cambridge
The structure of archaic admixture revealed by modern human genomes
Genetic evidence has shown that modern humans interbred with at least two groups of archaic humans, the Neanderthals and the Denisovans. The admixture process in relation to the expansion of modern human out of Africa, however, is not well understood since only a few major population groups have been extensively sampled. Using over 900 high-coverage genomes in the HGDP panel, we compared the distribution and diversity of archaic segments across 53 populations worldwide. We found that Neanderthal segments from all non-African populations appear largely homogeneous after accounting for the recent demographic history in modern human populations, consistent with a single admixture event that happened before they diverged from each other; in contrast, a distinct separation in genomic location and haplotype structure exists between Denisova segments recovered from Melanesia and those from mainland Eurasia. Furthermore, the Denisova haplotypes in East Asia cannot be explained by a single source of gene flow. Therefore we propose that more than one episode of admixture with genetically distinct Denisova groups occurred in the ancestral population of present-day East Asian, South Asian and American populations, and that another admixture event with the Denisova population occurred in the ancestors of Melanesians after their separation from the other Eurasian populations.
University of Cambridge
Convergent evolution of a cystic fibrosis pathogen within and between patients
Lung infections with Mycobacterium abscessus have increased in frequency worldwide, emerging as an important global threat to individuals with cystic fibrosis (CF) where they cause accelerated inflammatory lung damage and death. M. abscessus was previously thought to be independently acquired by susceptible individuals from the environment. However, using whole genome sequencing and epidemiological analysis of a global collection of 1,080 clinical isolates from 517 CF patients, we found strong evidence of frequent patient-to-patient spread.
Within patients we found a high level of genomic diversity, to which we applied haplotype reconstruction to recapitulate the evolutionary history of subclones circulating within patients. The pattern of subclonal evolution revealed convergent evolution of 18 virulence associated genes both within and between patients. Strikingly, the most highly evolved genes were either global transcriptional regulators that respond to environmental stimuli within the host or antibiotic targets. Moreover, we identified the presence of hypermutable strains in several patients which accelerated this evolutionary process, thus demonstrating the propensity of M. abscessus to rapidly evolve from an environmental organism into a transmissible human pathogen.
University of Cambridge
Variation of colour patterning within and between species of cichlid fishes
The genetic basis of the emergence and maintenance of morphological variation in natural populations remains largely unknown. We address this question in a highly diverse vertebrate model system, cichlid fishes. We specifically focus on variation of a set of brightly pigmented egg-spots on male anal fins that play a key role in the territorial and breeding behaviour of around 1,500 species of cichlids. Using intra and inter-specific genomic comparisons we identified loci associated with egg-spot number variation. Interestingly, the loci associated with variation within species do not overlap with interspecific mapping approaches, suggesting that in this system variation within populations does not contribute to variation between species. The identified loci are known to be involved in the physiology and development of pigment cells, here we show the progress of the genetic and developmental dissection of candidate gene function and trait ontogeny.
University of Sussex
The epigenetic and evolutionary impact of gene capture by TEs on donor genes in maize
Transposable Elements (TEs) are known to regularly capture fragments of host genes. The functional role and evolutionary fate of these captured fragments has been intensively studied trying to deduce if this is, for example, a main route for the formation of new genes. In contrast, the effect of the capturing process on the donor genes remains poorly understood. To shed light on this side of the story, we identified captured exon fragments within three TE families in maize, i.e. Helitrons, Pack-MULEs and Sirevirus LTR retrotransposons. With 2,632 donor genes discovered, and after finding that the sequences of the TE and captured fragments are highly methylated, we hypothesized that the methylation status of donor genes may be affected in trans by RdDM. Indeed, donor genes are biased towards having high levels of body (i.e. exonic) methylation compared to “free” (i.e. uncaptured) genes. We also found that donor genes are also targeted by more siRNAs than free genes, with a higher than expected proportion of these siRNAs also mapping to the captured fragment within the TE, which is indicative of an epigenetic cross-talk between TEs and donor genes. A series of observations suggests that donor genes are not pseudogenes. They contain on average more exons and produce longer transcripts than free genes, they are equally or more expressed compared to free genes across a range of tissues, while similar proportions of both gene sets have detailed functional annotation in the maize v4 genome and have orthologous genes in Sorghum and other species. One intriguing difference is that a smaller proportion (57%) of donor genes compared to free genes (79%) is located in syntenic positions in relation to Sorghum. Overall, our analysis provides evidence that links gene-body methylation with TE capture, with additional implications of this mechanism, and possibly of TE mobility, with gene movement and the breakage of synteny.
University of Cambridge
Emergence of a floral colour polymorphism by pollinator-mediated overdominance
Understanding the emergence of stable phenotypic differences in and between natural populations, eventually leading to novel species, is a central aspect of evolutionary biology. Systems with polymorphisms in fitness-relevant traits such as coloration allow studying the molecular mechanisms underlying speciation. In 1955, Dobzhansky stated that polymorphisms evolve mainly due to overdominance, i.e. a higher fitness of the intermediate, heterozygous morph. However, half a century later, there is still very little unambiguous evidence for the action of overdominance in natural populations.
A southern population of the Alpine orchid Gymnadenia rhellicani contains a striking, but yet unstudied polymorphism with a dark, intermediate, and bright floral colour morph. We have applied an eco-evo-devo approach to fully reconstruct the developmental genetic causes and the ecological consequences of this polymorphism. A combination of phenotypic, metabolomic and transcriptomic analyses showed that the morphs differ solely in the concentration of two cyanidin pigments, which is linked to differential expression of an anthocyanidin synthase (ANS) gene. Transcriptome-wide association mapping further identified a single nucleotide polymorphism (SNP) heterozygous in the intermediate morph. This SNP results in a premature stop codon in an ANS regulating R2R3-MYB transcription factor. Finally, field observations revealed that bee and fly pollinators exert opposite directional selection on flower colour, together maximising seed set in the intermediate, heterozygous morph.
Altogether, these findings (1) provide clear and complete evidence of Dobzhanky’s statement that polymorphisms can evolve by overdominant selection of a single locus, and (2) demonstrate the usefulness of combining analyses on the genomic, metabolomic, phenotypic, and ecological level to understand biological phenomena in non-model organisms.
University of Cambridge
Wing pheromones in Heliconius butterflies: physiology, behavior, and genetics
Butterflies in the genus Heliconius (Nymphalidae) have been extensively studied as key examples of speciation, mate choice, and Müllerian mimicry via their bright colour patterns. Male chemical signaling via wing pheromones has recently been demonstrated in H. melpomene and shown to have an effect on female choice. Although the composition of these pheromones is known, exactly which components of the signal are key to female preference – and the genetic basis of their production – is still unknown. Our understanding of chemical signaling in reproductive isolation between Heliconius species is still in its infancy. Using electroantennography, I examined responses of virgin female H. melpomene and H. cydno to natural and synthetic male pheromones of both species, as well as to individual dominant compounds. Surprisingly, H. melpomene and H. cydno females both react more strongly to wing pheromones from H. cydno. Of the major components of the wing pheromone in both species, only octadecanal (26% of the H. melpomene male pheromone) provoked a significant response in both species, despite its absence in H. cydno wing pheromones. When female H. melpomene were presented with a choice between a control male and one augmented with additional octadecanal, they showed a slight preference for the control male, and also waited twice as long before mating if they chose the octadecanal-augmented male. We are currently analyzing data from 219 backcross individuals to locate QTL for octadecanal production. Preliminary analysis shows a locus on chromosome 20 with no linkage to the known major wing colour genes. This work is the first to show female physiological responses to male wing pheromones in Heliconius, and argues for their importance in both female mate choice and reproductive isolation in combination with other mate choice signals. It also highlights the potential role of octadecanal, a relatively simple pheromone component, in mate choice and reproductive isolation.
University of Jyväskylä
A wide pleiotropic effect of melanin pathway genes on behavior and life-history traits in Drosophila montana
Pigmentation is one of the most variable traits in insects and is largely associated with the melanin biosynthesis pathway. A wide range of phenotypic traits, such as immunity, body size and behavior, is proposed to correlate with melanism, possibly via the diverse functions of dopamine metabolism. However, many studies lack a definite proof that a change in a specific phenotypic trait is caused by a change in a melanic gene. We have investigated three key genes in the melanin pathway, yellow, ebony and tan, by inducing mutations in them with CRISPR/Cas9 gene editing technique and by tracing changes in phenotypic traits associated with the life-history and mating behavior in an ecologically interesting species Drosophila montana. In addition, we have investigated cuticular hydrocarbon profile and cuticle structure of the flies to reveal possible structural causes of the detected phenotypic changes. Our results demonstrate how a single key gene in an important pathway can have a major effect on a wide range of adaptive phenotypes, and deepen our understanding of how pleiotropy shapes important evolutionary traits.
Wellcome Sanger Institute
Single-cell profiling of lymphocyte somatic mutations reveals divergence from stem cells, variation in mutation burden, and strong microenvironmental effects
In somatic evolution, human cells accumulation mutations with time and cell division. Lymphocytes, blood cells of the adaptive immune system, are produced from an ever evolving pool of hematopoietic stem cells (HSCs), and diverge from the HSCs functionally and genetically. Our understanding of the mutations that arise in these cells is dominated by cancer samples, where clonal expansions allow for easy identification of somatic mutations. In order to understand normal somatic evolution of the human immune system, we perform single-cell expansion and whole genome sequencing of T and B lymphocytes. We identify per-cell mutation burdens and profiles, and identify lymphocyte-specific patterns of mutations. We find a striking variance in mutation burden across cells, associated with environment-specific mutational processes. Two examples of this is germinal center (and AID activity) mutations in memory B cells, and UV-like mutations in memory T cells (putatively skin resident cells). The timing of the divergence of T lymphocytes from the HSC pool can also be estimated from T cell specific mutational patterns, which accumulate at a rate associated with telomere shortening and by proxy cell replication. This work highlights an under-appreciated genetic diversity in normal lymphocytes, with some cells accumulating thousands of mutations in a relatively short number of cell divisions, with some of the lymphocyte-specific mutational processes elucidated by the observed patterns of mutations. This genetic diversity may have profound effects on the evolution and function of the immune system with age.
University of Cambridge
Altitude shapes local adaptation in Heliconius butterflies
Heliconius butterflies have long been studied for their Müllerian mimicry and are one of the most widely distributed genera of butterflies in the Neotropics. As such, they range across a vast number of habitats and climates, with some species that have specialised to high altitude environments in the Andes, and others, such as H. erato, that are found in continuous populations from 0 to 1600 m above sea-level. Their dazzling diversity in colour patterns has perhaps obscured the less conspicuous variation in wing shapes across their range. Here I will present a first insight into the traits associated with altitudinal adaptation, dissect the selection pressures shaping these traits, and discuss the genomics putatively underlying them. With a wild collection of over 3000 individuals, common-garden rearing experiments, and whole genomes from 300 wild H. erato, we show that wings are rounder at high elevations, both within and across species, that wing shape is heritable, and that there are regions of the genome repeatedly diverging between altitudinally structured populations. Our results reveal tractable traits involved in local adaptation to altitude and make this an exciting new avenue for Heliconius research.
Cancer Research UK Cambridge
Develop and validating a predictive platform of evolutionary trajectories and drug responses
Breast cancer is a group of different diseases, displaying both inter- and intra-tumour heterogeneity. Our group has developed one of the largest and most comprehensive molecular-annotated biobanks of breast cancer patient-derived tumour xenograft (PDTX) models, which retain most of the originating cancer’s heterogeneity. PDTXs also capture the diversity of inter-patient drug responses seen in the clinic. We aim to expand from these observations and explore the use of PDTXs as mouse avatars alongside an ongoing clinical trial in the breast cancer neo-adjuvant setting, to leverage the development of a predictive platform of evolutionary trajectories of drug responses with high clinical predictive power. To develop and validate the predictive trajectory platform, we engraft breast cancer samples from treatment naïve breast cancer patients (n=6) to generate matched PDTX mice. Through parallel passaging, we expanded a give patient’s sample into sister mice to test and evaluate the effects of a number of therapeutic strategies a given patient’s tumour receives. We initially aim to evaluate drug responses in avatar mice (PDTXs enrolled in the “mirror”/same as patient treatment arm) and in the “alternative” mice (PDTXs enrolled in the other arm of the trial). We further test other clinically relevant therapeutic strategies in vivo PDTXs derived from the same patient’s sample. Drug response data is integrated with state-of-the art genomic, phenotypic and functional data from multi-region and bulk shallow whole genome and whole exome sequencing, bulk and single cell RNA-sequencing and ex vivo high-throughput drug response data from short-term cultures of PDTX cells. Here we propose the development and validation of a preclinical/clinical integrative platform that simulates clinical drug responses, enabling the study of patient-specific evolutionary trajectories paving the way towards a platform to help guide personalised clinical decision making.
University of Cambridge
Defining the evolutionary dynamics of clonal haematopoiesis
Somatic mutations acquired in healthy tissues as we age are major determinants of cancer risk. Our understanding of how evolutionary forces shape the acquisition and expansion of somatic clones, however, remains cursory. Here, by combining blood sequencing data from ~50,000 individuals, we reveal how mutation, selection and genetic drift combine to determine the clonal diversity of healthy blood (‘clonal haematopoiesis’). We find that chance differences in the timing of mutation acquisition combined with strong positive selection are the major determinants of the wide variation in clone size observed across individuals. We infer the spectrum of fitness effects of mutations in key blood cancer-associated genes including DNMT3A, TET2, JAK2 and spliceosome genes, providing quantitative measures of pathogenicity. Analysis of neutral ‘hitchhiker’ mutations provides evidence for thousands of mutations genome-wide driving clonal haematopoiesis which confer fitness advantages >10% per year. These mutations, if acquired early in life, overwhelm the bone marrow by age 75. Contrary to the widely held view that clonal haematopoiesis is driven by ageing-related alterations in the stem cell niche, we show that the majority of the ‘age-dependence’ of clonal haematopoiesis is consistent with clones growing and becoming increasingly detectable over time.
Wellcome Sanger Institute
The genomic and evolutionary landscape of normal human endometrial epithelium
Human endometrium is a highly dynamic tissue that undergoes numerous cycles of breakdown, repair and remodelling in response to the oscillating levels of oestrogen and progesterone. The marked regenerative capacity of the tissue’s epithelial compartment is maintained by the intra-glandular adult stem cells (ASCs) that reside in the stratum basalis retained during menstruation. Although the epithelial ASCs were first described over a decade ago, they remain relatively poorly characterised in comparison to their counterparts in other tissues. In particular, the size of the stem cell pool, clonal dynamics, rates of division and genomic landscape are largely unknown. Here, we laser-capture micro-dissected and whole genome sequenced 215 normal endometrial glands from pre- and post-menopausal women. Analysis of the WGS data showed that the majority of the glands were clonal cell populations sharing a recent common ancestor. Somatic mutations accumulated at a rate of 28 substitutions per year (P = 1.061e-07). Elevated body mass index (BMI), a known endometrial cancer risk factor, accelerated the rate of mutation acquisition. Total mutation burden in normal endometrial epithelium is higher than in other normal cells and manyfold lower than endometrial cancers. Despite the heterogeneity in age, reproductive history and BMI, we found relatively homogenous mutational processes to be operative in this tissue with only occasional outliers. Remarkably, we not only identify recurrent acquisitions of certain cancer-associated mutations but also show that such events occur early in life, potentially before adolescence. Over time, the mutant ASCs serve as a reservoir for acquisition of further driver mutations to the extent that the entire endometrium becomes neoplastic. In older women, we observe a shift in the spectrum of acquired cancer-associated mutations, which may reflect post-menopausal changes in the levels of sex-steroid hormones and the resultant tissue microenvironment.
Improved population resolution in Baltic harbour porpoise (Phocoena phocoena) based on a new high quality genome assembly.
The harbour porpoise (Phocoena phocoena) is a highly mobile cetacean found in waters across the Northern hemisphere. It inhabits basins that vary broadly in salinity, temperature, and food availability, which could drive differentiation among populations. Population structure within the North and Baltic Sea is not fully resolved, especially the potentially isolated Inner Baltic-proper population. An initially study showed higher resolution from single nucleotide polymorphism (SNP) markers than that previously detected with mtDNA haplotype data and microsatellites. We have extended this with ddRAD sequencing data, utilizing a newly assembled high quality reference genome, to further unravel subtle population structure. Using this new assembly as a reference, we have identified genome-wide SNPs from 297 individuals. Initial results support the classification of a distinct Inner Baltic proper population, genetically differentiated from the population in the Belt Sea, and because of low numbers, critically endangered. These results support conservation measurements for the Inner Baltic Proper populations, and provide a tool for further management and monitoring of all European population (e.g. a SNP-assay). Together with the draft genome annotation (22,154 predicted genes), these genetic variants can also be linked to the proteins they encode, allowing for further investigation into local adaption and functional evolution in the different harbour porpoise populations. Our study underscores the value of whole genome resources in conservation genomics, and provides a crucial addition for the study of porpoise evolution and phylogeny.