Genetics Journal Club Louisa C. Pyle November 12th, 2015

1 2 ...
Author: Patricia Manning
0 downloads 4 Views

1

2 Genetics Journal Club Louisa C. Pyle November 12th, 2015Pediatric Genetics fellow First year in Kate Nathanson’s lab Presenting paper out this week, “title” One of 2 papers out since the beginning of the month from the Deciphering Developmental Disorders study Genetics Journal Club Louisa C. Pyle November 12th, 2015

3 From Iceland

4 to the British Isles

5 From Iceland to the British Isles

6 From Iceland to the British IslesDeciphering Developmental Disabilities (DDD) Study UK (Scotland, Wales, England, Northern Ireland) and Republic of Ireland Why UK and Ireland? Manageable scale and systems - coordinated genetics services across the 4 UK National Health Services, plus Dublin

7 Large, compared with families, or seeking rare large families.Not large on a GWAS scale Can you use phenotypic similarities across families, to determine rare causative mutations in a (large) population?

8 Experimental ApproachPatient recruitment Exome sequencing Mendelian filtering (selecting probands with rare, biallelic, putatively damaging variants in the same gene) Assessing against likelihood of sampling the observed genotypes in the general population AND Comparing phenotypic similarity of probands with recessive variants in the same gene Confirm allele segregation AND Model variants in organisms for phenotypic similarity Some of the modeling was already published, and some they had to do Figure 1

10 Sequencing and AnnotationExome Sequencing Paired-end libraries using Illumina (150 bp fragments) Targets enriched with Agilent Human All-Exon SureSelect RNA bait 75-base paired-end sequencing on Illumina HiSeq Burrows-Wheeler Alligner mapping to hs37d5 GATK HaplotypeCaller best practices variant calling Excluded 1,053 families with proband de novo mutations altering coding sequence of known dominant or X-linked developmental disorders -> separate pipeline for verification as likely diagnoses (Agilent two different versions)

11 Sequencing and AnnotationFunctional annotation of variants from remaining 3,072 families Minor allele frequency (MAF) <1%, average 3.2 per proband Calculated from multiple sources (1000 Genomes, UK10K cohort, NHLBI GO Exome Sequencing Project, internal DDD cohort of unaffected parents) Variant effect predictor (VEP) from Ensembl (version 2.6) Class 1: Loss of Function (transcript_ablation, slice_donor_variant, splice_acceptor_variant, stop_gained, frameshift_variant) Class 2: Functional (stop_lost, initiator_codon_variant, transcript_amplification, inframe_insertion, inframe_deletion, missense_variant, coding_sequence_variant) -> 74 candidate recessive genes where one or both proband alleles are LoF (LoF/LoF or LoF/Func) in two or more families (Agilent two different versions)

12 Experimental ApproachPatient recruitment Exome sequencing Mendelian filtering (selecting probands with rare, biallelic, putatively damaging variants in the same gene) Assessing against likelihood of sampling the observed genotypes in the general population AND Comparing phenotypic similarity of probands with recessive variants in the same gene Confirm allele segregation AND Model variants in organisms for phenotypic similarity Some of the modeling was already published, and some they had to do Figure 1

13 Experimental ApproachPatient recruitment Exome sequencing Mendelian filtering (selecting probands with rare, biallelic, putatively damaging variants in the same gene) Assessing against likelihood of sampling the observed genotypes in the general population AND Comparing phenotypic similarity of probands with recessive variants in the same gene Confirm allele segregation AND Model variants in organisms for phenotypic similarity Figure 1

14 Genotype P-value Adjusted for probability of LoF/LoF and LoF/Func by chance among 3,072 families Frequency of rare LoF and missense variants in control group Exome Aggregation Consortium (ExAC – 60,000 exomes without severe pediatric disease) Families matched to one of four ancestral populations (Europeans non-Finish, Africans, East Asians, South Asians) determined by principle component analysis 2,799 Europeans non-Finish (NFE) 297 South Asian (SAS) 109 African (AFR) 15 East Asian (EAS) Probability normalized to ExAC findings within ancestral population Verified these frequencies (ExAC) were concordant with unaffected study parents in the European group Supplemental Figure 2

15 Genotype P-value Adjusted for autozygosity/runs of homozygosity throughout the matched ancestral population, at each individual gene Supplemental Figure 3: Rate of autozygosity per gene in probands. Grey lines show autozygosity of the signficant genes.

16 Genotype P-value Determined binomial probability of sampling two rare alleles of a specific functional category (within Class 1 and 2) n or more times from random draw (n = number of families with that rare allele type) Summed those probabilities across all possible combinations of rare allele type and ancestral population, for aggregate probability for sampling n families by chance Using that as normal distribution, determined P-value for finding each of the 72 potential genes for the n number of families in which it was found Supplemental Figure 4: QQ Plot, biallelic rare synonymous genotypes closely follows null distribution from ExAC

17 Phenotype P-value Clinical geneticists systematically recorded phenotypes using the standardized Human Phenotype Ontology (HPO) 11,000 standardized descriptive human phenotype terms (earlier ~10,000 version used in this paper), Hierarchical terms Guidelines to attempt to mark the most informative and specific Supplemental Figure 5: QQ Plot, similarity of HPO terms among probands

18 Phenotype P-value Maximum information content (maxIC) calculated Pairwise similarity between probands was calculated P-values generated by comparing that similarity to having those HPO terms in common throughout the study population probands (by random sampling) Demonstrated that it’s informative and calibrated by compared to a permuted sample set where phenotype-gene relationship was scrambled. Demonstrated a significance signal for the patients. Supplemental Figure 5: QQ Plot, similarity of HPO terms among probands Clinical geneticists free-text recorded a “suspected syndrome” – one of which the patient most reminded them

19 Experimental ApproachPatient recruitment Exome sequencing Mendelian filtering (selecting probands with rare, biallelic, putatively damaging variants in the same gene) Assessing against likelihood of sampling the observed genotypes in the general population AND Comparing phenotypic similarity of probands with recessive variants in the same gene Confirm allele segregation AND Model variants in organisms for phenotypic similarity Figure 1

20 Experimental ApproachPatient recruitment Exome sequencing Mendelian filtering (selecting probands with rare, biallelic, putatively damaging variants in the same gene) Assessing against likelihood of sampling the observed genotypes in the general population AND Comparing phenotypic similarity of probands with recessive variants in the same gene Confirm allele segregation AND Model variants in organisms for phenotypic similarity Figure 1

21 Genotype-Phenotype P-values CombinedGene n LOF/LOFa n LOF/functionalb P (genotype)c P (phenotype)d Combinede Evidence Additional evidence KIAA0586 6 1 1.42 × 10−6 5.60 × 10−4 1.75 × 10−8 Genome-wide significant Cosegregation in two affected siblings Mouse and chick mutants with ciliary phenotypes HACE1 3 2.39 × 10−8 1.11 × 10−1 5.50 × 10−8 Cosegregation in one affected sibling Three affected siblings with homozygous in-frame deletion Mouse mutant with early-lethality phenotype PRMT7 2 1.45 × 10−4 1.49 × 10−3 3.53 × 10−6 Suggestive Cosegregation in three affected siblings Concordant mouse mutant with AHO-like phenotype CSTB 3.16 × 10−5 1.76 × 10−2 8.56 × 10−6 Previously implicated gene COL9A3 3.89 × 10−5 4.08 × 10−2 2.28 × 10−5 MMP21 2.05 × 10−3 1.03 × 10−3 2.97 × 10−5 Cosegregation with affected sibling Two mouse mutants with heterotaxy Table 1: significance set P < 1.44 x 10-6 – significant for 2/74 candidate genes

22 Genotype-Phenotype P-values CombinedGene n LOF/LOFa n LOF/functionalb P (genotype)c P (phenotype)d Combinede Evidence Additional evidence KIAA0586 6 1 1.42 × 10−6 5.60 × 10−4 1.75 × 10−8 Genome-wide significant Cosegregation in two affected siblings Mouse and chick mutants with ciliary phenotypes HACE1 3 2.39 × 10−8 1.11 × 10−1 5.50 × 10−8 Cosegregation in one affected sibling Three affected siblings with homozygous in-frame deletion Mouse mutant with early-lethality phenotype PRMT7 2 1.45 × 10−4 1.49 × 10−3 3.53 × 10−6 Suggestive Cosegregation in three affected siblings Concordant mouse mutant with AHO-like phenotype CSTB 3.16 × 10−5 1.76 × 10−2 8.56 × 10−6 Previously implicated gene COL9A3 3.89 × 10−5 4.08 × 10−2 2.28 × 10−5 MMP21 2.05 × 10−3 1.03 × 10−3 2.97 × 10−5 Cosegregation with affected sibling Two mouse mutants with heterotaxy Table 1: significance set P < 1.44 x 10-6 – significant for 2/74 candidate genes

24 KIAA0586 – Joubert Syndrome8 individuals across 6 families 7 with compound heterozygosity for LoF, 1 with LoF and missense 5 of the 6 families had been suspected for Joubert (molar tooth sign) Ataxia, hypotonia, Duane anomaly (detailed phenotype information in supplement) Encodes TALPID3, a centrosomal protein involved in ciliogenesis and sonic hedgehog signaling Several other Joubert syndrome genes are critical for cilia functioning Mouse knockouts are embryonic lethal All have one Arg143Lysfs*4, they hypothesize is a null allele that in homozygously is lethal. Is combined here with hypomorphs.

25 Genotype-Phenotype P-values CombinedGene n LOF/LOFa n LOF/functionalb P (genotype)c P (phenotype)d Combinede Evidence Additional evidence KIAA0586 6 1 1.42 × 10−6 5.60 × 10−4 1.75 × 10−8 Genome-wide significant Cosegregation in two affected siblings Mouse and chick mutants with ciliary phenotypes HACE1 3 2.39 × 10−8 1.11 × 10−1 5.50 × 10−8 Cosegregation in one affected sibling Three affected siblings with homozygous in-frame deletion Mouse mutant with early-lethality phenotype PRMT7 2 1.45 × 10−4 1.49 × 10−3 3.53 × 10−6 Suggestive Cosegregation in three affected siblings Concordant mouse mutant with AHO-like phenotype CSTB 3.16 × 10−5 1.76 × 10−2 8.56 × 10−6 Previously implicated gene COL9A3 3.89 × 10−5 4.08 × 10−2 2.28 × 10−5 MMP21 2.05 × 10−3 1.03 × 10−3 2.97 × 10−5 Cosegregation with affected sibling Two mouse mutants with heterotaxy Table 1: significance set P < 1.44 x 10-6 – significant for 2/74 candidate genes

26 HACE1 – global developmental delayExceeded genome-wide significance Not previously reported in human disease 6 individuals across 4 families All LoF alleles Only is ambulatory Features include hypoplasia of the corpus callosum, reduced white matter Encodes a brain HECT-domain containing E3 ubiquitin ligase Homozygous mouse knockout is lethal before weaning, mechanism unknown All have one Arg143Lysfs*4, they hypothesize is a null allele that in homozygously is lethal. Is combined here with hypomorphs. B: protein domain structure, with 6 ankyrin repeats in orange, one HECT domain in blue C: protein modeling showing that the HECT domain, which is highly conserved, incurs a significant impact from L832del Figure 3

27 HACE1 – global developmental delayExceeded genome-wide significance Not previously reported in human disease 6 individuals across 4 families All LoF alleles Only is ambulatory Features include hypoplasia of the corpus callosum, reduced white matter Encodes a brain HECT-domain containing E3 ubiquitin ligase Homozygous mouse knockout is lethal before weaning, mechanism unknown

28 Genotype-Phenotype P-values CombinedGene n LOF/LOFa n LOF/functionalb P (genotype)c P (phenotype)d Combinede Evidence Additional evidence KIAA0586 6 1 1.42 × 10−6 5.60 × 10−4 1.75 × 10−8 Genome-wide significant Cosegregation in two affected siblings Mouse and chick mutants with ciliary phenotypes HACE1 3 2.39 × 10−8 1.11 × 10−1 5.50 × 10−8 Cosegregation in one affected sibling Three affected siblings with homozygous in-frame deletion Mouse mutant with early-lethality phenotype PRMT7 2 1.45 × 10−4 1.49 × 10−3 3.53 × 10−6 Suggestive Cosegregation in three affected siblings Concordant mouse mutant with AHO-like phenotype CSTB 3.16 × 10−5 1.76 × 10−2 8.56 × 10−6 Previously implicated gene COL9A3 3.89 × 10−5 4.08 × 10−2 2.28 × 10−5 MMP21 2.05 × 10−3 1.03 × 10−3 2.97 × 10−5 Cosegregation with affected sibling Two mouse mutants with heterotaxy Table 1: significance set P < 1.44 x 10-6 – significant for 2/74 candidate genes

29 MMP21 - heterotaxy Did not quite meet genome-wide significanceNot previously reported in human disease 2 individuals and one fetus across 2 families All with visceral heterotaxy (altered laterality of organs and development within organs) including heart malformations Encodes matrix metalloproteinase that may modulate cell proliferation and migration through extracellular matrix remodeling Figure 4

31 MMP21 - heterotaxy Mmp21 mutations in mice appears to not effect cilia motility

32 Genotype-Phenotype P-values CombinedGene n LOF/LOFa n LOF/functionalb P (genotype)c P (phenotype)d Combinede Evidence Additional evidence KIAA0586 6 1 1.42 × 10−6 5.60 × 10−4 1.75 × 10−8 Genome-wide significant Cosegregation in two affected siblings Mouse and chick mutants with ciliary phenotypes HACE1 3 2.39 × 10−8 1.11 × 10−1 5.50 × 10−8 Cosegregation in one affected sibling Three affected siblings with homozygous in-frame deletion Mouse mutant with early-lethality phenotype PRMT7 2 1.45 × 10−4 1.49 × 10−3 3.53 × 10−6 Suggestive Cosegregation in three affected siblings Concordant mouse mutant with AHO-like phenotype CSTB 3.16 × 10−5 1.76 × 10−2 8.56 × 10−6 Previously implicated gene COL9A3 3.89 × 10−5 4.08 × 10−2 2.28 × 10−5 MMP21 2.05 × 10−3 1.03 × 10−3 2.97 × 10−5 Cosegregation with affected sibling Two mouse mutants with heterotaxy Table 1: significance set P < 1.44 x 10-6 – significant for 2/74 candidate genes

34 PRMT7 – phenocopy of AHO Protein modeling suggests missense mutations are damaging Supplemental Figure 6

35 PRMT7 – phenocopy of AHO B: Most of the variants are in the second of two S-adenosylmethionine-dependant methyltransferase domains C: Knockout reporter used to generate homozygous Prmt7 null mice D: AHO features of knockout mice Figure 5

36 PRMT7 – phenocopy of AHO Small size Reduced 5th metatarsalSupplemental Figure 7

37 Summary Performed phenotyping, exome sequencing, rare variant calling on 4,205 patients with severe, undiagnosed developmental disorders, within 4,125 families 1,053 diagnosed from these data alone Integrated analysis of the remaining 3,072 families provided Average 3.2 minor allele variants per proband 72 total candidate recessive disease genes Two (2) with genome-wide significance Diagnosis for 26 patients from 19 families Four genes newly associated with human disease Additional evidence confirming two more previously implicated human disease genes (COL9A3 and CSTB) Large-scale ascertainment of small families with diverse clinical presentations, combined with exome sequencing and probabilistic integration of genotype/phenotype data enriches ability to detect new recessive conditions

38 Can you use phenotypic similarities across families, to determine rare causative mutations in a (large) population? Yes - but

39 Challenges Functional data is still required, and can be time/resource consuming Scale and coordination of populations required to advance/expand technique Phenotyping is clinician-dependent, and even with HPO terms can be difficult to standardize Unclear from published P-values how much phenotyping increased ascertainment In PRMT7 if “short metacarpal” had been included in HPO, would have reached genome-wide significance Free-text clinician entry of suspected syndrome or “looks like” gave greater significance than HPO phenotype annotation for syndromes with significant features (Joubert, AHO)

40 DDD DDG2P curated list of genes reported to be associated with developmental disorders. Wright CF, Fitzgerald TW, Jones WD, Clayton S, McRae JF, van Kogelenberg M, King DA, Ambridge K, Barrett DM, Bayzetinova T, Bevan AP, Bragin E, Chatzimichali EA, Gribble S, Jones P, Krishnappa N, Mason LE, Miller R, Morley KI, Parthiban V, Prigmore E, Rajan D, Sifrim A, Swaminathan GJ, Tivey AR, Middleton A, Parker M, Carter NP, Barrett JC, Hurles ME, FitzPatrick DR, Firth HV and DDD study. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet (London, England) 2015;385;9975; Deciphering Developmental Disorders Study. Large-scale discovery of novel genetic causes of developmental disorders. Nature 2015;519;7542;223-8

41

42 Supplemental Figure 1

43 Sequencing and AnnotationPatient recruitment 24 clinical genetics centers within UK National Health Service and the Republic of Ireland 4,205 patient with severe, undiagnosed developmental disorders and their parents (4,125 families) Exome Sequencing ***depth? Paired-end libraries using Illumina (150 bp fragments) Targets enriched with Agilent Human All-Exon SureSelect RNA bait Sequencing on Illumina HiSeq BWA alignment with GATK HaplotypeCaller best practices variant calling Functional annotation MAF (minor allele frequency) from multiple sources (1000 Genomes, UK10K cohort, NHLBI GO Exome Sequencing Project, internal DDD cohort of unaffected parents) VEP (variant effect predictor) from Ensembl (version 2.6) Class 1: Loss of Function (transcript_ablation, slice_donor_variant, splice_acceptor_variant, stop_gained, frameshift_variant) Class 2: Functional (stop_lost, initiator_codon_variant, transcript_amplification, inframe_insertion, inframe_deletion, missense_variant, coding_sequence_variant) Mendelian filtering Remaining 3,072 families, rare (MAF <1%) protein-altering variants identified (average 3.2 per proband) Categorizes probands with biallelic loss of function, or compound heterozygous loss of function/functional Statistical genotype assessment Tested all known genes (GENCODE release 19) for rare allele enrichment in multiple families, controlled by probability of observing that genotype by chance Statistical phenotype assessment Phenotype similarity between probands calculated by pairwise analysis of standardized HPO input from the approved clinicians. Suspected syndrome was also provided by clinicians, and similarities also calculated. Multiple ethnicities

44 Patient Recruitment Multiple ethnicities ***Lancet paper