23 - PART 16 Genes, the Environment, and Disease

01 - 479 Principles of Human Genetics
03 - 481 Mitochondrial DNA and Heritable Traits and Diseases
04 - 482 Telomere Disease
05 - 483 Gene and Cell-Based Therapy in Clinical Medicine
06 - 484 The Human Microbiome in Health and Disease

01 - 479 Principles of Human Genetics

479 Principles of Human Genetics

Genes, the Environment, and Disease PART 16 J. Larry Jameson, Peter Kopp

Principles of Human

Genetics IMPACT OF GENETICS AND GENOMICS

ON MEDICAL PRACTICE Over the past four decades, novel insights into human genetics and genomics have fundamentally impacted the practice of medicine, ush ering in a new area with a deeper understanding of the genetic basis of numerous health conditions, novel diagnostic technologies, disease prevention and management, personalized medicine, and targeted therapies. Human genetics refers to the study of individual genes, their role and function in disease, and their mode of inheritance. Genomics refers to an organism’s entire genetic information, the genome, and the function and interaction of DNA within the genome, as well as with environmental or nongenetic factors, such as a person’s lifestyle. With the characterization of the human genome, genomics not only comple ments traditional genetics in our efforts to elucidate the etiology and pathogenesis of disease, but it also plays a prominent and expand ing role in diagnostics, prevention, and therapy (Chap. 480). These transformative developments, originally emerging from the Human Genome Project, have been variably designated genomic medicine, personalized medicine, or precision medicine. Precision medicine aims at customizing medical decisions to an individual patient. For example, a patient’s genetic characteristics (genotype) can be used to optimize drug therapy and predict efficacy, adverse events, and drug dosing of selected medications (pharmacogenomics) (Chap. 72). The character ization of the mutational profile of a malignancy allows the identifica tion of driver mutations or overexpressed signaling molecules, which then facilitates the selection of targeted therapies. Genome-wide poly genic risk scores (PRS) for common complex diseases are beginning to emerge and may impact disease prevention in the future. Genetics has traditionally been viewed through the window of relatively rare single-gene diseases. These disorders account for ~10% of pediatric admissions and childhood mortality. Historically, genetics has focused predominantly on chromosomal and metabolic disorders, reflecting the long-standing availability of techniques to diagnose these conditions. For example, conditions such as trisomy 21 (Down’s syn drome) or monosomy X (Turner’s syndrome) can be diagnosed using cytogenetics. Likewise, many metabolic disorders (e.g., phenylketon uria, familial hypercholesterolemia) are diagnosed using biochemical analyses. The advances in DNA and RNA diagnostics have extended the field of genetics to include virtually all medical specialties and have led to the elucidation of the pathogenesis of most monogenic disorders. In addition, virtually every medical condition has a genetic component. As is often evident from a patient’s family history, many common disorders such as hypertension, heart disease, asthma, dia betes mellitus, and mental illnesses are significantly influenced by the genetic background. These polygenic or multifactorial (complex) disorders involve the contributions of many different genes, as well as environmental factors that can modify disease risk. Genome-wide association studies (GWAS) have elucidated numerous disease-associated loci and are providing novel insights into the allelic architecture of complex traits. These studies have been facilitated by the availability of comprehensive catalogues of human single nucleotide polymorphism (SNP) haplotypes (HapMap, International Genome Sample Resource). Next-generation DNA sequencing (NGS) technologies have evolved rapidly, and the cost of sequencing whole exomes (the exons within the genome; whole exome sequencing [WES]) or genomes (whole genome sequencing [WGS]) has plummeted. Comprehensive unbiased sequence analyses are now routinely used to characterize individuals

with complex undiagnosed conditions or to determine the mutational profile of advanced malignancies in order to select optimal and targeted therapies. The assembly of diploid genomes, i.e., the characterization of the complete genetic information from both sets of chromosomes in an individual’s genome, will further enhance the complete resolution of genetic variation and should provide further insights into heritability and disease mechanisms. Cancer has a genetic basis because it results from acquired somatic mutations in genes controlling growth, apoptosis, and cellular differ entiation (Chap. 76). In addition, the development of many cancers is associated with a hereditary predisposition. Characterization of the genome (and epigenome) in various malignancies has led to funda mental new insights into cancer biology and reveals that the genomic profile of mutations is in many cases more important in determining the appropriate therapy than the organ in which the tumor originates. The Cancer Genome Atlas (TCGA) initiative of the National Cancer Institute and the National Human Genome Research Institute has already characterized the genomic landscape across >30 malignancies. TCGA consists of comprehensive analyses of genomic and proteomic alterations and provided fundamental new insights into the molecular pathogenesis of cancer. These data, together with comprehensive cata logues of somatic mutations identified in human cancer, have direct clinical ramifications that impact cancer taxonomy, as well as the devel opment and choice of targeted therapies. Genetic and genomic approaches have proven invaluable for the detection of infectious pathogens and are used clinically to identify agents that are difficult to culture such as mycobacteria, viruses, and parasites, or to track infectious agents locally or globally. In many cases, molecular genetics has improved the feasibility and accuracy of diagnostic testing and has opened new avenues for therapy, includ ing gene and cellular therapies (Chap. 483). Molecular genetics has also provided the opportunity to characterize the microbiome, a field that characterizes the population dynamics of bacteria, viruses, and parasites that coexist with humans and other animals (Chap. 483). The microbiome has significant effects on normal physiology as well as various disease states, and the field is now focusing on defining the mechanisms underlying these interactions. Molecular biology has significantly changed the treatment of human disease. Peptide hormones, growth factors, cytokines, and vaccines can be produced in large amounts using recombinant DNA and RNA tech nology (e.g., mRNA vaccines against SARS-CoV-2; small interfering RNA [siRNA] to treat hypercholesterolemia). Targeted modifications of recombinant peptides provide improved therapeutic tools, as illus trated by genetically modified insulin analogues with more favorable kinetics or glucagon-like peptide 1 (GLP-1) agonists for treatment of type 2 diabetes and for weight management. The rate at which new genetic and genomic information is being generated presents many challenges for health care providers and systems. Although many functional aspects of the genome remain unknown, there are many clinical situations where genetic and genomic information optimize patient care. Much genetic information resides in databases that provide easy access to the expanding information about the human genome, genetic disease, and genetic testing (Table 479-1). For example, several thousand monogenic disorders are summarized in a large, continuously evolving compendium, the Online Mendelian Inheritance in Man (OMIM) catalogue (Table 479-1). The constant refinement of bioinformatics and big data analytics, together with the widespread adoption of electronic health records (EHRs), are simplify ing the access, analysis, and integration of this daunting amount of new information. Importantly, genomic data can be integrated readily into EHRs and thus impact clinical practice. ■ ■THE HUMAN GENOME Structure of the Human Genome The Human Genome Project was initiated in the mid-1980s as an ambitious effort to characterize

TABLE 479-1 Selected Databases Relevant for Genomics and Genetic Disorders SITE URL COMMENT National Center for Biotechnology Information (NCBI) http://www.ncbi.nlm.nih.gov/ Broad access to biomedical and genomic information, literature (PubMed), sequence databases, software for analyses of nucleotides and proteins Extensive links to other databases, genome resources, and tutorials PART 16 Genes, the Environment, and Disease National Human Genome Research Institute http://www.genome.gov/ An institute of the National Institutes of Health focused on genomic and genetic research; links providing information about the human genome sequence, genomes of other organisms, genomic research, and legislation Catalog of Published Genome-Wide Association Studies https://www.ebi.ac.uk/gwas/ Published high-resolution genome-wide association studies (GWAS) Ensembl Genome browser http://www.ensembl.org Maps and sequence information of eukaryotic genomes Online Mendelian Inheritance in Man http://www.ncbi.nlm.nih.gov/omim Online compendium of Mendelian disorders and human genes causing genetic disorders American College of Medical Genetics and Genomics http://www.acmg.net/ Extensive links to other databases relevant for the diagnosis, treatment, and prevention of genetic disease American Society of Human Genetics http://www.ashg.org Information about advances in genetic research, professional and public education, and social and scientific policies The Cancer Genome Atlas https://cancergenome.nih.gov/ Comprehensive, multidimensional characterization of the genomic and proteomic landscape of malignancies with high public health impact COSMIC Catalogue of Somatic Mutations in Cancer https://cancer.sanger.ac.uk/cosmic Comprehensive catalogue of somatic mutations in human cancer Genetic Testing Registry https://www.ncbi.nlm.nih.gov/gtr/ International directory of genetic testing laboratories and prenatal diagnosis clinics; reviews and educational materials Genomes Online Database (GOLD) http://www.genomesonline.org/ Information on published and unpublished genomes HUGO Gene Nomenclature http://www.genenames.org/ Gene names and symbols GENECODE https://www.gencodegenes.org/ High-quality reference gene annotation and experimental validation for human and mouse genomes MITOMAP, a human mitochondrial genome database http://www.mitomap.org/ A compendium of polymorphisms and mutations of the human mitochondrial DNA The International Genome Sample Resource (IGSR) http://www.internationalgenome.org Public catalogue of human variation and genotype data from numerous ethnic groups Human Genome Variation Society https://www.hgvs.org/ Collection and documentation of genomic variations including population distribution and phenotypic associations ENCODE http://www.genome.gov/10005107 Encyclopedia of DNA Elements; catalogue of all functional elements in the human genome Dolan DNA Learning Center, Cold Spring Harbor Laboratories http://www.dnalc.org/ Educational material about selected genetic disorders, DNA, eugenics, and genetic origin The Online Metabolic and Molecular Bases of Inherited Disease (OMMBID) http://ommbid.mhmedical.com Online version of the comprehensive text on the metabolic and molecular bases of inherited disease Online Mendelian Inheritance in Animals (OMIA) https://www.omia.org/home/ Online compendium of Mendelian disorders in animals The Jackson Laboratory http://www.jax.org/ Information about murine models and the mouse genome Mouse genome informatics http://www.informatics.jax.org Mouse genome informatics, potential mouse models of human disease, information on phenotypic similarity between mouse models and human patients Note: Databases are evolving constantly. Pertinent information may be found by using links listed in the few selected databases. the entire human genome and culminated in the completion of the DNA sequence for the last of the human chromosomes in 2006. The scope of a WGS analysis can be illustrated by the following analogy. Human DNA consists of ~3 billion base pairs (bp) of DNA per haploid genome, which is nearly 1000-fold greater than that of the Escherichia coli genome. If the human DNA sequence were printed out, it would correspond to about 120 volumes of Harrison’s Principles of Internal Medicine. In addition to the human genome, the genomes of thousands of organisms have been sequenced completely or partially (Genomes Online Database [GOLD]; Table 479-1). They include, among others, eukaryotes such as the mouse (Mus musculus), Saccharomyces cere visiae, Caenorhabditis elegans, and Drosophila melanogaster; bacteria (e.g., E. coli); and archaea, viruses, organelles (mitochondria, chloro plasts), and plants (e.g., Arabidopsis thaliana). Genomic information of infectious agents has significant impact for the characterization of infectious outbreaks and epidemics. Other ramifications arising from the availability of genomic data include, among others, (1) the comparison of entire genomes (comparative genomics); (2) the study of large-scale expression of RNAs (functional genomics), proteins (proteomics), or protein families (e.g., the kinome, the complete set of

protein kinases) to detect differences between various tissues in health and disease; (3) the characterization of the variation among individu als by establishing catalogues of sequence variations and SNPs; and (4) the identification of genes that play critical roles in the development of polygenic and multifactorial disorders. CHROMOSOMES The human genome is divided into 23 different chromosomes, including 22 autosomes (numbered 1–22) and the X and Y sex chromosomes (Fig. 479-1). Adult cells are diploid, meaning they contain two homologous sets of 22 autosomes and a pair of sex chromosomes. Females have two X chromosomes (XX), whereas males have one X and one Y chromosome (XY). As a consequence of meiosis, germ cells (sperm or oocytes) are haploid and contain one set of 22 autosomes and one of the sex chromosomes. At the time of fertiliza tion, the diploid genome is reconstituted by pairing of the homologous chromosomes from the mother and father. With each cell division (mitosis), chromosomes are replicated, paired, segregated, and divided into two daughter cells. STRUCTURE OF DNA DNA is a double-stranded helix composed of four different bases: adenine (A), thymidine (T), guanine (G), and cytosine (C). Adenine is paired to thymidine, and guanine is paired

Guanine Cytosine H O O H N N H H C C C O O P C C N N N O– C C H H C N N C N H O H H Thymine Adenine O N H H3C C N H C O O P C C C C N O– C N H H N C C C N N H O T Double-strand DNA without histones G A C A T C G Nucleosome core Histone H2A, H2B, H4 Metaphase chromosome Nucleosome fiber p, short arm Centromere Solenoid q, long arm Telomere Supercoiled chromatin FIGURE 479-1 Structure of chromatin and chromosomes. Chromatin is composed of double-strand DNA that is wrapped around histone and nonhistone proteins forming nucleosomes. The nucleosomes are further organized into solenoid structures. Chromosomes assume their characteristic structure, with short (p) and long (q) arms at the metaphase stage of the cell cycle. to cytosine, by hydrogen bond interactions that span the double helix (Fig. 479-1). DNA has several remarkable features that make it ideal for the transmission of genetic information. It is relatively stable, and the double-stranded nature of DNA and its feature of strict base-pair complementarity permit faithful replication during cell division. Complementarity also allows the transmission of genetic information from DNA → RNA → protein (Fig. 479-2). mRNA is encoded by the so-called sense or coding strand of the DNA double helix and is trans lated into proteins by ribosomes. The presence of four different bases provides surprising genetic diversity. In the protein-coding regions of genes, the DNA bases are arranged into codons, a triplet of bases that specifies a particular amino acid. It is possible to arrange the four bases into 64 different triplet codons (43). Each codon specifies 1 of the 20 different amino acids, or a regulatory signal such as initiation and stop of translation. Because there are more codons than amino acids, the genetic code is degenerate; that is, most amino acids can be specified by several dif ferent codons. By arranging the codons in different combinations and in various lengths, it is possible to generate the tremendous diversity of primary protein structure. DNA length is normally measured in units of 1000 bp (kilobases, kb) or 1,000,000 bp (megabases, Mb). In the human genome, only ~1% of DNA accounts for protein-coding sequences. The noncoding DNA has multiple functional and structural roles including (1) sequences that form introns; (2) regulatory elements (promoters, enhancers, silencers, insulators); (3) sequences that generate RNAs that do not code for pro teins; (4) centromeres and telomeres; (5) regions defining chromatin

structure and histone modifications; (6) various forms of repetitive sequences of variable length; and (7) pseudogenes and regions without currently discernible functional or structural roles (Fig. 479-1).

GENES A gene is a functional unit that is regulated by transcription (see below) and encodes an RNA product, which is most commonly, but not always, translated into a protein that exerts activity within or outside the cell (Fig. 479-3). Historically, genes were identified because they conferred specific traits that are transmitted from one generation to the next. Now, they are frequently characterized based on expression in various tissues (transcriptome). The size of genes is quite broad; some genes are only a few hundred base pairs, whereas others are extraordinarily large (2.3 Mb). The number of genes greatly underestimates the complexity of genetic expression, because single genes can generate multiple spliced messenger RNA (mRNA) products (isoforms), which are translated into proteins that are subject to com plex posttranslational modification such as phosphorylation. Exons refer to the portion of genes that are eventually spliced together to form mRNA. Introns refer to the spacing regions between the exons that are spliced out of precursor RNAs during RNA processing. The gene locus also includes regions that are necessary to control its expression (Fig. 479-2). Current estimates predict roughly 20,000 protein-coding genes in the human genome with an average of about four different coding transcripts per gene. Remarkably, the exome only constitutes 1.14% of the genome. Of note, the number of transcripts is close to 200,000 and includes thousands of noncoding transcripts (RNAs of various length such as microRNAs [miRNA] and long noncoding RNAs [lncRNA]). These noncoding RNAs are involved in numerous cellular processes such as transcriptional and posttranscriptional regulation of gene expression, chromatin remodeling, and protein trafficking, among others. Not surprisingly, aberrant expression and/or mutations in these RNAs play a pathogenic role in numerous diseases. CHAPTER 479 Principles of Human Genetics Histone H1 SINGLE-NUCLEOTIDE POLYMORPHISMS On average, a typical genome differs from the reference human genome at 4 to 5 million sites. Some of these variants have no impact on health, whereas others may increase or lower the risk for developing a specific disease. Remarkably, however, the primary DNA sequence of humans has ~99.9% similarity compared to that of any other human. An SNP is a variation of a single base pair in the DNA. Across human populations from distinct ethnic backgrounds, there are more than 1 billion validated SNPs (Fig. 479-3). SNPs are the most common type of sequence variation and account for

90% of all sequence variation. They occur on average every 100–300 bases and are the major source of genetic heterogeneity. SNPs that are in proximity are inherited together (e.g., they are linked) and are referred to as haplotypes (Fig. 479-4). Haplotype maps describe the nature and location of these SNP haplotypes and how they are distrib uted among individuals within and among populations, information that has been facilitating GWAS designed to elucidate the complex interactions among multiple genes and lifestyle factors in multifactorial disorders (see below). Moreover, haplotype analyses are useful to assess variations in responses to medications (pharmacogenomics) and envi ronmental factors, as well as the prediction of disease predisposition. COPY NUMBER VARIATIONS Copy number variations (CNVs) are relatively large genomic regions (1 kb to several Mb) that have been duplicated or deleted on certain chromosomes and hence alter the dip loid status of the DNA (Fig. 479-5). It has been estimated that 5–10% of the genome can display CNVs. When comparing the genomes of two individuals, ~0.4–0.8% of their genomes differ in terms of CNVs scattered throughout the genome. Some CNVs can increase or decrease gene dosage, potentially leading to detrimental effects if essential genes are impacted. Of note, de novo CNVs have been observed between monozygotic twins, who otherwise have identical genomes. Replication of DNA and Mitosis Genetic information in DNA is transmitted to daughter cells under two different circumstances: (1) somatic cells divide by mitosis, allowing the diploid (2n) genome to repli cate itself completely in conjunction with cell division; and (2) germ cells (sperm and ova) undergo meiosis, a process that enables the reduction of the diploid (2n) set of chromosomes to the haploid state (1n).

Steroids Ca2+ Cytokines Growth factors Hormones Light UV-light Mechanical stress PART 16 Genes, the Environment, and Disease Regulation of Gene Expression Enhancer Silencer Nuclear receptor Nuclear receptor HAT CoA CBP TAF GTF CREB CREB TBP Transcription factor CRE RE CAAT TATA

mRNA Processing Posttranslational Processing FIGURE 479-2 Flow of genetic information. Multiple extracellular signals activate intracellular signal cascades that result in altered regulation of gene expression through the interaction of transcription factors with regulatory regions of genes. RNA polymerase transcribes DNA into RNA that is processed to mRNA by excision of intronic sequences. The mRNA is translated into a polypeptide chain to form the mature protein after undergoing posttranslational processing. CBP, CREB-binding protein; CoA, co-activator; COOH, carboxyterminus; CRE, cyclic AMP responsive element; CREB, cyclic AMP response element–binding protein; GTF, general transcription factors; HAT, histone acetyl transferase; NH2, aminoterminus; RE, response element; TAF, TBP-associated factors; TATA, TATA box; TBP, TATA-binding protein. Prior to mitosis, cells exit the resting, or G0 state, and enter the cell cycle. After traversing a critical checkpoint in G1, cells undergo DNA synthesis (S phase), during which the DNA in each chromosome is rep licated, yielding two pairs of sister chromatids (2n → 4n). The process of DNA synthesis requires stringent fidelity in order to avoid transmit ting errors to subsequent generations of cells. Genetic abnormalities of DNA mismatch/repair include xeroderma pigmentosum, Bloom’s syndrome, ataxia telangiectasia, and hereditary nonpolyposis colon cancer (HNPCC), among others. Many of these disorders strongly predispose to neoplasia because of the rapid acquisition of additional mutations (Chap. 76). After completion of DNA synthesis, cells enter G2 and progress through a second checkpoint before entering mitosis. At this stage, the chromosomes condense and are aligned along the equatorial plate at metaphase. The two identical sister chromatids, held together at the centromere, divide and migrate to opposite poles of the cell. After formation of a nuclear membrane around the two separated sets of chromatids, the cell divides and two daughter cells are formed, thus restoring the diploid (2n) state. Assortment and Segregation of Genes During Meiosis Meiosis occurs only in germ cells of the gonads. It shares certain features with mitosis but involves two distinct steps of cell division that reduce the chromosome number to the haploid state. In addition, there is active recombination that generates genetic diversity. During the first cell division, two sister chromatids (2n → 4n) are formed for each chro mosome pair and there is an exchange of DNA between homologous paternal and maternal chromosomes. This process involves the forma tion of chiasmata, structures that correspond to the DNA segments that cross over between the maternal and paternal homologues (Fig. 479-6). Usually there is at least one crossover on each chromosomal arm; recombination occurs more frequently in female meiosis than in male meiosis. Subsequently, the chromosomes segregate randomly. Because there are 23 chromosomes, there exist 223 (>8 million) possible com binations of chromosomes. Together with the genetic exchanges that

Cytoplasm Nucleus RNA polymerase II DNA Transcription

hRNA

–Poly-A Tail 5′ -Cap mRNA Translation Protein NH2– –COOH occur during recombination, chromosomal segregation generates tre mendous diversity, and each gamete is genetically unique. The process of recombination and the independent segregation of chromosomes provide the foundation for performing linkage analyses, whereby one attempts to correlate the inheritance of certain chromosomal regions (or linked genes) with the presence of a disease or genetic trait (see below). After the first meiotic division, which results in two daughter cells (2n), the two chromatids of each chromosome separate during a sec ond meiotic division to yield four gametes with a haploid state (1n). When the egg is fertilized by sperm, the two haploid sets are combined, thereby restoring the diploid state (2n) in the zygote. ■ ■REGULATION OF GENE EXPRESSION Regulation by Transcription Factors The expression of genes is regulated by DNA-binding proteins that activate or repress tran scription. The number of DNA sequences and transcription factors that regulate transcription is much greater than originally anticipated. Most genes contain at least 15–20 discrete regulatory elements within 300 bp of the transcription start site. This densely packed promoter region often contains binding sites for ubiquitous transcription fac tors. However, factors involved in cell-specific expression may also bind to these sequences. Key regulatory elements may also reside at a large distance from the proximal promoter. The globin and the immunoglobulin genes, for example, contain locus control regions that are several kilobases away from the structural sequences of the gene. Specific groups of transcription factors that bind to these promoter and enhancer sequences provide a combinatorial code for regulating transcription. In this manner, relatively ubiquitous factors interact with more restricted factors to allow each gene to be expressed and regulated in a unique manner that is dependent on developmental state, cell type, and numerous extracellular stimuli. Regulatory factors also bind within the gene itself, particularly in the intronic regions. The

SNPs (612,977) Known Genes (1260) p22.3 p22.1 p21.3 p15.3 p15.1 p14.3 p13 p12.3 p14.1 p21.1 Chromosome 7 116.90 Mb 116.94 Mb 116.98 Mb 117.02 Mb 117.06 Mb CFTR Gene SNPs Intronic Splice site Coding region, synonymous Coding region, nonsynonymous FIGURE 479-3 Chromosome 7 is shown with the density of single nucleotide polymorphisms (SNPs) and genes above. A 200-kb region in 7q31.2 containing the CFTR gene is shown below. The CFTR gene contains 27 exons. Close to 2000 mutations in this gene have been found in patients with cystic fibrosis. A 20-kb region encompassing exons 4–9 is shown further amplified to illustrate the SNPs in this region. FIGURE 479-4 The origin of haplotypes is due to repeated recombination events occurring in multiple generations. Over time, this leads to distinct haplotypes. These haplotype blocks can often be characterized by genotyping selected Tag single nucleotide polymorphisms (SNPs), an approach that facilitates performing genomewide association studies (GWAS).

CHAPTER 479 Principles of Human Genetics q31.2 q31.31 q31.33 q32.1 p21.13 q11.21 q11.22 q11.23 q21.11 q22.1 q22.3 q35 q36.1 q36.3 p12.1 p11.2 q21.3 q31.1 q33 q34 200 Kb 20 Kb Coding region, frameshift transcription factors that bind to DNA represent only the first level of regulatory control. Other proteins—co-activators and co-repressors— interact with the DNA-binding transcription factors to generate large regulatory complexes. These complexes are subject to control by numerous cell-signaling pathways and enzymes, leading to phosphory lation, acetylation, sumoylation, and ubiquitination. Ultimately, the recruited transcription factors interact with, and stabilize, components of the basal transcription complex that assembles at the site of the TATA box and initiator region. This basal transcription factor complex consists of >30 different proteins. Gene transcription occurs when RNA polymerase begins to synthesize RNA from the DNA template. A large number of identified genetic diseases involve transcription factors (Table 479-2). The field of functional genomics is based on the concept that under standing alterations of gene expression under various physiologic and pathologic conditions provides insight into the underlying functional role of the gene. The ENCODE (Encyclopedia of DNA Elements) project aims at identifying and annotating all functional sequences in the human genome. By revealing specific gene expression profiles, this knowledge can be of diagnostic and therapeutic relevance. The large-scale study of expression profiles is referred to as transcriptomics because the complement of mRNAs transcribed by the cellular genome is called the transcriptome. Most studies of gene expression have focused on the regulatory DNA elements of genes that control transcription. However, it must be

Normal Duplicated Area PART 16 Genes, the Environment, and Disease Deleted Area

log2 (ratio)

–1 –2 Chromosome 8 FIGURE 479-5 Copy number variations (CNV) encompass relatively large regions of the genome that have been duplicated or deleted. Chromosome 8 is shown with a CNV detected by genomic hybridization. An increase in the signal strength indicates a duplication, whereas a decrease reflects a deletion of the covered chromosomal regions. emphasized that gene expression requires a series of steps, including mRNA processing, protein translation, and posttranslational modifica tions, all of which are actively regulated (Fig. 479-2). Epigenetic Regulation of Gene Expression (see Chap. 497)

Epigenetics describes mechanisms and phenotypic changes that are not a result of variation in the primary DNA nucleotide sequence but are caused by secondary modifications of DNA or histones. These modifi cations include heritable changes such as X-inactivation and imprint ing, but they can also result from dynamic posttranslational protein modifications in response to environmental influences such as diet, age, or drugs. The epigenetic modifications result in altered expression of individual genes or chromosomal loci encompassing multiple genes. The term epigenome describes the constellation of covalent modifica tions of DNA and histones that impact chromatin structure, as well as noncoding transcripts that modulate the transcriptional activity of DNA. Although the primary DNA sequence is usually identical in all cells of an organism, sex- and tissue-specific changes in the epigenome contribute to determining the transcriptional signature of a cell (tran scriptome) and hence the protein expression profile (proteome). Mechanistically, DNA and histone modifications can result in the activation or silencing of gene expression (Fig. 479-7). DNA methyla tion involves the addition of a methyl group to cytosine residues. This is usually restricted to cytosines of CpG dinucleotides, which are abun dant throughout the genome. Methylation of these dinucleotides is thought to represent a defense mechanism that minimizes the expres sion of sequences that have been incorporated into the genome such as retroviral sequences. CpG dinucleotides also exist in so-called CpG islands, stretches of DNA characterized by a high CG content, which are found in the majority of human gene promoters. CpG islands in promoter regions are typically unmethylated, and the lack of methyla tion facilitates transcription. Histone methylation involves the addition of a methyl group to lysine residues in histone proteins (Fig. 479-7). Depending on the specific lysine residue being methylated, this alters chromatin configuration, making it either more open or tightly packed. Acetylation of histone proteins is another well-characterized mechanism that results in an

open chromatin configuration, which favors active transcription. Acetylation is generally more dynamic than methylation, and many transcriptional activation complexes have histone acetylase activity, whereas repressor complexes often contain deacetylases and remove acetyl groups from histones. Other histone modifications include, among others, phosphorylation and sumoylation. Furthermore, noncoding RNAs and RNA regula tory networks that bind to DNA have a significant impact on transcriptional activity. Physiologically, epigenetic mechanisms play an important role in several instances. For example, X-inactivation refers to the relative silencing of one of the two X chromosome copies present in females. The inactivation process is a form of dosage compensation such that females (XX) do not generally express twice as many X-chromosomal gene products as males (XY). In a given cell, the choice of which chromo some is inactivated occurs randomly in humans. But once the maternal or paternal X chromosome is inactivated, it will remain inactive, and this infor mation is transmitted with each cell division. The X-inactive specific transcript (Xist) gene encodes a long non-coding RNA (lncRNA) that mediates gene silencing on one of the X chromosomes. The inac tive X chromosome is highly methylated and has low levels of histone acetylation. While the majority of X-chromosomal genes are silenced by X-inactivation, ~15% escape inactivation and are expressed. Epigenetic gene inactivation also occurs on selected chromosomal regions of autosomes, a phenomenon referred to as genomic imprinting. Through this mechanism, a small subset of genes is only expressed in a monoallelic fashion. Imprinting is A a A a a A B b B b b B C c C c c C D d D d d D Chromatids Homologous chromosomes A a a A A a a A A a a A B b b B B b b B B b b B c c C C c c C C C c c C d d D D D d d D D d d D Crossover Double crossover No crossover A a a A A a A a A a a A B b b B B b B b B b b B c c C C c c C C C c c C D d D d d d D D D d d D Recombination in gametes Recombination in gametes No recombination in gametes FIGURE 479-6 Crossing-over and genetic recombination. During chiasma formation, either of the two sister chromatids on one chromosome pairs with one of the chromatids of the homologous chromosome. Genetic recombination occurs through crossing-over and results in recombinant and nonrecombinant chromosome segments in the gametes. Together with the random segregation of the maternal and paternal chromosomes, recombination contributes to genetic diversity and forms the basis of the concept of linkage.

TABLE 479-2 Selected Examples of Diseases Caused by Mutations and Rearrangements in Transcription Factors TRANSCRIPTION FACTOR CLASS EXAMPLE ASSOCIATED DISORDER Nuclear receptors Androgen receptor Complete or partial androgen insensitivity (recessive missense mutations) Spinobulbar muscular atrophy (CAG repeat expansion) Zinc finger proteins WT1 WAGR syndrome: Wilms’ tumor, aniridia, genitourinary malformations, mental retardation Basic helix-loop-helix MITF Waardenburg’s syndrome type 2A Homeobox IPF1 Maturity onset of diabetes mellitus type 4 (monoallelic mutation/ haploinsufficiency) Pancreatic agenesis (biallelic mutations) Leucine zipper Retina leucine zipper (NRL) Autosomal dominant retinitis pigmentosa High mobility group (HMG) proteins SRY Sex reversal Forkhead HNF4α, HNF1α, HNF1β Maturity onset of diabetes mellitus types 1, 3, 5 Paired box PAX3 Waardenburg’s syndrome types 1 and 3 T-box TBX5 Holt-Oram syndrome (thumb anomalies, atrial or ventricular septum defects, phocomelia) Cell cycle control proteins P53 Li-Fraumeni syndrome, other cancers Co-activators CREB binding protein (CREBBP) Rubinstein-Taybi syndrome General transcription factors TATA-binding protein (TBP) Spinocerebellar ataxia 17 (CAG expansion) Transcription elongation factor VHL von Hippel–Lindau syndrome (renal cell carcinoma, pheochromocytoma, pancreatic tumors, hemangioblastomas) Autosomal dominant inheritance, somatic inactivation of second allele (Knudson two-hit model) Runt RUNX1 Familial thrombocytopenia with propensity to acute myelogenous leukemia Chimeric proteins due to translocations PML-RAR Acute promyelocytic leukemia t(15;17)(q22;q11.2-q12) translocation Abbreviations: CREB, cAMP responsive element–binding protein; HNF, hepatocyte nuclear factor; PML, promyelocytic leukemia; RAR, retinoic acid receptor; SRY, sexdetermining region Y; VHL, von Hippel–Lindau. heritable and leads to the preferential expression of one of the parental alleles, which deviates from the usual biallelic expression seen for the majority of genes. Remarkably, imprinting can be limited to a subset of tissues. Imprinting is mediated through DNA methylation of one of the alleles. The epigenetic marks on imprinted genes are maintained throughout life, but during zygote formation, they are activated or inactivated in a sex-specific manner (imprint reset) (Fig. 479-8), which allows a differential expression pattern in the fertilized egg and the sub sequent mitotic divisions. Appropriate expression of imprinted genes is important for normal development and cellular functions. Imprinting defects and uniparental disomy, which is the inheritance of two chro mosomes or chromosomal regions from the same parent, are the cause of several developmental disorders such as Beckwith-Wiedemann syn drome, Silver-Russell syndrome, Angelman’s syndrome, and PraderWilli syndrome (see below). Monoallelic loss-of-function mutations in the GNAS1 gene lead to Albright’s hereditary osteodystrophy (AHO). Paternal transmission of GNAS1 mutations leads to an isolated AHO

phenotype (pseudopseudohypoparathyroidism), whereas maternal transmission leads to AHO in combination with hormone resistance to parathyroid hormone, thyrotropin, and gonadotropins (pseudohy poparathyroidism type IA). These phenotypic differences are explained by tissue-specific imprinting of the GNAS1 gene, which is expressed primarily from the maternal allele in the thyroid, gonadotropes, and the proximal renal tubule. In most other tissues, the GNAS1 gene is expressed biallelically. In patients with isolated renal resistance to parathyroid hormone (pseudohypoparathyroidism type IB), defective imprinting of the GNAS1 gene results in decreased Gsα expression in the proximal renal tubules. Rett syndrome is an X-linked dominant disorder resulting in developmental regression and stereotypic hand movements in affected girls. It is caused by mutations in the MECP2 gene, which encodes a methyl-binding protein. The ensuing aberrant methylation results in abnormal gene expression in neurons, which are otherwise normally developed.

CHAPTER 479 Principles of Human Genetics Remarkably, epigenetic differences also occur among monozygotic twins. Although twins are epigenetically indistinguishable during the early years of life, older monozygotic twins exhibit differences in the overall content and genomic distribution of DNA methylation and histone acetylation, which would be expected to alter gene expression in various tissues. In cancer, the epigenome is characterized by simultaneous losses and gains of DNA methylation in different genomic regions, as well as repressive histone modifications. Hyper- and hypomethylation are associated with mutations in genes that control DNA methylation. Hypomethylation is thought to remove normal control mechanisms that prevent expression of repressed DNA regions. It is also associated with genomic instability. Hypermethylation, in contrast, results in the silencing of CpG islands in promoter regions of genes, including tumor-suppressor genes. Epigenetic alterations are more easily revers ible compared to genetic changes; modification of the epigenome with demethylating agents and histone deacetylases is being used in the treatment of various malignancies. ■ ■TRANSMISSION OF GENETIC DISEASE Origins and Types of Mutations The term mutation or variant is used to designate the process of generating genetic variations as well as the effect of these alterations. A mutation can be defined as any change in the primary nucleotide sequence of DNA regardless of its functional consequences, although it often has a negative connotation. There has been a shift towards using the more neutral term variant to describe sequence changes, and it is now recommended by several professional organizations and guidelines instead of mutation. Some variants may be lethal, others are less deleterious, and some may confer an evolu tionary advantage. Variations can occur in the germline (sperm or oocytes); these can be transmitted to progeny. Alternatively, variants can occur during embryogenesis or in somatic tissues. Variations that occur during development lead to mosaicism, a situation in which tis sues are composed of cells with different genetic constitutions. If the germline is mosaic, a mutation can be transmitted to some progeny but not others, which sometimes leads to confusion in assessing the pat tern of inheritance. Somatic mutations that do not affect cell survival can sometimes be detected because of variable phenotypic effects in tissues (e.g., pigmented lesions in McCune-Albright syndrome). Other somatic mutations are associated with neoplasia because they confer a growth advantage to cells. Epigenetic events may also influence gene expression or facilitate genetic damage. With the exception of triplet nucleotide repeats, which can expand (see below), variations are usu ally stable. Sequence variants are structurally diverse—they can involve the entire genome, as in triploidy (one extra set of chromosomes), or gross numerical or structural alterations in chromosomes or individual genes. Large deletions may affect a portion of a gene or an entire gene, or, if several genes are involved, they may lead to a contiguous gene syndrome. Unequal crossing-over between homologous genes can result in fusion gene mutations, as illustrated by color blindness. Varia tions involving single nucleotides are referred to as point mutations.

Methylated DNA Cytosine Methylation NH2 NH2 N CH3 N PART 16 Genes, the Environment, and Disease O N O N Histone Acetylation Unmethylated DNA Histone Modifications Acetylation Phosphorylation Methylation Transcription NH2 FIGURE 479-7 Epigenetic modifications of DNA and histones. Methylation of cytosine residues is associated with gene silencing. Methylation of certain genomic regions is inherited (imprinting), and it is involved in the silencing of one of the two X chromosomes in females (X-inactivation). Alterations in methylation can also be acquired, e.g., in cancer cells. Covalent posttranslational modifications of histones play an important role in altering DNA accessibility and chromatin structure and hence in regulating transcription. Histones can be reversibly modified in their aminoterminal tails, which protrude from the nucleosome core particle, by acetylation of lysine, phosphorylation of serine, methylation of lysine and arginine residues, and sumoylation. Acetylation of histones by histone acetylases (HATs), e.g., leads to unwinding of chromatin and accessibility to transcription factors. Conversely, deacetylation by histone deacetylases (HDACs) results in a compact chromatin structure and silencing of transcription. Substitutions are called transitions if a purine is replaced by another purine base (A ↔ G) or if a pyrimidine is replaced by another pyrimi dine (C ↔ T). Changes from a purine to a pyrimidine, or vice versa, are referred to as transversions. If the DNA sequence change occurs in a coding region and alters an amino acid, it is called a missense muta tion. Depending on the functional consequences of such a missense mutation, amino acid substitutions in different regions of the protein can lead to distinct phenotypes. Variants can occur in all domains of a gene (Fig. 479-9). A point mutation occurring within the coding region leads to an amino acid substitution if the codon is altered (Fig. 479-10). Point mutations that introduce a premature stop codon result in a truncated or missing pro tein. Large deletions may affect a portion of a gene or an entire gene, whereas small deletions and insertions alter the reading frame if they do not represent a multiple of three bases. These “frameshift” muta tions, also designated as amphigoric amino acid changes, lead to an entirely altered carboxy terminus. Mutations in intronic sequences or in exon junctions may destroy or create splice donor or splice acceptor sites. Variants may also be found in the regulatory sequences of genes, resulting in reduced or enhanced gene transcription. Certain DNA sequences are particularly susceptible to mutagenesis. Successive pyrimidine residues (e.g., T-T or C-C) are subject to the formation of ultraviolet light–induced photoadducts. If these pyrimi dine dimers are not repaired by the nucleotide excision repair pathway, mutations will be introduced after DNA synthesis. The dinucleotide C-G, or CpG, is also a hot spot for a specific type of alteration. In this case, methylation of the cytosine is associated with an enhanced rate of deamination to uracil, which is then replaced with thymine. This C → T transition (or G → A on the opposite strand) accounts for at least one-third of point mutations associated with polymorphisms and mutations. In addition to the fact that certain types of mutations (C → T or G → A) are relatively common, the nature of the genetic code also results in overrepresentation of certain amino acid substitutions. Polymorphisms are sequence variations that have a frequency of at least 1%. Usually, they do not result in a perceptible phenotype; the term variant is now preferred for the description of these sequence changes because allele frequency and functional consequences are often not known. Often, they consist of single base-pair substitutions that do not alter the protein coding sequence because of the degenerate

nature of the genetic code (synonymous polymorphism), although it is possible that some might alter mRNA stability, transla tion, or the amino acid sequence (non synonymous polymorphism) (Fig. 479-10). The detection of sequence variants poses a practical problem because it is often unclear whether it creates a change with functional consequences or a benign variation. In this situation, the sequence alteration is also described as variant of unknown significance (VUS). Methylation MUTATION RATES Mutations represent an important cause of genetic diversity as well as disease. Mutation rates are difficult to determine in humans because many muta tions are silent and because testing is often not adequate to detect the phenotypic con sequences. Mutation rates vary in different genes but are estimated to occur at a rate of ~10−10/bp per cell division. Germline muta tion rates (as opposed to somatic muta tions) are relevant in the transmission of genetic disease. Because the population of oocytes is established very early in develop ment, only ~20 cell divisions are required for completed oogenesis, whereas sper matogenesis involves ~30 divisions by the time of puberty and 20 cell divisions each year thereafter. Consequently, the probability of acquiring new point mutations is much greater in the male germline than the female germ line, in which rates of aneuploidy are increased. Thus, the incidence of new point mutations in spermatogonia increases with paternal age (e.g., achondrodysplasia, Marfan’s syndrome, neurofibromatosis). It is estimated that about 1 in 10 sperm carries a new deleterious mutation. The rates for new mutations are calculated most readily for autosomal dominant and X-linked disorders and are ~10−5−10−6/locus per genera tion. Because most monogenic diseases are relatively rare, new muta tions account for a significant fraction of cases. This is important in the context of genetic counseling because a new mutation can be transmit ted to the affected individual, but this does not necessarily imply that the parents are at risk to transmit the disease to other children. An exception to this is when the new mutation occurs early in germline development, leading to gonadal mosaicism. UNEQUAL CROSSING-OVER Normally, DNA recombination in germ cells occurs with remarkable fidelity to maintain the precise junction sites for the exchanged DNA sequences (Fig. 479-6). However, mispair ing of homologous sequences leads to unequal crossover, with gene duplication on one of the chromosomes and gene deletion on the other chromosome. A significant fraction of growth hormone (GH) gene deletions, for example, involve unequal crossing-over (Chap. 391). The GH gene is a member of a large gene cluster that includes a GH vari ant gene as well as several structurally related chorionic somatomam motropin genes and pseudogenes (highly homologous but functionally inactive relatives of a normal gene). Because such gene clusters con tain multiple homologous DNA sequences arranged in tandem, they are particularly prone to undergo recombination and, consequently, gene duplication or deletion. Duplication of the PMP22 gene because of unequal crossing-over results in increased gene dosage and type IA Charcot-Marie-Tooth disease. In contrast, unequal crossing-over resulting in deletion of PMP22 causes a distinct neuropathy called hereditary neuropathy with liability to pressure palsies (HNPP) (Chap. 457). Glucocorticoid-remediable aldosteronism (GRA) is caused by a gene fusion or rearrangement involving the genes that encode aldo sterone synthase (CYP11B2) and steroid 11β-hydroxylase (CYP11B1), normally arranged in tandem on chromosome 8q. These two genes are 95% identical, predisposing to gene duplication and deletion by

Maternal somatic cell mat pat Active unmethylated Inactive methylated Inactive methylated Germline development: Imprint reset Maternal germline Paternal germline mat pat Active unmethylated Active unmethylated Inactive methylated Zygote pat mat Inactive methylated Active unmethylated FIGURE 479-8 A few genomic regions are imprinted in a parent-specific fashion. The unmethylated chromosomal regions are actively expressed, whereas the methylated regions are silenced. In the germline, the imprint is reset in a parent-specific fashion: both chromosomes are unmethylated in the maternal (mat) germline and methylated in the paternal (pat) germline. In the zygote, the resulting imprinting pattern is identical with the pattern in the somatic cells of the parents. unequal crossing-over. The rearranged gene product contains the regulatory regions of 11β-hydroxylase fused to the coding sequence of aldosterone synthetase. Consequently, the latter enzyme is expressed in the adrenocorticotropic hormone (ACTH)–dependent zona fasciculata of the adrenal gland, resulting in overproduction of mineralocorticoids and hypertension (Chap. 398). Gene conversion refers to a nonreciprocal exchange of homologous genetic information. It has been used to explain how an internal portion of a gene is replaced by a homologous segment copied from another allele or locus; these genetic alterations may range from a few nucleotides to a few thousand nucleotides. As a result of gene conver sion, it is possible for short DNA segments of two chromosomes to be identical, even though these sequences are distinct in the parents. A practical consequence of this phenomenon is that nucleotide substitu tions can occur during gene conversion between related genes, often altering the function of the gene. In disease states, gene conversion often involves intergenic exchange of DNA between a gene and a related pseudogene. For example, the 21-hydroxylase gene (CYP21A2) is adjacent to a nonfunctional pseudogene (CYP21A1P). Many of the

nucleotide substitutions that are found in the CYP21A2 gene in patients with congenital adrenal hyperplasia correspond to sequences that are present in the CYP21A1P pseu dogene, suggesting gene conversion as one cause of mutagenesis. In addition, mitotic gene conversion has been suggested as a mechanism to explain revertant mosaicism in which an inherited mutation is “corrected” in certain cells. For example, patients with autosomal recessive generalized atrophic benign epidermolysis bullosa have acquired reverse mutations in one of the two mutated COL17A1 alleles, leading to clinically unaf fected patches of skin.

Paternal somatic cell pat mat CHAPTER 479 Active unmethylated Principles of Human Genetics INSERTIONS AND DELETIONS Although many instances of insertions and deletions occur as a consequence of unequal cross ing-over, there is also evidence for internal duplication, inversion, or deletion of DNA sequences. The fact that certain deletions or insertions appear to occur repeatedly as independent events indicates that specific regions within the DNA sequence predispose to these errors. For example, certain regions of the DMD gene, which encodes dystrophin, appear to be hot spots for deletions and result in muscular dystrophy (Chap. 460). Some regions within the human genome are rear rangement hot spots and lead to CNVs. pat mat Inactive methylated ERRORS IN DNA REPAIR Because mutations caused by defects in DNA repair accumulate as somatic cells divide, these types of muta tions are particularly important in the con text of neoplastic disorders. Several genetic disorders involving DNA repair enzymes underscore their importance. Patients with xeroderma pigmentosum have defects in DNA damage recognition or in the nucleo tide excision and repair pathway (Chap. 81). Exposed skin is dry and pigmented and is extraordinarily sensitive to the mutagenic effects of ultraviolet irradiation. Variants in more than 10 different genes have been shown to cause the different forms of xero derma pigmentosum. Ataxia-telangiectasia is a multisystem disorder that includes progressive neuro degenerative cerebellar ataxia, immunologic defects, telangiectatic lesions, lymphomas and leukemias, and hyper sensitivity to ionizing radiation (Chap. 450). The discovery of the ataxia-telangiectasia mutated (ATM) gene revealed that it is homolo gous to genes involved in DNA repair and control of cell cycle check points. Mutations in the ATM gene give rise to defects in meiosis as well as increasing susceptibility to damage from ionizing radiation. Fanconi’s anemia is also associated with an increased risk of multiple acquired genetic abnormalities. It is characterized by diverse congeni tal anomalies and a strong predisposition to develop aplastic anemia and acute myelogenous leukemia (Chap. 109). Cells from these patients are susceptible to chromosomal breaks caused by a defect in genetic recombination. It can be caused by mutations in the multiple genes forming the Fanconi’s anemia pathway, which is involved in DNA repair and replication. HNPCC (Lynch syndrome) is characterized by autosomal dominant transmission of colon cancer, young age (<50 years) of presentation, predisposition to lesions in the proximal large bowel, and associated malignancies such as uterine cancer and ovarian cancer. HNPCC is predominantly caused by mutations in one of several different mismatch repair (MMR) genes including MutS

A *

PART 16 Genes, the Environment, and Disease intron 2 intron 1 Poly A Promoter 5'UTR ε Gγ Aγ ψβ β δ –10 kb 0 kb 10 kb 20 kb 30 kb 40 kb 50 kb 60 kb β-Globin Gene Cluster FIGURE 479-9 Point mutations causing a thalassemia as example of allelic heterogeneity. The b-globin gene is located in the globin gene cluster. Point mutations occur in the promoter, the CAP site, the 5′-untranslated region, the initiation codon, each of the three exons, the introns, or the polyadenylation signal. Many mutations introduce missense or nonsense mutations, whereas others cause defective RNA splicing. Not shown here are deletion mutations of the β-globin gene or larger deletions of the globin locus that can also result in thalassemia. , promoter mutations; *, CAP site; , 5’UTR; 1 , initiation codon; , defective RNA processing; , missense and nonsense mutations; A, Poly A signal. homologue 2 (MSH2), MutL homologue 1 and 6 (MLH1, MLH6), MSH6, PMS1, and PMS2 (Chap. 86). These proteins are involved in the detection of nucleotide mismatches and in the recognition of slippedstrand trinucleotide repeats. Germline mutations in these genes lead to microsatellite instability and a high mutation rate in colon cancer. Genetic screening tests for this disorder are now being used for families considered to be at risk. Recognition of HNPCC allows early screen ing with colonoscopy and the implementation of prevention strategies using nonsteroidal anti-inflammatory drugs. UNSTABLE DNA SEQUENCES Trinucleotide repeats may be unstable and expand beyond a critical number. Mechanistically, the expansion is thought to be caused by unequal recombination and slipped mispair ing. A premutation represents a small increase in trinucleotide copy number. In subsequent generations, the expanded repeat may increase further in length and result in an increasingly severe phenotype, a process called dynamic mutation (see below for discussion of anticipa tion). Trinucleotide expansion was first recognized as a cause of the fragile X syndrome, one of the most common causes of intellectual disability. Other disorders arising from a similar mechanism include Wild-type AA DNA A GCA CTC L CTA S TCG H CAC A GCT R CGG E GAG G GGC E L Silent mutation AA DNA A CGT GCA L CTC L CTA S TCG H CAC A GCT R GAG G GGC E E Missense mutation AA DNA A CCG E GCA L CTC L CTA S TCG H CAC A GCT GAG G GGC E P Nonsense mutation AA DNA A GCA L CTC L CTA S TCG H CAC A GCT R CGG E GAG G GGC 1 bp Deletion with frameshift AA DNA A GCA L CTC L CTA CGC ACG CTC GGG AGG GCG R T L G R A A B FIGURE 479-10 A. Examples of mutations (now commonly referred to as variations). The coding strand is shown with the encoded amino acid sequence. B. Chromatograms of sequence analyses after amplification of genomic DNA by polymerase chain reaction.

Huntington’s disease, X-linked spino bulbar muscular atrophy, and myotonic dystrophy. Malignant cells are also char acterized by genetic instability, indicat ing a breakdown in mechanisms that regulate DNA repair and the cell cycle. Functional Consequences of Mutations Functionally, muta tions can be broadly classified as gain-of-function and loss-of-function mutations. Gain-of-function mutations are typically dominant (e.g., they result in phenotypic alterations when a single allele is affected). Inactivating mutations are usually recessive, and an affected individual is homozygous or compound heterozygous (e.g., carrying two differ ent mutant alleles of the same gene) for the disease-causing mutations. Alter natively, mutation in a single allele can result in haploinsufficiency, a situation in which one normal allele is not sufficient to maintain a normal phenotype. Hap loinsufficiency is a commonly observed mechanism in diseases associated with mutations in transcription factors (Table 479-2). Remarkably, the clinical features among patients with an identical mutation often vary significantly. One mechanism underlying this variability consists in the influence of modifying genes. Haploinsuf ficiency can also affect the expression of rate-limiting enzymes. For example, haploinsufficiency in enzymes involved in heme synthesis can cause porphyrias (Chap. 428). An increase in dosage of a gene product may also result in disease, as illustrated by the duplication of the DAX1 (NR0B1) gene in dosagesensitive sex reversal (Chap. 402). Mutation in a single allele can also result in loss of function due to a dominant-negative effect. In this case, the mutated allele interferes with the function of the normal (wild type) gene product by one of several different mechanisms: (1) a mutant protein may interfere with the function of a multimeric protein com plex, as illustrated by mutations in type 1 collagen (COL1A1, COL1A2) genes in osteogenesis imperfecta (Chap. 425); (2) a mutant protein may occupy binding sites on proteins or promoter response elements, as illustrated by thyroid hormone resistance β, a disorder in which Wild-type GAA N AAT E GAG S AGC F T T C T A C C D G A C F T T C I A T A C T G C GAA N AAT E GAG S AGC Heterozygous point mutation F T T C I A T A C T G C F T T C T A C C D G A C GAA N AAT E GAG S AGC T A C Y TAA AAT GAG AGC Homozygous point mutation X F T T C T A C C Y T A C F T T C I A T A C T G C AAA ATG AGA GC K M R

inactivated thyroid hormone receptor β binds to target genes and func tions as an antagonist of normal receptors (Chap. 394); or (3) a mutant protein can be cytotoxic as in α1 antitrypsin deficiency (Chap. 303) or autosomal dominant neurohypophyseal diabetes insipidus (Chap. 393), in which the abnormally folded proteins are trapped within the endo plasmic reticulum and ultimately cause cellular damage. Genotype and Phenotype • ALLELES, GENOTYPES, AND HAPLO TYPES An observed trait is referred to as a phenotype; the genetic information defining the phenotype is called the genotype. Alternative forms of a gene or a genetic marker are referred to as alleles. Alleles may be polymorphic variants of nucleic acids that have no apparent effect on gene expression or function. In other instances, these variants may have subtle effects on gene expression, thereby conferring adap tive advantages associated with genetic diversity. On the other hand, allelic variants may reflect mutations that clearly alter the function of a gene product. The common Glu6Val (E6V) sickle cell mutation in the β-globin gene and the ΔF508 deletion of phenylalanine (F) in the CFTR gene are examples of allelic variants of these genes that result in disease. Because each individual has two copies of each chromosome (one inherited from the mother and one inherited from the father), an individual can have only two alleles at a given locus. However, there can be many different alleles in the population. The normal or com mon allele is usually referred to as wild type. When alleles at a given locus are identical, the individual is homozygous. Inheriting identical copies of a mutant allele occurs in many autosomal recessive disorders, particularly in circumstances of consanguinity or isolated populations. If the alleles are different on the maternal and the paternal copy of the gene, the individual is heterozygous at this locus (Fig. 479-10). If two different mutant alleles are inherited at a given locus, the individual is said to be a compound heterozygote. Hemizygous is used to describe males with a mutation in an X chromosomal gene or a female with a loss of one X chromosomal locus. Genotypes describe the specific alleles at a particular locus. For example, there are three common alleles (E2, E3, E4) of the apolipo protein E (APOE) gene. The genotype of an individual can therefore be described as APOE3/4 or APOE4/4 or any other variant. These des ignations indicate which alleles are present on the two chromosomes in the APOE gene at locus 19q13.2. In other cases, the genotype might be assigned arbitrary numbers (e.g., 1/2) or letters (e.g., B/b) to distin guish different alleles. A haplotype refers to a group of alleles that are closely linked together at a genomic locus (Fig. 479-4). Haplotypes are useful for tracking the transmission of genomic segments within families and for detect ing evidence of genetic recombination if the crossover event occurs between the alleles (Fig. 479-6). As an example, various alleles of the histocompatibility locus antigens (HLA) at the major histocompatibil ity complex (MHC) on chromosome 6 are used to establish haplotypes associated with certain disease states. For example, 21-hydroxylase deficiency, complement deficiency, and hemochromatosis are each associated with specific HLA haplotypes. It is now recognized that these genes lie in close proximity to the HLA locus, which explains why HLA associations were identified even before the disease genes were cloned and localized. In other cases, specific HLA associations with diseases such as ankylosing spondylitis (HLA-B27) or type 1 diabetes mellitus (HLA-DR4) reflect the role of specific HLA allelic variants in susceptibility to these autoimmune diseases. The characterization of common SNP haplotypes in numerous populations from different parts of the world has provided the necessary tools for association stud ies designed to detect genes involved in the pathogenesis of complex disorders (Table 479-1). The presence or absence of certain haplotypes can also be relevant for the customized choice of medical therapies (pharmacogenomics) or may have value for preventive strategies. Genotype-phenotype correlation describes the association of a spe cific mutation and the resulting phenotype. The phenotype may differ depending on the location or type of the mutation in some genes. For example, in von Hippel–Lindau disease, an autosomal dominant mul tisystem disease that can include renal cell carcinoma, hemangioblas tomas, and pheochromocytomas, among others, the phenotype varies

greatly, and the identification of the specific mutation can be clinically useful in order to predict the spectrum of disease manifestations.

ALLELIC HETEROGENEITY Allelic heterogeneity refers to the fact that different mutations in the same genetic locus can cause an identical or similar phenotype. For example, many different mutations of the β-globin locus can cause β thalassemia (Table 479-3) (Fig. 479-9). In essence, allelic heterogeneity reflects the fact that many different muta tions can alter protein structure and function. For this reason, maps of inactivating mutations in genes usually show a near-random distribu tion. Exceptions include (1) a founder effect, in which a particular mutation that does not affect reproductive capacity can be traced to a single individual; (2) “hot spots” for mutations, in which the nature of the DNA sequence predisposes to a recurring mutation; and (3) local ization of mutations to certain domains that are particularly critical for protein function. Allelic heterogeneity creates a practical problem for genetic testing because one must often examine the entire genetic locus for mutations, because these can differ in each patient. For example, ~2000 variants have been identified in the CFTR gene to date, although some of them are very rare and some may not be disease-causing (Fig. 479-3). Mutational analysis may initially focus on a panel of mutations that are particularly frequent (often taking the ethnic background of the patient into account), but a negative result does not exclude the presence of a mutation elsewhere in the gene. Until recently, muta tional analyses tended to focus on the coding region of a gene without considering regulatory and intronic regions. However, disease-causing mutations may be located outside the coding regions, so negative results need to be interpreted with caution. The increasingly wide spread access to comprehensive sequencing technologies, WES and WGS, greatly facilitates unbiased mutational analyses. However, com prehensive sequencing can result in significant diagnostic challenges because the detection of a sequence alteration is not always sufficient to establish that it has a causal role (VUS). CHAPTER 479 Principles of Human Genetics PHENOTYPIC HETEROGENEITY Phenotypic heterogeneity occurs when more than one phenotype is caused by allelic mutations (e.g., by different mutations in the same gene) (Table 479-3). For example, laminopathies are monogenic multisystem disorders that result from mutations in the LMNA gene, which encodes the nuclear lamins A and C. Multiple autosomal dominant and recessive disorders are caused by mutations in the LMNA gene. They include several forms of lipodys trophies, Emery-Dreifuss muscular dystrophy, progeria syndromes, a form of neuronal Charcot-Marie-Tooth disease (type 2B1), and a group of overlapping syndromes. Remarkably, hierarchical cluster analysis has revealed that the phenotypes vary depending on the position of the mutation (genotype-phenotype correlation). Similarly, identical mutations in the FGFR2 gene can result in very distinct phenotypes: Crouzon’s syndrome (craniofacial synostosis) or Pfeiffer’s syndrome (acrocephalopolysyndactyly). LOCUS OR NONALLELIC HETEROGENEITY AND PHENOCOPIES Nonal lelic or locus heterogeneity refers to the situation in which a similar dis ease phenotype results from mutations at different genetic loci (Table 479-3). This often occurs when more than one gene product produces different subunits of an interacting complex or when different genes are involved in the same genetic cascade or physiologic pathway. For example, osteogenesis imperfecta can arise from mutations in two dif ferent procollagen genes (COL1A1 or COL1A2) that are located on dif ferent chromosomes and can involve multiple other genes (Chap. 425). The effects of inactivating mutations in these two genes are similar because the protein products comprise different subunits of the helical collagen fiber. Similarly, muscular dystrophy syndromes can be caused by mutations in various genes, consistent with the fact that it can be transmitted in an X-linked (Duchenne or Becker), autosomal domi nant (limb-girdle muscular dystrophy type 1), or autosomal recessive (limb-girdle muscular dystrophy type 2) manner (Chap. 460). Muta tions in the X-linked DMD gene, which encodes dystrophin, are the most common cause of muscular dystrophy. This feature reflects the large size of the gene (2.3 MB, 79 exons), as well as the fact that the phenotype is expressed in hemizygous males because they have only

TABLE 479-3 Selected Examples of Phenotypic Heterogeneity and Locus Heterogeneity Phenotypic Heterogeneity GENE, PROTEIN PHENOTYPE INHERITANCE OMIM LMNA, Lamin A/C Emery-Dreifuss muscular dystrophy (AD) PART 16 Genes, the Environment, and Disease Familial partial lipodystrophy Dunnigan AD

Hutchinson-Gilford progeria AD

Atypical Werner’s syndrome AD

Dilated cardiomyopathy 1A AD

Familial atrial fibrillation 3 AD

Charcot-Marie-Tooth type 2B1 AR

KRAS Noonan’s syndrome AD

Cardio-facio-cutaneous syndrome 1 AD

Locus Heterogeneity PHENOTYPE GENE CHROMOSOMAL LOCATION PROTEIN Familial hypertrophic cardiomyopathy MYH7 14q11.2 Myosin heavy chain beta Genes encoding sarcomeric proteins TNNT2 1q32.1 Troponin-T2 TPM1 15q22.2 Tropomyosin alpha MYBPC3 11p11q Myosin-binding protein C TNNC1 19q13.4 Troponin 1 MYL2 12q24.11 Myosin light chain 2 MYL3 3p21.31 Myosin light chain 3 TTN 2q31.2 Cardiac titin ACTC 15q14 Cardiac alpha actin MYH6 14q11.2 Myosin heavy chain alpha MYLK2 20q11.21 Myosin light-peptide kinase CAV3 3p25 Caveolin 3 Genes encoding nonsarcomeric proteins MT-T1 Mitochondrial tRNA isoleucine MT-TG Mitochondrial tRNA glycine PRKAG2 7q36.1 AMP-activated protein kinase γ2 subunit DMPK 19q13.32 Myotonin protein kinase (myotonic dystrophy) FRDA 9q21.11 Frataxin (Friedreich’s ataxia) Polycystic kidney disease PKD1 16p13.3 Polycystin 1 (AD) PKD2 4q22.1 Polycystin 2 (AD) PKHD1 6p21.1-p12.2 Fibrocystin/polyductin (AR) Noonan’s syndrome PTPN11 12q24.13 Protein-tyrosine phosphatase 2c KRAS 12p12.1 KRAS Abbreviations: AD, autosomal dominant; AR, autosomal recessive; OMIM, Online Mendelian Inheritance in Man. a single copy of the X chromosome. Dystrophin is associated with a large protein complex linked to the membrane-associated cytoskeleton in muscle. Mutations in several different components of this protein complex can also cause muscular dystrophy syndromes. Although the phenotypic features of some of these disorders are distinct, the phenotypic spectrum caused by mutations in different genes overlaps, thereby leading to nonallelic heterogeneity. It should be noted that mutations in dystrophin are also associated with allelic heterogeneity. For example, mutations in the DMD gene can cause either Duchenne’s or the less severe Becker’s muscular dystrophy, depending on the sever ity of the protein defect. Recognition of nonallelic heterogeneity is important for several rea sons: (1) the ability to identify disease loci in linkage studies is reduced by including patients with similar phenotypes but different genetic disorders; (2) genetic testing is more complex because several differ ent genes need to be considered along with the possibility of different mutations in each of the candidate genes; and (3) novel information is gained about how genes or proteins interact, providing unique insights into molecular physiology. Phenocopies refer to circumstances in which nongenetic conditions mimic a genetic disorder. For example, features of toxin- or druginduced neurologic syndromes can resemble those seen in Huntington’s

disease, and vascular causes of dementia share phenotypic features with familial forms of Alzheimer’s dementia (Chap. 442). As in nonallelic heterogeneity, the presence of phenocopies has the potential to con found linkage studies and genetic testing. Patient history and subtle differences in phenotype can often provide clues that distinguish these disorders from related genetic conditions. VARIABLE EXPRESSIVITY AND INCOMPLETE PENETRANCE The same genetic mutation may be associated with a phenotypic spectrum in different affected individuals, thereby illustrating the phenomenon of variable expressivity. This may include different manifestations of a disorder variably involving different organs (e.g., multiple endocrine neoplasia [MEN]), the severity of the disorder (e.g., cystic fibrosis), or the age of disease onset (e.g., Alzheimer’s dementia). MEN 1 illustrates several of these features. In this autosomal dominant tumor syndrome, affected individuals carry an inactivating germline mutation that is inherited in an autosomal dominant fashion. After somatic inactiva tion of the alternate allele (loss of heterozygosity; Knudson two-hit model), patients can develop tumors of the parathyroid gland, endo crine pancreas, the pituitary gland, and dermatologic lesions (Chap. 400). However, the pattern of tumors in the different glands, the age at which tumors develop, and the types of hormones produced vary among

affected individuals, even within a given family. In this example, the phenotypic variability arises, in part, because of the requirement for a second somatic mutation in the normal copy of the MEN1 gene, as well as the large array of different cell types that are susceptible to the effects of MEN1 gene mutations. In part, variable expression reflects the influence of modifier genes, or genetic background, on the effects of a particular mutation. Even in identical twins, in whom the genetic constitution is essentially the same, one can occasionally see variable expression of a genetic disease. Interactions with the environment can also influence the course of a disease. For example, the manifestations and severity of hemochro matosis can be influenced by iron intake (Chap. 426), and the course of phenylketonuria is affected by exposure to phenylalanine in the diet (Chap. 431). Other metabolic disorders, such as hyperlipidemias and porphyria, also fall into this category. Many mechanisms, including genetic effects and environmental influences, can therefore lead to variable expressivity. In genetic counseling, it is particularly important to recognize this variability, because one cannot always predict the course of disease, even when the mutation is known. Penetrance refers to the proportion of individuals with a mutant genotype that express the phenotype. If all carriers of a mutant express the phenotype, penetrance is complete, whereas it is said to be incomplete or reduced if some individuals do not exhibit features of the phenotype. Dominant conditions with incomplete penetrance are characterized by skipping of generations with unaffected carriers transmitting the mutant gene. For example, hypertrophic obstructive cardiomyopathy (HCM) caused by mutations in the myosin-binding protein C gene is a dominant disorder with clinical features in only a subset of patients who carry the mutation (Chap. 267). Patients who have the mutation, but no evidence of the disease, can still transmit the disorder to subsequent generations. In many conditions with postnatal onset, the proportion of gene carriers who are affected varies with age. Thus, when describing penetrance, one must specify age. For example, for disorders such as Huntington’s disease or familial amyotrophic lateral sclerosis, which present later in life, the rate of penetrance is influenced by the age at which the clinical assessment is performed. Imprinting can also modify the penetrance of a disease. For example, in patients with AHO, mutations in the Gsα subunit (GNAS1 gene) are expressed clinically only in individuals who inherit the mutation from their mother (Chap. 422). SEX-INFLUENCED PHENOTYPES Certain mutations affect males and females quite differently. In some instances, this is because the gene resides on the X or Y sex chromosomes (X-linked disorders and Y-linked disorders). As a result, the phenotype of mutated X-linked genes will be expressed fully in males but variably in heterozygous females, depending on the degree of X-inactivation and the function of the gene. For example, most heterozygous female carriers of factor VIII deficiency (hemophilia A) are asymptomatic because sufficient factor VIII is produced to prevent a defect in coagulation (Chap. 121). On the other hand, some females heterozygous for the X-linked lipid storage defect caused by α-galactosidase A deficiency (Fabry’s disease) experience mild manifestations of painful neuropathy, as well as other features of the disease (Chap. 429). Because only males have a Y chro mosome, mutations in genes such as SRY, which causes male-to-female sex reversal, or DAZ (deleted in azoospermia), which causes abnor malities of spermatogenesis, are unique to males (Chap. 402). Other diseases are expressed in a sex-limited manner because of the differential function of the gene product in males and females. Activating mutations in the luteinizing hormone receptor cause dominant male-limited precocious puberty in boys (Chap. 403). The phenotype is unique to males because activation of the receptor induces testosterone production in the testis, whereas it is function ally silent in the immature ovary. Biallelic inactivating mutations of the follicle-stimulating hormone (FSH) receptor cause primary ovarian failure in females because the follicles do not develop in the absence of FSH action. In contrast, affected males have a more subtle phenotype, because testosterone production is preserved (allowing sexual maturation) and spermatogenesis is only partially impaired

(Chap. 403). In congenital adrenal hyperplasia, most commonly caused by 21-hydroxylase deficiency, cortisol production is impaired and ACTH stimulation of the adrenal gland leads to increased production of androgenic precursors (Chap. 398). In females, the increased androgen level causes ambiguous genitalia, which can be recognized at the time of birth. In males, the diagnosis may be made on the basis of adrenal insufficiency at birth, because the increased adrenal androgen level does not alter sexual differentiation, or later in childhood, because of the development of precocious puberty. Hemochromatosis is more common in males than in females, pre sumably because of differences in dietary iron intake and losses associated with menstruation and pregnancy in females (Chap. 426).

CHAPTER 479 Principles of Human Genetics Chromosomal Disorders Chromosomal disorders and the tech niques used for their characterization have been discussed in detail in previous editions of this textbook. Chromosomal or cytogenetic disorders are caused by numerical (aneuploidy) or structural aberra tions (deletions, duplications, translocations, inversions, dicentric and ring chromosomes, Robertsonian translocations) in chromosomes. They occur in ~1% of the general population, in 8% of stillbirths, and in close to 50% of spontaneously aborted fetuses. Indications for cytogenetic and cytogenomic chromosome analyses are summarized in Table 479-4. Contiguous gene syndromes (e.g., large deletions affect ing several genes) have been useful for identifying the location of new disease-causing genes. Because of the variable size of gene deletions in different patients, a systematic comparison of phenotypes and loca tions of deletion breakpoints allows positions of particular genes to be mapped within the critical genomic region. Monogenic Mendelian Disorders Monogenic human diseases are frequently referred to as Mendelian disorders because they obey the principles of genetic transmission originally set forth in Gregor Mendel’s classic work. The continuously updated OMIM catalogue lists several thousand of these disorders and provides information about the clinical phenotype, molecular basis, allelic variants, and pertinent animal models (Table 479-1). The mode of inheritance for a given phe notypic trait or disease is determined by pedigree analysis. All affected and unaffected individuals in the family are recorded in a pedigree using standard symbols (Fig. 479-11). The principles of allelic seg regation, and the transmission of alleles from parents to children, are illustrated in Fig. 479-12. One dominant (A) allele and one recessive (a) allele can display three Mendelian modes of inheritance: autosomal dominant, autosomal recessive, and X-linked. About 65% of human monogenic disorders are autosomal dominant, 25% are autosomal recessive, and 5% are X-linked. Genetic testing is now readily available for the characterization of monogenic disorders and plays an important role in clinical medicine (Chap. 480). TABLE 479-4 Indications for Cytogenetic and Cytogenomic Analysis across the Life Span TIMING OF TESTING INDICATIONS FOR TESTING Prenatal Advanced maternal age Abnormalities on ultrasound Increased risk for genetic disorder on maternal serum screen Neonatal and childhood Multiple congenital anomalies Intellectual disability Autism spectrum disorders Developmental delay Failure to thrive Short stature Disorders of sexual development History of familial chromosomal alteration Cancer Adult Infertility Recurrent miscarriage Familial cancer

Female Male Unknown sex

PART 16 Genes, the Environment, and Disease Multiple siblings Spontaneous abortion Deceased male Affected male Proband Affected female Heterozygous male Heterozygous female Female carrier of X-linked trait I

Mating Consanguineous union II

Monozygotic twins Dizygotic twins FIGURE 479-11 Standard pedigree symbols. AUTOSOMAL DOMINANT DISORDERS In autosomal dominant disor ders, mutations in a single allele are sufficient to cause the disease. In contrast to recessive disorders, in which disease pathogenesis is rela tively straightforward because there is a biallelic loss of gene function, dominant disorders can be caused by various disease mechanisms, many of which are unique to the function of the genetic pathway involved. Mechanistically, the mutation may confer constitutive activa tion (gain of function), exert a dominant negative effect, or result in loss of function and haploinsufficiency. In autosomal dominant disorders, individuals are affected in suc cessive generations; the disease does not occur in the offspring of unaffected individuals. Males and females are affected with equal frequency because the defective gene resides on one of the 22 auto somes (Fig. 479-13A). Autosomal dominant mutations alter one of the two alleles at a given locus. Because the alleles segregate randomly at meiosis, the probability that an offspring will be affected is 50%. Unless there is a new germline mutation, an affected individual has an affected parent. Children with a normal genotype do not transmit the disorder. Due to differences in penetrance or expressivity (see above), the clini cal manifestations of autosomal dominant disorders may be variable. Because of these variations, it is sometimes challenging to determine the pattern of inheritance. It should be recognized, however, that some individuals acquire a mutated gene from an unaffected parent due to de novo germline muta tions. They occur more frequently during later cell divisions in gameto genesis, which explains why siblings are rarely affected. As noted, new germline mutations occur more frequently in fathers of advanced age. For example, the average age of fathers with new germline mutations that Aa aa AA Aa Aa Aa aa Aa Aa AA Aa Aa aa 50:50

25:50:25 FIGURE 479-12 Segregation of alleles. Segregation of genotypes in the offspring of parents with one dominant (A) and one recessive (a) allele. The distribution of the parental alleles to their offspring depends on the combination present in the parents. Filled symbols = affected individuals.

Autosomal dominant A B Autosomal recessive Autosomal recessive with pseudodominance X-linked C Mitochondrial D FIGURE 479-13 A. Dominant, B. recessive, C. X-linked, and D. mitochondrial (matrilinear) inheritance. cause Marfan’s syndrome is ~37 years, whereas fathers who transmit the disease by inheritance have an average age of ~30 years. AUTOSOMAL RECESSIVE DISORDERS In recessive disorders, the mutated alleles result in a complete or partial loss of function. They fre quently involve enzymes in metabolic pathways, receptors, or proteins in signaling cascades. In an autosomal recessive disease, the affected individual, who can be of either sex, is a homozygote or compound heterozygote for a single-gene defect. With a few important excep tions, autosomal recessive diseases are rare and occur more often in the context of parental consanguinity. The relatively high frequency of cer tain recessive disorders such as sickle cell anemia, cystic fibrosis, and thalassemia, is partially explained by a selective biologic advantage for the heterozygous state (see below). Although heterozygous carriers of a defective allele are usually clinically normal, they may display subtle differences in phenotype that only become apparent with more precise testing or in the context of certain environmental influences. In sickle cell anemia, for example, heterozygotes are normally asymptomatic. However, in situations of dehydration or diminished oxygen pressure, sickle cell crises can also occur in heterozygotes (Chap. 103). aa In most instances, an affected individual is the offspring of heterozy gous parents. In this situation, there is a 25% chance that the offspring will have a normal genotype, a 50% probability of a heterozygous state, and a 25% risk of homozygosity for the recessive alleles (Figs. 479-12 and 479-13B). In the case of one unaffected heterozygous and one affected homozygous parent, the probability of disease increases to 50% for

each child. In this instance, the pedigree analysis mimics an autosomal dominant mode of inheritance (pseudodominance). In contrast to auto somal dominant disorders, new mutations in recessive alleles are rarely manifest because they usually result in an asymptomatic carrier state. X-LINKED DISORDERS Males have only one X chromosome; conse quently, a daughter always inherits her father’s X chromosome in addi tion to one of her mother’s two X chromosomes. A son inherits the Y chromosome from his father and one maternal X chromosome. Thus, the characteristic features of X-linked inheritance are (1) the absence of father-to-son transmission and (2) the fact that all daughters of an affected male are obligate carriers of the mutant allele (Fig. 479-13C). The risk of developing disease due to a mutant X-chromosomal gene differs in the two sexes. Because males have only one X chromosome, they are hemizygous for the mutant allele; thus, they are more likely to develop the mutant phenotype, regardless of whether the muta tion is dominant or recessive. A female may be either heterozygous or homozygous for the mutant allele, which may be dominant or reces sive. The terms X-linked dominant and X-linked recessive are therefore only applicable to expression of the mutant phenotype in women. In addition, the expression of X-chromosomal genes is influenced by X chromosome inactivation. Y-LINKED DISORDERS The Y chromosome has a relatively small number of genes. One such gene, the sex-region determining Y factor (SRY), which encodes the testis-determining factor (TDF), is crucial for normal male development. Normally, there is infrequent exchange of sequences on the Y chromosome with the X chromosome. The SRY region is adjacent to the pseudoautosomal region, a chromosomal seg ment on the X and Y chromosomes with a high degree of homology. A crossing-over event occasionally involves the SRY region with the distal tip of the X chromosome during meiosis in the male. Trans locations can result in XY females with the Y chromosome lacking the SRY gene or XX males harboring the SRY gene on one of the X chromosomes (Chap. 402). Point mutations in the SRY gene may also result in individuals with an XY genotype and an incomplete female phenotype. Most of these mutations occur de novo. Men with oligo spermia/azoospermia frequently have microdeletions on the long arm of the Y chromosome that involve one or more of the azoospermia factor (AZF) genes. Exceptions to Simple Mendelian Inheritance Patterns •

MITOCHONDRIAL DISORDERS Mendelian inheritance refers to the transmission of genes encoded by DNA contained in the nuclear chro mosomes. In addition, each mitochondrion contains several copies of a small circular chromosome (Chap. 481). The mitochondrial DNA (mtDNA) is ~16.5 kb and encodes transfer and ribosomal RNAs and 13 core proteins that are components of the respiratory chain involved in oxidative phosphorylation and ATP generation. The mitochondrial genome does not recombine and is inherited through the maternal line because sperm does not contribute significant cytoplasmic components to the zygote. A noncoding region of the mitochondrial chromosome, referred to as D-loop, is highly polymorphic. This property, together with the absence of mtDNA recombination, makes it a valuable tool for studies tracing human migration and evolution, and it is also used for specific forensic applications. Inherited mitochondrial disorders are transmitted in a matrilineal fashion; all children from an affected mother will inherit the disease, but it will not be transmitted from an affected father to his children (Fig. 479-13D). Alterations in the mtDNA that involves enzymes required for oxidative phosphorylation lead to reduction of ATP sup ply, generation of free radicals, and induction of apoptosis. Several syndromic disorders arising from mutations in the mitochondrial genome are known in humans, and they affect both protein-coding and tRNA genes. The broad clinical spectrum often involves (cardio) myopathies and encephalopathies because of the high dependence of these tissues on oxidative phosphorylation. The age of onset and the clinical course are highly variable because of the unusual mechanisms of mtDNA transmission, which replicates independently from nuclear DNA. During cell replication, the proportion of wild-type and mutant

mitochondria can drift among different cells and tissues. The resulting heterogeneity in the proportion of mitochondria with and without a mutation is referred to as heteroplasmia and underlies the phenotypic variability that is characteristic of mitochondrial diseases.

Acquired somatic mutations in mitochondria are thought to be involved in several age-dependent degenerative disorders affecting predominantly muscle and the peripheral and central nervous sys tem (e.g., Alzheimer’s and Parkinson’s diseases). Establishing that an mtDNA alteration is causal for a clinical phenotype is challenging because of the high degree of polymorphism in mtDNA and the phe notypic variability characteristic of these disorders. Certain pharma cologic treatments may have an impact on mitochondria and/or their function. For example, treatment with the antiretroviral compound azidothymidine (AZT) causes an acquired mitochondrial myopathy through depletion of muscular mtDNA. CHAPTER 479 Principles of Human Genetics MOSAICISM Mosaicism refers to the presence of two or more geneti cally distinct cell lines in the tissues of an individual. It results from a mutation that occurs during embryonic, fetal, or extrauterine devel opment. The developmental stage at which the mutation arises will determine whether germ cells and/or somatic cells are involved. Chro mosomal mosaicism results from nondisjunction at an early embryonic mitotic division, leading to the persistence of more than one cell line, as exemplified by some patients with Turner’s syndrome (Chap. 402).

Somatic mosaicism is characterized by a patchy distribution of genetically altered somatic cells. The McCune-Albright syndrome, for example, is caused by activating mutations in the stimulatory G protein α (Gsα) that occur postzygotically in early development (Chap. 422). The clinical phenotype varies depending on the tissue distribution of the mutation; manifestations include ovarian cysts that secrete sex steroids and cause precocious puberty, polyostotic fibrous dysplasia, café-au-lait skin pigmentation, GH-secreting pituitary adenomas, and hypersecreting autonomous thyroid nodules. X-INACTIVATION, IMPRINTING, AND UNIPARENTAL DISOMY Accord ing to traditional Mendelian principles, the parental origin of a mutant gene is irrelevant for the expression of the phenotype. There are, however, important exceptions to this rule. X-inactivation prevents the expression of most genes on one of the two X chromosomes in every cell of a female. Gene inactivation through genomic imprinting occurs on selected chromosomal regions of autosomes and leads to inheritable preferential expression of one of the parental alleles. It is of pathophysiologic importance in disorders where the transmission of disease is dependent on the sex of the transmitting parent and, thus, plays an important role in the expression of certain genetic disorders. Two classic examples are the Prader-Willi syndrome and Angelman’s syndrome. Prader-Willi syndrome is characterized by diminished fetal activity, obesity, hypotonia, intellectual disability, short stature, and hypogonadotropic hypogonadism. Deletions of the paternal copy of the Prader-Willi locus located on the short arm of chromosome 15 result in a contiguous gene syndrome involving missing paternal cop ies of the necdin and SNRPN genes, among others. In contrast, patients with Angelman’s syndrome, characterized by intellectual disability, seizures, ataxia, and hypotonia, have deletions involving the maternal copy of this region on chromosome 15. These two syndromes may also result from uniparental disomy. In this case, the syndromes are not caused by deletions on chromosome 15 but by the inheritance of either two maternal chromosomes (Prader-Willi syndrome) or two paternal chromosomes (Angelman’s syndrome). Lastly, the two distinct phenotypes can also be caused by an imprinting defect that impairs the resetting of the imprint during zygote development (defect in the father leads to Prader-Willi syndrome; defect in the mother leads to Angelman’s syndrome). Imprinting and the related phenomenon of allelic exclusion may be more common than currently documented because it is difficult to examine levels of mRNA expression from the maternal and paternal alleles in specific tissues or in individual cells. Genomic imprinting, or uniparental disomy, is involved in the pathogenesis of several other disorders and malignancies. For example, hydatidiform moles contain a normal number of diploid chromosomes, but they are exclusively of

paternal origin. The opposite situation occurs in ovarian teratomata, with 46 chromosomes of maternal origin. Expression of the imprinted gene for insulin-like growth factor 2 (IGF-2) is involved in the patho genesis of the cancer-predisposing Beckwith-Wiedemann syndrome (BWS). These children show somatic overgrowth with organomegalies and hemihypertrophy, and they have an increased risk of embryonal malignancies such as Wilms’ tumor. Normally, only the paternally derived copy of the IGF2 gene is active, and the maternal copy is inactive. BWS can be caused by several genetic defects that result in overactivity of IGF-2, or a missing active copy of CDKN1C, that result in inhibition of cell proliferation. They include paternal uniparental disomy (UPD) of chromosome 11, aberrant methylation of this region, maternal chromosomal rearrangements, or deletions within the locus.

PART 16 Genes, the Environment, and Disease Alterations of the epigenome through gain and loss of DNA meth ylation and altered histone modifications play an important role in the pathogenesis of malignancies. SOMATIC MUTATIONS Cancer can be considered a genetic disease at the cellular level (Chap. 76). Cancers are monoclonal in origin, indicat ing that they have arisen from a single precursor cell with one or sev eral mutations in genes controlling growth (proliferation or apoptosis) and/or differentiation. These acquired somatic mutations are restricted to the tumor and its metastases and are not found in the surrounding normal tissue. The molecular alterations include dominant gain-offunction mutations in oncogenes, recessive loss-of-function mutations in tumor-suppressor genes and DNA repair genes, gene amplification, and chromosome rearrangements. Chromothripsis refers to a muta tional process including multiple clustered chromosomal rearrange ments in close vicinity, for example, after injury by ionizing radiation. Rarely, a single mutation in certain genes may be sufficient to trans form a normal cell into a malignant cell. In most cancers, however, the development of a malignant phenotype requires several genetic altera tions for the gradual progression from a normal cell to a cancerous cell, a phenomenon termed multistep carcinogenesis. Genome-wide analyses of cancers using deep sequencing often reveal somatic rearrangements resulting in fusion genes and mutations in multiple genes (Table 479-1 and Fig. 479-14). Comprehensive sequence analyses, now also pos sible through single-cell sequencing (SCS), provide insight into the evolution and genetic heterogeneity within malignancies; these include intratumoral heterogeneity among the cells of the primary tumor, intermetastatic and intrametastatic heterogeneity, and interpatient

Mutations per Mb

Histology HPV clade HPV integration APOBEC mutagenesis UCEC-like EMT score Purity iCluster PIK3CA (26%) EP300 (11%) FBXW7 (11%) PTEN (8%) HLA-A (8%) ARID1A (7%) NFE2L2 (7%) HLA-B (6%) KRAS (6%) ERBB3 (6%) MAPK1 (5%) B CASP8 (4%) TGFBR2 (3%) SHKBP1 (2%) C Synonymous In-frame indel Other non-synonymous Missense Splice site Frameshift Nonsense 3q (66%) CD274 (8%) PTEN (8%) YAP1 (16%) BCAR4 (16%) D 0.1 ≤ log2[CN] < 0.4 log2[CN] ≤ –0.4 –0.4 < log2[CN] ≤ –0.1 log2[CN] ≥ 0.4 Gene-level SCNAs

FIGURE 479-14 Somatic alterations in cervical cancer. A. Cervical carcinoma samples ordered by histology and mutation frequency; B. clinical and molecular platform features; C. significantly mutated genes (SMGs); and D. select somatic copy number alterations. SMGs are ordered by the overall mutation frequency and color-coded by mutation type. Adeno, adenocarcinomas; Adenosq, adenosquamous cancers; CN, copy number; SCNAs, somatic copy number alterations; Squamous, squamous cell carcinomas. (Reproduced from The Cancer Genome Atlas Research Network. Integrated genomic and molecular characterization of cervical cancer. Nature 543:378–384, 2017.)

differences. These analyses further support the notion of cancer as an ongoing process of clonal evolution, in which successive rounds of clonal selection within the primary tumor and metastatic lesions result in diverse genetic and epigenetic alterations that require targeted (personalized) therapies (precision medicine). The heterogeneity of mutations within a tumor can also lead to resistance to targeted thera pies because cells with mutations that are resistant to the therapy, even if they are a minor part of the tumor population, will be selected as the more sensitive cells are eliminated. Telomeres, repeats of conserved sequences, protect the ends the chromosomes from DNA damage or fusion with neighboring chromo somes. Telomere length shortens with age. Most human tumors express telomerase, an enzyme formed of a protein and an RNA component, which adds telomere repeats at the ends of chromosomes during rep lication. This mechanism impedes shortening of the telomeres and is associated with enhanced replicative capacity in cancer cells. Telomer ase inhibitors provide a strategy for treating advanced human cancers. In many cancer syndromes, there is frequently an inherited predis position to tumor formation. In these instances, a germline mutation is inherited in an autosomal dominant fashion inactivating one allele of an autosomal tumor-suppressor gene. If the second allele is inactivated by a somatic mutation or by epigenetic silencing in a given cell, this will lead to neoplastic growth (Knudson two-hit model). Thus, the defec tive allele in the germline is transmitted in a dominant mode, although tumorigenesis results from a biallelic loss of the tumor-suppressor gene in an affected tissue. The classic example to illustrate this phenomenon is retinoblastoma, which can occur as a sporadic or hereditary tumor. In sporadic retinoblastoma, both copies of the retinoblastoma (RB) gene are inactivated through two somatic events. In hereditary retino blastoma, one mutated or deleted RB allele is inherited in an autosomal dominant manner and the second allele is inactivated by a subsequent somatic mutation. This two-hit model applies to other inherited cancer syndromes such as MEN 1 (Chap. 400) and neurofibromatosis types 1 and 2 (Chap. 95). In contrast, in the autosomal dominant MEN 2 syndrome, the predisposition for tumor formation in various organs is caused by a gain-of-function mutation in a single allele of the RET gene (Chap. 400). NUCLEOTIDE REPEAT EXPANSION DISORDERS Several diseases are associated with an increase in the number of nucleotide repeats above a certain threshold (Table 479-5). The repeats are sometimes located Synonymous Non-synonymous Other Squamous Adenosq. Adeno. Negative A9 A7 Yes Low High No No Yes No 0.96 –3.76 0.22 1.15 Adeno. Keratin-high Keratin-low APOBEC Non-APOBEC

Mutations Gain Loss

TABLE 479-5 Selected Trinucleotide Repeat Disorders DISEASE LOCUS REPEAT X-chromosomal spinobulbar muscular atrophy (SBMA) Xq12 CAG 11–34/40–62 XR Androgen receptor Fragile X syndrome (FRAXA) Xq27.3 CGG 6–50/200–300 XR FMR-1 protein Fragile X syndrome (FRAXE) Xq28 GCC 6–25/>200 XR FMR-2 protein Dystrophia myotonica (DM) 19q13.32 CTG 5–30/200–1000 AD, variable penetrance Myotonin protein kinase Huntington’s disease (HD) 4p16.3 CAG 6–34/37–180 AD Huntingtin Spinocerebellar ataxia type 1 (SCA1) 6p22.3 CAG 6–39/40–88 AD Ataxin 1 Spinocerebellar ataxia type 2 (SCA2) 12q24.12 CAG 15–31/34–400 AD Ataxin 2 Spinocerebellar ataxia type 3 (SCA3);

Machado-Joseph disease (MD) 14q32.12 CAG 13–36/55–86 AD Ataxin 3 Spinocerebellar ataxia type 6 (SCA6, CACNAIA) 19p13 CAG 4–16/20–33 AD Alpha 1A voltage-dependent L-type calcium channel Spinocerebellar ataxia type 7 (SCA7) 3p14.1 CAG 4–19/37 to >300 AD Ataxin 7 Spinocerebellar ataxia type 12 (SCA12) 5q32 CAG 6–26/66–78 AD Protein phosphatase 2A Dentatorubral pallidoluysian atrophy (DRPLA) 12p13.31 CAG 7–23/49–75 AD Atrophin 1 Friedreich’s ataxia (FRDA1) 9q21.11 GAA 7–22/200–900 AR Frataxin Abbreviations: AD, autosomal dominant; AR, autosomal recessive; XR, X-linked recessive. within the coding region of the genes, as in Huntington’s disease or the X-linked form of spinal and bulbar muscular atrophy (SBMA; Kennedy’s syndrome). In other instances, the repeats probably alter gene regulatory sequences. If an expansion is present, the DNA frag ment is unstable and tends to expand further during cell division. The length of the nucleotide repeat often correlates with the severity of the disease. When repeat length increases from one generation to the next, disease manifestations may worsen or be observed at an earlier age; this phenomenon is referred to as anticipation. In Huntington’s disease, for example, there is a correlation between age of onset and length of the triplet codon expansion (Chap. 435). Anticipation has also been documented in other diseases caused by dynamic mutations in tri nucleotide repeats (Table 479-5). The repeat number may also vary in a tissue-specific manner. In myotonic dystrophy, the CTG repeat may be 10-fold greater in muscle tissue than in lymphocytes (Chap. 460). Complex Genetic Disorders The expression of many common diseases such as cardiovascular disease, hypertension, diabetes, asthma, psychiatric disorders, and certain cancers is determined by a combina tion of genetic background, environmental factors, and lifestyle. A trait is called polygenic if multiple genes contribute to the phenotype or mul tifactorial if multiple genes are assumed to interact with environmental factors. Genetic models for these complex traits need to account for genetic heterogeneity and interactions with other genes and the envi ronment. Complex genetic traits may be influenced by modifier genes that are not linked to the main gene involved in the pathogenesis of the trait. This type of gene-gene interaction, or epistasis, where the expres sion of a gene is altered by the expression of one or several indepen dently inherited genes, plays an important role in polygenic traits. In aggregate, variants in multiple genes need to be present simultaneously to result in a pathologic phenotype. Type 2 diabetes mellitus provides a paradigmatic example of a multifactorial disorder, because genetic, nutritional, and lifestyle fac tors are intimately interrelated in disease pathogenesis (Table 479-6)

(Chap. 415). The identification of genetic variations and environmen tal factors that either predispose to or protect against disease is essen tial for predicting disease risk, designing preventive strategies, and developing novel therapeutic approaches. The study of rare monogenic diseases may provide insight into some of the genetic and molecular mechanisms important in the pathogenesis of complex diseases. For example, the identification of the genes causing monogenic forms of permanent neonatal diabetes mellitus or maturity-onset diabetes defined them as candidate genes in the pathogenesis of diabetes mel litus type 2 (Tables 479-2 and 479-6) (Fig. 479-15). Genome scans have identified numerous genes and loci that may be associated with

TRIPLET LENGTH (NORMAL/DISEASE) INHERITANCE GENE PRODUCT CHAPTER 479 Principles of Human Genetics susceptibility to development of diabetes mellitus in certain popula tions (Fig. 479-16). Efforts to identify susceptibility genes require very large sample sizes, and positive results may depend on ethnicity, ascer tainment criteria, and statistical analysis. Association studies analyzing the potential influence of (biologically functional) SNPs and SNP hap lotypes on a particular phenotype have revealed new insights into the genes involved in the pathogenesis of these common disorders. Large variants ([micro]deletions, duplications, and inversions) present in the human population also contribute to the pathogenesis of complex dis orders, but their contributions remain poorly understood. Linkage and Association Studies There are two primary strate gies for mapping genes that cause or increase susceptibility to human disease: (1) classic linkage can be performed based on a known genetic model or, when the model is unknown, by studying pairs of affected relatives; or (2) disease genes can be mapped using allelic association studies (Table 479-7). GENETIC LINKAGE Genetic linkage refers to the fact that genes are physically connected, or linked, to one another along the chromosomes. Two fundamental principles are essential for understanding the concept of linkage: (1) when two genes are close together on a chromosome, they are usually transmitted together, unless a recombination event separates them (Figs. 479-6); and (2) the odds of a crossover, or recom bination event, between two linked genes is proportional to the distance that separates them. Thus, genes that are farther apart are more likely to undergo a recombination event than genes that are very close together. The detection of chromosomal loci that segregate with a disease by linkage can be used to identify the gene responsible for the disease and to predict the odds of disease gene transmission in genetic counseling. Polymorphic variants are essential for linkage studies because they provide a means to distinguish the maternal and paternal chromo somes in an individual. On average, 1 out of every 1000 bp varies from one person to the next. Although this degree of variation seems low (99.9% identical), it means that >3 million sequence differences exist between any two unrelated individuals and the probability that the sequence at such loci will differ on the two homologous chromosomes is high (often >70–90%). These sequence variations include variable number of tandem repeats (VNTRs), short tandem repeats (STRs), and SNPs. Most STRs, also called polymorphic microsatellite markers, consist of di-, tri-, or tetranucleotide repeats that can be characterized readily using the polymerase chain reaction (PCR). Characterization of SNPs, using DNA chips or beads, permits comprehensive analyses of genetic variation, linkage, and association studies. Although these sequence variations often have no apparent functional consequences, they provide much of the basis for variation in genetic traits.

TABLE 479-6 Examples of Genes and Loci Involved in Mono- and Polygenic Forms of Diabetes DISORDER GENES OR SUSCEPTIBILITY LOCUS Monogenic permanent neonatal diabetes mellitus KCNJ11 (inwardly rectifying potassium channel Kir6.2) 11p15.1 AD GCK (glucokinase) 7p13 AR PART 16 Genes, the Environment, and Disease INS (insulin) 11p15.5 AR, hyperproinsulinemia ABCC8 (ATP-binding cassette, subfamily c, member 8; sulfonylurea receptor) 11p15.1 AD or AR GLIS3 (GLIS family zinc finger protein 3) 9p24.2 AR, diabetes, congenital hypothyroidism Maturity-onset diabetes of the young (MODY): monogenic forms of diabetes mellitus MODY 1 HNF4α (hepatocyte nuclear factor 4α) 20q13.12 AD inheritance MODY 2 GCK (glucokinase) 7p13 MODY 3 HNF1α (hepatocyte nuclear factor 1α) 12q24.31 MODY 4 IPF1 (insulin receptor substrate) 13q12.2 MODY 5 (renal cysts, diabetes) HNF1β (hepatocyte nuclear factor 1β) 17q12 MODY 6 NeuroD1 (neurogenic differentiation factor 1) 2q31.3 MODY 7 KLF1 (Kruppel-like factor 1) 19p13.13 MODY 8 CEL (carboxyl ester lipase) 9q34.13 MODY 9 PAX4 (paired box transcription factor 4) 7q32.1 MODY 10 INS (insulin) 11p15.5 MODY 11 BLK (B-lymphocyte-specific tyrosine kinase) 8p23.1 MODY 12 ABCC8 (ATP-binding cassette, subfamily c, member 8; sulfonylurea receptor) 11p15.1 MODY 13 KCNJ11 (inwardly rectifying potassium channelKir6.2) 11p15.1 Diabetes mellitus type 2; loci and genes linked and/or associated with susceptibility for diabetes mellitus type 2 Genes and loci identified by linkage/association studies Heavily influenced by diet, energy expenditure, obesity PPARG, KCNJ11/ABCC8, TCF7L2, HNF1B, WFS1, SLC30A8, FTO, HHEX, IGF2BP2, CDKN2A/B, CDKAL1, TSPAN8, ADAMTs9, CDC123/CAMK1D, JAZF1, NOTCH2, THADA, KCNQ1, DUSP8, MTNR1B, IRS1, SPRY2, SRR, ZFAND6, GCK, KLF14, TP53INP1, PROX1, PRC1, BCL11A, ZBED3, RBMS1, HNF1A, DGKB/ TMEM195, CCND2, C2CD4A/C2CD4B, PTPRD, ARAP1/CENTD2, HMGA2, TLE4/ CHCHD9, ADCY5, UBE2E2, DUSP9, GCKR, COBLL1/GRB14, HMG20A, VPS26A, ST6GAL1, AP3S2, HNF4A, BCL2, LAMA1, GIPR, MC4R, TLE1, KCNK16, ANK1, KLHDC5, ZMIZ1, PSMD6, FITM2/R3HDML/HNF4A, CILP2, ANKRD55, GLIS3, PEPD, GCC1/PAX4, ZFAND3, MAEA, BCAR1, RBM43/RND3, MACF1, RASGRP1, GRK5, TMEM163, SGCG, LPP, FAF1, TMEM154, MPHOSPH9, ARL15, POU5F1/ TCF19, SSR1/RREB1, HLA-B, INS-IGF2, GPSM1, LEP, SLC16A13, PAM/PPIP5K2, SLC16A11, CCDC63, C12orf51, CCND2, HNF1A, TBC1D4, CCDC85A, INAFM2, ASB3, FAM60A, ATP8B2, MIR4686, MTMR3, DMRTA1, SLC35D3, GLP2R, GIP, MAP3K11, PLEKHA1, HSD17B12, NRXN3, CMIP, ZZEF1, MNX1, ABO, ACSL1, HLA-DQA1 Abbreviations: AD, autosomal dominant; AR, autosomal recessive; MODY, maturity-onset diabetes of the young. In order to identify a chromosomal locus that segregates with a disease, it is necessary to characterize polymorphic DNA markers from affected and unaffected individuals of one or several pedigrees. One can then assess whether certain marker alleles cosegregate with the disease. Markers that are closest to the disease gene are less likely to undergo recombination events and therefore receive a higher link age score. Linkage is expressed as a lod (logarithm of odds) score—the ratio of the probability that the disease and marker loci are linked rather than unlinked. Lod scores of +3 (1000:1) are generally accepted as supporting linkage, whereas a score of –2 is consistent with the absence of linkage. ALLELIC ASSOCIATION, LINKAGE DISEQUILIBRIUM, AND HAPLO TYPES Allelic association refers to a situation in which the frequency of an allele is significantly increased or decreased in individuals affected by a particular disease in comparison to controls. Linkage and association differ in several aspects. Genetic linkage is demonstrable in families or sibships. Association studies, on the other hand, compare a population of affected individuals with a control population. Asso ciation studies can be performed as case-control studies that include unrelated affected individuals and matched controls or as family-based

CHROMOSOMAL LOCATION OTHER FACTORS studies that compare the frequencies of alleles transmitted or not trans mitted to affected children. Allelic association studies are particularly useful for identifying susceptibility genes in complex diseases. When alleles at two loci occur more frequently in combination than would be predicted (based on known allele frequencies and recombination fractions), they are said to be in linkage disequilibrium. Evidence for linkage disequilibrium can be helpful in mapping disease genes because it suggests that the two loci are tightly linked. Detecting the genetic factors contributing to the pathogenesis of common complex disorders is challenging. In many instances, these are low-penetrance alleles (e.g., variations that individually have a subtle effect on disease development, and they can only be identified by unbiased GWAS) (Catalog of Published Genome-Wide Association Studies; Table 479-1) (Fig. 479-16). Most variants occur in noncoding or regulatory sequences but do not alter protein structure. The analysis of complex disorders is further complicated by ethnic differences in disease prevalence, differences in allele frequencies in known suscepti bility genes among different populations, locus and allelic heterogene ity, gene-gene and gene-environment interactions, and the possibility of phenocopies. Catalogues of human variation and genotype data

Rare alleles Mendelian disease High Effect size 3.0 Intermediate 1.5 Modest Rare variants with small effect: difficult to identify 1.1 Low 0.001 0.005 0.05 Common Very rare FIGURE 479-15 Relationship between allele frequency and effect size in monogenic and polygenic disorders. In classic Mendelian disorders, the allele frequency is typically low but has a high impact (single-gene disorder). This contrasts with polygenic disorders that require the combination of multiple low-impact alleles that are frequently quite common in the general population. (HapMap, International Genome Sample Resource) have greatly facili tated GWAS for the characterization of complex disorders. Adjacent SNPs are inherited together as blocks, and these blocks can be identi fied by genotyping selected marker SNPs, so-called Tag SNPs, thereby reducing cost and workload (Fig. 479-4). The availability of this infor mation permits the characterization of a limited number of SNPs to identify the set of haplotypes present in an individual (e.g., in cases and controls). This, in turn, permits performing GWAS by searching for associations of certain haplotypes with a disease phenotype of inter est, an essential step for unraveling the genetic factors contributing to complex disorders. POPULATION GENETICS In population genetics, the focus changes from alterations in an individual’s genome to the distribution pattern of different genotypes in the population. In a case where there are only two alleles, A and a, the frequency of the genotypes will be p2 + 2pq + q2 = 1, with p2 corresponding to the frequency of AA, 2pq to the fre quency of Aa, and q2 to aa. When the frequency of an allele is known, the frequency of the genotype can be calculated. Alternatively, one can determine an allele frequency if the genotype frequency has been determined. Allele frequencies vary among ethnic groups and geographic regions. For example, heterozygous mutations in the CFTR gene are relatively common in populations of European origin but are rare in the African population. Allele frequencies may vary because certain allelic variants confer a selective advantage. For example, heterozygotes for the sickle cell mutation, which is particularly common in West Africa, are more resistant to malaria infection because the erythrocytes of heterozygotes provide a less favorable environment for Plasmodium parasites. Although homozygosity for the sickle cell mutation is associ ated with severe anemia and sickle ‘crises, heterozygotes have a higher probability of survival because of the reduced morbidity and mortality from malaria; this phenomenon has led to an increased frequency of the mutant allele. Recessive conditions are more prevalent in geograph ically isolated populations because of the more restricted gene pool. APPROACH TO THE PATIENT Inherited Disorders For the practicing clinician, the family history remains an essential step in recognizing the possibility of a hereditary predisposition to disease. When taking the history, it is useful to draw a detailed

Rare: Common variants with high effect on complex disease CHAPTER 479 Low-frequency variants with intermediate effect Principles of Human Genetics Typical: Common variants with low effect on complex disease Rare Low frequency Allele frequency pedigree of the first-degree relatives (e.g., parents, siblings, and children), because they share 50% of genes with the patient. Stan dard symbols for pedigrees are depicted in Fig. 479-11. The family history should include information about ethnic background, age, health status, and deaths, including infants. Next, the physician should explore whether there is a family history of the same or related illnesses to the current problem. An inquiry focused on commonly occurring disorders such as cancers, heart disease, and diabetes mellitus should follow. Because of the possibility of agedependent expressivity and penetrance, the family history will need intermittent updating. If the findings suggest a genetic disorder, the clinician should assess whether some of the patient’s relatives may be at risk of carrying or transmitting the disease. In this cir cumstance, it is useful to confirm and extend the pedigree based on input from several family members. Emerging artificial intelligence tools analyzing facial features can aid the clinician in diagnosing patients with genetic conditions. In aggregate, this information may form the basis for genetic counseling, carrier detection, early intervention, and disease prevention in relatives of the index patient (Chap. 480). In instances where a diagnosis at the molecular level may be rel evant, it is important to identify an appropriate laboratory that can perform the appropriate test. Genetic testing is available for a large number of monogenic disorders through commercial laboratories. For uncommon disorders, the test may only be performed in a spe cialized research laboratory. Approved laboratories offering testing for inherited disorders can be identified in continuously updated online resources (e.g., Genetic Testing Registry; Table 479-1). If genetic testing is considered, the patient and the family should be counseled about the potential implications of positive results, including psychological distress and the possibility of discrimina tion. The patient or caretakers should be informed about the mean ing of a negative result, technical limitations, and the possibility of false-negative and inconclusive results. For these reasons, genetic testing should only be performed after obtaining informed consent. Published ethical guidelines address the specific aspects that should be considered when testing children and adolescents. IDENTIFYING THE DISEASE-CAUSING GENE Precision medicine aims to enhance the quality of medical care through the use of genotypic analysis (DNA testing) to iden tify genetic predisposition to disease, to select more specific

Significant Loci:

PART 16 Genes, the Environment, and Disease African and African-American East Asian European Hispanic/Native American South Asian

Initial sample size Replication sample size

Linkage or candidate gene GWAS or Metabochip Exome array Genome or exome sequencing

Total sample size (1000s)

PPARG KCNJ11 TCF7L2 SLC30A8 MC4R SLC16A11

PubMed ID

2003 2006

Year

FIGURE 479-16 Genome-wide association studies (GWAS) across ancestries and discovery of loci over time. The pie charts represent type 2 diabetes GWAS, as well as candidate gene or sequencing studies. The x axis shows the year of publication, and the y axis shows discovery sample size. The inner circles are scaled in proportion to discovery sample size, and the outer circles are scaled in proportion to total (discovery + replication) sample size. Significant loci are defined as a p value of 5 × 10−8. At the end of 2022, 534 type 2 diabetes distinct intervals (520 autosomal, 14 X chromosomal) were defined. (Reproduced with permission from BF Voight.) pharmacotherapy, and to design individualized medical care based on genotype. Genotype can be deduced by analysis of protein (e.g., hemoglobin, apoprotein E), mRNA, or DNA. Many (pathogenic) variants can be readily identified by DNA analyses; technical advances in RNA sequencing now add increasing depth to genetic and genomic investigations (e.g., for the detection of gene fusions or aberrant gene expression patterns). DNA testing is performed by mutational analysis or linkage studies in individuals at risk for a genetic disorder known to be present in a family. Mass screening programs require tests of high sensitivity and specificity to be cost-effective. The benefits and risks of screening newborns with genomic sequencing, and the potential impact on surveillance, preventative health care, and personalized treatment options are topics of current research (BabySeq Project). Prerequisites for the success of genetic screening programs include the following: that the disorder is potentially serious; that it can be influenced at a presymptomatic stage by changes in behavior, diet, and/or pharmaceutical manipulations; and that the screen ing does not result in any harm or discrimination. Screening in Jewish populations for the autosomal recessive neurodegenerative storage disease Tay-Sachs has reduced the number of affected indi viduals. In contrast, screening for sickle cell trait/disease in African Americans has led to unanticipated problems of discrimination by health insurers and employers. Mass screening programs harbor additional potential problems. For example, screening for the most common genetic alteration in cystic fibrosis, the ΔF508 mutation

PNPLA3 LPL POC5 ANKH TBC1D4 PAM

with a frequency of ~70% in northern Europe, is feasible and seems to be effective. One has to keep in mind, however, that there is pronounced allelic heterogeneity and that the CFTR gene can be affected by >2000 other mutations. While the search for less com mon mutations has been challenging in the past, next-generation genome sequencing now permits comprehensive and cost-effective mutational analyses. However, the bioinformatic analysis and the classification of the detected variants as pathogenic or benign alterations is still challenging. Occupational screening programs aim to detect individuals with increased risk for certain professional activities (e.g., α1 antitrypsin deficiency and smoke or dust expo sure). Integrating genomic data into electronic medical records is evolving and can provide significant decision support at the point of care, for example, by providing the clinician with genomic data and decision algorithms for the prescription of drugs that are subject to pharmacogenetic influences. Mutational Analyses DNA sequence analysis is widely used as a diagnostic tool and has significantly enhanced diagnostic accuracy. It is used for determining carrier status and for prenatal testing in monogenic disorders. Numerous techniques, discussed in previous versions of this chapter, are available for the detection of mutations. Analyses of large alterations in the genome are possible using clas sic methods such as karyotype analysis, cytogenetics, fluorescent in situ hybridization (FISH), and array- or bead-based techniques that search for multiple single exon deletions or duplications.

TABLE 479-7 Genetic Approaches for Identifying Disease Genes INDICATIONS AND ADVANTAGES LIMITATIONS METHOD Linkage Studies Classical linkage analysis (parametric methods) Analysis of monogenic traits Difficult to collect large informative pedigrees Suitable for genome scan Difficult to obtain sufficient statistical power for complex traits Control population not required Useful for multifactorial disorders in isolated populations Allele-sharing methods (nonparametric methods) Suitable for identification of susceptibility genes in polygenic and multifactorial disorders Difficult to collect sufficient number of subjects Affected sib and relative pair analyses Suitable for genome scan Difficult to obtain sufficient statistical power for complex traits Sib pair analysis Control population not required if allele frequencies are known Reduced power compared to classical linkage, but not sensitive to specification of genetic mode Statistical power can be increased by including parents and relatives Association Studies Case-control studies Suitable for identification of susceptibility genes in polygenic and multifactorial disorders Requires large sample size and matched control population Linkage disequilibrium Suitable for testing specific allelic variants of known candidate loci False-positive results in the absence of suitable control population Transmission disequilibrium test (TDT) Facilitated by comprehensive catalogs of genotypes and variants Candidate gene approach does not permit detection of novel genes and pathways Whole-genome association studies Does not necessarily need relatives Susceptibility genes can vary among different populations Next-Generations Sequencing Technologies Whole exome or genome sequencing Unbiased approach, analysis can be performed without reference sequences from parents or siblings Requires appropriate bioinformatics, may have low sensitivity if CNV analysis is not included, detects numerous VUS, can lead to the detection of unrelated deleterious alleles Targeted sequencing of gene panels Captures multiple candidate genes and loci with hybridization techniques followed by deep sequencing Permits analyses of multiple candidate genes in parallel; facilitates molecular characterization of disorders with locus heterogeneity Abbreviations: CNV, copy number variation; VUS, variants of unknown significance. The analysis of more discrete sequence alterations often rely on the use of PCR, which allows rapid gene amplification and analysis. Moreover, PCR makes it possible to perform genetic testing and mutational analysis with small amounts of DNA extracted from leukocytes or even from single cells, buccal cells, or hair roots. DNA sequencing can be performed directly on PCR products. The advent of comprehensive sequencing technologies analyzing the whole

exome or genome, of selected chromosomes, or of numerous candi date genes in a single run with NGS platforms is now fundamentally transforming the characterization of patients with rare disorders and advanced malignancies. These techniques have the advantage of an unbiased comprehensive approach, and they are increasingly cost-effective. Analysis of cell-free DNA (cfDNA; also referred to as “liquid biopsy”) present in body fluids is playing a growing role for minimally invasive diagnostics and disease monitoring. Genomic tests are also widely used for the detection of pathogens and for the identification of viral or bacterial sequence variations. CHAPTER 479 Principles of Human Genetics The integration of genomic tests into clinical medicine is asso ciated with a number of ongoing challenges related to variable sensitivities of the tests, bioinformatics analyses, storage and shar ing of data, and the difficulty of interpreting all genetic variants identified with comprehensive testing. The discovery of incidental (or secondary) findings that are unrelated to the indication for the sequencing analysis, but indicators of other disorders of potential relevance for patient care can pose a difficult ethical dilemma. It can lead to the detection of undiagnosed medically action able genetic conditions but can also reveal deleterious mutations that cannot be influenced, as numerous sequence variants are of unknown significance. A general algorithm for the approach to mutational analysis in patients with a suspected genetic disorder and (advanced) malig nancies is outlined in Fig. 479-17. The importance of a detailed characterization of the clinical phenotype cannot be overempha sized. This is the step where one should also consider the possibil ity of genetic heterogeneity and phenocopies. If obvious candidate genes are suggested by the phenotype, they can be analyzed directly. After identification of a mutation, it is essential to demonstrate that it segregates with the phenotype. The functional characterization of novel mutations remains labor intensive and may require analyses in vitro or in transgenic models in order to document the relevance of the genetic alteration. Prenatal diagnosis of numerous genetic diseases in instances with a high risk for certain disorders is possible by direct DNA analysis. Amniocentesis involves the removal of a small amount of amniotic fluid, usually at 16 weeks of gestation. Cells can be collected and submitted for karyotype analyses, FISH, and mutational analysis of selected genes (Table 479-4). The main indications for amnio centesis include advanced maternal age (>35 years), presence of an abnormality of the fetus on ultrasound examination, an abnormal serum “quad” test (α fetoprotein, β human chorionic gonadotropin, inhibin-A, and unconjugated estriol), a family history of chromo somal abnormalities, or a Mendelian disorder amenable to genetic testing. Prenatal diagnosis can also be performed by chorionic villus sampling (CVS), in which a small amount of the chorion is removed by a transcervical or transabdominal biopsy. Chromosomes and DNA obtained from these cells can be submitted for cytogenetic and mutational analyses. CVS can be performed earlier in gesta tion (weeks 9–12) than amniocentesis, an aspect that may be of relevance when termination of pregnancy is a consideration. Later in pregnancy, beginning at ~18 weeks of gestation, percutaneous umbilical blood sampling (PUBS; cordocentesis) permits collection of fetal blood for analysis. Prenatal cfDNA allows DNA analy ses from the mother and fetus from a maternal blood sample to screen for certain chromosomal abnormalities and fetal sex. These approaches enable screening for clinically relevant and deleterious alleles inherited from the parents, as well as for de novo germline mutations, and they have the potential to identify genetic disorders in the prenatal setting. In combination with in vitro fertilization (IVF) techniques, it is possible to perform genetic diagnoses in a single cell removed from the four- to eight-cell embryo or to analyze the first polar body from an oocyte. Preconceptual diagnosis thereby avoids therapeutic abortions but is costly and labor intensive. It should be empha sized that excluding a specific disorder by any of these approaches is never equivalent to the assurance of having a normal child.

Characterization of phenotype Familial or sporadic genetic disorder PART 16 Genes, the Environment, and Disease Pedigree analysis Gene unknown Gene known or candidate genes Targeted sequencing Deep sequencing of DNA Deep sequencing of RNA (RNAseq) Deep sequencing (Linkage analysis and sequencing of linked region) Mutational analysis Determine functional properties of identified mutations in vitro and in vivo Genetic counseling Testing of other family members Therapy integrating genetic and genomic information Treatment based on pathophysiology FIGURE 479-17 Approach to genetic disease. Postnatal indications for cytogenetic analyses in infants or chil dren include multiple congenital anomalies, suspicion of a known cytogenetic syndrome, developmental delay, dysmorphic features, autism, short stature, and disorders of sexual development, among others (Table 479-4). Mutations in certain cancer susceptibility genes such as BRCA1 and BRCA2 may identify individuals with an increased risk for the development of malignancies and result in risk-reducing interven tions. The detection of cytogenetic alterations and mutations is an important diagnostic and prognostic tool in leukemias, and it has also transformed the management of solid tumors. In addition to providing diagnostic information, mutational analysis can inform the choice of targeted therapies (“actionable mutations”), character ize the mutational load, identify gene signatures associated with effective immunotherapies, and be used for surveillance. The demonstration of the presence or absence of mutations and polymorphisms is also relevant for the field of pharmacogenom ics, including the identification of differences in drug treatment response or metabolism as a function of genetic background Gene therapy through the introduction of a normal gene or the ability to make site-specific modifications to the human genome has, so far, limited clinical application. However, several gene transfer methods have now been approved for clinical use, for example, for the treatment of Leber congenital amaurosis, B-cell acute lymphoblastic leukemia, spinal muscular atrophy, and hereditary transthyretin-mediated amyloidosis. Genome edit ing (or gene editing) with CRISPR-Cas9 is a promising novel approach for the treatment of various diseases, for example cystic fibrosis, certain cancers, hemophilia, and sickle cell disease. The first therapies using this technology for the treatment of sickle cell disease were approved by the U.S. Food and Drug Administration in 2023 (Chap. 483). ETHICAL ISSUES Determination of the association of genetic defects with disease, comprehensive data of an individual’s genome, and studies of genetic variation raise many ethical and legal issues. Genetic information is generally regarded as sensitive information that should not be

Patient with (advanced) cancer Tumor biopsy: Somatic analysis Peripheral cells: Germline analysis DNA and RNA extraction Bioinformatics Tumor board readily accessible without explicit consent (genetic privacy). The disclosure of genetic information may risk possible discrimination by insurers or employers. The scientific components of the Human Genome Project have been paralleled by efforts to examine ethical, social, and legal implications. An important milestone emerging from these endeavors is the Genetic Information Nondiscrimination Act (GINA), signed into law in 2008, which aims to protect asymp tomatic individuals against the misuse of genetic information for health insurance and employment. It does not, however, protect the symptomatic individual. Provisions of the U.S. Patient Protection and Affordable Care Act, effective in 2014, have, in part, filled this gap and prohibit exclusion from, or termination of, health insurance based on personal health status. Potential threats to the maintenance of genetic privacy include the increasing integration of genomic data into electronic medical records, compelled disclosures of health records, and direct-to-consumer genetic testing. It is widely accepted that identifying disease-causing genes can lead to improvements in diagnosis, treatment, and prevention. However, the information gleaned from genotypic results can have quite different impacts, depending on the availability of strategies to modify the course of disease. For example, the identification of mutations that cause MEN 2 or hemochromatosis allows specific interventions for affected family members. On the other hand, at present, the identification of an Alzheimer’s or Huntington’s disease gene does not currently alter therapy and outcomes. Most genetic disorders are likely to fall into an intermediate category where the opportunity for prevention or treatment is significant but limited. However, the progress in this area is unpredictable, as underscored by the finding that angiotensin II receptor blockers appear to slow disease progression in Marfan’s syndrome. Genetic test results can generate anxiety in affected individuals and family members. Com prehensive sequence analyses are particularly challenging because most individuals can be expected to harbor several serious reces sive gene mutations. Moreover, the sensitivity of comprehensive sequence analyses is not always greater, for example, if CNV analy sis is not integrated. Genetic manipulation and patient selection for gene therapy approaches have raised ethical controversy and safety concerns that remain unresolved.

03 - 481 Mitochondrial DNA and Heritable Traits and Diseases

481 Mitochondrial DNA and Heritable Traits and Diseases

limited to, variable penetrance, which may be significantly impacted by family history (thus, those individuals with germline pathogenic variants detected via population screening in the absence of a classic family history may have significantly lower risks of disease), consensus of which genes should be included, management of VUS, and practi cal issues of implementation including insurance coverage, counseling related to the limitations of GINA, the role of primary care providers in ordering, and concerns about the possibility of worsening socioeco nomic and racial disparities that already exist in genetic testing. THERAPEUTIC INTERVENTIONS BASED

ON GENETIC RISK FOR DISEASE Specific treatments are available for a number of genetic disorders. Strategies for the development of therapeutic interventions have a long history in childhood metabolic diseases; however, these principles have been applied in the diagnosis and management of adult-onset diseases as well (Table 480-2). Hereditary hemochromatosis is usually caused by pathogenic variants in HFE (although other genes have been less commonly associated) and manifests as a syndrome of iron overload, which can lead to liver disease, skin pigmentation, diabetes mellitus, arthropathy, impotence in males, and cardiac issues (Chap. 426). When identified early, the disorder can be managed effectively with therapeutic phlebotomy. Therefore, when the diagnosis of hemochro matosis has been made in a proband, it is important to counsel other family members in order to minimize the impact of the disorder. Preventative measures and therapeutic interventions are not restricted to metabolic disorders. Identification of familial forms of long QT syndrome, associated with ventricular arrhythmias, allows early electrocardiographic testing and the use of prophylactic antiar rhythmic therapy, overdrive pacemakers, or defibrillators. Individuals with familial hypertrophic cardiomyopathy can be screened by ultra sound, treated with beta blockers or other drugs, and counseled about the importance of avoiding strenuous exercise and dehydration. Those with Marfan’s syndrome can be treated with beta blockers or angioten sin II receptor blockers and monitored for the development of aortic aneurysms. The identification of germline abnormalities that increase the risk of specific types of cancer is rapidly changing clinical manage ment. Identifying family members with pathogenic variants that pre dispose to FAP or Lynch syndrome leads to recommendations of early cancer screening and prophylactic surgery, as well as consideration of chemoprevention and attention to healthy lifestyle habits. Similar principles apply to familial forms of melanoma as well as cancers of the breast, ovary, and thyroid. There has been a significant growth in the number of molecularly directed therapies for genetic diseases including transthyretin stabiliz ers for TTR-associated cardiac amyloid; poly (ADP-ribose) polymerase (PARP) inhibitors for treatment of BRCA1/2 and PALB2-associated breast, ovarian, prostate, and pancreatic cancer; and medications for Duchenne’s muscular dystrophy that can either promote exon skip ping or allow bypass of nonsense mutations. Gene therapy either by replacement, such as in spinal muscular atrophy, or in sickle cell dis ease, increasing production of fetal hemoglobin or hemoglobin A, is an exciting area with tremendous opportunities (Chap. 483). The field of pharmacogenetics identifies genes that alter drug metabolism or confer susceptibility to toxic drug reactions. Pharmaco genetics seeks to individualize drug therapy in an attempt to improve treatment outcomes and reduce toxicity. Examples include thiopurine methyltransferase (TPMT) deficiency, dihydropyrimidine dehydro genase deficiency, malignant hyperthermia, and glucose-6-phosphate deficiency. Despite successes in this area, it is not always clear how to incorporate pharmacogenetics into clinical care. For example, although there is an association with CYP2C6 and VKORC1 genotypes and war farin dosing, there is no evidence that incorporating genotyping into clinical practice improves patient outcomes compared with clinical algorithms. Although the role of genetic testing in the clinical setting continues to evolve, such testing holds the promise of allowing early and more targeted interventions that can reduce morbidity and mortality. Rapid technologic advances are changing the ways in which genetic testing

is performed. As genetic testing has become less expensive and tech nically easier to perform, there has been significant expansion of its use. This has created both challenges and opportunities. It is critical that physicians and other health care professionals keep current with advances in genetic medicine in order to facilitate appropriate referral for genetic counseling and judicious use of genetic testing, as well as to provide state-of-the-art, evidence-based care for affected or at-risk patients and their relatives.

CHAPTER 481 ■ ■FURTHER READING ACMG Board of Directors: Direct-to-consumer genetic testing: Mitochondrial DNA and Heritable Traits and Diseases
A revised position statement of the American College of Medical Genetics and Genomics. Genet Med 18:207, 2016. Anya ER et al: The Goldilocks conundrum: Disclosing discrimination risks in informed consent. J Genet Couns 31:1383, 2022. Dewey FE et al: Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354:aaf6814, 2016. Food and Drug Administration. FDA direct to consumer tests. Available at https://www.fda.gov/medical-devices/in-vitro-diagnostics/ direct-consumer-tests. Accessed January 1, 2024. Hampel H et al: A practice guideline from the American College of Medical Genetics and Genomics and the National Society of Genetic Counselors: Referral indications for cancer predisposition assess ment. Genet Med 17:70, 2015. Miller DT et al: ACMG SF v3.2 list for reporting of secondary find ings in clinical exome and genome sequencing: A policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med 25:100866, 2023. Robson ME et al: American Society of Clinical Oncology policy state ment update: Genetic and genomic testing for cancer susceptibil ity. J Clin Oncol 33:3660, 2015. Splinter K et al: Effect of genetic diagnosis on patients with previously undiagnosed disease. N Engl J Med 379:2131, 2018. Turnbull C et al: Population screening requires robust evidencegenomics is no exception. Lancet 403:583, 2024. Karl L. Skorecki, Bruce H. Cohen

Mitochondrial DNA

and Heritable Traits

and Diseases Mitochondria are cytoplasmic organelles whose major function is to generate ATP by the process of oxidative phosphorylation under aerobic conditions. This process is mediated by the respiratory elec tron transport chain (ETC) multiprotein enzyme complexes I–V and the two electron carriers, coenzyme Q10 (CoQ10) and cytochrome c, located in the inner mitochondrial membrane. Other cellular processes to which mitochondria make a major contribution include apoptosis (programmed cell death) and additional cell type–specific functions (Table 481-1). The efficiency of the mitochondrial ETC in ATP pro duction is the major determinant of overall body energy balance and thermogenesis. In addition, mitochondria are the predominant source of reactive oxygen species (ROS), whose rate of production relates to the delicately balanced coupling of ATP production to oxygen consumption in health and disease. Given the centrality of oxidative phosphorylation to the normal activities of almost all cells, it is not surprising that mitochondrial dysfunction can affect almost any organ system (Fig. 481-1). Until recently, it was thought that disruption of

TABLE 481-1 Functions of Mitochondria All Cells and Tissues Oxidative phosphorylation Free radical production Calcium homeostasis Apoptosis (programmed cell death) PART 16 Genes, the Environment, and Disease Tissue- or Cell-Specific Cholesterol metabolism Amino and organic acid metabolism Fatty acid beta oxidation Sex steroid synthesis Heme synthesis Hepatic ammonia detoxification Neurotransmitter metabolism energy production was the source of the pathophysiology in those with mitochondrial dysfunction, but recent evidence suggests that free radi cal production and the redox state of the mitochondria may play a role as well. Thus, physicians in many disciplines might encounter patients with mitochondrial diseases and should be aware of their existence and characteristics. The integrated activity of an estimated 1500 gene products is required for normal mitochondrial biogenesis, function, maintenance, and integrity. Aside from the 37 genes that comprise the mitochondrial Heart Conduction disorder Wolff-Parkinson-White syndrome Cardiomyopathy Skeletal muscle Weakness Fatigue Myopathy Neuropathy Oxidative phosphorylation Subunits Nuclear DNA Brain Seizures Myoclonus Ataxia Stroke Dementia Migraine Mitochondrial DNA Nuclear DNA Inner ear Sensorineural hearing loss Colon Pseudo obstruction FIGURE 481-1 Dual genetic control and multiple organ system manifestations of mitochondrial disease. (From DR Johns: Mitochondrial DNA and disease. N Engl J Med 333:638, 1995. Copyright © 1995, Massachusetts Medical Society. Reprinted with permission from Massachusetts Medical Society.)

DNA (mtDNA) molecule, the remaining 1400+ gene products are encoded by nuclear genes (referred to as nDNA) and thus follow the rules and patterns of nuclear genomic inheritance (Chap. 479). These nuclear-encoded proteins are synthesized in the cell cytoplasm and imported to their location of activity within the mitochondria through a complex biochemical process. This process includes unfolding of the nuclear-encoded protein, attachment to a chaperone protein that shut tles it through a specific channel to a specific mitochondrial location, and detachment from the chaperone followed by assembly with other mtDNA- and nDNA-encoded proteins. In addition, the mitochon dria contain their own small genome consisting of numerous copies (polyploidy) per mitochondrion of a circular, double-strand mtDNA molecule comprising 16,569 nucleotides. This mtDNA sequence (also known as the “mitogenome”) might represent the remnants of endo symbiotic prokaryotes from which mitochondria are thought to have originated. The mtDNA sequence contains a total of 37 genes, of which 13 encode mitochondrial protein components of the ETC (Fig. 481-2). The remaining 22 tRNA- and 2 rRNA-encoding genes are mitochondriaspecific and dedicated to the process of translating the 13 mtDNAencoded proteins. The mtDNA itself replicates constantly, independent of cell division, and requires its own unique polymerase, referred to as polymerase gamma (polγ), which is encoded by the nuclear gene POLG, disorders of which are discussed in Chaps. 460 and 480. However, mutations in POLG can disrupt the endonuclease function of polγ, resulting in somatic mutations in the mtDNA that endure with future replication. Unless this mutation occurs and is propagated in the oocyte, it is not heritable. Mutations in POLG can also affect the polymerase function of polγ that results in decreased replica tion of the mtDNA and can lead to mtDNA depletion. This dual nuclear and mitochondrial genetic control of mitochondrial function results in unique and diagnostically challenging patterns of inheritance. The current chapter focuses on heritable traits and diseases related to the mtDNA com ponent of the dual genetic control of mitochondrial function. The reader is referred to Chaps. 460 and 479 for consideration of mitochondrial disease originating from mutations in the nuclear genome. The former include (1) disorders due to mutations in nuclear genes directly encoding structural components or assembly factors of the oxidative phosphoryla tion complexes, (2) disorders due to mutations in nuclear genes encoding proteins indirectly related to oxidative phosphorylation, (3) mtDNA deple tion syndromes (MDSs) character ized by a reduction of mtDNA copy number in affected tissues without mutations or rearrangements in the mtDNA, and (4) disorders due to mutations in nuclear genes that dis rupt normal mitochondrial dynamics (biosynthesis, mitophagy, fission, and fusion). Eye Optic neuropathy Ophthalmoplegia Retinopathy Liver Hepatopathy ATP Kidney Fanconi’s syndrome Glomerulopathy Pancreas Diabetes mellitus Blood Pearson’s syndrome The classic physical structure of the mitochondria is that of a thread-like organelle, which under fixed condi tions, such as observed with immu nohistochemical stains or electron microscopy, has a submarine shape and measures about 1 μm in length. However, in the living state, mito chondria comprise a network, with the mitochondrial shape being highly

I II III FIGURE 481-2 Maternal inheritance of mitochondrial DNA (mtDNA) disorders and heritable traits. Affected women (filled circles) transmit the trait to their children. Affected men (filled squares) do not transmit the trait to any of their offspring. variable based on the cell type, and manifests a complex and everchanging syncytial form, with continuous appearance and disappear ance of budding structures (representing mitochondrial fission) and reorganization of separate mitochondria (representing mitochondrial fusion). Although mitochondrial number per cell type appears in the medical literature, this is no longer considered as a reliable expression of actual functional mitochondrial volume or mass. Although the presence of mitochondria has been known for

150 years, and knowledge of their respiratory function was pro posed ~100 years ago, the initial description of an illness linked to mitochondrial dysfunction was only made in 1962. The presence of mtDNA was also only reported in the 1960s, and it was not until 1988 that the first mutations in the mtDNA causing human illness were described. These included the demonstration of a large-scale mtDNA deletion causing Kearns-Sayre syndrome (KSS) and the discovery of a point mutation in ND4, an mtDNA-encoded complex I gene, causing Leber’s hereditary optic neuropathy (LHON). Following these two discoveries, >400 pathogenic mtDNA mutations or deletions have been reported. MITOCHONDRIAL DNA STRUCTURE

AND FUNCTION As a result of its circular structure and extranuclear location, the rep lication and transcription mechanisms of mtDNA differ from the cor responding mechanisms in the nuclear genome, whose nucleosomal packaging and structure are more complex. Specifically, mitochondria have their own transcription system, and the mtDNA itself replicates independently of cellular replication. Because each cell contains many copies of mtDNA and because the number of mitochondria can vary during the lifetime of each cell, mtDNA copy number is not directly coordinated with the cell cycle. Thus, vast differences in mtDNA copy number are observed between different cell types and tissues and during the lifetime of a cell. Another important feature of the mtDNA replication process is a reduced stringency of proofreading and replication error correction, leading to a greater degree of sequence variation compared to the nuclear genome. Some of these sequence variants are silent polymorphisms that do not have the potential for a phenotypic or pathogenic effect, whereas others may be considered pathogenic mutations. There are some mutations that may be consid ered ecogenetic, as they typically remain silent, meaning they do not cause disease, unless an external event occurs. One classic example is seen in a common (1:800) mutation in the mitochondrial 12S rRNA gene, m.A1555G, which is associated with hearing loss that is rapidly exacerbated by exposure to normal dosages of an aminoglycoside anti biotic. Because mtDNA replication is independent of cellular replica tion, the percentage of mutant mtDNA copies tend to increase with age, especially in cells that are terminally differentiated (nonreplicative) at birth such as neurons and myocytes, which may explain some features of mitochondrial dysfunction with aging. With respect to transcription, initiation can occur on both strands and proceeds through the production of an intronless polycistronic precursor RNA, which is then processed to produce the 13 individual mRNA and 24 individual tRNA and rRNA products. The 37 mtDNA genes comprise fully 93% of the 16,569 nucleotides of the mtDNA in what is known as the coding region. The control region, which is

contained in the D-loop, consists of ~1.1 kilobases (kb) of noncod ing DNA and is thought to have an important role in replication and transcription initiation.

■ ■MATERNAL INHERITANCE AND LACK OF RECOMBINATION In contrast to homologous pair recombination that takes place in the nucleus, mtDNA molecules do not undergo recombination, such that mutational events represent the only source of mtDNA genetic diver sification. Moreover, it is only the maternal DNA that is transmitted to the offspring. The fertilized oocyte degrades the paternal mito chondria involving the ubiquitin proteasome system and autophagy that takes place on the inner membrane of the oocyte. However, additional studies suggest that human spermatozoa do not contain intact mtDNA and are missing TFAM, the mitochondrial transcrip tion factor necessary for mtDNA transcription. Thus, although moth ers transmit their mtDNA to both their sons and daughters, only the daughters can transmit the inherited mtDNA to future generations. Accordingly, mtDNA sequence variation and associated phenotypic traits and diseases are inherited exclusively along maternal lines, meaning both sons and daughters have equal chances of having symptomatic disease, with the only significant exception being LHON, as described below. CHAPTER 481 Mitochondrial DNA and Heritable Traits and Diseases
The phenotypic expression, including age of onset and the specific pattern and severity of organ dysfunction, of a pathogenic mtDNA mutation may vary greatly, even within families. Because of this com plex relationship between mtDNA mutations and disease expression, sometimes it is difficult to recognize the maternal pattern of inheri tance at the clinical or pedigree level. However, evidence of paternal transmission can almost certainly exclude an mtDNA genetic origin of phenotypic variation or disease; conversely, a disease affecting both sexes without evidence of paternal transmission strongly suggests a heritable mtDNA disorder (Fig. 481-2). ■ ■MULTIPLE COPY NUMBER (POLYPLOIDY),

HIGH MUTATION RATE, HETEROPLASMY,

AND MITOTIC SEGREGATION Each aerobic cell in the body has multiple mitochondria, often num bering many hundreds or more in cells with extensive energy produc tion requirements. Furthermore, the number of copies of mtDNA within each mitochondrion varies from several to hundreds; this is true of both somatic as well as germ cells, including oocytes in females. In the case of somatic cells, this means that the impact of most newly acquired somatic mtDNA mutations is likely to be very small in terms of total cellular or organ system function; however, because of the manyfold higher mutation rate during mtDNA replication, numerous different mutations may accumulate with aging of the organism. It has been proposed that the total cumulative burden of acquired somatic mtDNA mutations with age may result in an overall perturbation of mitochondrial function, contributing to age-related reduction in the efficiency of oxidative phosphorylation and increased production of damaging ROS. Because certain mtDNA (and nDNA) mutations may result in electron leak within the ETC, the ROS damage may rise to a level causing increased susceptibility to somatic mtDNA damage and disease expression. The accumulation of such acquired somatic mtDNA mutations with aging may contribute to age-related diseases, such as metabolic syndrome and diabetes, cancer, and neurodegen erative and cardiovascular disease in any given individual. However, somatic mutations are not carried forward to the next generation, and the hereditary impact of mtDNA mutagenesis requires separate consid eration in the female germline. The multiple mtDNA copy number within each cell, including the maternal germ cells, results in the phenomenon of heteroplasmy, in contrast to the much greater uniformity (homoplasy) of somatic nuclear DNA sequence. Heteroplasmy for a given mtDNA sequence variant or mutation arises as a result of the coexistence within a cell, tissue, or individual of mtDNA molecules bearing more than one version of the sequence variant (Fig. 481-3). The importance of the heteroplasmy phenomena to the understanding of mtDNA-related

PART 16 Genes, the Environment, and Disease Cell division Mutated mtDNA Mitochondrial bottleneck Wild-type mtDNA Nondividing cell FIGURE 481-3 mtDNA genetic bottleneck and changes of heteroplasmy level throughout the lifetime. Each oocyte can inherit a different proportion of mutated mtDNA molecules from maternal mitochondria. When cells divide (shown in pink), heteroplasmy levels in each daughter cell can either increase, decrease, or stay approximately the same. Once inherited, mtDNA mutations can continuously “clonally expand” throughout life, even in nondividing cells (shown in green, blue, and yellow). If one genotype is copied more frequently than another, it will change the overall proportion of different genotypes within the cell over time. The direction of this change can be influenced by selection for or against a particular mtDNA variant (shown in blue and yellow). When a mutated mtDNA molecule has a replicative advantage, the level will increase during life and possibly exceed the biochemical threshold, and thus contribute to the age-related pathologies or the aging process (shown in the blue box). (Reproduced from W Wei, PF Chinnery: Inheritance of mitochondrial DNA in humans: Implications for rare and common diseases. J Intern Med 2020; 287:634.) mitochondrial diseases is critical. The coexistence of mutant and nonmutant (wild-type) mtDNA and the variation of the mutant load, which can be thought of as the percentage of mutant mtDNA mol ecules within a specific cell, tissue, organ, or organism, contribute to the expression of a phenotype among individuals from the same maternal sibship. At the level of the oocyte, the percentage of mtDNA molecules bearing each version of the polymorphic sequence vari ant or mutation depends on stochastic events related to partitioning of mtDNA molecules during the process of oogenesis itself. Thus, oocytes differ from each other in the degree of heteroplasmy for that sequence variant or mutation. In turn, the heteroplasmic state is car ried forward to the zygote and to the organism as a whole, to varying degrees, depending on mitotic segregation of mtDNA molecules during organ system development and maintenance. For this reason, in vitro fertilization, followed by preimplantation genetic diagnosis (PGD), is not as predictive of the genetic health of the offspring in the case of mtDNA mutations as in the case of mutations and subsequent diseases occurring in the nuclear genome. Similarly, the impact of somatic mtDNA mutations acquired during development and subse quently also shows a wide spectrum of variability. In general, a higher mutant load will result in a more severe and earlier phenotypic presen tation. However, measuring heteroplasmy in one tissue (lymphocytes from blood or urine sediment containing kidney and bladder epithe lial cells, for example) may not represent the percentage of mutant heteroplasmy in the tissue or organs most affected, such as the cardiac atrioventricular node or brain. Furthermore, the threshold of mutant heteroplasmy that results in clinical illness may vary depending on the specific mutation.

Mitotic segregation refers to the unequal distribution of wild-type and mutant versions of mtDNA molecules during all cell divisions that occur during prenatal development and subsequently throughout the lifetime of an individual. The phenotypic effect or disease impact will be a function not only of the inherent disruptive effect (pathogenicity) on the mtDNA-encoded gene (coding region mutations) or integrity of the mtDNA mol ecule (control region mutations) but also of its distribution among the multiple copies of mtDNA in the various mitochondria, cells, and tissues of the affected individual. Thus, one consequence can be the generation of a bottleneck due to the marked decline in given sets of mtDNA variants, pathogenic and non pathogenic, consequent to such mitotic segre gation. It is postulated that the main effects of this bottleneck occur between the primordial germ cell state and the primary oocyte stage of development. Heterogeneity arises from differences in the degree of heteroplasmy among oocytes of the transmitting female, together with subsequent, probably random, mitotic segregation of the pathogenic muta tion during tissue and organ development and throughout the lifetime of the individual offspring. The actual expression of disease is believed to primarily depend on a threshold percentage of mitochondria whose function is disrupted by mtDNA mutations. This in turn confounds hereditary transmission patterns and hence genetic diagnosis of pathogenic heteroplasmic mutations. Generally, if the proportion of mutant mtDNA is <60%, the individual is unlikely to be affected, whereas proportions exceeding 90% likely result in clinical disease. One notable exception is LHON, in which these mutations are present either in 100% mutant homoplasmy, which causes the disease expression, or 100% wild-type homoplasmy. It is not understood why this specific phenotype and the several known mtDNA alleles that result in LHON behave in this manner. Homoplasmy Heteroplasmy mtDNA randomly replicated Through aging/ selection for replication Selection against replication ■ ■HOMOPLASMIC VARIANTS AND

HUMAN mtDNA PHYLOGENY In contrast to classic mtDNA diseases, most of which have clinical onset during childhood and are the result of heteroplasmic mutations as noted above, during the course of human evolution, certain mtDNA sequence variants have drifted to a state of homoplasmy, wherein all of the mtDNA molecules in the organism contain the new sequence variant. This arises due to a “bottleneck” effect followed by genetic drift during the very process of oogenesis itself (Fig. 481-3). In other words, during certain stages of oogenesis, the mtDNA copy number becomes so substantially reduced that the particular mtDNA species bearing the novel or derived sequence variant may become the increas ingly predominant, and eventually exclusive, version of the mtDNA for that particular nucleotide site. All of the offspring of a woman bearing an mtDNA sequence variant or mutation that has become homoplas mic will also be homoplasmic for that variant and will transmit the sequence variant forward in subsequent generations. Considerations of reproductive fitness limit the evolutionary or population emergence of pathogenic homoplasmic mutations that are lethal or cause severe disease in infancy or childhood. Thus, with a number of notable exceptions (e.g., as noted mtDNA mutations causing LHON; and see below), most homoplasmic mutations are considered to be neutral markers of human evolution, which are useful and interesting in the population genetics analysis of shared maternal

ancestry but have little significance in human phenotypic variation or disease predisposition. More important is the understanding that this accumulation of homoplasmic mutations occurs at a genetic locus that is transmit ted only through the female germline and that lacks recombination. In turn, this enables reconstruction of the sequential topology and radiating phylogeny of mutations accumulated through the course of human evolution since the time of the most recent common mtDNA ancestor of all contemporary mtDNA sequences, some 200,000 years ago. The term haplogroup is usually used to define major branching points in the human mtDNA phylogeny, nested one within the other, which often demonstrate striking continental geographic ancestral partitioning. At the level of the complete mtDNA sequence, the term haplotype is usually used to describe the sum of mutations observed for a given mtDNA sequence and as compared to a reference sequence, such that all haplotypes falling within a given haplogroup share the total sum of mutations that have accumulated since the most recent common ancestor and the bifurcation point they mark. The remaining observed variants are private to each haplotype. Con sequentially, the human mtDNA sequence serves with high fidelity as a molecular prototype for a nonrecombining locus, and its variation has been extensively used in phylogenetic studies. Moreover, the mtDNA mutation rate is considerably higher than the rate observed for the nuclear genome, especially in the control region, which contains the displacement loop, or D-loop, in turn comprising two adjacent hypervariable regions (HVR-I and HVR-II). Together with the absence of recombination, this amplifies drift to high frequencies of novel haplotypes that are highly partitioned across geographically defined populations. Despite extensive research, it has not been well established that such haplotype-based partitioning has a significant influence on human health conditions. However, mtDNA-based phylogenetic analysis can be used both as a quality assurance tool and as a filter in distinguishing neutral mtDNA variants comprising human mtDNA phylogeny from potentially deleterious mutations. Parkinsonism, aminoglycoside-induced deafness LS, MELAS, multisystem disease Cardiomyopathy PEO, LHON, MELAS, myopathy, cardiomyopathy, diabetes and deafness MITOCHONDRIAL DNA DISEASE The true prevalence of mtDNA disease is difficult to estimate because of the phe notypic heterogeneity that occurs as a function of heteroplasmy, the challenge of detecting and assessing heteroplasmy in different affected tissues, and the other unique features of mtDNA func tion and inheritance described above. It is estimated that at least 1 in 200 healthy humans harbors a pathogenic mtDNA mutation with the potential to causes disease but that heteroplasmic germline pathogenic mtDNA mutations actually result in clinical disease in ~1 in 5000 individuals. LHON Cyt b Myopathy, cardiomyopathy, PEO Myopathy, lymphoma Myopathy, MELAS Cardiomyopathy LHON LS, ataxia, chorea, myopathy PEO Myopathy, PEO ECM PEO Myoglobinuria, motor neuron disease, sideroblastic anemia PPK, deafness, MERRF-MELAS Cardiomyopathy myoclonus The true disease burden relating to mtDNA sequence variation will only be known when the following capabilities become available: (1) ability to distin guish a completely neutral sequence variant from a true phenotype-modify ing or pathogenic mutation, (2) accu rate assessment of heteroplasmy that can be determined with high fidelity, and (3) a systems biology approach (Chap. 499) to determine the network of epistatic interactions of mtDNA sequence variations with mutations in the nuclear genome. FIGURE 481-4 Mutations in the human mitochondrial genome known to cause disease. Disorders that are frequently or prominently associated with mutations in a particular gene are shown in boldface. Diseases due to mutations that impair mitochondrial protein synthesis are shown in blue. Diseases due to mutations in protein-coding genes are shown in red. ECM, encephalomyopathy; FBSN, familial bilateral striatal necrosis; LHON, Leber’s hereditary optic neuropathy; LS, Leigh’s syndrome; MELAS, mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes; MERRF, myoclonic epilepsy with ragged red fibers; MILS, maternally inherited Leigh’s syndrome; NARP, neuropathy, ataxia, and retinitis pigmentosa; PEO, progressive external ophthalmoplegia; PPK, palmoplantar keratoderma; SIDS, sudden infant death syndrome. (From S DiMauro, E Schon: Mitochondrial respiratory-chain diseases. N Engl J Med 348:2656, 2003. Copyright © 2003, Massachusetts Medical Society. Reprinted with permission from Massachusetts Medical Society.)

■ ■OVERVIEW OF CLINICAL AND PATHOLOGIC FEATURES OF HUMAN mtDNA DISEASE Given the vital roles of mitochondria in all nucleated cells, it is not surprising that mtDNA mutations can affect numerous tissues with pleiotropic effects. More than 200 different disease-causing, mostly heteroplasmic mtDNA mutations have been described affecting ETC function. Figure 481-4 provides a partial mtDNA map of some of the better characterized of these disorders. A number of clinical clues can increase the index of suspicion for a heteroplasmic mtDNA mutation as an etiology of a heritable trait or disease, including (1) familial clustering with absence of paternal transmission; (2) adherence to one of the classic syndromes (see below) or paradigmatic combinations of disease phenotypes involving several organ systems that normally do not fit together within a single nuclear genomic mutation category; (3) a complex of laboratory and pathologic abnormalities that reflect disruption in cellular energetics (e.g., lactic acidosis and neurodegen erative and myodegenerative symptoms with the finding of ragged red fibers, reflecting the accumulation of abnormal mitochondria under the muscle sarcolemmal membrane); or (4) a mosaic pattern reflect ing a heteroplasmic state. There are no truly sensitive and specific biomarkers of disease, and the presence of a historically quintessential finding of ragged red fibers can be seen in numerous muscle disorders, so laboratory tests must always be interpreted in the context of their limitations and should not be used to define the disease. Because of the improved availability and decreasing cost for mtDNA sequencing, the presence or absence of a pathogenic mtDNA mutation can be diag nostic when the clinical phenotype and family history are suggestive.

CHAPTER 481 Mitochondrial DNA and Heritable Traits and Diseases
Heteroplasmy can sometimes be elegantly demonstrated at the tissue level using histochemical staining for enzymes in the oxidative phos phorylation pathway, with a mosaic pattern indicating heterogeneity of the genotype for the coding region for the mtDNA-encoded enzyme. MELAS myoglobinuria Myopathy, PEO Cardiomyopathy ECM ECM, LHON, myopathy, cardiomyopathy, MELAS and parkinsonism 16S V 12sF PT Cardiomyopathy ECM L1 E ND1 LHON, MELAS, diabetes, LHON and dystonia ND6 M Q I ND2 ND5 LS, MELAS Cardiomyopathy, ECM PEO, myopathy, sideroblastic anemia Y C N A W L2 S2 H Diabetes and deafness COXI ND4 LHON, myopathy, LHON and dystonia D S1 ND4L ND3 COXIII COXII R G K A6 A8 LHON Progressive myoclonus, epilepsy, and optic atrophy Myopathy, multisystem disease, encephalomyopathy NARP, MILS, FBSN Cardiomyopathy, SIDS, ECM Cardiomyopathy, PEO, MERRF, MELAS, deafness LS, ECM, myoglobinuria

Complex II, CoQ, and cytochrome c are exclusively encoded by nuclear DNA. In contrast, complexes I, III, IV, and V contain at least some sub units encoded by mtDNA. Just 3 of the 13 subunits of the ETC complex IV enzyme, cytochrome c oxidase (COX), are encoded by mtDNA, and therefore, this enzyme has the lowest threshold for dysfunction when a threshold level of mutated mtDNA is reached. Histochemical stain ing for COX activity in tissues of patients affected with heteroplasmic inherited mtDNA mutations (or with the somatic accumulation of mtDNA mutations, see below) can show a mosaic pattern of reduced histochemical staining in comparison with histochemical staining for the complex II enzyme succinate dehydrogenase (SDH) (Fig. 481-5).

PART 16 Genes, the Environment, and Disease Next-generation sequencing (NGS) has dramatically improved the clinical genetic diagnostic evaluation of mitochondrial diseases at the level of both the nuclear genome and mtDNA. Low sequencing costs, high throughput, and short turnaround time expedite whole exome (WES) or whole genome sequencing (WGS) to identity genes and mutations with known pathogenicity or based on bioinformat ics assessment of likely pathogenicity. In the context of the mtDNA, the deep coverage enabled by NGS compared to Sanger sequencing now provides rapid and reliable detection of heteroplasmy in differ ent affected tissues. NGS yields accurate information about a patient’s predominant mtDNA sequence as well as lower frequency heteroplas mic variants and can reliably reach detection of even single mutant nucleotide heteroplasmy down to levels of <10%. Lower levels are often only clinically relevant if in the setting of a striking difference in heteroplasmy in different tissues. Clinically, the most striking overall characteristic of mitochondrial genetic disease is the phenotypic heterogeneity associated with mtDNA mutations. This extends to intrafamilial phenotypic heterogeneity for the same mtDNA pathogenic mutation and, conversely, to the overlap of phenotypic disease manifestations with distinct mutations. Thus, although fairly consistent and well-defined “classic” syndromes have been attributed to specific mutations, frequently “nonclassical” com binations of disease phenotypes ranging from isolated myopathy to extensive multisystem disease are often encountered, rendering geno type-phenotype correlation challenging. In both classical and nonclas sical mtDNA disorders, there is often a clustering of some combination A B C D E FIGURE 481-5 Cytochrome c oxidase (COX) deficiency in mitochondrial DNA (mtDNA)–associated disease. Transverse tissue sections that have been stained for COX and succinate dehydrogenase (SDH) activities sequentially, with COX-positive cells shown in brown and COX-deficient cells shown in blue. A. Skeletal muscle from a patient with a heteroplasmic mitochondrial tRNA point mutation. The section shows a typical “mosaic” pattern of COX activity, with many muscle fibers harboring levels of mutated mtDNA that are above the crucial threshold to produce a functional enzyme complex. B. Cardiac tissue (left ventricle) from a patient with a homoplasmic tRNA mutation that causes hypertrophic cardiomyopathy, which demonstrates an absence of COX in most cells. C. A section of cerebellum from a patient with mtDNA rearrangement that highlights the presence of COX-deficient neurons. D, E. Tissues that show COX deficiency due to clonal expansion of somatic mtDNA mutations within single cells—a phenomenon that is seen in both postmitotic cells (D; extraocular muscles) and rapidly dividing cells (E; colonic crypt) in aging humans. (Reproduced with permission from R Taylor, D Turnbull: Mitochondrial DNA mutations in human disease. Nat Rev Genetics 6:389, 2005.)

of abnormalities affecting the neurologic system (including optic nerve atrophy, pigment retinopathy, and sensorineural hearing loss), cardiac and skeletal muscle (including extraocular muscles), and endocrine and metabolic systems (including diabetes mellitus). Additional organ systems that may be affected include the hematopoietic, renal, hepatic, and gastrointestinal systems, although these are more frequently involved in infants and children. Disease-causing mtDNA coding region mutations can affect either one of the 13 protein-encoding genes or one of the 24 protein synthetic genes. Clinical manifesta tions do not readily distinguish these two categories, although lactic acidosis and specific muscle pathologic findings (e.g., ragged red and ragged blue fibers, immunohistochemical staining, paracrystalline inclusions on ultrastructure) tend to be more prominent in the latter. In all cases, either defective ATP production due to disturbances in the ETC or enhanced generation of ROS has been invoked as the mediat ing biochemical mechanism between mtDNA mutation and disease manifestation. ■ ■mtDNA DISEASE PRESENTATIONS The clinical presentation of adult patients with mtDNA disease can be divided into three categories: (1) clinical features suggestive of mitochondrial disease (Table 481-2) but not a well-defined classic syndrome; (2) classic mtDNA syndromes; and (3) clinical presentation confined to one organ system (e.g., isolated sensorineural deafness, cardiomyopathy, or diabetes mellitus). It is important to note, espe cially when young adults come to medical attention, that symptoms of an mtDNA disorder may have begun during childhood. Table 481-3 provides a summary of eight illustrative classic mtDNA syndromes or disorders that affect adult patients and highlights some of the most interesting features of mtDNA disease in terms of molecu lar pathogenesis, inheritance, and clinical presentation. The first five of these syndromes result from heritable point mutations in either protein-encoding or protein synthetic mtDNA genes; the other three result from rearrangements or deletions that usually do not involve the germline. LHON is a common cause of maternally inherited visual failure. LHON typically presents during young adulthood with subacute painless loss of vision in one eye, with symptoms developing in the other eye 6–12 weeks later. In some instances, cerebellar ataxia, peripheral neuropa thy, and cardiac conduction defects are observed. In >95% of cases, LHON is due to one of the three homoplas mic point mutations of mtDNA that affect genes encoding different sub units of complex I of the mitochon drial ETC; however, not all individuals who inherit a primary LHON mtDNA mutation develop optic neuropathy, and the male-to-female ratio is 8.2, indi cating that additional environmental (e.g., tobacco exposure) or independent genetic factors are important in the eti ology of the disorder. Estrogen may also play a role in the decreased clinical pen etrance in women. Both the nuclear and mitochondrial genomic backgrounds modify disease penetrance. Indeed, a region of the X chromosome containing a high-risk haplotype for LHON has been identified, supporting the formu lation that nuclear genes act as modifi ers and affording an explanation for the male prevalence of LHON. This haplo type can be used in predictive genomic testing and prenatal screening for this disease. In contrast to the other classic mtDNA disorders, it is of interest that

TABLE 481-2 Common Features of Mitochondrial DNA–Associated Diseases in Adults Neurologic: stroke, epilepsy, migraine headache, peripheral neuropathy, ataxia, dystonia, myoclonus, cranial neuropathy (optic atrophy, sensorineural deafness, dysphagia, dysphasia) Skeletal myopathy: ophthalmoplegia, exercise intolerance, myalgia, weakness Cardiac: conduction block, cardiomyopathy Respiratory: hypoventilation, aspiration pneumonitis Endocrine: diabetes mellitus, premature ovarian failure, hypothyroidism, hypoparathyroidism Ophthalmologic: cataracts, pigment retinopathy, neurologic and myopathic (optic atrophy, ophthalmoplegia) patients with this syndrome are often homoplasmic for the diseasecausing mutation. The somewhat later onset in young adulthood and modifying effect of protective background nuclear genomic haplotypes may have enabled homoplasmic pathogenic mutations to have escaped evolutionary censoring. Mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes (MELAS) is a multisystem disorder with a typical onset between 2 and 10 years of age, although adult presentations also occur. Following normal early psychomotor development, the most com mon initial symptoms are seizures, recurrent headaches, anorexia, and recurrent vomiting. Exercise intolerance or proximal limb weakness can be the initial manifestation, followed by generalized tonic-clonic seizures. Short stature is common. Seizures are often associated with stroke-like episodes of transient hemiparesis or cortical blindness that may produce recurrent encephalopathy with impaired consciousness. It is often not possible to determine if the encephalopathy is due to refractory clinical or subclinical seizures or should be attributed to an independent effect. The cumulative residual effects of the stroke-like episodes gradually impair motor abilities, vision, and cognition, often by adolescence or young adulthood. Sensorineural hearing loss adds to the progressive decline of these individuals. A plethora of less common symptoms have been described including myoclonus, ataxia, episodic coma, optic atrophy, cardiomyopathy, pigmentary retinopathy, ophthal moplegia, diabetes mellitus, hirsutism, gastrointestinal dysmotility, and nephropathy. The typical age of death ranges from 10 to 35 years, but some individuals live into their sixth decade. Intercurrent infections or intestinal obstructions are often the terminal events. It is proposed that the clinical diagnosis of MELAS can only be applied if the following three criteria are met: (1) stroke-like episode before age 40 years, (2) encephalopathy due to seizures and/or dementia, and (3) lactic acidosis and/or ragged red fibers. It is not atypical for some family members to have much less severe or later onset illness, presumably because of a TABLE 481-3 Mitochondrial Diseases Due to Mitochondrial DNA (mtDNA) Point Mutations and Large-Scale Rearrangements DISEASE PHENOTYPE NARP, Leigh’s syndrome Loss of central vision leading to blindness in young adult life MELAS Mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes; may manifest only as diabetes mellitus MERRF Myoclonic epilepsy, ragged red fibers in muscle, ataxia, increased CSF protein, sensorineural deafness, dementia Deafness Progressive sensorineural deafness, often induced by aminoglycoside antibiotics Nonsyndromic sensorineural deafness m.7445A>G mutation in 12S rRNA Chronic progressive external ophthalmoplegia (PEO) Late-onset bilateral ptosis and ophthalmoplegia, proximal muscle weakness, and exercise intolerance Pearson’s syndrome Pancreatic insufficiency, pancytopenia, lactic acidosis Large deletion Heteroplasmic Sporadic, somatic mutations Kearns-Sayre syndrome (KSS) External ophthalmoplegia, heart block, retinal pigmentation, ataxia Abbreviations: CSF, cerebrospinal fluid; NARP, neuropathy, ataxia, and retinitis pigmentosa.

lessor mutation load, and “MELAS” is not used as a diagnosis for these restricted phenotypes. This creates somewhat of a disconnect between the genotype for MELAS (most commonly the m.3243A>G mutation) and a diverse phenotype, which includes the syndrome MELAS, a syndrome of high-frequency hearing loss and diabetes with onset later in life, as well as many other phenotypes between these two extreme syndromes. Certain other mtDNA mutations can also cause such patterns of diverse phenotypic expression. Laboratory investigation commonly demonstrates elevated blood lactate concentrations at rest with excessive increase after moderate exercise. Magnetic resonance imaging (MRI) of the brain shows areas of involvement on T2- or fluid-attenuated inversion recovery (FLAIR) sequences, with decreased signal on perfusion-weighted sequences, which typically involve the posterior cerebrum and do not conform to the distribution of major arteries. These abnormalities may be temporary or evolve to subse quent atrophy (Fig. 481-6). Electrocardiography (ECG) may show evidence of cardiomyopathy, preexcitation, or incomplete heart block. Electromyography and nerve conduction studies are consistent with a myopathic process, without or with coexisting axonal and sensory neuropathic findings. Muscle biopsy typically shows ragged red fibers with the modified Gomori trichrome stain or “ragged blue fibers” with the SDH histochemical stain, resulting from the hyperintense reaction. The diagnosis of MELAS is based on a combination of clinical findings and molecular genetic testing. Mutations in the mtDNA gene MT-TL1 encoding tRNAleu are causative. The most common mutation, pres ent in ~80% of individuals with typical clinical findings, is an A-to-G transition at nucleotide 3243 (m.3243A>G). Mutations can usually be detected in mtDNA from leukocytes in individuals with typical MELAS; however, the occurrence of heteroplasmy can result in vary ing tissue distribution of mutated mtDNA. In the absence of specific treatment, various manifestations of MELAS are treated according to standard modalities for prevention, surveillance, and treatment. Recent developments in therapy are described below.

CHAPTER 481 Mitochondrial DNA and Heritable Traits and Diseases
Myoclonus epilepsy with ragged red fiber (MERRF) is a multisystem disorder characterized by myoclonus, seizures, ataxia, and myopathy with ragged red fibers. Hearing loss, exercise intolerance, neuropathy, ataxia, cervical lipomas, and short stature are often present. Ataxia and lipomas can be a feature in adults or adult-onset MERRF. Cerebrospi nal fluid (CSF) analysis reveals an elevated protein content. Almost all MERRF patients have a mutation in the mtDNA tRNAlys gene, and the m.8344A>G mutation in the mtDNA gene encoding the lysine amino acid tRNA is responsible for 80–90% of MERRF cases. Neuropathy, ataxia, and retinitis pigmentosa (NARP) is character ized by moderate diffuse cerebral and cerebellar atrophy and sym metric lesions of the basal ganglia on MRI (Figs. 481-7 and 481-8). A heteroplasmic m.8993T>G mutation in the ATPase 6 subunit gene MOST FREQUENT mtDNA MUTATIONS HETEROPLASMIC/ HOMOPLASMIC MATERNAL m.1778G>A, m.14484T>C, m.3460G>A Heteroplasmic Maternal Point mutation in tRNAleu Heteroplasmic Maternal Point mutation in tRNAlys Heteroplasmic Maternal m.1555A>G mutation in 12S rRNA Homoplasmic Maternal Homoplasmic Maternal Single deletions or duplications Heteroplasmic Mostly sporadic, somatic mutations The 5-kb “common deletion” Heteroplasmic Sporadic, somatic mutations

PART 16 Genes, the Environment, and Disease FIGURE 481-6 A 15-year-old girl with MELAS (mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes) due to m.A3243G (tRNALeu(UUR)), 85% mutant heteroplasmy, presenting at age 5 with focal motor seizures, ataxia, and short stature, with episodes of acute language and motor dysfunction and progress cognitive impairment. The fluid-attenuated inversion recovery (FLAIR) magnetic resonance image (MRI) shows increased signal intensity (white arrows) in the left temporal-parietal region in addition to global mild volume loss (increased extraaxial cerebrospinal fluid spaces). has been identified as causative, which underscores the lack of defini tive genotype-phenotype correlation in mtDNA diseases. Ragged red fibers are not observed in muscle biopsy. When >95% of mtDNA molecules are mutant, a more severe clinical, neuroradiologic, and FIGURE 481-7 A 9-year-old girl with Leigh’s syndrome due to m.T8993G (ATPase subunit 6), 99% heteroplasmy, presenting at age 14 months with a motor delay and who underwent magnetic resonance imaging (MRI) at 24 months, at which time she had just begun to walk. She has moderate cognitive impairment, arm chorea, and distal leg dystonia. The fluid-attenuated inversion recover (FLAIR) MRI shows symmetric bilateral increased signal in the caudate nuclei (thin arrow) and putamen (thick arrow); only left-sided lesions indicated with arrows.

FIGURE 481-8 A 12-year-old boy with Leigh’s syndrome due to m.T10191C (ND3 gene, complex I), heteroplasmy percentage not determined, presenting with infantile spasms at 8 months of life. He responded well to adrenocorticotropic hormone (ACTH), and his magnetic resonance imaging (MRI) and development were normal until 30 months when he developed dystonia and progressive medically intractable epilepsy. The fluid-attenuated inversion recover (FLAIR) MRI at 6 years of life shows global atrophy with large extra-axial cerebrospinal fluid spaces, increased signal intensity in the cortex (thin arrows), necrotic bilaterally symmetric lesions in the putamina, and enlarged lateral ventricles due to loss of bilateral caudate nuclei volume (stars). neuropathologic picture (Leigh’s syndrome) emerges. Not uncom monly, an infant is diagnosed with Leigh’s syndrome due to the m.8993T>G mutation and not until several years later will the mother present with symptoms of NARP, a situation that highlights the concept of a higher threshold for lower levels of tissue heteroplasmy. Point mutations in the mtDNA gene encoding the 12S rRNA (m.A1555G) result in heritable nonsyndromic hearing loss. One such mutation causes heritable ototoxic susceptibility to standard dosing of aminoglycoside antibiotics, which opens a pathway for a simple phar macogenetic test in the appropriate clinical settings. This is an example of an ecogenetic disorder in that most people with this mutation do not develop any symptoms until exposed to an external agent. KSS, sporadic progressive external ophthalmoplegia (PEO), and Pearson’s syndrome are three disease phenotypes caused by large-scale mtDNA rearrangements including partial deletions or partial duplica tion. The majority of single large-scale rearrangements of mtDNA are thought to result from clonal amplification of a single sporadic muta tional event, occurring in the maternal oocyte during early embryonic development. The typical mtDNA deletion specifically involves 4977 nucleotides, lost at identical breakpoints, and accounting for most KSS and PEO of mtDNA deletion origin. Because germline involvement is rare, most cases are sporadic rather than inherited. KSS is characterized by the triad of onset before age 20, chronic PEO, and pigmentary reti nopathy. Cerebellar syndrome, heart block, increased CSF protein con tent, diabetes mellitus, and short stature are also part of the syndrome. Single deletions/duplication can also result in milder phenotypes such as PEO, characterized by late-onset PEO, proximal myopathy, and exer cise intolerance. In both KSS and PEO, diabetes mellitus and hearing loss are frequent accompaniments. Pearson’s syndrome is characterized by infantile onset of a sideroblastic anemia accompanied by lactic aci dosis and failure to thrive caused in part by exocrine pancreatic insuf ficiency. If the child survives, the manifestations appear phenotypically similar to those of severe KSS with myopathy, PEO, encephalopathy, and cardiomyopathy. Pearson’s syndrome is generally caused by

large-scale sporadic deletion of several mtDNA genes that differ from the common deletion seen in KSS. Typically, the deletion size is larger in Pearson’s syndrome, and located with different breakpoints, than in KSS or PEO, but this is not always the case. Two important dilemmas in classic mtDNA disease have benefited from recent important research insights. The first relates to the greater involvement of neuronal, muscular, renal, hepatic, and pancreatic manifestations in mtDNA disease in these syndromes. This observa tion has appropriately been mostly attributed to the high energy uti lization of the involved tissues and organ systems and, hence, greater dependency on mitochondrial ETC integrity and health. However, because mutations are stochastic events, mitochondrial mutations should occur in any organ during embryogenesis and development. Recently, additional explanations have been suggested based on studies of the common m.3243A>G transition. The proportion of this muta tion in peripheral blood cells was shown to decrease exponentially with age. A selective process acting at the stem cell level with a strong bias against the mutated form would have its greatest effect to reduce the mutant mtDNA only in highly proliferating cells, such as those derived from the hematopoietic system. Tissues and organs carrying pathogenic mtDNA mutations and having a lower cell turnover, such as brain, nerve, or retina, would not benefit from this effect and, thus, would be expected to accumulate mutational load and be the most affected. However, age-related clonal hematopoiesis might mitigate some of this disparity. The other dilemma arises from the observation that only a sub set of mtDNA mutations accounts for the majority of the familial mtDNA diseases. The random occurrence of mutations in the mtDNA sequence should yield a more uniform distribution of disease-causing mutations. However, recent studies using the introduction of one severe and one mild point mutation into the female germline of experi mental animals demonstrated selective elimination during oogenesis of the severe mutation and selective retention of the milder mutation, with the emergence of mitochondrial disease in offspring after multiple Clinical Investigations – Initial Biochemical Screening Blood: CK, liver functions, glucose, lactate, carnitine/acylcarnitines, amino acids, GDF-15 Urine: organic acids, amino acids CSF: glucose, protein, lactate, amino acids Cardiac: ECG, ECHO Brain: MRI with MRS (or CT) Nerve/Muscle: EMG, nerve conduction Depending on Available Technology, Select

Specific mtDNA point mutations with LR-PCR (mtDNA); or
Whole mtDNA genome (NextGen) with LR-PCR (mtDNA); or
WES or WGS (including mtDNA genome) with LR-PCR (mtDNA) Muscle Biopsy Immunohistochemistry Respiratory Chain Enzymology Special Testing on Muscle
LR-PCR (mtDNA)
mtDNA whole genome sequencing
mtDNA depletion quantification FIGURE 481-9 Clinical and laboratory investigation of a suspected mitochondrial DNA (mtDNA) disorder. Following history including the family history and examination, a screening biochemical evaluation and other testing is selectively warranted, and if the evaluation suggests a mitochondrial disease, further genetic evaluation is warranted. The specific molecular genetic testing depends on available technology and costs, with clinical acumen essential for determining the extent of testing. The use of muscle investigation can support the genetic testing. Of note, mtDNA deletion disorders at times require LR-PCR on skeletal muscle tissue to find the deletion. CSF, cerebrospinal fluid; CT, computed tomography; ECG, electrocardiogram; ECHO, echocardiography; EEG, electroencephalogram; EMG, electromyogram; LHON, Leber’s hereditary optic neuropathy; LR-PCR, long-range polymerase chain reaction; MELAS, mitochondrial encephalomyopathy, lactic acidosis, and stoke-like episodes; MERRF, myoclonic epilepsy with ragged red fibers; MRI, magnetic resonance imaging; MRS, magnetic resonance spectroscopy, PCR, polymerase chain reaction; RFLP, restriction fragment length polymorphism; WES, whole exome sequencing; WGS, whole genome sequencing.

generations. Thus, oogenesis itself can act as an “evolutionary” filter for the most harmful mtDNA disease.

■ ■THE INVESTIGATION OF SUSPECTED

mtDNA DISEASE The clinical presentations of classic syndromes, groupings of disease manifestations in multiple organ systems, or unexplained isolated presentations of one of the disease features of a classic mtDNA syndrome should prompt a systematic clinical investigation as outlined in Fig. 481-9. However, some tests are not universally available or are costly, and WGS is not only more readily available but also less costly than the cost of obtaining tissue such as muscle for biochemical and pathologic evaluation. In many medical centers, a history and exam suggestive of an mtDNA disorder will result in a molecular genetic evaluation before tissue evaluation. Indeed, mitochondrial disease should be considered in the differential diagnosis of any progressive multisystem disorder. Despite the centrality of disruptive oxidative phosphorylation, an elevated blood lactate level is neither specific nor sensitive, because there are many causes of blood lactic acidosis and many patients with mtDNA defects presenting at any age may have normal blood lactate levels. An elevated CSF lactate is a more spe cific test for mitochondrial disease if there is central nervous system involvement but is still not diagnostic. The serum creatine kinase may be elevated but is often normal, even in the presence of a proximal myopathy. Recently, testing for elevated levels of growth differentiat ing factor 15 (GDF15) has shown a high degree of sensitivity and specificity in those with a mitochondrial myopathy, but the degree of elevation for an individual patient reflects the severity of the illness and does not seem to be a sensitive marker of disease activity. Urinary organic acids (specifically TCA cycle intermediates) and amino acids (alanine, proline) may also be abnormal, reflecting metabolic as well as kidney proximal tubule dysfunction. Every patient with seizures, epi sodes of confusion or atypical behavioral changes, or cognitive decline should have an electroencephalogram. A brain computed tomography CHAPTER 481 Mitochondrial DNA and Heritable Traits and Diseases

(CT) scan may show calcified basal ganglia or bilateral hypodense regions with cortical atrophy. MRI is indicated in patients with brainstem signs or stroke-like episodes.

For an increasing number of mitochondrial diseases, it is possible to obtain an accurate diagnosis with a simple molecular genetic screen. For examples, 95% of patients with LHON harbor one of the three mtDNA point muta tions (m.11778A>G, m.A3460A>G, or m.14484T>C). These patients have very high levels of mutated mtDNA in peripheral blood cells, and therefore, it is appropriate to send a blood sample for molecular genetic analysis by polymerase chain reaction (PCR) or restriction fragment length polymorphism (RFLP). The same is true for most MERRF patients who harbor a point mutation in the lysine tRNA gene at position 8344. In contrast, patients with the m.3243A>G MELAS mutation often have low levels of mutated mtDNA in blood. If clinical suspicion is strong enough to warrant peripheral blood testing, then patients with a negative result should have testing repeated using a saliva sample or be investigated further by performing a skeletal muscle biopsy to obtain mtDNA from a relatively nonreplicative tissue. ANT1 adPEO PART 16 Genes, the Environment, and Disease Deoxyguanosine kinase MPV17 Thymidine kinase (TK2) RRMB2 (p53-R2) Succinyl-CoA synthase (SUCLA2, SUCLG1) TP Thymidine phosphorylase FIGURE 481-10 Disorders associated with perturbations in nuclear-mitochondrial genomic crosstalk. Clinical features and genes associated with multiple mitochondrial DNA (mtDNA) deletions, mtDNA depletion, and mitochondrial neurogastrointestinal encephalomyopathy syndromes. adPEO, autosomal dominant progressive external ophthalmoplegia; ANT, adenine nucleotide translocators; arPEO, autosomal recessive progressive external ophthalmoplegia; IOSCA, infantile-onset spinocerebellar ataxia; SCAE, spinocerebellar ataxia and epilepsy. (Reproduced with permission from A Spinazzola, M Zeviani: Disorders from perturbations of nuclear-mitochondrial intergenomic cross-talk. J Intern Med 265:174, 2009.) Muscle biopsy histochemical analysis had been the historical cornerstone for investigation of patients with suspected mitochondrial disease. Histochemical analysis may show subsarcolemmal accumulation of mitochon dria with the appearance of ragged red fibers, especially in those with mtDNA mutations affecting the tRNA and rRNA genes. Electron microscopy might show abnormal mitochondria with paracrystalline inclusions. Muscle histochemistry may show COX-deficient fibers, which indi cate mitochondrial dysfunction (Fig. 481-5). Respiratory chain complex assays may also show reduced enzyme function. If enzy matic or polarographic data are used to aid in the confirmation of diagnosis, a standard method of analysis should be employed. Either of these two abnormalities, within the exact context of established peer-reviewed criteria, may confirm the presence of a mitochondrial disease, to be followed by an in-depth molecular genetic analysis. In most major centers, genetic testing has become the primary means of obtaining a definitive diagnosis, using muscle pathology and bio chemistry to assist with interpretation of inconclusive genetic results. It is proposed to use of the term primary mitochondrial disease only when a pathogenic mutation is identified that matches the clinical phenotype. Recent evidence has provided important insights into the impor tance of nuclear-mtDNA genomic cross-talk and has provided a descriptive framework for classifying and understanding disorders that emanate from perturbations in this cross-talk. Although not strictly considered as mtDNA genetic disorders, manifestations do overlap those highlighted above (Fig. 481-10). IMPACT OF HOMOPLASMIC SEQUENCE VARIATION ON HERITABLE TRAITS AND DISEASE The relationship among the degree of heteroplasmy, tissue distribution of the mutant mtDNA, and disease phenotype simplifies inference of a clear causative relationship between heteroplasmic mutation and disease. With the exception of certain mutations (e.g., those causing most cases of LHON), drift to homoplasmy of such mutations would be precluded normally by the severity of impaired oxidative phosphoryla tion and the consequent reduction in reproductive fitness. Therefore, sequence variants that have reached homoplasmy should be neutral in terms of human evolution and, hence, useful for tracing phylogeny, demography, and migration, as described above. Thus, novel homo plasmic variants are seldom pathogenic. One important exception is in the case of one or more of the homoplasmic population-level variants, which designate the mtDNA haplogroup J, and the interaction with the mtDNA mutations causing LHON. Reduced disease predilection

Multiple ∆mtDNA adPEO arPEO Pol γ adPEO A B Twinkle mtDNA depletion Pol γA Twinkle Patient Control Alpers’ like IOSCA Alpers’s. SCAE dNTP pool Pyrimidine salvage suggests that one or more of the ancient sequence variants designating mtDNA haplogroup J appear to attenuate predisposition to degen erative disease, in the presence of other risk factors. Whether or not additional epistatic interactions between population-level mtDNA haplotypes and common health conditions will be found remains to be determined. If such influences do exist, then they are more likely to be relevant to health conditions in the postreproductive age groups, wherein evolutionary filters would not have had the opportunity to censor deleterious effects and interactions and wherein the effects of oxidative stress during aging or with poor diet or lack of exercise may play a role. Although much has been written about the possible associations between population-level common mtDNA variants and human health and disease phenotypes or adaptation to different envi ronmental influences (e.g., climate), clinical implications have not been forthcoming. Many studies that purport to show such associations with phe notypes such as longevity, athletic performance, and metabolic and neurodegenerative disease are limited by small sample sizes, possible genotyping inaccuracies, and the possibility of population stratifica tion or ethnic ancestry bias. Because mtDNA haplogroups are so prominently partitioned along phylogeographic lines, it is difficult to exclude the possibility that a haplogroup for which an association has been reported is simply a marker for otherwise unappreciated popula tion heterogeneity, wherein a nongenetic (societal or environmental) difference among the populations marked by the mtDNA haplogroup differences is actually causally related to the disease of interest. The experimental difficulty in generating cellular or animal models to test the functional influence of homoplasmic sequence variants (as a result of mtDNA polyploidy) further compounds the challenge. The most likely formulation is that the risk conferred by different mtDNA haplogroup-defining homoplasmic mutations for common diseases depends on the concomitant nuclear genomic background, together with environmental influences. Progress in minimizing potentially misleading associations in mtDNA heritable trait and disease stud ies should include ensuring adequate sample size taken from a large sample recruitment base, using carefully matched controls and population structure determination, and performing analysis that takes into account epistatic interactions with other genomic loci and environmental factors.

IMPACT OF ACQUIRED SOMATIC

mtDNA MUTATION ON HUMAN

HEALTH AND DISEASE Studies on aging humans and animals have shown a potentially important correlation of age with the accumulation of heterogeneous mtDNA mutations, especially in organ systems that undergo the most prominent age-related degenerative tissue phenotype. Sequencing of PCR-amplified single mtDNA molecules has demonstrated an average of two-to-three-point mutations per molecule in elderly subjects when compared with younger ones. Point mutations observed include those responsible for known heritable heteroplasmic mtDNA disorders, such as the m.3344A>G and m.3243A>G mutations responsible for the MERRF and MELAS syndromes, respectively. However, the cumula tive burden of these acquired somatic point mutations with age was observed to remain well below the threshold expected for phenotypic expression (<2%). Point mutations at other sites not normally involved in inherited mtDNA disorders have also been shown to accumulate to much higher levels in some tissues of elderly individuals, with the description of tissue-specific “hot spots” for acquired somatic mtDNA point mutations. Likewise, an age-associated and tissue-specific accu mulation of mtDNA deletions has been observed, including deletions involved in known heritable mtDNA disorders, as well as others. The accumulation of functional mtDNA deletions in a given tissue is expected to be associated with mitochondrial dysfunction, as reflected in an age-associated patchy and reduced COX activity on histochemi cal staining, especially in skeletal and cardiac muscle and brain. A particularly well-studied and potentially important example is the accumulation of mtDNA deletions and COX deficiency observed in neurons of the substantia nigra in Parkinson’s disease patients. The progressive accumulation of ROS has been proposed as the key factor connecting mtDNA mutations with aging and age-related dis ease pathogenesis (Fig. 481-11). As noted above, ROS are a by-product of normal oxidative phosphorylation and are removed by detoxifying antioxidants into less harmful moieties; however, environmental factors or mutations that result in exaggerated production of ROS or impaired Damaged mitochondrial proteins Error-prone DNA Pol-γ Mutant mitochondrial proteins Decreased DNA repair O2 X − O2 X DNA mutations H2O2 H2O OH ROS Nuclear DNA damage Apoptosis Aging FIGURE 481-11 Multiple pathways of mitochondrial DNA (mtDNA) damage and aging. Multiple factors may impinge on the integrity of mitochondria that lead to loss of cell function, apoptosis, and aging. The classic pathway is indicated with blue arrows; the generation of reactive oxygen species (ROS; superoxide anion, hydrogen peroxide, and hydroxyl radicals), as a by-product of mitochondrial oxidative phosphorylation, results in damage to mitochondrial macromolecules, including the mtDNA, with the latter leading to deleterious mutations. When these factors damage the mitochondrial energy-generating apparatus beyond a functional threshold, proteins are released from the mitochondria that activate the caspase pathway, leading to apoptosis, cell death, and aging. (Reproduced with permission from L Loeb et al: The mitochondrial theory of aging and its relationship to reactive oxygen species damage and somatic mt DNA mutations. Proc Natl Acad Sci USA 102 (52):18769-18770, 2005.)

removal result in ROS accumulation and subsequent cellular injury. One of the main targets for ROS-mediated injury is DNA, and mtDNA is particularly vulnerable because of its proximity to the origin of free radical production, the lack of protective histones, and less efficient injury repair systems compared with nuclear DNA. In turn, accumula tion of mtDNA mutations results in inefficient oxidative phosphoryla tion, with the potential for excessive production of ROS, generating a “vicious cycle” of cumulative mtDNA damage. Indeed, measurement of the oxidative stress biomarker 8-hydroxy-2-deoxyguanosine has been used to measure age-dependent increases in mtDNA oxidative damage at a rate exceeding that of nuclear DNA. It should be noted that mtDNA mutations can potentially occur in postmitotic cells as well, because mtDNA replication is not synchronized with the cell cycle. Two other proposed links between mtDNA mutation and aging, besides ROS-mediated tissue injury, are the perturbations in efficiency of oxidative phosphorylation with disturbed cellular aerobic func tion and perturbations in apoptotic pathways, whose execution steps involve mitochondrial activity.

CHAPTER 481 Mitochondrial DNA and Heritable Traits and Diseases
Genetic intervention studies in animal models have sought to clarify the potential causative relationship between acquired somatic mtDNA mutation and the aging phenotype and the role of ROS in particular. Replication of the mitochondrial genome is mediated by the activity of the nuclear-encoded POLG. A transgenic homozygous mouse knockin mutation of this gene renders the polymerase enzyme deficient in proofreading and results in a threefold to fivefold increase in mtDNA mutation rate. Such mice develop a premature aging phenotype, which includes subcutaneous lipoatrophy, alopecia, kyphonia, and weight loss with premature death. Although the finding of increased mtDNA muta tion and mitochondrial dysfunction with age has been solidly established, the causative role and specific contribution of mitochondrial ROS to aging and age-related disease in humans have yet to be proved. Similarly, although many tumors display higher levels of heterogeneous mtDNA mutations, a causal relationship to tumorigenesis has not been proved. Besides the age-dependent acquired accumulation in somatic cells of heterogeneous point mutations and deletions, a quite different effect of nonheritable and acquired mtDNA mutations has been described affecting tissue stem cells. In particular, disease phenotypes attributed to acquired mtDNA mutation have been observed in sporadic and apparently nonfamilial cases involving a single individual or even tis sue, usually skeletal muscle. The presentation consists of decreased exercise tolerance and myalgias, sometimes progressing to rhabdo myolysis. As in the case of the sporadic, heteroplasmic, large-scale deletion, classic syndromes of chronic PEO, Pearson’s syndrome, and KSS, the absence of a maternal inheritance pattern and the finding of limited tissue distribution suggest a molecular pathogenic mechanism emanating from mutations arising de novo in muscle stem cells after germline differentiation (somatic mutations that are not sporadic and occur in tissue-specific stem cells during fetal development or in the postnatal maintenance or postinjury repair stage). Such mutations would be expected to be propagated only within the progeny of that stem cell and affect a singular tissue within a given individual, without evidence of heritability. PROSPECTS FOR CLINICAL MANAGEMENT OF mtDNA DISEASE ■ ■TREATMENT OF mtDNA DISORDERS No specific curative treatment for mtDNA disorders is currently available; therefore, the management of mitochondrial disease is largely supportive. Management issues may include early diagnosis and medical management of epilepsy, gastrointestinal dysfunction, weakness, diabetes mellitus, cardiac dysrhythmia, hearing loss, endo crinopathy, ptosis, and cataracts. Rapid identification of subclinical seizures, which may present with focal neurologic signs or even mild mental status changes, is critical in the management of MELAS and other mtDNA disorders associated with epilepsy. The value of aggres sive symptom management cannot be understated. Less specific interventions in the case of other disorders involve combined treat ment strategies including dietary intervention and removal of toxic

metabolites. Cofactors and vitamin supplements are widely used in the treatment of diseases of mitochondrial oxidative phosphoryla tion, although there is little evidence, apart from anecdotal reports, to support their use. This includes administration of artificial electron acceptors, including vitamin K3, vitamin C, and ubiquinone (CoQ10); administration of cofactors (coenzymes) including riboflavin, car nitine, and creatine; and use of oxygen radical scavengers, such as vitamin E, copper, selenium, ubiquinone, and idebenone. Drugs that could interfere with mitochondrial function, such as the anesthetic agent propofol, barbiturates, and high doses of valproate, can gen erally be avoided if possible. The use of valproate in patients with pathogenic mutations in POLG and possibly other mutations affect ing mtDNA stability and replication is especially contraindicated. Supplementation with the nitric oxide synthase substrate l-arginine and, more recently, l-citrulline has been advocated as a vasodilator treatment during stroke-like episodes as well as for chronic manage ment in patients with MELAS. Open-label studies demonstrate that levoarginine and levocitrulline may be helpful in reducing the strokelike symptoms in MELAS but may have serious side effects. As CSF folate deficiency has been reported in some cases of mitochondrial disease, this can be treated with oral folinic acid.

PART 16 Genes, the Environment, and Disease The physician should also be familiar with environmental interac tions, such as the strong and consistent association between visual loss in LHON and smoking or ethanol consumption. A clinical penetrance of 93% was found in men who smoked. Asymptomatic carriers of an LHON mtDNA mutation should, therefore, be strongly advised not to smoke and to moderate their alcohol intake. Although not a cure, these interventions might stave off the devastating clinical manifesta tions of the LHON mutation. Another example is strict avoidance of aminoglycosides in the familial syndrome of ototoxic susceptibility to aminoglycosides in the presence of the mtDNA m.1555A>G mutation of the 12SrRNA encoding gene. Clinical trials using novel agents have been initiated and launched. These agents include elamipretide (Stealth Biotherapeutics) and KL-1333 (Abliva). In an open-label study of α-tocotrienol used to treat 10 children with Leigh’s syndrome, there were improvements in the primary endpoints, including the Newcastle Pediatric Mitochondrial Diseases Scale, the Gross Motor Function Measure, and the PedsQL Neuromuscular Module. GENETIC COUNSELING, PRENATAL DIAGNOSIS, AND PGD IN

mtDNA DISORDERS The provision of accurate genetic counseling and reproductive options to families with mtDNA mutations is challenging due to the unique genetic features of mtDNA inheritance that distinguish it from Mendelian genetics. mtDNA defects are transmitted by mater nal inheritance. mtDNA de novo mutations are often large deletions, affect one family member, and usually represent no significant risk to other members of the family. In contrast, mtDNA point mutations or duplications can be transmitted maternally. Accordingly, the father of an affected individual has no risk of harboring the disease-causing mutation, and a male cannot transmit the mtDNA mutation to his offspring. In contrast, the mother of an affected individual usually harbors the same mutation but may be completely asymptomatic. This wide phenotypic variability is primarily related to the phenomena of heteroplasmy and the mutation load carried by different members of the same family. Consequently, a symptomatic or asymptomatic female harboring a disease-causing mutation in a heteroplasmic state will transmit to her offspring variable amounts of the mutant mtDNA molecules. The offspring will be symptomatic or asymptomatic pri marily according to the mutant load transmitted via the oocyte and, to some extent, subsequent mitotic segregation during development. Interactions with the mtDNA haplotype background or nuclear human genome (as in the case of LHON) serve as an additional important determinant of disease penetrance. Because the severity of the disease phenotype associated with the heteroplasmic mutation load is a function of the stochastic differential segregation and copy

number of mutant mtDNA during the oogenesis bottleneck and, sub sequently, following tissue and organ development in the offspring, it is rarely predictable with any degree of accuracy. For this reason, prenatal diagnosis (PND) and PGD techniques that have evolved into integral and well-accepted standards of practice are severely ham pered in the case of mtDNA-related diseases. The value of PND and PGD is limited, partly due to the absence of data on the rules that govern the segregation of wild-type and mutant mtDNA species (heteroplasmy) among tissue in the devel oping embryo. Three factors are required to ensure the reliability of PND and PGD: (1) a close correlation between the mutant load and the disease severity, (2) a uniform distribution of mutant load among tissues, and (3) no major change in mutant load with time. These criteria are suggested to be fulfilled for the NARP m.8993T>G mutation but do not seem to apply to other mtDNA disorders. In fact, the level of mutant mtDNA in a chorionic villous or amniotic fluid sample may be very different from the level in the fetus, and it would be difficult to deduce whether the mutational load in the prenatal samples provides clinically useful information regarding the postnatal and adult state. ■ ■PREVENTION OF MITOCHONDRIAL

DISEASE INHERITANCE BY ASSISTED REPRODUCTIVE TECHNOLOGIES Because the treatment options for patients with mitochondrial dis ease are rather limited, with no current U.S. Food and Drug Admin istration (FDA)–approved therapies for established mitochondrial DNA disease, preventive interventions that eliminate the likelihood of transmission of affected mtDNA into offspring are desirable. The poor reliability of prenatal and preimplantation approaches in pre dicting mitochondrial DNA disease has resulted in the search for alternative preventive approaches. The common purpose underly ing various emerging approaches is to reduce mutant heteroplasmy levels to a level below a pathogenic threshold. This is based on the observed relationship between heteroplasmy and disease inheritance patterns, which indicates that even a small increase in copy number of nonmutant mtDNA molecules in the fertilized egg can exceed the threshold required to ameliorate serious clinical disease. Use of gene editing, with clustered regularly interspaced short palindromic repeats (CRISPR) or mitochondrial-targeted TALEN (transcrip tion activator-like effector nucleases) technology, and others, for example, to shift the heteroplasmy load in affected tissues will require future development of corrective gene delivery techniques. Likewise, induced pluripotent cell technology has not yet met with widespread success in the preclinical research setting. This has prompted the application of mitochondrial replacement therapy (MRT) approaches (Fig. 481-12). These approaches substitute in vitro the entire oocyte or zygote complement of mitochondria, together with their mtDNA from the carrier mother, with the unaffected complement of mito chondria and their unaffected mtDNA from a donor woman. This can be accomplished either by removing and transferring the car rier mother’s spindle with her nuclear DNA into the unfertilized oocyte of the donor or, alternatively, by transferring the pronucleus from the fertilized oocyte of the carrier mother to the unfertilized donor oocyte from which the pronucleus has been removed. These approaches provide a “bulk” substitution and hence do not target the specific mtDNA mutation, and they are potentially applicable to a wide variety of mtDNA disorders. This is a form of germline genetic therapy, and therefore, it projects onto future generations in the case of a female offspring. Accordingly, ethical and regulatory bodies have appropriately weighed in on the societal implications of such approaches and have been tentatively supportive of human clinical investigation for situations that would prevent great suffering and when the clinical need is clear and unambiguous, subject to specified conditions and principles and subject to ethical scrutiny. Several such studies have been initiated, and careful examination and follow-up are needed to determine developmental and longer-term health and fertility of children who have undergone genetic manipulation at the earliest stages of human development and whose genomes comprise

FIGURE 481-12 Mitochondrial replacement techniques—maternal spindle transfer and pronuclear transfer. In both procedures, some mutant mtDNA, estimated at 1–2%, might be carried over together with the spindle or pronucleus, but the levels are low enough to avoid disease risk. IVF, in vitro fertilization. (From MJ Falk et al: Mitochondrial replacement techniques—Implications for the clinical community. N Engl J Med 374:1103, 2016. Copyright © 2016, Massachusetts Medical Society. Reprinted with permission from Massachusetts Medical Society.) separate maternal origins of nuclear and mtDNA genomes. It has been recommended that such studies be limited to male offspring, who cannot then transmit the donor mtDNA to future generations, until such time as the health, ethical, and societal issues are well understood and live up to the exciting promise of reducing the bur den of clinical mtDNA disease in the future. ■ ■FURTHER READING Alston CL et al: The genetics and pathology of mitochondrial disease. J Pathol 241:236, 2017. Camp K et al: Nutritional interventions in primary mitochondrial disorders: Developing an evidence base. Mol Genet Metab 119:187, 2016.

CHAPTER 481 Mitochondrial DNA and Heritable Traits and Diseases
Di Mauro S: Mitochondrial encephalomyopathies—Fifty years on: The Robert Wartenberg Lecture. Neurology 81:281, 2013. Di Mauro S et al: The clinical maze of mitochondrial neurology. Nat Rev Neurol 9:429, 2013. Gorman GS et al: Mitochondrial diseases. Nat Rev Dis Primers 2:16080, 2016. Haas RH et al: Mitochondrial disease: A practical approach for pri mary care physicians. Pediatrics 129:1326, 2007. Koopman WJ et al: Monogenic mitochondrial disorders. N Engl J Med 366:1132, 2012. Parikh S et al: Patient care standards for primary mitochondrial disease: A consensus statement from the Mitochondrial Medicine Society. Genet Med 19:1380, 2017.

04 - 482 Telomere Disease

482 Telomere Disease

Parikh S et al: Diagnosis of “possible” mitochondrial disease: An exis

tential crisis. J Med Genet 56:123, 2019. Russell OM et al: Mitochondrial diseases: Hope for the future. Cell 181:168, 2020. Saneto RP et al (eds): Mitochondrial Case Studies, Underlying Mecha nisms and Diagnosis. London, Academic Press/Elsevier, 2016. Wei W, Chinnery PF: Inheritance of mitochondrial DNA in humans: PART 16 Genes, the Environment, and Disease Implications for rare and common diseases. J Intern Med 287:634, 2020. Rodrigo T. Calado, Neal S. Young

Telomere Disease ■ ■DEFINITION In telomere diseases (telomeropathies or telomere biology disorders), organ dysfunction is caused by excessive loss of the ends of chromo somes, a process termed accelerated telomere attrition, which results from germline mutations in genes involved in telomere maintenance. Inadequate repair or insufficient protection of telomeres and their resulting accelerated erosion induces cell death, deficient cell prolif eration, and chromosome instability; affected tissues show defective regeneration, fibrosis or replacement by fat, and a proclivity for cancer. Bone marrow, lungs, liver, and skin especially suffer accelerated telo mere loss and dysfunction. Telomeres shorten in humans at an average of 50 base pairs/year as measured in peripheral blood leukocytes. How ever, normal aging is not associated with developing manifestations of a telomere disease. In normal aging, sufficient stem cell number and function are maintained to sustain vital processes. Telomere length associates with life span in the general population. While shorter leukocyte telomeres correlate with increased mortality risk, especially from nonmalignant causes, telomere loss is not established as the cause of physiologic aging. Long telomeres due to rare inherited mutations do associate with clonal hematopoiesis and predisposition to cancer. ■ ■DISEASE MECHANISM Telomeres, the physical termini of linear chromosomes, are repeated hexanucleotide sequences physically associated with specific proteins. Telomeres protect the chromosome ends against recognition as damaged or infectious DNA by the DNA repair machinery (Fig. 482-1). During mitosis, the DNA polymerase employs an RNA oligonucleotide with a 3′ hydroxyl group to prime replication. The primer dissociates as the DNA polymerase advances along the template strand, and a gap is left at the ends of linear DNA molecules: the newly synthesized DNA strand is necessarily shorter than the original template—the “end-replication problem.” Chromosome erosion is thus inevitable with mitotic cell division, but the repetitive structure buffers the loss of genetic infor mation. In human cells, telomeres consist of hundreds to thousands of TTAGGG tandem repeats in the leading and CCCTAA in the lag ging DNA strand. At birth, telomeres are long, but they inexorably shorten with aging (Fig. 482-1). In an individual cell, critically short telomere length triggers the p53 pathway, leading to proliferative arrest and apoptosis. Telomere loss is the molecular basis for the “Hayflick phenomenon,” the limit to cell division. If a cell overcomes proliferation arrest, extremely short telomeres engage the DNA dam age repair machinery, and chromosome end-to-end fusions, chromo some breaks, aneuploidy, and chromosome instability may occur. In addition to the telomere repeated sequences, a group of specialized proteins, collectively termed shelterins, directly bind to or indirectly associate with telomeres, assisting in the organization of the telomere tertiary structure and inhibiting the activity of DNA damage response proteins (Fig. 482-1).

To escape telomere attrition, cells with high proliferative demand, including embryonic and adult stem cells, lymphocytes, and most cancer cells, preserve telomere length by the synthesis of telomeric repeats. Telomerase adds GTTAGG hexanucleotides to the 5′ end of the leading DNA strand using a reverse transcriptase enzyme (TERT), and TERC, its RNA template (Fig. 482-1). The telomerase holoenzyme complex comprises two copies of TERT, TERC, and dyskerin and asso ciated proteins. TERC binds to TERT and serves as the RNA template for its function as a reverse transcriptase. Dyskerin, encoded by DKC1, stabilizes the complex, and TCAB1, encoded by the WRAP53 gene, aids telomerase trafficking to the Cajal bodies, nuclear structures for ribonucleoprotein processing where telomerase associates with telo meres for elongation. Telomerase expression is tightly regulated: MYC, sex hormones, and many additional factors stimulate TERT transcrip tion. In mature cells, the TERT gene is highly repressed. In addition, shelterin proteins (such as POT1) regulate telomerase function and processivity (the ability to consecutively synthesize telomeric repeats in a single interaction with the telomere), modulating its catalytic activ ity on telomeres. Other proteins also are necessary for telomere length maintenance. RTEL1, a DNA helicase, dismantles t-loops and resolves g-quadruplexes, ensuring adequate telomere elongation. Pathologic accelerated telomere attrition has a genetic origin. Germ line loss-of-function mutations in genes involved in telomere biology impair telomere length repair, increasing the rate of telomere erosion in highly proliferative cells, reaching critically short lengths fast. The consequences are limited cell proliferation and impaired tissue regen eration. Some organs appear to be particularly susceptible to telomere erosion. Billions of blood cells are produced daily (Chap. 101), and telo mere attrition curtails cell proliferation, producing a hypocellular mar row and low blood counts. Lymphocytes also have high proliferative capacity and activate telomerase, but when telomeres are very short, B and T cells have an aged, more exhausted, and proinflammatory profile, and immunodeficiency may occur. The liver is also an organ with high proliferative capacity, and telomere dysfunction impairs hepatic regeneration after injury, with a variety of pathologic conse quences. The lung alveolar epithelium is in contact with exogenous toxins that stimulate regeneration, and telomere loss may cooperate with other events to hinder these physiologic responses. However, it remains unclear why other regenerative tissues, like the intestine, are less affected by telomere dysfunction, or the mechanism by which telo mere loss provokes a fibrotic response in the lungs (pulmonary fibro sis), an adipose response in the marrow (aplastic anemia), and both in the liver (hepatic steatosis and cirrhosis). These phenotypes appear to result from a combination of genetic, epigenetic, and environmental determinants. When telomeres are critically short, the DNA damage response machinery may be recruited, mistaking telomeres for damaged or infectious DNA and forcing inappropriate repair. Activation of this pathway may cause chromosome instability due to end-to-end fusion of chromosomes or translocations; these alterations generate genomic instability and potentially malignant clones of cells. That telomere dysfunction increases the risk of cancer development has been dem onstrated in murine models of telomerase deficiency, and patients with telomere diseases are prone to develop malignant neoplasms, particularly acute myeloid leukemia and head and neck squamous cell carcinomas. ■ ■GENETICS The pattern of inheritance may be X-linked, autosomal recessive, and autosomal dominant, and penetrance is highly variable, even within a pedigree. The genetic architecture may be complex, and affected patients may inherit pathogenic loss-of-function variants in more than one gene involved in the same telomere biology pathway. At least 17 genes have been implicated in the etiology of telomeropathies (Table 482-1). ■ ■CLINICAL MANIFESTATIONS Presentation of telomere disease in the clinic is highly variable—in the tissues affected, in the severity of organ dysfunction, and in patterns of

Telomerase TERT NOP10 NHP2 RNA template Centromere TTAGGGTTAGGGTTAGGGTTAGGGTTAGGG-3' AATCCCAATCCCAATCCC-5' AAUCCC Chromosome TIN2 TRF1 T loop 5' D loop A 3'

Telomere length (kb)

Average loss of 40–60 base pairs/year

Age (years) B FIGURE 482-1 Telomeres and telomerase. A. Telomeres are ribonucleoprotein structures located at the termini of linear chromosomes inside the cell nucleus composed of hundreds of tandem hexameric DNA repeats. A group of proteins bind directly or indirectly to telomere sequences in order to provide protection to the structure and are collectively termed shelterin or telosome (TRF1, TRF2, TIN2, POT1, TPP1, and RAP1). As the 3′ end of the leading strand forms a single-stranded overhang, it folds back and invades the telomeric double helix, forming a lariat termed T loop. The telomerase complex is composed of the enzyme telomerase reverse transcriptase (TERT), its RNA component (TERC), the protein dyskerin, and associated proteins (NHP2, NOP10, and GAR). This enzymatic complex elongates telomeres by adding GTTAGG hexameric repeats to the 3′ end of the telomeric leading strand, using a sequence in TERC as the template. B. The average telomere length in human leukocytes varies: it is longer at birth (10–11 kilobases) and progressively shortens with aging (6–7 kilobases at age 90 years) at an average loss of 40–60 base pairs/year. However, there is significant variability in telomere length in each given age. disease within a pedigree and between families with similar mutations. In a same family, one individual may be severely affected, but close relatives carrying the same mutation may be asymptomatic and with normal laboratory results. Asymptomatic carriers may have subclinical organ dysfunction, which may be detected by directed or specialized testing (reduced forced vital capacity on pulmonary function test, hypocellular bone marrow at biopsy, hepatic steatosis on ultrasound). Somatic genetic rescue is a rare spontaneous genetic event in a somatic cell, conferring a selective advantage and annulling the effect of the original pathogenic germline mutation and may occur in telomere diseases, mitigating the phenotype. Clonal hematopoiesis may be mal adaptive, as acquired mutations in a specific set of genes (e.g., TP53) lead to myeloid neoplasms. Environmental regenerative stresses, including factors such as smoking, alcohol consumption, and viral infection, may increase sus ceptibility to organ damage and contribute to disease heterogeneity. Disease anticipation, in which clinical phenotype manifests at an earlier age in successive generations, is observed in some families with

telomeropathies due to the direct inheritance of short telomeres in sperm and oocytes.

The diagnosis of a telomere disease is suggested by personal and family history, strengthened by simple measurement of leukocyte telomere length, and usually definitively established by next-generation sequencing for genes encoding telomere repair enzyme complex and shelterin components. In a machine-learning tool that differentiates acquired immune aplastic anemia from inherited bone marrow failure syndromes, telo mere length is a top predictor. CHAPTER 482 GAR Telomere Disease Dyskerin Dyskeratosis Congenita Dyskeratosis congenita is the classic telomere disease of childhood and usu ally diagnosed in the first two decades of life. Affected children often have at least two features of the muco cutaneous triad of ungual dystrophy, reticular skin pigmentation, and oral leukoplakia (Fig. 482-2). In more severe syndromes, affected newborns have cer ebellar hypoplasia (Hoyeraal-Hreidarsson syndrome) or exudative retinopathy (Revesz syndrome) (Fig. 482-3). Telomeres are usually extremely short, below the first percentile expected for age (Fig. 482-4). Most patients with dyskeratosis congenita develop bone marrow fail ure, often requiring transfusions and, ultimately, bone marrow transplant. Pulmonary fibrosis appears in as many as 20% of cases and liver disease in 10%, often after bone marrow transplant for hematopoietic failure. Other tissues and organs also may be affected (Fig. 482-3). Mutations most common in dyskeratosis congenita patients are in DKC1, TINF2, TERT, TERC, and RTEL1 genes, and triallelic inheritance (involving two genes in the same pathway) also may occur (Table 482-1). POT1 Shelterin TRF2 Rap1 TPP1 Bone Marrow Failure Aplastic anemia (Chap. 107) is the most common major clinical manifestation of dyskeratosis congenita. However, young or older patients carrying a telomere defect, without typical physical stigmata, can also develop marrow failure. Genetic variants usually are monoallelic (one mutated allele and one wild-type allele), resulting in haploinsuf ficiency, and TERT, TERC, and RTEL1 are the genes usually affected. Telomere loss in these cases is often less intense than in classic dyskeratosis congenita. As a result of inadequate telomerase function, the stem cell pool is limited in size and in its ability to regener ate, leading to marrow hypocellularity, and insufficient production of erythrocytes, platelets, and granulocytes (Fig. 482-5). The most typical presentation is moder ate aplastic anemia after a long history of macrocytic mild to moderate anemia and/or thrombocytopenia, with preservation of leukocyte numbers. A comprehensive personal and family history is important, querying especially for blood count abnormalities and cytopenia as well as lung and liver disease; early hair graying, while not specific to telomeropathies, strongly suggests telomere disease in the appropriate context. Myeloid Neoplasms Some patients diagnosed with myelodysplas tic syndrome (Chap. 107) or acute myeloid leukemia (Chap. 109) have a family history of bone marrow failure or other myeloid neoplasms. One of the genetic causes for myeloid neoplasia predisposition is a telomere defect, and these disorders are now classified together by the World Health Organization as “myeloid neoplasms associated with telomere biology disorders.” Telomere length measurement may be confounded by circulating immature cells, which may have very short telomeres, precluding accurate test interpretation. Pulmonary Fibrosis Pulmonary fibrosis appears in ~20% of children with dyskeratosis congenita. Conversely, ~10–15% of patients with idiopathic pulmonary fibrosis (Chap. 304) or familial pulmonary

TABLE 482-1 Genetic Variants in 13 Genes Involved in Telomere Maintenance, Inheritance Pattern, and Phenotype DYSKERATOSIS CONGENITA APLASTIC ANEMIA PULMONARY FIBROSIS CIRRHOSIS MDS/LEUKEMIA GENE Telomerase PART 16 Genes, the Environment, and Disease DKC1 XL TERT AD/AR AD/AR AD AD AD/AR TERC AD/AR AD AD AD AD NOP10 AR NHP2 AR WRAP53 AR Shelterin TINF2 AD AD AD TERF2 AD ACD AD Others RTEL1 AR AD/AR AD AD CTC1 AR AR PARN AD USB1 AD ZCCHC8 AD AD NAF1 AD Abbreviations: AD, autosomal dominant; AR, autosomal recessive; MDS, myelodysplastic syndrome; XL, X-linked. fibrosis have an etiologic telomerase gene mutation. Regardless of mutation status, most pulmonary fibrosis patients have short telomeres for their age but not as short as in dyskeratosis congenita. How telo mere erosion causes pulmonary fibrosis is unclear, but it might prevent adequate proliferation and regeneration of pneumocytes type II. Idio pathic pulmonary fibrosis due to a telomere disease usually appears after the fourth decade of life, with a restrictive pattern on pulmonary function testing associated with decreased diffusion capacity for carbon monoxide (DLCO) and a diffuse “honeycomb” appearance on high-resolution computed tomography (CT) (Fig. 482-5). Histopathol ogy of biopsied lung shows interstitial pneumonia. The pulmonary clinical presentation in telomere disease is indistinguishable from FIGURE 482-2 Skin manifestations of dyskeratosis congenita. The pediatric syndrome dyskeratosis congenita is characterized by the mucocutaneous triad of (A) reticular skin pigmentation, (B) oral leukoplakia, and (C, D) nail dystrophy.

idiopathic pulmonary fibrosis, except that those with an underlying telomere defect may have cryptic hepatic cirrhosis, mac rocytosis, cytopenias, and a family history of lung, liver, or bone marrow disease. Pul monary arteriovenous malformation lead ing to right-to-left shunting is observed in patients with pulmonary fibrosis due to telomere disease. Patients with idiopathic pulmonary fibrosis or familial pulmonary fibrosis should have leukocyte telomere length assayed and, if telomeres are short, undergo screening for mutations in telo mere-associated genes and telomeres; telo mere length may be normal in some cases despite the presence of pathogenic muta tions. TERT, TERC, RTEL1, and PARN are the most commonly mutated genes. Liver Disease Genetic telomere defects may cause hepatic cirrhosis (Chap. 355), nodular regenerative hyperplasia of the liver, nonalcoholic fatty liver disease (Chap. 354), and hepatocellular carcinoma (Chap. 87). Hepatocytes of most patients with cirrhosis have very short telomeres. Eroded telo meres limit hepatocyte proliferation, espe cially upon chronic injury. Additionally, hepatocytes with short telomeres display abnormal metabolic patterns and defective mitochondrial function. Abnormal liver pathology may be uncovered incidentally during the evaluation of telomeropathy patients with aplastic anemia or pulmo nary fibrosis, but cirrhosis also may be the sole or most prominent clinical presentation of a telomere defect. A minority of individuals with cirrhosis associated with virus B or C infection or alcohol-asso ciated liver disease may carry a telomere-associated gene mutation. Liver histopathology is variable, but cirrhosis is usually associated with inflammation (Fig. 482-5), increased iron deposit, positivity for CD34 in sinusoid endothelial cells, and widening of hepatocyte plates. Defec tive telomere maintenance may increase the susceptibility of the liver to environmental challenges, such as alcohol and viruses, increasing the risk for developing severe hepatic disease in mutation carriers. ■ ■LONG TELOMERE SYNDROME POT1 is a shelterin component that modulates telomerase access to telo meres, thus regulating telomere elongation. Heterozygous germline lossof-function mutations in the POT1 gene cause excessively long telomeres that clinically translate into clonal hemato poiesis and a predisposition to benign and malignant tumors. Very long telo meres appear to augment the capac ity for cell proliferation, facilitating the acquisition of harmful driver somatic mutations. ■ ■TELOMERE LENGTH MEASUREMENT Length of telomeres can be accurately measured in peripheral blood leuko cytes by commercial laboratories. Of several methods available, flow–fluo rescence in situ hybridization (FISH) and quantitative real-time polymerase chain reaction (qPCR) are most widely

FIGURE 482-3 Clinical consequences of telomere diseases. Telomere dysfunction affects a variety of organs: cerebellum, eyes, lungs, liver, skin, gastrointestinal tract, and the bone marrow. utilized. Both methods have advantages and limitations and require high-quality samples, usually fresh or freshly processed, as cell death and DNA degradation impact the accuracy of testing. Results are usu ally expressed as leukocyte telomere length in kilobases. However, the interpretation of length must account for patient age, due to physi ologic telomere loss. A normal range for telomeres is available for each year of age, longest at birth and shortening at 40–60 base pairs per year (Fig. 482-1). For each age bracket, the percentile curves are calculated, and a given patient’s test result is interpreted in the context of normal age variation: telomeres below the tenth percentile for age are defined as “short” and telomeres below the first percentile are considered “very short” (Fig. 482-4). Telomeres above the 99th percentile are considered “very long.” Short telomeres may also be present in some chronic conditions, such as cardiovascular disease or diabetes. In these settings, telomere Telomere Diseases

Dyskeratosis congenita Aplastic anemia Pulmonary fibrosis Long telomere syndrome

Telomere length (kb)

Age (years) FIGURE 482-4 Telomere length measurement in telomere diseases. Telomeres shorten with aging, and solid curves represent the percentiles for age in healthy subjects. Telomeres are considered “short” when below the 10th percentile, very short when below the first percentile, and very long when above the 99th percentile. In patients with dyskeratosis congenita, telomeres are usually below the first percentile, regardless of the gene lesion, whereas in patients with aplastic anemia or pulmonary fibrosis, telomeres are usually below the 10th percentile. In patients with long telomere syndrome, telomeres are usually above the 99th percentile.

CHAPTER 482 Telomere Disease erosion is not thought to be etiologic but rather a secondary conse quence of chronic inflammation; telomere testing does not have clini cal utility and is not recommended. Likewise, telomere length tests, despite commercial hyperbole, have not been shown to be useful in the assessment of aging and longevity or as a basis for therapeutic interven tions, absent a diagnosis of genetic telomere disease. Flow-FISH uses a fluorescent-labeled nucleotide probe specific for telomere repeats to estimate telomere content in an individual cell. It has the advantage of determining telomere length in individual cells and in leukocyte subpopulations; lymphocyte telomere shortening is more specific for telomere diseases than in other cells. A limitation of flow-FISH is the requirement for intact cells for analysis, which are not always available, and neutrophils are susceptible to damage during processing, freezing, and thawing. qPCR uses telomere-binding modified primers to measure telomere content in comparison to a housekeeping gene in the whole leukocyte population and thus does not require intact cells. qPCR provides an estimate of the average telomere length of a given sample without determining telomere length in individual cells. Good DNA quality is essential for adequate qPCR testing and automation or semi-automation is required for clinical purposes, as variability in conditions among batches may result in interassay variation. The standard Southern blot is very accurate but requires larger amounts of DNA and is labor intensive. Other measures have been developed in research laboratories (single telomere length analysis [STELA], telomere shortest length assay [TESLA]) to assess critically short telomeres in particular. 10% 50% 90% 99% ■ ■GENETIC TESTING 1% When a patient with a suspected telomeropathy has short or very short telomeres, genetic screening for mutations in genes involved in telomere maintenance and biology is indicated (Table 482-1). Genetic testing has been restricted to patients with suspected telomere disorders but is increasingly incorporated into genomic screening in the bone marrow failure syndromes in general, and next-generation sequencing has been routinely used. Mutations may be biallelic or trial lelic involving two genes (especially in dyskeratosis congenita), but usually only one allele is mutated. Interpretation of genetic screening results is challenging, as rare singleton polymorphisms of unknown significance have been identified in large cohorts of healthy individu als. In silico analysis, mutation location, and functional studies are helpful to interpret the mutation significance.

05 - 483 Gene and Cell-Based Therapy in Clinical Medicine

483 Gene and Cell-Based Therapy in Clinical Medicine

PART 16 Genes, the Environment, and Disease FIGURE 482-5 Pathologic manifestations of telomere diseases. A. In the bone marrow, telomere erosion predisposes to aplastic anemia, characterized by an empty hematopoietic marrow replaced by fat (hematoxylin and eosin). B. In the liver, telomere attrition predisposes to cirrhosis (hematoxylin and eosin). C. Telomere shortening may also result in nodular regenerative hyperplasia of the liver (reticulin stain). D. In the lungs, telomere dysfunction predisposes to pulmonary fibrosis mainly in the subpleural regions, which may be detected by high-resolution computed tomography scan. Genetic counseling is necessary after screening, as the inheritance pattern may be autosomal dominant, mutation penetrance is highly variable, and phenotypes may be diverse even within a pedigree. Poten tial family stem cell donors must be screened before transplantation to ensure that they do not have mutations. TREATMENT Telomere Disease Patients with severe aplastic anemia due to telomere disease may undergo allogeneic hematopoietic stem cell transplant when a suitable donor is available. Treatment-related mortality may be increased due to pulmonary and hepatic complications, for which reduced intensity conditioning regimens appear advantageous. Lung transplant for pulmonary fibrosis is feasible but often not performed due to coexisting cytopenias and other comorbidities. Patients with pulmonary fibrosis associated with telomere disease have a poorer outcome after lung transplant and with nontransplant therapies. Similarly, there is no specific treatment for the liver in telomere disease; liver transplant has been performed in several cases with good outcome and without excessive posttransplant mortality and improvement in the respiratory status. Telomeropa thy patients should be advised to avoid toxins (metal dust, busulfan, amiodarone), ionizing radiation, cigarette smoke, and alcohol, as these can be possibly harmful. Long-term therapy with androgens may mitigate telomere attri tion and even elongate leukocyte telomere length in humans. In research trials, danazol and nandrolone improved blood counts in marrow failure patients and reduced transfusion requirements. ■ ■FURTHER READING Blackburn EH et al: Human telomere biology: A contributory and interactive factor in aging, disease risks, and protection. Science 350:1193, 2015. Calado RT, Young NS: Telomere diseases. N Engl J Med 361:2353, 2009. Carvalho VS et al: Recent advances in understanding telomere dis eases. Fac Rev 11:31, 2022.

Clé DV et al: Effects of nandrolone decanoate on telomere length and clinical outcome in patients with telomeropathies: A prospective trial. Haematologica 108:1300, 2023. Collins J, Dokal I: Inherited bone marrow failure syndromes. Hema tology 20:433, 2015. DeBoy EA et al: Familial clonal hematopoiesis in a long telomere syn drome. N Engl J Med 388:2422, 2023. Devine MS, Garcia CK: Genetic interstitial lung disease. Clin Chest Med 33:95, 2012. Gutierrez-Rodrigues F et al: Differential diagnosis of bone marrow failure syndromes guided by machine learning. Blood 141:2100, 2023. Townsley DM et al: Danazol treatment for telomere diseases. N Engl J Med 374:1922, 2016. Wang M et al: Liver disease and transplantation in telomere biology disorders: An international multicenter cohort. Hepatol Commun 8:e0462, 2024. Katherine A. High, Marcela V. Maus

Gene and Cell-Based

Therapy in Clinical

Medicine Gene therapy is a novel area of therapeutics in which the active agent is a nucleic acid sequence rather than a protein or small molecule. One of the most powerful concepts in modern molecular medicine, gene therapy has the potential to address a host of diseases for which there are currently no available treatments. Because delivery of naked DNA or RNA to a cell is an inefficient process, most gene therapy is carried out using a vector, or gene delivery vehicle, typically engineered from viruses by deleting some or all of the viral genome and replacing it

TABLE 483-1 Characteristics of Commonly Used Gene Delivery Vehicles VIRAL BASE NONVIRAL FEATURES RETROVIRAL/LENTIVIRAL ADENOVIRAL AAV LIPID NANOPARTICLES Genome RNA DNA DNA RNA Cell division requirement G1 phase No No No Packaging limitation 8 kb 8–30 kb 5 kb 10 kb or more Immune responses to vector Extensive Few Few Genome integration Yes Poor Poor May be used to package either RNA or DNA Long-term expression Yes No Yes Transient for RNA Main advantages Persistent gene transfer in transduced tissues Highly effective in transducing various tissues Main disadvantages Might induce oncogenesis in some cases; only used ex vivo Viral capsid elicits strong immune responses with the therapeutic gene of interest under the control of a suitable promoter. Nonviral delivery vehicles such as lipid nanoparticles are increasingly being used (Table 483-1). Gene therapy strategies can thus be described in terms of three essential elements: (1) a gene delivery vehicle; (2) a gene to be delivered, sometimes called the transgene; and (3) a physiologically relevant target cell to which the DNA or RNA is delivered. The series of steps in which the vector and donated DNA or RNA enter the target cell and express the transgene is referred to as transduction. Gene delivery can take place in vivo, in which the vector is directly injected into the patient, or, in the case of hematopoietic, liver, immune, and some other target cells, ex vivo, with removal of the target cells from the patient, followed by return of the gene-modified autologous cells to the patient after manipulation in the laboratory. The latter approach effectively combines gene transfer techniques with cellular therapies. In the past few years, gene therapy for genetic disease has moved from addressing ultra-rare inherited diseases to more common ones including sickle cell disease and hemophilia. Similarly, chimeric anti gen receptor (CAR) T cells have expanded beyond hematologic malig nancies to address solid tumors and autoimmune diseases (currently investigational products). Therapeutic approaches have expanded from gene transfer to gene editing, and RNA-based therapies have gained ground rapidly, partly as a key component of CRISPR-based gene edit ing (guide RNAs and mRNAs encoding editing enzymes) and as the active agent in multiple SARS-CoV-2 vaccines. This chapter will focus primarily on approved therapies (Table 483-2), with some discussion of investigational therapies in late-phase development and earlier trials critical to the development of the field. Clinical trials of gene therapy have been under way since 1990; the first gene therapy product to be licensed in the United States or Europe was approved in 2012 (see below). Given that vector-mediated gene therapy is arguably one of the most complex therapeutics yet developed, typically consisting of both a nucleic acid and a protein component, this time course from first clinical trial to licensed product is noteworthy for being similar to those seen with other novel classes of therapeutics, i.e., monoclonal antibodies or bone marrow transplan tation. Thousands of people have now received approved products or participated in investigational studies of gene transfer. Potential adverse events, predicted based on first principles (Table 483-3), have occurred but have been rare. Some of the initial trials were character ized by an overabundance of optimism and a failure to be appropriately critical of preclinical studies in animals; in addition, it was sometimes not fully appreciated that animal studies are only a partial guide to safety profiles of products in humans (e.g., in the setting of insertional mutagenesis or human immune responses to the vector). Clinical experience and laboratory research led to a more nuanced understand ing of the actual risks (Table 483-4) and dramatic benefits of these new therapies and to more sophisticated selection of disease targets. Currently, gene therapies are being developed for a variety of disease entities. Critical aspects of the history to be assessed when evaluating

CHAPTER 483 Gene and Cell-Based Therapy in Clinical Medicine
Elicits few inflammatory responses, nonpathogenic RNA expressed transiently Limited packaging capacity Immunogenic; predominantly targets the liver; challenging manufacturing process a patient who has received a gene therapy product (or investigational agent) are outlined in Table 483-5. GENE TRANSFER AND GENE EDITING

FOR GENETIC DISEASES Most approved gene therapies for genetic diseases involve gene addi tion therapy. Recently the first gene editing products, for sickle cell dis ease and β-thalassemia, have been approved. Gene therapy strategies generally involve transfer of the missing gene to a physiologically rel evant target cell. However, other strategies are possible, including sup plying a truncated form of the gene with comparable biological activity (e.g., a gene encoding B domain-deleted FVIII for hemophilia A or microdystrophin for Duchenne muscular dystrophy); supplying a gene that achieves a similar biologic effect through an alternative pathway (e.g., utrophin in place of dystrophin for Duchenne muscular dystro phy); or downregulating a harmful effect through a small interfering or short hairpin RNA. From a therapeutic standpoint, gene therapies for genetic diseases fall into two distinct categories: (1) they may provide treatment for diseases that have hitherto lacked any pharmacologic therapies; or (2) they may provide an alternative to complex medical regimens that are frequently characterized by significant nonadherence due to the burden of treatment (e.g., monthly red blood cell transfu sions and iron chelation in transfusion-dependent β-thalassemia). Gene therapy for genetic disease requires long-term expression of the transgene. Two distinct strategies are available to achieve this goal: one is to transduce stem cells with an integrating vector, so that all progeny cells will carry the donated gene; and the other is to transduce long-lived, postmitotic cells, such as skeletal muscle or neurons. In the case of long-lived cells, integration into the target cell genome is unnecessary. Instead, because the cells are nondividing, the donated DNA, even if stabilized predominantly in an episomal form, will give rise to expression for the life of the cell. This latter approach mitigates risks related to integration and insertional mutagenesis. CRISPR/Cas9-based gene editing differs from gene therapy in that the therapeutic moiety utilizes a bacterial Cas9 enzyme and a guide RNA to introduce a double strand break (DSB) at a specific site in the DNA. When co-delivered with an appropriately designed gene-target ing vector, gene insertion can take place at the site of the DSB, resulting in a corrected sequence at the endogenous locus of a gene. This perma nent correction in the genome is thus under the control of endogenous regulatory signals and will be passed to every daughter cell. In practice, the Cas9 enzyme is complexed with the guide RNA to form a ribonu cleoprotein (RNP), which is then delivered to the cell, where the guide RNA directs the RNP complex to the genome target, introducing the DSB at a precise locus. In the absence of a gene-targeting vector, the DSB will be repaired using nonhomologous end joining (NHEJ), which typically results in small deletions or insertions, disrupting the expres sion of the target gene. In the presence of a targeting vector, repair at the DSB may occur by homology-directed repair, resulting in gene replacement at the targeted locus. With currently available systems, the

TABLE 483-2 Currently Approved Gene and Cell Therapy Products in North America and/or Europe YEAR FIRST APPROVED PRODUCT INDICATION AGE GROUP Strimvelis®a ADA-SCID Pediatric

Europe Retroviral ADA (adenosine deaminase) PART 16 Genes, the Environment, and Disease Kymriah® (tisagenlecleucel) Relapsed or refractory (R/R) B-cell acute lymphoblastic leukemia (pediatric); R/R large B-cell lymphoma (adult); third-line follicular lymphoma Pediatric and adult, different disease indications

United States, Europe, China, Japan Yescarta® (axicabtagene ciloleucel) R/R and second-line large B-cell lymphomas; third-line follicular lymphoma Adult

United States, Europe, Japan Luxturna® (voretigene neparvovec) Confirmed biallelic RPE65 mutation–associated retinal dystrophy Pediatric and adult

United States and Europe Zolgensma® (onasemnogene abeparvovec) Spinal muscular atrophy type 1 due to biallelic mutations in the SMN1 gene Pediatric

<2 years of age

United States and Europe Zynteglo® (betibeglogene autotemcel) Adults and pediatric ≥12 years of age

Europe and United States Transfusion-dependent β thalassemia; sickle cell disease Libmeldy®b (aditarsagene autotemcel) Metachromatic leukodystrophy due to biallelic mutations in the arylsulfatase A gene Pediatric

Europe, United States Tecartus® (brexucabtagene autoleucel) R/R mantle cell lymphoma; R/R B-cell acute lymphoblastic leukemia Adults

United States and Europe Breyanzi® (lisocabtagene maraleucel) R/R and second-line large B-cell lymphoma Adult

United States, Europe, and Japan Abecma® (idecabtagene vicleucel) Fifth-line treatment for multiple myeloma Adult

United States and Europe Carvykti® (ciltacabtagene autoleucel) Fifth-line treatment for multiple myeloma Adult

United States and Europe Skysona® (elivaldogene autotemcel) Early active cerebral adrenoleukodystrophy Boys age 4–17

Europe and United States Upstaza® (eladocagene exuparvovec) Confirmed AADC deficiency with severe phenotype Children 18 months and older

Europe AAV2 Human aromatic L-amino acid decarboxylase (AADC) Roctavian® (valoctocogene roxaparvovec) Severe hemophilia A and no history of inhibitors Adults

Europe and United States Hemgenix® (etranacogene dezaparvovec) Severe or moderately severe hemophilia B Adults

Europe and United States Vyjuvek® (beremagene geperpavec) Dystrophic epidermolysis bullosa due to mutations in COL7A1 Age 6 months and older

United States Herpes simplex viral vector Elevidys® (delandistrogene moxeparvovec) Duchenne muscular dystrophy Ages 4–5

United States AAVrh74 cDNA encoding microdystrophin Casgevy® (exagamglogene autotemcel) Sickle cell anemia and

β thalassemia Ages 12 and up

United States and Europe Beqvez® (fidanacogene elaparvovec) Severe or moderately severe hemophilia B Adults

Canada,

United States and Europe aAutologous CD34+-enriched cell fraction that contains CD34+ cells transduced with retroviral vector that encodes for the human ADA cDNA sequence. bAutologous CD34+ cells encoding arylsulfatase A. Abbreviations: AAV, adeno-associated virus; ADA-SCID, adenosine deaminase severe combined immunodeficiency; CAR, chimeric antigen receptor; RBC, red blood cell.

WHERE APPROVED VECTOR TRANSGENE TARGET TISSUE Autologous hematopoietic stem cells (HSCs) Lentiviral CAR directed to CD19 with 4-1BB signaling domain Autologous T cells Retroviral CAR directed to CD19 with CD28 signaling domain Autologous T cells AAV2 RPE65 (retinal pigment epithelial 65 kD protein) Retinal pigment epithelial cells by single subretinal injection AAV9 SMN1 (survival motor neuron 1) Spinal motor neurons by single IV infusion Lentiviral βA-T87Q globin gene Autologous HSCs Lentiviral ARSA (arylsulfatase A) Autologous HSCs Retroviral Same molecular construct as axicabtagene Autologous T cells Lentiviral CAR directed at CD19 with 4-1BB signaling domain; CD4 and CD8 T-cell products manufactured and infused separately Autologous T cells Lentiviral CAR directed to B-cell maturation antigen (BCMA); 4-1BB signaling domain Autologous T cells Lentiviral CAR directed to BCMA with two single-domain antibodies; 4-1BB signaling domain Autologous T cells Lentiviral Adenosine triphosphate– binding cassette, subfamily D, member 1 (ABCD1) Autologous HSCs Cells in putamen via single neurosurgical procedure AAV5 cDNA encoding human factor VIII, B domaindeleted, SQ form Hepatocytes via single IV infusion AAV5 cDNA encoding factor IX Padua Hepatocytes via single IV infusion Collagen type VII alpha 1 chain (COL7A1) Keratinocytes and fibroblasts at sites of lesions Skeletal muscle via single IV infusion Gene editing Inactivates BCL11a in RBCs Autologous HSCs AAVrh74variant cDNA encoding factor IX Hepatocytes via single IV infusion Padua

TABLE 483-3 Potential Complications of Gene Therapy Gene silencing—repression of promoter Genotoxicity—complications arising from insertional mutagenesis, or acceleration of malignant transformation in a cell on the path to oncogenesis before transduction (i.e., CAR introduced into a premalignant T cell) Phenotoxicity—complications arising from overexpression or ectopic expression of the transgene Immunotoxicity—harmful immune response to either the vector or transgene, or a harmful immune response of the vector (e.g., CAR T cells) Risks of horizontal transmission—shedding of infectious vector into environment Risks of vertical transmission—germline transmission of donated DNA Abbreviation: CAR, chimeric antigen receptor. first (cleavage) step is much more efficient than the second (targeting) step. The only approved gene editing product requires only the cleavage step (vide infra), and the same is true for most investigational products that have been published as clinical studies. ■ ■EX VIVO GENE TRANSFER Early attempts to effect gene replacement into hematopoietic stem cells (HSCs) were stymied by the relatively low transduction efficiency of TABLE 483-4 Adverse Events in Gene Therapy and Gene Editing VECTOR OR TREATMENT MODALITY SYMPTOM OR LABORATORY FINDING MECHANISM DOSE DEPENDENCE MITIGATION STRATEGIES Retroviral or lentiviral vectors Malignancya Insertional mutagenesisa Yes for retroviral vectors AAV Vector sequences in semen, risk of germline transmission Based on animal studies, present in prostatic fluid but not in gametes Immune responses directed to capsid, sometimes accompanied by loss of expression Memory T cells directed to vector capsid in humans, who are natural hosts for wild-type AAV Thrombotic microangiopathy with high-dose systemic infusion Rapid rise in antibodies to AAV, formation of antigen-antibody complexes, triggering of complement activation Ex vivo and in vivo genome editing Off-target cleavage resulting in unintended gene silencing Guide RNA lacks requisite specificity In vivo genome editing Liver-directed in vivo editing has shown mild and transient transaminase elevations but excellent efficacy at doses studied clinically Possibly immune responses to bacterial proteins in editing machinery, potentially resulting in loss of edited cells CAR-T therapy Cytokine release syndrome: fever, hypotension, tachycardia, hypoxia, multiorgan failure Systemic inflammatory response caused by cytokines released by CAR T cells Neurotoxicity-cerebral edema and encephalopathy Peripheral immune overactivation, endothelial activation-induced blood-brain barrier dysfunction, CNS inflammation Immunodeficiency (hypogammaglobulinemia and susceptibility to viral injections) On-target effect against B cells and/or plasma cells Preparatory lymphodepleting chemotherapy regimen also contributes New T-cell malignancy Insertional mutagenesis or potentially chronic activation due to new transgene aIn target cells, either hematopoietic stem cells or T cells. Abbreviations: AAV, adeno-associated virus; CAR, chimeric antigen receptor; CAR-T, chimeric antigen receptor T cell; CNS, central nervous system; FDA, U.S. Food and Drug Administration; PJP, Pneumocystis jirovecii pneumonia; VZV, varicella-zoster virus.

retroviral vectors, which require dividing target cells for integration. Because HSCs are normally quiescent, they are a formidable trans duction target. However, identification of cytokines that induce cell division without promoting differentiation of stem cells, along with technical improvements in the isolation and transduction of HSCs, led to modest but real gains in transduction efficiency.

CHAPTER 483 Immunodeficiency Disorders: Proof of Principle The first convincing therapeutic effect from gene transfer occurred in children with X-linked severe combined immunodeficiency disease (SCID), which results from mutations in the gene (IL2RG) encoding the γc sub unit of cytokine receptors required for normal development of T and natural killer (NK) cells (Chap. 362). Affected infants present in the first few months of life with overwhelming infections and/or failure to thrive. In this disorder, it was recognized that successfully transduced cells, even if few in number, would have a proliferative advantage com pared to nontransduced cells, which lack receptors for the cytokines required for lymphocyte development and maturation. Isolation of autologous CD34+ cells, followed by transduction with a retroviral vec tor encoding the γc subunit and transplantation of the gene-modified autologous cells, led to complete reconstitution of the immune system, including documented responses to standard childhood vaccinations, clearing of infections, and remarkable gains in growth in most treated Gene and Cell-Based Therapy in Clinical Medicine
Less frequent with lentiviral vectors, likely because of differences in integration patterns Yes Barrier birth control until three sequential semen samples are negative for vector DNA Yes Reduce doses Administer immunomodulatory agents

(short-term) to reduce or ablate response Yes Has responded to therapy with complement inhibitors including eculizumab Likely Preclinical assessment for off-target effects Long-term follow-up of trial participants and patients Yes Careful dose-ranging studies in early-phase testing Consider short-term immunomodulatory agents if needed at higher doses Possibly Tocilizumab/corticosteroids Possibly Avoid seizure-threshold-lowering medications in early phase of treatment Treat with dexamethasone as early as possible (use specific management guidelines) No Prophylaxis for opportunistic infections (PJP, VZV) for at least 1 year; vaccination schedule (specific guidelines) No Report to FDA and manufacturer Consider activation of suicide gene if present in the transgene expression cassette Treat per standard guidelines

TABLE 483-5 Taking History from Patients Who Have Received Gene Therapies or Gene Editing Elements of History for Patients Who Received Gene Therapy (or Have Participated in Trials)

What vector was administered? Is it predominantly integrating (retroviral, PART 16 Genes, the Environment, and Disease lentiviral, herpesvirus, or gene editing) or nonintegrating (plasmid, adenoviral, adeno-associated viral)?
What were the dose and the route of administration of the vector?
What was the target tissue?
What gene was transferred in? The gene that is defective in the patient’s disease? A truncated version? A gene encoding a different protein with similar properties? A knockdown approach?
Were there any adverse events noted after gene transfer? Screening Questions for Long-Term Follow-Up in Gene Transfer Subjectsa
Has a new malignancy been diagnosed? If so, clinicians should contact the manufacturer to report the event and obtain instructions on the collection of patient samples for testing.
Has a new neurologic/ophthalmologic disorder, or exacerbation of a preexisting disorder, been diagnosed?
Has a new autoimmune or rheumatologic disorder been diagnosed?
Has a new hematologic disorder been diagnosed? aFactors influencing long-term risk include integration of the vector into the genome, vector persistence without integration, and transgene-specific effects. children. However, among 20 children treated in the initial trials, five eventually developed a syndrome similar to T-cell acute lymphocytic leukemia, with splenomegaly, rising white counts, and the emergence of a single clone of T cells. Molecular studies revealed that, in most of these children, the retroviral vector had integrated within a gene, LMO-2 (LIM only-2), which encodes a component of a transcription factor complex involved in hematopoietic development. The retroviral long terminal repeat acted as a promoter to increase the expression of LMO-2, resulting in T-cell leukemia. The X-linked SCID studies were a watershed event in the evolution of gene therapy. They demonstrated conclusively that gene therapy could cure disease, with dramatic and durable clinical results. However, they also demonstrated that insertional mutagenesis leading to cancer was more than a theoretical possibility (Table 483-3). As a result of the experience in these trials, all protocols using integrating vectors in hematopoietic cells must include a plan for monitoring sites of inser tion and clonal proliferation for 15 years after treatment. Initial strate gies to overcome the possible complication of insertional mutagenesis included using a “suicide” gene cassette in the vector, so that errant clones can be quickly ablated, or using “insulator” elements in the cas sette, which can limit the activation of genes surrounding the insertion site. However, the occurrence of malignancy in the X-linked SCID trials led to a transition to lentiviral vectors. These vectors efficiently transduce nondividing target cells and are characterized by a different pattern of integration into the genome that appears to be safer than retroviral vectors. However, recent developments in the field of CAR-T therapy, notably reports of development of T-cell lymphoma, have underscored the need for caution (vide infra). Transfusion-Dependent Thalassemia: Extension of Principle

Therapeutic success for inherited immunodeficiencies, though a clear unmet medical need, affects only a very small population. The success of gene therapy in β thalassemia, one of the most common genetic dis eases in Asian and Mediterranean populations, and one that provided the foundation for success in sickle cell disease, the most common genetic disease in Africans, demonstrated conclusively the therapeu tic impact of gene therapy. The red cell disorders β thalassemia and sickle cell disease are more challenging targets for gene therapy than the immunodeficiencies for several reasons. First, in immunodefi ciency disorders, the transduced stem cells have a survival advantage over nontransduced cells, which is not the case in red cell disorders (although the fully differentiated gene-modified red blood cells [RBCs] have a survival advantage compared to thalassemic or sickle RBCs).

Second, in order to achieve transfusion independence or freedom from vaso-occlusive crises, one must achieve higher transduction efficiency as well as engraftment of higher numbers of stem cells. There are now two approved products, one a lentiviral-based gene therapy and the other a gene editing approach, for both of these conditions (Table 483-2). Standard of care for transfusion-dependent β thalassemia (TDT) consists of lifelong regular RBC transfusions, typically monthly, to support hemoglobin (Hgb) levels >9 g/dL, coupled with an intensive regimen of iron chelation to minimize iron overload to the liver, heart, and endocrine system (Chap. 103). Allogeneic stem cell transplanta tion addresses the underlying cause of the disease but carries risks of myeloablation, graft-versus-host disease (GVHD), and graft rejection and thus is reserved primarily for those with an human leukocyte antigen (HLA)-matched sibling donor (<25% of patients). The first approved gene therapy for β thalassemia consists of a lentiviral vec tor driving expression of an antisickling variant of β-globin (βT87Q, the same product used for sickle cell disease), introduced into autologous HSCs, which are then transplanted back into the patient after mye loablation. Results of clinical trials for both β0/β0 genotype (the most severe) and for non-β0/β0 showed durable transfusion independence, defined as Hgb ≥9 g/dL and no transfusion for ≥12 months, in 20 of 22 evaluable participants in one phase 3 study and 12 of 14 patients in a second study. The remaining subjects all demonstrated reduction in the transfusion requirement, enabling iron removal therapy by either phlebotomy or iron chelation and removing risks related to iron over load. Gene therapy with lentiviral transduction of autologous cells thus dramatically simplifies the medical regimen for these patients, since it eliminates the need for ongoing transfusion and iron chelation and car ries no risk of GVHD or graft rejection because it is generated from the patient’s own cells. Similarly, since the transduced cells are autologous, there is no requirement for an HLA-matched donor, expanding the numbers of patients who can be treated. Safety in the initial trials has been excellent, with most adverse events related to the known risks of the myeloablative conditioning regimen. The same lentiviral vector is also now approved for sickle cell dis ease; a single-arm, 24-month, open-label study assessed 36 participants who underwent apheresis followed by myeloablative conditioning and transplantation of gene-modified autologous cells. Of the 32 evaluable patients, 30 achieved complete elimination of vaso-occlusive crises between 6 and 18 months after infusion, a key efficacy endpoint in the study, and 31 of 36 achieved a globin response defined as hemoglo bin AT87Q (the transgene product) of at least 30% of total Hgb and an increase in total Hgb of ≥3 g/dL. Using an earlier version of this product that was prepared using a different manufacturing process and a dif ferent transplant procedure, two patients died following development of acute myeloid leukemia. Interpretation of these data is not straight forward, since patients with sickle cell disease have an increased risk of hematologic malignancy compared to the general population. The product carries a boxed warning summarizing this risk; twice yearly monitoring of a complete blood count is recommended. Neurodegenerative Disease: Broadening of Principle The SCID trials gave support to the hypothesis that gene transfer into HSCs could be used to treat any disease for which allogeneic bone marrow transplantation was therapeutic. Moreover, the use of genetically modi fied autologous cells carried the advantages noted above, i.e., no risk of GVHD, guaranteed availability of a “donor” (unless the disease itself damages the stem cell population of the patient), and low likelihood of failure of engraftment. Investigators in Paris capitalized on this realiza tion to conduct the first trial of lentiviral vector transduction of HSCs for a neurodegenerative disorder, X-linked adrenoleukodystrophy (ALD). The key to the mechanism of action is that a subpopulation of the gene-modified cells gives rise to myeloid cells that cross the bloodbrain barrier and engraft as central nervous system (CNS)-resident microglia and perivascular CNS macrophages. The transduced cells carry the gene encoding the missing protein, in this case an adenosine triphosphate–binding cassette transporter (Table 483-2). Following lentiviral transduction of autologous HSCs in young boys with the disease, dramatic stabilization of disease occurred, demonstrating that

stem cell transduction could work for neurodegenerative as well as immunologic disorders. Investigators in Milan carried this observation one step further to develop a treatment for another pediatric neurodegenerative disorder that had previously responded poorly to bone marrow transplantation. Metachromatic leukodystrophy is a lysosomal storage disorder caused by mutations in the gene encoding arylsulfatase A (ARSA). The late infantile form of the disease is characterized by progressive motor and cognitive impairment and death within a few years of onset, due to accu mulation of the ARSA substrate sulfatide in oligodendrocytes, microglia, and some neurons. Recognizing that endogenous levels of production of ARSA were too low to provide cross-correction by allogeneic transplant, a lentiviral vector was used to create supraphysiologic levels of ARSA expression in transduced cells. Transduction of autologous HSCs from children born with the disease, at a point when they were still presymp tomatic, has led to preservation and continued acquisition of motor and cognitive milestones at time periods as long as 8 years after treatment, with observation ongoing. This product is approved in Europe and the United States for those with late infantile or early juvenile forms of the disease (Table 483-2). These results illustrate that the ability to engineer levels of expression can allow gene therapy approaches to succeed where allogeneic bone marrow transplantation cannot. A similar approach may be useful in other neurodegenerative conditions. ■ ■EX VIVO GENOME EDITING The first approved genome editing product, and those furthest along in clinical development, all use strategies that require only a cleavage event, rather than both a cleavage and a targeting event. For sickle cell disease (Chap. 103), the genome editing strategy is carried out ex vivo in HSCs. A Cas9/guide RNA ribonucleoprotein complex targeting the erythroid enhancer of the BCL11A gene, which normally represses γ-globin (the fetal β-like globin) during the fetal-to-adult β-globin switch, is introduced into autologous CD34+ cells of patients with TDT or sickle cell disease. Reduced BCL11A expression results in increased γ-globin expression and Hgb F production in erythroid cells, reducing the Hgb S levels and preventing sickling. The product was approved based on a study of 44 children and adults (age range 12–34 years) with sickle cell disease; of these, 31 had been followed for at least 16 months after gene editing, myeloablation, and engraftment. Of the 31 partici pants with adequate follow-up, 29 achieved the primary efficacy end point of at least 12 consecutive months without any protocol-defined vaso-occlusive crisis. The mean total Hgb at month 18 was 13.3 g/dL, compared to a baseline mean of 7.5 g/dL. This product is now approved for children (12 and older) and adults with sickle cell disease and recur rent vaso-occlusive crisis. Other genome editing strategies, in earlier stages of clinical develop ment, use Cas9/guide RNA to introduce a cleavage within the β-globin locus near the site of the sickle mutation and simultaneously supply a targeting vector (in this case, an AAV6 vector) encoding a short sequence of the wild-type β-globin gene. In a process dependent on the cellular homology-directed repair pathway, this sequence is used as a DNA repair template at the site of the break, resulting in replacement of the mutant βs sequence with the wild-type sequence. ■ ■LONG-TERM EXPRESSION IN GENETIC DISEASE: IN VIVO GENE TRANSFER WITH RECOMBINANT ADENO-ASSOCIATED VIRAL VECTORS Recombinant adeno-associated viral (AAV) vectors have emerged as attractive gene delivery vehicles for genetic disease. Engineered from a small replication-defective DNA virus, they are devoid of viral coding sequences and trigger very little immune response in experimental ani mals. They are capable of transducing nondividing target cells, and the donated DNA is stabilized primarily in an episomal form, thus mini mizing risks arising from insertional mutagenesis. Because the vector has a tropism for certain long-lived cell types, such as skeletal muscle, neurons, and hepatocytes, long-term expression can be achieved even in the absence of integration. Of note, because the donated DNA is predominantly nonintegrated, long-term expression requires targeting of nondividing or slowly dividing cells; otherwise, expression is lost as

cells divide. The other shortcoming of AAV as a gene delivery vehicle is that it cannot package inserts of more than ~5 kb, owing to the fact that the wild-type viral genome is only ~4.7 kb; fortunately, with some notable exceptions, most cDNAs fall below this limit.

CHAPTER 483 First Approved Products for Ultra-Rare Diseases As was the case with ex vivo gene transfer, the first approved products for in vivo gene therapy were for ultra-rare disorders. In the Western world, the first approved gene therapy product for genetic disease was an AAV vector, conditionally approved in Europe in 2012, for treatment of the autosomal recessive disorder lipoprotein lipase deficiency. The sponsor allowed the approval to expire in 2017, without completing the postmarketing requirements, but the initial approval was a crucial catalyst for the current robust activity in gene therapy, and AAV vectors are now the largest category of approved products for genetic disease, including products for spinal muscular atrophy type 1, a rare form of genetic blindness, hemophilia A and B, and Duchenne muscular dys trophy (Table 483-2). Gene and Cell-Based Therapy in Clinical Medicine
The first approved AAV therapy for genetic disease in the United States was also for an ultra-rare disease, an inherited retinal dystrophy due to mutations in the gene encoding retinal pigment epithelialassociated 65-kDa protein (RPE65). The retina is an attractive target for AAV-mediated gene transfer. It is a relatively immunoprivileged space, obviating problems related to immune responses, and the photorecep tors, retinal ganglion cells, and retinal pigment epithelial cells are all long-lived postmitotic cells. Routes of administration for these cell types—either intravitreal or by subretinal injection—involve standard procedures in ophthalmology. Given the small space, doses required are relatively low, lessening the manufacturing burden. Finally, canine mod els of a number of inherited retinal dystrophies have been well-charac terized and faithfully model the human diseases. Work carried out in the 1990s had demonstrated that the canine disease could be reversed, with durable restoration of visual behavior, by subretinal injection of an AAV vector in dogs with a mutation in the gene encoding RPE65, an enzyme key to the visual cycle. Like the canine disease, the human disease is characterized by early-onset visual impairment, with most patients pro gressing to blindness over time. Phase 1 clinical trials by multiple groups established the safety of subretinal injections of an AAV vector express ing RPE65. A single phase 3 trial, the first randomized controlled trial in human gene therapy, demonstrated improvement in multiple measures of retinal and visual function. Of note, and likely to be a recurring theme as gene therapies address diseases for which there are no existing treat ments, successful clinical development required the development and validation of a novel clinical endpoint that could measure improvements in functional vision. This product, the first licensed AAV gene therapy product in the United States, is now approved worldwide (Table 483-2). Trials for both inherited and complex acquired retinal disorders such as age-related macular degeneration, affecting millions worldwide, are now underway. Hemophilia B: Addressing More Common Genetic Disorders and Unraveling the Human Immune Response to Systemi cally Administered AAV Hemophilia is the X-linked bleeding diathesis caused by mutations in the genes encoding factor VIII (hemo philia A) or factor IX (hemophilia B) (Chap. 121). Current treatment relies on intravenous infusion of clotting factor concentrates or, as an alternative in hemophilia A, administration of a bispecific antibody that replaces the cofactor factor VIII by binding to the enzyme (fac tor IXa) and its substrate (factor X), resulting in a biologically active conformation. Gene therapy for hemophilia began with hemophilia B, a smaller patient population compared to hemophilia A, but the F9 cDNA is smaller (2.8 kb) and is more easily accommodated in an AAV vector. The early vectors were successful at demonstrating long-term expression of factor IX at therapeutic levels in hemophilic mice and dogs when vector was delivered to the liver (hepatocytes are the normal site of synthesis of factor IX), but the initial clinical trials uncovered a plethora of problems not predicted by animal models that had to be addressed to allow successful development to proceed. Fortunately, the solutions to these problems were generalizable across multiple

therapeutic indications that rely on systemic delivery of AAV vector (vide infra) (Table 483-4). Most complex among these was working out the human immune responses, both innate and adaptive, to the AAV vector. With systemic administration, the presence of neutralizing antibodies, harbored in a substantial portion of children and even more prevalent in adults, can prevent transduction before the target cells are reached, and the cellular immune response, not predicted by animal models, which are not natural hosts for AAV and thus lack memory T cells directed to the capsid, can result in the loss of the transduced cells in a matter of weeks after successful initial transduction. The preexisting antibodies can be screened for in advance, assuring that only those likely to benefit receive the therapy. The cellular immune response typically presents as asymptomatic transaminase elevation and concomitant loss of factor IX (or the transgene product) in the circulation. This response is dose-dependent and can be mitigated by the use of immunomodulatory agents such as glucocorticoids or by strategies that reduce the vector dose. The use of appropriately timed steroid treatment to dampen the cellular immune response resulted in the first report of durable factor IX expression in men with severe hemophilia B. The eventual widely adopted solution for hemophilia B gene therapy came from human genetics, specifically the use of a high specific activity variant of factor IX, factor IX Padua, that allowed a substantial reduction in vector dose and/or higher levels of circulating factor IX; this strategy has shown durable expression. The two cur rently approved hemophilia B gene therapies show an increase in mean factor IX levels well into the mild hemophilia range; these levels result in annualized bleeding rates that are noninferior to clotting factor pro phylaxis in men with the disease.

PART 16 Genes, the Environment, and Disease Successful extension to hemophilia A required the use of a truncated version of the factor VIII cDNA. The only approved product drives therapeutic levels of expression, but durability has been less than that seen in gene therapy for hemophilia B. Spinal Muscular Atrophy Type 1 Spinal muscular atrophy type 1 is the most common genetic cause of death in infancy and affects about 1 in 11,000 births. The disease is caused by autosomal reces sive mutations in the SMN1 gene, encoding survival motor neuron 1; affected infants undergo degeneration and loss of lower motor neurons, presenting as hypotonia, severe weakness, and failure to sit without support. In a large natural history study of untreated infants with the disease, by age 20 months, only 8% of patients with the disease were alive and free of ventilator support. In a phase 1 gene therapy study, intravenous infusion of an AAV vector (one with tropism for the nervous system [AAV9]) expressing SMN1 showed survival without ventilator support in 100% of participants (n = 15) at 20 months of age. A phase 3 trial was initiated, but the treatment was approved in the United States based on the efficacy data from the first 21 participants enrolled in that study, coupled with the safety data from the ongoing phase 3 and the completed phase 1 study (Table 483-2). The major safety concern was the risk of acute serious liver injury; because the vector is infused intravenously, there is considerable biodistribution to the liver (and to the spinal motor neurons, the therapeutic target). The approved dose of 1.1 × 1014 vector genomes/kg is quite high, and results in marked elevation of liver transaminases if untreated. Leveraging the results in the early hemophilia trials, the phase 1 study showed that the liver toxicity could be controlled using a course of corticosteroids begun 1 day before the vector infusion and continued for 30 days, with tapering begun at that point and carried out with monitoring of liver transaminases. Postmarketing studies revealed an additional rare adverse reaction, thrombotic microangiopathy (TMA). Presenting as thrombocytopenia, microangiopathic hemolytic anemia, and acute kidney injury, this constellation of findings in the setting of AAV gene therapy was first described in the AAV gene therapy trials for Duch enne muscular dystrophy, which also use very high doses of vector. When it occurs, TMA appears early after vector infusion, is complementmediated, and has responded to complement inhibitors, i.e., eculi zumab. Patients receiving onasemnogene abeparvovec are currently followed in a registry designed to assess effectiveness, long-term safety, and overall survival of patients with Spinal muscular atrophy.

■ ■IN VIVO GENOME EDITING Clinical trials using ex vivo genome editing of HSCs by CRISPR/Cas9 systems have been ongoing for several years and have now led to the first approved genome editing product (vide supra). More recently, investigators have begun to explore the feasibility of in vivo genome editing, with intriguing results. As has been the case with ex vivo editing, these initial trials all center on strategies that require a cleav age step only, i.e., not a cleavage event followed by a targeting event. For in vivo editing of hepatocytes, lipid nanoparticles containing an mRNA encoding the Cas9 protein and a single guide RNA directed to the gene of interest are infused intravenously. The major safety concern involves the risk of off-target editing; an additional concern prior to clinical trials was whether a robust immune response to the bacterial Cas9 enzyme would result in unacceptable toxicity or loss of efficacy. Published experience to date does show mild, transient, dose-dependent rises in liver transaminases, but also robust effi cacy in reducing circulating levels of transgene products of interest, including transthyretin in transthyretin amyloidosis and plasma kallikrein B1 in hereditary angioedema. Additional in vivo genome editing trials are underway, notably for cardiovascular conditions, to reduce circulating levels of lipoprotein(a), proprotein convertase subtilisin/kexin type 9 (PCSK9), or angiopoietin-like protein 3, all associated with atherosclerotic cardiovascular disease. An advantage of genome editing approaches is that the edit will be passed to every daughter cell; thus, changes should persist over time, and this treat ment can be used even in children, without fear that the edit will be lost as the liver grows in size. GENE THERAPY FOR CANCER The majority of clinical gene transfer experience has been in subjects with cancer. The intent has been to increase the precision of cancer therapies and thereby make them less toxic and more effective. Most approaches have either modified the tumor directly or altered the host’s response to the malignancy to produce immune effector cells that are precisely targeted to the tumor phenotype. ■ ■MODIFYING THE CANCER Since cancer is an (acquired) genetic disorder, initial efforts were directed at correcting the genetic deficits of the tumor or introducing lethal genes. Two major and persisting obstacles, however, are the poor biodistribution and transduction efficiency of all currently available vectors, and the heterogeneity and genetic instability of the tumor tar gets themselves, so that correction of single driver mutations does not preclude the evolution of a resistant population. Tumor Correction One widely used direct intratumoral approach was adenoviral-mediated expression of the tumor suppressor p53, which is mutated in many different cancers. Initial studies showed some complete and partial responses in squamous cell carcinoma of the head and neck, esophageal cancer, and non-small cell lung cancer, but as yet, there have been no successful product licensing studies for this approach except in China. Prodrug Metabolizing Genes Efforts to overcome the above limitations have included the introduction of a prodrug or a suicide gene that would increase sensitivity of tumor cells to cytotoxic drugs. A strategy used early on was intratumoral injection of an adenoviral vector expressing the thymidine kinase (TK) gene. Cells that take up and express the TK gene can be killed after the administration of ganciclovir, which is phosphorylated to a toxic nucleoside by TK. The advantage of this approach is that the effects of transducing even a lim ited number of tumor cells are amplified by the spread of active drug to adjacent tumor cells. Although the approach continues to be examined in aggressive brain tumors and locally recurrent prostate, breast, and colon tumors, progress remains slow, and systemic benefits against metastatic disease have not been established. ■ ■MODIFYING THE HOST Recruiting the Immune System The successful use of mono clonal antibodies that produce antitumor activity by activating the

immune response has demonstrated the feasibility of manipulating the immune system to recognize the abnormal pattern of antigen expres sion on tumor cells. Immune cells are capable of almost unlimited expansion and persistence and can provide long-term tumor control. They can also traffic to tumor sites irrespective of location and, in prin ciple, have the potential to evolve with the changing pattern of tumor cell phenotype and function. Antibodies targeting “checkpoint” mol ecules, particularly CTLA-4 and the PD-1/PD-L1 axis, which naturally limit T-cell responses and maintain tolerance, have been particularly successful. Vaccination This strategy promotes more efficient recognition of tumor cells by the immune system, but the development of a therapeutic as opposed to the preventative vaccines required to combat infectious diseases has proved to be a considerable chal lenge. Approaches have included direct injection of tumor or tumor-antigen–derived RNA or DNA; transduction of tumor cells with immune-enhancing genes encoding cytokines, chemokines, or co-stimulatory molecules; and the ex vivo manipulation of dendritic cells to enhance the presentation of tumor antigens. A dendritic cell vaccine for treatment of recurrent prostate cancer has received approval in the United States, but its limited potency and high cost have constrained commercial success. Adoptive Cell Transfer Host immune cells such as T cells, NK cells, and others can be modified to express new transgenic recep tors intended to recognize tumor cells and their microenvironment (Fig. 483-1). Retargeting may use a modification of the cells’ own receptor or a molecularly synthesized CAR that is usually composed of the antigen recognition portion of an antibody and the signaling components of the cell’s native antigen receptor along with one or more additional signaling domains that boost T-cell activation. Both approaches have been successful, with significant responses reported with native receptors targeted to melanoma and synovial cell sarcoma and—most dramatically—with CARs targeted to CD19, an antigen expressed at high levels on normal and many malignant B cells, or B-cell maturation antigen (BCMA), an antigen expressed at high levels in normal and multiple myeloma plasma cells. Infused CAR T cells can expand many thousand-fold in vivo, persist long term, and have pro duced >80% complete response rates when targeting intractable B-cell acute lymphoblastic leukemia; approximately half of those patients have remained in remission for many years afterward without further CAR T cells Native T cells VL CL Target cell VL VH CH MHC I β2 VH CH2 CH3 Antigenic peptide Spacer TCR Monoclonal antibody α β TM CD3ζ COSTIM α β CAR ζ δ ε ε γ ζ ζ TCR complex γ ε ε δ ζ ζ TCR complex FIGURE 483-1 T-cell receptors. A native T-cell receptor (TCR) recognizes processed peptide antigens bound to major histocompatibility (MHC) molecules through its αβ chains. Signaling then occurs through a multichain intracellular CD3 complex. A chimeric antigen receptor (CAR) usually contains an extracellular receptor component derived from the antigen binding portion (VH and VL) of a monoclonal antibody. This produces a receptor that can recognize either protein or nonprotein antigens independent of the MHC. A transmembrane (TM) domain then connects this receptor to the ζ chain of the CD3 complex derived from the native TCR. A costimulation domain (COSTIM), such as CD28 or 4-1BB, is also present.

cancer therapies (i.e., have been “cured”). This approach has also been successful in adult patients with relapsed or chemotherapy-refractory B-cell–derived large cell lymphoma, mantle cell lymphoma, and mul tiple myeloma. Many responses are sustained long term, and several of these CAR T-cell products have been approved by the U.S. Food and Drug Administration (FDA), as well as international regulatory agen cies, and adopted as standard of care. In 2021, the first reports of T-cell lymphoma were reported with the use of piggyBac-modified (transpo son) CAR-T therapy; 2 of 10 patients developed CAR-T lymphoma, but this may have been a result of widespread copy number variations and multiple insertions, and there was no apparent evidence of insertional mutagenesis. In November 2023, the FDA issued a warning for most approved CAR-T therapies, all of which use retroviral or lentiviral vec tors for gene delivery of the CAR, suggesting that there was a higherthan-expected rate of T-cell lymphoma and that patients treated with CAR-T therapy should be monitored for secondary malignancies for life. Nevertheless, the overall benefits of these products continued to outweigh their potential risks for their approved uses. Although the FDA did not list the frequency, a flurry of subsequent reports noted that the rate of T-cell malignancies after CAR-T treatment for patients with relapsed or refractory hematologic malignances was ~20 in 34,000 patients (~0.06%); however, because reporting to the FDA on the inci dence of T-cell malignancies is voluntary once products are approved, it is possibly an underestimated frequency. A boxed warning listing the possibility of secondary malignancies was added to the label of most CAR-T products in January 2024, and the FDA published instructions on reporting secondary malignancies to the community in order to gather more information. It is important to keep in mind that the vast majority of secondary malignancies in patients treated with approved CAR-T therapies are unrelated to the presence of the CAR transgene, including solid tumors and myeloid malignancies; such secondary malignancies may be related to age, the presence of clonal hemato poiesis, and exposure to prior chemotherapy or other antineoplastic treatments.

CHAPTER 483 Gene and Cell-Based Therapy in Clinical Medicine
The general approach of CAR-T therapy is under rapid develop ment, including trials with CAR T cells targeting different antigens for solid tumors and other hematologic malignancies and CAR T cells with different molecular structures and different gene transfer vectors. Remaining challenges in the field and application of adoptive T-cell approaches include the following: (1) the immune inhibitory micro environment associated with most solid tumors, and recent studies further modify the T cells with countermeasures to tumor inhibitory signals; (2) acute and severe (though rarely fatal) systemic inflamma tory and neurologic toxicities during the phase of T-cell expansion and tumor killing, which typically require access to intensive care for clini cal management; (3) the off-target or on-target but off-tumor effects that may damage normal host tissues (e.g., normal B cells following CD19 CAR therapy); and (4) the cost, time, and complexity of manu facture, which are particular problems when antigens unique to each tumor’s individual mutations are targeted (neoantigens), rather than widely shared tumor-associated antigens. Nonimmunologic Modifications to Host Gene transfer can be used to protect normal cells from the toxicities of chemotherapy and thereby increase the therapeutic index of these drugs. The most extensively studied approach has been to transduce hematopoietic cells with genes encoding resistance to chemotherapeutic agents, including the multidrug resistance gene MDRI or the gene encod ing O6-methylguanine DNA methyltransferase (MGMT). Although such approaches reduce hematologic toxicity, cytotoxic dose escala tion quickly reveals dose-limiting toxicities to other organ systems. Chemotherapy resistance can also be engineered into immune cells redirected to target cancer, to enable combination treatments with cells and chemotherapy. T cells Finally, gene transfer can be used to inhibit the host angiogenesis required for tumor support, for example by constitutive expression of inhibitors such as angiostatin and endostatin, or the transfer of T cells genetically modified to recognize antigens specific to newly forming vasculature. These studies are in early phases.

06 - 484 The Human Microbiome in Health and Disease

484 The Human Microbiome in Health and Disease

■ ■COMBINATION APPROACHES: MODIFICATION

OF HOST AND TUMOR BY VIROTHERAPY— IMMUNO-ONCOLYTIC VIRUSES These viruses are genetically modified to replicate in malignant but not normal cells. The replicating vectors thus proliferate and spread within the tumor, facilitating eventual tumor clearance. However, physical limitations to viral spread, including fibrosis, intermixed normal cells, basement membranes, and necrotic areas within the tumor, may reduce clinical efficacy, and their activity against metastatic disease has proved limited. Recently, the FDA granted licensing approval to talimo gene laherparepvec, an oncolytic herpes virus containing the human granulocyte-macrophage colony-stimulating factor (GM-CSF) gene, for treatment of melanoma. This success has led to resurgent interest in combining the local tumor destruction and tumor antigen release mediated directly by oncolytic viruses with the recruitment of a systemic immune response mediated by immunostimulatory genes contained within the oncolytic virus. In principle, such immune-oncolytic viruses should produce responses in both local and metastatic disease. Numer ous novel viral agents are now entering early-phase clinical testing.

PART 16 Genes, the Environment, and Disease SUMMARY AND FUTURE DIRECTIONS Cell and gene therapies have progressed from halting beginnings to the current status as the fastest-growing sector of medicine and the health care industry. The speed of technical evolution results in a continuously changing landscape. Key advances likely to assume greater importance in the coming decade include bioengineered AAV capsids selected for tropism and increased expression, leading to lower doses, fewer adverse events, and lower manufacturing burden; extension of therapeutic trials from single-gene disorders to complex acquired disorders, including chronic heart failure, age-related macular degeneration, and Alzheimer’s disease; continued expansion of in vivo genome editing; and application of CAR-T technology to solid tumors and autoimmune disorders (e.g., systemic lupus erythematosus). The power and versatility of gene therapy approaches are such that there are few serious disease entities for which gene therapies are not under development. Approved products and examples of clinical success are now abundant, and cell and gene therapies are likely to become increas ingly important as therapeutic modalities in the twenty-first century. Realization of the therapeutic benefits of modern molecular medicine will depend on continued progress in cell and gene therapy technology. ■ ■FURTHER READING Al-Zaidy SA et al: AVXS-101 (onasemnogene abeparvovec) for SMA1: Comparative study with a prospective natural history cohort. J Neuromuscular Dis 6:307, 2019. Frangoul H et al: CRISPR-Cas9 gene editing for sickle cell disease and β-thalassemia. N Engl J Med 384:252, 2021. Fumagalli F et al: Lentiviral haematopoietic stem-cell gene therapy for early-onset metachromatic leukodystrophy: Long-term results from a non-randomised, open-label, phase 1/2 trial and expanded access. Lancet 399:372, 2022. High KA, Roncarolo MG: Gene therapy. N Engl J Med 381:455, 2019. June CH, Sadelain M: Chimeric antigen receptor therapy. N Engl J Med 379:64, 2018. Kanter J et al: Biologic and clinical efficacy of LentiGlobin for sickle cell disease. N Engl J Med 386:617, 2022. Larson RC, Maus MV: Recent advances and discoveries in the mecha nisms and functions of CAR T cells. Nat Rev Cancer 21:145, 2021. Longhurst HJ et al: CRISPR-Cas9 in vivo gene editing of KLKB1 for hereditary angioedema. N Engl J Med 390:432, 2024. Pipe SW et al: Gene therapy with etranacogene dezaparvovec for hemophilia B. N Engl J Med 388:706, 2023. Ruella M et al: Mechanisms of resistance to chimeric antigen receptor-T cells in haematological malignancies. Nat Rev Drug Discov 22:976, 2023. Tabebordbar M et al: Directed evolution of a family of AAV capsid variants enabling potent muscle-directed gene delivery across spe cies. Cell 184:4919, 2021. Verdun N, Marks P: Secondary cancers after chimeric antigen receptor T-cell therapy. N Engl J Med 390:584, 2024.

Neeraj K. Surana, Dennis L. Kasper

The Human Microbiome

in Health and Disease “All disease begins in the gut.” —Hippocrates Nearly two and a half millennia after Hippocrates made this statement, we are just coming to truly appreciate its profundity. Since the begin ning of humankind, scholars have been investigating the underpin nings of disease with an almost singular focus on the human side of the equation. Microbes were not recognized as an important cause of disease until the inception of the “germ theory” in the late nineteenth century. During the first century of medical microbiology, research largely centered on the role of microbes as pathogens. Only recently has there been a resurgence of interest in understanding how commensal organisms—the bacteria, viruses, fungi, and Archaea that make up the microbiota—impact human physiology. The idea that these micro organisms are vital to the well-being of humans has challenged our traditional notions of “self.” Indeed, a human being can most accurately be described as a holobiont: a complex assemblage of human cells and microorganisms interacting in an elaborate pas de deux that drives normal physiologic processes. Aimed at a better understanding of this relationship, myriad stud ies during the past decade have begun to catalogue the microbiota at various body sites and in a multitude of disease conditions. Diseases in virtually every organ system have been associated with changes in the microbiota. Indeed, the microbiota has been linked to intestinal disorders, disturbances in metabolic function, autoimmune diseases, and psychiatric conditions and has been shown to influence sus ceptibility to infection and the efficacy of pharmaceutical therapies. Knowledge of the specific mechanism(s) underlying most of these microbe–disease associations is lacking; it remains unclear whether the disease-associated alterations in the microbiota represent mere biomarkers of disease, a causal relationship, or a combination of the two. Although cause-and-effect relationships are still being eluci dated for many diseases, it is clear that humans coexist in an intricate relationship with commensal organisms. This chapter explores in detail the nature of these host–commensal interactions, focusing on how this information might be translated into clinically meaningful interventions. HISTORICAL PERSPECTIVE Massive undertakings, such as the Human Microbiome Project (HMP) sponsored by the National Institutes of Health and MetaHIT sponsored by the European Commission, have catalogued all the bacteria present at multiple body sites in people with and without disease. Coupled with the confluence of advances in sequencing technologies (Chap. 126), gnotobiotic animal availability, and microbial culture, significant prog ress has been made toward an understanding of the interplay between the microbiota and human health. However, current findings were foreshadowed by work done centuries ago. The human microbiota was first explored in 1683 when Antony van Leeuwenhoek described in a letter to the Royal Society of London the “very little living animalcules, very prettily a-moving” that he had observed in the plaque between his teeth. Leeuwenhoek went on to perform the first comparative “microbiota” studies by assessing how fecal and oral bacteria differ, how oral microbes change in the setting of disease (e.g., alcoholism and tobacco use), and how microbial com position changes across the age spectrum (e.g., in young children vs old men). He attempted—unsuccessfully—to eliminate these bacteria. Although Leeuwenhoek was not taken seriously when he first reported his findings, his studies laid the groundwork for what is now the field of microbiome research, and investigators are still trying to answer

many of the same overarching questions that he raised more than three centuries ago. Although Leeuwenhoek first reported the existence of bacteria and their association with humans at the end of the seventeenth century, the significance of commensal bacteria was not realized until late in the nineteenth century. In 1885, Pasteur suggested that animals could not survive if they were “artificially and completely deprived of the com mon microbes.” Although Pasteur’s preconceived ideas were proven incorrect in 1912 by the advent of germ-free (GF) animals (animals raised without exposure to any microorganisms), the underlying con cept that commensal organisms are critical to health has held up. Élie Metchnikoff made another conceptual advance in this field by suggest ing at the beginning of the twentieth century that clinical outcomes could be altered by the administration of specific beneficial organisms (probiotics). In particular, Metchnikoff believed that aging was caused by toxic bacteria in the gut and that lactic acid–producing bacteria (e.g., Lactobacillus species) present in sour milk and yogurt could miti gate against this process. The data behind this specific claim are still lacking, but contemporary discoveries offer continued hope that the microbiome can be effectively harnessed to protect against and treat a variety of diseases. Thus, although the field of microbiome research is sometimes considered to have emerged over the last two decades, the basic tenets—that the microbiota varies according to body site and clinical characteristics, that microbes are critical for human health, and that specific modulation of the microbiota may lead to improved clini cal outcomes—are far from new. A PRIMER ON TAXONOMY Given that microbiome-based studies have identified and compared microbes at different levels of taxonomic resolution (Fig. 484-1), some understanding of taxonomy is essential for better comprehension of the implications of these studies. Of the ~100 bacterial phyla that exist in nature, only five (Actinobacteria, Bacteroidetes, Firmicutes, Fuso bacteria, and Proteobacteria) are dominant members of the human microbiome. Each of these phyla can be further categorized into mul tiple classes, orders, families, genera, and species. Early studies on the microbiota focused on changes in the relative abundance at the phylum level between different groups (e.g., obese vs normal-weight patients); however, these comparisons are at such a broad taxonomic level that they often provide little or no biologic insight. As illustrated in Fig. 484-1, drawing comparisons between organisms in two different bacterial phyla is analogous to comparing humans to sea stars: the evolution ary distance between the two is tremendous. Examining microbial profiles at the phylum, family, or even genus level—as is often done at present—ignores the great heterogeneity within different strains of the same bacterial species. The analytical pipelines are beginning to enable strain-level comparisons, and these improvements will likely facilitate our ongoing investigation of host–commensal interactions. Bacteria Firmicutes Bacilli Bacillales Staphylococcaceae Staphylococcus S. aureus G. haemolysans M. equipercicus L. monocytogenes B. anthracis S. pneumoniae E. faecalis C. botulinum C. difficile E. rhusiopathiae E. coli B. fragilis S. epidermidis S. lugdunensis FIGURE 484-1 Juxtaposition of bacterial and human taxonomy highlights the evolutionary distance between different taxonomic levels. The listed species represent exemplars that are members of the taxon to which they are connected but that are not contained within the next-lower-level taxon listed. For example, Clostridium botulinum, Clostridioides difficile, and Erysipelothrix rhusiopathiae are members of the phylum Firmicutes, but are in classes other than Bacilli. Similarly, starfish and humans are both members of the kingdom Animalia, but they are in different phyla.

THE MICROBIOTA AND HUMAN HEALTH

■ ■OVERVIEW OF THE HUMAN MICROBIOTA The overwhelming majority of microbiota studies have focused on stool, given that this sample type represents the most ecologically rich anatomic site, is easy to obtain, and can readily be followed longitudi nally in the same individual. A landmark study by the HMP sought to define the “normal” microbiota throughout the entire body in healthy Western adults. To this end, the microbial populations at 15–18 body sites were characterized in 242 people. One striking finding was that all samples from a given body region (e.g., skin) were more similar to each other than they were to samples from a different body region (e.g., stool), even in the same individual (Fig. 484-2A). In essence, the effect of the anatomic site on microbial composition is far greater than the effect of heterogeneity between individuals. That said, there was a remarkable amount of interindividual variation at any given body site (Fig. 484-2B). In stool, for example, the abundance of the phylum Bac teroidetes ranged from ~10% in some individuals to >90% in others. Remarkably, even with person-to-person variability and differences among body sites, the functional capacity of the microbiota—assessed using metagenomic data to identify gene pathways—was quite similar across different people and different body sites (Fig. 484-2C). This dis crepancy between the substantial differences in microbial composition and the little or no resulting change in the functional properties of the microbiota reflects an important ecologic property of the microbiota: the microbial communities at different body sites and in different people assemble in such a way that all the core metabolic functions are maintained. This finding also hints at the likely possibility of sig nificant functional redundancy within the microbiota, with different species executing the same biologic functions in different people and/ or at different anatomic sites. CHAPTER 484 The Human Microbiome in Health and Disease While the HMP provided the first large-scale catalogue of the microbiome in multiple people and at many different body sites, the amount of data generated by what, at the time, was by far the largest microbiome study has been dwarfed by subsequent studies. These more recent studies have confirmed the HMP’s major tenets: the com position of the microbiota differs by body site, there is tremendous interindividual variation, and the microbial gene content is relatively conserved irrespective of the body site or individual. No microbial species are ubiquitous in all individuals and at all body sites, but some species are highly prevalent at a given body site: in the HMP study, Staphylococcus epidermidis was present in 93% of nares samples and Escherichia coli in 61% of stool samples. These findings highlight the remarkable personalization of the human microbiome. While the human genome is typically >99.5% identical in different people, the microbiotas of two individuals may not overlap at all. Although the “precision medicine” approach currently focuses on teasing out how differences in the human genome relate to different clinical end Eukarya Domain Animalia Kingdom Chordate Phylum Mammalia Class Primate Order Hominidae Family Homo Genus Species

Gastrointestinal Urogenital PC2 (4.4%) PART 16 Genes, the Environment, and Disease Oral Skin Nasal PC1 (13%) A Phyla B Metabolic pathways C Anterior nares RC Buccal mucosa Supragingival plaque Tongue dorsum Stool Posterior fornix FIGURE 484-2 The human microbiome exhibits significant taxonomic variability among body sites and between individuals while maintaining core metabolic pathways. A. Principal coordinates (PC) plot showing variation among samples demonstrates that primary clustering is by body area, with the oral, gastrointestinal, skin, and urogenital habitats separate; the nares habitat bridges oral and skin habitats. Each circle represents an individual sample. B, C. Vertical bars represent microbiome samples by body habitat, with each bar within a given body site representing a different individual. Bars indicate relative abundances colored by microbial phyla (B) and metabolic pathways (C). The legend on the right indicates the most abundant phyla/pathways. RC, retroauricular crease. (Reproduced with permission from Human Microbiome Project Consortium: Structure, function and diversity of the healthy human microbiome. Nature 486:207, 2012.) points, the human microbiome clearly represents a critical component for consideration. ■ ■THE MICROBIOTA BY THE NUMBERS It has long been known that the human-associated microbiota is numerically dense. Leeuwenhoek estimated that there were more “animals living in the scrum on the teeth in man’s mouth than there are men in a kingdom.” Specific enumeration of the components of the microbiota has been challenging, in part because of its variability across time, space (body region), and clinical conditions. Moreover, the majority of human-associated microbes are not readily cultivable—a situation that raises questions about the best methodology for such quantitation. Initial back-of-the-envelope calculations performed in the 1970s suggested that there were roughly tenfold more bacteria in the body than there were human cells. This rather astounding estimate suggested that humans are really only ~10% “human” and that by far the greatest part of the holobiont is represented by microbes. This stark numerical discrepancy has prompted some to question “who parasit izes whom.” However, it has been suggested that there are “only” ~1.3 times more bacteria in the body than there are human cells and thus that humans are ~56% “bacterial.” Of note, this study does not include the numbers of viruses (known to generally be approximately tenfold more abundant than other microbes), fungi, or Archaea. Given these additional microorganisms, the notion that microbes constitute >90% of the cells present in a human body is likely correct. These ratios are even starker when one considers the genetic potential of human

Firmicutes Actinobacteria Bacteroidetes Proteobacteria Fusobacteria Tenericutes Spirochaetes Cyanobacteria Verrucomicrobia TM7 Central carbohydrate metabolism Cofactor and vitamin biosynthesis Oligosaccharide and polyol transport system Purine metabolism ATP synthesis Phosphate and amino acid transport system Aminoacyl tRNA Pyrimidine metabolism Ribosome Aromatic amino acid metabolism cells versus that of commensal organisms. In contrast to the ~20,000 genes in the human genome, the estimated total number of genes in the microbiota (which together constitute the microbiome)—i.e.,

2,000,000—indicates that the human genome contributes <1% to the total genetic potential of the overall holobiont. Most microbiome studies have focused almost exclusively on the bacterial component; much remains to be learned about the functional interplay of bacteria, viruses, fungi, and Archaea and how these other classes of microorgan isms impact human health. In terms of overall diversity, >10,000 different bacterial species are present in the human microbiota; the intestines alone contain >1000 species. At any given time, the body of any given individual harbors 500–1000 bacterial species, with 100–200 bacterial species in the gut alone. If one considers different strains of the same bacterial species, which may be functionally different from one another, the diversity of the microbiota is probably at least an order of magnitude greater. Although marked diversity exists at the strain and species level, only limited bacterial phyla are generally found in the human microbiota at any given body site (Fig. 484-3). ■ ■INFLUENCES ON THE MICROBIOTA An individual’s specific microbial configuration is dynamic and is quickly altered in response to subtle changes in the microenviron ments in which the bacteria reside. On a day-to-day basis, these changes usually reflect alterations in the relative abundance of the various microbes. However, some exposures have a greater effect on

Nares Buccal mucosa GI/Stool KEY Actinobacteria Bacteroidetes Fusobacteria Proteobacteria Firmicutes Other FIGURE 484-3 Different anatomic sites harbor very different microbiomes. The figure indicates the relative proportion of sequences determined at the taxonomic phylum level at six anatomic sites. (Data for stool, vagina, nares, buccal mucosa, and supragingival plaque are from the Human Microbiome Project; data for the skin are from EA Grice et al: Topographical and temporal diversity of the human skin microbiome. Science 324:1190, 2009.) the microbiota and can shift the microbial population to a new equi librium via the loss of specific species and/or the acquisition of others; this new microbial equilibrium can be associated with either health or a disease state (Fig. 484-4). Identification of the factors that influence the microbiota’s composition is critical to an understanding of what leads to and controls intra- and interindividual variation. Moreover, an understanding of the influences on the microbiota will facilitate the Healthy state 2 Unstable Healthy state 1 Disease state Stable Current microbial state FIGURE 484-4 A stability landscape of the human microbial ecosystem. A stable state, illustrated as a depression in the landscape, can be associated with either a healthy state or a disease state. The topology of an individual’s landscape reflects that person’s genetics, age, diet, medications, medical history, and lifestyle. The position of the green ball represents the current microbial state. Clinical changes (e.g., administration of antibiotics, development of disease) can influence both the current state and the overall topology.

design and proper interpretation of microbiota studies. While it is clear that the microbiota can be altered through these various mecha nisms, it is not yet clear whether these changes are biologically significant.

Supragingival plaque CHAPTER 484 Genetics Studies of monozygotic and dizy gotic twins have revealed that host genetics have a small but statistically significant effect on the microbiota’s composition. Notably, some taxa, such as Christensenella species, are more heritable than others. A cross-sectional study of >1000 healthy individuals who have distinct ancestral origins and a relatively shared com mon environment confirmed a weak associa tion between host genetics and the microbiome but highlighted that environmental factors are more prominent modulators of the microbi ome. That said, the host’s genetic contribution to the microbiota, albeit small, may be mean ingful. Studies in mice have demonstrated that genetic variation in the major histocompatibil ity complex, a specific set of immune-related genes, leads to changes in the microbiota that alter susceptibility to an autoimmune disease. These studies offer a proof of concept for the notion that the genetic predisposition observed for certain diseases may actually be mediated by indirect alterations in the microbiota. Skin The Human Microbiome in Health and Disease Vagina Age Burgeoning evidence now indicates that microbial exposure may begin in utero: bacterial DNA from bacteria typically associ ated with the oral microbiota has been identi fied in otherwise healthy placentas, in amniotic fluid obtained at early stages of gestation, and in meconium of term newborns. Although some controversy persists about whether these results reflect contamination and/or the pres ence of nonviable bacteria, they raise the possibility that human expo sure to the microbial world begins before birth. The delivery mode (vaginal vs cesarean section) and the method of feeding (breast milk vs formula, timing of solid food introduction) are major determinants of an infant’s early microbiota. After birth, the infant’s microbiota goes through a stereotyped succession process; with increases in bacterial diversity and functional capacity, the child’s microbiota resembles that of an adult by the age of 2–3 years. Cross-sectional studies that have examined the microbiota across the entire age spectrum have revealed a general stability of the fecal microbiota after 2–3 years of age; how ever, the microbiota of the elderly (persons >80 years of age) demon strates notable differences from those of their younger counterparts, with increases in Bacteroides and Eubacterium species and decreases in the bacterial family Lachnospiraceae. Although there has been sig nificant interest in defining microbial features that predispose towards longevity, there has been poor concordance of findings between stud ies, potentially due to very different populations being studied. Diet Diet is a strong determinant of human health. The impact of diet is mediated, in part, by its effects on the composition of the gut microbiota. This makes intuitive sense, as the human diet provides nutrients needed not only by our own cells but also by the microbes living in the alimentary tract. In young children, this dietary influence is marked by major shifts (e.g., a decrease in Bifidobacterium species) in the intestinal microbiota that occur at weaning and with the introduc tion of solid food. In adults, long-term dietary patterns are associated with relatively stable microbial compositions. However, drastic changes in short-term macronutrient availability cause rapid (within 1 day) and reproducible fluctuations in the fecal microbiota that reflect the bio logic processes needed to degrade and metabolize the nutrients in the new diet. For example, vegetarian diets are associated with a microbiota

that has an increased ability to metabolize plant polysaccharides (e.g., Roseburia species, Eubacterium rectale, Ruminococcus bromii), while animal-based diets result in an increased abundance of bile-tolerant organisms (e.g., Alistipes, Bilophila, and Bacteroides species). At the completion of dietary interventions and the resumption of the indi vidual’s normal dietary pattern, the microbial communities revert back to their previous states, probably because the individual resumes their typical diet. Taken together, dietary studies confirm that the microbiota is highly adaptable and varies in relation to changes in the diet. Of note, virtually all these studies have focused on how the diet influences the fecal microbiota, with emerging evidence showing that it may similarly influence the microbiota at some nonintestinal sites. Drugs Virtually all drugs have the capacity to change the microbiota by altering the chemical landscape in which the microorganisms live (e.g., statins, bile acid sequestrants), modulating the host’s ability to recognize and react to microbes (e.g., immunosuppressants) and/or directly interfering with the microbiota’s constituents (e.g., antibiotics). These potential effects have made critical interpretation of microbiota studies much more difficult. A prominent study that claimed to iden tify a fecal microbiota signature associated with type 2 diabetes was later found actually to have identified a signature for patients taking metformin instead; the effects of this drug on the microbiota were far greater than the effects of the disease itself. These results highlight the importance of controlling for clinical variables in microbiota studies.

PART 16 Genes, the Environment, and Disease Antibiotics are the most obvious and best-studied class of drugs that modulate the microbiota. Multiple groups have demonstrated that antibiotics exert a considerable effect on the gut microbiota by depleting antibiotic-sensitive strains. What is more surprising is that many strains resistant to the antibiotic tested are also eliminated. For example, treatment with ciprofloxacin, which has little to no activ ity against clinically relevant anaerobes, leads to a loss of roughly one-third of the bacterial taxa in the gut. This broad effect is likely mediated by the depletion of certain “keystone” species that are required for the persistence of other, unrelated species and highlights the intricate microbe–microbe interactions that are fundamental to maintenance of the overall microbial community. While many of the observed antibiotic effects (e.g., loss of specific taxa) are shared across many different individuals, some effects vary greatly among people. For example, studies found that microbiota recovery following anti biotic treatment differed significantly in terms of timing and degree. The microbiota of most healthy people who received ciprofloxacin for 5 days had completely recovered within 4 weeks, whereas micro biologic changes lasted up to 6 months in other individuals. More over, the degree of variation was compounded by repeated antibiotic administration, with fewer individuals reverting to their baseline microbiota after a second course of ciprofloxacin given 6 months after the first. These findings are consistent with those of microbial ecology experiments, which also showed that this type of repeated disturbance leads to less predictable results. Lifestyle Many seemingly innocuous lifestyle decisions can impact the human microbiota. For example, a person’s skin and fecal microbio tas are more similar to those of their household members, regardless of genetic relatedness, than to those of residents of different households. The degree of similarity in skin microbiotas is even greater if a dog also lives in the home; in contrast, the presence of a cat or a young child does not accentuate this microbial relatedness. The presumption is that the dog serves as a more effective “vector” for transmitting microbes dur ing its frequent direct contact with adults in the household. The type of setting in which a person lives also impacts the microbiota. Living in a rural or farm setting leads to a different fecal microbiota than living in an urban environment. Similarly, the individual’s country of residence affects the microbiota. An analysis of daily fecal samples from an indi vidual who temporarily (i.e., for a couple of months) moved from the United States to Thailand demonstrated a large shift in the fecal micro biota that coincided with arrival in Thailand and a reversion in most respects to the “American” microbial configuration upon return to the United States. Similarly, immigration to the United States “westernizes” the microbiome of individuals coming from non-Western countries.

These geography-driven changes probably reflect a combination of environmental and dietary differences between locations. Circadian Rhythms Many human biologic processes follow a cir cadian clock; aspects of physiology are tuned by external cues, including the degree and timing of ambient light, temperature, and availability of nutrients. This endogenous biologic clock enables animals to efficiently adapt to changing environmental conditions. Similarly, the microbiota maintains a circadian rhythm that is linked to—and helps entrain—the host’s circadian clock. If circadian oscillations are disrupted in the host, they are similarly disrupted in the microbiota, and vice versa. These bacterial vacillations occur at the level of spatial localization within the intestine, relative species abundance, and bacterial metabolite secre tion. Work in the 1960s showed that mice exhibited daily periodicity of susceptibility to infection with either Streptococcus pneumoniae or E. coli lipopolysaccharide (LPS). Although the fundamental basis for this difference was not known at the time, it is likely to be related, in part, to the microbial circadian clock. Derangements of these microbial oscillations have also been linked to the development of metabolic diseases and may underlie some of the health hazards associated with shift work and jet lag. THE MICROBIOTA AND DISEASE ■ ■THE HYGIENE HYPOTHESIS Over the past few decades, abundant epidemiologic data have revealed an inverse correlation between exposure to microbes and the incidence of autoimmune and/or atopic diseases (Fig. 484-5). This type of epi demiologic correlation led to the proposal of the “hygiene hypothesis” in 1989. Initially, this hypothesis focused on the development of atopic diseases in young children, with the idea that these epidemiologic observations could “be explained if allergic diseases were prevented by infection in early childhood, transmitted by unhygienic contact with older siblings, or acquired prenatally from a mother infected by contact with her older children.”1 In fact, this notion that differences in living conditions and environmental exposures contribute to susceptibility to hay fever (summer catarrh) dates back to at least the early nineteenth century. The hygiene hypothesis has continued to evolve over the past three decades and now posits that inadequacies in microbial exposure—in combination with genetic susceptibilities—lead to a collapse of the normally highly coordinated, homeostatic immune response. At its core, the hygiene hypothesis holds that specific early-life microbial exposures are required to prevent subsequent disease and that the “westernization” of society has led to a decrease in such exposures. This concept is being applied beyond atopic diseases to other inflammatory and autoimmune diseases and is thought to reflect processes that occur in later life as well. ■ ■RELATIONSHIP BETWEEN THE MICROBIOTA

AND SPECIFIC DISEASE STATES The ideas inherent in the hygiene hypothesis—in sum, that micro bial exposure can affect long-term health outcomes—laid the theo retical foundation for translational microbiome studies. While most of the studies described earlier sought to describe how the microbiota responds to specific and often transient influences (e.g., a course of antibiotics, dietary interventions, travel), a multitude of studies have characterized the microbiota in patients with various diseases in the hope that a better understanding of the nature of disease-specific microbial communities will provide insight into disease pathogen esis and potentially uncover novel treatment modalities. Remarkably, virtually all these studies have demonstrated differences between the microbiotas of healthy controls and patients, irrespective of the specific disease process examined. Although it is difficult to generalize across all studies, a couple of general themes have emerged. First, disease states are typically associated with microbiotas that are less diverse than those of healthy individuals. This loss of diversity can be measured 1D. Strachan: BMJ 299:1259, 1989.

Crohn’s disease Incidence of infectious diseases (%) Incidence of immune disorders (%) Rheumatic fever

Hepatitis A

Tuberculosis

Measles Mumps

B A

FIGURE 484-5 There was an inverse relationship between the incidence of select infectious diseases and the incidence of autoimmune disorders during the latter half of the twentieth century. A. Relative incidence of prototypical infectious diseases from 1950 to 2000. B. Relative incidence of select autoimmune disorders from 1950 to 2000. (From JF Bach: The effect of infections on susceptibility to autoimmune and allergic diseases. N Engl J Med 347:911, 2002. Copyright © 2002, Massachusetts Medical Society. Reprinted with permission from Massachusetts Medical Society.) either as a decrease in the number of species (alpha diversity; often measured as the number of operational taxonomic units or amplicon sequence variants, which are the bioinformatic equivalent of species) or as a reduction in the microbial relatedness of the species present (beta diversity). Often, both alpha and beta diversity decrease in the setting of disease. Second, states of inflammation—regardless of site or underlying disease process—are often associated with an increase in the relative abundance of the bacterial family Enterobacteriaceae and a decrease in the relative abundance of Lachnospiraceae. Dissecting Correlation and Causality Given that most of these investigations have been designed as case-control studies, it is difficult to determine whether microbiologic findings are the cause or the effect of the disease. Even studies that examine treatment-naïve patients at the time of initial diagnosis are still confounded by this “chicken or egg” issue. Moreover, prospective, longitudinal clinical studies—still rare in the microbiome field—may simply yield correlations between the microbiome and subclinical disease rather than necessarily proving causality. Experiments in animals—specifically, studies using gnotobi otic mice (GF mice that have been colonized with specified microbial communities)—have been critical in this regard as they allow investiga tion of specific differences in microbial components while controlling for the host’s genetics, diet, and housing conditions. Moreover, human microbes can be transplanted into gnotobiotic mice to permit in-depth mechanistic studies of how these microbial communities affect disease pathogenesis. This marriage of human samples and animal experi ments has facilitated the identification of causal roles played by some microbes in disease pathogenesis; these findings provide a critical proof of concept for the interplay of the microbiota with human health. However, the vast majority of microbiome studies are still at the level of correlation. The next several sections describe the clinical and animal data for many different disease processes. Given the voluminous and rapidly changing nature of this field, it is impossible to cover all of the disease associations known to date; rather, the following discussion represents a combination of the leading exemplars of microbiome data and nascent areas of significant clinical interest. In all cases, the hope is that further study of the role of the microbiota will provide novel diagnostics, new therapeutic modalities, and/or additional insight into disease pathogenesis. Gastrointestinal Diseases Given that the intestines harbor the largest number and greatest diversity of organisms in the body, much work has focused on how the microbiota impacts gastrointestinal diseases. Even though the luminal surface area of the gastrointestinal tract is 30–40 square meters (~90% of which is contained within the small intestine) and features marked anatomic and functional differ ences that result in many discrete macro- and micro-ecosystems, stool is often used as a surrogate for the intestinal microbiota given the

relative ease of collecting samples. A few studies that have compared the microbial profile in stool with the mucosa-adherent organisms present in biopsy samples have demonstrated that stool is, in fact, a rea sonable proxy for biopsy samples; how ever, the relative microbial “noise” present in stool can sometimes overwhelm the “signal,” making biopsy samples more informative for some scientific questions. The key issue is to ensure that the biopsy samples evaluated represent relatively similar intestinal regions, as there are significant differences between the organ isms present in the crypt and the tip of the villus and between microbes found in the ascending versus the descending colon. Newer technologies (e.g., smart capsules) are being developed that will allow for noninvasive sampling of microbial com munities along the length of the gastroin testinal tract, which will provide new insight into regional differences in host–microbiota interactions.

Multiple sclerosis CHAPTER 484 Type 1 diabetes The Human Microbiome in Health and Disease Asthma OBESITY Obesity is a worsening epidemic throughout the world, and multiple studies have linked the composition of the intestinal microbiota to the development of obesity in animal models and in humans. Indeed, many of the initial translational microbiome studies performed in mice at the beginning of the twenty-first century focused on obesity. Gnotobiotic mouse studies have demonstrated the gut microbiota impacts host metabolism—resulting in body weight and adiposity changes—through several different mechanisms: the micro biota impacts the amount of energy extracted from the diet, promotes small-intestinal absorption of dietary fatty acids, regulates expression of lipid metabolism genes in the intestines, and induces hepatic lipo genesis and synthesis of triglycerides. Consistent with these findings, GF mice are resistant to diet-induced obesity, which establishes the requirement of the microbiota in the development of obesity. Over the past ~15 years, numerous human studies examining the relationship between the microbiome and obesity have been completed, all with mixed results. Although initial studies suggested obesity was associ ated with a lower ratio of the relative abundance of Bacteroidetes to Firmicutes, this has not held up in subsequent studies. Beyond this ratio of major bacterial phyla, obesity was linked to a microbiome with a lower alpha diversity. A meta-analysis of 10 studies including nearly 3000 individuals revealed an apparent lack of relationship between the Bacteroidetes/Firmicutes ratio and obesity, though there is ~2% lower diversity associated with obesity that is statistically significant but of unclear biologic significance. This finding highlights a problem com mon to microbiome studies: i.e., there is no sense as to what magnitude of change is biologically meaningful. Ultimately, although murine stud ies have indicated a causal link between the microbiota and obesity, the human data are less convincing, and their significance may be limited because the studies primarily examined only high-level taxonomic information rather than also assessing differences in bacterial products or metabolites. The rise in obesity has elicited a plethora of ideas about the type of diet that might be most successful in leading to sustained weight loss. However, it has become clear that the same dietary ingredient can have highly diverse effects on blood glucose measurements in different peo ple and that this effect is mediated largely by the microbiome. These observations suggest that the “optimal” diet needs to be individualized in the context of the person’s microbiome, which itself may continue to change over the course of the diet. Moreover, the microbiota may also influence dietary preferences, which suggests important feedback loops between the microbiome and diet. MALNUTRITION Representing the other end of the metabolic spec trum from obesity, malnutrition is also linked to an altered micro biome. Analysis of Malawian twin pairs (≤3 years of age) who were

discordant for kwashiorkor—a severe form of malnutrition—revealed that kwashiorkor is associated with a microbiologically “immature” fecal microbiota that resembles that of a chronologically younger child. Transplantation of the fecal microbiota from these discordant twins into gnotobiotic mice that were fed a diet similar in composition to a typical Malawian diet established that the kwashiorkor-associated microbiome is causally related to poor weight gain. Subsequent studies demonstrated these same general trends in malnourished Bangladeshi children. Investigators were able to identify five bacterial species (Faecalibacterium prausnitzii, Ruminococcus gnavus, Clostridium nex ile, Clostridium symbiosum, and Dorea formicigenerans) that—when administered together as a “cocktail” to mice colonized with a kwashiorkor-associated microbiome—were able to prevent growth impairments. Moreover, children with moderate acute malnutrition fed therapeutic food purposefully designed for its ability to alter the microbiota in defined manners have improved growth. These results demonstrate that rationally designed modulation of the microbiota may lead to improved health outcomes.

PART 16 Genes, the Environment, and Disease INFLAMMATORY BOWEL DISEASE Ulcerative colitis and Crohn’s dis ease, the two predominant forms of inflammatory bowel disease (IBD), are chronic gastrointestinal inflammatory conditions that differ in their locations and patterns of inflammation (Chap. 337). The following observations have led to the suggestion that IBD is the result of an immune response to a dysbiotic microbiota in a genetically suscep tible individual: genes account for only ~20% of susceptibility to IBD (and many of the relevant genes are related to host–microbe interac tions), antibiotic treatment reduces the clinical severity of disease, and relapses of Crohn’s disease are prevented by diversion of the fecal stream. While the microbiota clearly is not the only driver of disease, it is considered to be an important element. Accordingly, numerous animal and clinical studies have been designed to tease out the nature of the relationship between the microbiota and IBD. Most of these studies have focused on comparing the microbiome’s composition in IBD patients with that in healthy controls, concen trating on microbial diversity and specific bacterial taxa that are associated with health or disease. Unfortunately, few, if any, results have been universally obtained, probably because of differences in study design, inclusion criteria, and methodology (e.g., the use of stool, rectal swabs, or biopsy samples; the choice of sequencing prim ers; the analysis pipeline). Even with these differences among studies, patients with IBD typically have reduced alpha and beta diversity in their fecal microbiotas. Moreover, Clostridium clusters IV and XIVa, which are polyphyletic and encompass several different bacterial families, are generally reduced in patients with IBD. F. prausnitzii is a notable example from Clostridium cluster IV that is often under represented in the stool of patients who have Crohn’s disease, with more mixed results in biopsy samples. The bacterial family Lachno spiraceae, which is largely contained in Clostridium cluster XIVa, and other butyrate-producing organisms are also reduced in the stool of patients with IBD. Some of these species produce butyrate by using acetate generated by other members of the microbiome, and some of these acetate-producing species are similarly reduced (e.g., Rumino coccus albus). These complex interactions and dependencies among bacterial species pose unique challenges to definitive ascertainment of the cause–effect relationships between microbes and disease. Even before researchers were able to assess the entire microbiome at once, they often noted that patients with Crohn’s disease had a higher rep resentation of adherent invasive E. coli in the ileal mucosa, an obser vation consistent with the increased abundance of Enterobacteriaceae seen in sequencing-based microbiome studies. Beyond bacteria, burgeoning evidence supports a role for Caudovirales bacteriophages in IBD pathogenesis, though these findings may merely reflect the underlying dysbiosis related to the loss of bacterial diversity in IBD. Moreover, dysregulation of the fungal component of the microbiota (the mycobiota) alters the mucosal immune system and is linked to IBD disease severity. It is still unclear whether any of these microbial associations reflect the cause of IBD or merely serve as biomarkers of disease.

Studies of antibiotic-treated mice and gnotobiotic mice colonized with IBD-associated microbiotas have been useful in confirming that the microbiota affects colitis severity. Several bacterial species have been identified as either promoting colitis in mice (e.g., Klebsiella pneumoniae, Prevotella copri) or protecting against it (e.g., Bacteroi des fragilis, Clostridium species); however, these organisms do not always correlate with the taxa identified as differentially abundant across multiple clinical studies. In contrast, IgA-coated commensal organisms isolated from patients with IBD promote more severe colitis in mice than either IgA-uncoated bacteria from patients with IBD or IgA-coated bacteria from healthy controls. These data suggest that functional categorization of the microbiota based on immune rec ognition (e.g., IgA coating) may be a useful approach for identifying pathogenic organisms. Cardiovascular Disease Inflammation helps drive the pathogen esis of atherosclerosis, and it has long been postulated that microbes are involved in the atherosclerotic process. Early work demonstrated that patients with cardiovascular disease have higher titers of antibody to Chlamydia pneumoniae than control patients, that C. pneumoniae is present within atherosclerotic lesions, and that C. pneumoniae can both initiate and exacerbate atherosclerotic lesions in animal models. This type of analysis has been extended to other bacteria, such as Porphy romonas gingivalis, with the idea that multiple different bacteria may play some role in the pathogenesis of atherosclerosis. Studies have demonstrated clinical correlations between serum levels of trimethylamine N-oxide (TMAO) and atherosclerotic heart disease. Animal studies have confirmed that transfer of the gut micro biota from atherosclerosis-susceptible strains of mice to atheroscle rosis-resistant animals leads to increased serum levels of TMAO and a dietary choline-dependent increase in atherosclerotic plaques; this observation confirms the role of the gut microbiota in the generation of TMAO and atherosclerosis. Given that red meat, eggs, and dairy products are important sources of carnitine and choline (both precur sors of TMAO), it is not surprising that levels of TMAO are higher in omnivores than in vegans. The gut microbiota converts carnitine into the intermediary metabolite γ-butyrobetaine, which it further metabo lizes—in a diet-dependent fashion—into trimethylamine (TMA); hepatic flavin-containing monooxygenases then transform TMA into TMAO. Moreover, treatment of atherosclerosis-susceptible strains of mice with a structural analogue of choline that inhibits the first enzymatic step in TMAO formation leads to decreased circulating TMAO levels and, more importantly, restrains macrophage foam-cell formation and atherosclerotic lesion development. In a study of >4000 patients, plasma TMAO levels were also predictive of incident throm bosis risk (myocardial infarction, stroke). Gnotobiotic animals were used to demonstrate that this risk was dependent on the microbiota; although eight bacterial taxa were identified as being associated with both plasma TMAO levels and thrombotic risk, organisms with cholineutilization genes that represent the first step of TMAO production were not more abundant in animals at greater risk for thrombosis. This discrepancy highlights the complexity of the microbiota and suggests that other aspects of the overall dynamics of the microbial community may be in play. Oncology Studies exploring the link between the microbiota and cancer have demonstrated that specific members of the microbiota can affect treatment efficacy in both a positive and a negative manner. For example, therapy with antibody to programmed cell death ligand 1 (anti-PD-L1) has proven highly effective for many different can cers (Chap. 78); however, a significant proportion of patients do not respond even when their tumors have high PD-L1 expression levels. Three groups have independently performed clinical studies—some times coupled with gnotobiotic mouse experiments to verify causal relationships—to demonstrate that specific bacteria can potentiate checkpoint blockade inhibition in melanoma, non-small-cell lung cancer, and renal cell carcinoma. Intriguingly, these groups identified different bacteria (Bifidobacterium, Faecalibacterium, and Akkermansia species) as being associated with the anticancer effects, even when the same

oncologic process was being studied. The biologic factors driving these differences are not yet clear but may relate to differences in adjunc tive therapies, geography, and/or other as-of-yet unidentified factors. Although these seemingly disparate findings raise concern about the generalizability of microbiome studies, it may be that identifying rel evant bacterial species—as opposed to their bioactive molecules—does not offer sufficient granularity for comparison across studies. The clinical relevance of the microbiota in this process was highlighted by proof-of-concept clinical trials demonstrating that fecal microbiota transplantation (FMT)—the “transplantation” of stool from one indi vidual into another—led to clinical benefit in a few patients who previ ously did not respond to anti-PD-1 therapy after they received stool from patients who had previously responded to anti-PD-1 therapy, findings which still require confirmation in larger clinical trials. In a separate set of studies, the efficacy of therapy with antibody to cytotoxic T lymphocyte–associated antigen 4 (anti-CTLA-4) was associated with T-cell responses specific for either Bacteroides thetaio taomicron or B. fragilis. In particular, administration of B. fragilis to GF or antibiotic-treated mice restored the normally absent anticancer response to anti-CTLA-4 therapy. While these examples demonstrate potentiation of anticancer therapies by the microbiota, other therapies can be antagonized. Some cancers, such as pancreatic ductal adenocar cinoma, contain intratumoral bacteria, particularly Gammaproteobac teria, that can metabolize the chemotherapeutic agent gemcitabine and thereby contribute to the drug resistance of these tumors. In addition, the gut microbiota can increase the half-life of irinotecan, a chemotherapeutic agent commonly used in treating rhabdomyosar coma and colorectal cancer, by converting an inactive metabolite back to the active form, which leads to increased drug toxicities. Overall, these examples highlight the microbiota’s critical impact—both direct and indirect—on the efficacy and safety profile of drugs. Many other notable examples have been described (e.g., involving cyclophospha mide, digoxin, levodopa, and sulfasalazine), and many more likely remain to be discovered. Using advances in computational tools for sequence decontamina tion and batch effect correction, reanalysis of data repositories gener ated by The Cancer Genome Atlas (TCGA) Research Network has identified microbial signatures within tumor genome sequences that predicted clinical outcomes in cancer, although these findings have been questioned given potential errors in the computational pipe line. This ongoing controversy highlights the complexity within the bioinformatic pipelines, the requirement for detailed reference data bases, and dealing with samples that have an overall low abundance of microbes. Additional work is required to validate these signatures in prospective cohorts and to understand the biology underlying microbe–cancer interactions within the tumor milieu. The application of microbiome science to hematopoietic stem cell transplantation (HSCT) is an area of expanding interest, particularly given the significant morbidity and mortality related to graft-versus-host disease (GVHD). In light of studies in the 1970s showing that GF mice developed less frequent and less severe gut GVHD than wild-type mice, clinicians began to use antibiotics to decontaminate the gut of patients undergoing HSCT. This decontamination approach yielded mixed results, probably because of differences in the antibiotic regimens used. The natural history of patients undergoing allogeneic HSCT includes a substantial loss of diversity in the fecal microbiota and intestinal domination (≥30% abundance in the fecal microbiota) by Enterococcus species and other pathogens, with a higher bacterial diversity at time of neutrophil engraftment associated with lower mortality. Moreover, a retrospective analysis of ~850 patients undergoing allogeneic HSCT revealed that receipt of imipenem-cilastatin or piperacillin-tazobactam for neutropenic fever was associated with increased GVHD-related mortality at 5 years; this observation suggested that specific bacteria may help protect against GVHD-related mortality. More detailed anal yses revealed an association between the abundance of Blautia species and protection against GVHD and mortality, though this correlation is still being examined with regard to its causal relationship. Despite sig nificant interest in examining these microbial relationships with HSCT, little has yet been studied in the context of solid organ transplantation,

which likely represents the next frontier of transplantation-related microbiome investigation.

Autoimmune Diseases The dramatic rise in the incidence of many autoimmune diseases over the past few decades has been far more rapid than can be explained simply by genetic factors (Fig. 484-5). It is increasingly thought that environmental triggers, including the microbiome, are partially responsible for the development of these autoimmune diseases. CHAPTER 484 TYPE 1 DIABETES Type 1 diabetes (T1D) is an autoimmune disorder characterized by T cell–mediated destruction of insulin-producing pancreatic islets (Chap. 415). There is a clear genetic predisposition for the disease: ~70% of patients with T1D have human leukocyte antigen (HLA) risk alleles. However, only 3–7% of children with these risk alleles actually develop disease, an observation that sug gests a role for other environmental factors. Studying a prospec tive, densely sampled, longitudinal cohort of at-risk, HLA-matched children from Finland and Estonia, investigators detailed changes in the microbiota prior to development of disease. Although only 4 of the 33 children studied developed T1D within the time frame of the study, a marked decrease of ~25% in alpha diversity occurred after seroconversion but before disease diagnosis. The low number of cases in this study unfortunately precluded identification of any specific disease-associated taxa. A follow-up study compared the microbiomes of a larger cohort of these high-risk northern European children with those of low-risk Russian children who lived in geographic proximity. Bacteroides species were more abundant in the high-risk group than in the low-risk group, particularly at early ages. This difference was postulated to be associated with an altered structure of the bacterial LPS to which children were exposed at a young age. It was further suggested that Bacteroides-derived LPS was not able to provide the immunogenic stimulus necessary to prevent T1D. These two studies offer attractive—though logistically complicated—options for future clinical investigations aimed at exploring the role of the microbiome. The first approach—longitudinally following individuals who are at high risk for a given disease—may provide insight into host–microbe relation ships by mapping temporal changes in the microbiome with disease onset. An important caveat with this type of study, though, is that the associations identified may reflect preclinical disease rather than specifically indicating causality for any observed changes. The second approach illustrates how careful selection of study participants may offer an opportunity to uncover more meaningful associations that can subsequently be experimentally verified. The Human Microbiome in Health and Disease RHEUMATOID ARTHRITIS Similar to many other autoimmune dis eases, rheumatoid arthritis (RA) is a multifactorial disease that comes to clinical attention after an environmental factor triggers symptoms in an individual with preexisting autoantibodies. Multiple lines of evidence support the notion that RA pathogenesis is reliant on the microbiota, including the findings that GF mice do not develop symptoms in several RA models and that antibiotic treatment of mice mitigates against RA development. Several taxa (e.g., Bacteroides spe cies, Lactobacillus bifidus, and segmented filamentous bacteria) have been implicated in promoting RA in murine models, and analysis of the fecal microbiota of patients with newly diagnosed RA has indicated that P. copri is a biomarker of disease. That this association with P. copri does not exist for chronic, treated RA or for psoriatic arthritis suggests some specificity for new-onset RA. A major limitation of this approach is that the identified association is shown to be a biomarker of disease (and, in this case, potentially of response to treatment), but no added insight is gained into a possible causal relationship between P. copri and RA. In fact, many of the patients with new-onset RA had no Prevotella detected, and several of the healthy controls had significant levels of Prevotella. The lack of a strict concordance between the presence (or absence) of a specific taxon and a given disease state argues against a possible causal role. MULTIPLE SCLEROSIS Epidemiologic studies of twin pairs and atrisk individuals moving between high- and low-risk geographic areas indicate that genetics plays a minor component in multiple sclerosis

(MS) susceptibility relative to environmental factors. For example, in monozygotic twin pairs in which one sibling has MS, the other sibling also develops MS in only ~30% of cases. Although MS is a disease of the central nervous system (CNS), there is growing evidence of a link between MS and the microbiota, specifically that of the gut. In murine models of MS, GF and antibiotic-treated animals displayed reduced disease incidence and severity, and gnotobiotic mice harboring the fecal microbiota of individuals with MS—but not that of healthy controls—had increased disease activity. Clinical studies have bioin formatically associated numerous microbial changes with the presence of MS, including prior infection with Epstein-Barr virus. Importantly, a causal role has not yet been established for any of these microbes in MS pathogenesis. Although work relating the microbiome to MS is ongo ing, it has opened the door to exploring this link with other neurologic diseases. Animal studies have linked the microbiota with Parkinson’s disease, Alzheimer’s disease, and autism, and there are clinical data assessing fecal microbiomes in relation to a variety of neurologic con ditions. It is not yet clear how the gut microbiota is communicating with the CNS—i.e., whether communication takes place via bacterial metabolites that travel in the bloodstream and cross the blood-brain barrier, via migration of whole organisms into the CNS, or via feedback through the vagus nerve. Emerging data suggest that a subset of entero endocrine cells in the intestinal epithelium is synaptically connected to the CNS, which may provide another means for the gut microbiota to impact neurologic function. Although our understanding of this brain-gut axis is still in its infancy, research in this area has elicited tremendous excitement as a tractable approach to potential treatments for these challenging diseases.

PART 16 Genes, the Environment, and Disease Atopic Diseases The incidence and prevalence of allergic diseases continue to steadily increase, as do more severe clinical presentations. Life-threatening food allergies are now such a public health issue that nut-free classrooms are the norm in many cities. The development of allergic diseases often follows a stereotyped progression that begins with atopic dermatitis (AD) and continues, in order, with food allergy, asthma, and allergic rhinitis. The microbiome has been linked to all of these conditions and has the potential to modulate effects anywhere along this spectrum. ATOPIC DERMATITIS The skin is the largest organ in the body, and its different anatomic sites (e.g., antecubital fossa, volar forearm, alar crease) represent distinct ecologic niches and harbor unique microbial communities. Moreover, given that the skin serves as a critical inter face between the body and the external environment (e.g., microbes), it must be able to respond to unwanted microbes with an adequate immune response. AD is an inflammatory skin disorder involving immune dysfunction and a dysbiotic skin microbiota that is typically marked by greater abundances of Staphylococcus aureus and reduced bacterial diversity. Effective treatment of AD does not require com plete elimination of S. aureus but is associated with restoration of the normal level of diversity. It is likely that this increase in bacterial diver sity reestablishes normal immune homeostasis in the skin; specific members of the skin microbiota have been shown to induce protective skin-restricted immune responses. Coagulase-negative staphylococci (CoNS; primarily S. epidermidis and S. hominis) obtained from lesional and nonlesional skin of patients with AD were functionally screened and compared to CoNS from healthy controls; AD-lesional CoNS were much less often able to produce antimicrobial peptides (lantibiotics) directed against S. aureus. To demonstrate that these lantibioticproducing CoNS were biologically relevant, they were incorporated into a lotion and applied to the arms of patients with AD. Surprisingly, a single application of the probiotic-laced lotion led to a decrease in the abundance of S. aureus recovered; no such decrease was observed when lantibiotic-negative strains were used. The authors of this study did not specifically comment on the clinical improvement of the AD lesions. Nevertheless, this is one of a limited number of studies that is beginning to extend microbiome-related findings into clinical trials. ASTHMA Asthma is characterized by the clinical triad of airflow obstruction, bronchial hyperresponsiveness, and inflammation in the

lower respiratory tract. Although the long-standing dogma was that the lungs are sterile, there is now convincing evidence for a constant ebb and flow of bacteria within the lower airways. In healthy states, the mucociliary escalator continually eliminates these bacteria soon after they land in the airways; in disease states (e.g., cystic fibrosis, chronic obstructive pulmonary disease), these bacteria establish long-term colonization of the airways and influence disease pathogenesis. In asthma specifically, both fecal and airway microbes have been linked to clinical outcomes. Early studies of the microbiome’s influence on asthma used culturebased methods to assess the hypopharyngeal microbiota of asymp tomatic 1-month-old infants. Intriguingly, in one study, early-life colonization with S. pneumoniae, Haemophilus influenzae, Moraxella catarrhalis, or a combination of these organisms—but not S. aureus— was significantly associated with persistent wheeze and asthma at 5 years of age. Eosinophilia and total IgE levels at 4 years of age were also increased in children who were neonatally colonized with these organisms. Although this study examined a focused set of bacteria, it laid the experimental groundwork indicating that early-life microbial exposures influence subsequent development of asthma. A later lon gitudinal investigation of the fecal microbiota in a general-population birth cohort of >300 children demonstrated that lower abundances of the genera Lachnospira, Veillonella, Faecalibacterium, and Rothia at 3 months of age were associated with an increased risk for development of asthma. The fact that these bacterial changes were no longer appar ent when the children were 1 year of age is consistent with the notion that microbial exposures early in life are important to disease patho genesis later in life. Transplantation of stool samples from 3-month-old children at risk for asthma into gnotobiotic mice resulted in significant airway inflammation in a murine model of asthma; pre- and postnatal exposure of mice to a four-species cocktail (F. prausnitzii, Veillonella parvula, Rothia mucilaginosa, and Lachnospira multipara) inhibited airway inflammation, with a marked reduction in neutrophil num bers in bronchoalveolar lavage fluid. These data suggest that early-life modulation of the microbiome may be an effective strategy to help prevent asthma, though the specific logistics (e.g., strains, dose, timing of exposure, patient selection) remain to be clarified. Infectious Diseases The increased susceptibility of antibiotictreated mice to infection with a wide range of enteric pathogens was initially observed in the 1950s and led soon thereafter to the concept of colonization resistance, which holds that the normal intestinal micro biota plays a critical role in preventing colonization—and therefore disease production—by invading pathogens. Seminal work in the 1970s demonstrated that this protection is largely reliant on anaerobic gram-positive organisms, and the subsequent half-century has been spent trying to identify the specific microbes involved. Although much of the work relating the microbiota to infection has focused on enteric pathogens, the intestinal microbiota has also been clearly linked to bacterial pneumonia in mouse models, and changes in the microbial composition of the gut have been causally related to changes in the severity of disease. Although this gut-lung axis clearly exists in animals, its relevance in humans is still unclear. Several groups are beginning to study the human lung microbiome in the context of pneumonia and tuberculosis. Moreover, the relationships between the microbiota and both systemic infections (e.g., HIV infection, sepsis) and the response to vaccination are starting to be explored. ENTERIC INFECTIONS Clostridioides difficile infection (CDI) represents a growing worldwide epidemic and is the leading cause of antibioticassociated diarrhea (Chap. 139). Roughly 15–30% of patients who are successfully treated for CDI end up with recurrent disease. The strong association between antibiotic exposure and CDI initially raised the idea that the microbiota is inextricably linked to acquisition of disease, presumably because of the loss of colonization resistance. Consistent with the epidemiologic data, characterization of the fecal microbiota of patients with CDI revealed that it is a markedly less diverse, dysbiotic community. FMT using stool from a healthy individual was success fully used in the 1950s to treat four patients with severe CDI and has since been demonstrated in numerous studies to be an effective

therapy for recurrent CDI, with clinical cure in 85–90% of patients (as detailed below). Thus, FMT for recurrent CDI has become the “poster child” for the idea that microbiome-based therapies can transform the management of many diseases previously considered to be refractory to medical therapy. Although FMT is agnostic as to the underlying mechanism of protection, work is ongoing to identify specific microbes and host pathways that can protect against CDI. Studying mice with differential susceptibilities to CDI due to antibiotic-induced changes in their microbiota, investigators identified a cocktail of four bacteria (Clostridium scindens, Barnesiella intestihominis, Pseudoflavonifractor capillosus, and Blautia hansenii) that conferred protection against CDI in a mouse model. Intriguingly, treatment of mice with just C. scindens offered significant, though not complete, protection in a bile acid– dependent manner. Clinical data from patients who underwent HSCT also associated C. scindens with protection from CDI, an observation that suggests the possibility of translating these findings from mice to humans. This study provides another example of the identification of relevant bacterial factors through examination of microbial differences in populations that differ in disease risk. Microbiome-related changes associated with Vibrio cholerae infec tion include a striking loss of diversity (largely due to V. cholerae becoming the dominant member of the microbiota) and an altered composition that rapidly follows the onset of disease. These changes, which occur in a reproducible and stereotypical manner, are reversible with treatment of the disease. This recovery phase involves a microbial succession that is similar to the assembly and maturation of the micro biota of healthy infants. In addition to V. cholerae, streptococcal and fusobacterial species bloom during the early phases of diarrhea, and the relative abundances of Bacteroides, Prevotella, Ruminococcus/Blau tia, and Faecalibacterium species increase during the resolution phase and mark the return to a healthy adult microbiota. Analysis of these microbial changes occurring in patients with cholera and in healthy children led to the selection of 14 bacteria that were transplanted into gnotobiotic mice, which were then challenged with V. cholerae. Bioin formatic analysis of specific taxa changing during cholera determined that Ruminococcus obeum restrained V. cholerae growth. Subsequently, this relationship was experimentally confirmed, and the R. obeum quo rum-sensing molecule AI-2 (autoinducer 2) was found to be respon sible for restricting V. cholerae colonization via an unclear mechanism. These studies highlight the potential for use of microbiome-based therapies to prevent and/or treat infectious diseases. Moreover, they suggest that temporal analysis of longitudinal microbiome data may be an effective strategy for identifying microbes with causal relationships to disease. VIRAL INFECTIONS One long-standing maxim for management of infectious diseases is that antibiotics are only to be used for treatment of bacterial infections. Studies using mouse models have demonstrated, however, that a variety of viruses require the bacterial component of the microbiota for pathogenesis. Moreover, antibiotic therapy, which has bacteria-independent effects on the host, leads to reduced disease severity in some animal models of viral infection, though the clinical relevance of this is not yet clear. In addition to being required for some viral infections to proceed, commensal bacteria have also been shown to play a critical role in inducing type I interferons, which represent a potent defense mechanism against many viruses, and for modulating cellular physiology in ways that inhibit viral replication. HIV INFECTION The augmentation of HIV pathogenesis by some viral, bacterial, and parasitic co-infections suggests that a patient’s underlying microbial environment can influence the severity of HIV disease. Moreover, it has been hypothesized that the intestinal immune system plays a significant role in regulating HIV-induced immune activation; this seems particularly likely since the intestines are an early site for viral replication and exhibit immune defects before peripheral CD4+ T-cell counts decrease. Several studies of HIV-infected indi viduals have identified substantial differences in the HIV-associated fecal microbiota that correlate with systemic markers of inflammation. Curiously, these microbial changes do not necessarily normalize with antiretroviral therapy; this finding suggests that the microbiota may

have some “memory” of the previously high HIV loads and/or that HIV infection helps reset the “normal” microbiota. This memory-like capacity of the microbiota has been demonstrated in animal models in the context of other infections and in response to dieting.

Given that the majority of new HIV transmission events follow het erosexual intercourse, there has been significant interest in examining the relationship between the vaginal microbiota and HIV acquisition. A longitudinal study of South African adolescent girls who under went high-frequency testing for incident HIV infection facilitated the identification of bacteria that were associated with reduced risk of HIV acquisition (Lactobacillus species other than L. iners) or with enhanced risk (Prevotella melaninogenica, Prevotella bivia, Veillonella montpellierensis, Mycoplasma, and Sneathia sanguinegens). In mice inoculated intravaginally with Lactobacillus crispatus or P. bivia, the latter organism induced a greater number of activated CD4+ T cells in the female genital tract, a result suggesting that the increased risk of HIV acquisition associated with P. bivia may be secondary to the increased presence of target cells. In a separate study, the composi tion of the vaginal microbiota was shown to modulate the antiviral efficacy of a tenofovir gel microbicide. Although tenofovir reduced HIV acquisition by 61% in women who had a Lactobacillus-dominant vaginal microbiota, it reduced HIV acquisition by only 18% in women whose vaginal microbiota comprised primarily Gardnerella vaginalis and other anaerobes. This difference in efficacy was due to the ability of G. vaginalis to metabolize tenofovir faster than the target cells can take up the drug and convert it into its active form, tenofovir diphosphate. These findings illustrate how microbial ecology can be an important consideration in choosing effective treatment regimens. CHAPTER 484 The Human Microbiome in Health and Disease RESPONSE TO VACCINATION Second only to the provision of clean water, vaccination has been the most effective public health interven tion in the prevention of serious infectious diseases. Its effects are mediated by antigen-specific antibodies and, in some cases, effector T-cell responses. Although vaccines are clearly effective on a popula tion scale, the magnitude of the immune response to vaccines can vary among individuals up to a hundredfold. Although many factors (e.g., genetics, maternal antibody levels, prior antigen exposures) can affect vaccine immunogenicity, the microbiota is now recognized as another important factor. Several cohort studies have associated dif ferences in the fecal microbiota with altered vaccine responses, and the nasal microbiota is thought to contribute to the IgA response to live, attenuated influenza vaccines. These correlations based on clinical data have been partially confirmed in animal studies. The best example is the demonstration that the responses to nonadjuvanted viral subunit vaccines (inactivated influenza and polio vaccines) are reliant on the microbiota, whereas the responses to live or adjuvanted vaccines (live attenuated yellow fever, Tdap/alum, an HIV envelope protein/ alum vaccine) are not. A causal role for the microbiome influencing vaccine-induced immunity in humans was demonstrated by com paring microneutralization titers following the inactivated influenza vaccine in individuals treated with or without antibiotics, although an antibiotic-dependent effect was only present in subjects who had low levels of preexisting immunity to influenza. These data suggest that the microbiota may serve as an adjuvant for certain vaccine types and in naïve populations. Incorporation of specific commensal bacteria and/ or their products that improve vaccine responses into vaccine formula tions may increase overall vaccine efficacy. MECHANISMS OF MICROBIOME-MEDIATED EFFECTS As highlighted in the examples above, numerous associations have been made between the microbiome and various disease states. These correlations have often been established at broad taxonomic levels, with little or no insight into causality. Given that most clinical studies of these relationships have a fairly small sample size (often <100) and are simultaneously comparing numerous variables (i.e., each of the bacterial species in the microbiota is effectively a different feature being compared), many of these studies may not be adequately powered and therefore may yield false-positive results. Testing of these correlations

in animal models of disease has been critical in demonstrating a causal relationship between microbes and specific phenotypes. Because microbiome-wide association studies typically result in a long list of bacterial taxa that are correlated with a disease, it has been challenging to know which organism to test further in mechanistic studies. More over, even if a specific bacterial species is identified in these analyses, there is potentially enough strain-to-strain variation that the “func tional” isolate may need to be recovered from the individuals studied; a publicly available representative of the species may not confer the same phenotype. Despite all these difficulties, a handful of specific microbes have now been linked to disease effects; some examples have been mentioned above. The next layer of challenges relates to identification of the specific mechanisms that underlie these causal relationships. Although the microbiota modulates most facets of host physiology, its impact on the immune system is the best-studied mechanism and helps explain its role in many diseases, particularly those that stem from misdirected immune responses.

PART 16 Genes, the Environment, and Disease ■ ■REGULATION OF THE IMMUNE SYSTEM The microbiota is required for the proper development, education, and maintenance of the immune system, a finding underscored by the fact that GF animals have an immature and underdeveloped immune system. Moreover, given that the microbiota has co-evolved with its host, a host-specific microbiota is critical for normal maturation of the immune system: gnotobiotic mice colonized with the microbiota from healthy humans have a small-intestinal immune system indistinguish able from GF mice. This impact on immune ontogeny begins in early life, with maternal transfer of microbially-targeted antibodies and microbe-derived metabolites augmenting neonatal immune develop ment. Some of these early-life microbial exposures must occur during a time-sensitive window, after which subsequent exposures fail to redress the initial deficiency. Examples of these “original sins” are limited in number, but they can have long-lasting physiologic consequences that extend into adult life. In contrast, most host–microbe–immune interactions occur on an ongoing basis throughout life, with microbial perturbations (e.g., antibiotic use, changes in diet) disrupting this homeostatic immunity and potentially altering disease susceptibility. The microbiota impacts virtually all aspects of host immunity, including its different arms (i.e., innate, adaptive), its varied anatomic niches (e.g., intestinal, skin, lung, bone marrow, CNS), and its overall immunologic tone and responsiveness. Not only does the micro biota influence the development and education of immune cells, but it also plays a critical role in modulating epithelial cell responses that contribute to immune defenses and disease pathogenesis. While the microbiota as a whole is known to drive these varied responses, not all microbes have the same immunomodulatory effects. Indeed, a broad screen of >50 taxonomically diverse commensal bacteria in GF mice demonstrated that most have the capacity to modulate the immune system, with very few bacterial taxa being immunologically quiescent; however, the immunomodulatory effects were often not detected when the same bacterium was administered to a mouse with a normal microbiota, which highlights the functional redundancy within the microbiota. Interestingly, bacterial taxonomy did not correlate with effects on the immune system, a finding that suggests the bioactive molecules may be unique rather than evolutionarily conserved. Given that all the tested bacteria express various canonical ligands (e.g., LPS, peptidoglycan, flagellin) for pattern recognition receptors, such as Tolllike receptors, commensal bacteria either modulate the immune system via a different class of products or their “canonical ligands” have unique structural motifs that trigger distinct signaling pathways—or combina tion of pathways—that result in education of the immune system. Efforts are ongoing to define cognate relationships between spe cific commensal bacteria and their immunomodulatory effects, and approaches are being developed to define specific bacterial factors that are responsible for the phenotypic changes. Complicating factors are that many organisms, particularly those in the phylum Firmicutes, are not readily genetically tractable and that many of the phenotypes are not easy to assess with high-throughput screening. The use of mass spectrometry to detect and profile tens of thousands of different

metabolites present in different bodily fluids has offered the promise of deeper insight into microbially mediated processes that underlie dis ease susceptibility. However, the fact that the overwhelming majority of these metabolites are not annotated, coupled with the sheer volume of data generated, has so far limited the general utility of these untargeted approaches. The few immunomodulatory bacteria and their bioactive molecules that have been identified serve as useful archetypes for how the microbiome influences the immune system and, more generally, host physiology. These commensal-derived products can generally be categorized as endobiotic microbial structures, modified dietary nutri ents, and modified host-derived metabolites. Endobiotic Microbial Immunomodulatory Molecules B. fragilis polysaccharide A (PSA) is perhaps the best-studied commensal-derived molecule that has been demonstrated to influence disease outcomes in mouse models. PSA—one of at least eight capsular polysaccharides expressed by B. fragilis—has a unique zwitterionic structure that incor porates both a positive and a negative charge within each repeating unit. Studies in which mice have been treated either with isogenic strains of

B. fragilis that differ in PSA expression or with purified PSA have shown that PSA confers protection—prophylactically and therapeutically— against experimental colitis and MS. PSA is recognized by Toll-like recep tor 2 on antigen-presenting cells, particularly plasmacytoid dendritic cells, and—in the setting of inflammation—induces interleukin 10 (IL-10)–

producing regulatory T cells (Tregs) that help restrain inflammation. B. fragilis is also the source of an immunomodulatory glycosphin golipid that, if present during neonatal life, decreases the number of colonic invariant natural killer T (iNKT) cells and improves outcomes in a model of colitis in adulthood. It is not clear whether these glyco sphingolipids activate or inhibit iNKT cells; results have been discor dant, probably because different glycosphingolipid species have been tested. A chemical synthesis approach confirmed that B. fragilis glyco sphingolipids have distinct immunomodulatory functions depending on their specific structure. There are an increasing number of commensal-derived polysac charides and other large molecules that have been shown to modulate the immune system and/or disease outcomes. Advances in bacterial genetics have facilitated the identification of structural features that contribute to some of these host–microbiota relationships. It is likely that our general understanding of structure–function relationships of commensal-derived products will continue to grow in the coming years, mirroring what has occurred in microbial pathogenesis studies over the past several decades. Modified Dietary Nutrients As described above, the human diet provides nutrients for the gut microbiota, which can metabolize them into new, bacteria-derived compounds. Perhaps the best example of this is fermentation of undigested dietary fibers into short-chain fatty acids (SCFAs). Several groups have demonstrated that SCFAs, the intestinal levels of which are largely determined by bacterial metabolism, are important for the induction of Tregs, though there is not agreement on which specific SCFA (propionate, acetate, or butyrate) is most relevant. Wild-type mice colonized with bacteria known to induce colonic Tregs have elevated cecal levels of SCFAs. Colonization with any of three Bacteroides species (B. caccae, B. massiliensis, and B. thetaiotaomicron) increases levels of acetate and propionate, whereas colonization with Parabacteroides distasonis or a mix of 17 human-derived Clostridium species elevates levels of all three SCFAs. In all of these cases, though, the SCFAs inhibit histone deacetylase, with a consequent increase in Foxp3 expression. Notably, microbe-induced SCFA production has not been shown to be critical for Treg induction by any of these organisms. In contrast, there appears to be no correlation between SCFA levels and Treg numbers in mice monocolonized with various Treg-inducing bacterial species. Taken together, these data suggest important hetero geneity in the mechanisms underlying Treg development and do not rule out the possibility of other, redundant mechanisms for Treg induc tion. In addition to effects on Tregs, SCFAs also promote the epithelial barrier, impact cell proliferation (directionality depends on the specific cell type and SCFA), regulate host metabolism, and provide an energy source to colonocytes.

Although SCFAs represent the best-studied molecules that the microbiota generate from diet, there are many other physiologically important examples. The microbiota metabolizes tryptophan into various products (e.g., kynurenine, indole, and its derivatives) that influence immune function, metabolic diseases, viral infections, and neuronal function, among other things. Desaminotyrosine produced by Clostridium orbiscindens confers protection from influenza by inducing type I interferon activity. Modification of unsaturated fatty acids (e.g., linoleic acid) into different isomers regulates specific T-cell subsets embedded in the small-intestinal epithelium. These examples represent an important proof of concept that diet plays an important role in the functional output of the microbiota, not just its composition. Modified Host-Derived Molecules Bile acids are produced in the liver but then are metabolized by intestinal bacteria to form deconjugated and secondary bile acids. These microbially produced bile acid profiles act through complex signaling pathways to balance the metabolism of lipids and carbohydrates and to affect immune responses. Therefore, bile acids are now being investigated as micro bial metabolites that are critical to maintaining human health. As mentioned above, C. scindens helps protect mice against CDI through a bile acid–dependent process. Alterations in bile acid profiles due to underlying microbial dysbiosis have also been associated with hepatic and colonic inflammation, hepatic cellular carcinoma, colorectal can cer, and impaired gut motility. Almost all of these relationships have been documented at the level of correlation and, at best, reflect a partial change in phenotype in the setting of bile acid sequestrants (e.g., chole styramine). Work is ongoing to determine causal relationships between bacterial metabolism of bile acids and changes in host physiology, though the most definitive evidence is that microbe-produced bile acid metabolites influence colonic Treg homeostasis. In addition to bile acids, the gut microbiota can metabolize many other host-derived molecules, thereby regulating their levels and downstream effects. Taurine enhances NLRP6 inflammasome–induced colonic IL-18 secretion, while histamine, spermine, and putrescine suppress IL-18 secretion; the levels of all of these host-derived metabo lites can be regulated by the microbiota. Inosine, the deamination product of adenosine, produced by Bifidobacterium pseudolongum enhances efficacy of checkpoint blockade inhibitors in mouse models. While these examples represent the tip of the iceberg, many more examples of bacterial metabolites will undoubtedly be linked to health and disease given the thousands of different bacterial metabolites pres ent throughout the body. However, the clinical relevance of any of these bacterial metabolites remains unknown. MOVING MICROBIOME SCIENCE FROM BENCH TO BEDSIDE The numerous microbiome–disease associations identified thus far have generated a great deal of hope that understanding the relevant microbe–host interactions will open the door to unlimited therapeu tic applications. Microbiome-based therapies offer several potential benefits. Patients often view such treatment as more “natural” than conventional drug therapy and are therefore more likely to comply with it. Biologically, microbiome-based therapies are more likely to address one of the root causes of disease (microbial dysbiosis) rather than sim ply affecting the downstream sequelae. Finally, a given microbiomebased therapy may serve as a “polypill” that is effective against several different diseases stemming from similar microbial changes. Despite tremendous interest in therapeutically exploiting the microbiome, there have thus far been few clinical successes along these lines. The most successful therapeutic application of microbiome science has been the use of FMT, particularly for CDI. As mentioned earlier, FMT involves “transplanting” stool from one individual to a diseased patient, with the idea that the donor microbiota will correct whatever derangement may exist in the ill patient and therefore will alleviate symptoms. Fundamentally, this notion is agnostic as to the specific microbial dysbiosis and holds that any “healthy” microbiota will be curative, though some are now using donor stool from patients with a desired phenotype rather than any healthy individual. The idea of

FMT dates to at least the fourth century, when traditional Chinese doctors used a “yellow soup” (fresh human fecal suspension) to suc cessfully treat food poisoning and severe diarrhea. The continued use of FMT through the centuries for the treatment of diarrheal illnesses in both humans and animals, along with the growing appreciation of the importance of the microbiota, laid the groundwork for using FMT to treat CDI. Since the first major prospective trial assessing FMT for recurrent CDI in 2013, most of the numerous studies of FMT for CDI have demonstrated remarkable efficacy, with an average clinical cure rate of ~85%. The donor stool can be fresh or frozen (use of the latter allows biobanking of samples from a limited number of prescreened donors) and can be administered via nasogastric tube, nasoduodenal tube, colonoscopy, enema, or oral capsules; the cure rate is slightly higher with lower-gastrointestinal administration than with uppergastrointestinal treatment. The optimal screening, preparation, and concentration of infused donor stool have not yet been determined, and there have been cases of antimicrobial-resistant pathogens trans mitted by FMT that have led to mortality. The most common adverse effects of FMT include altered gastrointestinal motility (with constipa tion or diarrhea), abdominal cramps, and bloating, all of which are generally transient and resolve within 48 h. Although controlled stud ies of the use of FMT in immunosuppressed patients do not yet exist, meta-analyses of case reports and case series have found no serious FMT-related adverse events in >300 immunocompromised patients.

CHAPTER 484 The Human Microbiome in Health and Disease The successful use and the favorable short-term safety profile of FMT for CDI have led to its expanded application for other indications. As of July 2024, >500 trials (listed at ClinicalTrials.gov) were investigat ing the efficacy of FMT for a range of indications, including CDI, IBD (ulcerative colitis and Crohn’s disease), obesity, eradication of multi drug-resistant organisms, anxiety and depression, cirrhosis, and type 2 diabetes. The few published studies regarding indications other than CDI have generally included small sample sizes and have offered mixed results. In contrast to the successes in CDI, the results have been more varied for patients with IBD, which is perhaps the second best-studied indication. It is not clear whether these discrepancies are due to hetero geneity in recipients (e.g., in terms of underlying disease mechanisms or endogenous microbiotas), the donor material, and/or the logistical details of FMT administration (e.g., route, frequency, dose). However, these results demonstrate that—under the right circumstances— modulation of the microbiota can be an effective therapy for IBD. Although FMT offers an important proof of concept that microbi ome-based therapies can be effective, treatment is difficult to standard ize across large populations because of variability among stool donors and among the endogenous microbiotas of recipients. In addition, FMT is fraught with safety concerns, and its mechanisms of action are unclear. That said, there are now two microbiome-derived therapies conceptually analogous to FMT that are approved by the U.S. Food and Drug Administration (FDA) for treatment of recurrent CDI. FMT likely represents the first generation of microbiome-based therapies; subsequent generations will include the use of more refined bacte rial cocktails, single strains of bacteria, or bacterial products and/or metabolites as the therapeutic intervention. The field of probiotics has a complicated history: many different strains have been tested against a multitude of diseases. Several meta-analyses have combined results across bacterial strains and/or disease indications and have generally concluded that the data are not yet convincing enough to support the use of the tested regimens. It should be noted that the tested organisms have generally been chosen based on their presumed safety profile rather than in light of a plausible biologic link to disease. The hope is that more focused, mechanistic microbiome studies will identify specific commensal organisms—and their underlying mechanisms of action—that are involved in disease pathogenesis and that will serve as the basis for the next wave of rationally chosen probiotics, a few of which are currently in clinical trials. The main hurdle in this endeavor has been identifying specific microbes that are causally related to protection from disease. Future therapeutic strategies might include administering a beneficial microbe/microbial product; targeting a deleterious microbe/microbial pathway; or modulating the microbial ecology, potentially by impacting keystone species.

PERSPECTIVE The medical view of microbes has changed radically, moving from the early-twentieth-century notion that we are engaged in a constant struggle with microbes—an “us-versus-them” mentality that focused on the necessity of eradicating bacteria—to the current understanding that we live in a carefully negotiated state of détente with our commen sal organisms. Instead of holding a simple view of microbes as enemies to be eliminated with antibiotics, scientists are increasingly recognizing the critical role these organisms play in maintaining human health; loss of these host–microbe interactions in the increasingly sterile environment typical of Western civilization may have predisposed to the increased incidence of autoimmune and inflammatory diseases. The field of microbiome research has made great strides over the past decade in cataloguing the normal microbiota and is now beginning to identify clinically actionable microbe–host relationships.

PART 16 Genes, the Environment, and Disease The explosion of “–omics” technologies (e.g., metagenomics, meta transcriptomics, metabolomics) has enabled the generation of vast amounts of data, but it is not yet clear how best to integrate data sets in order to gain useful insights into host–microbe relationships. The use of FMT has demonstrated that modulation of an individual’s micro biota can effectively treat certain diseases; however, models with which to predict specifically how a microbiota will change after modulation— and what potentially untoward effects these changes might have—are still lacking. Implicit in this limitation is our ignorance about what microbial configuration is optimal and how a given microbiota should be rationally altered to obtain an ideal outcome.

Despite initial hyperbolic hype and a few false starts, microbiome research now stands at the forefront of an ability to treat the fundamen tal basis of many diseases. As the field continues to mature, it will need to move beyond correlations and address causation. The identification of causal microbes and their mechanisms of action will create a “micro bial toolbox” from which relevant bioactive strains can be chosen on a per-patient basis to correct specific underlying microbial dysbioses. In the near future, our knowledge base regarding the microbiome and its relationship to health and disease will be robust enough that this information can be applied in making important treatment decisions. ■ ■FURTHER READING Amato KR et al: The human gut microbiome and health inequities. Proc Nat Acad Sci 118:e2017947118, 2021. Goodrich JK et al: Conducting a microbiome study. Cell 158:250, 2014. Human Microbiome Project Consortium: Structure, function and diversity of the healthy human microbiome. Nature 486:207, 2012. Schmidt TSB et al: The human gut microbiome: From association to modulation. Cell 172:1198, 2018. Shalon D et al: Profiling the human intestinal environment under physiological conditions. Nature 617:581, 2023. Stefan KL et al: Commensal microbiota modulation of natural resistance to virus infection. Cell 183:1, 2020. Walker AW, Hoyles L: Human microbiome myths and misconceptions. Nature Microbiol 8:1392, 2023.