SECTION 3 Cell biology

3.1 The cell 209

3.1 The cell 209

ESSENTIALS The cell is a dynamic entity. Cells are not simply building blocks that are linked together to create an organism: each cell comprises a dy- namic network of interacting macromolecules. Just how dynamic has been brought home by recent advances in cell imaging technologies. A host of multisubunit molecular structures must assemble and dis- assemble in a highly coordinated, exquisitely regulated, and beauti- fully choreographed manner to ensure the integrity of the cell and provide its ability to function correctly as a single unit within a large multicellular organism. Introduction The cell is the fundamental unit of all forms of independent life on this planet, from the simplest single-​celled prokaryote to the most complex multicellular eukaryote. A limiting membrane, the plasma membrane, encloses the contents of the cell and allows a host of en- zymatic reactions and intermolecular interactions to occur within a confined, and regulated, environment. This raises the question, ‘What is the limiting membrane composed of and what are the con- tents of the cell?’. The major component of the limiting membrane is a lipid bi- layer, and the major components of the lipid bilayer are amphi- pathic phospholipids. Amphipathic molecules have one part that is water soluble (hydrophilic) and one part that is water insoluble (hydrophobic). A property of amphipathic molecules is that, in an aqueous environment, they spontaneously organize them- selves so that the hydrophobic regions face one another (shielding them from the surrounding water) leaving the hydrophilic regions (often referred to as ‘head groups’) exposed. In fact, an appropriate mixture of phospholipids in water will lead to the spontaneous for- mation of lipid vesicles, with water on the inside and water on the outside (Fig. 3.1.1). This gives us the basic template for the limiting membrane of the cell, the contents of which comprise a vast range of biomolecules, from simple building blocks to large macromol- ecular complexes. Prokaryotes compared with eukaryotes Before going further, it is worthwhile clarifying the difference be- tween prokaryotes and eukaryotes. The defining difference is that prokaryotes have no nucleus whereas eukaryotes do. Prokaryotes can be divided into two major divisions, or domains, the eubac- teria and the archaebacteria, which appear to have diverged from a common ancestor at around about the same time that ancestral eukaryotes evolved as the third domain of life on earth. All three domains use DNA as their hereditary material and in- formation store. In prokaryotes this DNA resides within the cell alongside all the other cellular contents; in eukaryotes it is contained within the nucleus and thereby physically separated from the bulk of the contents of the cell. Just as the cell itself is a membrane-​bound structure, so is the nucleus within each eukaryotic cell. In fact, the membrane that surrounds the nucleus is a double lipid bilayer and is referred to as the nuclear envelope. 3.1 The cell George Banting and Jean Paul Luzio Phospholipid Hydrophillic headgroup Hydrophobic ‘tails’ Lipid bilayer Lipid vesicle Fig. 3.1.1  Phospholipids spontaneously self-​assemble in aqueous environments to form lipid vesicles.

210 SECTION 3  Cell biology The nucleus is just one of several membrane-​bound structures within eukaryotic cells. The larger of these structures are referred to as organelles and they serve to compartmentalize the cell, allowing specific processes and reactions to occur within defined and con- trolled local environments. Smaller membrane-​bound compart- ments are referred to as vesicles and tubules; these are generally involved in transport between organelles or between organelles and the plasma membrane. The contents of a eukaryotic cell are referred to as the cytoplasm; the aqueous part of the cytoplasm outside membrane-​bound com- partments is referred to as the cytosol; and the inside of an organelle, vesicle, or tubule is termed the lumen of that compartment. Transcription, translation, and macromolecular crowding In all cells, DNA is transcribed into messenger RNA (mRNA) by the enzyme RNA polymerase; the mRNA is, in turn, translated into protein by ribosomes. In prokaryotes this all occurs within the same space, since there are essentially no intracellular membrane-​bound compartments and transcription and translation can be coupled (i.e. an mRNA can be being translated while still being transcribed). However, things are different in eukaryotes, since (in general) the transcribed mRNA leaves the nucleus (via protein-​lined channels, nuclear pores, in the nuclear envelope) before being translated in the cytosol. It has been estimated that, on average, each mammalian cell con- tains about 1010 protein molecules of about 10 000 to 20 000 different kinds. On top of this there are multiple copies of a range of other large macromolecules, notably nucleic acids and complex sugars. All of this within a cell that is around 20 µm in diameter (although there is massive variation here). It is, therefore, hardly surprising that there is a very high total concentration of macromolecules within mammalian cells—​estimated to be up to 400 g/​litre—​meaning that anything up to 40% of the cell volume is physically occupied by these molecules. Thus, while the inside of the cell is clearly an aqueous environment, it is a very crowded and highly ordered aqueous en- vironment, and probably has a consistency rather like that of thick soup or porridge. Lipid bilayers and integral membrane proteins A lipid bilayer, be it the plasma membrane at the cell surface or the defining membrane of an intracellular organelle, is—​among other things—​a permeability barrier. It is permeable to small lipophilic molecules, partially permeable to water, but impermeable to ions and large molecules. It therefore not only retains the contents of the cell or organelle, but also provides a physical barrier between the contents of the cell or organelle and the exterior environment. A cell, however, clearly has to interact with its exterior environ- ment, whether it be a single-​celled prokaryote (and most living or- ganisms are single cells) or a complex multicellular organism such as a human being with an estimated 1013 cells assembled in such a way, and communicating with one another, so that they create a living being that is far more than the sum of the individual parts. This interaction with the exterior environment of the cell is de- pendent upon proteins that reside in the lipid bilayer. Many of these proteins actually span the lipid bilayer, with part of the protein res- iding outside the cell, part within the cell, and part within the hydro- phobic core of the lipid bilayer. In the case of eukaryotic cells, the vast majority of these integral membrane proteins have been post-​ translationally modified by the addition of specific sugar residues to create glycoproteins. In many cases the correct sugar modifications are critical for the correct function of the glycoprotein, particularly for those glycoproteins that are involved in cell–​cell or cell–​substrate interactions (NB: aberrant glycosylation is frequently observed on glycoproteins at the surface of cells present in tumours). Membrane proteins perform a multitude of roles, just about all of which can be considered to be involved in some way in commu- nication between the inside and the outside of the cell. They act as transporters of ions, sugars, amino acids, peptides, hormones, and other molecules; they act as receptors for extracellular ligands such as hormones and neurotransmitters, and transmit signals to the in- side of the cell; they act to link cells to one another or to the under- lying substrate; in short, they are the physical link between the inside of the cell and the outside world. It is not surprising, therefore, that it has been estimated that about a third of all the proteins encoded by the human genome are membrane proteins. Organelles The main organelles within eukaryotic cells are the nucleus, the endoplasmic reticulum (ER), the Golgi apparatus, mitochondria, lysosomes, endosomes, and peroxisomes (and chloroplasts in plants), with a range of specialized organelles occurring in different cells of higher eukaryotes (see Fig. 3.1.2). While there remains much speculation about the origin and evo- lution of intracellular organelles, there is little doubt that mitochon- dria are derived from an extinct α-​proteobacterium that lived inside another prokaryotic cell (a process called endosymbiosis). One hy- pothesis proposes that the modern eukaryotic cell derives from one ancestral prokaryote phagocytosing another, which evolved into the modern mitochondrion (a second phagocytic event could account for the origin of chloroplasts in plant cells). Other organelles could result from infolding and vesiculation of the prokaryotic plasma membrane. An alternative hypothesis is that membrane-​bound blebs were extruded by the ancestral prokaryotic cell, which is homolo- gous to the modern-​day nucleus. These blebs are proposed to have expanded around the proto-​mitochondria, generated the organelles of the secretory system and fused to form the plasma membrane. Nucleus The nucleus can be considered to be at the heart of the eukaryotic cell. It harbours the vast majority of DNA within the cell (i.e. the nuclear genome). It is also, among other things, the site of gene tran- scription, the site of mRNA processing, and the site of ribosome as- sembly. The control of gene expression, which often occurs at the level of transcriptional regulation, is fundamental to the regulation of cell function and involves a complex interplay between the gen- omic DNA in the nucleus and a host of cellular proteins. Many of these proteins shuttle, in a controlled manner and in response to specific signals, between the nucleus and the cytosol. They are able to do so because there are gaps, termed nuclear pores, in the mem- brane that envelopes the nucleus. As mentioned earlier, the nuclear membrane is a double lipid bilayer. Thus, there is an inner nuclear membrane in contact with the contents of the nucleus and an outer lipid bilayer in contact with the cytosol. The space between the two

3.1  The cell 211 lipid bilayers is the lumen of the nuclear membrane and is a space that is contiguous with the lumen of the endoplasmic reticulum. Nuclear pores are complex multiprotein assemblies that allow certain proteins to pass in and out of the nucleus while excluding others. Fully processed mRNA also leaves the nucleus via nuclear pores before being translated in the cytosol by ribosomes. The dy- namic traffic within, and in and out of, the nucleus is imperative for cell function. Mitochondria The mitochondria are also enveloped in two layers of membrane, an inner one and an outer one. They are involved in the oxidation of molecular fuels, including pyruvate derived from sugars and fatty acids, to generate the adenosine triphosphate (ATP) that is needed as an energy source for the reactions of the cell. Mitochondria, along with chloroplasts in plant cells, are enclosed in a double-​layered membrane and are the only non​nuclear organelles to contain DNA, reflecting their endosymbiotic origin. The mitochondrial outer membrane is freely permeable to ions and small molecules but the inner membrane acts as a diffusion barrier only allowing the trans- port of selected ions and metabolites by means of transport proteins. The generation of ATP by mitochondria is fundamental to eukary- otic cell function and it is the inner membrane that contains the mitochondrial ATP synthase as well as the machinery for electron transport (the respiratory chain) that enables oxygen consumption. The experiments that led to an understanding of the chemiosmotic process that couples oxidation energy to ATP production constitute one of the great achievements of late 20th-​century biochemistry. The ATP synthase is a remarkable nanomachine, comprising over 20 individual proteins, that functions as a proton driven turbine to produce ATP by a process called rotary catalysis. The rotor stalk in a single ATP synthase can spin at 8000 revolutions per minute, gen- erating 400 ATP molecules per second. Mitochondrial disorders are an important cause of neurological diseases and disorders of muscle, including cardiac muscle. Mitochondrial abnormalities may be inherited or acquired. Those specifically affecting the intrinsic genome of this organelle are transmitted in a matrilinear fashion. Descriptions of important mitochondrial disorders are to be found in Chapter 24.19.5. Peroxisomes Peroxisomes are membrane-​bound organelles that contain high concentrations of oxidative enzymes such as catalase and are there- fore important for a range of oxidative processes necessary for the elimination of multiple substances (e.g. in the breakdown of very long chain fatty acids). Several diseases caused by defects in peroxi- somal proteins are described in Chapter 12.9. Endoplasmic reticulum (ER) The membranes of the ER are contiguous with those of the nuclear membrane, thus the lumen of the ER is contiguous with the space between the two membranes of the nuclear envelope. The lumen of the ER serves as a calcium store for the cell, with concentrations of calcium in the ER lumen being around 10–​3 M, compared with 10–​8 M to 10–​6 M in the cytosol. Regulated release of calcium from the ER, in response to extracellular signals detected by membrane proteins at the cell surface and transmitted via specific intracellular second-​ messenger molecules, leads to changes in the activity of a host of cellular processes. This is because many intracellular proteins bind calcium, and their activities and/​or interactions with other proteins are dependent upon whether or not they are calcium bound. In fact, the calcium concentration in the cytosol has to be very carefully controlled because it is an important regulator of many intracel- lular processes including muscle contraction and secretion. Excess calcium within the cytosol can rapidly lead to cell death. There is, therefore, a complex array of transporters operating to ensure that calcium levels remain high in the lumen of the ER and are only tran- siently raised in the cytosol in response to specific stimuli. Some of these transporters are in the ER membrane, others in the plasma membrane, others in the mitochondria. Microtubules Actin Intermediate filaments Endoplasmic reticulum Golgi apparatus Trans Golgi network (TGN) Secretory vesicle Late endosome Recycling endosome Lysosome Nucleus Endocytic vesicle Mitochondrion Peroxisome Fig. 3.1.2  Cartoon showing some of the main components of a higher eukaryotic cell.

212 SECTION 3  Cell biology The ER is not, however, simply a calcium store. It is also a major site of lipid biosynthesis within the cell. It is also the site of synthesis for proteins that are destined to be secreted from the cell or to be membrane proteins. Such proteins are synthesized by ribosomes that become attached to the cytosolic face of the ER membrane soon after they have started to translate an mRNA encoding a secretory or integral membrane protein. Before folding up as a secretory pro- tein in the lumen of the ER, or passing laterally from the transloca- tion channel into the ER membrane if it is destined to be an integral membrane protein, the nascent protein is translocated through a protein-​lined channel in the ER membrane. This channel opens only when a ribosome synthesizing a secretory or integral membrane protein is bound to its cytosolic face. The ER is thus the start of the secretory pathway in eukaryotic cells. The fact that the translocation channel opens only when a ribosome is bound to it preserves the in- tegrity of the ER membrane as a permeability barrier and ensures no leakage of ions, such as calcium, from the ER lumen. Some of the most abundant proteins in eukaryotic cells reside in the lumen of the ER; these are proteins that are involved in assisting the correct folding (see Box 3.1.1) of newly synthesized proteins in the secretory pathway and include proteins such as the enzyme protein disulphide isomerase (which ensures that the correct di- sulphide bonds are formed in proteins), calnexin, and calreticulin. The latter two are also calcium-​binding proteins. Thus, although most proteins that enter the secretory pathway at the ER are des- tined to be secreted or to become integral membrane proteins in the plasma membrane, certain proteins (both soluble proteins within the lumen of organelles along the secretory pathway and integral membrane proteins within the membranes of organelles along the secretory pathway) are primarily localized to specific compart- ments along the secretory pathway. It has become increasingly clear over recent years that a combin- ation of retention and retrieval signals (often short linear sequences of amino acids) within proteins serve to ensure these localizations, with retention signals serving to hold proteins in place and retrieval signals operating to bring proteins back to their steady-​state local- ization from a point further along the secretory pathway. Most diagrams of eukaryotic cells in textbooks, including Fig. 3.1.2 here, show the ER as a membranous organelle linked to the nucleus; this is correct, but it fails to illustrate the extent of the ER since, in most cells, it pervades much of the extranuclear space of the cell and is a highly dynamic organelle. More than half of the total membrane area of a mammalian cell can be ER. Golgi apparatus The step beyond the ER in the secretory pathway is the Golgi ap- paratus. The Golgi apparatus has been likened to a small stack of pitta bread, with each pitta corresponding to a cisterna (segment) of the Golgi. Traffic through the secretory pathway, from the ER to the Golgi and beyond, is mediated by shuttling transport vesicles, which transfer cargo molecules between organelles. The process is called vesicular transport, whereby vesicles bud from a donor compartment in a process that enables protein sorting to allow se- lective incorporation of soluble and integral membrane protein cargo into the forming vesicles, while leaving organelle-​resident proteins behind. The vesicles are subsequently targeted to a spe- cific acceptor compartment into which they unload their cargo upon fusion of their limiting membranes. Thus, in the transport step from the ER, vesicles are delivered to the cis face of the Golgi apparatus. The recruitment of specific cargo into the vesicles, the budding of the vesicles, and their fusion with the Golgi are all steps that involve discrete, and transient, assemblies of proteins. The different cisternae of the Golgi apparatus—​known as cis, medial, and trans, although there may well be many more than three in certain cell types—​are the next steps along the secretory pathway. It now appears that passage through this part of the pathway can be quite complex, with both forward (anterograde) and backward (retrograde) vesicular traffic occurring. The anterograde traffic moves cargo towards the trans side of the Golgi apparatus and the retrograde traffic retrieves material that is required earlier in the secretory pathway (i.e. in the medial or cis cisternae of the Golgi apparatus or in the ER). Vesicular traffic within the Golgi apparatus also seems to be com- plemented by a process that has been termed cisternal maturation. This describes the maturation of a cis cisterna into a medial cisterna by the vesicular retrieval of material that should not be present in a medial cisterna. The retrieved vesicles fuse with newly arrived vesicles from the ER to form a new cis cisterna; meanwhile the medial cisterna matures into a trans cisterna via the same process. As all of this is happening, the proteins that are passing along the secretory pathway are being sequentially post-​translationally modi- fied, primarily by the addition of a series of sugar residues to gen- erate glycoproteins, by specific enzymes with discrete steady-​state localizations maintained by retention and retrieval signals, within specific cisternae of the Golgi apparatus. Beyond the trans cisterna of the Golgi apparatus lies the trans Golgi network (TGN) from which a range of vesicles and tubules bud to deliver their cargo to its destination. This may be the cell surface, for proteins that are to be secreted or to become integral membrane proteins in the plasma membrane, but may also be an intracellular organelle. Thus, for example, lysosomal enzymes have to be delivered to the lysosome and this is done via the secretory pathway. Lysosomes Lysosomes can be considered as the recycling centres of the cell. The cytosolic surface of the lysosome membrane is now recognized as a major site of action of signalling complexes that regulate cellular Box 3.1.1   Proteostasis • Ensuring that proteins are correctly folded is part of the maintenance of proteome homeostasis (known as proteostasis), something which is crucial for cellular and organismal health. • Proteostasis is achieved by an integrated network of several hundred proteins (maybe even a couple of thousand), including most promin- ently (1) molecular chaperones and their regulators, which assist in de novo folding or refolding, and (2) the ubiquitin−proteasome system (UPS) and autophagy system, which mediate the timely removal of irreversibly misfolded and aggregated proteins. • Deficiencies in proteostasis have been linked to the progression of numerous diseases, such as neurodegeneration and dementia, type 2 diabetes, amyloidosis, lysosomal storage disease, cystic fibrosis, cancer, and cardiovascular disease. • Pharmacological approaches to improve the folding, trafficking, and function of misfolded proteins are currently being developed.

3.1  The cell 213 metabolism, including a transcription factor, TFEB, which can translocate to the nucleus and upregulate autophagy (a process of orderly degradation and recycling of cellular components) and lyso- some biogenesis genes when the cell is starved. Macromolecules are delivered to lysosomes to be broken down into their constituent building blocks (e.g. proteins to amino acids, polysaccharides to monosaccharides) by a host of acid hydrolases (proteases, glycosidases, nucleases, lipases, and so on). The building blocks are then exported from the lysosome and used by the cell to make new macromolecules. It is clearly important that the hydrolysis of macromolecules is strictly compartmentalized, otherwise the cell would destroy itself. In fact, the cell not only compartmentalizes lysosomal enzymes within the lysosome, but also ensures that these enzymes only be- come fully active once they have been delivered into the lysosome. The lysosomal enzymes are acid hydrolases, that is to say that they function at low pH. The pH of the lumen of the lysosome is about 4.5. This is in contrast to the pH in the lumen of the Golgi and TGN (approximately 6.5 to 6.7) or the pH of the cytosol (approximately 7.4). Thus, as lysosomal enzymes are delivered from the TGN to the lysosome (by vesicular transport) they become activated because of the lower pH in the lysosome. In fact, lysosomal enzymes are not delivered directly from the TGN into the lysosome, but are delivered to an intermediate compart- ment, the late endosome. This is an interface between the secretory pathway and the endocytic pathway (a pathway carrying material that has been internalized from the cell surface). A late endosome fuses with a mature lysosome, delivering its contents to the lyso- somal hydrolases in a transient hybrid organelle, the endolysosome, in which digestion commences (Fig. 3.1.3). Membrane lipids and integral membrane proteins that should be in the late endosome are then retrieved from the endolysosome and recycled to form a new late endosome and allow regeneration of a mature lysosome. A similar process of delivery to lysosomal hydrolases occurs in the autophagic pathway in which an autophagosome, formed by a double membrane enclosing a region of cytoplasm to be degraded, fuses with a lysosome to form an autolysosome from which a mature lysosome can be regenerated. The luminal pH of the late endosome is intermediate between that of the TGN and that of the lysosome (i.e. between 5 and 6). This reduced pH is generated by the action of Fig. 3.1.3  Correlative light and electron microscopy of endolysosomes. A cultured fibroblastic cell was incubated with a membrane permeable cathepsin substrate called Magic RedTM, which releases red fluorescent cresyl violet dye within the endolysosomes after hydrolysis (upper left). The location of the endolysosomes within the cell can be identified by merging the fluorescence image with a differential interference contrast microscopy image (upper right). Transmission electron microscopy of the same cell shows the ultrastructure of the individual endolysosomes (lower images), identified at the centre of boxes 1 and 2. Scale bars: upper images, 10 microns; lower images, 0.2 microns.

214 SECTION 3  Cell biology a proton pump (a vacuolar ATPase) in the limiting membrane of the late endosome and lysosome. Many inherited disorders of lysosomal function have been identified. These diseases and their treatments are described in Chapter 12.8. The special capacity of the lysosomal compartment for complementation by internalizing proteins supplied externally has allowed several important therapeutic enzymes to be developed. There are also many disease-​causing single gene disorders associated with malformation or malfunction of lysosome-​related organelles that exist in a variety of differentiated and specialized cells. These have some properties and proteins in common with lysosomes but also contain cell type-​specific proteins, often destined for secretion as a result of the lysosome-​related organelle fusing with the plasma membrane. Examples include secretory lysosomes in cytotoxic T lymphocytes and natural killer cells, melanosomes in melanocytes, and a variety of platelet granules. Endocytosis and endosomes The existence of late endosomes implies that there must also be early endosomes. There is clearly a flow of membrane and protein along the secretory pathway culminating in the fusion of vesicles with the plasma membrane. In the absence of any compensatory membrane internalization, the surface area of the cell would therefore continu- ally increase. Such internalization does occur, thereby ensuring that most cells remain relatively constant in size. The internalization of membrane from the cell surface (a process termed endocytosis) occurs via a variety of routes, the best char- acterized being clathrin-​mediated endocytosis. Just as with ves- icular transport in the secretory pathway, the process involves the assembly of specific protein machinery at the cytosolic face of the plasma membrane, the invagination of the plasma membrane, and the pinching off of clathrin-​coated, membrane-​bound vesicles. The protein coat (including clathrin) that has been instrumental in the formation of these vesicles then disassembles and the uncoated vesicles fuse to form early endosomes; the vacuolar ATPase is al- ready active in early endosomes and they have a lumenal pH of ap- proximately 6.5 to 6.8. The endocytic process not only ensures that a balance is maintained between the amount of membrane inserted in the plasma membrane and the amount removed, but also allows the selective recruitment of specific integral membrane proteins (often with extracellularly bound ligand) into the endocytic pathway. The different endocytic mechanisms selectively recruit different cargo and thereby serve as molecular filters, internalizing certain in- tegral membrane proteins while leaving others at the cell surface. The complexity of the endomembrane system (i.e. the membranes of the endocytic compartments) in mammalian cells has become ap- parent in recent years. Thus, for example, further protein machinery is used to sort integral membrane proteins destined for degradation away from the limiting membrane of endosomes into intraluminal vesicles that accumulate in late endosomes, giving them the appear- ance of multivesicular bodies in the electron microscope. In addition to early and late endosomes, there are also recycling endosomes. These are compartments from which material that is to be returned to the plasma membrane is retrieved. Such ma- terial might be receptors that have been internalized along with their ligand, but which need to be returned to the cell surface having released their ligand at the lower pH of the early endosome/​recyc- ling endosome. An example of such a receptor is the transferrin re- ceptor which releases the iron that is bound to transferrin in the early endosome/​recycling endosome, leaving the transferrin receptor and apotransferrin to be recycled to the cell surface for reuse. Other receptors that are internalized are destined for degradation in the lysosome and are delivered to the late endosome. An example of such a receptor is the epidermal growth factor receptor. When this receptor binds its ligand at the cell surface it transmits a cascade of signals across the cell which trigger cell growth and cell division. Such signals should only be transient, otherwise unregulated cell growth and cell division occur; the cell ensures that the signals trans- mitted by the receptor are transient by internalizing the receptor and sending it to the lysosome for degradation. Cytoskeleton Movement of vesicles and tubules between compartments does not occur at random but is dependent upon motor proteins and the cytoskeleton. The cytoskeleton is the name given to a framework within the cell which gives the cell its shape and provides a structure to which organelles and proteins can be attached, thus providing an architecture that gives spatial organization to the cell. There are three main components of the cytoskeleton in mamma- lian cells: microtubules, actin filaments, and intermediate filaments (Fig. 3.1.2). Each of these components is a polymer of protein sub- units and all three are dynamic structures with the potential for as- sembly and disassembly according to the needs of the cell. Microtubules are highly dynamic polymers of heterodimers of α-​ and β-​tubulin which assemble to form long hollow tubes ap- proximately 25 nm in diameter. Monomers of globular (G) actin polymerize to form filamentous (F)  actin, which is a double-​ stranded helical polymer with a diameter of 5 to 9 nm. Elongated and fibrous subunits assemble to form intermediate filaments with a diameter of approximately 10 nm (e.g. lamins A, B, and C assemble to form the nuclear lamina that provides the inner lining to the nu- clear envelope). The different components of the cytoskeleton provide comple- mentary features of the cellular architecture. Microtubules serve to localize organelles within the cell and provide the tracks along which many classes of transport vesicles and tubules move, the movement being powered by motor proteins (notably kinesin and dynein) attached to the membranes of the vesicles or tubules. Microtubules also play a critical role during cell division, since they are pivotal in the physical separation of chromosomes during mitosis. Actin filaments can cross the cell and provide the struc- ture that determines the shape of the cell’s surface. These filaments play major roles in protrusions from the cell surface. For example, they run along the length of the microvilli that extend from the apical surface of polarized epithelial cells, and they are absolutely necessary for cell locomotion since concerted rearrangements of the actin cytoskeleton underlie cell movement. Myosin motor proteins also interact with actin, the best characterized such inter- action being between actin and myosin II in skeletal muscle; this interaction is responsible for generating the force that is required for muscle contraction. Both microtubules and actin filaments are highly dynamic structures. Their assembly and disassembly are

3.1  The cell 215 precisely and finely regulated by a host of cellular proteins in re- sponse to a range of extracellular signals. Intermediate filaments are relatively stable by comparison, providing mechanical strength to the cell. The dynamic cell The preceding overview of the secretory and endocytic pathways in mammalian cells highlights their dynamic nature. The elegant car- toons that grace most textbooks in this field indicate the subcellular organelles and other cellular components and their relative posi- tions within the cell, but, because they are two-​dimensional static images, cannot give any indication of the complex dynamics that operate within cells. All the different vesicle budding, vesicle trans- port, vesicle targeting, and vesicle fusion steps involve the assembly and disassembly of specific and discrete macromolecular complexes. There is exquisite spatiotemporal control of each of these events. The dynamic nature of microtubules and actin filaments adds to the complexity of the interactions that occur within cells. The dynamic nature of the eukaryotic cell has been made evident over the past 10 years or so following the widespread use of a range of tools and microscopy systems that allow the imaging of specific pro- teins within live cells. One real breakthrough came with the isolation of the DNA sequence encoding green fluorescent protein (GFP), which is encoded by the genome of the jellyfish Aequorea victoria and naturally fluorescent, emitting green light when it is illumin- ated with blue light. The now standard techniques of molecular gen- etics have allowed researchers to link the DNA sequence encoding GFP to the DNA sequences encoding a range of different proteins. These hybrid DNA sequences can be introduced into eukaryotic cells and the localizations and intracellular movements of the hybrid proteins they encode can be monitored by appropriate microscopy techniques. This has allowed us to see the dynamic instability of microtubules within living cells, the movement of proteins along the secretory pathway, the sorting of proteins in the endocytic pathway, and many other cellular processes. Genetic engineering has also provided us with a suite of spectral variants of GFP, each emitting light of a different wavelength (i.e. a different colour), thereby allowing the imaging of two or more dif- ferent proteins in the same cell at the same time. It is remarkable that in most cases the presence of a fluorescent protein attached to a pro- tein of interest has little if any effect on the function of that protein. Developments in microscopy are also making a huge contribution to our understanding of cell dynamics. Remarkably, it has been dis- covered that there are ways of overcoming Abbe’s diffraction limit, derived from the laws of physics, which dictates that visible light cannot distinguish between objects closer to each other than around 200 nm (about half the wavelength of visible light). This has led to the construction of so-​called super-​resolution microscopes with greater than an order of magnitude improvement in resolution. The development of lattice light sheet microscopy, with scanning speeds of hundreds of planes per second and exceptionally low photo- bleaching and phototoxicity, has recently enabled extraordinary spatiotemporal resolution in living cells. We may soon see the results of studies on the interplay of a dozen or more proteins in the same cell at the same time, helping us to understand how they orchestrate a complex cellular function such as vesicular traffic. The overall importance of the dynamic nature of the cell is high- lighted by the pathogenesis of the diverse group of genetic disorders that make up the hereditary spastic paraplegias. In these diseases, lower limb spasticity and weakness is caused by a progressive distal axonopathy that mainly involves the longest corticospinal tract axons that can reach one metre in length. The existence of protru- sions of this length from a single cell undoubtedly presents many challenges in cell dynamics, considering that most mammalian cells are only a few tens of microns across at their greatest width. Many of the proteins encoded by hereditary spastic paraplegia genes are now recognized as being involved in intracellular vesicular traffic or shaping of intracellular organelles, with c.60% of cases due to mu- tations in the gene encoding the protein spastin that couples mem- brane modelling to the severing of microtubules. Biological membranes It is over 40 years since Singer and Nicholson proposed the ‘fluid mosaic’ model for biological membranes. This proposed that inte- gral membrane proteins could diffuse freely in the sea of the lipid bilayer. The imaging of populations of GFP-​tagged proteins has con- firmed earlier studies which show that this is essentially the case, but more sophisticated single-​particle tracking studies have shown that the plasma membrane is partitioned with regard to molecular diffusion in the plane of the lipid bilayer. One major reason for this is specific interactions with the underlying actin cytoskeleton that can create molecular picket fences around two-​dimensional domains (incorporating both bilayers). These interactions tend to be between actin and the cytosolic domain of specific integral membrane pro- teins (Fig. 3.1.4). Interactions in biological membranes are often indirect (i.e. via one or more intermediate proteins), thereby providing the oppor- tunity for regulation of the interaction. For example, the cytosolic domain of the cystic fibrosis transmembrane conductance regulator (CFTR) interacts with a cytosolic protein called EBP50 at the apical surface of polarized human airway epithelial cells. EBP50 in turn binds another protein, ezrin, and ezrin binds the actin cytoskeleton, thereby tethering CFTR to the actin cytoskeleton and keeping it in the right place in the plasma membrane. Similar interactions be- tween the cytosolic domains of specific integral membrane proteins and the actin cytoskeleton most probably also occur in the context of organellar membranes. In addition to the plasma membrane having two-​dimensional or- ganization as a result of picket fences, there are also raft domains enriched in cholesterol and glycosphingolipids (together with glycosylphosphatidylinositol-​anchored proteins) as a result of the affinities of these lipids for each other. Although there is some dis- agreement about the best practical way to define a raft, it is clear that they exist, at least as microdomains. Interactions between rafts and the actin cytoskeleton have been detected and seem to function as signalling nodes due to the recruitment and concentration of pro- teins required for signalling pathways on the cytosolic side. As with other aspects of cellular organization, the two-​dimensional organ- ization of the plasma membrane is dynamic and reversible. Differential gene expression A human body clearly arises from a single cell, the fertilized egg. This single cell eventually gives rise to the multitude of different cell types within the body. For this to occur, cells must grow and divide

216 SECTION 3  Cell biology (and sometimes die) in a highly regulated manner. Not only do the cells need to grow and divide, different populations of cells must dif- ferentiate along different lineages in order to generate the different cell types required to populate the different tissues and organs of the body. Each of the 1013 or so cells in the human body is a phenomen- ally complex and dynamic entity. Furthermore, different subsets of the approximately 20 000 to 25 000 genes in the human genome are expressed in different cell types, with further variations in gene ex- pression occurring during development and in response to external stimuli. This differential gene expression leads, at least in part, to the diversity of cell types found throughout the body. Thus, for example, a neuron is clearly very different from an epithelial cell lining the gut. However, both have the same fundamental organization described in the preceding paragraphs. They both express a core set of shared genes, providing the fundamental cellular organization, but each ex- presses a different set of specific genes. The specific genes expressed will help to define the phenotype of the cell. The differences between cells can often be quite subtle (e.g. different cell types have different protein subunits making up their intermediate filaments), different motor proteins are expressed in different cell types, and differential glycosylation of glycoproteins and glycolipids occurs in different cell types. Alternative splicing and post-​translational modifications The fact that there appear to be only 20 000 to 25 000 genes in the human genome does not mean that only this number of proteins can be encoded by the genome. Many genes are subject to the process of alternative splicing, whereby specific exons are included or excluded as the precursor mRNA is processed (spliced) in the nucleus to re- move the non​coding intron sequences. Thus, one gene can give rise to several related, but different, mRNA transcripts. Furthermore, differential processing and differential post-​translational modifica- tion of proteins leads to further variety in the range of protein prod- ucts produced from the genome. The range of proteins in a cell (the proteome) is therefore poten- tially considerably larger than the number of genes in its genome. In the case of cytosolic proteins, and the cytosolic domains of integral membrane proteins, many of the post-​translational modifications that occur are transient and reversible. Thus, many such proteins are subject to phosphorylation (the addition of a phosphate group to the side chain of a specific amino acid). This process is catalysed by specific enzymes (kinases) and occurs on specific serine, threo- nine, or tyrosine residues in target proteins. This modification is reversible by the action of members of another family of enzymes (phosphatases) that remove the phosphate. Phosphorylated proteins have different activities, and often interact with a different subset of proteins, from their non​phosphorylated counterparts; thus, revers- ible phosphorylation is a mechanism whereby the cell can regulate interactions and thereby processes occurring within it. It is not uncommon for a kinase to be activated by phosphoryl- ation and it is not uncommon for one kinase to activate another by phosphorylation, thus establishing a signalling cascade that has built-​in amplification of the initial signal—​amplification because each kinase is an enzyme capable of acting upon multiple substrate molecules while it is in the active state. The initial signal might be the binding of a ligand to its receptor at the cell surface (e.g. the binding of epidermal growth factor to its receptor at the cell surface). As mentioned earlier, this initiates a cascade of signals across the cell which trigger cell growth and cell division, a cascade which is es- sentially a cascade of phosphorylation events. Such a process clearly needs to be transient or it would lead to unregulated cell growth and cell division. The initial signal is removed by the internalization and degradation of the epidermal growth factor receptor, as previ- ously outlined, but this still leaves an activated kinase cascade per- petuating the ‘grow and divide’ message. It is the action of specific phosphatases removing the phosphate groups from the kinases in the cascade that turn off the signalling pathway. Thus, once again, we have a highly dynamic cellular system with exquisite spatiotemporal control. Reversible phosphorylation is one example of several reversible post-​translational modifications that serve to regulate cellular func- tion. The principles relating to phosphorylation as a form of post-​ translational regulation (i.e. that phosphorylated proteins interact with different proteins compared to their non​phosphorylated coun- terparts, or that phosphorylated enzymes have different activities from their nonhosphorylated counterparts), and that this plays a role in the regulation of cell function, also applies to other forms of reversible post-​translational modification. Integral membrane protein tethered to the actin cytoskeleton via intermediate proteins. Integral membrane protein not tethered to the actin cytoskeleton and free to diffuse in the plane of the lipid bilayer, but hindered from doing so by the tethered proteins. Lipid bilayer Actin cytoskeleton Fig. 3.1.4  Integral membrane proteins in the lipid bilayer. Some are tethered to the underlying actin cytoskeleton, keeping them in place and providing barriers to the free diffusion of those integral membrane proteins that are not so tethered.

3.1  The cell 217 Post-​transcriptional gene silencing (miRNA) The 20 000 to 25 000 genes in the human genome account for only about 2% of the total DNA in the genome. So, what is the role of all the other DNA? A significant amount of it serves structural pur- poses (e.g. the sequences at the centromeres (middles) and telo- meres (ends) of chromosomes), but recent evidence shows that much of it plays crucial regulatory roles, working by the process of post-​transcriptional gene silencing. The phenomenon of post-​transcriptional gene silencing was first described in plants but has subsequently been shown to be widespread in eukaryotes. In this process a short (19–​23 nucleotides long) double-​ stranded RNA molecule associates with a target mRNA (the nucleotide sequence of one of the RNA strands in the double-​stranded molecule is complementary to the sequence of the target mRNA). This occurs in the context of a multiprotein complex and leads to either a block in translation or the degradation of the target mRNA. This mechanism therefore regulates protein expression post-​transcriptionally, hence the designation post-​transcriptional gene silencing. The short double-​stranded RNA molecules involved in post-​ transcriptional gene silencing are produced from slightly larger pre- cursor RNA molecules known as micro RNAs (miRNAs) which are themselves produced by transcription of relevant DNA sequences by DNA polymerase in the cell’s nucleus. The number of DNA sequences within the human genome that encode miRNAs has yet to be finalized, but there appear to be at least as many such sequences as there are protein-​encoding DNA sequences (i.e. conventional genes). miRNA sequences have been shown to play critical regulatory roles in a range of processes (e.g. during development and in the immune response to pathogens). They have also been implicated as playing a role in several disease states, such as heart disease and cancer. Future developments The availability of the human genome sequence has given us access to information concerning the basic building blocks of the cell, but it is how those building blocks are modified and used in a multitude of different dynamic interactions that gives organization, function, and life to the cell. A major challenge of the next decade is to integrate the vast amounts of data that are now available, and will continue to be- come available, on the molecular mechanisms that underlie cellular organization, structure, and function. Such a challenge will have to be met if we are to achieve a clearer, and more complete, under- standing of the cell and are to develop the capacity to refine our means of modifying cellular functions that are disturbed in disease. FURTHER READING Alberts B, et  al. (2015). Molecular biology of the cell, 6th edition. Garland Science, New York. Baum DA, Baum B (2014). An inside-​out origin for the eukaryotic cell. BMC Biology, 12, 76. Berridge MJ (2006). Cell signalling biology. Portland Press, Colchester. http://​www.cellsignallingbiology.org/​ Blackstone C, et al. (2011). Hereditary spastic paraplegias: membrane traffic and the motor pathway. Nat Rev Neurosci, 12, 31–​40. Bonifacino JS, Glick BS (2004). The mechanisms of vesicle budding and fusion. Cell, 116, 153–​66. Brooks SA, et  al. (2008). Altered glycosylation of proteins in cancer:  what is the potential for new anti-​tumour strategies. Anticancer Agents Med Chem, 8, 2–​21. Bushati N, Cohen SM (2007). microRNA functions. Ann Rev Cell Dev Biol, 23, 175–​205. Chalfie M, et al. (1994). Green fluorescent protein as a marker for gene expression. Science, 11, 802–​5. Chen B-​C, et al. (2014). Lattice light sheet microscopy: imaging mol- ecules to embryos at high spatiotemporal resolution. Science, 346, 1257998. Clapham DE (2007). Calcium signaling. Cell, 131, 1047–​58. Ellis RJ, Minton AP (2003). Join the crowd. Nature, 425, 27–​8. Giepmans BN, et al. (2006). The fluorescent toolbox for assessing pro- tein location and function. Science, 14, 217–​24. Goldman RD, et al. (2008). Intermediate filaments: versatile building blocks of cell structure. Curr Opin Cell Biol, 20, 28–​34. Huotari J, Helenius A (2011). Endosome maturation. EMBO J, 30, 3481–​500. Irannejad R, et al. (2015). Effects of endocytosis on receptor-​mediated signaling. Curr Opin Cell Biol, 35, 137–​43. Kusumi A, et al. (2012). Dynamic organizing principles of the plasma membrane that regulate signal transduction: commemorating the fortieth anniversary of Singer and Nicholson’s fluid-​mosaic model. Annu Rev Cell Dev Biol, 28, 215–​50. Lanzetti L (2007). Actin in membrane trafficking. Curr Opin Cell Biol, 19, 453–​8. Levine B, Kroemer G (2019). Biological functions of autophagy genes: a disease perspective. Cell, 176, 11–42. Lewin B, et al. (2007). Cells. Jones and Bartlett, Sudbury, MA. Lippincott-​Schwartz, J (2004). Dynamics of secretory membrane traf- ficking. Ann N Y Acad Sci, 1038, 115–​24. Luzio JP, et  al. (2014). The biogenesis of lysosomes and lysosome-​ related organelles. Cold Spring Harb Perspect Biol, 6, a016840. Ross JL, Ali MY, Warshaw DM (2008). Cargo transport: molecular motors navigate a complex cytoskeleton. Curr Opin Cell Biol, 20, 41–​7. Settembre C, et al. (2013). Signals from the lysosome: a control centre for cellular clearance and energy metabolism. Nat Rev Mol Cell Biol, 14, 283–​96. Singer SJ, Nicolson GL (1972). The fluid mosaic model of the structure of cell membranes. Science, 18, 720–​31. Stadler BM, Ruohola-​Baker H (2008). Small RNAs: keeping stem cells in line. Cell, 132, 563–​6. Stagg SM, LaPointe P, Balch WE (2007). Structural design of cage and coat scaffolds that direct membrane traffic. Curr Opin Struct Biol, 17, 221–​8. Stefani G, Slack FJ (2008). Small non-​coding RNAs in animal develop- ment. Nat Rev Mol Cell Biol, 9, 219–​30. Ungewickell EJ, Hinrichsen L (2007). Endocytosis: clathrin-​mediated membrane budding. Curr Opin Cell Biol, 19, 417–​25. Yang YX, Rastetter RH, Wilhelm D (2016). Non-coding RNAs: an introduction. Adv Exp Med Biol, 866, 13–32. Zhang J, et al. (2002). Creating new fluorescent probes for cell biology. Nat Rev Mol Cell Biol, 3, 906–​18.

3.2 The genomic basis of medicine 218

3.2 The genomic basis of medicine 218

ESSENTIALS It is now possible to determine the entire DNA information content of living organisms—​the genome. The completion of the human ref- erence DNA sequence has provided an enormous tool for genomic analyses and has enhanced our view of the genetic and genomic variation contributing to the genetic bases of disease. Several human genomic studies in diverse populations (e.g. the International HapMap Project (HapMap), the ENCyclopedia Of DNA Elements Project (ENCODE), and 1000 Genomes Project) have revealed that the tremendous amount of genetic variation in humans consists of two major types: nucleotide sequence variants and genomic structural changes. The contribution of rare variants and de novo mutations to disease, embodied in the Clan Genomics hypothesis, is of great clinical utility, has gathered extensive sup- portive data, and further aligns clinical practice with human biology in the context of evolution. The first phase of the studies on genetic variation in humans has been focused on single nucleotide polymorphisms and common variation. The large number of single nucleotide polymorphisms identified has enabled successful genome-​wide association studies for disease susceptibility risk of complex traits (e.g. diabetes and cancer), but for the most part has had limited practical applications in clinical medicine. Technological developments enabling a higher-​resolution analysis of the human genome have uncovered extensive submicroscopic structural variation, including copy-​number variants. Copy-​number variants involving dosage-​sensitive genes result in several diseases and contribute to human diversity and evolution. An emerging group of genetic diseases have been described that result from DNA rearrangements (e.g. copy-​number variants and other structural variations including copy-​number neutral inversions and translocations), rather than from single nucleotide changes. Such conditions have been referred to as genomic disorders. Recurrent rearrangements of the human genome, or those of common size that contain the same genomic interval in different individual personal genomes and have clustered breakpoints, most frequently result from a mechanism of non​allelic homologous re- combination between region-​specific low-​copy repeats, or seg- mental duplications. Non​recurrent rearrangements, or those for which breakpoints do not cluster and that are generally different in size and genome content among families, can result from non-​ homologous end-​joining recombination mechanism. More recently, DNA replication mechanisms involving template switching have been shown to play a major role in the origin of non​recurrent re- arrangements; template switching also plays an important role in many Alu–​Alu mediated events. Iterative template switches during replicative repair can result in complex genomic rearrangements. The development of array-​based comparative genomic hybridiza- tion and single nucleotide polymorphism arrays have enabled high-​ resolution screening of genomic imbalances throughout the entire genome with the level of resolution depending only on the size and dis- tance between the arrayed interrogating probes. This has had tremen- dous clinical applications for both postnatal and prenatal diagnosis. Advances in massively parallel next-​generation sequencing tech- nologies have led to development and research and clinical imple- mentation of exome sequencing and whole-​genome sequencing, revolutionizing medical genetic diagnostics. These studies document a role for new mutations (either copy-​number variants or single nu- cleotide variant) in sporadic disease traits. Such genome-​wide vari- ation studies are beginning to yield insights into multilocus effects on disease trait manifestations. In the current postgenomic era both high-​resolution genome analysis by chromosome microarray analyses and personalized dip- loid genomic sequencing applied to the study of inherited and com- plex traits promise a continued revolution in our understanding of normal physiology and the pathophysiology of disease heralding the genomic basis of medicine and the precision medicine initiative. Introduction The elucidation of the DNA double helix establishing the chem- ical basis of heredity in 1953 and the determination of the correct number of human chromosomes 3 years later laid the fundamentals for the development of two major fields in human and medical gen- etics: clinical molecular genetics and clinical cytogenetics. Although developing independently for the first four decades, these two fields have contributed enormously to the molecular diagnosis of disease and provide a better understanding of the genetic bases of human 3.2 The genomic basis of medicine Paweł Stankiewicz and James R. Lupski

3.2  The genomic basis of medicine 219 physiology and pathophysiology. Around 25  years ago, techno- logical advances, mainly in fluorescence microscopy, have led to the development of molecular cytogenetics techniques that by enabling the identification of submicroscopic chromosome rearrangements, bridged the genetic variation ‘resolution’ gap between molecular genetics and clinical cytogenetics. As a consequence, since the early 1990s, the genomic aspects of inheritance have come to be recog- nized, as elucidated, for example, through studies of the submicro- scopic genomic duplication at chromosome 17p12 causing the common autosomal dominant adult onset Charcot–​Marie–​Tooth type 1A distal symmetric polyneuropathy (CMT1A). The beginning of human genetics can be traced to the rediscovery of Gregor Mendel’s observations on the inheritance of phenotypic traits in the garden pea Pisum sativum and Archibald Garrod’s elu- cidation of the genetics of biochemical traits such as alkaptonuria. Mendel found that during gamete formation, each member of the diploid allelic pair separates from the other one to form the gen- etic constitution of the haploid gamete. This phenomenon of inde- pendent segregation is now known as Mendel’s first law. Mendel’s second law states that the segregation of two alleles (corresponding DNA loci on homologous chromosomes) during gamete formation is independent from the segregation of the alleles of other allelic pairs. We now know that linkage, the physical proximity of two gen- etic loci on a linear map, results in exceptions to Mendel’s second law. Such linkage information has been used to map disease traits in humans. Mendel’s ‘inheritance factors’ encoding the genetic in- formation were further defined and termed ‘genes’ many years later. During the last two decades it has become possible to deter- mine the sequence and variation of the entire DNA content of the living organism—​the genome. The first human haploid reference genome sequence became available at the turn of this century. In the current postgenomic era, both high-​resolution genome ana- lysis by chromosome microarray analyses (CMA), and personal- ized diploid genomic sequencing using exome sequencing (ES) or whole-​genome sequencing (WGS) applied to the study of in- herited and complex traits promise a continued revolution in our understanding of the genetic bases of human biology and the pathophysiology of disease. Genes, chromosomes, and our genome A gene is defined as a set of segments of DNA (deoxyribonucleic acid) that carries the information necessary to produce (transcribe) a functional RNA (ribonucleic acid). Despite the completion of the Human Genome Project (HGP), the exact number of protein- coding genes in the human genome is still unknown; the current estimate is between 20 000 and 25 000. The DNA double helix is a three-​dimensional polymer com- posed of units called nucleotides. A  combination of two purine bases, adenine (A)  and guanine (G), and two pyrimidine bases, thymine (T) and cytosine (C), with deoxyribose sugars and linked by phosphodiester bonds (base + sugar + phosphate = nucleotide) constitute a single strand. The two strands are held together and stabilized by hydrogen bonds that enable Watson–​Crick base pairs to form. The base A forms two hydrogen bonds with T while C forms three hydrogen bonds with G. A combination of three nucleotides constitutes a triplet codon that encodes for an individual amino acid by a specific universal genetic code. Different triplets can encode the same amino acids or stop codons during translation; there are 43 = 64 different codon combinations possible, but only 20 amino acids, making the genetic code degenerate. Most genes consist of coding regions termed exons that are sep- arated by intervening introns. The exonic and intronic portions of a gene are transcribed by RNA polymerase II into messenger RNA, or mRNA, that usually begins with a cap on the 5' end and terminates with a polyadenylated (polyA) tail on the 3' end. The in- trons are deleted in a process called splicing and the resulting ma- ture transcript, or spliced mRNA, is translated into a polypeptide chain starting with a methionine encoded by the AUG triplet (in RNA, thymine is replaced by uracil, U) at the 5' end (N-​terminal, NH2, or amino end of polypeptide) and terminated by the stop co- dons: UAA (also known as ochre), UAG (amber), or UGA (opal or umber) at the 3' end (C-​terminal, COOH, or carboxyl end of the polypeptide). The human haploid genome consists of approximately 3 × 109 bp and the normal diploid human genome in each cell is composed of approximately 6 × 109 bp. Most of the human genome consists of repetitive DNA elements. These can be divided into tandem re- peats represented by satellites (e.g. in centromeres), telomeric re- peats, micro-​ (di-​, tri-, and tetranucleotide repeats), mini-​, and macrosatellites, and interspersed repeats derived from transposable elements (e.g. Alu elements and L1 elements), which together com- prise about 60% of the human genome. It has been estimated that greater than 6% of the haploid human genome is present in two or more copies, which have been termed low-​copy repeats (LCRs) or segmental duplications. They are defined as DNA fragments larger than 1 kb in size and of more than 90% DNA sequence identity in the haploid reference genome. The unique DNA sequence portion of the human genome in- cludes genes, regulatory elements, and other non​genic sequences. It has been shown that most of this DNA might be transcribed, but protein-​coding sequences occupy only about 1.5% of the human genome. For every human, it is important to inherit the proper amount of genomic information, with contributions from both parents and the correct copy-​number of each genetic locus for proper function. The genes in a human genome are distributed along 46 chromo- somes. There are 22 pairs of autosomal chromosomes and two sex chromosomes—​X and Y in males and two X chromosomes in fe- males. In a conventional clinical cytogenetic analysis using a light microscope, chromosomes can be recognized and distinguished from each other when their chromatin is condensed (arrested in metaphase of the cell cycle) and specifically stained (e.g. with Giemsa), revealing a characteristic G-​banding pattern. Each human metaphase chromosome consists of two chromatids forming the chromosome arms connected by a centromere. Centromeres in the human genome consist of α-​satellite DNA (arranged by monomers of approximately 171 bp) and occupy about 2 to 3% of the human genome. Depending on the relative location of the centromere, chromosomes have been divided into three types: metacentric (with similar-​sized arms), submetacentric (with one arm significantly longer that the other, the shorter arm referred to as p and the longer as q), and acrocentric (with a centromere located very close to one end of a chromosome—​ chromosomes 13, 14, 15, 21, and 22).

220 SECTION 3  Cell biology Human chromosome ends are capped by telomeres that con- tain thousands of copies of a telomeric repeat sequence TTAGGG. Telomeres are synthesized by the ribonucleoprotein telomerase. Based on the size and relative centromere position, human chromo- some pairs have been enumerated and arranged in a karyogram that is routinely applied in clinical cytogenetics. A clinical karyotype al- ways designates the number and chromosomal sex: e.g. normal fe- male: 46,XX; normal male: 46,XY; male with trisomy 21 associated with Down syndrome: 47,XY,+21. Pathogenic genetic variants The normal flow of the genetic information is susceptible to perturb- ation at different levels. Changes in the base pairs are called variants or mutations and can arise as a result of replication, recombin- ation, and repair errors, or by exposure to external environmental factors (e.g. radiation or chemical mutagens). To prevent a disease connotation of the term ‘mutation’, ‘variant’ is now used more com- monly. Structurally, small variants can be divided into point muta- tions (substitutions) and insertions or deletions (indels). The most common variants involving exchange of pyrimidine for pyrimidine (e.g. C to T) or purine for purine (e.g. A to G) are called transitions. The rarer transversions substitute purine by pyrimidine (e.g. A to C) or the reciprocal (e.g. G to T). The CpG dinucleotide is particularly prone to transition variants (about tenfold relative to other bases) because methylated C (after CpG island methylation) becomes T if deaminated, and now pairs with A. DNA alterations that do not lead to a change in an amino acid, because of the degenerate code, are called silent (synonymous) vari- ants. These do not change an amino acid but can have functional consequences, e.g. if they create a cryptic splice site or affect an exon splice enhancer. Missense mutations result in an amino acid change and nonsense mutations introduce stop codons that truncate the protein prematurely. Small insertions and deletions called indels that are not a multiple of three nucleotides, which can shift a reading frame and thus alter the protein primary sequence structure, are called frameshift mutations. Abnormally truncated or erroneous transcripts with a premature termination codon (PTC) due to nonsense, frameshift, or splice mu- tations are eliminated from cells by a surveillance mechanism called nonsense mediated decay (NMD). NMD is usually triggered by a PTC in any exon except the last and a portion of the penultimate exon; PTCs in the last 50 to 55 bp of the penultimate exon or in the final exon escape NMD presumably because of the inability of the machinery to distinguish such a PTC from the normal stop codon. About one-​third of all human disease-​associated point mutations result from PTCs due to nonsense or frameshift alleles. Mutations have been categorized also on the basis of their pheno- typic outcomes. Loss-​of-​function mutations (hypomorphic if the loss is partial, amorphic if it is complete) manifest phenotypically when a decreased amount of protein is insufficient for the normal cell function (e.g. in haploinsufficient genes). Gain-​of-​function mu- tations (neomorphic) enhance the normal or take on a new protein function, and dominant negative mutations (antimorphic) result in a protein that acts antagonistically with the normal product from the other allele or another subunit of a protein complex. A genetic locus is said to be homozygous when two alleles have the same status (e.g. both alleles are mutated) and heterozygous when one allele is mutated and the second is normal (wild type). Compound heterozygotes have different mutations in both alleles of one gene. Double heterozygotes have two mutant alleles, but each is at a different gene locus. The status in which one of the alleles is ab- sent (e.g. for most of the X chromosome genes in males) is described as hemizygous. Typically, different mutations in a gene manifest with the same phenotype, a phenomenon described as allelic heterogeneity. However, different mutations in the same gene can sometimes lead to varied phenotypes. Such a situation is described as allelic affinity. Finally, if the same phenotype is caused by mutations in different genes, this is described as genetic or locus heterogeneity. Patterns of inheritance Mendelian inheritance Genetic traits can show mendelian or non​mendelian inheritance patterns. Mendelian traits involve a single locus, are usually mono- genic, and segregate in autosomal dominant, autosomal recessive, or X-​linked fashion. Autosomal dominant inheritance In autosomal dominant inheritance, the mutated allele is trans- mitted to 50% of the gametes and thus is expected to be present in one-​half of the progeny. However, if the trait is lethal, incompletely penetrant, age-​dependent, or results in variation in expressivity, the proportion with manifestations of disease may vary from 0 to 50%. In pedigree analysis, autosomal dominant inheritance is observed as a vertical transmission of the trait. Autosomal recessive inheritance In autosomal recessive trait the affected individuals, representing one-​fourth of the progeny, carry two mutant alleles at a locus as compound heterozygous or homozygous variants, each one usu- ally inherited from carrier parents. In families with healthy siblings, two-​thirds are expected to be carriers of the mutant allele and one-​ third (one-​fourth of all progeny) have two wild-​type alleles. When the mutated alleles in the affected subject are the same, the family is usually consanguineous. In pedigree analysis, autosomal recessive inheritance is revealed as horizontal transmission of the trait. Of note, for a few autosomal recessive traits it has been shown that the heterozygous carriers of the mutated allele may have an increased susceptibility to complex or multifactorial traits (Table 3.2.1). Moreover, at some loci carrier states convey selective advantage; for example, haemoglobin B gene (HBB) and protection from mal- aria, CFTR and protection from cholera death, hence in some world populations the carrier state can reach a relatively high frequency. The X chromosome In females, the vast majority of genes on the X chromosome undergo random inactivation and thus represent structural disomy but func- tional monosomy. If one of the X chromosomes harbours a mutated recessive allele, X inactivation is usually non​randomly skewed with the X chromosome harbouring the mutant allele being preferentially inactivated. Therefore, X-​linked recessive diseases are not present in females but affect all males since they have only one X chromo- some. However, females with an incomplete X inactivation (e.g. the efficiency of X inactivation decreases significantly with age) or

3.2  The genomic basis of medicine 221 skewed X inactivation (e.g. 5–​10% of females have a 80:20 ratio of X inactivation), females with Turner syndrome and a 45,X karyotype, or females carrying a balanced translocation between the X chromo- some and an autosome (X material on the derivative chromosomes is not inactivated) can manifest the X-​linked recessive disease. For example, some carriers of mutations in the PHF6 gene in families with Börjeson-​Forssman-​Lehmann (BFL) syndrome manifest mild features or different phenotypes, likely due to escaping X inactiva- tion in some tissues or cell types. However, classical X-​inactivation studies cannot explain these findings. In contrast, X-​linked dominant diseases are present in both males and females and twice as many females as males are affected. However, the phenotype is usually milder in females than in males. Occasionally, if the disease is lethal in males, the trait can be visible only in females (e.g. Rett syndrome). In the X-​linked diseases, no male-​to-​male transmission is observed and all daughters of affected fathers are obligate carriers of the mutated allele. Penetrance, expressivity, and age of onset The determination of the mendelian segregation pattern can be challenged in pedigree analysis by incomplete penetrance, wherein a phenotypic feature can be present or absent (e.g. in Marfan syn- drome); variable expressivity, when the same mutation leads to dif- ferent severity or pattern of the phenotype (e.g. in cystic fibrosis); or manifestations depending on age (e.g. in Huntington disease). Non​mendelian inheritance There are many genetic abnormalities, which show familial recur- rence, but do not demonstrate mendelian segregation patterns. Such distortions from mendelian expectations, or non​mendelian traits, can be due to multiple aetiologies, including: genomic imprinting, uniparental disomy, mosaicism, mitochondrial DNA mutations, compound inheritance, digenic or triallelic inheritance, or muta- tional burden. Genomic imprinting If a phenotypic trait is transmitted through only one gender (parent-​ of-​origin effect), genomic imprinting should be considered. During the passage through meiosis, several genes are silenced (imprinted) in a sex-​specific manner. For example, the paternal copy of the UBE3A gene on chromosome 15q11.2 is imprinted during spermatogenesis. In the progeny, only the allele inherited from the mother is expressed in a neuronal tissue specific manner and is sufficient to produce enough RNA for normal cell function. When this single active allele is mu- tated or deleted, the individual is affected (in this case with Angelman syndrome). The sex-​specific imprint of UBE3A is erased upon the en- trance of chromosome 15 into meiosis and, depending on the sex, a new imprint is established; again, only one allele is expressed. Uniparental disomy The active allele of the imprinted gene will not be transmitted to the progeny if both homologous chromosomes harbouring the im- printed allele are inherited from one parent. Such lack of normal biparental inheritance of homologous chromosomes has been de- fined as uniparental disomy. Two major types of uniparental disomy have been described, heterodisomy and isodisomy. In heterodisomy, two homologous chromosomes from one parent are transmitted to the child and in isodisomy, both homologous chromosomes in an offspring originate from only one of the parental homologues. If a recessive disease gene is present on the isodisomic chromosome, the disease will manifest even though only one parent is a car- rier for the mutation. The most frequent mechanism responsible for uniparental disomy is chromosome non​disjunction in meiosis I ­followed by trisomy-​to-​disomy-​rescue in an early postzygotic stage of the embryo. In humans, among autosomes only trisomies 13, 18, and 21 are compatible with life. In certain tissues, the cells carrying an extra (trisomic) chromosome can survive only if one of the three copies of the homologous chromosomes (or a large portion thereof) is lost. This process of elimination of the extra chromosome is called trisomy rescue. Since this is a random event, in one-​third of cases the remaining chromosomes will be from the same parent, thus rep- resenting uniparental disomy. Because most cases result from an initial non​disjunction event, uniparental disomy is associated with advanced maternal age. Mitochondrial inheritance If a phenotypic trait is inherited only from mothers and never from fathers, then mitochondrial disease should be considered. Table 3.2.1  Recessive disorders and heterozygous predisposition to multifactorial disease Monogenic disease OMIM Gene Risk for multifactorial disease OMIM Aicardi–​Goutières syndrome 1 225750 TREX1 Chilblain lupus 610448 Ataxia-​telangiectasia 208900 ATM Breast cancer 114480 α1-​Antitrypsin deficiency 107400 AAT Chronic obstructive lung disease 606963 Cystic fibrosis 219700 CFTR Pancreatic insufficiency, Chronic rhinosinusitis, Idiopathic bronchiectasis 167800, 211400 Familial hypercholesterolemia 143890 LDLR Coronary artery disease 108725 Gaucher disease 230800, 230900, 231000 GBA Parkinson disease, late-​onset 168600 Hyperlipoproteinemia 238600 LPL Ischaemic heart disease 612030 Parkinson disease 600116 PARK2 Lung cancer; ovarian cancer 211980; 167000 Progressive familial intrahepatic cholestasis 602347 ABCB4 Intrahepatic cholestasis of pregnancy 147480 Tay-​Sachs disease 272800 HEXA GM2-​gangliosidosis, several forms 272800 Stargardt disease 248200 ABCR (ABCA4) Age-​related macular degeneration 153800

222 SECTION 3  Cell biology Mitochondrial DNA (mtDNA) is present in multiple copies in the cell cytoplasm and is transmitted to progeny only through the oo- cytes; the sperm carries a negligible amount of mtDNA. A broad spectrum of disease phenotypes may be caused by defects in the mitochondrial genome. In some cases, both the mother and her child may present with varying severity of the phenotype due to different proportions of mutated mtDNA in the cytoplasm, a phe- nomenon called heteroplasmy. In mitochondrial disease, the most energy-​dependent tissues (e.g. skeletal muscle, brain, heart, eyes) may be the first to reveal clinical signs or symptoms. Digenic inheritance Several diseases that are not complex traits and are not inherited as simple single-​gene mendelian disorders have been shown to be caused by mutations in two different genes, in which the other two alleles are normal. Such double heterozygotes that interact genetic- ally to manifest the phenotype have been described for example in retinitis pigmentosa (ROM1 and RDS encode interacting gene prod- ucts) and deafness (GJB6 and GJB2) (Table 3.2.2). Triallelic inheritance Bardet–​Biedl syndrome, a pleiotropic mendelian disorder character- ized by postnatal obesity, postaxial polydactyly, and progressive ret- inal dystrophy, can be caused in some families by mutations in at least two genes. Mutation analyses have revealed that in some patients with Bardet–​Biedl syndrome, three mutant alleles in two different genes segregated with expression of the disease. This phenomenon has been described as triallelic inheritance and has been observed in other diseases also (e.g. familial hypercholesterolemia and cortisone reductase deficiency). Based on this observation, an oligogenic type of inheritance (i.e. mutations in a small number of genes combined rather than a single locus mutation) was proposed to explain some other phenotypes, such as Hirschsprung disease (see Table 3.2.2). Multiple copy-​number variants (CNVs) and compound inheritance Sporadic disease can also potentially result from a combination of two CNVs at a single locus (e.g. analogous to the autosomal reces- sive neuromuscular disease spinal muscular atrophy or the renal disease nephronophthisis type I) or theoretically from two or more CNVs at different loci from two normal parents. Systematic evalu- ations of patients with the same-​sized CNVs and variable pheno- typic expressivity showed that additional CNVs could modify the severity of the phenotypic manifestations. This phenomenon has been referred to as two-​hit (or second-​hit) model. More recently, a compound inheritance model, consisting of a rare null mutation and a common, non​coding, haplotype that acted as a hypomorphic allele T-​C-​A (rs2289292, rs3809624, and rs3809627) of TBX6, with a prevalence of 44% among Han Chinese, was found to account for up to 11% of congenital scoliosis cases in a Chinese population. These findings were elegantly recapitulated in a mouse model of TBX6 null and hypomorphic mutations. Moreover, complex compound inheritance of coding and noncoding regulatory variants, e.g. in tissue-specific enhancer or evolutionarily conserved regions, have been described recently also for congenital anomalies of the kidney and urinary tract (CAKUT), lethal lung developmental diseases, and neurodevelopmental disorders. These studies help unravel the complex molecular bases of incomplete disease penetrance. Mosaicism Another distortion of mendelian inheritance can be caused by mo- saicism. Two or more cell lines can be present either in the gonads only (germline mosaicism) or in somatic cells (somatic mosaicism). Combined somatic and germline (gonosomal) mosaicism has been identified in parents of patients with several genetic conditions, thus raising the possibility that mosaic individuals might be detected by routine blood tests rather than requiring direct examination of germ cells. Mosaicism should be suspected when healthy parents have two or more children with a dominant disease. It has been shown that mutations that exist only in the mosaic state, presumably because constitutional mutations are embryonic lethal (e.g. Proteus syn- drome, OMIM 176920; hemimegalencephaly, OMIM 615937), can have profound effects on the phenotype of an individual. Pleiotropy and epistasis Pleiotropy occurs when a variant in a single gene has more than one distinguishable phenotypic effect. Epistasis refers to interaction between genes, in which a phenotypic effect is different from what would be expected if mutations of the genes were expressed inde- pendently. In both situations, the inheritance pattern in pedigree analysis can appear as non​mendelian. Clan genomics Analyses of the CMA, ES, WGS, and 1000 Genomes Project data, also targeted genomic sequencing derived from very large sample sizes, reveal that an abundance of rare and private single nucleotide variants (SNVs) and CNVs with large effect have arisen recently in the population history. Such rare and novel variants including new mutations contribute significantly to sporadic clinical traits. Clan Genomics posits that recent mutation may have a greater influence on disease susceptibility or protection and account for many more medically actionable variants, than is conferred by common vari- ations that arose in distant ancestors. Genetic and genomic variation The Human Genome Project has provided a haploid reference human genome enabling an enormous amount of DNA sequence data gener- ation and personal genome analyses for disease-​associated CNV and SNV, but it did not assess the scale of genetic and genomic polymorphic variation between different individuals and among populations. HapMap was developed in 2002 to determine the common pat- terns of human DNA sequence variation and to generate a haplotype Table 3.2.2  Models of disease allele transmission Locus Allele Example (disease/​gene) Monogenic Monoallelic Angelman (UBE3A) Biallelic Cystic fibrosis (CFTR) Triallelic CMT1A (PMP22) Digenic Biallelic RP (ROM1-​RDS) Triallelic BBS (BBS1–​19) Oligogenic Hirschsprung disease

3.2  The genomic basis of medicine 223 map (i.e. a linked set of genetic markers) of the human genome that in turn would help to identify genes affecting health and responses to drugs and environmental factors. Concurrently, the Human Genome Diversity Project (HGDP) was established to help to understand the diversity and unity of the entire human species, and the ENCODE Project was launched to help interpret this informa- tion, elucidate the functional consequences of non​coding variation, and to better understand the biology of human health and disease. By using high-​throughput methods, the ENCODE Project has generated a comprehensive catalogue of the structural and func- tional components encoded in the human genome sequence, including protein-​coding genes, non​protein-​coding genes, tran- scriptional regulatory elements, and sequences that mediate chromosome structure and dynamics. HGP, HapMap, HGDP, ENCODE, and several additional studies have revealed that human genetic variation is tremendous and con- sists of two major types: nucleotide sequence variations and gen- omic structural changes. Human genetic and genomic studies have resulted in the proliferation of several databases required to in- terpret the potential clinical significance or relevance of variation (Table 3.2.3). Types of variations Single nucleotide polymorphisms (SNPs) A genetic polymorphism is defined as a heterozygous DNA vari- ation found in more than 1% of the general population. A first phase of the studies on genetic variation in humans has focused on SNPs. A SNP is a nucleotide change that results from a base substitution during DNA replication. The frequency of such events has been es- timated as 10–​8 per base pair per generation. The vast majority of SNPs represent inherited changes that have accumulated over thou- sands of human generations. Depending on random genetic drift or natural selection models, the frequency of a particular SNP in the population can significantly change over generations. Typically, there are two major alleles for a SNP locus (i.e. biallelic); however, any base position can potentially be altered to one of three nu- cleotides (i.e. multiallelic), and as more personal genomes are sequenced, evidence indeed supports that all base positions are multiallelic. SNPs are predominantly localized in the non​coding portion of the human genome; only c.15% of over 650 million currently known SNPs map within genes. A SNP that does not change the polypeptide sequence is termed synonymous (or silent mutation) and if it leads to a polypep- tide sequence change it is described as non​synonymous. It has been shown that SNPs that are not in protein-​coding regions may still have functional consequences (e.g. they can generate splicing mutations, modify transcription factor binding sites or gene regulatory elements, or change the sequence of non​coding RNAs). A combination of closely linked SNPs is defined as a haplotype. Haplotypes result from reduced recombination (crossing-​over) events between closely linked genetic markers during meiosis and are generally shared between different populations; however, their frequency can differ widely. The non​random association of SNPs is described as linkage disequilibrium. The identification of a large number of SNPs has enabled successful genome-​wide association studies for risk alleles for susceptibility to common complex disease traits like diabetes and cancer (Table 3.2.4). Often such suscepti- bility variants map to non​coding regulatory regions. Repetitive DNA elements Variable number of tandem repeats (VNTR);
short tandem repeats (STRs) Di-​, tri-​, and tetra-​nucleotide repeats such as (GT)n, (CAA)n, or (GATA)n, respectively, have been referred to as microsatellites. They are very unstable and polymorphic genomic loci, thus useful Table 3.2.3  Summary of useful genomic websites Name URL Description 1000 Genomes http://​www.1000genomes.org/​ The 1000 Genomes Project is the first project to sequence the genomes of a large number of people, to provide a comprehensive resource on human genetic variation. ClinVar http://​www.ncbi.nlm.nih.gov/​clinvar/​ ClinVar is a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. DECIPHER, DatabasE of Chromosomal Imbalance and Phenotype in Humans http://​www.sanger.ac.uk/​PostGenomics/​ decipher Database of submicroscopic chromosomal imbalance describes clinical phenotype associated with submicroscopic rearrangements Database of Genomic Variants http://​projects.tcag.ca/​variation/​ A curated catalogue of structural variation in the human genome dbSNP http://​www.ncbi.nlm.nih.gov/​projects/​SNP/​ A public-​domain archive for a broad collection of simple genetic polymorphisms Ensembl http://​www.ensembl.org/​ Wellcome Trust funded software system which produces and maintains automatic annotation on selected eukaryotic genomes GeneTests/​GeneReview http://​www.genetests.org/​ A publicly funded medical genetics information resource developed for physicians, other healthcare providers, and researchers Human Gene Mutation Database http://​www.hgmd.cf.ac.uk/​ac/​index.php? HGMD constitutes a comprehensive core collection of data on germline mutations in nuclear genes underlying or associated with human inherited disease OMIM, Online Mendelian Inheritance in Man http://​www.ncbi.nlm.nih.gov/​sites/​ entrez?db = OMIM A catalogue of human genes and genetic disorders authored and edited by Dr Victor A. McKusick and his colleagues at Johns Hopkins and elsewhere UCSC Genome Bioinformatics http://​genome.ucsc.edu/​ This site contains the reference sequence and working draft assemblies for a large collection of genomes

224 SECTION 3  Cell biology in DNA fingerprinting and personal identification, population gen- etics, pedigree analysis, recombination, and linkage studies, as well as in determining paternity or parental origin of chromosomes. Minisatellites are DNA segments that consist of a short series (10–​ 100 bp) of GC-​rich tandem repeats and are present at more than 1000 locations in the human genome. Retrotransposons Alu elements are approximately 300 bp in size, present in about 1 mil- lion copies, and occupy approximately 10% of the human genome. The pathogenic function of Alu elements has been demonstrated to be exerted by two major mechanisms: insertional mutagenesis, util- izing an RNA intermediate to move or transpose Alu into exons or near spice junctions, and postinsertional ‘activity’ whereby they act as substrates to mediate rearrangements. Alu elements that share high sequence identity can serve as po- tential homologous recombination substrates. However, most Alu elements share less than 97–​98% sequence identity but can act as potential substrates for microhomology-​mediated DNA replica- tion errors via template switching, as in Fork Stalling and Template Switching (FoSTeS), and microhomology-​mediated break induced replication (MMBIR). Often times, recombination occurs be- tween imperfectly matched substrates, a phenomenon described as homeology. Examples of diseases caused by Alu element insertions include haemophilia A and B, and breast cancer due to disruption of the BRCA1 gene and by Alu–​Alu rearrangements in the LDLR, FOXF1, and SPAST genes, and C1 inhibitor locus. L1 elements are approximately 6 kb long, present in about 500 000 copies, and account for some 17% of the human genome. They are an important source of genomic variation. Like Alu repetitive elem- ents, using an RNA intermediate, L1 elements can be mutagenic by insertions into genes. Due to their abundance in the human genome, L1 elements that have high sequence identity can also stimulate and mediate non​allelic homologous recombination (NAHR). L1 elem- ents have been shown to mutate the genes responsible, for example, for Alport syndrome, colon cancer, haemophilia, β-​thalassemia, neurofibromatosis type 1, and Duchenne muscular dystrophy. Dynamic mutations Unlike the aforementioned DNA changes, which are usually trans- mitted through many generations without any change, more than 20 diseases have been described to date that are caused by unstable dynamic mutations occurring during DNA replication, repair, or re- combination. Most of these mutations are represented by an expan- sion of a simple triplet or trinucleotide repeat sequence (e.g. CAG, CGG, CTG, and AAG) in either coding regions (e.g. in Huntington disease) or non​coding regions such as introns (e.g. in Friedreich ataxia), or either 5' untranslated regions (e.g. in fragile X syndrome) or 3' untranslated regions (e.g. in myotonic dystrophy). These triplet repeat diseases have been shown to be inherited as autosomal dom- inant (e.g. in myotonic dystrophy), autosomal recessive (e.g. in Friedreich ataxia), or X-​linked (e.g. in fragile X syndrome) traits due to gain-​ or loss-​of-​function mutations. The minimal number of disease-​causing triplet repeats varies among different disorders, with 36 in Huntington disease and about 200 in fragile X syndrome. The intermediate number of repeats, lower than in affected individuals but greater than normal, is called premutation. It has been shown that premutations also can have phenotypic effects. For example, an increased incidence of ovarian failure in females and a late-​onset neurological disorder in males have been reported in indi- viduals carrying premutations in the fragile X syndrome FMR1 gene. Premutations have a potential to expand during meiosis and thus manifest the disease in the next generation. The nucleotide expan- sion often occurs in a sex-​specific manner and can be observed in a pedigree as a parent-​of-​origin effect. For example, expansions in fragile X syndrome arise during oogenesis and not in spermatogen- esis. The number of the pathogenic repeats can correlate inversely with the onset and severity of the disease; this provides a molecular explanation for the clinical phenomenon referred to as anticipation. Anticipation is observed in pedigree analysis as reduced age of onset in successive generations. In Huntington disease, the expandable triplet repeat CAG en- codes for the amino acid glutamine. Individuals with Huntington disease have 36 or more CAG repeats, which leads to polyglutamine expansion with subsequent huntingtin protein misfolding, ag- gregation, and degradation that exert toxic effects upon neurons. Similar polyglutamine expansions have been reported in several other neurological diseases (e.g. in spinocerebellar ataxia type 1). Expansions of polyalanine tracts beyond a certain threshold have been described as pathogenic, for example in congenital malfor- mations, skeletal dysplasia, and nervous system anomalies. Other pathogenic expansions have been shown to involve tetranucleotides Table 3.2.4  Genome-​wide association studies. SNPs and complex traits Disease Locus Reference Breast cancer FGFR2 Easton et al. (2007), Nature, 447, 1087–​93 Hunter et al. (2007), Nat Genet, 39, 870–​4 Coronary heart disease SNP, rs1333049, 9p21.3 Samani et al. (2007), N Engl J Med, 357, 443–​53 Crohn disease IRGM Parkes et al. (2007), Nat Genet, 39, 830–​2 Diabetes 12q24 (HNF1A), 12q13, 16p13 (CLEC16A), and 18p11 Todd et al. (2007), Nat Genet, 39, 857–​64 Macular degeneration CFH Li et al. (2006), Nat Genet, 38, 1049–​54 Maller et al. (2006), Nat Genet, 38, 1055–​9 Obesity FTO Dina et al. (2007), Nat Genet, 39, 724–​6 Frayling et al. (2007), Science, 316, 889–​94 Prostate cancer 8q24 (MYC, rs1447295, rs16901979, and rs6983267) Gudmundsson et al. (2007), Nat Genet, 39, 631–​7 Yeager et al. (2007), Nat Genet, 39, 645–​9 Rheumatoid arthritis SNP, rs10499194, 6q23 Plenge et al. (2007), Nat Genet, 39, 1477–​82

3.2  The genomic basis of medicine 225 (e.g. CCTG in myotonic dystrophy type 2), pentanucleotides (e.g. ATTCT in spinocerebellar ataxia type 10), and even dodecamers (e.g. CCCCGCCCCGCG in progressive myoclonus epilepsy of the Unverricht–​Lundborg type). Secondary DNA structures Abnormal secondary DNA structures can also be mutagenic. Several DNA conformations, different from the canonical right-​handed B-​ form, have been described. The best-​known non-​B DNA structures include triplexes, left-​handed DNA, bent DNA, cruciforms, nodule DNA, flexible and writhed DNA, G4 tetrad (quadruplexes), slipped structures, and sticky DNA. Some of these structures have been de- scribed as pathogenic for more than 20 neurological and psychiatric diseases. One of the best-​known examples of the pathogenic role of non-​B DNA structures are AT-​rich cruciforms in the proximal chromo- some 22q11.2, responsible for genomic instability and susceptibility to the most common recurrent non-​Robertsonian translocation t(11;22)(q11.2;q23.3) in humans. Copy-​number variants In contrast to c. 500 000 insertion or deletion poly­morphisms less than 1 kb in size that have been well-​studied and annotated, little was known about the polymorphic changes larger than this. The appli- cation of CMA to analyse the genomes of normal humans has led to the discovery of extensive genomic structural variation, ranging in size from thousands to millions of bases, which are not recognizable by chromosomal banding. These changes have been termed CNVs and result in deviation from the normal diploid state at a given locus. Deletions, duplications, triplications, quadruplications, insertions, or translocations can all result in CNVs. The total number, position, size, gene content, and population distribution of CNVs remain elu- sive. Data are still evolving but even several years ago, estimates have suggested approximate figures of 6000 CNVs in 4000 regions overlap- ping 1500 genes; most of these represent common variant CNV and thus are not associated with disease. However, they may contribute to pathology as recessive alleles. CNVs may account for as much as 360 to 500 Mb and represent 12 to 20% of the human genome. These num- bers can still represent a conservative estimate because CNVs ranging in size from 50 bp to 200 bp, those involving Alu and L1 variation at a single locus, and single exon drop out alleles resulting from error prone DNA replicative mechanisms, have not been well ascertained on a genome-​wide scale in different populations. It is anticipated that with the wider application of higher-​resolution CMA techniques, and next-​generation sequencing to determine in- dividual diploid genomes, the amount of structural variation iden- tified will increase significantly. The genomic distribution of CNVs has been shown to be non​random and correlates with exons, seg- mental duplications, and the mobile elements such as Alu repetitive elements, probably reflecting their ongoing evolutionary role. Like many other genomic rearrangements, CNVs can be inherited or sporadic. A commonly used and useful standard is to assume that de novo CNVs in association with sporadic clinical phenotypes are more likely to be disease causative. However, the phenotypic effects of CNVs are sometimes unclear and depend mainly on whether dosage-​sensitive genes are affected by the genomic rearrangement. Some CNVs have been shown to be responsible for mendelian dis- eases, non​mendelian traits such as complex diseases, and common traits (including behavioural traits), or to represent benign poly- morphic variation (Fig. 3.2.1; Table 3.2.5). CNVs have been proposed also to be a major factor responsible for human diversity and evolution. CNVs have been catalogued in public databases such as the Toronto Database of Genomic Variants and 1000 Genomes phase III studies. Clinically relevant CNVs can be found in: DECIPHER (see Table 3.2.3). Chromothripsis Next-​generation DNA sequencing of human tumours has led to discovery of chromothripsis, a phenomenon of complex rearrange- ments in one or a few chromosomal loci that arose in a single cata- strophic event. One proposed mechanism for chromothripsis is chromosome shattering with random reassembly in the subsequent interphase by non​homologous end joining (NHEJ). A similar phe- nomenon has been observed in constitutional rearrangements as- sociated with developmental disorders. Errors of replicative repair in which DNA replication initiates serial, microhomology-​mediated template switching is proposed to produce such rearrangements in a process termed chromoanasynthesis. Genomic disorders Over the past two decades it has become evident that higher-​order genomic architectural features can confer susceptibility to DNA rearrangements that are a frequent cause of diseases in humans. Conditions that result from such rearrangements of the human genome have been referred to as genomic disorders. Many genomic disorders occur sporadically, and these frequent events are often caused by de novo rearrangements. Various cal- culations have shown that the de novo locus-​specific mutation rates for genomic rearrangements are between 10–​4 and 10−5; this is at least 100-​ to 10 000-​fold more frequent than point mutations (Table 3.2.6). Genomic rearrangements can cause mendelian diseases (e.g. CMT1A, MODY5) or complex traits such as behaviours and intel- lectual disability, or may represent benign polymorphic changes. The major mechanism by which rearrangements convey pheno- types is altered gene dosage due to a variation in gene copy-​number. When the deleted or duplicated region harbours a dosage-​sensitive gene, the rearrangement will lead to an abnormal phenotype. Other mechanisms include gene interruptions, gene fusions, position ef- fects, and unmasking of variants in coding region or other func- tional SNVs in the second allele (Fig. 3.2.1). For a few genomic disorders, significant differences in incidences have been observed in different world populations. In some of them, structural variations of the genomic region in the patients’ parents have been found, demonstrating that the variation of genomic archi- tecture is a significant factor for disease susceptibility. For instance, submicroscopic genomic inversions can result in haplotype blocks (due to reduced recombination) and generate an architecture with directly oriented LCRs that can act as NAHR substrates. This can lead to the susceptibility to deletion/​duplication rearrangements only in the individuals within the population who harbour the in- version variant with the rearrangement-​prone genome architec- ture (e.g. in Williams–​Beuren syndrome or 17q21.31 microdeletion syndrome).

226 SECTION 3  Cell biology Genomic alteration Non​allelic homologous recombination Many LCRs have a complex structure and have arisen during pri- mate speciation during the last 25 to 40 million years as a result of serial segmental duplications. LCRs longer than 10 kb and of more than about 97% sequence identity can lead to local genomic instability. LCRs have been shown to stimulate and/​or mediate constitutional (both recurrent and non​recurrent), evolutionary, and somatic ­genomic rearrangements. When located at a distance less than 5 to 10 Mb from each other, LCRs can mediate NAHR, and potentially result in unequal crossing-​over. NAHR between directly oriented LCRs leads to deletions or reciprocal duplications of the genomic re- gion located between them, and NAHR between the inverted LCRs results in an inversion of the intervening genomic segment. In LCRs A) gene dosage D) position effect E) unmasking recessive allele or functional polymorphism * * or B) gene interruption C) gene fusion Neuropathy ID Infertility Ptosis Bleeding Anemia Color blindness Blood hypertension ID & deafness Overgrowth & bleeding Pigmentation Trait NAHR substrate Disease CMT1A-REP SMS-REP AZFc REP int22h-1 in Factor VIII and int22h-2 or int22h-3 α-globin RCP and GCP CYP11B1 and CYP11B2 SMS-REP & (mutation in MYO15A) Sos-REP & (mutation in Factor XII) PWS-REPs CMT1A/HNPP SMS/PTLS Azoospermia FOXL2 Haemophilia A α-thalassemia Deuteranopia, protanopia Glucocorticoid-remediable aldosteronism SMS & DFNB3
SoS & Factor XII deficiency PWS Dosage sensitive gene PMP22 RAI1 ? F8 RAI1 NSD1 and F12 P locus α-globin RCP and GCP CYP11B1 and CYP11B2 Blepharophimosis Fig. 3.2.1  Schematic models for molecular mechanisms of genomic disorders. For each model, examples of trait, non​allelic homologous recombination (NAHR) substrate, and disease are shown. (a) Gene dosage, where there is a dosage-​sensitive gene within the rearrangement; (b) gene interruption, wherein the rearrangement breakpoint disrupts a gene; (c) gene fusion, whereby a fusion gene is created at the breakpoint that either fuses coding sequences or a novel regulatory sequence to the gene. For example, two genes encoding cytochrome P450 enzymes CYP11B2 (aldosterone synthase) and ACTH-​regulated CYP11B1 (11-​β-​hydroxylase, cortisol biosynthesis) located on chromosome 8q21 are 45 kb apart and have 10 kb segments of 95% sequence identity. NAHR between these two genes results in a chromosome deletion, yielding a fusion hybrid CYP11B1/​CYP11B2 gene. CYP11B1/​CYP11B2 is under the regulation of ACTH and leads to glucocorticoid-​remediable aldosteronism (GRA, MIM 103900). All symptoms of the disease can be normalized by the administration of glucocorticoid analogues and are exacerbated by administration of ACTH; (d) position effect, in which the rearrangement has effects on expression/​regulation of a gene near the breakpoint, potentially by removing or altering a regulatory sequence; and (e) unmasking recessive allele, where a deletion results in hemizygous expression of a recessive mutation or further uncovers/​exacerbates effects of a functional polymorphism. In each model, both chromosome homologues are depicted as horizontal lines. The rearranged genomic interval is enclosed by brackets. Dashed lines indicate genomic regions either deleted or duplicated, an absent line indicates deletion with phenotypic effects from the remaining allele unmasked because of the rearrangement, and a dotted line represents deletion but where phenotypic effects result from the absence of interactions between alleles. Gene is depicted by filled horizontal rectangle, while regulatory region is shown as a hatch-​marked square. Asterisks denote point mutations. CMT1A, Charcot–​Marie–​Tooth disease type 1A; DFNB3, deafness, neurosensory, autosomal recessive 3; HNPP, hereditary neuropathy with liability to pressure palsies; ID, intellectual disability; PWS, Prader–​Willi syndrome; SMS, Smith–​Magenis syndrome; PMD, Pelizaeus–​Merzbacher syndrome; PTLS, Potocki–​Lupski syndrome; SoS, Sotos syndrome. Adapted from Lupski JR, Stankiewicz P (2005). Genomic disorders: molecular mechanisms for rearrangements and conveyed phenotypes. PLoS Genet, 1, e49. Table 3.2.5  CNVs and complex traits Disease Gene Reference Alzheimer disease APP Rovelet-​Lecrux et al. (2006), Nat Genet, 38, 24–​6 Chronic pancreatitis PRSS1 Le Maréchal et al. (2006), Nat Genet, 38, 1372–​4 Crohn disease IRGM McCarrol et al. (2008), Nat genet, 40, 1107–​12 Lupus with glomerulonephritis FCGR3B Aitman et al. (2006), Nature, 439, 851–​5 Parkinson disease SNCA Singleton et al. (2003), Science, 302, 841 Systemic lupus erythematosus Complement C4 component Yang et al. (2007), Am J Hum Genet, 80, 1037–​54

3.2  The genomic basis of medicine 227 of a more complex genomic structure consisting of both direct and inverted subunits, distinct portions can serve as NAHR substrates leading to deletions/​duplications or inversions, respectively. Recombination hot spots Interestingly, the strand exchanges for NAHR sites are not scattered throughout the length of homology within LCRs, but cluster in re- combination hot spots. Normal allelic homologous recombination, like NAHR, is characterized by hot spots and cold spots throughout the genome. A  meiosis-​specific histone methyltransferase PR do- main zinc finger protein 9 (PRDM9) has been shown to recognize a 13-​mer motif CCNCCNTNNCCNC enriched at human hotspots and cause histone H3 lysine 4 trimethylation. This reorganization is thought to be associated with increased probability of recombination. Interestingly, sequence variation in PRDM9 recognition sequence may be responsible for hotspot differences between species. Moreover, variation of PRDM9 in human populations has been directly linked to recombination rates at NAHR generated duplications and deletions. Microdeletion and microduplication syndromes Two common autosomal dominant peripheral neuropathies, CMT1A and hereditary neuropathy with liability to pressure palsies (HNPP), are among the first and best-​characterized genomic dis- orders. CMT1A and HNPP are caused in the vast majority of cases by copy-​number change of a dosage-​sensitive myelin gene PMP22 as a result of reciprocal duplication and deletion, respectively, of an ap- proximately 1.4 Mb genomic fragment within 17p12. This genomic segment is flanked by two LCRs of about 24 kb, approximately 98.7% identical, termed the proximal CMT1A-​REP and the distal CMT1A-​ REP, which serve as substrates for NAHR. The proximal chromosome 17p also harbours another meiotically unstable genomic region with a haploinsufficient RAI1 gene. Deletions and point mutations of RAI1 result in Smith–​Magenis syndrome (SMS), a disorder with multiple congenital anomalies and intellec- tual disability characterized by minor craniofacial and skeletal anom- alies such as brachycephaly, frontal bossing, synophrys, midfacial hypoplasia, short stature, and brachydactyly, neurobehavioral ab- normalities such as aggressive and self-​injurious behaviour and sleep disturbances, and ophthalmic, otolaryngological, cardiac, and renal anomalies. A  genomic segment of approximately 4 Mb encompassing RAI1 and flanked by large, complex, highly identical, and directly oriented, proximal (approximately 256 kb) and distal (approximately 176 kb) LCRs termed SMS-​REPs is deleted in 70 to 80% of patients with Smith–​Magenis syndrome via the NAHR mech- anism (i.e. common recurrent deletion). The remainder of CNVs in these patients are mediated by NAHR using alternate flanking LCRs as substrates (uncommon, recurrent) or due to non​recurrent re- arrangements mediated by template switching DNA replication error mechanisms. The latter CNVs have been recently shown to be often accompanied by megabase long hypermutation clusters of SNVs. The reciprocal duplication dup(17)(p11.2p11.2) of this region has been described in patients with Potocki–​Lupski syndrome (PTLS). Clinical features observed in patients with this syndrome are dis- tinct from those seen with Smith–​Magenis syndrome and include infantile hypotonia, failure to thrive, intellectual disability, autistic features, sleep apnoea, and structural cardiovascular anomalies. Other well-​characterized microdeletion syndromes include Williams–​Beuren syndrome (7q11.23), Prader–​Willi and Angelman syndromes (15q11.2q12), DiGeorge/​velocardiofacial syndrome (22q11.2), microdeletion 17q21.31 syndrome, and Sotos syndrome (5q35). For all these microdeletions, the reciprocal microduplications predicted by the NAHR model have been reported. Typically, the phenotypic manifestation of microduplication syndromes is milder than their reciprocal microdeletion counterpart. In chromosome duplications, the increase of 2 to 3 in gene copy-​number results in a 1.5-​fold increase (50% change) of the protein amount, vs. the 2 to 1 decrease in gene copy-​number leading to twofold reduction (100% change) of the protein amount in the reciprocal deletions. Mirror traits For some genomic regions in humans, deletion and reciprocal du- plication CNVs have been found in patients with opposing pheno- types. 1q21.1 deletion was associated with microcephaly and schizophrenia, whereas 1q21.1 duplication was found in patients with macrocephaly and a trend towards autism. Conversely, dele- tion in 16p11.2 was identified in association with macrocephaly and autism, while duplication of 16p11.2 was associated with micro- cephaly and schizophrenia. Moreover, patients with SMS were ob- served to be overweight with high body mass index (BMI), whereas those with PTLS due to dup17p11.2 are usually underweight, which Table 3.2.6  New mutation rates for genomic rearrangements Rearrangement
hot spot Mutation rate direct measure Method Mutation rate indirect estimate Method Deletion Duplication Deletion Duplication CMT1A-​REP 4.2 × 10−5 1.73 × 10−5 Real-​time PCR on sperm DNA 1.7–​2.6 × 10−5 Prevalence + molecular AZFa-​HERV 2.16 × 10−5 5.26 × 10−6 Real-​time PCR on sperm DNA LCR17p 1.87 × 10−6 8.74 × 10−7 Real-​time PCR on sperm DNA WBS-​LCR 9.55 × 10−6 4.54 × 10−6 Real-​time PCR on sperm DNA 2.0–​12.5 × 10−5 Prevalence DGS/​VCFS; SMS 2.0–​12.5 × 10−5 Prevalence DMD 1.0 × 10–​4 1.0 × 10−4 Prevalence + molecular α-​Globin 4.2 × 10–​5 Sperm PCR t(11;22) 1.2–​9.5 × 10–​5 (translocation) Sperm PCR Normal controls 1.7 × 10–​6 1.7 × 10–​6 Array CGH of trios Adapted from Turner DJ, et al. (2008). Germline rates of de novo meiotic deletions and duplications causing several genomic disorders. Nat Genet, 40, 90–​5 and Lupski JR (2007). Genomic rearrangements and sporadic disease. Nat Genet, 39, S43–​7.

228 SECTION 3  Cell biology was recapitulated in the mouse models of SMS and PTLS. These mirror trait findings are consistent with a theory regarding evolution of the social brain that posited that autism and schizophrenia repre- sented opposing phenotypic extremes of normal human behaviour. Non​homologous end joining and Fork Stalling
and Template Switching (FoSTeS) The NAHR/​low-​copy repeat mechanism is most prevalent and responsible for the common recurrent deletions, duplications, or inversions. Non​recurrent rearrangements have been shown in selected cases to arise by the NHEJ mechanism, where the low-​ copy repeats, if present, stimulate but do not mediate the recom- bination events. DNA replication errors have been shown to play an important role in the origin of some genomic disorders due to non​recurrent ­rearrangements. These models all incorporate the concept of tem- plate switching during DNA replication—​short distance, within replication fork, and long-​distance template switching embodied in the FoSTeS, Fork Stalling Template Switching, model. The MMBIR model was initially proposed based on data from E. coli, yeast, and human genome rearrangements not readily ex- plained by the current models for generating rearrangements. This model appears to account for complex genomic rearrangements due to iterative template switches. Chromosome aberrations Variations of the human genome larger than about 5 Mb in size can be visible in the light microscope and are referred to as chromo- some aberrations. Chromosome aberrations are frequent events, with the total incidence estimated as 1 in approximately 160 live births. They can be categorized as numerical or structural abnor- malities. Numerical abnormalities (1 in approximately 250 new- borns) are observed more frequently than structural ones (1 in approximately 375 newborns). Numerical aberrations Deviations from the normal chromosome number are usually un- balanced and defined as aneuploidy. Triploidies and tetraploidies Triploid (3n) complements of chromosomes, 69,XXX, 69,XXY, or 69,XYY, typically result from an egg being fertilized by two sperms. Tetraploid (4n) sets, 92,XXYY or 92,XXXX, are caused by a failure in zygote division. Both triploidies and tetraploidies states are lethal. Trisomies, monosomies The more commonly observed aneuploidies that result from numer- ical changes of a single chromosome, trisomies and monosomies, are caused by chromosome non​disjunction in meiosis I, or some- times in meiosis II, and in some cases (particularly those involving the sex chromosomes) are compatible with life. Although the most frequent aneuploidy in humans is trisomy 21 in patients with Down syndrome (1 in 670 newborns), aneuploidies of sex chromosomes are more frequent (1 in 440 newborns) than those involving auto- somes (1 in 700 newborns). This is due to the fact that, in addition to trisomy 21, only trisomies of chromosome 18 (Edwards syndrome, 1 in 7500 newborns) and chromosome 13 (Patau syndrome, 1 in 22 700 newborns) are compatible with life. Although approximately 99% of fetuses with monosomy X are spontaneously aborted, patients with Turner syndrome account for 1 in 4000 female newborns. Very often, however, the 45,X cell line is mosaic, accompanied by another cell line with either a normal cell chromosome complement or structural rearrangements of chromosome X (e.g. deletion of the short arm, ring chromosome, or isochromosome of the long or short arms). The most frequent aneuploidy during fetal life, trisomy 16 (one-​third of all trisomies), leads to early miscarriages and is not identified in live newborns. The karyotype 47,XXY is found in patients with Klinefelter syndrome (1 in 1000 male newborns). Marker chromosomes Marker chromosomes (SMCs) are small supernumerary chromo- somes and are detected with a frequency of 0.24/​1000 in newborns, 0.4 to 1.5/​1000 in prenatal studies, 2 to 3/​1000 among phenotyp- ically abnormal individuals, and 0.5/​1000 in the general popula- tion. Marker chromosomes are usually derived from acrocentric autosomes (c.85%), and particularly from chromosome 15 (some 40–​50%). The risk of an abnormal phenotype in de novo cases has been estimated to be about 28%. The severity of the phenotype de- pends on the size of the marker chromosome and the extent of mosaicism. Structural chromosome aberrations Deletions and duplications Chromosome deletions involving autosomes lead to structural and functional monosomies of the missing genomic material. In XY males, deletions of sex chromosomes result in structural and func- tional nullisomies. The phenotypic manifestation of a deletion is caused by the haploinsufficient gene(s) located in the deleted frag- ment or disrupted by the deletion breakpoint (Fig. 3.2.1). If more than one haploinsufficient gene is present in the deleted region, the abnormality is referred to as a contiguous gene deletion syndrome, as in the Potocki–​Shaffer syndrome (11p11.2) or Langer–​Giedion syndrome (8q23q24). Many of the smaller deletions in the unstable genomic regions have been shown to have the same size and are re- current. These microdeletion genomic disorders are usually caused by NAHR between directly oriented low-​copy repeats and are fre- quent events (Table 3.2.6). Reciprocal translocations Reciprocal translocation is defined as an exchange of the chromo- some segments between two chromosomes (homologous or non-​ homologous). Balanced reciprocal translocations are found in approximately 1 in 600 individuals; hence, approximately 1 in 300 couples are at risk of unbalanced progeny. In most cases, balanced reciprocal translocations are not associated with an abnormal phenotype; however, it has been shown that up to 40% of the ap- parently balanced reciprocal chromosome translocations in pa- tients with abnormal phenotype are accompanied by a chromosome imbalance either at the translocation breakpoint or elsewhere in the genome. Balanced translocations can also have clinical conse- quences for normal individuals. Depending on the type of meiotic segregation and the size of the translocated chromosome material, the unbalanced meiotic products of the segregating translocation chromosomes can result in chromosome imbalance and be associ- ated with either spontaneous abortions or births of affected children.

3.2  The genomic basis of medicine 229 The products of reciprocal chromosome translocation can be trans- mitted to progeny in a balanced or unbalanced form as a conse- quence of alternate or adjacent segregation. In the vast majority of cases, reciprocal translocations appear to be random events. However, two of the most common constitutional non-​Robertsonian translocations in humans have been shown to result from a specific genomic architectural features predisposing to ­recurrent events; the breakpoints of translocation t(11;22)(q11.2;q23.3) utilize AT-​rich cruciforms whereas low-​copy repeats on 4p, 8p, 11p, and 12p mediate the translocations t(4;8)(p16;p23) (olfactory receptor-​gene clusters), t(4;11)(p16.2;p15.4), and t(8;12)(p23.1;p13.31). Genomic architecture involving low-​copy repeats has also been shown to play a role in the formation of the most frequent somatic chromosome ­abnormality found in chronic myeloid leukaemia; translocation der(22)t(9;22)(q34;q11)—​Philadelphia chromosome. Robertsonian translocations Translocation between two acrocentric chromosomes (13, 14, 15, 21, or 22), with breakpoints occurring in the short arms within or close to the centromere, is defined as Robertsonian translocation or centric fusion. Inverted repeats in acrocentric short arms have been proposed to mediate Robertsonian translocation. One in approximately 900 newborns carries a Robertsonian translocation, making it the most common chromosome rearrange- ment in humans. In some cases, the rearrangement involving long arms of one chromosome is not a product of the centric fu- sion between two homologous chromosomes but a consequence of replication of one chromosome arm, and thus represents an isochromosome. The karyotype of the carrier of Robertsonian translocation is balanced and consists of 45 chromosomes (the acentric heterochro- matic short arms contain no genes and are lost during cell division). All combinations of acrocentric chromosomes have been found; however, translocations between chromosomes 13 and 14 or 14 and 21 are most prevalent, with the Robertsonian translocation 13;14 being the most common chromosome aberration in humans (1 in 1300). Carriers of Robertsonian translocation have a significantly increased risk of abnormal progeny; for example, carriers of trans- location 21q21q have an almost 100% chance of having a child with Down syndrome. The carriers of Robertsonian translocation are also at increased risk of having offspring with uniparental disomy for the acrocentrics involved in the rearrangement due to the trisomy rescue mechanism (see earlier). Uniparental disomy has clinical consequences for car- riers of Robertsonian translocations involving acrocentric chromo- somes 14 and 15 that are known to contain imprinted genes. Insertions A non​reciprocal translocation of DNA material from one chromo- some arm into another arm is described as an insertion or insertional translocation. The carrier of a balanced insertion has up to a 50% chance of an abnormal progeny. Inversions An inversion is defined as a double-​break chromosome rearrange- ment, in which a segment of a chromosome is reversed and rein- serted back into the chromosome. Some inversions (particularly those on chromosome 8p) have been shown to be mediated by a specific genomic architecture involving low-​copy repeats in an in- verted orientation. When the inverted fragment contains the centromere, the inver- sion is described as pericentric. The recombination products of the pericentric inversion are a chromosome with a terminal deletion of one chromosome arm and a terminal duplication of the second arm. Paracentric inversions do not include the centromere; both breaks occur in one arm of the chromosome. The product of the paracen- tric inversion is either an acentric or dicentric chromosome; in both cases it is unstable and usually a lethal event. Typically, inversions are balanced; however, occasionally imbalances are found at their break- points. In addition, an inversion breakpoint can disrupt a dosage-​ sensitive gene (e.g. the most common cause of severe haemophilia A, representing over 40% of cases), resulting in an abnormal phenotype, or convey a phenotype because of a position effect. Complex chromosome rearrangements When more than two breakpoints involve two or more chromosomes the resulting aberration is referred to as complex chromosome re- arrangement. These usually arise in spermatogenesis but are more often transmitted to subsequent generations through oogenesis. Ring chromosomes Ring chromosomes are usually formed when two chromosome arms break and fuse, thus forming a circular structure. Rings are often associated with abnormal phenotypes because of loss of genomic material at one or both chromosome ends. In rare cases, the breaks occur on one chromosome arm and the resulting ring chromo- somes do not contain alphoid centromeres. Such acentric rings can generate neocentromeres from a euchromatic material and can be transmitted to the daughter cells. Rings are mitotically unstable, are often found in a mosaic state, and can form double ring structures as a result of crossing-​over events. Isochromosomes When one chromosome arm is lost and the other is duplicated, the resulting mirror-​image chromosome is called an isochromosome. When the breakpoint is within the centromere (centromere misdivision), the resulting isochromosome is monocentric and stable. If the original chromosome breaks outside the centromere, the derivative chromosome product is dicentric and thus unstable. To stabilize such a chromosome, one of the centromeres becomes inactive. Such chromosomes are then called pseudodicentric (pseudoisodicentric). The clinically relevant isochromosomes are, for example, isochromosomes of the long arms of chromosome X found in patients with Turner syndrome. Moreover, an isodicentric chromosome idic(17)(p11.2) occurring as a somatic event is fre- quently found in chronic myeloid leukaemia and in childhood primitive neuroectodermal tumours. The idic(17)(p11.2) is recur- rently formed utilizing large cruciform structures containing some 38 to 49 kb low-​copy repeats of approximately 99.8% identity lo- calized in the Smith–​Magenis syndrome common deletion region in chromosome 17p11.2. The specific mechanism is a NAHR using inverted LCR located on non-​sister chromatids Centromere fission Very rarely, as a result of centromere misdivision, the short arms of a chromosome are separated from its long arms and after replication

230 SECTION 3  Cell biology form two isochromosomes, representing a balanced rearrangement. Such events are known as centromeric fission. Heterochromatin variants In addition to aberrations involving euchromatin, non​pathogenic variations of heterochromatin are often seen in karyotype ana- lysis. The most common polymorphisms involve differences in size of satellite DNA of the short arm of acrocentric chromosomes and size or location of heterochromatin in 1qhet, 9qhet, 16qhet, and Yqhet. Chromosome mosaicism The presence of two or more different chromosome complements in one individual is defined as chromosomal mosaicism. Somatic chromosomal mosaicism is a well-​known cause for birth defects, intellectual disability, and, in some instances, specific genetic syn- dromes such as hypomelanosis of Ito and Pallister–​Killian syndrome (tetrasomy 12p). Chromosomal mosaicism is found in up to 50% of embryos at the eight-​cell stage and up to 75% in blastocysts. The most common cause of chromosomal mosaicism is chromosome non​disjunction followed by trisomy rescue in a subpopulation of cells. Routine clinical G-​banded karyotype analysis is performed in peripheral blood T lymphocytes stimulated to divide by phytohaem- agglutinin. Thus, only a subpopulation of nucleated cells, and only those healthy enough to respond to stimulation, are expanded and examined. Applications of array comparative genomic hybridiza- tion (array CGH) technology on genomic DNA extracted directly from uncultured peripheral blood has enabled the identification of mosaic chromosome abnormalities that were undetected by conven- tional karyotype analysis. Thus, array CGH has enabled better detec- tion of mosaicism of unbalanced chromosome abnormalities than traditional cytogenetic techniques. Genetic and genomic analyses The pathogenic abnormalities in the human genome vary in size from SNV (locus-​specific mutation rates approximately 10–​6 to 10–​8) to CNV involving entire genes (mutation rate 10–​4 to 10–​5) to micro- scopically visible chromosome aberrations (found in 1 in 160 new- borns). Despite the broad spectrum of available techniques that have been developed recently to analyse the human genome, there is no single method that can identify all types of genetic and genomic variation (Fig. 3.2.2). Single nucleotide changes and next-​generation sequencing Point mutations are commonly analysed using conventional DNA sequencing with polymerase chain reaction (PCR) amplifica- tion followed by chain termination with fluorescently labelled dideoxynucleotides. However, this method is low-​throughput and relatively expensive. A large number of SNPs analysed in genome-​wide association studies are currently analysed using hybridization-​based oligo- nucleotide microarrays (Table 3.2.4). The available technologies (Affymetrix, Illumina) enable analysis of more than 1 million SNPs in one experiment. Point mutations SNPs LCRs Retrotransposons Down Edwards Turner Charcot–Marie–Tooth disease type 1A Prader Willi DiGeorge Smith–Magenis Potocki–Lupski Friedrich ataxia Huntington Haemophilia A Apert β-Thalassemia Colon cancer Breast cancer Fragile X Cystic fibrosis Recurrent translocation t(11;22)(q23;q11) Disease Repetitive DNA DNA repeats 100 101 102 103 104 105 106 107 108 109 bp 3×109 Mutation size Mutation type Analysis method DNA sequencing PCR Southern analysis Array CGH Dynamic mutations CNVs Non-B DNA Chromosome banding PFGE FISH Fig. 3.2.2  Genomic rearrangements, phenotypic traits, and methods used to assay. Above are shown the traits that can be due to DNA rearrangements. Below are ranges of DNA changes, descriptions of rearrangements, and the methods of assaying different sized changes.

3.2  The genomic basis of medicine 231 Development of massively parallel next-​generation sequencing technologies along with the bioinformatic pipelines to analyse large data sets have enabled successful research and clinical implemen- tation of ES. This enables robust, accurate, fast, and cost-​effective DNA sequencing of the entire coding portion of the genome in one assay. The amount of the next-​generation sequencing data being generated and the characterization of the numerous variants iden- tified are challenging to interpret. However, data sharing among large databases and the 1000 Genomes Project have substantially fa- cilitated appropriate classification of variants and discovery of new disease-​causing genes. As a result, ES, which has a diagnostic rate of 20–​30%, has revolutionized the diagnosis of mendelian diseases. Most of the identified pathogenic variants are in autosomal dom- inant traits, demonstrating an important role of de novo germline point mutations in both rare and common genetic disorders associ- ated with reduced fitness. WGS has revealed that the average genome harbours 50 to 100 de novo point variants; in ES trio analyses, on average, up to five apparent de novo SNVs (1–​2 non​synonymous and 2–​3 synonymous) have been identified per exome. The fre- quency of de novo variants increases with paternal age at a rate of about one new paternally derived variant per every 2 years past the age of 30 years. These variants are readily identified using a parent-​ child family trio-​based exome sequencing approach. Next-​generation sequencing technologies also allow for analysis of the entire genome from single cell as well as cell-​free DNA from maternal serum. The latter is extremely useful in non​invasive pre- natal screening of most common aneuploidies. Mutational burden and dual molecular diagnoses In addition to the aforementioned digenic or triallelic inheritance, and the two-​hit (or second-​hit) model, severity of a disease mani- festation can be caused by the abnormal copy-​number variation of dosage-​sensitive genes. Intrafamilial increase of copy-​number of PMP22 due to an NAHR-​mediated change from duplication to triplication has been associated with a more severe muscle atrophy of the lower leg and hand muscles, and severe pes cavus deformity due to decreased nerve conduction velocity. Similar phenotypic se- verity phenomenon has been reported for CHRNA7 triplications on chromosome 15q13.3, STS triplications on Xp22.31, as well as homozygous duplication (tetrasomy) of the DiGeorge syndrome critical region on chromosome 22q11.2. Moreover, systematic aggregate ES analyses in multiple unrelated families with CMT-​like peripheral neuropathy refractory to pre- vious molecular diagnosis revealed a significantly increased number of rare variants across 58 neuropathy-​associated genes in subjects versus controls, suggesting that the combinatorial effect of rare vari- ants contributes to disease burden and variable expressivity. In contrast to conventional genetic analyses, genomic approaches to disease, such as ES of a large cohort of subjects, have revealed that two or more pathogenic variants can be found in as many as 5% of patients when compared with unrelated control individuals. These studies also facilitated dual molecular diagnoses of ‘blended mendelian phenotypes’ as well as explanation of intra-familial clin- ical variability by multilocus variation among affected siblings. By deconvolution of the complicated phenotypic presentations due to coexistence of multiple genetic conditions, they demonstrated how combinatorial effects of rare variants contribute to disease burden. In dual molecular diagnoses, each disease will segregate according to its known associated trait. This is in contrast with digenic inher- itance whereby both heterozygous variants are required for disease manifestation. Detecting genome structural changes Genomic rearrangements such as deletions, duplications, or inver- sions that are up to 30 kb in size can be detected using the poly- merase chain reaction or Southern blot hybridization. Recently, small genomic rearrangements are detected using next-​generation sequencing and digital droplet PCR (ddPCR). Large visible chromosome rearrangements can be analysed using the light microscope by conventional banding techniques (most often G-​banding). The detection of genomic changes between 30 kb and 5 Mb in size had remained beyond the level of resolution of available methods until the development of the fluorescent in situ hybridization techniques. Likewise, pulsed-​field gel electrophoresis also enabled the resolution of genomic changes of similar magnitude. However, both these technologies are still limited to the examination of specific genomic regions (i.e. they represent locus-​specific tests). The development of array CGH and SNP arrays have enabled high-​resolution screening of genomic imbalances throughout the entire genome. The level of resolution is dependent on the size and distance between the arrayed interrogating probes. Initially, large genomic clones (bacterial or P1 artificial chromosomes) were im- mobilized and arrayed on glass slides and used as interrogating probes. Such microarrays enabled detection of CNVs throughout the entire human genome with a resolution of approximately 100 kb. The bacterial clones have been replaced by oligonucleotide probes. The currently commercially available arrays have several hundred thousands or millions of oligonucleotide probes. This technology has revolutionized clinical cytogenetics and may re- place much of chromosome analysis with high-​resolution genome analysis (Fig. 3.2.3). As an alternative approach to genome-​wide screening for the detection of specific large deletions or duplications, a quantitative technique called multiplex ligation-​dependent probe amplification based on the polymerase chain reaction, has been developed. This technique relies on sequence-​specific probe hybridization to gen- omic DNA, followed by amplification of the hybridized probe and semi-​quantitative analysis of the resulting polymerase chain reac- tion products. The relative peak heights or band intensities from each target indicate their initial concentration. This has proven to be an inexpensive, simple, rapid, and sensitive tool to detect dosage alterations in selected genomic regions. More recently, for pre- cise quantitative measurement, ddPCR has been used. The analysed DNA sample is separated into a large number of partitions and the reaction is carried out in each partition individually, providing an absolute quantification of target DNA molecules with previously un- achievable accuracy and sensitivity. Human genetics approaches to drug development Mendelian diseases are a good model to study critical pathways and the knowledge gained could be of benefit to the understanding and therapeutic developments for multifactorial diseases. Hypercholesterolemia and coronary atherosclerosis have been shown to result from elevated levels of low-​density lipoprotein (LDL) cholesterol or reduced number of LDL receptors (LDLR).

232 SECTION 3  Cell biology In addition to alterations in LDLR and its ligand, ApoB, hyperchol- esterolemia has also been shown to be caused by missense gain-​ of-​function mutations in PCSK9 that encodes a serine protease in the secretory pathway. Conversely, nonsense/​truncating variants in PCSK9 have been found in individuals with low LDL cholesterol and a substantial reduction in the incidence of coronary events. Thus, inhibitors of PCSK9 became a potential target for therapeutic ap- proaches preventing coronary atherosclerosis. Intravenous or sub- cutaneous administration of monoclonal antibodies to PCSK9 (alirocumab or evolocumab) significantly lowered LDL choles­ terol levels in clinical studies of healthy subjects and in subjects with familial or non​familial hypercholesterolemia. Currently, this approach is considered as a treatment in adults whose cholesterol levels are not controlled by diet and treatment with statins. Recent technological advances in developing antisense oligo- nucleotides (ASOs) have opened a promising and unparalleled po- tential for treatment of monogenic diseases. Some of the therapeutic approaches using ASOs have demonstrated successful phenotypic amelioration both in animal models and in humans. For example, application of ASOs successfully corrected defective pre-​mRNA spli- cing of transcripts from the Ush1c gene in a mouse model of human hereditary deafness. Importantly, these effects were sustained for several months, demonstrating the therapeutic potential of ASOs in the treatment of deafness. ASO targeting of long non​coding RNA (Ube3a-​ATS) in the mouse model of the genomic imprinting disorder Angelman syn- drome led to specific reduction of Ube3a-​ATS and sustained un-​ silencing of paternal Ube3a in neurons both in vitro and in vivo. As a result, partial restoration of UBE3A protein enabled reversal of imprinting ameliorating some cognitive deficits associated with the disease. Humans have two near identical copies of the survival motor neuron gene: SMN1 and SMN2. C to T transition (C6T) within exon 7 of SMN2 disrupts a modulator of splicing, leading to the exclusion of exon 7 from c.90% of the mRNA transcript. Deletion or mutation of SMN1 combined with the inability of SMN2 to com- pensate for the loss of SMN1 results in spinal muscular atrophy (SMA), a severe lethal neurodegenerative disease. Diverse treat- ment strategies aimed at improving the function of SMN2 have been envisioned (e.g. manipulation of transcription, correction of aberrant splicing, and stabilization of SMN mRNA). Several studies applying ASOs targeting the splicing site in SMN2 demon- strated successful elevation of SMN protein from SMN2 and im- proved survival and function. 100 101 102 103 104 105 106 107 108 109 bp DNA sequencing PFGE / FISH Oligonucleotide microarrays bp 1 Human male G-bands 6 13 19 20 21 22 X Y 14 15 16 17 18 7 8 9 10 11 12 2 3 4 5 Chromosome banding Fig. 3.2.3  Genome architecture and methods to resolve structure of varying DNA. Above is shown a scale of the human genome from 1(100) bp to 3 × 109 bp and the size ranges (colour coded) in which the different methods can physically resolve differences. Chromosomal banding (green) examines the whole genome at once, but cannot resolve changes of more than c.5 Mb (106–​107 bp) in size. DNA sequencing (purple) can resolve single nucleotide changes and changes of several bases, but cannot identify CNVs. Pulsed-​field gel electrophoresis (PFGE) and FISH (yellow) extend the reach of conventional karyotyping and resolve changes from 104 to 106 bp in size. Array CGH can resolve changes causing genomic imbalance from 103 to 108 bp (including aneuploidies), simultaneously performing thousands of locus-​specific FISH procedures as well as detecting imbalances seen by chromosome analysis. Adapted from Lupski (2003). 2002 Curt Stern Award Address. Genomic disorders recombination-​based disease resulting
from genomic architecture. Am J Hum Genet, 72, 246–​52; Lupski JR (2007). Genomic rearrangements and sporadic disease. Nat Genet, 39, S43–​7.

3.2  The genomic basis of medicine 233 Conclusions In a classical mendelian monogenic model of a disease, Watson–​ Crick DNA base-​pair changes in a single gene are recognized as a mechanism affecting the structure, function, or regulation of the en- coded protein. Completion of the human reference DNA sequence and recent advances in novel technologies that enable us to study the entire human genome of a given patient have extended our view of the genetic bases of disease in humans. It has become apparent that many disease traits are caused by genomic alterations rather than by single nucleotide changes. The genetic heterogeneity of several complex traits is being resolved. Also, the contributions of variant alleles at more than one locus in a given personal genome to disease manifestations are being better understood. Genome-​wide studies have led to important discoveries of large-​ scale CNVs in the human genome. The clinical consequences of the overwhelming majority of CNVs are not known. Many, if not most, CNVs are likely benign but some have been shown to be responsible for mendelian traits and others lead to increased susceptibility for complex traits. Personalized and precision genomic medicine The concept of personalized medicine has been developed with the Human Genome Project. In contrast to conventional medicine, where the patients’ diagnoses and treatments are based on disease signs and symptoms, personalized medicine refers to the genetic bases of the patient’s traits and susceptibility to traits. The hypoth- esis underlying personalized genomic medicine is that personalized medical care can be guided by the unique genomic content of an in- dividual patient. The aim of personal genomic medicine is the inter- pretation of unique information encoded in the individual patient’s genome to be able to anticipate genetic risks and liability and ad- just personal lifestyle changes, diet, medications, prevention, and therapy to mitigate the consequences of genetic risk. More recently, to avoid misinterpretations that unique treatments can be designed for each individual, the term precision medicine has been coined. In precision medicine, individuals can be classified into subpopulations that differ in their biology, susceptibility, prog- nosis, development, or in their response to a specific treatment for a particular disease. The increasing ability to assay an individual’s DNA poly­ morphisms (both SNPs and CNVs) will continue to further en- able prediction of personal responses to different drugs depending on an individual’s genetic background (i.e. pharmacogenomics). With the clinical implementation of new technologies, including massive parallel sequencing and high-​resolution oligonucleotide array CGH and SNP arrays that offer analysis of the individual diploid human genome (DNA sequence and CNVs) within a rela- tively short time, the information content of entire genomes of in- dividuals is expected to become affordable. Recent whole-​genome studies, however, suggest that interpretation of the complexity of the genetic load of an individual or selected patients will require better understanding of genotype/​phenotype correlations to pro- vide clinically relevant information in a format commensurate with clinical implementation. Such an approach will potentially revolu- tionize clinical diagnostics and therapy and may provide tremen- dous benefits for the patients’ health. FURTHER READING Akawi N, et  al. (2015). Discovery of four recessive developmental disorders using probabilistic genotype and phenotype matching among 4,125 families. Nat Genet, 47, 1363–​9. Audano PA, et al. (2019). Characterizing the major structural variant alleles of the human genome. Cell, 176, 663–75.e19. Badano JL, et al. (2006). Dissection of epistasis in oligogenic Bardet–​ Biedl syndrome. Nature, 439, 326–​30. Badano JL, Katsanis N (2002). Beyond Mendel: an evolving view of human genetic disease transmission. Nat Rev Genet, 3, 779–​89. Ballif BC, et al. (2006). Detection of low-​level mosaicism by array CGH in routine diagnostic specimens. Am J Med Genet A, 140, 2757–​67. Barbouti A, et al. (2004). The breakpoint region of the most common isochromosome, i(17q), in human neoplasia is characterized by a 220 kb region containing palindromic low-​copy repeats. Am J Hum Genet, 74, 1–​10. Baudat F, et al. (2010). PRDM9 is a major determinant of meiotic re- combination hotspots in humans and mice. Science, 327, 836–​40. Beck CR, et al. (2019). Megabase length hypermutation accompanies human structural variation at 17p11.2. Cell, 176, 1310–24.e10. Bentley DR (2006). Whole-​genome re-​sequencing. Curr Opin Genet Dev, 16, 545–​52. Berg IL, et al. (2010). PRDM9 variation strongly influences recombin- ation hot-​spot activity and meiotic instability in humans. Nat Genet, 42, 859–​63. Carvalho MBC, Lupski JR (2016). Mechanisms underlying struc- tural variant formation in genomic disorders. Nat Rev Genet, 17, 224–​38. Chance PF, et al. (1994). Two autosomal dominant neuropathies result from reciprocal DNA duplication/​deletion of a region on chromo- some 17. Hum Mol Genet, 3, 223–​8. Cheung SW, et al. (2007). Microarray-​based CGH detects chromo- somal mosaicism not revealed by conventional cytogenetics. Am J Med Genet, 143, 1679–​86. Chong JX, et  al. (2015). The genetic basis of mendelian pheno- types: discoveries, challenges, and opportunities. Am J Hum Genet, 97, 199–​215. Cohen JC, et  al. (2006). Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med, 354, 1264–​72. Cooper DN, Youssoufian H (1998). The CpG dinucleotide and human genetic disease. Hum Genet, 78, 151–​5. Costanzo M, et al. (2019). Global genetic networks and the genotype- to-phenotype relationship. Cell, 177, 85–100. Coulondre C, et  al. (1978). Molecular basis of base substitution hotspots in Escherichia coli. Nature, 274, 775–​80. Crespi B, Stead P, Elliot M (2010). Evolution in health and medicine Sackler colloquium: Comparative genomics of autism and schizo- phrenia. Proc Natl Acad Sci U S A, 107 Suppl 1, 1736–​41. Deciphering Developmental Disorders Study (2015). Large-​scale dis- covery of novel genetic causes of developmental disorders. Nature, 519, 223–​8. Deciphering Developmental Disorders Study (2017). Prevalence and architecture of de novo mutations in developmental disorders. Nature, 542, 433–8. Dharmadhikari AV, et al. (2019). Copy number variant and runs of homozygosity detection by microarrays enabled more precise
molecular diagnoses in 11,020 clinical exome cases. Genome Med, 11, 30.

234 SECTION 3  Cell biology Dipple KM, McCabe ER (2000). Phenotypes of patients with ‘simple’ Mendelian disorders are complex traits: thresholds, modifiers, and systems dynamics. Am J Hum Genet, 66, 1729–​35. Dumas L, et  al. (2007). Gene copy-​number variation spanning 60 million years of human and primate evolution. Genome Res, 17, 1266–​77. Edelmann L, et al. (2001). AT-​rich palindromes mediate the constitu- tional t (11;22) translocation. Am J Hum Genet, 68, 1–​13. Eichers ER, et  al. (2004). Triallelic inheritance:  a bridge between Mendelian and multifactorial traits. Ann Med, 36, 262–​72. Eldomery MK, et al. (2017). Lessons learned from additional research analyses of unsolved clinical exome cases. Genome Med, 9, 26. ENCODE Project Consortium (2004). The ENCODE (ENCyclopedia Of DNA Elements) Project. Science, 306, 636–​40. Firth HV, et al. (2009). DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. Am J Hum Genet, 84, 524–33. Gabriel SB, et al. (2002). Segregation at three loci explains familial and population risk in Hirschsprung disease. Nat Genet, 31, 89–​93. Giglio S, et  al. (2002). Heterozygous submicroscopic inversions involving olfactory receptor-​gene clusters mediate the recurrent t(4;8)(p16;p23) translocation. Am J Hum Genet, 71, 276–​85. Girirajan S, et al. (2010). A recurrent 16p12.1 microdeletion supports a two-​hit model for severe developmental delay. Nat Genet, 42, 203–​9. Girirajan S, et al. (2012). Phenotypic heterogeneity of genomic dis- orders and rare copy-number variants. N Engl J Med, 367, 1321–31. Gonzaga-​Jauregui C, et al. (2015). Exome sequence analysis suggests that genetic burden contributes to phenotypic variability and com- plex neuropathy. Cell Rep, 12, 1169–​83. Gonzaga-​Jauregui C, Lupski JR, Gibbs RA (2012). Human genome sequencing in health and disease. Annu Rev Med, 63, 35–​61. Hastings PJ, et al. (2009). Mechanisms of change in gene copy-​number. Nat Rev Genet, 10, 551–​64. Iafrate AJ, et al. (2004). Detection of large-​scale variation in the human genome. Nat Genet, 36, 949–​51. Inoue K, et al. (2004). Molecular mechanism for distinct neurological phenotypes conveyed by allelic truncating mutations. Nat Genet, 36, 361–​9. International Human Genome Sequencing Consortium (2001). Initial sequencing and analysis of the human genome. Nature, 409, 860–​921. International Human Genome Sequencing Consortium (2004). Finishing the euchromatic sequence of the human genome. Nature, 431, 931–​45. Karaca E, et al. (2018). Phenotypic expansion illuminates multilocus pathogenic variation. Genet Med, 20, 1528–37. Karolak J, et al. (2019). Complex compound inheritance of lethal lung developmental disorders due to disruption of the TBX-FGF pathway. Am J Hum Genet, 104, 213–28. Kato T, et al. (2006). Genetic variation affects de novo translocation frequency. Science, 311, 971. Katsanis N, et al. (2001). Triallelic inheritance in Bardet–​Biedl syn- drome, a Mendelian recessive disorder. Science, 293, 2256–​9. Kong A, et al. (2012). Rate of de novo mutations and the importance of father’s age to disease risk. Nature, 488, 471–​5. Kurahashi H, et al. (2000). Regions of genomic instability on 22q11 and 11q23 as the etiology for the recurrent constitutional t(11;22). Hum Mol Genet, 9, 1665–​70. Lappalainen T, et al. (2019). Genomic analysis in the age of human genome sequencing. Cell, 177, 70–84. Lee H, et al. (2014). Clinical exome sequencing for genetic identifica- tion of rare Mendelian disorders. JAMA, 312, 1880–​7. Lee JA, Carvalho CMB, Lupski JR (2007). A DNA replication mech- anism for generating nonrecurrent rearrangements associated with genomic disorders. Cell, 131, 1235–​47. Lee JA, Lupski JR (2006). Genomic rearrangements and gene copy-​ number alterations as a cause of nervous system disorders. Neuron, 52, 103–​21. Lentz JJ, et al. (2013). Rescue of hearing and vestibular function by antisense oligonucleotides in a mouse model of human deafness. Nat Med, 19, 345–​50. Levy S, et al. (2007). The diploid genome sequence of an individual human. PLoS Biol, 5, e254. Lifton RP, et al. (1992). A chimaeric 11-​beta-​hydroxylase/​aldosterone synthase gene causes glucocorticoid-​remediable aldosteronism and human hypertension. Nature, 355, 262–​5. Lindhurst MJ, et al. (2011). A mosaic activating mutation in AKT1 associated with the Proteus syndrome. N Engl J Med, 365, 611–​19. Liu J, et al. (2019). TBX6-associated congenital scoliosis (TACS) as a clinically distinguishable subtype of congenital scoliosis: further evidence supporting the compound inheritance and TBX6 gene dosage model. Genet Med, 21, 1548–58. Liu P, et  al. (2011). Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements. Cell, 146, 889–​903. Liu P, et al. (2014). Mechanism, prevalence, and more severe neur- opathy phenotype of the Charcot–​Marie–​Tooth type 1A triplication. Am J Hum Genet, 94, 462–​9. Liu P, et al. (2017). An Organismal CNV Mutator Phenotype Restricted to Early Human Development. Cell, 168, 830-42.e7. Liu P, et al. (2019). Reanalysis of clinical exome sequencing data.
N Engl J Med, 380, 2478–80. Lupski JR (2006). Genome structural variation and sporadic disease traits. Nat Genet, 38, 974–​6. Lupski JR (2007). Genomic rearrangements and sporadic disease. Nat Genet, 39, S43–​7. Lupski JR (2007). Structural variation in the human genome. N Engl J Med, 356, 1169–​71. Lupski JR, et al. (1991). DNA duplication associated with Charcot–​ Marie–​Tooth disease type 1A. Cell, 66, 219–​32. Lupski JR, et al. (1992). Gene dosage is a mechanism for Charcot–​ Marie–​Tooth disease type 1A. Nat Genet, 1, 29–​33. Lupski JR, et al. (2010). Whole-​genome sequencing in a patient with Charcot–​Marie–​Tooth neuropathy. N Engl J Med, 362, 1181–​91. Lupski JR, et al. (2011). Clan genomics and the complex architecture of human disease. Cell, 147, 32–​43. Lupski JR, Stankiewicz P (2005). Genomic disorders: molecular mech- anisms for rearrangements and conveyed phenotypes. PLoS Genet, 1, e49. Lupski JR, Stankiewicz P (eds) (2006). Genomic disorders: the genomic basis of disease. Humana Press, Totowa. Lupski JR (2015). Structural variation mutagenesis of the human genome: impact on disease and evolution. Environ Mol Mutagen, 56, 419–​36. Männik K, et al. (2015). Copy-​number variations and cognitive pheno- types in unselected populations. JAMA, 313, 2044–​54. Martin HC, et al. (2018). Quantifying the contribution of recessive coding variation to developmental disorders. Science, 362, 1161–4. Meng L, et al. (2015). Towards a therapy for Angelman syndrome by targeting a long non-​coding RNA. Nature, 518, 409–​12. Myers S, et al. (2010). Drive against hotspot motifs in primates im- plicates the PRDM9 gene in meiotic recombination. Science, 327, 876–​9.

3.2  The genomic basis of medicine 235 Niemi MEK, et al. (2018). Common genetic variants contribute to risk of rare severe neurodevelopmental disorders. Nature, 562, 268–71. Pentao L, et al. (1992). Charcot–​Marie–​Tooth type 1A duplication ap- pears to arise from recombination at repeat sequences flanking the 1.5 Mb monomer unit. Nat Genet, 2, 292–​300. Posey I, et al. (2017). Resolution of disease phenotypes resulting from multilocus genomic variation. N Engl J Med, 376, 21–31. Posey J, et al. (2019). Insights into genetics, human biology and dis- ease gleaned from family based genomic studies. Genet Med, 21, 798–812. Posey JE, et  al. (2015). Molecular diagnostic experience of whole-​ exome sequencing in adult patients. Genet Med, 18, 678–​85. Redon R, et al. (2006). Global variation in copy-​number in the human genome. Nature, 444, 444–​54. Rosenfeld JA, et al. (2013). Estimates of penetrance for recurrent pathogenic copy-number variations. Genet Med. 15, 478–81. Schmickel RD (1986). Contiguous gene syndromes: a component of recognizable syndromes. J Pediatr, 109, 231–​41. Scriver CR, Waters PJ (1999). Monogenic traits are not simple: lessons from henylketonuria. Trends Genet, 15, 267–​72. Sebat J, et al. (2004). Large-​scale copy-​number polymorphism in the human genome. Science, 305, 525–​8. Short PJ, et al. (2018). De novo mutations in regulatory elements in neurodevelopmental disorders. Nature, 555, 611–16. Sifrim A, et al. (2016). Distinct genetic architectures for syndromic and nonsyndromic congenital heart defects identified by exome sequencing. Nat Genet, 48, 1060–5. Song X, et al. (2018). Predicting human genes susceptible to gen- omic instability associated with Alu/Alu-mediated rearrangements. Genome Res, 28, 1228–42. Spence JE, et al. (1988). Uniparental disomy as a mechanism for human genetic disease. Am J Hum Genet, 42, 217–​26. Stankiewicz P, Beaudet AL (2007). Use of array CGH in the evaluation of dysmorphology, malformations, developmental delay, and idio- pathic mental retardation. Curr Opin Genet Dev, 17, 182–​92. Stefansson H, et al. (2005). A common inversion under selection in Europeans. Nat Genet, 37, 129–​37. Stein EA, et al. (2012). Effect of a monoclonal antibody to PCSK9 on LDL cholesterol. N Engl J Med, 366, 1108–​18. Sudmant PH, et al. (2015). An integrated map of structural variation in 2,504 human genomes. Nature, 526, 75–​81. The International HapMap Consortium (2003). The International HapMap Project. Nature, 426, 789–​96. Tijo JH, Levan A (1956). The chromosome number of man. Hereditas, 42, 1–​6. Todd JA, et al. (2007). Robust associations of four new chromosome regions from genome-​wide analyses of type 1 diabetes. Nat Genet, 39, 857–​64. Turner DJ, et al. (2008). Germline rates of de novo meiotic deletions and duplications causing several genomic disorders. Nat Genet, 40, 90–​5. Verbitsky M, et al. (2019). The copy number variation landscape of con- genital anomalies of the kidney and urinary tract. Nat Genet, 51, 117–27. Watson JD, Crick FH (1953). Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature, 171, 737–​8. Wells RD (2007). Non-​B DNA conformations, mutagenesis and dis- ease. Trends Biochem Sci, 32, 271–​8. Wheeler DA, et al. (2008). The complete genome of a single individual by massively parallel DNA sequencing. Nature, 452, 872–​6. Willingham AT, Gingeras TR (2006). TUF love for ‘junk’ DNA. Cell, 125, 1215–​20. Wright CF, et al. (2015). Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-​wide research data. Lancet, 385, 1305–​14. Wu N, et al. (2015). TBX6 null variants and a common hypomorphic allele in congenital scoliosis. N Engl J Med, 372, 341–50. Yang Y, et al. (2013). Clinical whole-​exome sequencing for the diag- nosis of mendelian disorders. N Engl J Med, 369, 1502–​11. Yang Y, et al. (2014). Molecular findings among patients referred for clinical whole-​exome sequencing. JAMA, 312, 1870–​9. Yang N, et al. (2019). TBX6 compound inheritance leads to congenital vertebral malformations in humans and mice. Hum Mol Genet, 28, 539–47. Zhang F, Carvalho CM, Lupski JR (2009). Complex human chromo- somal and genomic rearrangements. Trends Genet, 25, 298–​307. Zhang F, et al. (2009). Copy-​number variation in human health,
disease, and evolution. Annu Rev Genomics Hum Genet, 10, 451–​81.

3.3 Cytokines 236

3.3 Cytokines 236

ESSENTIALS Cytokines are small glycoprotein mediators that are involved in every facet of immune effector function and regulation, and moreover serve to integrate immune function with other physiologic processes (e.g. metabolism, neurologic function). More than 200 cytokines have been identified. They can (1) function through binding to spe- cific receptors that in turn signal via complex transduction pathways to regulate gene expression, thereby mediating positive and negative regulatory activities; (2) operate as soluble mediators in the extra- cellular domain or within cells, where they may also traffic to the nucleus and exhibit dual function as transcriptional regulators; (3) be expressed initially on the cell membrane, where they may exert ef- fector function directly in cell–​cell interactions, or from which they can be subsequently cleaved to yield bioactive soluble molecules, thereby mediating autocrine and paracrine activities around their cellular source. The innate immune response—​this is designed to offer immediate defence and comprises particular leucocyte lineages including neutrophils, mast cells, eosinophils, monocytes/​macrophages, and natural killer cells. It is regulated by cytokines such as the type I interferons (IFN), tumour necrosis factor (TNF) α, interleukin (IL) 1 superfamily, and IL-​6 superfamily, which are essential for the in- tegrity of the innate immune response. Additional cytokines such as IL-​10 and tissue growth factor (TGF) β promote resolution of immune responses and mediate healing of tissue damage, but they can play a substantial role in chronic inflammation when improp- erly down regulated. Adaptive immune responses—​these are an evolutionary refine- ment to facilitate recall responses to specific antigens derived from invading organisms or damaged self-​tissue. Professional antigen-​ presenting cells are activated by cytokines such as IFNs, TNFα, and IL-​1α to increase antigen uptake, processing, and presentation to naive T cells, and—​critically—​dendritic cells use cytokines to define the phenotype of subsequent T-​cell responses: Th1 (IFNγ predom- inant); Th17 response (IL-​17A, IL-​17F, IL-​22 predominant); and Th2 (IL-​4, IL-​5, IL-​13 predominant). Regulatory T-​cell subsets may also be activated in this phase by cytokines. Activation of host-​tissue cell types—​cytokines mediate many ef- fector functions in the immune system by action on host-​tissue cell types, including (1) activation of endothelial and lymphendothelial cells to express adhesion molecules and to alter their permeability properties to facilitate induction and resolution of inflammation; (2) regulation of differentiated tissue specific cells, which can con- tribute indirectly to host-​tissue defence; (3) regulation of host-​tissue function in the context of inflammation (e.g. by facilitating meta- bolic responses to sustain energy requirements for host-​defence responses). Clinical context—​understanding of the cytokine network has increasing importance in clinical practice with the advent of thera- peutic strategies that target particular cytokines with exquisite spe- cificity using biological agents, leading to remarkable advances in the treatment of inflammatory disorders (e.g. anti-​TNF therapy in rheumatoid arthritis and anti-​IL-​17A in psoriasis). The therapeutic potential in their manipulation has not yet been maximized and the future will hold remarkable advances as these molecular networks give up their secrets to provide for highly specific and well-​tolerated interventions. Introduction In higher organisms, the immune system has evolved to provide flexible and comprehensive host defence against microbial organ- isms. It also plays a critical role in recognition of, and immediate response to, altered integrity of self-​tissues, for example as occurs in trauma or neoplasia. The immune system itself comprises a complex interaction between cells of discrete lineage and func- tion that must act in a cooperative manner to achieve satisfactory, long-​lived protection with minimal damage to the host. Cytokines are small glycoprotein messengers (8–​40 kDa) that facilitate regu- lation and effector function of cells in an autocrine or paracrine manner. Typically, cytokines exhibit functional activities that are of critical importance in the immune system, but also mediate wider effects across a range of tissues. Cytokines thus also play a role in a variety of normal physiological and metabolic processes. The field of cytokine biology has expanded considerably in the last decade with the recognition of large numbers of moieties and the advent of effective therapeutics based on cytokine targeting, par- ticularly in inflammatory diseases. 3.3 Cytokines Iain B. McInnes

3.3  Cytokines 237 Classification of cytokines Cytokines were originally discovered and defined on the basis of their functional activities observed in bioassays (e.g. macrophage activating factor, lymphocyte activating factor). Subsequent ad- vances in molecular biology, bioinformatics, and recently the Human Genome Project have facilitated discovery of large numbers of cytokines, posing considerable challenges in resolving their in- dividual and synergistic functions in complex tissues in health and disease. In the absence of a unified classification system, cytokines may be variously identified by: • numerical order of discovery (e.g. the interleukins (IL): currently IL-​1 to IL-​40) • specific functional activity (e.g. tumour necrosis factor, TNF; granulocyte colony stimulating factor, G-​CSF)—​this is usually an inaccurate descriptor of the potential activities of a given cytokine and therefore interpretation of function from such nomenclature should be cautious. • on the basis of predominant primary cell or tissue of origin (monokine = monocyte derived; lymphokine = lymphocyte de- rived; adipocytokine = adipose tissue derived) • structural homologies shared with related molecules Cytokines may thus be considered in ‘family groups’ that share sequence similarity and also exhibit homology and some promis- cuity in their reciprocal receptor systems. Note that they need not exhibit functional similarity despite common structural domains or features. These so-​called ‘cytokine superfamilies’ often contain regulatory cell membrane receptor–​ligand pairs that use common structural motifs in diverse immune functions in higher mammals, reflecting evolutionary pressures. This is best exemplified in the IL-​ 1/​IL-​1 receptor superfamily that contains cytokines such as IL-​1β, IL-​1α, IL-​1 receptor antagonist, IL-​1 F5–​F10, IL-​18, IL-​33, IL-​36, and IL-​37 which mediate physiological and host-​defence func- tion. IL-​1 receptor signalling cascades share remarkable similarities with the signalling pathways induced by toll-​like receptors (TLR) and NOD-​like receptors (NLR), a series of mammalian pattern-​ recognition molecules with a crucial role in recognition of micro- bial species early in innate responses. These common motifs allow integration of responses at the signal transduction level between cytokine and other immune receptor systems to allow fine-​tuning of responses over time. Increasingly, cytokines might also be usefully organized on the basis of their cooperative functional properties and thereby assigned to a given immunologic process (e.g. T-​cell activation, T-​cell differ- entiation, or tissue repair). One cytokine can contribute to several activities and one cytokine superfamily can by corollary contribute to many discrete immunologic processes. Basic cytokine biology See Fig. 3.3.1. Synthesis, expression, and regulation Cytokines can be produced by almost every leucocyte and host tissue cell type (see next). Cytokines can be secreted or expressed as cell membrane-​bound proteins, or may be processed into cyto- solic forms that can traffic intracellularly, even to the nucleus where they can act as transcriptional regulators. Thus, cytokines mediate autocrine function either through release or membrane expression and immediate receptor ligation on the source cell, or intracellularly within the source cell. Alternatively, cytokines operate in a paracrine manner, allowing cellular communication beyond that facilitated by local cell–​cell contact. The distance over which cytokines can mediate effects is unclear and probably depends on many factors, not least the site of syn- thesis and the extracellular matrix components of a given tissue: in silico models predict meaningful bioactivity for many cytokines no further than a 40 µm diameter from the source cell due to physico- chemical considerations of the peptide structure itself, extracellular matrix binding (e.g. to heparan sulphate), enzymatic degradation, or the presence of soluble receptors or cytokine-​binding proteins. Some cytokines, however, clearly exhibit systemic activities (e.g. IL-​1β promotes pyrexia via either local expression or circulation to the central nervous system, and IL-​6 induces the acute-​phase re- sponse by circulating to the liver). To facilitate such broader effects, IL-​6 receptor also circulates as a soluble entity (soluble IL-​6 receptor) and thereby confers on this cytokine the ability to activate any cell expressing its other cognate receptor (gp130) since it can preform IL-​6/​IL-​6R complexes in the circulation that can directly bind and activate cells via gp130. One implication of this is that simple meas- urement of a cytokine concentration alone may not reflect its func- tional contribution at the biologic level unless other functioning components are also assessed, which has confounded efforts to use cytokines as biomarkers in clinical development programmes. The triggers to cytokine release are diverse and depend upon the cell type and tissue concerned (Box 3.3.1). Many of these triggers can also be used ex vivo to study cytokine function, together with a var- iety of surrogate stimuli such as chemical entities, including phorbol esters, calcium ionophores, lectins and receptor-​specific agonistic antibodies (e.g. anti-​CD3/​CD28 to mimic T-​cell costimulatory ac- tivation) and microbial derived derivatives (e.g. LPS, BLP). Detailed studies using such agents in vitro together with in vivo observations using gene knock-​in and knock-​out mice have unravelled many layers of regulation for cytokine production. Since they exhibit such potent effects in immune function, it is unsurprising that these path- ways are tightly regulated. Transcriptional and post-​transcriptional regulation Transcription of a cytokine gene depends upon the recruitment of usually multiple transcription factors (TFs) to the cytokine gene promoter region. TF binding allows numerous signal pathways to regulate cytokine expression. Some transcription factors (e.g. NF-​ κB, activator protein-​1, AP-​1; nuclear factor of activated T cell, NF-​ AT) appear to have particular importance in cytokine regulation in disease states. However, there are many other TF binding sites in most cytokine genes and increasingly the functional organization of chromatin structures and transcriptional machinery is recognized to be rather different depending upon the cell type and cytokine studied. The recognition and manipulation of cytokine transcrip- tion is both an area of enormous complexity at present, but also great therapeutic opportunity to achieve context-​specific inhibition of cytokine production (e.g. inflammatory vs. protective cytokine release). Transcription factors that define cell types/​differentiation

238 SECTION 3  Cell biology status and that are directly cytokine regulated offer particular hope in this regard (e.g. t-​bet, ror-​c that regulate Th1 and Th17 cell differ- entiation, respectively). Sequence polymorphism within cytokine promoters offers po- tential for differential cytokine expression between individuals that could confer selective advantage against infection, but could also in- crease susceptibility to, or progression of, autoimmunity or chronic inflammation. Intronic single nucleotide polymorphisms (SNPs) elsewhere in the cytokine gene structure are also likely to be of im- portance although they are in general ill-​clarified at present. Post-​transcriptional regulation is important in determining lon- gevity of cytokine expression. This may operate by promoting trans- lational initiation, mRNA stability, and polyadenylation. AU-​rich elements within the 5' or 3' untranslated regions (UTR) of cytokine mRNA are crucial for stability. Alternatively, cytokines may generate stable mRNA a priori to facilitate subsequent rapid response in tis- sues. By this means cytokines generated by distinct processing of this mRNA can be rapidly synthesized and trafficked to cytosolic or extracellular domains as required. Finally, the cytosolic activity of microRNA species provides a further level of cytokine regulation. These recently recognized small RNA sequences have the capacity to bind to mRNA and down-​regulate translation to protein. Myriad microRNAs have now been identified, each considered capable of regulating up to 20 mRNAs, usually in a negative direction. Genetic targeting of such microRNAs in vivo has led to remarkable effects on global im- mune function mediated through altered cellular differentiation and cytokine release. In terms of cytokine regulation there has been particular interest in the biology of miR146a and miR155, though many others are now emerging. Tissue structure Autoantibody synthesis macrophage Matrix damage Fibroblast Adaptive response m DC p DC B cell GC formation Innate response Mast cell Endothelial cell, smooth muscle cell activation/ angiogenesis adipocyte Neutrophil Osteoclast precursor Chondro cyte Atherogenesis Metabolic syndrome Inflammation BLyS APRIL IL-6 IL-10 LTβ CXCL13 CCL21 IL-15 TNF IL-6 IFNγ IL-18 Cell-contact IL-23 IL-6 TGFβ IL-12 IL-15 IL-18 IFNα/β IL-15 IL-18 IL-1 RANKL TNF M-CSF IL-17 Cadherin-11 Adiponectin TNF IL-6 IL-15 IL-1 resistin VEGF bFGF IL-1 TNF IL-6 TGFβ IL-17 IL-32 IL-1 IL-18 TNF GM-CSF IL-17 Th17 / Th1 cell Treg cell IL-10 TGFβ IL-17 RANKL Cell- contact TLR, PAR, FcR Points of cytokine synergy / cooperation with: Co-stimulation Plasma cell Fig. 3.3.1  Cytokines mediate pleiotropic activities within complex cellular systems. Cytokines form coordinated networks that regulate multiple cellular interactions locally and in turn promote systemic responses. This is particularly well-​defined in the pathogenesis of rheumatoid arthritis. The principle relationships existing between cellular lineages are consistent across a range of inflammatory disorders. The figure highlights the multifaceted roles played by cytokines in the range of manifestations of disease. The activities are broadly defined on the basis of their activities in promoting adaptive or innate immune function. In the novel immune response these will occur sequentially. In the context of chronic inflammation, or persistent recall responses, the two arms of the immune system will likely overlap allowing for considerable cross-​talk and interlinked function for cellular components, but particularly in the cytokine network. Reprinted by permission from Macmillan Publishers Ltd: Nat Rev Immunol. Cytokines in the pathogenesis of rheumatoid arthritis. McInnes IB, Schett G. 2007
Jun; 7(6):429–​42, © 2007.

3.3  Cytokines 239 Post-​translational regulation Cytokine production is regulated by post-​translational modifica- tions. Patterns of glycosylation are important for cytokine func- tion in the extracellular domain and may also regulate intracellular trafficking. Modified leader sequences can alter intracellular traf- ficking of cytokines. Moreover, some cytokines are translated without functional leader sequences. Their secretion depends on non​conventional secretory pathways that are thus far poorly understood but may include ion channels, chaperone proteins, or specific membrane pores. Enzymatic activation of preformed procytokines is common. Such enzymatic cleavage is often con- ducted within an assembly of proteins (e.g. the inflammasome) designed to carefully coordinate and regulate the amount and duration of active cytokine release. A variety of enzymes are im- plicated in these processes including caspases, the serine prote- ases, proteinase 3, and elastase, and adamolysin family members. Enzyme cleavage pathways operate both within and outside cells, providing for extracellular cytokine activation. Thus, cell mem- brane enzymes serve to cleave membrane-​expressed cytokine to generate soluble cytokines. In summary, extensive molecular machinery exists to tightly regulate not only the production and stability of cytokine mRNA, but also its translation and cellular expression and distribution. Mediation of cytokine effects Cytokines mediate their effects primarily via the binding to a receptor(s) or combination of receptors that is (are) expressed on the membrane or cytosol of a target cell. Like cytokines, and arising from similar evolutionary mechanisms, receptors for cytokines exist in structurally related superfamilies (Fig. 3.3.2). Cytokine receptors thus comprise high-​affinity molecular signalling complexes that fa- cilitate cytokine-​mediated communication. Such complexes often include heterodimeric or heterotrimeric structures that use unique, cytokine-​specific recognition receptors together with common re- ceptor chains shared across a cytokine superfamily. The common γ chain is used by receptor complexes of many α-​helix cytokines (e.g. IL-​2, IL-​7, IL-​15, IL-​21) and the gp130 receptor is similarly utilized by IL-​6 and many homologues. Similar promiscuity across cytokine receptors is shown by the intracellular molecules used to transduce the signal to the nucleus (e.g. JAK/​STAT pathways). Single signal molecules in turn can func- tion on behalf of many receptors, although subtle differences may operate as to precise dimerization or phosphorylation events for any given receptor leading to specificity of the response. The JAK/​STAT receptor signalling complex is particularly useful to demonstrate the very broad range of functions that can be achieved, with the poten- tial for fine tuning based on a relatively limited array of receptor and transduction components. Cytokines and their receptors can adopt a variety of orientations to mediate their effector functions. Membrane receptors, with intra- cellular signalling domains intact, can transmit signals to the target cell nucleus following soluble cytokine binding and thereby promote effector function. Membrane receptors may bind cell membrane cytokines, facilitating cross-​talk between adjacent cells that can alter the behaviour of both participating cells, particularly if a cyto- kine exhibits the potential to ‘reverse signal’ (the property to send a signal back into the producer cell via membrane cytokine expres- sion). Membrane-​bound and soluble cytokines may thus promote distinct functions. Cytokine receptor–​cytokine complexes may also operate in trans, whereby component parts of the ligand–​receptor complex are de- rived from adjacent cells. This renders target cells capable of re- sponses to a cytokine when they only express a part of the necessary cytokine receptor complex. Receptors also exist in soluble form, derived either from al- ternative mRNA processing to generate receptor-​lacking trans- membrane or intracellular domains, or by enzymatic cleavage of receptor from the cell surface. Soluble receptors can antagonize cytokine function, or preform complexes with cytokine to promote subsequent ligand–​receptor assembly on the target cell membrane, and thereby enhance function as another example of trans signal- ling. Furthermore, soluble receptors can deliver cytokine to the cell membrane via ligand passing, in essence the donation of a cytokine from one receptor to another on the surface of a cell to confer se- quential function. Box 3.3.1  Factors that regulate cytokine production • Cytokines (forming amplificatory and regulatory loops) • Cell–​cell contact (e.g. lectins, integrins, Ig superfamily members) • Immune complexes/​autoantibodies • Complement activation products • Microbial species and their soluble products (particularly via TLR/​NLR pathways) • Reactive oxygen and nitrogen intermediates • Trauma • Sheer stress and barotrauma • Ischaemia • Radiation • Ultraviolet light • Extracellular matrix components • DNA (mammalian or microbial) and RNA (including microRNA and lnRNA) • Heat-​shock proteins Fig. 3.3.2  Cytokine receptors and the intracellular molecules used to transduce the signal.

240 SECTION 3  Cell biology These details of how cytokines mediate their effects are crucially important in devising effective therapeutic cytokine inhibitors. The homeostasis of cytokine release and function is maintained by this network arrangement. An inhibitor that blocks only part of the cyto- kine receptor interaction may be ineffective or, worse, paradoxically enhance effects (Fig. 3.3.2). Finally, although in general cytokines bind only their cognate re- ceptor, there appears to be some plasticity in the system since close cross-​communication on the cell membrane between seemingly un- related cytokine receptor systems also occurs, thereby allowing a cell to integrate a variety of external stimuli to optimize signalling path- ways. This allows cells to constantly sense and respond to complex changes in the local environment delivered by many cytokines either simultaneously or in sequence. Activities of principal cytokines Members of the major cytokine superfamilies are listed in Table 3.3.1. Further hierarchical relationships exist within and between superfamilies, reflecting the ancestral genes from which cytokines have been derived. The diversity and density of data now available to describe the activities of individual cytokines are beyond the scope of this text, but the following paragraphs summarize the activities of selected cytokines implicated in inflammatory disorders. TNF superfamily Tumour necrosis factor Tumour necrosis factor (TNF) is the prototypic proinflammatory cytokine of this superfamily. It is produced by a wide variety of im- mune cell types, particularly macrophages, T and B lymphocytes, and neutrophils, but is also released by tissue cell types including keratinocytes, fibroblasts, glial cells, and smooth muscle cells. It comprises a heterotrimer (each subunit of 26 kDa) that binds to ei- ther of two receptors TNF receptor I (p55) or TNF receptor II (p75). TNF and its receptors are synthesized as membrane proteins that can be cleaved to soluble form by the activity of members of the ADAMs family of enzymes. Downstream signalling is mediated via MAPK and NF-​κB, and can if appropriate involve recruitment of death domains to facilitate apoptosis in target cells. TNF sits in a piv- otal position in many inflammatory cytokine networks. Thus, inhib- ition of TNF in inflammatory tissues in vitro, such as those derived from inflammatory synovitis, psoriatic skin, or inflammatory bowel disease mucosal biopsies leads to down-​regulation of many other inflammatory cytokines such as IL-​6 and IL-​8. and of the produc- tion of many inflammatory chemokines. This ‘cytokine hierarchy’ is considered to explain in part the efficacy of single cytokine targeting in chronic disease states. The precise effects of TNF receptor binding depend upon the lin- eage and activation status of the target cell. TNF induces monocyte activation and maturation and promotes chemokinesis, release of reactive oxygen and nitrogen intermediates (ROI/​RNI), and pros- taglandin/​leukotriene production. Similarly, polymorphonuclear leucocytes are primed and induced to oxidative burst by TNF. Effects on T cells are predominantly regulatory such that long-​ standing exposure to TNF leads to relative hypofunction of T cells and impaired T-​cell receptor signalling. Indeed, TNF blockade in humans can lead to enhanced T-​cell autoreactivity. TNF is a crit- ical activator of tissue cells. Thus, endothelial cells (vascular) and lymphendothelial cells (lymphatics) are induced to express high levels of adhesion molecules and chemokines upon TNF exposure. The net effect of this is to increase cellular trafficking into and out of inflammatory lesions. It further promotes vascular perme- ability and is directly implicated in the hypotension and oedema associated with septic shock. Local effects are also mediated upon nociception such that TNF increases pain sensation via modulated local neurotransmitter release. Systemic metabolic effects can pro- mote cachexia and altered lipid metabolism in adipose tissues. Mice in which TNF is transgenically overexpressed develop spontaneous autoinflammatory diseases including inflammatory arthritis and in- flammatory bowel disease. Targeting TNF is effective in numerous disease states, particularly rheumatoid arthritis, Crohn’s disease, and psoriasis, providing formal proof of concept of a pivotal role for this cytokine in pathogenesis. Lymphotoxin Lymphotoxin is a 22 to 26 kD cytokine that shares broad inflam- matory properties with TNF but is particularly and additionally Table 3.3.1  Members of the major cytokine superfamilies Cytokine family/​activity Key membersa TNF-​like TNF, lymphotoxin, BLyS, APRIL, RANKL, TWEAK IL-​1-​like IL-​1α, IL-​1β, IL-​1Ra, IL-​18, IL-​33, IL-​36α,β,γ, IL-​37, IL-​38, IL-​36Ra IL-​6-​like IL-​6, oncostatin M, leukaemia inhibitory factor, IL-​11, IL-​31, cardiotrophin-​1, cardiotrophin-​like cytokine ciliary neurotrophic factor IL-​12-​like IL-​12, IL-​23, IL-​27, IL-​35 IL-​10-​like IL-​10, IL-​19, IL-​20, IL-​22, IL-​24 Growth factors TGFβ, BMPs, PDGF Angiogenesis VEGF, bFGF, endostatin Colony stimulating factors G-​CSF, M-​CSF, GM-​CSF  Adipocytokines adiponectin, resistin, IL-17 family IL-17A, IL-17F, IL-17B, IL-17C, IL-17E aThe cytokines shown for each family are examples, not an exhaustive list.

3.3  Cytokines 241 implicated in structural organization of the immune system. Thus, formation of germinal centres in lymph nodes and spleen and the creation of ‘ectopic’ germinal centres (i.e. formed outwith lymphoid organs) in chronically inflamed tissues is particularly regulated by lymphotoxin, together with the chemokines CCL13 and CXCL21. B lymphocyte stimulator protein B lymphocyte stimulator protein (BLyS; also known as B-​cell acti- vating factor, BAFF) and a proliferation inducing ligand (APRIL) are two further members of the TNF superfamily that regulate B-​ cell function. BLyS and APRIL can be synthesized by monocytes, T cells, dendritic cells, fibroblasts, and some tumour cells. Their pri- mary activity lies in supporting B-​cell maturation and activation transduced via BLyS receptor and TACI. Following antigen-​driven B-​cell activation, BLyS and APRIL promote isotype switching and immunoglobulin secretion and delay B-​cell apoptosis. Effects be- yond B cells have been observed including T-​cell costimulation and tumour proliferation. Receptor activator of NF-​κB ligand Receptor activator of NF-​κB ligand (RANKL) was identified as a cytokine (35 kDa) promoting dendritic cell–​T-​cell interactions but is now primarily recognized as a critical regulator of bone homeo- stasis. RANKL is produced by monocytes, T cells, osteoblasts, and stromal cells and binds to RANK, its cognate receptor. RANKL me- diates effects in physiological bone remodelling by promoting mat- uration of osteoclast precursors to yield osteoclasts that are fully functional for resorption of calcified tissues. Production by osteo- clasts facilitates integrated resorption and new bone formation to maintain structural integrity and morphology of bone. In inflam- matory states, RANKL expression is increased leading to increased osteoclast maturation and effector function leading to net bone loss. RANKL exhibits synergistic effects with TNF, IL-​1β, IL-​17, and IL-​ 6 in this regard. Osteoprotegerin (OPG) is a 55 kDa soluble decoy receptor for RANKL that acts as a competitive inhibitor to limit the activity of RANKL. OPG-​deficient mice exhibit significant osteo- porosis and transgenic mice are osteopetrotic, suggesting a critical role in physiologic bone remodelling. The balance between RANKL and OPG synthesis defines the net resorptive activity of bone and is dysregulated in inflammatory states leading to systemic bone loss (i.e. osteoporosis). An increased inflammatory milieu usually in- creases the RANKL/​OPG ratio of expression leading to accelerated local bone loss, designated ‘erosions’. These are particularly preva- lent in rheumatoid arthritis and psoriatic arthritis, but also occur in septic arthritis. This can also mediate systemic bone loss and predis- pose patients with chronic inflammatory diseases to osteoporosis. IL-​1 superfamily The IL-​1 superfamily contains a variety of moieties involved in local and systemic regulation of immune responses that share struc- tural homology. This family has recently expanded and now com- prises a group of moieties with agonist activity that is generally proinflammatory, including IL-​1α, IL-​1β, IL-​18, IL-​33, IL-​36α, IL-​ 36β, and IL-​36γ. IL-​37 is an agonist that appears to predominantly anti-​inflammatory in its net effects. There are now three receptor antagonists that have been recognized, designated IL-​1Ra, IL-​36Rα, and IL-​38. IL-​1 Most is known about IL-​1α and IL-​1β that are synthesized as promolecules of approximately 35 kDa which are in turn cleaved by caspase 1 within the inflammasome to yield active 18 kDa cyto- kines. IL-​1Ra is a homologue of IL-​1α and IL-​1β that competes with these agonists for receptor binding. The IL-​1 cytokines effect function via binding to a heterodimeric receptor comprising IL-​1 receptor I (IL-​1RI) and IL-​1 receptor accessory protein (IL-​1RAcP). IL-​1RII is a further receptor that has decoy function. IL-​1 cytokines signal through a canonical signal pathway that they share with the TLR superfamily. Via a series of protein interactions and kinase de- pendent events, this pathway leads to NF-​κB activation and inflam- mation related gene activation. IL-​1α and IL-​1β are differentiated by the primarily membrane expression of the former which also retains activity as a full-​length promolecule. Their functions are, however, rather similar, reflecting the shared receptor components (and are designated ‘IL-​1’ here- after). Thus they promote monocyte activation, cytokine and ROI/​ RNI release, and prostaglandin production. They further drive fibroblast activation, collagen synthesis, prostaglandin release, and proliferation. IL-​1 is a potent inducer of endothelial cell activation and adhesion molecule expression, osteoclast maturation, and acti- vation in synergy with RANKL, chondrocyte activation, catabolism, and matrix degradation via metalloproteinase production. IL-​1 is a potent pyrogen commensurate with the fever syndromes that arise when genetic abnormalities occur in IL-​1 regulation (e.g. Muckle Wells syndrome/​cold autoinflammatory disorders). Such syndromes are accordingly amenable to therapeutic IL-​1 blockade using, for example, synthetic IL-​1Ra (anakinra) or anti-​IL-​1 antibodies. IL-​1 is also implicated in the pathogenesis of a variety of common chronic inflammatory diseases including rheumatoid arthritis, an- kylosing spondylitis, and psoriasis. Its hierarchical position in in- flammation cascades relative to TNF is unclear since targeting with IL-​1Ra in practice has proved disappointing in most diseases in which TNF blockade is effective. However, IL-​1 has found a favour- able role as a target in diseases associated with excess inflammasome activation, particularly gout. There is also interest in its metabolic activities in type II diabetes in which therapeutic targeting may be beneficial. Finally, high levels of IL-​1 expression within the central nervous system may regulate several central pathways implicated in cognition and mood state. IL-​18 IL-​18 is an innate response cytokine produced by monocytes, fibro- blasts, neutrophils, and dendritic cells as a 33 kDa promolecule that can be cleaved by the actions of caspase 1 in the NALP3 inflammasome to an 18 kDa active moiety. IL-​18 activates a heterodimeric receptor (IL-​18Rα/​IL-​18Rβ) and is antagonized in vivo by soluble IL-​18Rα and by a distinct IL-​18 binding protein family that contains several members generated by alternative mRNA splicing. IL-​18 activates neutrophils to promote maturation, chemotaxis, ROI/​RNI produc- tion, and cytokine/​chemokine release. It also drives NK-​cell ac- tivation, and monocytes/​macrophage maturation and activation. Its primary effects are likely mediated on driving T-​cell differenti- ation towards a mainly Th1 phenotype in synergy with IL-​12. IL-​ 18 is expressed at high levels in rheumatoid arthritis, psoriasis, and

242 SECTION 3  Cell biology inflammatory bowel disease in human tissues and rodent models and has net proinflammatory function therein. Very high systemic levels have been reported in Still’s disease (juvenile and adult onset) and in systemic leukophagocytosis syndromes. Recently effects on hepatocytes and adipocytes have been reported for IL-​18. Their net effect on metabolism and accrued vascular risk is unclear since epidemiological and several in vitro studies suggest that IL-​18 can promote atherogenic risk and metabolic syndrome whereas in vivo studies mainly in IL-​18-​deficient animals suggest that IL-​18 may be protective in this regard. This may offer therapeutic opportunity in due course. IL-​33 IL-​33 is produced mainly by fibroblasts, and is synthesized as a 33 kDa protein that is cleaved by enzyme pathways as yet undefined. Its effects function via ST2L and IL-​1RAcP binding and signals via the canonical IL-​1 receptor pathway. It also acts within the nucleus of the synthesizing cell as a transcriptional repressor by virtue of a direct DNA binding domain. IL-​33 activates Th2 cell and ILC2, iNKT cell differentiation and expansion. It directly activates mast cell and eosinophil activation and cytokine production. As such, the effects attributed to IL-​33 are mainly in allergy and anaphylaxis. It can also activate B cells, macrophages, and neutrophils. Recently a role for IL-​33 in promoting immune homeostasis in the gastrointes- tinal mucosa has been suggested via promotion of regulatory T-​cell functions. IL-​36 and IL-​37 IL-​36α,β,γ may be considered together at this time. IL-​36α,β,γ are produced by macrophages, dendritic cells, and some lymphocyte subsets and also by epithelial cells and synovial fibroblasts rendering them of interest in a variety of chronic disease states. Their effects can be antagonized by IL-​36Ra. Via activation of keratinocytes IL-​36 is implicated in psoriasis, and also has been shown to promote neu- trophilic pulmonary disease. IL-​37 exists in five splice variants, but at this time its functions are relatively poorly understood, although in general they appear to be anti-​inflammatory. Type I interferons Interferons were first described more than half a century ago and are of pivotal importance in primary host defence, but increasingly have been attributed effector roles across a range of cancers and inflam- matory diseases. Type I interferons comprise IFNα, (of which there are 13 subtypes in humans that share broad homology), IFNβ, and others including IFNε, IFNτ, IFNk, and IFNω. They are synthesized by many im- mune and tissue cells as part of the primary response to viral infec- tion, usually induced by pattern-​recognition receptors (PRR) that ‘sense’ a variety of shared structures derived from microbial species, or components of tissue damage (e.g. soluble DNA, RNA). PRR in turn activate IFN-​regulatory family transcription factors leading to IFNα and IFNβ synthesis. Type I IFNs in turn bind their receptor IFNAR1/​IFNAR2, leading to activation of interferon stimulated response elements in many promoters that permits induction of expression of many so-​called IFN-​stimulated genes (ISGs). The functional consequences are complex. Type I IFN serve to restrict viral synthesis within infected cells and transmission to adjacent tissue cells. They also mediate wider immune modulatory effects across many cells of the innate immune response. Specifically, they activate dendritic cells and monocytes, CD4 and CD8 T cells, NK cells, and B cells. Thus, they offer both elements of host defence but also host-​tissue damage, with the latter becoming dominant in the context of chronic inflamma- tory diseases. For example, in rheumatoid arthritis and especially in systemic lupus erythematosus, type I IFN responses are predom- inant and present in many patients. Recently it was recognized that type I  IFNs are also up-​regulated within tumour microenviron- ments, both within tumour cells and infiltrating immune effector cells such that they can regulate the autocrine and paracrine regula- tion of immune surveillance. Based on these varied clinical observations it is unsurprising that there remains intense interest in the use of type I IFNs both as thera- peutics per se (e.g. as established agents for the treatment of multiple sclerosis), and also as immune target in the context of clinical trials (e.g. the development of anti-​IFN and anti-​IFNAR antibodies in sys- temic lupus erythematosis). Moreover, the presence of ISGs is also considered to carry rich potential to serve as biomarkers in chronic inflammatory diseases (e.g. high ISG expression is associated with low responses to rituximab in rheumatoid arthritis). Type II and type III interferons The type II interferon class comprises only IFNγ that has a well-​ established role in the regulation of monocyte and dendritic cell ac- tivation arising as a product of TH1 cells, ILC1, and NKT cells. The wider role of IFNγ is discussed later here in the context of cellular activation and T-​cell function. Type III interferons include IFNλ1, IFNλ2, and IFNλ3 (also des- ignated as IL-​28), but little is currently known about their relevant effector biology. Other cytokine activities Regulation and mediation of T-​cell function T cells of CD4 + and CD8 + lineages are functionally defined on the basis of their release of effector cytokines. Th1 cells Th1 cells (defined by t-​bet transcription factor expression) release IFNγ and TNFα and promote granuloma formation and host de- fence to intracellular organisms. Th1 cell formation is driven by IL-​12 and IL-​18 together with relative absence of TGFβ, IL-​4, and IL-​17. IFNγ is a 20 to 25 kDa molecule that has pleiotropic func- tions including priming and activation of neutrophils, activation of macrophages particularly to induce cytotoxic pathways, and acti- vation of NK cells. However, IFNγ also promotes tissue repair and as such is an example of the multifunctional potential in cytokines that must be elucidated with care prior to therapeutic intervention. Inherited deficiencies in the IFNγ/​IFNγR signalling pathway, or in the components of the IL-​12 pathway, engenders susceptibility to intracellular infections, particularly tuberculosis, highlighting the critical role for this cytokine in host defence.

3.3  Cytokines 243 Th17 cells Th17 cells (RORγT) are a recently described subset implicated in auto- immunity and host defence. They release IL-​17A, IL-​17F, and IL-​22, and when targeted in many rodent models of autoimmunity are found to be of profound pathologic importance. Their generation depends variously on the activities of IL-​21, IL-​1β, IL-​6, IL-​23, and TGFβ. IL-​ 17A is a potent effector cytokine (20–​30 kDa) that operates in synergy with IL-​1 and TNF to promote leucocyte activation, bone marrow leucocyte maturation, haemopoiesis, and matrix degradation, the latter via direct effects on fibroblasts and chondrocytes. IL-​22 simi- larly promotes autoimmune-​mediated tissue damage by directly activating tissue cells such as keratinocytes and invading leucocytes. IL-​17F is a homologue that exerts similar effector function but ap- parently with lower potency. Recently it has emerged that it may play a more important role in human immunity than previously thought from murine studies raising interesting in dual IL-17A/F inhibition. Th2 cells Th2 cells (GATA3) are characterized by IL-​4, IL-​5, IL-​13, and IL-​25 release, and have their primary role in driving humoral immunity and host defence to many parasites, particularly in mucosal/​barrier defence. In disease states, Th2 cells promote especially allergy and anaphylaxis. Th2 differentiation is driven by IL-​33 and IL-​4. A fur- ther subset of regulatory T cells (Foxp3; Tr) is now described that comprises naturally occurring T cells that function in a predomin- antly suppressive manner via direct cell contact with adjacent leuco- cytes or via release of IL-​10 and TGFβ. Tr differentiation is favoured by TGFβ and IL-​35. Mechanisms that upregulate regulatory T-​cell function (e.g. via use of CD28 superagonists or indeed of recom- binant IL-​35 itself, are now under investigation). Miscellaneous activities IL-​6 is a pleiotropic proinflammatory cytokine that mediates function via IL-​6Rα and the common coreceptor, gp130. IL-​6 also activates B cells, promoting isotype switching and immunoglobulin produc- tion and T cells promoting proliferation and differentiation. It has an important role in haemopoiesis and thrombopoiesis. IL-​6 mediates intriguing systemic effects—​it critically regulates the acute-​phase re- sponse via direct effects on hepatocytes. Moreover, it has a role in integrating inflammatory responses with function of the hypothal- amic pituitary adrenal axis indicative of a role in the immediate stress response. High levels of IL-​6 expression are described in rheumatoid arthritis and Crohn’s disease and in juvenile inflammatory arthritis, particularly systemic variants thereof, and Still’s disease. High levels of IL-​6 are also detected in Castleman’s disease. IL-​6 deficiency or blockade is anti-​inflammatory in several in vivo and in vitro disease models, providing strong rationale for IL-​6 blockade in the clinic. In contrast, IL-​10 exhibits predominantly anti-​inflammatory effects. It is released by a variety of leukocytes including macro- phages, and T and B lymphocytes. Acting via IL-​10RI and IL-​10RII, it inhibits macrophage cytokine and RNI/​ROI production, T-​cell activation, dendritic cell priming and maturation, and fibroblast ac- tivation. Note, however, that IL-​10 promotes B-​cell activation and immunoglobulin secretion. Several members of the IL-​10 super- family are now defined with effects often manifest in barrier de- fence. In particular, IL-​20 and IL-​22 are implicated in keratinocyte responses to cutaneous inflammation. The range of known cytokine activities described is now substan- tial across a variety of tissue compartments and processes. Often their original designation belies pleiotropic effects across a range of physiology and pathologies. Colony stimulating factors such as GM-​CSF, G-​CSF, and M-​CSF were originally defined on the basis of leucocyte precursor differentiation and maturation, but now are recognized to play a role in effector immune responses. GM-​CSF in particular is now attracting attention as a pivotal effector cyto- kine in the pathogenesis of some chronic inflammatory diseases—​ phase II trials are ongoing in rheumatoid arthritis using a GM-​CSFR inhibitor. TGFβ isoforms have broad effects in tissue maintenance and re- pair and enjoy broad expression and functional promiscuity. They initially promote inflammatory cell recruitment and activation to- gether with a key role in fibroblast activation and matrix synthesis, but over time promote suppression of inflammation to permit effi- cient wound repair. The related cytokine family of bone morpho- genetic proteins (BMP) exhibit similarly wide activities in tissue morphogenesis and repair. Comprising a large family of homo-​ and heterodimers, they regulate chemotaxis, mitosis, and differentiation during chondrogenesis, osteogenesis, and tissue morphogenesis in heart, skin, eye, and beyond, rendering them interesting moieties for therapeutic manipulation of tissue repair following injury, neoplasia, or inflammatory insult. Other fundamental processes in tissue re- pair also reside in cytokine regulation. Angiogenesis is critically regulated by vascular endothelial growth factor (VEGF) and by basic fibroblast growth factor (FGF). Together with naturally occurring inhibitors of angiogenesis, such as endostatin, this pathway is per- missive for tissue repair, but also for maintenance of inflammation, or metastasis. These angiogenins carefully orchestrate recruitment of endothelial precursors and their subsequent maturation and or- ganization into vessel structure—​relative deficiency or excess can have significant consequences for tissues. Cytokines as therapeutic targets and biomarkers General concepts in cytokine therapeutics Cytokine immunology has provoked the creation of a dynamic new field in drug discovery. Cytokines represent therapeutic entities in themselves, best exemplified in their use to amplify cancer thera- peutics via immune stimulation or to enhance immune competency in immunosuppressed patients. However, their short half-​life and tendency to mediate systemic effects that are in general undesir- able has limited their value in this respect. More success has been achieved in the treatment of chronic viral infection whereby treat- ment of hepatitis C infection with type I interferon as part of com- bination antiviral therapy leads to improved viral clearance. Type I interferons have also delivered benefits in the treatment of MS. Similarly, the utility of ‘anti-​inflammatory’ cytokines as thera- peutic entities per se in inflammatory diseases has been limited by their half-​life and by toxicities arising from non-​disease-​related, but plausible, biological effector function. For example, IL-​10, a cyto- kine with generally anti-​inflammatory effects, exhibited some effi- cacy in patients with psoriasis and rheumatoid arthritis but offered an unacceptable toxicity/​benefit ratio overall due in part to systemic adverse effects. Current efforts are focused upon delivering high

244 SECTION 3  Cell biology local concentrations of cytokine, perhaps by gene delivery method- ologies, often in structurally modified form (e.g. by addition of an Fc domain). IL-2 is a promising emerging therapeutic with the po- tential to expand regulatory T cells and thus promote immune tol- erance. Alternatively, tissue-​localizing molecules (e.g. monoclonal antibodies against tissue endothelium specific targets, or damaged tissue epitopes) may be engineered on to cytokines at the molecular level with the objective of providing high concentrations of local cytokine agonist activity in areas of maximal inflammatory insult. This remains an area under development. The crucial advance in recent years, however, has been the recog- nition that specific cytokine inhibition can bring about remarkable changes in inflammatory disease expression. Cytokines can be in- hibited by interruption of any part of their synthesis, secretion, and effector pathway. Current therapeutics relies mainly on inhibition of cytokines in the extracellular or membrane compartment medi- ated by large biological drugs. Biological drugs generated thus far comprise antibodies and soluble cytokine receptor fusion proteins. Monoclonal antibodies specific for a given cytokine, which are either ‘fully human’ (human sequence; e.g. adalimumab, ustekinumab), ‘humanized’ (rodent sequence modified to resemble human struc- ture), or chimaeric (part of the antibody is rodent derived and part of human structural origin, e.g. infliximab), offer effective high-​ affinity inhibition. Similarly, soluble cytokine receptors, fused with Fc domains of immunoglobulin to enhance their half-​life and sta- bility, similarly provide high-​affinity therapeutic inhibitors (e.g. etanercept). Novel modifications to biologic agents, such as addition of polyethylene glycol residues (PEGylation), are in development to provide further refinement of their pharmacokinetic and pharmaco- dynamic properties. Other approaches to cytokine inhibition include the generation of small-​molecule inhibitors that may target several points in the synthesis and release pathway: • enzymes that cleave cytokines intracellularly, or from the cell membrane (e.g. TNF-α-converting enzyme (TACE), caspase 1) • signal transduction molecules that mediate receptor intracel- lular function (e.g. syk, JAK, p38MAPK, JNK, NF-​κB) • receptor antagonists selected for direct inhibition of cytokine receptor–​protein interaction An important corollary to small-​molecule inhibition, at least for signal transduction inhibitors, is the loss of the exquisite specifi- city for a single cytokine contained in the biologicals. Thus, path- ways such as the MAPK and JAK appear tractable to drug targeting, but since these molecules subserve several inflammatory cytokine receptor pathways, their inhibition is no longer specific to one pathway. The capacity to inhibit several cytokine effector pathways (and hence checkpoints) may be advantageous in bringing higher efficacy but must be carefully balanced with potential toxicity—​ predictable or idiosyncratic. At this time, the approach to inhib- ition of MAPK has failed in inflammatory illnesses: p38 inhibitors in particular have been trialled in numerous scenarios and as yet have offered no benefit. By contrast, JAK inhibitors are currently of- fering promise. Tofacitinib is a JAK1/​JAK3 inhibitor approved for use in some geographic areas in the treatment of rheumatoid arth- ritis. Follow on compounds that inhibit JAK1/​JAK2 (baricitinib) and JAK1 (upadacitinib, filgotinib) are either now approved or will be so shortly. The critical safety efficacy window for such agents re- mains uncertain. Future approaches to cytokine inhibition will likely entail gene si- lencing via delivery of inhibitory RNA species, or gene therapeutic delivery of inhibitory biologic agents. Cytokine targeting in inflammatory diseases Cytokine inhibitors are now widely used in several chronic in- flammatory diseases including rheumatoid arthritis, inflammatory bowel disease, sarcoidosis, psoriasis, psoriatic arthritis, uveitis, and vasculitis and are under investigation in many more conditions. They are dealt with in detail in disease-​related chapters. Whereas we have clearly used such therapeutics to inform the pathogenesis of diseases, we have also learned critical lessons from the use of these agents about the basic biology of cytokines: • First, it appears that cytokines exist in highly regulated networks and that there are critical checkpoints in such networks at which inhibition leads to general suppression of the cascade. • Second, there is redundancy in the cytokine system such that single cytokine blockade, although of therapeutic utility, does not lead to paralysis of the host-​defence capability. That said, there are some defence processes that may heavily rely on one cytokine—​ thus TNF appears critical for granuloma formation, hence suscep- tibility to tuberculosis and some fungal infections in TNF-​blocked patients. • Third, from pharmacokinetic (PK) studies we can deduce that cytokine effects in tissues are quantitatively regulated and thresh- olds may exist for some of their effector function. • Fourth, we have not yet established sufficient rationale or under- standing of cytokine networks to the level that we can formally combine targeting approaches. Combination targeting in rodent models is highly effective in many disease models, whereas in hu- mans combined biological therapies have thus far led to increased toxicity with no yield in improved efficacy. Measuring cytokines in biological fluids There is increasing interest in measurement of cytokines in serum and plasma and ex vivo in cellular supernatants. This is in part driven by basic discovery research. However, there is intense interest in the potential for cytokines as biomarkers of disease activity and prog- nosis or for pharmacodynamic purposes to predict drug response or toxicities. Cytokines were originally measured and defined in bio- assays using live cell assays in vitro that are impractical for such pur- poses. Thereafter, enzyme-​linked immunosorbent assay (ELISA) or radioimmunoassay became the techniques of choice. Though these are validated and still widely used, the future analysis of cytokines will be facilitated by the use of multiplex technologies, based on laser analysis, which measure up to 30 cytokines in small (<50 µl) volumes. Techniques based on protein chips offer further utility to measure the expression of wider ranges (up to 360 proteins per assay) of cytokines. The critical advance here will be to allow ana- lysis of changes in patterns of cytokines as opposed to changes in single moieties that will in turn considerably increase the power of this approach. Moreover, such techniques can be adapted to evaluate expression not only in fluid phase but also in tissue extracts to allow evaluation in biopsies.

3.3  Cytokines 245 FURTHER READING Altan-Bonnet G, Mukherjee R (2019). Cytokine-mediated commu- nication: a quantitative appraisal of immune complexity. Nat Rev Immunology, 19, 205–17. Dayer JM (2004). The process of identifying and understanding cyto- kines: from basic studies to treating rheumatic diseases. Best Pract Res Clin Rheumatol, 18, 31–​45. Dinarello CA (2005). Blocking IL-​1 in systemic inflammation. J Exp Med, 201, 1355–​9. Dong C (2008). TH17 cells in development: an updated view of their molecular identity and genetic programming. Nat Rev Immunol, 8, 337–​48. Elliott MJ, et  al. (1994). Randomised double-​blind comparison of chimeric monoclonal antibody to tumour necrosis factor alpha (cA2) versus placebo in rheumatoid arthritis. Lancet, 344, 1105–​10. Feldmann M, et al. (2004). The transfer of a laboratory based hypoth- esis to a clinically useful therapy:  the development of anti-​TNF therapy of rheumatoid arthritis. Best Pract Res Clin Rheumatol, 18, 59–​80. Garlanda C, Dinarello CA, Mantovani A (2013). The interleukin 1 family: back to the future Immunity, 39, 1003. Jones SA, Jenkins BJ (2018). Recent insights into targeting the IL-6 cytokine family in inflammatory diseases and cancer. Nat Rev Immunology, 18, 773–89. McInnes IB, Schett G (2007). Cytokines in the pathogenesis of rheuma- toid arthritis. Nat Rev Immunol, 7, 429–​42. McNab F, et al. (2015). Type I interferons in infectious disease Nat Rev Immunol, 15, 87. Moreland LW, et al. (1997). Treatment of rheumatoid arthritis with a recombinant human tumor necrosis factor receptor (p75)-​Fc fusion protein. N Engl J Med, 337, 141–​7. Ouyang W, Kolls JK, Zheng Y (2008). The biological functions of T helper 17 cell effector cytokines in inflammation. Immunity, 28, 454–​67. Van den Berg W, McInnes IB (2013). Th17cells and IL-​17A—​focus on immunopathogenesis and immunotherapeutics. Semin Arthritis Rheum, 43, 158.

3.4 Ion channels and disease 246

3.4 Ion channels and disease 246

ESSENTIALS Ion channels are membrane proteins that act as gated pathways for the movement of ions across cell membranes. They are found in both surface and intracellular membranes and play essential roles in the physiology of all cell types. An ever-​increasing number of human diseases are now known to be caused by defects in ion channel func- tion. Ion channel diseases may arise in several different ways: Mutations in the coding region of the gene, or its control elements, leading to the gain, or loss, of channel function—Such diseases are often known as channelopathies and their frequency in the gen- eral population is usually very low. Many channelopathies are gen- etically heterogeneous and the same clinical phenotype may be caused by mutations in different genes, as is the case for long-​QT syndrome. Conversely, mutations in the same gene may produce different phenotypes. For example, gain-​of-​function mutations in the epithelial Na+ channel produce Liddle’s syndrome, whereas loss-​of-​function mutations cause pseudohypoaldosteronism type 1. Disease severity may vary with different mutations in the same gene, as is seen with gain-​of-​function mutations in KATP channel subunits: all cause neonatal diabetes, but the most functionally se- vere also cause neurological problems. Defects in expression levels and trafficking, leading to the gain, or loss, of channel density may also cause disease. Defective regulation of channel activity by intracellular or extra- cellular ligands, or by channel modulators—This can be due to mu- tations in the genes encoding the regulatory molecules themselves, or defects in the pathways leading to their production. For instance, glucokinase mutations cause one type of maturity-​onset diabetes of the young (MODY2) by impairing the metabolic regulation of ATP-​ sensitive K+ channels in pancreatic β cells. Autoantibodies to ion channel proteins—which may either down­ regulate or enhance channel function. Ion channels that act as lethal agents—These are secreted by cells and insert into the membrane of the target cell to form large non-​ selective pores that cause cell lysis and death. Examples include bac- terial toxins such as staphylococcal α-​toxin and the amoebapore of Entamoeba histolytica. The membrane-​attack complex of comple- ment, perforin, and the defensins also acts in this way. Properties of ion channels To understand how ion channel defects give rise to disease, it is helpful to understand how ion channel proteins work. This chapter therefore considers what is known of ion channel structure, ex- plains the properties of the single ion channel, and shows how single-​channel currents give rise to action potentials and synaptic potentials. Ion channel structure Some ion channels consist of a single subunit, as in the case of the Ca2+-​release channel of the sarcoplasmic reticulum. In other cases, the channel pore is formed from a single (α) subunit but associ- ated regulatory subunits may modify the ion channel properties, as in the case of voltage-​gated Na+ and Ca2+ channels. Yet other ion channels are multimeric and several subunits are involved in pore formation—​the nicotinic acetylcholine receptor comprises five sub- units (2α, β, δ and either γ or ε), while the voltage-​gated K+ channels are composed of four subunits (which are sometimes, but not in- variably, identical). Mutations in both pore-​forming and regulatory subunits can cause disease. The multimeric nature of an ion channel may influence whether a channelopathy is inherited in a dominant or recessive fashion. Individuals who are heterozygous for voltage-​gated K+ channel muta­ tions will express both mutant and wild-​type subunits in the same cell. If the mutant subunits coassemble with wild-​type subunits to form hetero-oligomeric channels that are non​functional, the resulting K+ current will be much smaller than if hetero-multimerization does not occur. This is known as the ‘dominant-​negative’ effect and may give rise to a disease that is dominantly inherited. Single-​channel properties An ion channel can either be open or closed. When it is open, per- meant ions are able to move through the channel pore. The cur- rent flowing through the open pore is known as the single-​channel current. Its magnitude is determined by the ion concentrations on either side of the membrane (the chemical gradient), by the mem- brane potential (the electrical gradient), and by the ease with which the ion can move through the channel pore (its permeability). At the 3.4 Ion channels and disease Frances Ashcroft and Paolo Tammaro

3.4  Ion channels and disease 247 equilibrium potential of an ion, the electrical and chemical gradients are equal in magnitude but opposite in direction, and thus there is no net ion flux. The single-​channel conductance (γ) is a measure of the permeability of the ion and can be approximated by the single-​ channel current (i) divided by the membrane potential (γ = i/​V). Ion channels are often highly selective in the ions they conduct. K+ channels, for example, are far more permeable to K+ than to Na+, while Na+ channels conduct Na+ but discriminate against K+. Ion selectivity takes place within a narrow region of the pore known as the selectivity filter. While some ions are excluded on the basis of their size or their charge, hydrophobic interactions and the en- ergy required to remove the waters of hydration can also important. Different types of ion channel may utilize different mechanisms to achieve selectivity. The fraction of time the channel spends in the open state is known as the open probability. Some channels open and close at random, but in other channels gating is regulated. In voltage-​gated chan- nels the open probability is determined by the membrane potential, whereas in ligand-​gated channels it is regulated by the binding of extracellular or intracellular ligands. Gating may also be subject to modulation, a process in which channel opening or closing is modi- fied, usually by one of several factors, such as ion or lipid binding, G-​ protein interactions, or post-​translational modifications like protein phosphorylation and sumoylation (covalent attachment of a small ubiquitin-​related modifier, SUMO, to a protein). Gating is believed to involve conformational changes in the channel structure that re- sult in the opening or closing of the pore. Ion channels are also influenced by the potential difference across the cell membrane, which usually lies between –​60 and –​100 mV at rest. A  change in the membrane potential to a more positive value is known as depolarization; hyperpolarization is a change to more negative potentials. At the resting potential of the cell, most voltage-​gated channels are closed. In response to a membrane de- polarization, the open probability of the channel is increased. This voltage-​dependent activation may be followed by a further con- formational transition (inactivation) to an inactivated state in which the channel no longer conducts ions. Recovery from inactivation occurs after a variable period following repolarization to the resting potential. Although most voltage-​gated ion channels are opened by depolarization, a few types of voltage-​gated channel are activated by hyperpolarization. Ligand-​gated channels are opened (or more rarely closed) by binding of an appropriate ligand to a specific site on the channel protein, which induces a conformational change that allosterically opens the ion pore. The channel may open and close several times while the ligand remains bound to its receptor, but this intrinsic gating ceases on ligand dissociation. There are numerous different types of channel. For example, even among the inwardly rectifying K+ channels there are seven subfam- ilies, most of which have several members. In general, ion channels are named after their gating and/​or selectivity properties. Single-​channel currents summate to produce macroscopic currents The cell membrane contains many hundreds of ion channels. The macroscopic current (I) flowing through all ion channels of the same type is determined by the product of the number of channels in the membrane (N), the channel open probability (P), and the single-​channel current (i); in other words, I = NPi. Disease-​causing mutations may affect any or all of these parameters and thereby in- fluence the macroscopic current. Cell membranes also contain several different types of channel. The total current that flows across the cell membrane (the mem- brane current) represents the sum of the ion fluxes through all the different kinds of ion channel open in the membrane. If it is suffi- ciently large, the membrane current may cause a change in mem- brane potential. The size of this voltage change is given by Ohm’s law (V = IR) and is therefore influenced by both the current amplitude (I) and by the membrane resistance (R) (which in turn reflects the number of open channels). Action potentials In excitable cells, a depolarizing stimulus may elicit an action potential—​a transient change in membrane potential. For example, nerve axons and skeletal muscle fibres, the action potential results from the initial activation of voltage-​gated Na+ channels followed shortly afterwards by activation of voltage-​gated K+ channels. Because Na+ channels open rapidly on depolarization, there is an initial in- ward Na+ current. If this is greater than the outward current flowing through (voltage-​independent) K+ channels which are open at the resting potential, it will produce a further depolarization. This acti- vates more Na+ channels and depolarizes the membrane even more. In this way, a regenerative increase in membrane potential is produced. The membrane is returned to its resting level by inactivation of the Na+ channels (which reduces the inward current) and the opening of K+ channels (which produces an outward, hyperpolarizing current). The potential at which the inward Na+ current exactly balances the outward current through resting K+ channels is known as the threshold potential. It is a critical potential: any increase in the Na+ current will elicit an action potential, while any reduction in the in- ward current (or increase in the outward current) will prevent ac- tion potential generation. Ion channel mutations may increase nerve or muscle excitability either by enhancing the inward current (as in hyperkalaemic periodic paralysis), or by reducing the outward cur- rent (as in some forms of long-​QT syndrome). This will produce a larger depolarization, so that the threshold potential is reached more easily and a subsequent action potential is initiated. Other mutations produce a depolarizing block of action potential activity. This results from a maintained membrane depolarization of sufficient amplitude to inactivate the voltage-​dependent Na+ channels. In some cells, additional types of ion channel contribute to the action potential—​the ventricular action potential is mediated by voltage-​dependent Na+ and Ca2+ channels, and at least four kinds of K+ channel. Several different kinds of K+ channel contribute to the repolarization of action potentials in mammalian neurons and chloride (Cl–​) channels play an important role in the electrical ac- tivity of skeletal muscle. The functional importance of these different ion channels is exemplified by the fact that mutations in the genes that encode them produce a range of nerve and muscle diseases. Synaptic potentials When a nerve impulse arrives in the presynaptic terminal it opens voltage-​gated Ca2+ channels, producing a rise in the intracellular Ca2+ concentration ([Ca2+]i) that triggers the exocytosis of syn- aptic vesicles. The amount of transmitter released varies with [Ca2+]i and thus with the magnitude of the presynaptic Ca2+ current. In

248 SECTION 3  Cell biology turn, this is influenced by the duration of the membrane depolar- ization and thus by the amplitude of the voltage-​gated K+ current that underlies membrane repolarization. A reduction in the pre- synaptic K+ current therefore leads to excess transmitter release and postsynaptic hyperexcitability, as in episodic ataxia type 1 and ac- quired neuromyotonia. Conversely, a reduction in the presynaptic Ca2+ current is associated with reduced transmitter release, as oc- curs in Lambert–​Eaton myasthenic syndrome when the density of presynaptic Ca2+ channels is decreased by receptor internalization induced by the binding of autoantibodies. Once released, the transmitter diffuses across the synaptic cleft and binds to receptors in the postsynaptic membrane. At the neuro- muscular junction, for example, acetylcholine (ACh) binds to the nicotinic acetylcholine receptor (AChR), and opens an intrinsic ion channel. The resulting synaptic current produces a depolariza- tion of the postsynaptic membrane (the endplate potential) which, if it is sufficiently large, triggers an action potential in the muscle fibre. A reduction in AChR density, as in myasthenia gravis, de- creases effective transmission, and leads to muscle weakness. Gain-​ of-​function mutations in AChR may also induce myasthenia, by causing prolonged depolarization of the postsynaptic membrane and thereby Na+ channel inactivation. This depolarizing block is the basis of the slow-​channel syndromes. Mutations in the voltage-​ gated Na+ channel of skeletal muscle may cause paralysis, or myotonia. In skeletal muscle, the action potential is conducted into the in- terior of the fibre via invaginations of the surface membrane known as the transverse tubules (T-​tubules). Depolarization of the T-​tubule membrane stimulates the opening of Ca2+-​release channels (RyR) in the membrane of the sarcoplasmic reticulum (SR), the intracel- lular Ca2+ store. The T-​tubule and SR membranes are not directly connected and the precise mechanism by which they interact is not fully understood. However, there is evidence that the α1-​subunit of the voltage-​gated Ca2+ channel in the T-​tubule membrane acts as the voltage sensor for the Ca2+-​release channels in the SR membrane. Mutations in the Ca2+-​release channel of skeletal muscles cause ma- lignant hyperthermia and central core disease. The channelopathies This section provides brief descriptions of a selected range of channelopathies. Table 3.4.1 lists these diseases, the channels in- volved, their gene names, and chromosomal locations. The list is far from exhaustive. Additional details may be found elsewhere in the Oxford textbook of medicine or in the books and websites referenced at the end of this chapter. Neuronal channelopathies Epilepsy Many different ion channels have been implicated in the epilepsies, including both voltage-​gated and ligand-​gated channels. Channelopathies make up a major group of genes implicated in the epileptic encephalopathies, which are severe epilepsies typically beginning in infancy and childhood and associated with develop- mental slowing and often regression. The prototypic form of these disorders is Dravet syndrome (pre- viously known as severe myoclonic epilepsy of infancy) in which more than 80% of patients have mutations in the gene SCN1A, which encodes the α1-​subunit of the voltage-​gated Na+ channel. Seizures are often precipitated by fever, hot temperatures, or vac- cination. Recently, increasing numbers of patients with encephal- opathies due to other Na+ channel (SCN2A, SCN8A), K+ channel (KCNQ2, KCNT1), and Ca2+ channel (CACNA1A) genes have been identified. Ligand-​gated ion channels, such as γ-​aminobutyric acid (GABA) receptors and glutamate receptors, also cause epileptic en- cephalopathies. Most of these mutations arise de novo in the affected individual, but parental mosaicism is becoming increasingly ap- preciated and is important for reproductive counselling regarding recurrence risk. An important, emerging picture is that many of the channelopathies are associated with a spectrum of epilepsies from self-​limited (pre- viously called benign) epilepsies to severe phenotypes such as the encephalopathies. SCN1A mutations are found in 10–​20% of large families with genetic epilepsy with febrile seizures plus (GEFS+). These families have marked phenotypic heterogeneity, ranging from febrile seizures in some individuals to more severe focal and gener- alized epilepsies in others. GEFS+ has also been linked to mutations in the β1-​subunit of the voltage-​gated Na+ channel (SCN1B). The presence of the β-​subunit accelerates both the rate of inactivation, and the rate of recovery from inactivation, of the voltage-​gated Na+ channel. Precisely how SCN1A and SCN1B mutations lead to GEFS+ remains unclear. Similarly, SCN2A is associated with the syndrome of benign familial neonatal-​infantile epilepsy, while KCNQ2 and KCNQ3 are linked with benign familial neonatal epilepsy; both are mild self-​limited disorders occurring in individuals of normal intel- lect (see next). Identification of a causative mutation is important as it may carry treatment implications. For example, in Dravet syndrome, in which loss-​of-​function SCN1A mutations are usual, Na+ channel blockers such as carbamazepine are contraindicated as they bring out and/​ or exacerbate myoclonic seizures. In contrast, clinical observations suggest that Na+ channel blockers such as phenytoin and carbamaze- pine are effective in SCN2A and SCN8A encephalopathies in which gain-​of-​function mutations are seen. Benign familial neonatal convulsions Benign familial neonatal convulsions (BFNC) is characterized by neonatal convulsions within the first 7 days after birth that nor- mally show spontaneous remission by the third month of life. There is an increased risk of epilepsy in later life in 10 to 15% of individ- uals. Mutations in the voltage-​gated K+ channel genes KCNQ2 and KCNQ3 are associated with BFNC. KCNQ2 and KCNQ3 associate in a heteromeric complex to form the M-​channel. This channel plays a critical role in determining the electrical excitability of many neurons. It is slowly activated when the membrane is depolarized to around the threshold level for action potential firing, thereby hyperpolarizing the membrane back towards its resting level. This reduces neuronal excitability by limiting the spiking frequency and decreasing the responsiveness of the neuron to synaptic inputs. Some benign familial neonatal epi- lepsy (BNFC) mutations result in reduced channel density. Others alter the channel kinetics. Both are expected to lead to neuronal hyperexcitability, accounting for the epileptic seizures. Because the M-​channel is a heteromer of KCNQ2 or KCNQ3, mutations in ei- ther gene will disrupt channel function and cause BNFC.

3.4  Ion channels and disease 249 Table 3.4.1  Examples of ion channel genes associated with disease Gene Chromosome location Protein Disease Neuronal diseases SCN1A 2q24 Voltage-​gated Na+-​channel α-​subunit, Nav1.1 Dravet syndrome, epilepsy (GEFS+ type-​2) SCN2A 2q23–​q24.3 Voltage-​gated Na+-​channel α-​subunit, Nav1.2 Benign familial infantile seizures SCN8A 12q13.3 Voltage-​gated Na+-​channel α-​subunit, Nav1.6 Infantile epileptic encephalopathy SCN9A 2q24 Voltage-​gated Na+-​channel α-​subunit, Nav1.7 Erythermalgia, paroxysmal extreme pain disorder, congenital indifference to pain SCN1B 19q13.1 Voltage-​gated Na+-​channel β-​subunit Epilepsy (GEFS+ type-​1) KCNA1 12p13 Voltage-​gated K+ channel, Kv1.1 Episodic ataxia type-​1 KCNQ2 20q13.3 Voltage-​gated K+ channel Epilepsy (BNFS) KCNQ3 8q24 Voltage-​gated K+ channel Epilepsy (BNFS) CACNA1A 19p13.1 Voltage-​gated Ca2+ channel α-​subunit (P/​Q type) Episodic ataxia type-​2, familial hemiplegic migraine, and spinocerebellar ataxia type-​6 CACNB4 2q22–​q23 Voltage-​gated Ca2+ channel β4-​subunit Juvenile myoclonic epilepsy Generalized epilepsy and praxis seizures CHRNA4 20q13.2–​13.3 nACh-​receptor α4-​subunit Epilepsy (nocturnal frontal lobe epilepsy type-​1) CHRNB2 1q21 nACh-​receptor β-​subunit Epilepsy (nocturnal frontal lobe epilepsy type-​3) GLRA1 5p32 Glycine receptor α1-​subunit Hyperekplexia (startle disease) GJB1 Xq13.1 Connexin 32 Charcot–​Marie–​Tooth disease Cardiac muscle diseases SCN5A 3p21–​24 Voltage-​gated Na+-​channel α-​subunit Long-​QT syndrome (LQT3), Brugada syndrome, congenital conduction defects, atrial fibrillation KCNQ1 11p15.5 Voltage-​gated K+ channel α-​subunit Long-​QT syndrome (LQT1), short QT syndrome, atrial fibrillation (Romano–​Ward syndrome, Jervall–​Lange–​Nielsen syndrome) KCNH2 7q35–​36 Voltage-​gated K+ channel α-​subunit (HERG) Long-​QT syndrome (LQT2), short QT syndrome KCNE1 21q22.1–​q22.1 Voltage-​gated K+-​channel β-​subunit (MinK) Long-​QT syndrome (LQT5) Jervall–​Lange–​Nielsen syndrome KCNE2 21q22.1 Voltage-​gated K+-​channel β-​subunit (MiRP1) Long-​QT syndrome (LQT6), atrial fibrillation KCNJ2 17q24.3 Inwardly rectifying K+ channel (Kir2.1) Anderson syndrome, atrial fibrillation HCN4 15q24.1 Hyperpolarization-​activated K+ channel Sick sinus syndrome CACNA1C 12p13.33 Voltage-​gated ion channel Timothy syndrome RYR2 1q42.1–​q43 Ca2+ release channel of cardiac SR Ventricular tachycardia Skeletal muscle diseases SCN4A 17q23–​q25 Voltage-​gated Na+-​channel α-​subunit HyperPP, PAM, paramyotonia congenita CACNA1S 1q32 Voltage-​gated Ca2+ channel α-​subunit (L-​type) Hypokalaemic periodic paralysis Malignant hyperthermia KCNE3 11q13–​14 Voltage-​gated K+-​channel β-​subunit (MiRP2) Hypokalaemic periodic paralysis KCNJ2 17q23 Inward rectifier K+ channel, Kir2.1 Andersen syndrome CLCN1 7q35 Voltage-​gated Cl–​ channel, ClC-​1 Myotonia congenita, generalized myotonia RYR1 19q13.1 Ca2+-​release channel of SR Malignant hyperthermia, central core disease CHRNA1 2q24–​q32 nACh-​receptor α1-​subunit Slow-​channel syndrome (SCS), fast-​channel syndrome (FCS) CHRNB1 17p12–​p11 nACh-​receptor β-​subunit SCS, nAChR deficiency syndrome CHRND 2q33–​q34 nACh-​receptor δ-​subunit SCS, FCS CHRNE 17p13.1 nACh-​receptor ε-​subunit SCS, nAChR deficiency syndrome Kidney diseases KCNJ1 11q24 Inward rectifier K+ channel, Kir1.1 Bartter’s syndrome (type II) KCNJ10 1q23.2 Inward rectifier K+ channel, Kir4.1 SeSAME syndrome CLCNKB 1p36 Voltage-​gated Cl–​ channel Bartter’s syndrome (type III) (continued)

250 SECTION 3  Cell biology Episodic ataxia type 1 Episodic ataxia type 1 (familial periodic cerebellar ataxia with myokymia) is an autosomal dominant disorder that causes ataxia accompanied by myokymia, nausea, vertigo, and headache. It re- sults from mutations in the voltage-​gated K+ channel KV1.1, which is expressed in the synaptic terminals and dendrites of many brain neurons. These mutations either prevent the formation of functional channels or result in a reduced K+ current. This is expected to pro- long the neuronal action potential, inducing repetitive firing and excessive and unregulated transmitter release, and thereby produce the clinical symptoms of ataxia and myokymia. Familial hemiplegic migraine, episodic ataxia type 2, and spinocerebellar ataxia type 6 There are three human diseases with different phenotypes that are as- sociated with mutations in the same Ca2+-​channel gene, CACNA1A. These are familial hemiplegic migraine (FHM), episodic ataxia type 2 (EA-​2), and spinocerebellar ataxia type 6 (SCA-​6). All three diseases result in progressive cerebellar atrophy, but they differ in the extent and rate of progression of neuronal degeneration, with SCA-​6 showing the greatest atrophy, and FHM the least. Migraine-​like symptoms also occur in all three diseases and are most severe in patients with FHM, who suffer transient hemiparesis. EA-​2 and SCA-​6 are also character- ized by ataxia and nystagmus. FHM is associated with missense muta- tions. In mice, these lead to an increase in the P/​Q type Ca2+ current of cerebellar and cortical neurons and an enhanced tendency to cortical spreading depression, which may underlie the migraine. Startle disease (hyperekplexia) Glycine is the major inhibitory transmitter in the brainstem and spinal cord. It binds to a ligand-​gated Cl–​ channel, producing an in- crease in Cl–​ permeability that reduces the membrane depolariza- tion and neuronal firing induced by excitatory neurotransmitters. The glycine receptor is a pentamer of three α-​subunits, which con- tain the glycine-​binding site, and two β-​subunits. In humans, two types of the α-​subunit have been identified. Mutations in the gene encoding the α1-​subunit of the glycine receptor give rise to startle disease (hyperekplexia). This is an autosomal dominant neuro- logical disorder characterized by muscle spasm in response to an un- expected stimulus. It manifests as facial grimacing, hunching of the shoulders, clenching of the fists, exaggerated jerks of the limbs and sudden falls. Startle disease mutations produce a dramatic decrease Gene Chromosome location Protein Disease CLCN5 Xp11.22 Voltage-​gated Cl–​ channel, ClC-​5 Nephrolithiasis (Dent’s diseasea) SCNN1A 12p13 Epithelial Na+-​channel α-​subunit Pseudohypoaldosteronism (PHA-​1) SCNN1B 16p13–​p12 Epithelial Na+-​channel β-​subunit Liddle’s syndrome, PHA-​1, bronchiectasis (BESC) SCNN1G 16p13–​p12 Epithelial Na+-​channel γ-​subunit Liddle’s syndrome, PHA-​1, BESC AQP2 12q13 Aquaporin 2 (water channel) Nephrogenic diabetes insipidus PDK1 16p13.3 Polycystin 1 (associates with PDK2) Polycystic kidney disease PDK2 4q22.1 TRPP2 channel (polycystin 2) Polycystic kidney disease Other diseases KCNJ11 11p15.1 ATP-​sensitive K+ channel subunit, Kir6.2 Neonatal diabetes, congenital hyperinsulinaemia of infancy ABCC9 11p15.1 ATP-​sensitive K+ channel subunit, SUR1 Neonatal diabetes, congenital hyperinsulinaemia of infancy KCNJ8 12p12.1 ATP-​sensitive K+ channel subunit, Kir6.1 Cantu syndrome ABCC9 12p12.1 ATP-​sensitive K+ channel subunit, SUR2 Cantu syndrome CFTR 7q31 CFTR Cl–​ channel Cystic fibrosis CLCN7 16p13 Voltage-​gated Cl–​ channel, ClC-​7 Osteopetrosis CNGA1 4p12–​cen Cyclic nucleotide-​gated channel α-​subunit Retinitis pigmentosa STIM1 11p15.5 CRAC channel subunit Immunodeficiency and autoimmunity syndrome ORAI1 12q24 CRAC channel subunit Immunodeficiency and autoimmunity syndrome GJB2 13q11–​q12 Connexin 26 Deafness (DFNA3 and DFNB1) Vohwinkel’s syndrome GJB3 1p35.1 Connexin 31 Non​syndromal sensineural deafness (DFNA2) Erythrokeratodermia variabilis GJB6 13q12 Connexin 30 Deafness (DFNA3) Ectodermal dysplasia GJA3 13q11 Connexin 46 Cataract (zonular pulverulent type-​3) GJA8 1q21.1 Connexin 50 Cataract (zonular pulverulent type-​1) BNFC, benign familial neonatal seizures; GEFS+, generalized epilepsy with febrile seizures plus; HyperPP, hyperkalaemic periodic paralysis; PAM, potassium-​aggravated myotonia; PHA-​1, pseudohypoaldosteronism type 1, BESC, bronchiectasis with or without elevated sweat chloride. a Dent’s disease is now recognized to include X-​linked recessive nephrolithiasis, X-​linked hypophosphataemic rickets, and a renal tubular defect in Japanese children Table 3.4.1  Continued

3.4  Ion channels and disease 251 in glycine-​activated currents. Because glycinergic interneurons are important for normal spinal cord reflexes, muscle tone, and the pat- tern of motor neuron firing during movement, this leads to excessive and uncontrolled movements. Charcot–​Marie–​Tooth disease Charcot–​Marie–​Tooth disease type 1 (CMT1) causes progressive de- generation and demyelination of the peripheral nerves. It is genetic- ally heterogeneous, but the X-​linked form of the disease results from mutations in the gap junction channel connexin 32 (Cx32). It shows incomplete dominant inheritance, with heterozygous females being affected less severely than hemizygous males. The phenotype may vary from mild, in which the patient has a normal gait, to a severe form which may necessitate the use of a walking stick or wheelchair. More than 100 mutations in CX32 have been identified. They fall into two main groups—​those in which the protein never reaches the plasma membrane, and those where the protein reaches the membrane but forms channels with altered functional properties. The former give rise to a severe phenotype, whereas the latter may be associated with either mild or severe phenotypes, according to whether they partially or completely disrupt channel function. The Cx32 protein is primarily expressed in the Schwann cells of peripheral myelinated nerves, at the nodes of Ranvier and at Schmidt–​Lanterman incisures. In these regions, the myelin is not complete and there is a thin layer of cytoplasm between each of the enveloping turns of the Schwann cell. This suggests that Cx32 may serve as a short-​cut pathway for nutrients and other substances moving to the innermost layers of the Schwann cell, and perhaps also to the axon itself. This might explain why loss of Cx32 function causes axonal degeneration and demyelination. Familial pain syndromes Mutations in the peripheral nerve voltage-​gated Na+ channel Nav1.7 (SCN9A) cause familial pain disorders. Gain-​of-​function mutations produce inherited erythermalgia, paroxysmal extreme pain disorder (PEPD), and idiopathic small fibre neuropathy. Erythermalgia is characterized by episodes of erythema and burning pain of the lower legs and feet that usually are provoked by warmth or exercise. The pain can be extreme. PEPD is associated with severe pain triggered by bowel movements: it may be accompanied by non​epileptic seiz- ures and cardiac problems. These symptoms arise because Nav1.7 is expressed in nociceptive neurons and activating mutations enhance their excitability. By contrast, loss-​of-​function mutations in Nav1.7 lead to im- paired action potential transmission and a reduced ability to sense pain. Patients may not recognize they have hurt themselves as they feel no pain from bone fractures or walking on hot coals. A drug that inhibits Nav1.7 might be an effective therapy for chronic pain and is the subject of much current pharmaceutical research effort. Cardiac muscle channelopathies The ion channels that underlie the cardiac action potential differ in different regions of the heart (ventricle, atria, Purkinje cell, SA node, and so on), accounting for the fact that the action potentials in these regions have a different time course and duration. Mutations in the genes encoding these channels can cause a range of cardiac arrhythmias. Long-​QT syndrome Long-​QT syndrome is a congenital cardiac disorder associated with an abrupt loss of consciousness and sudden death from ventricular arrhythmia in children and young adults. It is characterized by an abnormally long-​QT interval in the electrocardiogram, which re- flects the delayed repolarization of the ventricular action potential. This predisposes to torsade de pointes and ventricular fibrillation. The duration of the cardiac action potential is determined by the balance between the inward and outward currents flowing during the plateau phase. Prolongation of the action potential can therefore be caused by a persistent inward current or by a reduction in outward K+ currents. Several different cardiac ion channels are associated with long-​QT syndrome, the most common being KCNQ1, KCNH2 (HERG), and SCN5A (Table 3.4.1). The IKs channel is a complex of two different proteins, KCNQ1 and minK. Likewise, IKr is a complex of HERG and Mirp1. Mutations in these four genes either abolish or mark- edly decrease the repolarizing K+ currents IKs and IKr, and are there- fore expected to prolong the cardiac action potential and increase the QT interval. Mutations in the cardiac muscle Na+ channel gene (SCN5A) also cause long-​QT. These mutations affect Na+ channel inactivation, producing a sustained inward current that results in an increased action potential duration. The larger the component of non-​inactivating current, the more severe the phenotype. In many cases, long-​QT syndrome is not inherited but acquired. For example, drugs that block IKr or IKs currents prolong the cardiac action potential and induce long-​QT syndrome. Among these are the antibiotic erythromycin, the class III antiarrhythmic agents such as sotalol, dofetilide, and quinidine (which selectively block IKr) and the antihistamine H1-​receptor antagonists terfenadine and astemizole (which block HERG). In most people, terfenadine does not produce cardiac problems as it is rapidly broken down in the liver and its me- tabolite, terfenadine carboxylate, does not block IKr. However, if the activity of the P450 enzymes that break down terfenadine is impaired (due to liver disease or drugs such as ketoconazole and the macrolide antibiotics), there is a risk of torsade de pointes. Other cardiac arrhythmias Many other cardiac arrhythmias result from ion channel mutations. These include short QT syndrome, Brugada syndrome, Lev-​Lenegré syndrome, Timothy syndrome, and atrial fibrillation. Short QT syndrome is associated with a reduced QT interval and is caused by mutations in the K+ channels KCNH2, KCNJ2, and KCNQ1 that are believed that these lead to a gain of function. Brugada syndrome was initially identified as being due to mutations in the Na+ channel SCN5A: other channels have subsequently been implicated, although the most recent work has cast doubt on many such reports. It leads to right ventricular conduction abnormalities, ventricular fibrillation, and sudden cardiac death in young people. Lev-​Lenegré syndrome is a progressive conduction disorder also caused by loss-​of-​function mutations in SCN5A. Timothy syndrome is characterized by multiorgan dysfunction, including severe arrhythmias, and is associated with high mor- tality. It is due to gain-​of-​function mutations in the Ca2+ channel CACNA1C, which lead a longer QT interval. Catecholaminergic ventricular tachycardia is a cardiac arrhythmia triggered by physical or emotional stress that can lead to syncope or sudden cardiac death. About half of cases are due to mutations in the

252 SECTION 3  Cell biology sarcoplasmic Ca2+ channel RYR2 which lead to increased intracel- lular Ca2+ release and thereby arrhythmia. Anderson syndrome is associated with mutations in the inwardly rectifying K+ channel KCNJ2. It is a complex multisystem disorder characterized by ventricular arrhythmia, periodic paralysis, and dysmorphic features of both the skeleton and face. Atrial channelopathies Atrial fibrillation is one of the most common arrhythmias, occurring in about 1% in the general population and increasing with age. However, it can also be caused by mutations in Na+ (SCN5A, SCN1B, SCN2B) and K+ channels (KCNQ1, KCNE2, KCNA5, KCNJ2, and KCNH2). These mutations generally cause loss of Na+ current or gain of K+ current and may result in shortening of action poten- tial duration and effective refractory period, which can precipitate atrial fibrillation. Of note is the case of a mutation in KCNQ1 (R14C) which causes a reduced K+ current only when the cell is exposed to stretch (experimentally achieved with exposure to a hypotonic solu- tion). This finding emphasizes the importance of genetic and envir- onmental interactions in the development of the disease. Rare mutations in the hyperpolarization-​activated cyclic nucleotide-​gated channel HCN4, which underlies the pacemaker current in sinoatrial node cells, lead to sick sinus syndrome. This is characterized by idiopathic sinus bradycardia and chronotropic in- competence. In some families, long-​QT and torsade de pointes have also been seen. In addition, increased HCN4 expression may occur during cardiac hypertrophy and congestive heart failure, and con- tribute to the increased risk of arrhythmia. In addition to mutations in ion channel genes themselves, an increasing number of disorders have been found to associate with mutations in genes that dictate the density of ion channels in the membrane or regulate their function. For example, muta- tions in caveolin-​3 or a1-​syntrophin enhance SCN5A currents, so causing long-​QT syndrome, and mutations in calsequestrin affect the extent of Ca2+ release through RYR2 function and give rise to catecholaminergic ventricular tachycardia. Skeletal muscle channelopathies Myasthenia gravis, slow-​channel, fast-​channel, and
AChR deficiency syndromes Myasthenia gravis is usually produced by autoantibodies directed against the nicotinic acetylcholine receptor (nAChR), as discussed elsewhere. These antibodies lead to loss of nAChR due to internal- ization and thus to a smaller endplate potential that fails to reach the threshold for action potential initiation. At least three different congenital myasthenic syndromes are pro- duced by mutations in the muscle nAChR channel. Slow-​channel syndrome (SCS) mutations are found in all four subunits of the adult channel (α, β, δ, ε) and result in protracted channel activa- tion by acetylcholine. The increase in channel open probability pro- duces a prolonged synaptic current and endplate potential. Impaired neuromuscular transmission is thought to result from a combin- ation of three pathogenic mechanisms. First, temporal summation of endplate potentials can occur at physiological rates of stimu- lation, leading to prolonged depolarization of the muscle mem- brane, inactivation of voltage-​gated Na+ channels, and failure of muscle excitability. A similar ‘depolarization block’ is observed with acetylcholinesterase (AChE) inhibitors or with nAChR agonists like suxamethonium. Second, the prolonged endplate potential causes excess Ca2+ entry and activation of proteolytic enzymes, which may account for the progressive destruction of the postsynaptic neuro- muscular junction observed in SCS—​loss of junctional nAChRs and destruction of the junctional folds has been reported. Abnormal channel openings in the absence of acetylcholine may also contribute to the ‘endplate myopathy’. Third, the slow-​channel mutations give an increased propensity for the nAChR to enter a desensitized state in which it is unable to respond to acetylcholine. Fast-​channel syndrome (FCS) is the converse of SCS: nAChR mu- tations shorten channel openings thereby reducing the endplate po- tential amplitude below that required to trigger action potentials. nAChR deficiency, the most common congenital myasthenic syn- drome, results from mutations (often in the ε subunit) that impair channel assembly and insertion into the plasma membrane. Mutations in genes that affect the clustering and/​or density of nAChR at the synapse, such as AGRN, LRP4, MuSK, and DOK7, are another cause of congenital myasthenic syndromes, and mutations in the early steps of the N-​glycosylation pathway may affect both channel assembly and insertion into the plasma membrane as well as AChR clustering. Acetylcholinesterase inhibitors ameliorate the symptoms of nAChR deficiency and FCS but exacerbate those of SCS. Patients with DOK7 and MuSK mutations show a dramatic response to sal- butamol whereas AChE inhibitors are detrimental, although the mechanism is unclear. Recently it has been found that patients with severe nAChR deficiency on treatment with cholinergic inhibitors respond very well to the addition of oral salbutamol or ephedrine. SCS often benefits from treatment with open channel blockers of nAChR, such as fluoxetine or quinidine. As expected, nAChR gen- etic disorders are unresponsive to immunotherapies. The periodic paralyses Hyperkalaemic periodic paralysis, paramyotonia congenita, and the potassium-​aggravated myotonias result from mutations in the α-​ subunit of the human skeletal muscle Na+ channel. All are inherited as dominant traits and usually present within the first or second decade of life. Hyperkalaemic periodic paralysis (HyperPP) may occur spon- taneously, but attacks are commonly precipitated by exercise, stress, fasting, or eating potassium-​rich foods. Paralysis is often preceded by signs of muscle hyperexcitability such as myotonia or fasciculations. The duration is variable (minutes to hours) and may be so severe that the patient is unable to remain standing. It is associated with a raised blood K+ concentration (5–​7 mM). Paramyotonia congenita is pre- cipitated by cold and (in contrast to most classical myotonias) aggra- vated by exercise. In some patients, the myotonia may be followed by prolonged paralysis. Potassium-​aggravated myotonia is charac- terized by myotonia without muscle weakness or paralysis. It can be distinguished from classical myotonias by the fact that the myotonia is exacerbated by a mild elevation of the plasma K+ concentration. All three types of disorder result from mutations in the α-​ subunit of the skeletal muscle Na+ channel (SCN4A), which dis- rupt Na+ channel inactivation. As a consequence, they produce a persistent inward current that causes a tonic depolarization of the muscle membrane (the larger the current, the greater the depolar- ization). The magnitude of the depolarization determines whether

3.4  Ion channels and disease 253 myotonia or paralysis occurs. A small depolarization causes mem- brane hyperexcitability by lowering the action potential threshold, whereas a large depolarization can lead to Na+ channel inactivation and thereby paralysis. It is still not understood how cold or an ele- vated plasma K+ level precipitate attacks. Myotonia Loss-​of-​function mutations in the gene CLCN1 encoding the skel- etal muscle Cl–​ channel produce two forms of myotonia—​autosomal dominant myotonia congenita (Thomsen’s disease) and autosomal recessive generalized myotonia (Becker’s disease). Clinical descrip- tions of the disease can be found in Chapter 24.19.3. In normal skeletal muscle, the Cl–​ conductance accounts for be- tween 70 and 80% of the resting membrane conductance. Mutations in CLCN1 that result in a loss of functional Cl–​ channels will therefore produce a marked increase in the input resistance of the muscle fibre. Consequently, muscle excitability will be enhanced (because a smaller Na+ current will be sufficient to trigger an action potential). The ele- vated input resistance also produces a reduced rate of action potential repolarization, which enhances muscle excitability. An important role of the muscle Cl–​ conductance is to counteract the depolarizing effect of K+ accumulation in the transverse tubular system that accompanies muscle activity. During an action potential, K+ ions leave the muscle fibre. In normal muscle, the amount of K+ entering the transverse tubular system during a single action potential is not sufficient to alter the membrane potential, because the tubular Cl–​ conductance is very high. But in myotonic muscle, the Cl–​ conductance is very low and a small rise in tubular K+ produces a significant depolarization fol- lowing an action potential. If several action potentials occur in rapid succession, summation of the after-​depolarizations may be sufficient to trigger spontaneous action potentials and thereby myotonia. Mutations in CLCN1 give rise to both recessive and dominant forms of myotonia. This may be because the muscle Cl–​ channel is a dimer. In heterozygotes, mutant subunits might combine with wild-​type sub- units to form heteromeric channels. The extent to which the mutant subunit reduced the function of the heteromeric channel would thus dictate the severity of myotonia. Total inactivation of the channel by a single mutant subunit (the dominant-​negative effect) would produce dominant myotonia, whereas recessive myotonia might occur if the heteromeric channel was unaffected by the mutant subunit. Malignant hypothermia and central core disease Mutations in the ligand-​gated Ca2+ channel of skeletal muscle cause malignant hyperthermia and central core disease. This channel me- diates Ca2+ release from the sarcoplasmic reticulum, allowing Ca2+ to enter the cytoplasm and activate the contractile proteins. It is also known as the ryanodine receptor (or RYR1) because it binds the al- kaloid ryanodine with high affinity. Malignant hyperthermia (MH) is one of the main causes of death due to anaesthesia. In susceptible individuals, common inhalation anaesthetics or depolarizing muscle relaxants trigger accelerated skeletal muscle metabolism, muscle contractures, hyperkalaemia, arrhythmias, respiratory and metabolic acidosis, and a rapid rise in body temperature (as much as 1°C every 5 min). It is thought that this is due to stimulation of Ca2+ release from the SR, which pro- duces a sustained increase in intracellular Ca2+. This activates both metabolic and contractile activity; the former results in respiratory and metabolic acidosis and the latter produces the elevation in body temperature. The syndrome can be treated with dantrolene sodium, which blocks Ca2+ release from the SR. Malignant hyperthermia is genetically heterogeneous and is not linked to RYR1 in all families. Central core disease (CCD) is an autosomal dominant, non-​ progressive myopathy that presents in infancy as proximal muscle weakness and hypertonia. Diagnosis is by muscle biopsy, which re- veals that regions of type 1 skeletal muscle fibres (known as ‘central cores’) are depleted of mitochondria and oxidative enzymes. The disease is often associated with a predisposition to malignant hyper- thermia and results from mutations in RYR1. Thus CCD and MH are allelic disorders of the same gene. It is not clear how the different phenotypes arise, especially because the same mutation can give rise to MH in some individuals and CCD in others. Because all CCD patients are MH-​susceptible, it is possible that additional factors are necessary for the development of central core disease. Kidney channelopathies Liddle’s syndrome Liddle’s syndrome is a congenital form of salt-​sensitive hyperten- sion characterized by a very high rate of renal Na+ uptake despite low levels of aldosterone, secondary hypokalaemia, and metabolic acidosis. It is caused by gain-​of-​function mutations in the epithelial Na+ channel (ENaC). This channel consists of three subunits (α, β, γ), and disease-​causing mutations have been identified in both the β-​ and γ-​subunits. All are located in the C-​terminus of the protein and result in constitutive channel hyperactivity. The increase in ENaC current causes enhanced Na+ uptake. This is accompanied by increased water uptake, thereby producing a chronic increase in blood volume and ultimately hypertension. An increased Na+ uptake also has secondary consequences: in particular, K+ secretion into the tubule lumen is stimulated because the apical membrane depolarizes and so increases the driving force for K+ ef- flux. In addition, more K+ enters the cell due to the enhanced activity of the Na+/​K+-​ATPase. This explains why excess ENaC activity in Liddle’s syndrome is associated with hypokalaemia and, conversely, why reduced ENaC activity, as in pseudohypoaldosteronism type 1, is accompanied by hyperkalaemia. Treatment is a low-​salt diet and K-​ sparing diuretics like amiloride that directly block the ENaC channel. Pseudohypoaldosteronism type 1 While gain-​of-​function mutations in ENaC cause enhanced Na+ uptake and hypertension, loss-​of-​function mutations produce salt-​ wasting, hypotension, and dehydration in newborns and infants. Pseudohypoaldosteronism type 1 results from loss-​of-​function mutations in the α, β, or γ ENaC subunits. The marked reduction in ENaC activity leads to decreased Na+ absorption by the kidney. This stimulates renin and aldosterone secretion, but salt reabsorp- tion cannot be augmented as ENaC is not functional. The high Na+ concentration in the tubular fluid causes water to be osmotically re- tained in the tubule lumen, leading to diuresis and dehydration. Gitelman’s syndrome Gitelman’s syndrome is the most common genetic cause of hypo- kalaemia and is an autosomal recessive condition typically caused by biallelic inactivating mutations in the SLC12A3 gene that codes for the thiazide-​sensitive Na–​Cl cotransporter (NCCT). See Chapter 21.2.2 for further discussion.

254 SECTION 3  Cell biology Bartter’s syndrome Bartter’s syndrome generally presents in childhood with features including growth failure and mental retardation, polyuria, and poly- dipsia, associated with hypokalaemia and metabolic alkalosis. The syndrome is both phenotypically and genetically heterogeneous, and several subtypes have been distinguished. Antenatal Bartter’s syndrome results from loss-​of-​function mu- tations in the genes encoding proteins involved in salt transport in the cells of the nephron. These include the inwardly rectifying K+ channel Kir1.1 (KCNJ1; Bartter’s syndrome type II), the Na-​K-​ 2Cl cotransporter (SLC12A1, Bartter’s syndrome type I), and the voltage-​gated Cl–​ channel CLC-​Kb (CLCNKB, Bartter’s syndrome type III). See Chapter 21.2.2 for further discussion. SeSAME syndrome Loss-​of-​function mutations in the inwardly rectifying K+ channel Kir4.1 (KCNJ10) give rise to SeSAME syndrome (also called EAST syndrome). This complex disorder is characterized by seizures, sen- sorineural deafness, ataxia, mental retardation, and electrolyte im- balance (e.g. hypokalaemia, hypomagnesaemia, metabolic acidosis). Kir4.1 is expressed in the kidney, inner ear, and glial cells. It is pos- tulated that K+ recycling in the distal convoluted tubule is mediated by Kir4.1 and that in its absence the Na+/​K+-​ATPase is inhibited, reducing Na+ uptake. This stimulates Na+ uptake in other regions of the kidney tubule, which leads to increased K+ and H+ resorption and thereby hypokalaemia and metabolic acidosis. Dent’s disease Dent’s disease describes a spectrum of related inherited disorders of renal function that result from mutations in the renal chloride channel gene, CLCN5. Different mutations can produce pheno- typically distinct syndromes (Table 3.4.1), which may involve low molecular weight proteinuria, hypercalciuria, hyperphosphaturia, nephrocalcinosis, and nephrolithiasis. ClC-​5 is found in apical endosomes of kidney proximal tubule cells. Mouse models suggest that ClC-​5 mutations result in reduced uptake of protein (including parathyroid hormone) by the proximal tubules. This leads to im- paired metabolism of calciotropic hormones and ultimately to hyperphosphaturia and kidney stones. Nephrogenic diabetes insipidus Familial nephrogenic diabetes insipidus (NDI) results from im- paired water uptake by the kidney tubules. The diseases manifests within the first few weeks of life and is characterized by the excretion of large amounts of hypotonic urine and excessive thirst. In early infancy these may not be noticed and the disease is often recognized by signs of dehydration, such as poor feeding, poor weight gain, ir- ritability, and fever. In most cases, familial NDI is caused by a muta- tion in the vasopressin receptor, but in some families it results from loss-​of-​function mutations in the aquaporin 2 (AQP2) gene. AQP2 is expressed exclusively in the collecting duct of the kidney and plays a fundamental role in the production of a concentrated urine because it acts as a water channel. Vasopressin stimulates water uptake by causing the insertion of AQP2 channels into the apical membranes of the principal cells of the collecting duct, thereby enhancing water uptake. Loss-​of-​function mutations in AQP2 result in a dramatic re- duction in water channels, thereby accounting for the polyuria. Polycystic kidney disease Autosomal dominant polycystic kidney disease is characterized by the gradual development of multiple fluid-​filled renal cysts that ul- timately lead to kidney failure. It is one of the most common in- herited human diseases and caused by mutations in the either the transient receptor potential polycystin 2 channel (TRPP2, PDK2) or polycystin-​1 (PDK1). TRPP2 is Ca2+ permeable non​selective cation channel found in both the plasma membrane and several subcellular compartments where it appears to have different functions. PDK1 associates with TRPP2 to form a large receptor-​channel complex. How the mutations cause the disease is poorly understood. Other channelopathies Cystic fibrosis Of all the channelopathies, the best known is probably cystic fibrosis (CF). Its clinical features are described in Chapter 18.10. Cystic fi- brosis is a recessively inherited disorder that results from mutations in an epithelial chloride channel known as the cystic fibrosis transmem- brane conductance regulator (CFTR). Although its primary sequence is highly homologous to that of the ATP-​binding cassette transporters, it is now well established that CFTR functions as a chloride channel. It also regulates the activity of the epithelial Na+ channel. All disease-​causing CF mutations result in the complete absence or a marked reduction in CFTR function. Those which result in the total loss of channel activity, either because the protein does not reach the plasma membrane or because it is present but completely inactive, give rise to a severe form of the disease. Mutations that re- sult in a reduced Cl–​ current are associated with a milder form of the disease. Compound heterozygotes carrying one allele with a severe mutation and another with a mild mutation will have significant re- sidual channel activity and therefore a mild form of the disease. Although a large number of mutations (more than 2000) have been identified in CFTR, it is uncertain how the loss of channel function gives rise to the clinical features of the disease, especially in the lungs. However, it is recognized that lack of Cl–​ and HCO3-​ se- cretion leads to the accumulation of sticky mucous and increases the risk of bacterial infection. Insulin secretory disorders The pancreatic β-​cell ATP-​sensitive K+ (KATP) channel consists of two types of subunit: a pore-​forming subunit Kir6.2 (KCNJ11), and a regulatory subunit SUR1 (ABCC8). Loss-​of-​function mutations in either subunit cause congenital hyperinsulinaemia (CHI) whereas gain-​of-​function mutations lead to neonatal diabetes. This is because the KATP channel plays a crucial role in glucose-​stimulated insulin secretion. When the plasma glucose level is low (less than 3 mM), the channel is open and keeps the β-​cell membrane potential at a hyperpolarized level. When plasma glucose levels rise, increasing glucose uptake and metabolism by the β-​cell, ATP levels rise causing KATP channels close. This produces a membrane depolarization that activates voltage-​gated Ca2+ channels, increases Ca2+ influx, and so stimulates insulin release. Two classes of therapeutic drugs modulate insulin secretion by interacting with KATP channels. Sulphonylureas inhibit channel activity and are used to enhance insulin secretion in patients with type 2 diabetes mellitus, whereas K-​channel openers (e.g. diazoxide) activate KATP channels, hyperpolarizing the β-​cell and preventing insulin release.

3.4  Ion channels and disease 255 CHI is characterized by unregulated insulin secretion and pro- found hypoglycaemia that presents at birth or within the first year of life. This is because CHI mutations result in loss of KATP channel activity, which causes continuous depolarization of the β-​cell, per- sistent Ca2+ influx and thereby constitutive insulin secretion. Some patients respond to treatment with diazoxide, but in others the most effective treatment is resection of the pancreas (more than 90% is usual). Many patients develop diabetes in later life. Mutations that impair ATP inhibition and so increase KATP channel activity cause neonatal diabetes (ND), by holding the β-​cell hyperpolarized and preventing Ca2+ influx and insulin secretion even when plasma glucose rises. Around 50% of ND patients have KATP channel mutations. All have diabetes, usually presenting within the first six months of life, which may be either permanent or exhibit a remitting-​relapsing time course. These patients were once thought to have an unusually early form of type 1 diabetes and thus were treated with insulin. Recognition that they possess activating KATP channel mutations has enabled more than 90% of patients to switch to sulphonylurea therapy: these drugs close the open KATP channels so stimulating endogenous insulin secretion. Importantly, glucose homeo- stasis is improved on sulphonylurea therapy, being lower and showing less fluctuations than on insulin therapy, which suggests the risk of dia- betic complications will be lower. In addition to diabetes, some muta- tions that produce a severe reduction in ATP inhibition cause muscle weakness, motor and mental developmental delay, and hyperactivity (iDEND syndrome), and occasionally also epilepsy (DEND syn- drome). This is because KATP channels are also expressed in neurones. The motor symptoms are sometimes helped by sulphonylureas, but the cognitive benefits are less clear. Because of the marked clinical benefits of sulphonylurea therapy, it is advisable to test all patients with diabetes presenting before six months for KATP channel mutations. KATP channels are also found in the heart, where they are composed of Kir6.2 and SUR2A; and in smooth muscle where they comprise Kir6.1 and either SUR2A or SUR2B (which differ only in their final 42 amino acids). Mutations in Kir6.1 or SUR2 cause Cantu syndrome. This is characterized by congenital hypertrichosis, macrocephaly, a distinctive facial appearance, cardiomegaly, a patent ductus arteriosus and various other symptoms. How the mutations cause the phenotype is unclear. Non​syndromic deafness About 70% of all cases of prelingual deafness are non​syndromic. The disorder shows marked genetic heterogeneity, but in some families it results from loss-​of-​function mutations in the gene (GJB2) encoding the gap junction channel connexin 26. Both recessive and dominant mutations have been described. Connexin 26 is expressed in the cochlea, but the mechanism by which the lack of functional connexin 26 leads to hearing loss remains obscure. In some individuals, muta- tions in connexin 26 are associated with Vohwinkel’s syndrome or other skin abnormalities. Many patients also suffer from deafness. Cancer A wide variety of ion channels have been implicated in tumour growth and metastasis. This is hardly surprising given that ion channels are involved in multiple processes involved in tumourigenesis, including cell cycle progression, proliferation, volume regulation, and cell death. Although we are unaware of a cancer caused by an ion channel mu- tation, enhanced expression of numerous ion channels has been found in many different types of cancer. For example, voltage-​gated Na+ channels are upregulated in breast, lung, and prostate cancer (among others) and their enhanced activity potentiates migration, invasion, and metastasis in vivo. Chloride channels are important for glioma in- vasion, and a natural peptide inhibitor (chlorotoxin) labels glioma cells and is a potential future tool both for glioma detection and for targeting of therapeutic agents. The K+ channel Kv10.1 is ectopically expressed in more than 75% of human tumours, and in mice blockade of this channel slows tumour growth. In many cases, however, the extent to which changes in ion channel expression are the cause or consequence of cancer are not fully understood. A recent investigation shows that persistent changes in the cell membrane potential, determined by al- tered expression of ion channels, can lead to clustering of negatively charged lipid in the inner membrane leaflet and recruitment of the sig- nalling protein K-​Ras, which enhances its ability to promote cell prolif- eration. It is still unclear if targeting ion channel expression or activity will be of therapeutic benefit. Nevertheless, changes in ion channel ex- pression may provide a useful diagnostic biomarker. Concluding remarks Numerous ion channels are now known to play important roles in human disease, and recognition that this is the case has had a pro- found influence on both diagnosis and therapy. Identification of a specific channel mutation can now be accomplished much more quickly than was possible only a few years ago, enabling newly pre- senting patients to diagnosed and (where possible) treated without undue delay. Indeed, channelopathies have provided several ex- amples of personalized medicine, where therapy is tailored to the patient’s genetic constitution. A  genetic diagnosis also enables testing of family members, leading to identification of mutation car- riers and those at risk of the disease. It is important to remember, however, that genetic counselling can be complex: even where a mu- tation is not detected in either parent, a second child may be born with the same mutation due to parental mosaicism. FURTHER READING Ashcroft FM (2000). Ion channels and disease. Academic Press, San Diego, CA. Ashcroft FM (2006). From molecule to malady. Nature, 440, 440. Imbrici P, et  al. (2016). Therapeutic approaches to genetic ion channelopathies and perspectives in drug discovery. Front Pharmaco, 7, 121. Lehmann-​Horn F, Jurkatt-​Rott K (1999). Voltage-​gated ion channel and hereditary disease. Physiol Rev, 79, 1317–​72. National Center for Biotechnology Information. http://​www.ncbi. nlm.nih.gov/​ Online Mendelian Inheritance in Man (OMIM). http://​www.ncbi.nlm. nih.gov/​omim/​ Ptáček LJ (2015). Episodic disorders:  channelopathies and beyond. Ann Rev Physiol, 77, 475–​9. Washington University, Neuromuscular Disease Center. http://​ www.neuro.wustl.edu/​neuromuscular/​mother/​chan.html Zheng J, Trudeau MC (2015). Handbook of ion channels. CRC Press, Boca Raton, FL. Zipes DP (2013). Cardiac electrophysiology:  from cell to bedside. Saunders, Philadelphia, PA.

3.5 Intracellular signalling 256

3.5 Intracellular signalling 256

ESSENTIALS This chapter outlines the general principles of intracellular signal- ling. Focusing on cell surface receptors, the requirements for ef- fective transmission of information across the plasma membrane are
outlined. The principal mechanisms utilized in mammalian signal transduction are described. For each, the pathological conse- quences of aberrant signalling and means by which pathways can be pharmacologically targeted are described in molecular terms. Intracellular signalling pathways permit the transmission and in- tegration of information within cells. Mammalian receptor signalling relies on only a small number of distinct molecular processes which interact to determine cellular responses. Rapid advances in our knowledge of the mechanisms of intracellular signalling has greatly increased understanding of how cells function physiologically, how they malfunction pathologically, and how their behaviour might be manipulated therapeutically. Introduction The evolution of cellular life was only possible through the devel- opment of an insulating barrier to the external world, the plasma membrane, allowing manipulation of the intracellular environment. However, to be able to respond to the extracellular milieu and to each other, primitive cells needed to transmit information across the plasma membrane, leading to the evolution of intracellular signal- ling processes. The progression to multicellularity appears to have depended on the development of robust and sophisticated signal transduction pathways. These evolved through episodes of gene duplication and subsequent protein sequence divergence which peaked at the time of animal–​plant–​fungi separation (1000 million years ago) and again after the Cambrian explosion (500 million years ago). Moreover, co-​option of proteins originally involved in cell structure and metabolism further contributed to the diversification and development of signalling pathways. The resulting complex no- menclature, originating from diverse sources of biological research (especially the fruit fly, Drosophila), can be confusing and alienating for the non​specialist. Transmitting information across the plasma membrane barrier is achieved in one of three ways: 1. Nuclear receptor signalling (e.g. utilized by steroid hormones) employs lipophilic, membrane-​permeable ligands which diffuse through the plasma membrane and directly interact with intra- cellular receptors to alter gene expression and subsequently cell function. Nuclear receptor signalling is limited by the physical properties of the ligand, the absence of a signal amplification step (limiting sensitivity), and slow response times (since these depend on de novo protein synthesis). 2. Ion channel activation permits rapid changes in membrane voltage and intracellular ion concentrations. These processes, which underlie nerve conduction and muscle contraction, also mediate signalling events in non​excitable cells and are discussed elsewhere. 3. Cell surface receptors, in contrast, detect extracellular ligand binding and transmit an intracellular signal to alter cell func- tion. There are seven main solutions to the problem of trans- membrane signal transduction that have evolved in mammalian cells. Heterotrimeric G-​protein-​coupled-​receptors (GPCRs), Wnt, and Hedgehog (Hh) signalling pathways all utilize cell sur- face molecules with seven transmembrane-​spanning domains (7TM) which undergo conformational change on ligand binding triggering intracellular signalling cascades. Alternatively, ligand engagement can be sensed through activation (upon receptor aggregation) of intracellular enzyme cascades. This mechanism is utilized in tyrosine kinase (TK)-​dependent receptor signal- ling and in the serine/​threonine-​dependent signalling of the transforming growth factor beta (TGF-β) receptor superfamily. Receptors have also evolved which, upon ligand-​induced aggre- gation, recruit cytoplasmic molecules into large signalling com- plexes via homotypic protein domain interactions. These include the TNFα and Fas receptors as well as the Toll-​like receptors (TLRs). Finally, Notch signalling employs ligand-​dependent re- ceptor cleavage and nuclear translocation of the receptor frag- ment to induce gene expression changes. 3.5 Intracellular signalling R. Andres Floto

3.5  Intracellular signalling 257 Principles of receptor signalling Receptor signalling pathways, in transmitting an extracellular mes- sage to the cell interior, have to deal with the same fundamental problems of information transmission as other processes, such as electronic systems. These include signalling sensitivity, robustness, resolution, and integration. Sensitivity The sensitivity of signalling pathways varies enormously and is de- termined by both activation threshold and signal amplification. The activation threshold for a pathway can be set by (1) the af- finity, avidity, and dissociation rates of receptor-​ligand interaction, and also (2) the amount of activated intermediary molecules re- quired to propagate the signal. For example, activation of IgG recep- tors (FcγR) is determined by both the density and subclass of IgG coating an antigen (thereby affecting receptor engagement) but also by the level of receptor tyrosine phosphorylation achieved (which is influenced by the balance of receptor-​associated tyrosine kinases and phosphatases). Amplification is usually achieved through one or more enzymatic steps in the signalling pathway and permits extremely low levels of stimuli to trigger signal transduction. Examples include the ability of rod photoreceptors to respond to individual photons of light and the successful recognition of individual peptide-​bound major histocom- patibility complex molecules by T-​cell receptors. Very high amplifica- tion tends to lead to yes–​no binary outputs (a pathway is either ‘on’ or ‘off’) as well as low signal-​to-​noise ratios. In contrast, non​amplified systems permit more fidelity of signal representation (with high signal-​to-​noise ratios), as illustrated by the response of TGFβ super- family receptors to morphogenic gradients during development. By altering sensitivity (through changes in threshold and amplifi- cation), signalling pathways can greatly extend the dynamic range of stimulus intensities they respond to without saturating. This process is known as adaptation. Signal robustness Robustness refers to the ability of systems to function correctly in the presence of invalid inputs or hostile environments. Robustness in cellular signalling can be enhanced by both positive and negative feedback loops. For example, a pathway which enhances the forma- tion of its own ligand amplifies, stabilizes, and prolongs signalling. Such events are commonly seen in development (where correct signal transmission is critical) but also occur aberrantly in cancer (through the establishment of autocrine signalling loops). Negative feedback cycles also stabilize fluctuations, rapidly returning signals to pre-​excitation levels and thus minimize subthreshold signalling. Another mechanism of increasing signal robustness is the use of parallel, redundant signalling cascades which ensure that interrup- tion of one pathway does not disrupt signal transmission. Signal resolution Signalling pathways need to respond appropriately to temporal and spatial changes in stimulus. • Temporal resolution is determined by the speed of initiation and termination of signalling and varies from milliseconds (e.g. ion channel activation and GPCR-​like sensory transduc- tion), seconds to minutes (as seen in TK-​dependent signalling of immunoreceptors), or hours (such as Notch signalling in development). While fast temporal resolution permits rapid detection of changing external environments, slow receptor kinetics will, in effect, average extracellular signals over time, removing fluctuations in signal intensity, and results in improved signal-​to-​noise ratios. • Spatial resolution is the ability of cells to detect the localization of a stimulus. It is critical for many processes including cell migration during embryogenesis and inflammation, cell–​cell interactions (e.g. immune synapse formation), and phagocyt- osis. Spatial resolution requires subcellular containment of activated signalling components which can be achieved by: (1) the restriction of lateral diffusion of activated membrane receptors by sphingolipid microdomains or cytoskeletal bar- riers; (2)  the presence of a cordon of inhibitory molecules limiting signal spread (as observed at the leading edge of chemotactic cells where the phosphatase PTEN localizes phosphatidylinositol 3,4,5-​triphosphate (PIP3) production); and (3) the sequestration of active signalling proteins within a large multimolecular complex which localizes activity to a specific region of the cell. Signal integration As with complex neurological systems, individual cells can inte- grate multiple input signals (through interactions between dif- ferent signalling pathways) to perform simple Boolean operations (sensing ‘signal 1 AND signal 2’, ‘signal 1 OR signal 2’, and ‘signal 1 NOT signal 2’). For example, to achieve full activation, T lympho- cytes need to receive simultaneous signals from both the T-​cell re- ceptor (TCR) and its coreceptor, CD28. They thus identify ‘TCR signal AND CD28 signal’. However, if only TCR signalling is trig- gered (TCR signal NOT CD28 signal), cells respond by becoming unresponsive (anergic) to further stimulation. For certain aspects of T-​cell function, however, another coreceptor, ICOS, may substi- tute for CD28 signalling and leads to full cellular activation (ICOS signal OR CD28 signal). Cells are also able to use various signal transduction topologies to stabilize outputs and integrate information about signal amp- litude, duration, frequency and (sometimes) spatial orientation. Examples include threshold detection to allow transformation of graded (analogue) inputs into an all or nothing (digital) output (e.g. cell fate decisions) and negative feedback amplification (where an input is enzymatically amplified and then inhibited by the resultant output) to improve noise and smooth outputs. Cells also undertake more complex signal processing. The re- cent application of high-​throughput genetic manipulation, the identification of protein–​protein interactions by mass spectrom- etry, and the application of bioinformatic analysis has revealed that, far from being a series of linear processes, intracellular sig- nalling is, in reality, a complex, integrated, and interdependent network of signalling pathways. However, within this web of mul- tiple protein–​protein interactions and enzymatic cascades, there are clear signalling ‘nodes’: important molecules where multiple signal inputs converge, which represent points of physiological, and potentially pharmacological, control of cellular responses.

258 SECTION 3  Cell biology Specific signalling pathways The description of signalling pathways here is limited to the seven main types of receptor signal transduction mechanisms used by mammalian cells: GPCR, Wnt, Hh, tyrosine kinase-​dependent sig- nalling (using the B-​cell receptor as an example), TGFβ superfamily receptors, TLRs (as an example of pathways utilizing homotypic pro- tein domain interactions), and Notch signalling. For each pathway, the main physiological roles, the mechanism by which signalling is initiated and controlled, the pathological consequences of signal- ling dysfunction, and the potential for therapeutic manipulation are described. G-​protein-​coupled receptors The G-​protein-​coupled receptors (GPCRs) are a large family of ap- proximately 800 7TM proteins which are involved in virtually all aspects of human biology including sensory transduction (of vi- sion, olfaction, taste, and pain) and signalling by peptide hormones, glycoproteins, neurotransmitters, and chemokines. More than 60% of all marketed drugs (and a large proportion of those in develop- ment) target GPCRs. The main signalling pathways are summarized in Fig. 3.5.1. GPCRs constitutively associate with heterotrimeric G proteins which consist of a guanine diphosphate (GDP)-​bound α subunit (of which there are 16 types) complexed to a βγ dimer. Ligand binding to a GPCR ruptures an ionic bond between transmembrane domain (TM)-​3 and TM-​6 inducing a large cytosolic conformational change which permits binding of a G protein. The Gα subunit of the G pro- tein is then able to bind guanosine triphosphate (GTP) instead of GDP and allows Gα and βγ subunits to dissociate and interact with effector enzymes (such as adenylyl cyclase and phospholipase C) and small G proteins. Intrinsic hydrolysis of bound GTP to GDP inacti- vates Gα (thus acting as a molecular stopwatch to limit Gα activity) and permit binding to βγ subunits and re-​association with receptors. In parallel, GPCRs also interact, through c-​terminal phos- phorylation by G-​protein-​coupled receptor kinases (GRKs), with β-​arrestins; molecules that mediate ubiquitination-​dependent re- ceptor endocytosis, recruitment of c-​Src family tyrosine kinases (such as Hck and Yes), and activation of the ERK MAP kinase sig- nalling pathway. Signalling by β-​arrestins mediates receptor desen- sitization, cellular degranulation, chemotaxis, and cell survival. Many GPCRs have multiple physiological ligands, binding at distinct extracellular sites, which can selectively activate spe- cific intracellular signalling pathways. These ‘biased ligands’ have driven pharmaceutical efforts to develop compounds, binding orthosterically or allosterically, that might selectively modulate spe- cific intracellular signalling pathways. More recently, in vitro studies have shown the feasibility of using membrane-​permeable peptides or intrabodies to conformationally alter GPCRs from the cytosolic surface and discretely target specific signalling cascades. As might be expected from their critical role in peptide hormone signal transduction, loss-​of-​function and gain-​of-​function muta- tions of multiple GPCRs and heterotrimeric G proteins have been implicated in both hereditary and sporadic endocrine diseases. Somatic Gsα mutations, which disrupt intrinsic GTPase function resulting in prolonged activation, are found in 40% of growth-​ hormone-​secreting pituitary adenomas as well as McCune–​Albright syndrome (bone fibrous dysplasia, endocrinopathy with hormone oversecretion). In contrast, heterozygous inactivating mutations of Gsα result in Albright’s hereditary osteodystrophy. Simple loss-​of-​ function mutations in GPCRs usually cause recessive conditions. For example, mutations in the luteinizing hormone (LH) receptor cause autosomal recessive familial hypogonadism. By contrast, gain-​ of-​function mutations may result from changes in: (i) ligand spe- cificity (e.g. mutant FSH receptors responding to hCG in ovarian hyperstimulation syndrome); (ii) ligand sensitivity (e.g. full activa- tion of mutant calcium-​sensing receptors at physiological calcium levels in Bartter syndrome type V); (iii) increased basal activity in the absence of ligand (e.g. LH receptor mutations causing male-​limited GPCR GPCR signalling Ligand GPCR βγ GDP GTP βγ GRK Effector enzymes Effector enzymes Downstream signalling Inhibition of Gα signalling Receptor internalization Downstream signalling Downstream signalling Inhibition of Gα signalling Ligand β-arrestins P Gα Gα Fig. 3.5.1  G-​protein-​coupled receptors (GPCRs). GPCRs constitutively associate with the guanosine diphosphate (GDP)-​bound α (Gα) and βγ subunits of heterotrimeric G proteins. Ligand binding induces receptor conformational change allowing Gα to bind guanosine triphosphate (GTP) which permits the subunits to dissociate and interact with effector enzymes. GPCRs also interact with β-​arrestins, following C terminal phosphorylation by GPCR kinases (GRK), which mediated receptor internalization and other signalling events.

3.5  Intracellular signalling 259 precocious puberty); and (iv) decreased desensitization (e.g. KISS-​1 receptor mutations causing precocious puberty). GPCR signalling has also been exploited by pathogens. HIV binding to the chemokine receptor CCR5 mediates cellular invasion and alters immune cell function. Vibrio cholerae toxin A1 induces ADP-​ribosylation of Gαs preventing GTP hydrolysis (resulting in persistent activation and leading to cyclic AMP-​driven secretory diarrhoea). Bordetella pertussis toxin A freezes Gαi in an inactive GDP-​bound conformation (through ADP-​ribosylation) which pre- vents phagocyte chemotaxis, bacterial engulfment, and intracellular killing. Wnt signalling Wnt signalling plays a major role in epidermal, haematopoietic, and neural stem cell development and has been implicated in oncogen- esis (particularly of colonic, ovarian, and hepatocellular carcinoma and melanoma) thought to arise through stem cell dysfunction. 19 genes are defined in humans encoding Wnt ligands, a family of secreted, palmitoylated, cysteine-​rich proteins which bind to sur- face receptors (called Frizzled-​class proteins) variably complexed to different coreceptors (LRP5, 6; MUSK; PTK7; ROR1, 2; RYK; Syndecan; and Glypican). The name Wnt is derived from Wingless, a Drosophila gene and the molecule Int-​1 (integration of mammary tumour virus). Canonical Wnt signalling is summarized in Fig. 3.5.2. In the absence of Wnt ligands, newly synthesized β-​catenin is com- plexed within the cytoplasm to two scaffolding proteins, Axin and APC (adenomatous polyposis coli). Serine/​threonine phos- phorylation of β-​catenin, by two proteins GSK3β and CK1, ini- tiates its ubiquination and subsequent proteosomal degradation. In the absence of β-​catenin, the transcriptional complex Tcf/​Lef represses gene expression. Wnt ligands induce coaggregation of LRP 5/​6 (an LDL receptor family member) and the 7TM receptor, Frizzled, and results in phosphorylation of the scaffolding protein Dishevelled and sequestration of Axin. The resultant inhibition of the β-​catenin destruction complex increases cytoplasmic levels of β-​catenin and permits its nuclear translocation and binding to Tcf/​Lef to form an activatory transcription complex which trig- gers gene expression. β-​catenin-​independent signalling pathways can also be trig- gered by Wnt signalling. These include (a) planar cell polarity sig- nalling that regulates cell polarity and cytoskeletal rearrangements through activation of the GTPases Rac1 and RhoA and is essen- tial for correct gastrulation, neural tube closure, and orientation of inner ear stereocilia; and (b) Wnt-​calcium signalling, involving ­activation of heterotrimeric G proteins leading to phospholipase C-​mediated production of inositol 1,4,5-​trisphosphate and subse- quent ­calcium signalling, which influences cell motility and gene expres- sion during tumorigenesis, inflammation, and neurodegeneration. Evidence suggests that both these non​canonical signalling pathways inhibit canonical β-​catenin-​dependent signalling and are favoured by certain Wnt ligands (Wnt5A, 11) and coreceptors (ROR1 and 2) combinations. The critical role of Wnt signalling in stem cell regulation within intestinal villi underlies its association with colonic malignancies. Raised nuclear β-​catenin levels (leading to persistent Wnt-​dependent gene expression and eventually malignant transformation) occur in the presence of (1) mutations in either APC (found in most spor- adic colorectal cancers as well as familial adenomatous polyposis) or Axin which impair β-​catenin binding or (2) activating mutations of β-​catenin, preventing its phosphorylation. Therapeutic manipulation of Wnt pathway signalling is currently being investigated. Small molecule agonists to enhance tissue repair and wound healing are under preclinical development. Lithium, at Frizzled X Nucleus LRP P GSK3-β P P CK1 Axin APC Frizzled Nucleus Dsh LRP Axin Wnt Wnt β-catenin β-catenin LEF/ TCF GSK3-β APC LEF/ TCF β-catenin Fig. 3.5.2  Wnt signalling. In the absence of Wnt ligands, newly synthesized β-​catenin is bound by Axin and APC (adenomatous polyposis coli), phosphorylated by GSK3β and CK1 and consequently degraded by the ubiquitin-​ proteasome system. Wnt ligands induce coaggregation of the surface receptors Frizzled and LRP, leading to phosphorylation of Dishevelled (Dsh), sequestration of Axin and inhibition of β-​catenin degradation. β-​catenin can then translocate to the nucleus, bind the transcription complex LEF/​TCF and trigger gene expression.

260 SECTION 3  Cell biology least in vitro, enhances canonical Wnt signalling (through inhib- ition of GSK3β), which may underlie some of its effects in psychi- atric disorders. Antagonists of Wnt pathway signalling are in clinical trials as antitumour agents. These include inhibitors of Porcupine (an acyltransferase which specifically palmitoylates WNTs, enab- ling secretion), monoclonal antibodies targeting various Frizzled isoforms, and blockers of β-​catenin interaction with its transcrip- tional coactivator CBP. Hedgehog Hedgehog (Hh) signalling has important roles in embryogenesis, tissue repair, and tumorigenesis. Hh proteins are named after the appearance of the embryo in classical Drosophila mutants and have been conserved as regulators of development in vertebrates. They act as short-​ and long-​range morphogens (determining cell fate), mitogens (controlling cell proliferation) and as inducing factors (regulating the form of developing organs). There are three human homologues of Hh—​Sonic Hh, Indian Hh, and Desert Hh—​secreted as lipid-​conjugated hydrophobic peptides with distinct patterns of spatial and temporal distribution. In vertebrates, all Hh signalling appears to take place on, and is regulated by, primary cilia Hedgehog binds to a cell surface receptor, Patched-​1, relieving constitutive repression of a (predominantly endosomal) 7TM pro- tein, Smoothened, which is then recruited to the plasma membrane and primary cilium (Fig. 3.5.3). Active Smoothened increases the formation of the activator form of the transcription factor com- plex, GLi (GLiA) which stimulates Hh target gene transcription. Control of Hh signalling occurs through (1)  constitutive repres- sion of Smoothened; (2) phosphorylation of GLi by other signalling pathways (such as Notch) which generates a repressor transcription complex GLiR preventing gene expression, and (3)  inhibition of nuclear translocation of GLIA by two cytoplasmic proteins, SUFU and Iguana. In addition, non​canonical (GLI-​independent) Hh signal- ling can occur through Smoothened-​independent pathways (for ­example through Patched-​1 dependent regulation of cyclin B1 and consequently apoptosis) and through cytoskeletal regulation and calcium oscillations by Smoothened. Mutations in human Sonic Hh, the best characterized member of these developmental regulatory proteins in mammals, result in developmental disorders such as holoprosencephaly which is fre- quent in aborted fetuses and characterized by severe malfunctions including cyclopia. Drugs which interfere with sterol synthesis cause such malformations because they interfere with the addition of chol- esterol to the N-​terminal domain of the Sonic Hedgehog protein after processing, thereby preventing normal trafficking and secre- tion of the ligand. Uncontrolled Hh signalling appears to promote tumorigenesis. Gorlin syndrome, caused by an inactivating muta- tion of Patched-​1, is characterized by the development of multiple basal cell carcinomas and medulloblastomas. Moreover, most spor- adic basal cell carcinomas show evidence of inactivating mutations in Patched-​1 or activating mutations in Smoothened, while a pro- portion of medulloblastomas demonstrate increased Hh signalling (due to inactivating mutations of Patched-​1 or SUFU). In addition, a truncated alternative splice variant of GLI1 has been identified in many glioblastomas and breast cancers. Therapeutic disruption of Hh signalling through blocking Smoothened (e.g. vismodegib) and GLI1 (arsenic trioxide) are now licenced as antitumour agents, while Shh inhibition by small mol- ecules or monoclonal antibodies show promising efficacy in vitro and in vivo. Tyrosine kinase-​dependent signalling Tyrosine kinases mediate signalling by several different receptor families including those with receptor-​associated TK activity (such Other signals Patched-1 Smoothened Gli X Patched-1 Smoothened Gli GliA Hedgehog X GliA GliR Iguana SUFU Nucleus Nucleus Endosome Endosome Hedgehog signalling Hedgehog SUFU P Fig. 3.5.3  Hedgehog signalling. Binding of the soluble ligand Hedgehog to its receptor, Patched-​1, relieves constitutive repression of the protein Smoothened which now acts on the transcription factor complex Gli to increase the amount of activator form (GliA) relative to repressor form (GliR) and thus promote gene expression. Control of signalling is achieved by inhibition of nuclear translocation of GliA by two cytoplasmic proteins SUFU and Iguana, and phosphorylation of Gli (by several different signalling pathways including Notch) promoting GliR formation.

3.5  Intracellular signalling 261 as epidermal growth factor receptors) and those which recruit sol- uble TKs to initiate signalling (such as immunoreceptors, integrins, and cytokine receptors). In general, signal transmission is receptor aggregation-​dependent, rapid in onset (of the order of seconds to minutes) and, once initiation thresholds are surpassed, greatly amplified (due to multiple enzyme-​dependent steps). As expected, dysregulated TK signalling contributes to both oncogenesis and immunodeficiency. The B-​cell receptor (BCR) serves as a useful example. The BCR complex consists of a surface immunoglobin non​covalently as­sociated with Igα and Igβ subunits, each of which contains a cytoplasmic immunoreceptor tyrosine activation motif (ITAM). Antigen-​induced receptor aggregation permits loosely associ- ated c-​Src family TKs (such as Lyn and Fyn) to phosphorylate subunit ITAMs which can then strongly bind c-​Src family and Syk family tyrosine kinases. A second wave of adaptor molecules (such as BLNK), small G proteins (such as Ras and Rac), and kin- ases such as phosphatidylinositol-​3-​kinase (PI-​3K) are recruited to the signalling complex. PI-​3K generates phosphatidylinositol 3,4,5-​triphosphate (PIP3) from the plasma membrane lipid phosphatidylinositol 4,5-​triphosphate (PIP2(4,5)). PIP3 recruits cytoplasmic molecules (through their Pleckstrin homology do- mains). These include: (1) Bruton’s tyrosine kinase (BTK), muta- tion of which result in X-​linked agammaglobulinaemia; (2) AKT; and (3)  phospholipase C (PLC), which generates inositol 1,4,5-​ trisphosphate (IP3) and diacylglycerol (DAG) from PIP2(4,5), leading to intracellular calcium signalling and protein kinase C (PKC) activation. The fully formed signalling complex can then activate downstream signalling pathways such as ERK, JNK, and p38 MAP kinases, NFκB, and NFAT (Fig. 3.5.4). Signal transduction is regulated at certain steps including: (1) CD45-​dependent dephosphorylation of src family kinases which is necessary to permit ITAM engagement; (2)  the activity of the phosphatidylinositol phosphatase, PTEN, which converts PIP3 back to PIP2(4,5) thereby limiting signalling complex formation; and (3) the density of immunoreceptor tyrosine inhibitory motif (ITIM)-​containing inhibitory receptors (such as FcγRIIb and CD22) associated with the signalling complex. PTEN (phosphatase and tensin homologue) is a human tumour suppressor gene; it is one of the most frequently lost tumour suppressors in cancer and is mutated in both Cowden’s and Proteus syndromes. These inhibi- tory receptors recruit soluble tyrosine phosphatases (such as SHP1) which limit phospho-​ITAM generation, inositol phosphatases (such as SHIP) which hydrolyse PIP3 to PIP2(3,4), and inhibitors of small G-​protein signalling (such as p62 DOK). Small molecule inhibitors of receptor tyrosine kinases are in clin- ical use in the treatment of chronic myeloid leukaemia (imatinib), renal cell carcinoma and gastrointestinal stromal tumours (sunitinib). Several strategies have been adopted to disrupt BCR signalling in chronic lymphocytic leukaemia (CLL) where prolifer- ation is driven by aberrant BCR activation, and in autoimmune con- ditions, where inappropriate, BCR signalling leads to autoantibody production and self-​antigen presentation by B cells. Small molecule inhibitors of src-​like tyrosine kinases (dasatinib), BTK (ibrutinib), PI3 kinase δ (idelalisib), and Syk kinase (fostamatinib) are now li- cenced therapies for CLL, with the latter also increasingly used for autoimmune conditions. Interest has also focused on ways to recruit ITIM-​containing inhibitory receptors to the BCR complex to re- duce activatory signalling in autoimmunity. Monoclonal antibody therapy targeting the inhibitory receptor CD22 (epratuzumab) and B cell receptor signalling P P P P P P P P P SHP1 Lyn Syk PLC BTK Lyn SHIP ITAM ITIM PKC FcγRIIb BCR BLNK Calcium signalling Downstream signalling Activation of NFAT, NFkB Downstream signalling Activation of ERK, JNK, & p38 MAP kinases DAG SHIP Ras Rac PTEN PI-3K Fig. 3.5.4  B-​cell receptor (BCR) signalling. Antigen induces BCR aggregation leading to phosphorylation of cytoplasmic ITAMs (immunoreceptor tyrosine activation motifs) and subsequent binding and activation of soluble tyrosine kinases such as Lyn and Syk. A second wave of adaptor molecules (such as BLNK), small G proteins (such as Rac and Ras) and kinases, including phosphatidylinositol 3-​kinase (PI-​3K) are then recruited to the signalling complex. PI-​3K generates phosphatidylinositol 3,4,5-​triphosphate (PIP3) from PIP2(4,5) recruiting further molecules including Bruton’s tyrosine kinase (BTK) and phospholipase C (PLC). The latter splits PIP2(4,5) into inositol 1,4,5-​trisphosphate (IP3) and diacylglycerol (DAG) leading to calcium signalling and protein kinase C activation. Signalling is controlled by coaggregation of inhibitory receptors, such as FcγRIIb, which, through their ITIM (immunoreceptor tyrosine inhibitory motifs), recruit and activate tyrosine phosphatases (such as SHP1), limiting ITAM phosphorylation, and the inositol phosphatase SHIP, which together with the phosphatidylinositol phosphatase PTEN, reduce PIP3 levels.

262 SECTION 3  Cell biology promoting its BCR coaggregation and internalization is currently in late stage clinical trials for systemic lupus erythematosus (SLE), while a bi-​specific antibody-​based molecule is being developed to colligate the inhibitory Fcγ receptor, FcγRIIb, with the Igβ subunit of the BCR to block activation in autoimmune conditions. Transforming growth factor beta (TGF-β) superfamily The TGF-β superfamily of about 20 ligands, including TGF-β, activin, nodal, endoglin, bone morphogenetic proteins (BMP), and growth and differentiation factors (GDFs), have important roles in embryogenesis (where they form morphogenic gradients), extracellular matrix (ECM) remodelling and wound healing, and immunoregulation. Pathologically, TGF​β overactivity has also been shown to drive epithelial-​mesenchymal transition (a process that contributes to cancer progression, neo-​intimal hyperplasia, and tissue fibrosis), promote connective tissue disruption in conditions such as Marfan syndrome, and suppress antitumour immunity. Receptors contain cysteine-​rich extracellular domains, a single TM domain and an intracellular serine/​threonine kinase domain. Ligands (all of which contain three intramolecular disulphide bonds termed a ‘cysteine knot’) are secreted as inactive homodimers bound within a large latent complex (LLC) which, in the case of TGF​β, con- sists of a latency-​associated peptide (LAP) and latent TGF​β-​binding protein (LTBP). The LLC is extensively bound to extracellular matrix components. Proteolytic cleavage (by matrix metalloproteinases and other enzymes), integrin binding, and pH changes will release ­active TGF​β which can then trigger the aggregation of type I and type II receptor homodimers into heteromeric complexes (Fig. 3.5.5). Phosphorylation of type I  receptors (by type II receptor serine/​ threonine kinases) permits recruitment and subsequent phos- phorylation of intracellular signalling molecules called receptor (R-​) SMADs. SMADs are homologues of the Caenorhabditis ele- gans protein SMA and the Drosophila protein Mothers against decapentaplegia. Phosphorylated R-​SMADs, in turn, bind the key regulator, SMAD4. The R-​SMAD/​SMAD4 complex translocates to the nucleus where, after associating with other cofactors, it regulates gene transcription. Inhibition of signalling is achieved through tran- scriptional induction of inhibitory (I-​) SMADs, which competitively bind type 1 receptors (preventing SMAD complex formation) and target receptors for ubiquitin-​dependent degradation, and preven- tion of nuclear translocation of R-​SMAD/​SMAD4 via phosphoryl- ation by ERK, MAPK, and CDK kinases. Non​canonical signalling involves activation by the heteromeric receptor complex of a series of pathways including: TNF receptor-​associated factor (TRAF) 4, TRAF6, TGFβ-​activated kinase (TAK)-​1, MAP kinase, PI3 kinase-​ AKT, and NF-​κB. Defective signalling of TGFβ superfamily pathways has been im- plicated in Camurati–​Engelmann disease (a progressive diaphyseal dysplasia affecting long bones), oncogenesis (particularly skin cancers), several fibrotic conditions (including systemic scler- osis), familial primary pulmonary hypertension (BMP receptor 2 Nucleus TGFβ P P Type I Type II R-SMADs R-SMADs SMAD-4 R-SMADs SMAD-4 R-SMADs SMAD-4 I-SMADs P P P ERK MAPK CDK Inactive TGFβ LLC Fig. 3.5.5  TGF​β signalling. TGF​β is secreted in as an inactive dimer bound to a latency-​associated peptide (LAP) and a latent TGF​β binding protein (LTBP) to form a large latent complex (LLC). Following release from the extracellular matrix and proteolytic cleavage, active TGF​β binding induces heteromeric receptor complexes leading to phosphorylation of type I receptor cytoplasmic tails permitting recruitment and activation of R-​(receptor) SMADs, which in turn bind SMAD-​4. The R-​SMAD/​SMAD4 complex then translocates to the nucleus where it interacts with other cofactors to control gene expression. Inhibition of signalling is achieved by transcriptional induction of inhibitory SMADs (I-​SMADs) which prevent R-​SMAD/​SMAD4 complex formation and target receptors for degradation. Negative regulation of R-​SMAD/​SMAD4 nuclear accumulation and transcriptional activation is achieved through serine/​threonine phosphorylation through ERK, MAPK, and CDK kinases.

3.5  Intracellular signalling 263 mutations), and hereditary haemorrhagic telangiectasia (endoglin mutations). Multiple strategies for therapeutic regulation of TGF​β signalling in cancer and fibrosis are currently undergoing clinical and preclin- ical evaluation, including antisense oligonucleotide and antisense RNA blocking ligand synthesis, ligand traps (e.g. soluble Fc-​receptor fusion proteins to reduce active ligand concentrations), inhibitors of ligand activation, monoclonal antibodies targeting either ligands or receptors, and small molecule inhibitors of receptor signalling. Pirfenidone, an antifibrotic drug which probably acts by reducing TGF​β activity, has recently been approved for use in idiopathic pul- monary fibrosis. Toll-​like receptor signalling Several signalling pathways utilize protein–​protein binding via homotypic domain interaction. These include TNFα receptor sig- nalling, which uses death domains (DD), caspase signalling (util- izing CARD domains), and TLR signalling (which uses DD and Toll/​interleukin 1 (TIR) domain interactions). I have focused on TLR signalling as an example. The nine types of mammalian TLR are found on both the cell surface (TLR1,2,4,5,6,) and within endosomal compartments (TLR3,7,8,9) and recognize distinct microbial products (as well as some endogenous ligands). They direct the innate immune response against pathogens, triggering inflammatory and antiviral mediator release. In addition, TLR-​induced maturation of dendritic cells per- mits processing and surface presentation of internalized antigen, re- sulting in stimulation of cognate T cells and induction of adaptive immunity. In the case of TLR4 (summarized in Fig. 3.5.6), engagement of lipopolysaccharide (LPS) triggers receptor aggregation and conformational change which recruits cytoplasmic adaptor pro- teins (MyD88, MAL, TRIF, and TRAM) through TIR domain interactions. MyD88 in turn, through DD interactions, recruits and activates the serine/​threonine kinases IRAKs which mediate ubiquitination-​dependent formation of a large oligomeric signal- ling complex (the ‘signalosome’) which permits activation of NFκB (generating production of pro-​inflammatory cytokines such as TNFα). TRIF, in contrast, activates interferon-​regulatory factors (IRFs) which trigger the generation of IFNα and β (which are cru- cial to antiviral host immunity). Regulation of signalling occurs at several levels including: (1) reduced membrane recruitment of MyD88; (2) disruption of IRAK signalling by the inhibitory mol- ecule IRAK-​M; and (3) inhibition of TRIF signalling by the cyto- plasmic protein SARM. Polymorphisms in components of TLR signalling have been as- sociated with increased susceptibility to Gram-​negative infections and septic shock (TLR4), Gram-​positive infections (TLR2, IRAK4, TLR-4 signalling LPS Nucleus IRAK-M SARM MAL MyD88 TRIF TRAM LPS LPS IRAK IFNα IFNβ TNFα TLR4 NFkB NFkB IRFs IRFs MD2 Fig. 3.5.6  Toll-​like receptor signalling. Lipopolysaccharide (LPS) binding to TLR4 triggers receptor aggregation and conformational change recruiting cytoplasmic adaptor molecules (MyD88, MAL, TRIF, and TRAM) through homotypic TIR (Toll/​interleukin 1) domain interactions. MyD88 recruits and activates the serine/​threonine kinases IRAKs (through Death domain interactions) which in turn activate the nuclear transcription factor NFkB and switch on transcription of pro-​inflammatory cytokines such as TNFα. In contrast, TRIF activates interferon-​regulatory factors (IRFs) which trigger generation of type 1 interferon (IFNα and β). Signal regulation is achieved at several levels including inhibition of signalling through IRAK and TRIF by IRAK-​M and SARM, respectively.

264 SECTION 3  Cell biology MAL), and tuberculosis (TLR2, MAL). TLR polymorphisms have also been implicated in the development of atherosclerosis. There has been growing interest in therapeutic manipulation of TLR signalling for a variety of conditions. Synthetic TLR agon- ists are in development or already licenced as vaccine adjuvants (e.g. the TLR4 activator monophosphoryl lipid A), as antiviral therapies (e.g. the TLR7 agonist imiquimod), as antitumour agents (e.g. CpG based oligonucleotides stimulating TLR9, imiquimod), as antiallergy therapy (e.g. TLR7 agonists for asthma). TLR ant- agonists are being developed for acute and chronic inflammation (targeting TLR2 and TLR4), for sepsis (targeting TLR4), for auto- immunity (e.g. Poly TLR antagonists for SLE), and for specific dif- fuse B-​cell lymphomas containing MyD88 oncogenic mutations (blocking TLR7,8,9 activation). Notch Named after a Drosophila protein mutation resulting in a ‘notched’ wing phenotype, Notch signalling pathways are widely conserved across species and have roles in embryonic development (particu- larly binary cell fate decisions and terminal differentiation), main- tenance of stem cells, and lymphocyte differentiation and signalling. Four Notch receptors (Notch 1 to 4)  and five canonical ligands (jagged1, jagged2, Delta-​like 1, 3, and 4) have been identified (as well as several non​canonical ligands including contactin). Although synthesized as a single polypeptide, surface Notch re- ceptors are heterodimers consisting of an extracellular region non-​ covalently linked to a transmembrane/​intracellular portion. As shown in Fig. 3.5.7, ligand binding permits extracellular cleavage of Notch heterodimers by TACE (TNFα converting enzyme), an ADAM protease. This allows ubiquitin-​dependent endocytosis of Notch. Subsequent cleavage by an endosomal γ-​secretase (presenilin) releases the intracellular fragment of Notch which can then associate with the nuclear transcription factor CSL, switching on gene expres- sion (particularly of the HES family of transcription factors). In add- ition, Notch may also signal through CSL-​independent nuclear and cytoplasmic pathways (although incompletely understood) that may be independent of receptor cleavage or occur via cross-​talk with NF-​ kB, TGFβ, and hypoxia-​induced signalling pathways. Notch signalling is exquisitely sensitive to quantitative changes in receptor and/​or ligand concentration, since transduction does not involve enzymatic amplification, and is consequently linear. For example, haplo-​insufficiency of Notch 2 leads to Alagille’s syn- drome, while activating Notch 2 mutations result in diffuse large cell lymphomas. Notch signalling can be physiologically regulated by: (a) alteration in receptor fucosylation regulated by Fringe pro- teins (which alters ligand specificity); (b) rerouting of intracellular Notch fragments for lysosomal degradation via ubiquitination within multivesicular bodies (which limits intracellular signalling); and (c) via inhibitory interactions of ligands with receptors when both are expressed on the same cell. These negative cis interactions are critical for the formation of sharp boundaries and lateral inhib- ition patterns during development. Mutations in Notch pathway proteins have been identified in de- velopmental disorders such as congenital aortic valve disease (Notch 1), neurovascular syndromes such as cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL; Notch 3 mutations) and T-​cell acute lymphocytic leu- kaemia (50% of which have activating mutations of Notch 1 caused either by chromosomal translocation or viral promoter integration). Therapeutic manipulation of Notch signalling has so far focused on γ-​secretase inhibitors, which have shown promise as antitumour agents in preclinical studies and early phase clinical trials. Nucleus Endosome TACE GSK3-β Notch γ-secretase CSL Notch ligand cis inhibition Lysosomal degradation Fig. 3.5.7  Notch signalling. Ligand binding permits extracellular cleavage of surface Notch receptors (which form as heterodimers) by an ADAM protease TACE which allows ubiquitin-​mediated internalization of membrane associated receptor fragment. Further cleavage by an endosomal γ secretase releases a cytoplasmic fragment which associates with the nuclear transcription factor CSL and drives gene expression. Negative signal regulation can occur through lysosomal degradation of internalized receptor fragments or of ‘cis’ receptor-​ligand complexes.

3.5  Intracellular signalling 265 Conclusion The transmission of information across the plasma membrane is achieved through a limited number of types of signalling pathway. Recent progress in defining the mechanisms of signal transduction (‘learning the language of intracellular communication’) has per- mitted huge advances in our understanding of disease pathophysi- ology. The last decade has seen an explosion of new therapies aimed at manipulating signalling therapeutically, with the promise of ac- celerated future drug discovery with improved molecular character- ization of key nodal signalling events. FURTHER READING Akhurst RJ, Hata A (2012). Targeting the TGFβ signalling pathway in disease. Nat Rev Drug Discov, 11, 791–​811. Barolo S, Posakony JW (2007). Three habits of highly effective sig- naling pathways: principles of transcriptional control by develop- mental cell signaling. Genes Dev, 16, 1167–​81. Call ME, Wucherpfennig KW (2007). Common themes in the as- sembly and architecture of activating immune receptors. Nat Rev Immunol, 7, 841–​50. Clevers H (2006). Wnt/​β-​Catenin signaling in development and dis- ease. Cell, 127, 469–​80. Cook DN, Pisetsky DS, Schwartz DA (2004). Toll-​like receptors in the pathogenesis of human disease. Nat Immunol, 5, 975–​9. Dorner T, et al. (2015). The mechanistic impact of CD22 engagement with epratuzumab on B cell function: implications for the treatment of systemic lupus erythematosus. Autoimmun Rev, 14, 1079–​86. Fiúza U-​M, Arias AM (2007). Cell and molecular biology of Notch. J Endocrinol, 194, 459–​74. Goetz SC, Anderson KV (2010). The primary cilium:  a signalling centre during vertebrate development. Nat Rev Genet, 11, 331–​44. Guruharsha KG, Kankel MW, Artavanis-​Tsakonas S (2012). The Notch signalling system: recent insights into the complexity of a conserved pathway. Nat Rev Genet, 13, 654–​66. Hennessy EJ, Parker AE, O’Neill LAJ (2010). Targeting Toll-​like recep- tors: emerging therapeutics. Nat Rev Drug Discov, 9, 293–​307. Ingham PW, McMahon AP (2007). Hedgehog signalling in animal de- velopment: paradigms and principles. Genes Dev, 15, 3059–​87. Kahn M (2014). Can we safely target the WNT pathway? Nat Rev Drug Discov, 13, 513–​32. Kanzler H, et al. (2007). Therapeutic targeting of innate immunity with Toll-​like receptor agonist and antagonists. Nat Med, 13, 552–​9. Lefkowitz RJ, Shenoy SK (2005). Transduction of receptor signals by β-​arrestins. Science, 308, 512–​17. Milligan G, Kostenis E (2006). Heterotrimeric G-​proteins: a short his- tory. Br J Pharmacol, 147, S46–​55. National Cancer Institute/​Nature. Pathway Interaction Database. https://​wiki.nci.nih.gov/​pages/​viewpage.action?pageId=315491760 National Center for Biotechnology Information. Online Mendelian Inheritance in Man (OMIN). http://​www.ncbi.nlm.nih.gov/​sites/​ entrez?db=omim Nichols JT, Miyamoto A, Weinmaster G (2007). Notch signalling—​ constantly on the move. Traffic, 8, 959–​69. Niehrs C (2012). The complex world of WNT receptor signalling. Nat Rev Mol Cell Biol, 13, 767–​79. Nowell CS, Radtke F (2017). Notch as a tumour suppressor. Nat Rev Cancer, 17, 145–59. Pires-​daSilva A, Sommer RJ (2003). The evolution of signaling path- ways in animal development. Nat Rev Genet, 4, 39–​48. Robbins DJ, Fei DL, Riobo NA (2012). The Hedgehog signal transduc- tion network. Sci Signal, 5, re6. Rubin LL, de Sauvage FJ (2006). Targeting the Hedgehog pathway in cancer. Nat Rev Drug Disc, 5, 1026–​33. Schmierer B, Hill CS (2007). TGFβ-​SMAD signal transduction: mo- lecular specificity and functional flexibility. Nat Rev Mol Cell Biol, 8, 970–​82. Science Signaling. The Signal Transduction Knowledge Environment (STKE). http://​stke.sciencemag.org/​ Shi Y, Massagué J (2003). Mechanisms of TGF-​β signalling from cell membrane to the nucleus. Cell, 113, 685–​700. Shukla AK (2014). Biasing GPCR signaling from inside. Sci Signal, 4, pe1. Takeda K, Akira S. (2003). TLR signalling pathways. Seminars Immunol, 16, 3–​9. UCSD Nature. The Signalling Gateway. http://​www.signaling-​ gateway.org/​ Vassart G, Costagliola S (2011). G protein-​coupled receptors: muta- tions and endocrine diseases. Nat Rev Endocrinol, 7, 362–​72. Weinstein LS, et  al. (2006). Genetic diseases associated with heterotrimeric G proteins. Trends Pharmacol Sci, 27, 260–​6. Wootten D, et al. (2018). Mechanisms of signalling and biased agonism in G protein-coupled receptors. Nat Rev Mol Cell Biol, 19, 638–53. Woyach JA, Johnson AJ, Byrd JC (2012). The B-​cell receptor sig- naling pathway as a therapeutic target in CLL. Blood, 120, 1175–​84. Zhan T, Rindtorff N, Boutros M (2016). Wnt signaling in cancer. Oncogene, 36, 1461–73.

3.6 Apoptosis in health and disease 266

3.6 Apoptosis in health and disease 266

ESSENTIALS Apoptosis is the process by which single cells die in the midst of living tissues. It is responsible for most—​perhaps all—​of the cell death events that occur during the formation of the early embryo and the sculpting of organs. Apoptotic cell death continues to play a critical role in the maintenance of cell numbers in those tissues in which cell turnover persists into adult life, such as the epithelium of the gastro- intestinal tract, the bone marrow, and lymphoid system including both B-​ and T-​cell lineages. Clinical context—​apoptosis appears in the reactions of many tissues to injury, including mild degrees of ischaemia, exposure to ionizing and ultraviolet radiation, or treatment with cancer chemotherapeutic drugs. Excessive or too little apoptosis play a significant part in the pathogenesis of autoimmunity, infectious disease, AIDS, stroke, myo- cardial disease, and cancer. When cancers regress, apoptosis is part of the mechanism involved. Mechanism—​the process is rapid, taking minutes to hours. (1)  Structural changes—​dying cells lose contact with their neigh- bours and undergo loss of volume, explosive blebbing from the cell surface and fragmentation into a cluster of membrane-​bounded apoptotic bodies, with chromatin condensing in discrete aggregates under the nuclear membrane. The fragments are rapidly phagocyt- osed. (2) Cellular processes—​many of the morphological features of apoptosis are attributable to activation of a family of proteases called ‘caspases’, which are activated by two pathways (a)  extrinsic—​via death-​signalling receptors (members of the tumour necrosis factor α (TNFα) receptor family); (b) intrinsic pathway—​triggered by many signals from the cell interior including the heat shock response, the unfolded protein response, the stress-​activated kinase response, and the DNA damage response. Introduction Apoptosis is the process by which single cells die in the midst of living tissues, playing a crucial role in the formation of the early em- bryo, the sculpting of organs, the maintenance of cell numbers in those tissues in which cell turnover persists into adult life, and the reactions of many tissues to injury. Abnormalities of apoptosis play a significant part in many disease processes. This chapter gives an overview of apoptosis in health and disease. Figure 3.6.1 gives an overview of the process to orient the reader for the account that follows. Structural changes in apoptosis Apoptosis can be recognized because of its characteristic, stereo- typed sequence of structural changes (Fig. 3.6.2). The dying cells lose contact with their neighbours and undergo a rapid loss of volume. There is explosive blebbing from the cell surface, followed by frag- mentation of the cell into a cluster of subcellular bodies (apoptotic bodies), each membrane-​bounded and containing a variety of compacted cytoplasmic organelles. The nucleus undergoes similar shrinkage and fragmentation. Chromatin condenses under the nu- clear membrane in knob-​like, hemilunar, or toroidal aggregates. Nuclear membranes overlying residual uncondensed chromatin are rich in pores, but these are absent adjacent to condensed chromatin, suggesting that redistribution takes place. The nucleolus falls apart, its argyrophilic fibrillar centre remaining apparently tethered to the peripheral aggregates of chromatin, while the osmiophilic particles associated with transcription complexes disperse in the central nu- cleoplasm. Eventually the nuclear membrane disappears and the en- tire nuclear remnant fragments into several near-​spherical masses of condensed chromatin. Within the cytoplasm, the endoplasmic reticulum dilates. The cell surface loses any pre-​existing microvilli or other indices of polarity. The shrunken cell and the apoptotic bodies into which it fragments tend to become spherical. Isolated apoptotic cells lose the ability to maintain ionic homeo- stasis within an hour or so, lose density, swell in volume, and permit the entry of various dyes classically used to mark dead cells (such as trypan blue and propidium iodide). Within tissues, however, this phase is seldom seen, because the apoptotic cell and its fragments undergo rapid phagocytosis. Often this is undertaken by ‘profes- sional’ phagocytes—​the resident tissue macrophages—​but where unusually large numbers of apoptotic cells are generated, other cell types share in ingesting them, including their viable neighbours. 3.6 Apoptosis in health and disease Mark J. Arends and Christopher D. Gregory

3.6  Apoptosis in health and disease 267 Once within the phagosome of the ingesting cell, the apoptotic cell and its fragments rapidly become indistinguishable from the con- tents of any other large secondary lysosome. For reasons to be expanded later, the process of apoptotic cell phagocytosis inhibits the neutrophil-​dominated inflammatory re- action that is often seen when macrophages are activated in other circumstances. Cell loss by apoptosis can therefore be effected with little disruption of the tissue concerned. Moreover, apoptosis, once initiated, is completed swiftly. Although the interval from the ini- tial application of a lethal stimulus to the first manifestations of shrinkage and blebbing can vary from minutes to many hours, phagocytosis may be complete within an hour thereafter. Hence, the evidence for cell loss by apoptosis provided by the ‘snapshot’ of a histological section is often surprisingly scanty relative to the reduc- tion in cell number that occurs. Apoptosis is not the only mode of cell death. In classical necrosis, dying cells show a different pattern of change, dominated by volume overload and, eventually, plasma membrane breakdown and leakage of intracellular contents into the extracellular space. At first, the nu- cleus retains its general structure, although the chromatin patterns coarsen. Later, following equilibration of the cytosol with extracel- lular calcium, and the resultant widespread activation of degradative enzymes such as cathepsins, vestiges of nuclear structure fade away (karyolysis) and only ghost-​like cellular outlines remain. Usually there is an associated acute inflammatory reaction. This pattern of death is frequently found when tissues are overwhelmed by high concentrations of toxic substances or in severe ischaemic damage, where vascular perfusion has been arrested. Although apoptosis is the most established and widely studied mode of programmed cell death, it is worth noting other modes that have been described more recently. These in- clude necroptosis, a form of programmed necrosis that occurs when apoptosis is inhibited; pyroptosis, an inflammatory mode of cell death in response to infection; and autophagy-​associated cell death, which can occur in response to nutrient deprivation, during involution of tissues or exposure to toxins. In autophagy, Radiation, chemotherapy Growth factor withdrawal Death ligand Death receptor Procaspase 8 Procaspase 9 Active caspase 8 Active caspase 3,7 Active caspase 9 APAF1 Cytochrome c Mitochondrion APOPTOSOME APOPTOSIS Dismantling of the cell Plasma membrane changes Recognition, response, and removal INTRINSIC PATHWAY EXTRINSIC PATHWAY Bcl-2 Procaspase 3,7 Smac Fig. 3.6.1  Overview of apoptosis: intrinsic and extrinsic pathways. The intrinsic pathway is triggered by many signals from the cell interior, such as genotoxic injury including radiation and chemotherapy, or growth factor withdrawal, leading to release of cytochrome c from the mitochondrial intermembranous space, formation of the apoptosome, and activation of caspase 9. The extrinsic pathway involves ligation of death-​signalling receptors by their ligands, recruitment of a cluster of proteins collectively called the death-​inducing signalling complex, leading to activation of caspase 8. Caspases from both pathways lead to activation of effector caspases 3 and 7 that bring about dismantling of the cell, plasma membrane changes mediating its recognition and removal by phagocytosis, as well as additional responses involved in tissue repair. See text for details.

268 SECTION 3  Cell biology which is an overt cell survival response, cells typically show a set of structural changes characterized by portions of cyto- plasm, including mitochondria and other organelles (but not the nucleus), becoming enveloped by their own cell’s endosomal membranes and undergoing destruction through fusion with lysosomes. It is one of the principal mechanisms responsible for cell atrophy (the organized reduction in volume and com- plexity of cytoplasm), but is probably not intrinsic to the pro- cess of death. Studies of gene expression show many differences between autophagy and apoptosis, and autophagy can occur without cell death. Although both may occur in parallel within involuting tissue, autophagy appears to be an adaptive response, effected by cells living through adverse conditions, but apop- tosis always implies cell death. Caspases: Effectors of apoptosis with other functions Many of the morphological features of apoptosis are attributable to activation of a family of proteases known as caspases (so called be- cause of the presence of the amino acid cysteine (c) in their cata- lytic site, and their preferential cleavage of peptides immediately C-​terminal to aspartate (asp) residues). There are at least 18 mammalian caspases (Table 3.6.1). All are ini- tially synthesized as inactive proenzymes and undergo proteolysis to generate two fragments of around 10-​ and 20-​kDa molecular weight (p10 and p20), together with a fragment of variable length from the original N terminus and a linker fragment (Fig. 3.6.3). These p10 and p20 subunits oligomerize in pairs to form a tetramer, which is the active enzyme. Long N-​terminal sequences provide the op- portunity for regulation through interaction with various binding proteins. Caspases recognize motifs of four amino acids that are present in many proteins. Significantly, such caspase target sites are often highly conserved between species, and frequently occur in strategic intramolecular locations, such that caspase cleavage would radically alter the function of the substrate protein. In particular, the cleavage of caspase substrates accounts for many of the structural changes of apoptosis already described. Particularly interesting substrates in- clude proteases, kinases, cytoskeletal proteins, proteins involved in DNA damage and repair, and cell-​cycle regulatory proteins. Caspases and proteases The cleavage sites involved in the processing of caspases to their ac- tive form are themselves typical caspase target sequences. Hence, caspase activation can occur either by autocatalysis or through a sequential cascade-​like process in which initiator caspases with (c) (b) (a) (d) (e) Fig. 3.6.2  The structure of apoptosis. (a) Scanning electron micrograph of a normal macrophage shows its surface sprouting many pseudopodia. In (b) the cell has been injured (in this case by oxidized lipid of the type often present in high concentration in atheromatous plaques) and is throwing out and retracting multiple surface blebs (explosive surface blebbing). In (c) the whole cell is fragmenting into roughly spherical apoptotic bodies. Some of these are cratered by the orifices of the dilated endoplasmic reticulum fusing with the membrane. (d) Transmission electron micrograph (TEM) of an ultrathin section of a normal macrophage. (e) An apoptotic macrophage (TEM) shows the condensed chromatin (arrowhead), nucleolar remnant (arrow), and convoluted surface with dilated endoplasmic reticulum. Micrographs by courtesy of Dr Jeremy Skepper and Dr Jing Xia, Cambridge School of Biology Multi-​imaging Centre.

3.6  Apoptosis in health and disease 269 long N-​terminal sequences (caspases 8, 9, 10, and 2) are activated first and then activate, by cleavage, the short effector or executioner caspases (3 and 7). Caspases can also activate other proteases. Thus, the calpain-​inhibitor protein calpastatin is inactivated by caspase cleavage, so turning on calpain digestion within the dying cell. Caspases and protein kinases The small G-​protein rho regulates the mobility of the cell surface. Two rho-​dependent kinases, PAK2 and ROCK-​1, are rendered constitutively active by caspase cleavage, through excision of their negative regulatory domains. PAK2 activity is a factor in the early retraction of the apoptotic cell from its neighbours or from sub- strate attachment, while ROCK-​1 activity is responsible for the enhanced action of a myosin light-​chain kinase that drives the cell-​membrane blebbing immediately preceding fragmentation of the apoptotic cell. FAKp125 is the kinase associated with focal adhesion plaques. It is a critical element in the signalling pathway that links cellular aware- ness of substrate attachment (through integrins) to other cellular functions, including movement, attachment, and new transcription. FAKp125 is cleaved and inactivated by caspases, hence isolating the cell from such signals, many of which would normally promote sur- vival. Somewhat similarly, the adenomatous polyposis coli protein and β-​catenin are cleaved by caspases, at molecular sites that en- sure loss of their function. Both are members of the Wnt-​1 signal- ling pathway, connecting cell-​to-​cell signals with regulation of cell function. Caspases and cytoskeletal proteins Actin (the major protein of the cytoskeleton), fodrin (which pro- vides the deformable shell underlying the plasma membrane), vimentin (an intermediate filament protein of the cytoskeleton), and the lamins (which form a major component of the nuclear en- velope) are all caspase substrates. Caspase cleavage of these large polymeric proteins ensures they are rapidly disassembled to mono- mers. Gelsolin, a further caspase substrate, is an actin-​binding pro- tein that cleaves actin filaments in a calcium-​dependent manner. Caspase cleavage of gelsolin separates the calcium-​sensitive nega- tive regulatory domain from the protease domain, and hence actin-​ filament cleavage is effected under normal intracellular calcium Table 3.6.1  Caspases Function Caspase Domain structure Size (AAs) Species Comment Apoptosis—​initiator Caspase-​2 CARD-​L-​S 452 h,m Nedd2. CARD domain for activation Caspase-​8 DED-​DED-​L-​S 479 h,m DED domains for activation Caspase-​9 CARD-​L-​S 416 h,m CARD domain for activation Caspase-​10 DED-​DED-​L-​S 521 h DED domains for activation Apoptosis—​executioner Caspase-​3 L-​S 277 h,m Lack N-​terminal interaction domains Caspase-​6 L-​S 293 h,m Lack N-​terminal interaction domains Caspase-​7 L-​S 303 h,m Lack N-​terminal interaction domains Inflammation Caspase-​1 CARD-​L-​S 404 h,m ICE Caspase-​4 CARD-​L-​S 377 h Caspase-​5 CARD-​L-​S 434 h Caspase-​11 CARD-​L-​S 373 m Caspase-​12-​L* CARD-​L-​S 419 h,m Caspase-​12-​S* CARD-​L 231 h Other Caspase-​13 CARD-​L-​S 377 b Caspase-​14 L-​S 242 h,m Skin epidermis—​cornification Caspase-​16 L-​S 183 h,m Caspases, cysteine-​ASPartic proteases; CARD, caspase recruitment domain; DED, death effector domain; L, large subunit; S, small subunit; L*, long form; S*, short form; AAs, number of amino acids; ICE, interleukin-​1-​β-​converting enzyme was the early name (later renamed) for caspase-​1; Nedd2 was later renamed caspase-​2. 18 mammalian caspases are known, but recently identified caspases-​15, -​17, and -​18 are absent from placental mammals; caspase-​5 is not present in mice; caspase-​11 and -​13 are murine (m) and bovine (b) orthologues of human (h) caspase-​4, respectively. Caspase-​12 has two forms: long (L*) and short (S*) forms in humans, but only long in rodents. Caspase-​14 is expressed in skin epidermis and has a role in cornification. Apoptosis executioner caspases-​3, -​6, and -​7 lack N-​terminal interaction domains, whereas apoptosis initiator caspases possess N-​terminal interaction domains of either DED (Caspase-​8 or -​10) or CARD (caspase-​2 or -​9) types, to mediate dimerization and/​or recruitment into larger complexes for their activation. Caspase-​8 mainly mediates the extrinsic pathway (at the cell membrane) and caspase-​9 the intrinsic pathway (at the mitochondrion, involving mitochondrial outer membrane permeabilization [MOMP]) of activation of apoptosis. Caspase-​2 can be activated upstream (CARD-​mediated) or downstream of MOMP and can be recruited to the PIDDosome multiprotein complex (including RAIDD & PIDD). As well as apoptosis, some caspases can be activated during pyroptosis (macrophage death after shigella or salmonella infection, involving caspase-​1 or -​11) and autophagy-​associated cell death (involving ATG proteins, beclin-​1, and several caspases -​2, -​3, -​6, -​8). N-terminal sequence p20 Linker p10 Active sites Fig. 3.6.3  Schematic diagram of caspase activation. The proenzyme on the left is processed by cleavage to the active form shown on the right: the N-​terminal sequence and the linker are lost and two pairs of p10 and p20 subunits combine, each contributing to the active sites of the enzyme.

270 SECTION 3  Cell biology concentrations. These cytoskeletal proteolytic events probably con- tribute to the rounded shape of apoptotic bodies and to the eventual dissolution of the nuclear envelope. Caspases and DNA damage and repair ICAD (inhibitor of caspase-​activated DNase) is a cytoplasmic chap- erone that binds a double-​strand DNase, CAD (caspase-​activated DNase). The ICAD–​CAD complex is normally cytoplasmic. ICAD, however, is a caspase substrate and once cleaved ceases to chaperone CAD, which unfolds, displaying a nuclear localizing signal. Once within the nucleus, CAD initiates the digestion of DNA, first to large fragments of around 50 kilobase pairs and eventually—​through cleavage of chromatin at internucleosomal sites—​to a series of frag- ments that are multiples of the 180-​ to 200-​base pair unit wrapped around each nucleosome. The genesis of these DNA fragments is ex- ploited in several cytological and electrophoretic methods for iden- tifying apoptosis. DNA-​PK, ATM, PARP, and Rad51 are all DNA repair proteins concerned with the recognition and response to double-​strand DNA breaks. Significantly, all are cleaved in apoptosis at sites that separate their DNA-​binding and catalytic domains, thus removing their ability to repair DNA. This may be important in preventing re-​ligation of the heavily digested DNA of the apoptotic nucleus, so avoiding the generation of large numbers of undesirable recom- binant DNA molecules. Caspases and cell-​cycle proteins Unexpectedly, several proteins that normally inhibit movement around the cell cycle are targets for caspase cleavage. These include p21CIP1 and p27KIP1 (inhibitors of cyclin-​dependent kinases that catalyse movement through the G1-​S and G2 phases of the cell cycle), WEE-​1 (which blocks movement from G2 to mitosis), and CDC27 (which inhibits entry into mitosis). The purpose of this potential reactivation of cell-​cycle activity during the process of death is ob- scure. It occurs during the apoptosis of cells such as neurons that have long since ceased movement around the cycle. Non​apoptotic roles of caspases It is becoming clear that caspases play a multitude of roles in addition to those in apoptosis. Caspase 1, for example, is key to processing interleukin-​1β in inflammatory responses, and several others are in- volved in inflammation (Table 3.6.1). Even the executioner caspases of apoptosis have additional non​apoptotic roles. Thus, caspases 3 and 8 play important, non​apoptotic roles in immune regulation. It seems that caspases are, perhaps unsurprisingly, highly pleiotropic proteins playing roles in cell proliferation, differentiation in a range of developmental and adult tissue contexts. In this way they may be regarded as important cell-​fate decision-​making enzymes con- tributing to the fundamental processes of cell population expansion, specialization, and death. The activation of apoptosis Two pathways converge on and activate the effector or executioner caspases. One connects extracellular cytokine-​based stimuli to the caspase cascade, through death-​signalling receptors on the cell sur- face, and is often referred to as the extrinsic pathway. The other, termed the intrinsic pathway, links the caspase cascade to a great variety of signals from the cell interior, reflecting dysfunction in me- tabolism, genotoxic injury, hypoxia, and the status of the cytoskel- eton. Both pathways may be triggered by physiological as well as pathological stimuli. Death-​signalling receptors coupled to apoptosis The death-​signalling receptors are all members of the tumour ne- crosis factor alpha (TNFα) receptor family. They are type 1 mem- brane receptors (that is, with the N terminus on the external surface), containing a series of cysteine-​rich incomplete repeats in the ligand-​binding domain, a single transmembrane domain, and a cytoplasmic moiety with one or more signalling domains (Fig. 3.6.4). Their ligands are homologues of the cytokine TNFα. The prototype death-​signalling receptor is Fas (also called CD95 or Apo-​1). On binding its ligand, FasL, this receptor trimerizes and immediately recruits to its cytoplasmic moiety a cluster of proteins collectively called the death-inducing signalling complex (DISC). The aggregation of DISC proteins is the result of protein–​protein interaction at an α-​helical region called the death domain (DD). Through their DDs, Fas interacts with an adapter protein called FADD (Fas-​associated protein with death domain) that contains a further interactive region called DED (for death effector domain). Through DED, FADD recruits procaspase 8 to the DISC, an initi- ator caspase with two DEDs in its N-​terminal sequence. Because they are at high local concentration in the DISC, the procaspase 8 molecules can catalyse their own activation, and so initiate the proteolytic cascade that ultimately turns on the effector caspases. While Fas is widely expressed in many tissues, FasL expression is largely restricted to cytotoxic lymphocytes and to cells in immuno- logically privileged sites. In this way, the Fas/FasL system plays a major role in cell killing by cytotoxic T lymphocytes (CTLs) but can repulse CTLs at immunologically privileged sites. TNFR1, the high-​affinity TNFα receptor, also trimerizes on binding its ligand TNFα, but the downstream pathways are more di- verse than those of Fas. Three types of protein complex form around the cytoplasmic moiety of the activated TNF receptor. Each initially comprises a basic DISC that includes a DD-​containing adapter pro- tein called TRADD (for TNF receptor-​associated death domain protein), a threonine kinase called RIP (for receptor interacting protein), and a third protein, TRAF-​2 (for TNF receptor associated factor). From this common origin, three types of protein complex develop, each responsible for a different pattern of signal transduc- tion (Fig. 3.6.4 and Box 3.6.1). DR3 is a receptor closely similar in structure to TNFR1 but with a narrower tissue distribution. Whereas TNFR1 is ubiquitous, DR3 is expressed predominantly in the lymphocytes of spleen, thymus, and peripheral blood. Interestingly, the expression of the ligands ap- pears to adopt the opposite pattern, with TNFα being a product pre- dominantly of activated macrophages and lymphocytes, whereas the DR3 ligand (variously also called Apo3L and TWEAK) is expressed in many tissue types. DR4 and DR5 are similar receptors that bind a ligand called TRAIL (TNF-​related apoptosis-​inducing ligand). The downstream signalling appears to involve both FADD and caspase 8.  Both TRAIL and its receptors are expressed in many tissue types. TRAIL has excited attention as a potential therapeutic agent because it is

3.6  Apoptosis in health and disease 271 frequently cytotoxic to tumour cells under conditions in which normal cells are unharmed. Variant receptors that lack the cyto- plasmic signalling moieties (e.g. DcR1, Fig. 3.6.4) are expressed in many normal tissues and appear to act as inhibitory decoys for TRAIL. Mitochondrial signals coupled to apoptosis The mitochondrial pathway depends upon the release of cytochrome c, together with deoxyATP (dATP), from the intermembranous space of mitochondria. Cytochrome c and dATP bind to and ef- fect a conformational change in a protein of the outer mitochon- drial membrane, APAF-​1 (for apoptotic protease activating factor), so that it exposes a protein-​binding domain (generically called a CARD, for caspase-​activating recruitment domain) capable of re- cruiting and activating procaspase 9. This molecular assembly has been called the apoptosome. Caspase 9 then activates the execu- tioner caspases. Triggers for the release of cytochrome c include reactive oxygen species, cellular redox stress, and proteins of the BCL-​2 family (Fig. 3.6.5). BCL-​2 is a protein with a C-​terminal hydrophobic domain that allows it to anchor to the outer mitochondrial membrane. It was first identified because of its consistent activation (through a chromo- some translocation) in follicular B-​cell lymphoma. Its major physio- logical role is that of a survival factor, and thus it can cooperate with other oncogenes during carcinogenesis to sustain the life of clones of cells that otherwise might be deleted by apoptosis. The mammalian BCL-​2 family contains at least 15 members in three major branches, distinguished on the basis of their function, which may facilitate ei- ther survival or apoptosis, and the presence or absence of certain conserved BCL-2 homology domains, called BH1 to BH4 (Fig. 3.6.6). Among the prosurvival molecules are BCL-​2 itself, BCL-​xL, RIP TNFα TNFR1 D C B A TNFR1 TRADD FADD NFκB JNK/p38 life death FasL TRAIL DcR1 FADD TRADD TRAF 2 D C B A E TNFR1 TRADD FADD NFκB JNK/p38 death life AIP ASK1 ASK1 AIP RIP caspase 8 caspase 8 death Fas Fig. 3.6.4  Death-​signalling receptors. The Fas receptor, with its ligand (FasL) and DISC, signalling exclusively to death, is shown in (A). The more complex TNFα receptor 1 (TNFR1) is shown in (B), (C), and (D). One outcome of activation of this receptor (shown diagrammatically in B) signals for survival through the transcription factor NF-​κB, by a RIP kinase dependent pathway. Another, (C), dependent upon recruitment of ASK-​1 to the DISC, activates the Jun kinase/​p38 pathway and can support either survival or apoptosis, depending on cell type and conditions. The third (D) requires internalization of the receptor, and activates caspase 8 through protein–​protein interaction between TRADD and FADD. (E) shows DcR1, one of the decoy receptors for TRAIL (TNF-​related apoptosis-​inducing ligand). This receptor has no membrane anchor and so competes for TRAIL with the death-​signalling membrane receptors, DR4 and DR5. Box 3.6.1  TNFα signalling When TNFα binds to its receptor, three pathways with different out- comes may be activated. • First, the basic DISC—​comprising TRADD, RIP, and TRAF2—​may dir- ectly recruit regulatory elements of the MAP kinase pathway, leading to activation of the transcription factor NF-​κB and a set of pro-​survival, NF-​κB-​dependent events. • Second, and apparently following internalization of the receptor and its DISC, TRADD can recruit FADD and hence procaspase 8 or 10, so providing the means of activating apoptosis. • Third, activation of the TNF receptor can lead to the dissociation from it of a protein called AIP (for ASK-​interacting protein). While it is bound to the TNF receptor AIP is in an inactive, folded form, but on release it unfolds, becomes phosphorylated by RIP, and contributes to a new signalling complex comprising TRAF-​2, RIP, AIP, and ASK-​1. ASK-​1 (for apoptosis signal-​regulating kinase, also called MAP3K5) is an upstream kinase in the MAP kinase cascade and ultimately directs the activation of JNK and p38 kinase, as will be discussed later. Thus, activation of the TNF receptor may induce survival or apoptosis, depending on the cell type and local environmental conditions.

272 SECTION 3  Cell biology BCL-​w, MCL-​1, and A1, all of which share all four BH homology do- mains. In contrast, BAX and BAK form a branch of the BCL-​2 family that possesses BH3, BH2, and BH1 domains but exerts pro-​death functions The third—​and still expanding—​family branch consists of proapoptotic proteins whose sole region of homology with the others is a single BH3 domain (amounting to no more than 9–​16 amino acids): BID, BAD, BIK, BIM, BNIP3, NOXA, PUMA, BMF, HRK, and Mule/​ARF-​BP1, as well as others. The BH1, BH2, and BH3 domains of the pro-​survival family members together form a hydrophobic groove into which BH3 do- mains of the BH3-​only proteins and the multidomain proapoptotic proteins can fit, in much the same way as a ligand binds to its re- ceptor. Such binding prevents the oligomerization of BAX and BAK and in so doing neutralizes their proapoptotic functions. However, in the presence of the ‘BH3-​only’ family members, most of which bind to the hydrophobic cleft with high affinity, BAX and BAK are displaced from the groove to form homo-​oligomeric permeability pore structures that lodge in the outer mitochondrial membrane, creating there the conditions that permit efflux of cytochrome c and dATP (see Box 3.6.2) and hence procaspase 9 activation, as previ- ously described. An alternative scenario suggests that the BH3-​only proteins may bind to the hydrophobic groove of BAX or BAK, so catalysing their oligomerization directly. The BH3-​only, proapoptotic proteins play important roles in coupling the powerful mitochondrial pathway to a broad variety of stimuli—​physiological and pathological—​in the cellular envir- onment (Fig. 3.6.5). Notably, BID is activated through cleavage by caspase 8 of a small peptide from its N terminus. The truncated BID, now activated, translocates from the cytosol to mitochondria and effects the mitochondrial release of cytochrome c. In this way, stimuli emanating from cytokine receptors but too small to activate the effector caspases directly can be amplified by recruitment of the mitochondrial pathway. Put another way, activation of BID lowers the threshold at which cytokines trigger apoptosis. Somewhat simi- larly, BAD is involved in a mechanism to raise the threshold at which apoptosis is engaged, depending on the availability of cyto- kine growth factors. BAD is phosphorylated by the kinases AKT (protein kinase B) and RSK, both in turn dependent on PI3 kinase and the growth factors responsible for its activation. Normally, phosphorylated BAD is sequestered in the cytoplasm by the chap- erone 14-​3-​3. In conditions of growth factor deprivation, however, unphosphorylated BAD becomes available, translocates to the mito- chondria, and activates cytochrome c release. BNIP3 is a mitochon- drial protein that accumulates under conditions of hypoxia. It may thus provide a trigger linking hypoxia to apoptosis. Normally, BIM binds to the light chain of dynein and BMF to myosin V, cytoskeletal inactive BAX BCL2 activated BAX BAX complex with BCL2 APAF-1 caspase 9 silenced by IAP APAF-1 caspase 9 active cytochrome c dATP BCL2 BH3-only complex with BCL2 SMAC Loss of contact DNA damage Oncogene overdrive BMF Ca flux UV Cytokine deprivation Fas/TNFαR stimulation BID tBID BIM ARF-BP1 ARF p53 NOXA PUMA BAD BAX oligomer permeability pore hypoxia BCL2 BCL2 Fig. 3.6.5  A summary of the processes involved in intrinsic pathway activation involving caspase 9, and the BCL-​2 family proteins. BAX, activated in the cytoplasm, translocates to the surface of mitochondria where it initially binds to BCL-​2 (or combinations of BAX or BAK with BCL-​2 or BCL-​xL). Excess BAX forms BAX-​BAX homo-​oligomers that generate a permeability pore and are responsible for the release of cytochrome c and dATP. Alternatively, BAX may be displaced from BCL-​2 by competition with the BH3-​only proteins. Caspase 9 complexed with APAF-​1 is activated by cytochrome c and dATP released from the intermembranous space of the mitochondrion, but this activation can be inhibited by IAPs. Release of SMAC from the intermembranous space blocks this inhibition. The roles of several BH3-​only proteins (purple boxes) in connecting the apoptotic machinery to a variety of stimuli are also shown, including the caspase-​8-​mediated cleavage of BID to truncated BID (tBID) which links extrinsic pathway activation involving Fas/​TNFR1 to intrinsic pathway activation.

3.6  Apoptosis in health and disease 273 proteins that appear to generate signals relating to microtubule in- tegrity and cell attachment, respectively. The transcription of PUMA and NOXA is directly dependent on p53, hence providing a link be- tween nuclear DNA damage and apoptosis. Mule/​ARF BP1 is a ubi- quitin ligase that targets for proteasomal destruction the cell-​cycle regulator CDC6. This has the effect of arresting the cell cycle, but as discussed next, can also initiate apoptosis. There is also specifi- city as to which members of the pro-​survival BCL-​2 family proteins are targeted by individual BH3-​only proteins: whereas tBID, BIM, and PUMA bind to all five pro-​survival BCL-​2 family proteins, the others have more limited affinities. In this way the BH3-​only pro- teins provide a summation of injury and physiological death signals from all over the cell, and translate that, in a cell-​type-​dependent manner, to the final decision between life and death. Mitochondria are not unique among cellular organelles in pro- viding the location for procaspase-​containing protein complexes whose activation is affected by BCL-​2 family members. Procaspase 2 can be found in the nucleus and Golgi apparatus of some cells. BCL-​ 2 is present on nuclear and endoplasmic reticulum membranes. Activated BAX locates to endoplasmic reticulum as well as to mito- chondria. Hence, multiple organelles may contribute to the execu- tion of apoptosis as well as the audit of its initiating stimuli. Apoptosis and cell stress The question arises how apoptosis relates to the other molecular mechanisms whereby cells respond to stresses of various kinds. Injured cells activate stereotyped reactions, of which the heat shock response, the unfolded protein response, the stress-​activated kinase response, the metabolic response, and the DNA damage response are of particular relevance here. BCL-2 BCL-XL BCL-w BAX MCL-1 BAK BID BAD BIK BIM BNIP3 NOXA PUMA ARF-BP1 (a) (b) (c) TM BH2 BH1 BH3 BH4 Fig. 3.6.6  Examples of the human BCL-​2 family, showing schematically the relative positions in the unfolded protein of the BCL-​2 homology domains (BH1–​4), and the transmembrane domain (TM). (a) Four pro-​survival members. The orientation of the BH domains in A1 is similar, but this protein lacks a transmembrane domain. (b) The two major multidomain proapoptotic proteins, BAX and BAK. (c) Some BH3-​only proteins. Note that although some of these possess transmembrane domains, the great majority have only the BH3-​homology domain in common, and they differ greatly in total size. Thus, PUMAα (the longest splice-​variant isoform of the PUMA gene) has 193 amino acids, while ARF-​BP1 has 4374, and hence is not drawn to scale relative to the others. Box 3.6.2  Mitochondrial outer membrane permeabilization There has been controversy over the precise mode of action of BAX and BAK in effecting the release of cytochrome c and dATP from mitochon- dria. Under normal conditions, there is an electrical potential across the mitochondrial membrane (ΔΨm) sustained by proton pumping. Immediately prior to apoptosis, ΔΨm dissipates abruptly, associated with mitochondrial outer membrane permeabilization (MOMP) or de- polarization, suggesting unselective passage of ions. MOMP has been described as the point of no return or commitment to execution of apop- tosis. One possibility is that osmotic expansion of the inner mitochon- drial compartment could lead to rupture of the outer membrane and hence escape into the cytoplasm of cytochrome c and dATP. However, direct experiments with artificial reconstructions of lipid bilayers and super resolution microscopy show that homo-​oligomers of BAX can insert directly into such membranes, creating grommet-​like channels, rings, and arcs, through which large molecules can move and the col- lapse of ΔΨm is secondary to the appearance of such channels. These permeability pores permit release of cytochrome c and dATP from the mitochondrial intermembranous space, with binding to the apoptosome and activation of caspase 9.

274 SECTION 3  Cell biology The heat shock response Heat shock proteins (HSPs) are molecular chaperones of diverse molecular weight that share the property of greatly enhanced tran- scription following cell stress. Thermal, osmotic, and redox stress, ultraviolet and ionizing radiation all may induce HSP transcription. The heat shock response sustains cell survival under adverse circum- stances and inhibits apoptosis in several different ways: Hsp27 in- hibits caspase 8 cleavage of BID, and hence the release of cytochrome c from mitochondria; Hsp40 and Hsp70 inhibit BAX translocation to mitochondrial membranes; Hsp70 and Hsp90 may dissociate the components of the apoptosome. Presumably, each cell has a threshold at which full activation of caspases and entry to apoptosis become inevitable. The HSPs raise that threshold, but little is known of how the threshold itself is defined in the first place. The unfolded protein response (UPR) This regulates the rate of protein synthesis so that correct folding and export from the endoplasmic reticulum occur. Without the UPR, in- soluble aggregates of misfolded protein begin to accumulate in the ER, a manifestation of ‘ER stress’. In summary, the UPR is initiated by three receptor proteins—​PERK, ATF6, and IRE-​1. PERK is a kinase, and responds to the presence of misfolded protein by inhibiting (by phosphorylation) the translation initiation function of eIF2. ATF6 is a transcription factor, and migrates to the nucleus where it stimulates the transcription of chaperone proteins GRP78, GRP94, and XBP1. IRE-​1 is a dual function serine-​threonine kinase and ribonuclease. It splices XBP1 mRNA to generate a further transcription factor for chaperones. However, the UPR has a clearly recognizable boundary at which its function changes from cytoprotective to proapoptotic. On prolonged stimulation, certain specific proteins are translated at high abundance, despite the general inhibition of eIF2. Among them is a protein called CHOP that lowers the cell’s apoptosis threshold by inhibiting transcription of BCL-​2. Further, IRE-​1 forms an acti- vating complex with TRAF-​2 and ASK-​1, proapoptotic elements of the JNK/​p38 kinase pathway to be described next. The stress-​activated kinase response The MAP kinases are serine-​threonine protein kinase cascades, described in detail elsewhere in this textbook (see Chapter  3.5). Activation of these cascades is initiated by phosphorylation of regu- latory upstream members, MAP kinase kinases (MAP3Ks), and leads ultimately to activation of transcription factors. Two of the three major sets of mammalian MAP kinase cascades are directly involved in transduction of stimuli that lead to apoptosis: the p38 kinases often being part of a stress-​related proapoptotic process, the JNK kinases sometimes, depending on local circumstances. That these kinases have a role in the activation of apoptosis is clearly demonstrated by the attenuation of apoptosis in cells from appropriate knockout ani- mals, but how and why these roles are played have proved more dif- ficult to define. The stress kinase cascades appear to engage with the apoptosis effectors in several different ways. Thus, they activate (by phosphorylation) p53, CHOP, and several BH3-​only proapoptotic BCL-​2 family members, inactivate (again by phosphorylation) BCL-​2 and BCL-​XL, and activate the transcription of Fas ligand, all processes that lower the threshold for apoptosis. By stimulating cell-​ cycle movement through transcriptional activation of c-​MYC, under conditions in which cycle movement is blocked (e.g. by p53) they also promote apoptosis, as described next. ASK-​1 is a significant up- stream MAP3K connecting the relevant environmental stimulus to the stress kinase cascades. ASK-​1 is itself activated by reactive oxygen species, as it is normally bound in inactive conformation by the redox sensor thioredoxin. In the presence of a strong oxidative environ- ment, thioredoxin dissociates, ASK-​1 is activated, and the JNK/​p38 cascades are stimulated. ASK-​1 is also activated as part of a complex with TRAF-​2 following TNF receptor stimulation and in the UPR. The metabolic response The close association between the mitochondrion and the regulation of apoptosis, along with the long-​standing supposition that meta- bolic stresses are signalling routes to apoptosis, indicate a close as- sociation between apoptotic and metabolic circuitries. Perhaps the best example of the closeness of this association at the molecular level is cytochrome c, whose prototypic function is in the production of mitochondrial ATP during oxidative phosphorylation. As just de- scribed, cytochrome c is also essential for the initiation of apoptosis via the intrinsic pathway. Additional links between metabolic inter- mediates and regulated cell death processes are provided by ATP, acetyl-​CoA, NAD+, NADP+ and reactive oxygen species. Although the details are not yet clear, it seems that ‘metabolic checkpoints’ exist in order to determine whether a cell responds to metabolic im- balances by eliciting an appropriate adaptive response or, alterna- tively, by signalling its own demise. The DNA damage response Damage to nuclear DNA is a particularly important source of injury-​ related stimuli for caspase activation. Separate molecular mechan- isms exist for responding to the presence of inappropriately inserted bases (base excision repair), nucleotides that have become modified through intrastrand cross-​linking or the formation of covalently bound adducts (nucleotide excision repair), nucleotide mismatch, insertion-​deletion loops, or abnormal methylation (mismatch re- pair), interstrand cross-​links (Fanconi repair) and double-​strand breaks (homologous recombination or non​homologous end-​ joining). In mismatch repair, MSH2 and MLH1 are recruited sequen- tially into a molecular complex at the injury site, which activates p53, effects cell-​cycle arrest, and, in the meantime, initiates repair at the site of damage. Similarly, among the first molecules to bind to DNA double-​strand breaks in non​homologous end-​joining are the DNA kinases ATM, ATR, and DNA-​PK. In turn, these recruit and activate p53 and other molecules (e.g. CHK-​1 and CHK-​2). In surviving cells, these effect arrest at a variety of points around the cell cycle, so en- suring that there is opportunity to load the repair machinery on to the damaged DNA template before this is further altered by DNA replication (in S-​phase) or chromatid separation (in mitosis). A profoundly different means of limiting the effect of genome damage, however, is to commit the damaged cell to apoptosis. Elements such as p53 within the repair complex in both mismatch repair and non​homologous end-​joining can do this. The molecular basis for the decision between apoptosis or survival with repair is still largely unknown. Activation of p53 is common to both outcomes, and it is therefore reasonable to search in and around this molecule for clues to the nature of the life or death decision. Activated p53 ­alters the transcription of a large number of genes. Some are well-​ known inhibitors of cell-​cycle progression, such as p21CIP1, but others (e.g. BAX, Fas, a membrane protein called PERP, and the BH3-​only

3.6  Apoptosis in health and disease 275 proapoptotic molecules NOXA and PUMA) are associated almost exclusively with apoptosis. A further transcriptional target of p53 is the non​translated microRNA miR-​34 (see Box 3.6.3), activation of which is associated with both cell-​cycle arrest and apoptosis. The situation is further complicated by the fact that p53 also signals to the apoptosis effector process by non​transcriptional means, via an N-​terminal sequence that does not appear to be instrumental in ef- fecting cell-​cycle arrest. Phosphorylation provides one of the critical signals for p53 activation, and there are several different phos- phorylation sites that respond preferentially to the various kinases (including Jun kinase, as mentioned earlier). Thus, the precise phos- phorylation status of p53 could provide a molecular signature indi- cative of the nature, and perhaps the outcome, of the DNA damage. Another potential factor in controlling the outcome of DNA injury is a kinase (called DAP kinase because it was originally discovered as a death-​associated protein) that influences the selection of p14ARF rather than p16INK4A—​alternative splice forms from the same gene. Whereas p16INK4A is a cell-​cycle regulator, inhibiting the cyclin-​ dependent kinases, p14ARF displaces p53 from its inhibitor, MDM2, so generating a sustained p53 signal that may favour apoptosis. Further evidence for integration of multiple factors in the response to double-​strand DNA breaks comes from detailed study of the in- jured nucleus. Within an hour of DNA damage, large complexes of phosphorylated proteins form around the damaged site, including ATM, p53, many repair proteins and the phosphorylated histone γH2AX. Within a few hours, these foci come to lie in close juxtapos- ition with pre-​existing intranuclear bodies called PML bodies, into which p53 and many other proteins in the DNA damage response are recruited. Nuclei without PML mount only an attenuated version of the expected p53-​dependent apoptotic response to DNA damage, even though p53 itself is available. One explanation for this is that the PML body is an intracellular location for the activation of p53 protein by acetylation. The PML body may thus form the local envir- onment in which the state of the injured DNA is evaluated and final decisions made regarding the ultimate fate of the cell. The replicative status of the cell is a further important deter- minant of its sensitivity to apoptosis following DNA injury. The proto-​oncogene c-​MYC is normally among the earliest genes to be expressed when cells are stimulated by growth factors to leave quiescence and enter the replicative cycle. Paradoxically, however, c-​MYC expression is also a powerful factor lowering the threshold for apoptosis. In particular, c-​MYC expression without concurrent molecular evidence of external growth factor stimulation (such as phosphatidylinositol-​3 (PI3) kinase and AKT activation) is inter- preted as a death signal. Similarly, other early regulators of cell-​cycle entry, including inhibition of function of the retinoblastoma protein and the release of the transcription factor E2F-​1 from its binding pocket, also trigger apoptosis in the absence of concurrent evi- dence of external mitogenic stimulation. Perhaps this represents a means whereby tissues are protected from autonomous cell replica- tion: survival of replicating cells is made conditional on the presence of appropriate stimuli in the cellular environment. The benefits of re- moving cells that show a tendency for such replicative autonomy are obvious, but the precise mechanism that couples replication to death except in acceptable circumstances is far less clear. It seems probable that the cell cycle itself includes checkpoints at which the decision to engage the apoptosis machinery can be taken should any of the appropriate conditions for replication be absent. Indeed, it is pos- sible that injured cells may force the activation of such checkpoints as one way to access their apoptosis programme. This might explain the paradoxical activation of cyclin-​dependent kinases by caspases in cells such as neurons that normally do not engage in replicative cycles at all, as mentioned earlier. Inhibitors of caspase activation The role of the BCL-​2 family proteins in the activation and inhib- ition of apoptosis has been described, but there are other powerful endogenous inhibitors of caspase-​associated cell death. One is FLIP, a DED-​containing version of procaspase 8 that lacks caspase activity. High local concentrations of FLIP compete with procaspase 8 for re- cruitment into the DISC and so inhibit further propagation of death signals originating from the TNF family of receptors. IAPs (inhibitors of apoptosis proteins) inhibit caspase activity after autocatalytic processing of the procaspase has begun. All con- tain an element called a BIR domain, which binds to the N-​termini of the short fragment of partially processed caspases in such a way that adjacent elements of the IAP molecule drape across the caspase active site and sterically hinder substrate attachment. There are sev- eral such proteins—​IAP1 and IAP2, ILP, the neuronal NAIP, and an X-​linked family member X-​IAP, all of which possess several BIR do- mains, and LIVIN and SURVIVIN, which contain a single BIR do- main. One manifestation of the importance of IAPs is the presence of an IAP inhibitor, variously called SMAC or DIABLO, which is re- leased from mitochondria along with cytochrome c during caspase activation by the mitochondrial pathway. The inhibitor SMAC has an N-​terminal sequence that competes with partially processed caspase for the binding site in the BIR domain, and so allows the caspase to escape from the inhibitory embrace of the IAPs. The IAPs provide a further example of the interconnections be- tween the cell cycle and cell death. SURVIVIN, apparently associ- ated with caspase 9, forms a complex with and is phosphorylated by active CDK1 (cyclin-​dependent kinase-​1) during mitosis. Loss of phosphorylation leads to dissociation of the SURVIVIN–​caspase 9 heterodimer, activation of caspase 9, and apoptosis. As normal mi- tosis proceeds, SURVIVIN associates with kinetochore proteins, the spindle microtubules, and finally, at cytokinesis, with the mid-​body. Complexes of SURVIVIN with cyclin-​dependent kinases active earlier in the cycle (e.g. CDK4) have also been identified and promote transit through G1. Thus, SURVIVIN may form part of a regulatory network, providing a means whereby the threshold for apoptosis is varied through the cell cycle. Finally, IAPs are multifunctional Box 3.6.3   MicroRNA MicroRNAs (miRNA) are a family of short RNA species that are not trans- lated but exert profound influence over the patterns of translation. They bind to regions of sequence homology in the 3' untranslated regions of messenger RNA, inhibiting translation and creating double-​stranded RNA that becomes a target for digestion by double-​strand RNA spe- cific ribonucleases called argonaute proteins. There appear to be only a few hundred distinct types of miRNA, each capable of inhibiting its own spectrum of specific messenger RNA types. Hence altered patterns of miRNA transcription can swiftly alter the overall pattern of messenger RNA that is available for translation. Certain miRNA profiles are charac- teristic of some types of cancer.

276 SECTION 3  Cell biology proteins: they are themselves potential substrates of caspase attack, activators of the survival factor NF-​κB, and downstream products of NFκB-​directed transcription. They thus form part of positive-​ feedback systems for both survival and death. Recognition of apoptotic cells The rapid clearance of apoptotic cells requires that they are efficiently sensed by phagocytes at an early stage. Engulfment by juxtaposed neighbours (including ‘non​professional’ phagocytes) does not re- quire a migratory response on the part of the phagocyte. By con- trast, ‘find me’ signals released by apoptotic cells induce chemotactic migratory responses in mononuclear (‘professional’) phagocytes. These chemoattractant molecules encompass lipid, protein, and nu- cleotide moieties (Fig. 3.6.7) and are released from apoptotic cells via cleavage and channel-​activating events, at least some of which are caspase-​dependent. For example, extracellular ATP acts as a po- tent ‘find me’ signal following release from apoptotic cells through caspase-​3-​activated pannexin 1 channels. Phagocyte Chemotaxis “Find me” signals released CD31 CD31 CD47 ‘Don’t eat me’ signals lost Phosphatidylserine exposure MFG-E8 Gas6 or Protein S Exposed sugars Phagocyte lectins ANTI-INFLAMMATORY & REPARATORY RESPONSES ENGULFMENT BAI1 Tim-4 Stablin-2 Int egrins TAM-RTK SCARF1 LRP1 CD14 Integrins CD36 SIRP1 Phagocyte Apoptotic Cell Apoptotic cell TSP CRT C1q Fig. 3.6.7  Clearance of apoptotic cells by phagocytes. Apoptosis elicits plasma membrane changes—​‘eat me’ signals—​notably exposure of phosphatidylserine, that permit interaction with multiple receptors of phagocytes either directly (e.g. TIM-​4) or via bridging molecules (e.g. Gas6 or Protein S bridging to the Tyro-​Axl-​Mer receptor tyrosine kinases, TAM-​RTK). Receptor-​ligand interactions lead to engulfment as well as anti-​inflammatory and repair mechanisms. Interaction is also dependent upon inhibition of ‘don’t eat me’ signals. In the case of professional phagocytes, engulfment is preceded by sensing of chemotactic molecules (‘find me’ signals), released by apoptotic cells—​see inset.

3.6  Apoptosis in health and disease 277 Macrophages subsequently recognize and bind to the surface of apoptotic cells by virtue of multiple molecular ‘eat me’ signals (Fig. 3.6.7). The disposition of phosphatidylserine (PS) residues on the apoptotic cell surface is one of the most characteristic of these. Normally PS appears only on the inner leaflet of the cell membrane, but this strict polarity is lost very early in apoptosis: around the time of rounding up, substantially earlier than chromatin condensation and DNA cleavage. PS exposure requires the caspase-mediated in- activation of ATP11C, a member of the P4-ATPase family, which, in viable cells, acts as a PS ‘flippase’, maintaining the phospholipid’s asymmetric distribution on the inner plasma membrane leaflet. In concert, the PS scramblase, Xkr8 is activated by caspase cleavage during apoptosis to promote PS externalisation. Macrophages pos- sess multiple receptors that bind to the exposed PS residues. The ex- posed PS may also bind to ‘bridging’ molecules in the extracellular environment that then form linkers to receptors on macrophage sur- faces. Thus MFG-​E8 helps bind PS on the apoptotic cell surface to β3 and β5 integrins on the macrophage surface. Gas6 and protein S similarly bridge PS to the TAM (Tyro-​Axl-​Mer) family of receptor tyrosine kinases (TAM-​RTK). Other bridging molecules include the complement fragment iC3b, which links to macrophage β2 integrins, thrombospondin 1, which links to β3 integrins and CD36, whereas the near-​ubiquitous extracellular molecule β2 glycoprotein-​1 links to a macrophage receptor specific for it. In the same way, extracel- lular complement component C1q links specific binding sites on the apoptotic cell surface to receptors on the macrophages. Along with the macrophage tethering receptor CD14 (whose binding moieties have not been clearly established), a group of scavenger receptors (SRA, CD36, CD68, LOX-​1) may tether directly to poorly defined oxidized lipid groups (similar to those in oxidized low-​density lipo- proteins) exposed on the surfaces of apoptotic cells. Endogenous macrophage surface lectins also bind to sugars (such as N-​acetyl glu- cosamine) selectively exposed on apoptotic cell membranes. These multiple mechanisms that facilitate macrophage phagocyt- osis of apoptotic cells ensure that degradation of dying cells does not usually occur before they are securely engulfed within the phago- somes of the ingesting cells. Presumably this forestalls innate and acquired immune reactions to intracellular proteins, or the voiding into extracellular space of potentially recombinogenic and immuno- genic fragments of genomic DNA. A distinctive feature of macrophage binding to apoptotic cells is the concurrent effect on macrophage function. Macrophages that phagocytose particles opsonized by immunoglobulin or comple- ment component C3b effect a sharp increase in oxygen usage (the respiratory burst), generate reactive oxygen species and nitric oxide, and release of inflammatory cytokines such as TNFα. These recruit other acute inflammatory cells to the site. In contrast, macrophages that ingest apoptotic bodies show suppression of pro-​inflammatory responses, mediated through the release of different cytokines, such as TGFβ. The basis of these contrasting effector responses appears to be the different signalling pathways that are activated by the macro- phage receptors engaged by apoptotic bodies as opposed to opson- ized particles. Responses to apoptotic cells are not limited either to those of macrophages or to anti-​inflammatory effects (although the latter are the most renowned). In certain situations, such as during de- velopment or as a consequence of wounding or radiotherapy, apop- tosis elicits compensatory proliferative responses of neighbouring cells. Apoptotic cells can also engender angiogenic responses. These effects suggest that apoptosis can promote tissue repair and regen- eration. Furthermore, although apoptosis is generally regarded as a tolerogenic process, chemotherapy-​induced apoptosis by anthracyclines and oxaliplatin can be immunogenic, an effect that is mediated by dendritic cells which engulf apoptotic tumour cells and activate T cells to stimulate adaptive antitumour immunity. Are caspases necessary and sufficient for cell death? Although caspase activation plays a dominant role in the effector phase of apoptosis, it is not responsible for all the phenomena of apoptosis. For example, developmentally programmed cell death can sometimes occur on schedule in embryonic tissues in which caspases have been inhibited, or key members of the caspase activation system (such as APAF-​1) rendered deficient through germline gene knockout. In all these circumstances, the morphology of the caspase-​ free death is not that of apoptosis. The nuclei swell rather than under- going chromatin condensation. The cytoplasm shows signs of fluid overload, sometimes with the formation of conspicuous fluid-​filled vacuoles. Some of these changes are reminiscent of necrosis rather than apoptosis. Rather similar changes take place during the devel- opmental death of phylogenetically ancient multicellular organisms that do not possess recognizable close homologues to the caspases, such as the slime mould Dictyostelium discoides. These observations suggest that caspase activation, although in- trinsic to the subtle and highly coordinated death process recognized as apoptosis, may not be the only event that commits cells to death. The existence of at least one caspase-​independent death pathway is highlighted by a flavoprotein released from the mitochondria of in- jured cells called AIF (apoptosis-​inducing factor). AIF translocates to the nucleus, where it can effect chromatin cleavage to large frag- ments, but not the extreme condensation observed in apoptosis. It also appears to reproduce the cellular volume overload described earlier, even in the presence of caspase inhibition. Phylogenetically close homologues of AIF are found in bacteria and plants as well as invertebrate and vertebrate animals. Apoptosis and disease There are few disease processes in which apoptosis does not feature, but the examples that follow are chosen because they exemplify how various steps in the apoptosis pathways may be critical for, or are subverted in, the course of disease pathogenesis. Immunity and its disorders Apoptosis is used extensively in the normal function of the immune system to facilitate the process of clonal selection. Antigen stimu- lation of T-​cell proliferation is usually followed by expression of both Fas and FasL, a recipe for apoptosis on a grand scale (called activation-​induced cell death, AICD) unless there is rescue by a sur- vival stimulus. This can be provided by costimulation from the im- mediate environment—​adhesion molecules or cytokine receptors. A particularly important route for costimulation is through CD28, a receptor on T cells for signals transmitted from antigen-​presenting

278 SECTION 3  Cell biology cells, which increases the expression of several cytokines and their receptors. Similarly, clonally expanded populations of stimulated B cells in the bone marrow or those undergoing affinity maturation in lymph-​node follicle centres are deleted by Fas signalling, but can be selectively rescued by costimulation through CD40. Cytotoxic T lymphocytes kill their targets by delivering to them the contents of their granules. Among these are perforin, which cre- ates regions in the target-​cell membrane of enhanced permeability at the points of contact with the CTL, and granzyme B, a protease that directly activates the caspases of the target cell. In this way, CTLs induce target-​cell apoptosis. The main effect of PD-1 signalling in T cells upon PD-L1/2 binding is usually functional inactivation rather than programmed cell death. The importance of apoptosis for the normal function of the im- mune system is underscored by the effects of genetic defects. Strains of mice with loss-​of-​function mutations in the genes encoding fas or fas ligand (called lpr and gld, respectively) show similar immuno- logical phenotypes, characterized by massive lymphoproliferation and autoimmune disorders. The human homologue is the rare condition of Canale–​Smith syndrome (childhood autoimmune lymphoproliferative syndrome or ALPS) in which there is a muta- tion in the DD of Fas. Inherited deficiency in C1q also leads to an autoimmunity syndrome:  affected individuals almost always de- velop systemic lupus erythematosus. The pathogenesis here appears to be ineffective recognition and phagocytosis of endogenous apop- totic cells, so that their intracellular antigens are inappropriately processed. In particular, the persistence of non​degraded DNA that results from failed clearance of apoptotic cells has fundamental im- portance for the development of autoimmune disease. Infective disorders Shigella dysentery is due to pathogenic strains of Shigella flex- neri. Pathogenicity is conferred by plasmid-​borne genes that neu- tralize the primary host defence: phagocytosis and destruction of the bacteria by macrophages in the intestinal lamina propria. The plasmid-​encoded protein Ipa B activates macrophage caspase 1, so annihilating the defence by inducing macrophage apoptosis. This strategy appears to be successful, because the bacterium that would normally be destroyed if it persisted within the phagosome of the ingesting macrophage can escape from the cytoplasm of macro- phages that undergo apoptosis. The initial response to Trypanosoma cruzi, the parasite respon- sible for Chagas’ disease, is dominated by T-​lymphocyte activation. The resultant AICD generates a population of apoptotic lympho- cytes. These impinge upon the macrophages that, suitably armed by pro-​inflammatory cytokine stimulation, would be one of the most effective elements in the host defence against the parasite. As de- scribed earlier, sustained macrophage phagocytosis of these large numbers of apoptotic cells leads to suppression of pro-​inflammatory cytokine release. The parasite subverts this aspect of the physiology of apoptosis into a source of protection from the host-​defence reaction. The intracellular parasite chlamydia makes a protein (CPAF) that comprehensively targets BH3-​only proapoptotic molecules for proteasomal destruction. This illustrates the value to the organism of keeping a live cell environment around it, but also provides vivid affirmation of the key role played by BH3-​only proteins in activating apoptosis. Viruses engage with the machinery of apoptosis in many ways. Even lytic viruses have strategies designed to conserve the life of their host cells for some time. DNA viruses, in particular, require means to abort apoptosis, as they must activate the cellular DNA synthesis machinery in order to replicate their own genomes, yet must then avoid the apoptosis that would otherwise follow DNA synthesis unaccompanied by commensurate external stimuli. The E6 gene of high-​risk human papillomaviruses (HPV) 16 and 18 en- codes a protein that targets p53 for ubiquitination and subsequent degradation, and so permits cellular survival as the viral E7 protein binds Rb and initiates entry into S-​phase. The transforming genes of adenoviruses pair up to effect rather similar outcomes: E1A binds Rb and initiates DNA synthesis, the 55-​kDa subunit of E1B binds and inhibits p53, and the 19-​kDa subunit neutralizes proapoptotic members of the BCL-​2 family. Human herpesviruses such as HHV8 encode their own version of FLIP (v-​FLIP). They also have their own pro-​survival BCL-​2 family members, such as BHRF1 in the Epstein–​ Barr virus (EBV) and KS-​BCL2 in HHV8. The HHV8 strategy is particularly subtle, because the virus also destroys the endogenous BCL-​2. Unlike endogenous BCL-​2, this viral surrogate lacks an in- ternal caspase site, and cannot be converted into a killer peptide by caspase cleavage. Baculovirus encodes a 35-​kDa protein with BIR domains that is a prototypical IAP. Apoptosis plays a key role in the pathogenesis of AIDS. The pro- gressive loss of circulating CD4+ T cells, by which the course of HIV-​ 1 infection to clinical AIDS can be charted, involves loss of numbers of cells that are several orders of magnitude greater than the numbers that ever carry the virus. It is therefore clear that the overwhelming majority of the dying cells must be bystanders, sensitized to apop- tosis by the presence of infection but not infected themselves. Viral proteins released from infected cells effect this sensitization by sev- eral parallel routes. The HIV proteins Tat and Nef induce Fas, FasL, and TRAIL. Tat alters the cellular redox equilibrium in a manner that may activate ASK-​1. Vpr binding protein modulates p53 in- duced apoptosis. A type of AICD may be induced by stimulation of CD4 and the cytokine receptor CXCR4 (both of which bind HIV epitopes). In infected cells, however, Nef inhibits ASK-​1, and so may selectively protect these from apoptosis. Rather similar mechanisms underlie the deletion of neurons in HIV-​associated dementia. Cardiovascular disease Pathogenetic mechanisms that interface with apoptosis are relatively poorly understood in cardiovascular disease, but there are several observations of potential relevance. Laminar flow inhibits ASK-​1 in endothelium, while the generation of reactive oxygen species in- duces the p38 and JNK stress kinase pathways. Thus, turbulence and the presence of generators of reactive oxygen species such as oxi- dized low-​density lipoproteins—​both known risk factors in the gen- esis of atheroma—​are liable to promote apoptosis in endothelium. Other elements of the vascular wall are also abnormal in atheroma. Vascular smooth muscle cells from atheromatous vessels express p53, induce Fas, and undergo apoptosis in increased numbers, par- ticularly in the shoulders of the plaque, thus weakening attachment of the fibrous cap and rendering plaque rupture more probable. Macrophages also undergo apoptosis in response to the oxidized lipids that are present in atheromatous plaques. Death of the lipid-​ filled macrophages (foam cells) produces extracellular depots of oxi- dized lipid in the plaque core, a key step in plaque progression.

3.6  Apoptosis in health and disease 279 Although necrosis is the pattern of the cell death that immedi- ately follows episodes of infarction, there is now substantial evidence that apoptosis occurs in the surrounding tissue over several hours thereafter, probably in response to relative ischaemia and the local generation of reactive oxygen species. In animal models of stroke, this apoptosis can be down-​regulated by a variety of manoeuvres, including caspase inhibition, with objective evidence of improved cerebral function. These observations have generated enthusiasm for the development of antiapoptotic drugs for use following stroke and myocardial infarction. Another approach, potentially applicable to ischaemic myocardium, is to promote angiogenesis, perhaps by the use of angiogenic stem cells. Experimental models suggest that this improves the remodelling of the peri-​infarct tissue, including decreased apoptosis of myocytes and improved cardiac function. Degeneration of the central nervous system Despite the importance of the subject, there is still much doubt over the role of apoptosis in the chronic degenerative disorders such as Alzheimer’s and Parkinson’s diseases. Much of the problem stems from the relative inaccessibility of the brain for sequential studies following injury. In both conditions there is clear evidence of a loss of neurons, and those that remain accumulate abnormal cytoplasmic material, such as presenilins 1 and 2, and amyloid protein Aβ in Alzheimer’s dis- ease. Cell culture and animal models suggest that the presence of these proteins may induce oxidative stress, which can lower the threshold for apoptosis. The protective effect of BCL-​2 and caspase inhibition has also been recorded. The difficulties are compounded by the fact that neurons that undergo severe overstimulation (e.g. by local high con- centrations of the neurotransmitter glutamate) can also be induced to die (a phenomenon called excitotoxicity), but it is not clear whether the pathways involved overlap with or are identical to those of apoptosis. Tumour biology Apoptosis is of significance in cancer biology for several reasons. First, carcinogenesis is almost invariably associated with escape from mechanisms that normally activate apoptosis. Second, a large component of tumour regression following therapy is attributable to apoptosis. Third, and perhaps most surprisingly, apoptosis harbours sinister, protumour properties. Carcinogenesis Carcinogenesis involves inappropriate cell proliferation, driven by release from tumour suppressor gene inhibition or by hyperactive oncogene expression. Under normal circumstances, however, the accelerated movement around the cell cycle renders the cells vulner- able to DNA damage, which activates p53 (the ‘DNA damage check- point’) and ensures either cessation of replication or apoptosis of the affected cells. For the inappropriately driven population to progress to tumour growth, the affected cells must silence this p53 response. This affords one reason for the frequent appearance of deletions and loss-​of-​function mutations of p53 in tumours, and the observation that the cells of many tumours and some premalignant (but pro- gressing) lesions appear to be in a perpetual state of uncompleted DNA repair. A more subtle mechanism couples inappropriate prolif- eration to activation of p14ARF. Uncoupling of this ‘oncogene check- point’ also permits the continuing replication of cells that would otherwise have been arrested in cell cycle or committed to apoptosis, as ARF has the effect of increasing the half-​life of p53 thus activating the p53 pathway. Suppression of these pathways in the early genesis of tumours has the effect of permitting repeated escape from the DNA damage or oncogene checkpoints and so giving cancer cells the op- portunity to explore the consequences of further genomic rearrange- ments or mutations that are denied normal cells. Some of these prove incompatible with continuing life but others lead to selective, pro- gressive growth advantage towards malignancy (Fig. 3.6.8). Tumour regression These considerations have an important bearing on therapeutically induced tumour regression. Many therapeutic agents are effective because they create DNA lesions that activate a DNA damage check- point. However, as discussed earlier, most if not all tumours will already be derived from clones of cells that have lost critical damage-​ or oncogene-​activated checkpoints. If these happen to be the same as is targeted by the therapeutic agent, there is a high likelihood that the tumour will be resistant to the agent. Further, animal experiments have tested the effect on tumour behaviour of selective restoration of p53 function. Significantly, although regression was initiated almost immediately, tumour regrowth often occurred, accompanied by loss of function in the tumour cells of either ARF or the restored p53. The immediately effected regression of these tumours demonstrates that the downstream effectors of the p53 pathway are still intact in these tumours, and still capable of response to p53 when it is provided. However, the swift tumour recurrence also shows that single-​agent therapy has a high chance of failure: the tumour’s genomic instability leads to rapid selection of alternative resistant clones. Genotoxic damage Injured but surviving NORMAL APOPTOSIS TUMOUR CLONE DSB p53+ p53− Fig. 3.6.8  Failure to activate apoptosis following DNA damage by genotoxic carcinogens, because of the absence of functional p53, leads to the inappropriate survival of clones of cells bearing double-​strand breaks (DSB) and illegitimate recombination events. Although some of these clones may fail to proliferate further (purple bars), others survive to become the founder clones of tumours. Constitutionally, these survivors have unstable genomes, as on further exposure to similar genotoxic stimuli they may again undergo genomic rearrangements or other forms of mutation yet fail to enact apoptosis. Although the example given is for cells lacking normal p53, and hence unable to respond appropriately to DNA DSBs, similar mechanisms apply to cells that fail to identify nucleotide mismatches through defective DNA mismatch repair (mutated/​inactivated MSH2 or MLH1), or fail to repair DNA interstrand cross-​links due to inactivation of the Fanconi DNA repair pathway or other DNA repair pathways (e.g. inactivated BRCA1 and BRCA2). Such cells can tolerate and survive extensive DNA damage and have very high mutation rates.

280 SECTION 3  Cell biology Protumour properties Finally—​and on the face of it counterintuitively—​apoptosis has protumour properties. True to their description as ‘wounds that fail to heal’, malignant tumours hijack innate host responses to cell death such as compensatory proliferation and angiogenesis to ensure sus- tained net growth. It is notable that high-​grade tumours display high apoptotic indices alongside their high mitotic indices. The main cause of such high constitutive apoptosis is likely to be the out-​pacing of oxygen and nutrient supplies through rapid proliferation leading to microenvironmental stress. However, the apoptotic portion of the tu- mour cell population helps to perpetuate and progress the malignant disease through multiple mechanisms. Perhaps the most important of these is the regulation of tumour-​associated macrophages, cells of the tumour stroma which tend to be associated with poor prog- nosis. At least in some tumours, apoptosis drives the accumulation of tumour-​associated macrophages displaying protumour activities, including stimulation of angiogenesis and inhibition of antitumour immunity. These along with possibly additional ‘regenerative’ prop- erties of apoptotic tumour cells are likely to be important underlying causes of relapse following apoptosis-​inducing cancer therapies. FURTHER READING Adams J, Cory S (2007). The Bcl-​2 apoptotic switch in cancer develop- ment and therapy. Oncogene, 26, 1324–​37. Anwar S, Whyte MK (2007). Neutrophil apoptosis and infectious dis- ease. Exp Lung Res, 33, 519–​28. Czabotar PE, et al. (2014). Control of apoptosis by the BCL-​2 protein family: implications for physiology and therapy. Nat Rev Mol Cell Biol, 15, 49–​63. Feig C, Peter ME (2007). How apoptosis got the immune system in shape. Eur J Immunol, 37, S61–​S70. Green DR (2018). Cell Death: Apoptosis and other means to an end, 2nd edition. Cold Spring Harbor Laboratory Press, US. LaCasse EC, et al. (2008). IAP-​targeted therapies for cancer. Oncogene, 27, 6252–​75. Levy OA (2009). Cell death pathways in Parkinson’s disease:
proximal triggers, distal effectors, and final steps. Apoptosis, 14, 478–​500. Li J, Yuan J (2008). Caspases in apoptosis and beyond. Oncogene 27, 6194–​206. Merino D, Bouillet P (2009). The Bcl-​2 family in autoimmune and degenerative disorders. Apoptosis, 14, 570–​83. Nagata S, Suzuki J, Segawa K, Fujii T (2016). Exposure of phosphatidylserine on the cell surface. Cell Death Differ, 23(6), 952–61. Perez-Garijo A, Steller H (2015). Spreading the word: non-autonomous effects of apoptosis during development, regeneration and disease. Development, 142(19), 3253–62. Salvador-​Gallego R, et al. (2016). Bax assembly into rings and arcs in apoptotic mitochondria is linked to membrane pores. EMBO J, 35, 389–​401. Serhan CN, Savill J (2005). Resolution of inflammation: the beginning programmes the end. Nat Immunol, 6, 1191–​97. Shalini S, et al. (2015). Old, new and emerging functions of caspases. Cell Death Differ, 22, 526–​39. Singh R, Letai A, Sarosiek K (2019). Regulation of apoptosis in health and disease: the balancing act of BCL-2 family proteins. Nat Rev Mol Cell Biol, 20, 175–93. Wyllie AH (2010). ‘Where, O death, is thy sting?’ A brief review of apoptosis biology. Mol Neurobiol, 42, 4–​9.

3.7 Stem cells and regenerative medicine 281

3.7 Stem cells and regenerative medicine 281

ESSENTIALS There is a great and unmet need for treatments that will deliver re- storative solutions to patients with diseases hitherto considered ir- reparable. Advances in human pluripotent stem cell biology and gene-​editing technology offer unprecedented opportunities for both drug discovery and translational therapies that will likely herald a new chapter of regenerative and personalized medicine. Requirements for regenerative therapy A prerequisite for any regenerative therapy is the generation of scal- able and enriched numbers of defined cell types appropriate to the target condition. Preclinical work-​up requires demonstration of sustained stem-​cell mediated functional recovery in appropriate models of injury. General principles include the need for ensuring appropriate distribution, connectivity, survival, and functional inte- gration of stem cells in the context of injury, without the hazards of tumour generation or immune rejection. Efficacy demonstrated in early phase II trials needs to be extended to the demonstration of sustained clinical benefit in definitive phase III studies. How might the promise of stem cells be realized? Consideration of three major target conditions for regenerative medicine—​Parkinson’s disease, heart failure, and diabetes mellitus—​ emphasizes distinct and common challenges that must be overcome in order to realize the stem cell promise. Novel approaches to induce pluripotency from differentiated somatic cells and targeted genetic manipulation of stem cell populations, along with new insights de- rived from improved understanding of human pluripotent stem cell biology and increased recognition of endogenous stem cells, offers a range of mechanisms through which stem cells may be therapeutic. In addition to classic cell/​tissue replacement approaches, creation of disease models using human, potentially patient-​specific, pluripotent stem cell systems, and linked high-​throughput cell drug screening plat- forms offers hope for accelerated discovery of new targets and medicines. When will stem cell treatments become available? This will vary from disease to disease. The history of haematological stem cell medicine, from which much of the template of regenerative medicine is borrowed, suggests an incremental and combinatorial approach to treatment. Introduction Regenerative medicine is not a new discipline: the 1990 Nobel Prize in Physiology and Medicine to Joseph Murray and Donnall Thomas was in recognition of pioneering kidney and bone marrow trans- plantations undertaken in the 1950s. The surge of renewed interest has been catalysed by recent and rapid advances in human stem cell biology and gene technology, which offer the prospect of the development of novel reparative strategies for a host of diseases hitherto considered irreparable. These include diabetes mellitus, neurodegenerative diseases, and heart failure. Stem cells The human body is organized into discrete but interrelated organs and tissues that each contain differentiated or specialized functional cells. Stem cells are defined as cells that possess three functional characteristics: an immature phenotype, self-​renewal capacity, and the ability to differentiate into one or more functional or specialized derivatives (Fig. 3.7.1). The first or earliest stem cell is the embryonic stem cell (ESC) that arises from the epiblast (Fig. 3.7.2). Embryonic stem cells are pluripotent cells capable of generating all cell types in the body and can be considered as transient stem cells. During development and through adulthood other stem cells emerge that display progres- sively more restricted phenotypical range and can be considered tissue-​ or organ-​specific. Endogenous tissue-​specific stem cells are multipotent, with a differentiation repertoire normally confined to those cells of the tissue of origin. They persist through adulthood and are responsible for regenerating tissues with a rapid cell turn- over, such as the gastrointestinal tract epithelium, skin epidermis, and haematopoietic system. Stem cells have also been identified in relatively quiescent tissues including the heart and central nervous system, where their precise functional role has yet to be determined. Technical advances enabling long-​term ex vivo culture of human-​ derived pluripotent and adult tissue-​specific stem cells, along with increased recognition of endogenous adult stem cells and the pos- sibility of directed reprogramming and targeted gene editing, have generated intense excitement in the experimental and therapeutic potential of human stem cell biology. 3.7 Stem cells and regenerative medicine Alexis J. Joannides, Bhuvaneish T. Selvaraj, and Siddharthan Chandran

282 SECTION 3  Cell biology (a) (b) Pluripotent stem cell Multipotent tissue stem cell Transit-amplifying progenitor Terminally differentiated functional progeny Embryonic stem cells Fetal tissue stem cells Adult tissue stem cells iPS cells Safety Cell yield Plasticity Histocompatibility potential Ethical acceptability Fig. 3.7.1  Stem cells and their sources. (a) All stem cells, irrespective of developmental stage, share two fundamental properties: self-​renewal and differentiation to progressively lineage-​restricted cell types, ultimately generating terminally differentiated, functional progeny. (b) Human stem cells can be derived from embryonic, fetal, or adult tissue. Each has its relative merits and drawbacks, and choosing the most appropriate source largely depends on the requirements of each specific experimental or therapeutic context.

3.7  Stem cells and regenerative medicine 283 Historical perspective It has long been known that the cells in certain tissues are constantly replaced, but it is only recently that we have come to realize the number of these areas, the many ways by which a balance is achieved between cell production and loss, and particularly the speed of the renewal process. The remarkable behaviour of many cell popula- tions raises not only histological but biochemical questions which are yet unanswered. The concept of tissue stem cells emerged from the pioneering work of Charles Leblond, James Till, and Ernest McCullogh in the mid-​20th century. Leblond developed the technique of autoradi- ography which led to the identification of continuous cellular self-​ renewal in certain tissues and culminated in the description of what we now know as stem cell-​mediated renewal in spermatogenesis. Till and McCullogh independently proposed a similar model for haematopoiesis. During the 1960s they identified ‘spleen colony-​ forming cells’, which were able to reconstitute the haematopoietic system of a lethally irradiated animal host, and together with Louis Siminovitch went on to demonstrate their self-​renewal capacity by serial transplantation. Subsequent advances in stem cell biology have enabled iden- tification, propagation and directed differentiation of stem cells from a variety of adult tissues, including bone marrow stroma, skin epidermis, and brain (Fig. 3.7.3). Parallel pioneering studies on embryonic carcinoma cells derived from teratocarcinomas led to the isolation of embryonic stem cells from mouse blastocysts in 1981. Recognition of the fundamental advance of this finding and the enabling of the gene modification era led to the Nobel Prize in Medicine 2007. Successful isolation of human ESC cells, coupled with the discovery of a comparatively simple technique to induce pluripotency from adult differentiated cells, opened up the field of regenerative stem cell-​applied biology to include the possibility of generating patient-​specific cells or tissues through induced pluripotency. John Gurdon and Shunya Yamanaka were awarded the 2012 Nobel Prize for Medicine for showing that mature cells can be reprogrammed to become pluripotent. Complementing progress in stem cell generation, recent advances in targeted gene-​ editing technology have significantly increased the repertoire of stem cell-​based therapeutic possibilities. What can human stem cells offer
regenerative medicine? Experimental and therapeutic opportunities are the short an- swer. Regenerative medicine can be summarized as treatments (cell and drug based) that seek to restore structure and function following injury (Fig. 3.7.4a). Stem cells can achieve this goal in a variety of ways, direct and indirect (Fig. 3.7.4b). Perhaps the simplest and most intuitive therapeutic contribution is through cell replacement of lost or damaged cells. Cultured autologous keratinocytes for skin loss is a well-​established current example of cell-​based therapy. Cell replacement therapies for Parkinson’s disease and type 1 diabetes represent future therapeutic targets. Cell-​based therapies have also been demonstrated to have benefi- cial effects independent of their differentiation potential, such as the immunomodulatory effects of mesenchymal stem cells in the treatment of graft versus host disease. Beyond stem cells as direct therapy, human stem cells offer com- plementary opportunities to study human development and model disease, as well as providing a unique resource for drug discovery and testing. Such insights are likely to lead to novel disease-​modifying and regenerative therapies, and ultimately provide the largest clin- ical dividend. ESC Oct4 Nestin Vimentin Nanog Fetal NSC Adult MSC iPSC Fig. 3.7.2  Human stem cells grown in vitro. Representative light microscopy and immune micrograph pictures of (from left to right) embryonic stem cells (ESC), fetal-​derived neural stem cells (NSC), adult skin-​derived mesenchymal stem cells (MSC), and induced pluripotent stem cells (IPSC).

284 SECTION 3  Cell biology 1905—first successful keratoplastry using cadaveric corneal tissue 1954—first successful kidney transplant between monozygotic twins 1957—first successful bone marrow transplant by intravenous infusion 1981—autologous keratinocytes used to treat third-degree burns 1990—transplantation of primary fetal tissue for Parkinson’s disease 1997—autologous limbal stem cells used for corneal grafting 1998—trials of autologous chondrocyte implantation in knee injury 2004—trials of autologous bone marrow infusion following myocardial infarction 2000—first successful pancreatic islet cell transplant from cadaveric donor tissue 2002—trials of autologous mesenchymal stem cells for graft vs host disease 1954—demonstration of continuous tissue self- renewal using autoradiography 1962—cloning by somatic cell nuclear transfer in Xenopus laevis 1963—discovery of endogenous stem-cell mediated self-renewal 1970—isolation and expansion of bone marrow stromal cells 1975—isolation and expansion of epidermal keratinocytes 1981—isolation and expansion of embryonic stem cell lines from mouse blastocysts 1992—isolation and expansion of neural stem cells from the embryonic and adult brain 2006—generation of induced pluripotent stem cells from adult tissue 1997—mammalian cloning in sheep by somatic cell nuclear transfer 1998—generation of embryonic stem cell lines from human blastocysts 2007—generation of primate embryonic stem cell lines by somatic cell nuclear transfer 1900 1950 1970 1990 2000 2010 Future prospects Implantation of β-cells, cardiomyocytes, and dopaminergic neurones from pluripotent cells Implantation of stem cell-derived retinal pigment epithelium for macular degeneration Oligodendrocyte precursor transplantation for spinal cord injury and multiple sclerosis Tissue protection trials for stroke, motor neurone disease, and Alzheimer’s disease Development of drug compounds from stem- cell-based screening Stem-cell-based immunotherapy for cancer and infectious diseases Stem-cell-derived organ transplantation Stem cell biology Regenerative medicine Short term Medium to long term 2014—clinical trials of human embryonic stem cell-derived β-cell in type I diabetes 2009—generation of integration-free induced pluripotent stem cells 2013—reproducible targeted gene editing using CRISPR/Cas9 system 2016—clinical improvement following cultured bone marrow cell injection in heart failure 2012—transplantation of embryonic stem-cell derived retinal pigment epithelium cells for AMD Fig. 3.7.3  Timeline of key advances and future prospects in stem cell biology and regenerative medicine.

3.7  Stem cells and regenerative medicine 285 Inadequate endogenous stem-cell-mediated repair Normal tissue Disease tissue Cell loss/dysregulation (a) (b) Pluripotent stem cells Multipotent tissue stem cells Target cell type Exogenous cell replacement Tissue protection Promoting endogenous repair Damaged tissue In vitro disease modelling and drug testing ± gene editing Fig. 3.7.4  Therapeutic principles of stem cell-​based treatments. (a) Organ function is dependent on a dynamic equilibrium between the extent of pathological injury (from any cause) and the extent of self-​repair from endogenous tissue stem cells (which is highly variable between organs). An imbalance leads to progressive tissue damage and/​or loss, ultimately resulting in organ impairment and functional decompensation. (b) Stem cell-​based interventions can be directed towards multiple points in disease progression. Stem cells and progenitors, delivered as single cells or within a tissue scaffold may have a disease-​modifying effect independent of differentiation potential through trophic support or immunomodulatory properties, while differentiated progeny can be used to replace lost cells. In addition, in vitro stem cell-​based studies can lead to the development of drug compounds for mobilizing endogenous stem cells and shifting the organ equilibrium towards self-​repair.

286 SECTION 3  Cell biology Current therapeutic applications of stem cells Not infrequently in medical discovery, application of scientific in- novation precedes biological or mechanistic understanding. Stem cells are no exception. Within the translational arena, stem cell transplantation has been performed (even unknowingly) for over a century (Fig. 3.7.3). Eduard Zirm carried out the first successful keratoplasty in 1905, while E. Donnall Thomas performed the first successful bone marrow transplantation by intravenous infusion in 1957. Haemopoietic stem cell transplantation for haematological disease is now routine procedure (see Chapter 22.8.2). Rheinwald and Green’s success in using autologous, ex vivo, ex- panded human keratinocytes for treating patients with third-​degree burns in 1981 established an important proof of concept for stem cell-​based therapy. The potential of regenerative therapy is apparent from the process of autologous epidermal grafting. Keratinocytes, although relatively quiescent in vivo, can be expanded exponentially in culture with a doubling time of 16 to 18 hours, achieving a 10 000-​ fold expansion over a 2-​ to 3-​week period. Sufficient cell numbers can thus be obtained from very small full-​thickness skin biopsies, making it possible to treat patients with large area skin loss where split-​thickness skin grafts are not feasible. Together with autologous chondrocyte implantation for articular cartilage defects, this is a fast-​growing area that is now mainstream. Use of limbal epithelial stem cells for corneal disease is a further example of an emerging clinical application of autologous adult stem cells. Finally, combination of autologous material with the disciplines of material science and biotissue engineering raises the imminent prospect of ex vivo tissue organogenesis. Use of tissue-​ engineered bladder augmentation for neurogenic bladders is an exciting advance and likely to herald wider application (see later). Current barriers to clinical application Clinical application of stem cells has now become routine practice in haematology, plastic surgery, orthopaedics, and ophthalmology. However, beyond these areas the promise of using stem cells in mainstream therapies remains anticipated and unrealized, and raises many issues. Although comprehensive and detailed analysis of individual disease requirements is beyond the scope of this chapter, some general themes for clinical application of stem cell-​based re- generative medicine emerge (Fig. 3.7.5). These include: 1. Identifying the correct stem cell source 2. Generating appropriate numbers of specialized cells and valid- ating sustained in vivo function in injury models 3. Establishing the infrastructure and correct trial design methodo­ logy to rigorously clinically evaluate putative regenerative therapies These separate areas are considered by way of illustration and in ref- erence to three principal target medical conditions that could benefit from regenerative medicine:  neurodegenerative diseases such as Parkinson’s disease, cardiac failure, and type 1 diabetes mellitus. Identifying the correct human stem cell source Accepting the need for human material, this is in many ways an issue of determining the appropriate developmental stage of stem cells. Adult tissue-​specific stem cells possess some advantage, being poten- tially autologous, often readily accessible, as well as being ethically less controversial (see Fig. 3.7.1b). However, their limited prolifera- tive capacity and restricted differentiation potential place significant practical constraints on their widespread utility. Although several studies have reported adult stem cell ‘transdifferentiation’ to other lineages, these findings have not always been reproducible and are likely accounted for by alternative explanations such as cell fusion in vivo or genetic transformation in vitro. Notwithstanding their intuitive attraction, some populations (e.g. neural stem cells) are inaccessible and would require invasive methods, with attendant risks, for harvesting. By contrast, embryonic and induced pluripo- tent stem cells are scientifically attractive on account of their unique ability to respond predictably to developmental cues, which together with their non​transformed nature and almost unlimited prolifera- tive capacity allow the realistic prospect of generating scaleable numbers of all cell types. A stem cell source should thus ideally combine the practical and ethical acceptability of adult stem cells with the biological poten- tial of embryonic or pluripotent stem cells. Successful somatic cell nuclear transfer (SCNT) in mammalian reproductive cloning dem- onstrated the conceptual feasibility of nuclear reprogramming to generate embryonic cells from an adult mammalian somatic cell source. Primate and human SCNT has since been demonstrated and independently confirmed, but a significant practical hurdle of SCNT is the need for large numbers of oocytes. An alternative approach proposes fusion of existing embryonic stem cell lines and dermal fibroblasts, and a key milestone in this field has been the demonstration that somatic (non​stem) cell repro- gramming can be induced by overexpression of a limited number of transcription factors in both adult mouse and human systems. The resulting induced pluripotent stem cells (iPS) show many of the characteristics of embryonic stem cells, and ongoing work on the underlying mechanisms of somatic reprogramming has improved the efficiency and reliability of the process. This approach to ‘re- programming’ opens up the possibility of an era of patient-​specific stem cell-​based studies and conceivably even personalized ‘stem cell’ medicine. Generating appropriate numbers of functional cell type(s) There are two overriding requirements: generation of the correct functional cell type without any contaminant undifferentiated stem cell(s). Although it is axiomatic that cells of appropriate regional identity and function are required for experimental or therapeutic applica- tion, directed differentiation has lagged behind advances in stem cell isolation and culture. Pluripotent stem cells possess a clear advantage over (non-​reprogrammed) adult stem cells, responding predictably to developmental signals and retaining imposed positional identity following transplantation. It is worth noting that most cell types in future regenerative therapies, including neuronal subtypes, pancre- atic islet β cells, and cardiomyocytes, physiologically emerge at a defined developmental stage and are not normally generated from resident adult stem cell populations. It is unclear whether any other stem cell populations have the in vitro or in vivo potential to generate these functional cell types. Nevertheless, applying insights borrowed from developmental principles of patterning and specification are

3.7  Stem cells and regenerative medicine 287 Derivation Expansion Ethical acceptability Differentiation Route, dose ± co- treatment Functional improvement Correct disease stage and subtype Develop accurate disease models Potentially autologous Scaleable proliferation while retaining plasticity Clinical grade conditions Homogeneous population Correct physiological function Preclinical studies Demonstrate functional recovery Bioengineering Clinical evaluation Patient selection Tissue scaffolds Encapsulation In vitro organogenesis Outcome measures Final product Genetic editing * Fig. 3.7.5  Criteria for clinical implementation of cell-​based therapy. The many challenges and requirements for cell-​based strategies can be classified into three fundamental steps (see text for a detailed discussion). Initially, the right cell type needs to be generated in sufficient numbers, in high purity and in the right form for therapy (step 1, shaded in yellow). Subsequently, putative therapies need to be tested in appropriate animal models of disease for both safety and efficacy, and criteria for patient phenotypes most likely to benefit from therapy should be established (step 2, shaded in pink). Finally, clinical cell therapy needs to be evaluated in the context of other existing therapies, and functional improvement should be monitored for a sufficient period to demonstrate benefits in disease morbidity and/​or mortality (step 3, shaded in blue).

288 SECTION 3  Cell biology likely to be critical of the generation of region-​specific cell types re- gardless of age or source of origin of stem cell. In addition to the conventional application of soluble factors in two-​dimensional culture, the development of advanced tissue engineering approaches can potentially enable better definition of the extracellular microenvironment. Thus, novel techniques such as micropatterning, biomaterial scaffolds, and bioprinting have the po- tential to further enhance cell differentiation potential, yield, and long-​term survival. Minimizing the risk of tumorigenic ‘rogue’ cells is a major obs- tacle. Potential approaches include a combination of positive and negative ex vivo selection techniques, predifferentiation, insertion of inducible (preferably pluripotency-​associated) ‘suicide’ genes, or use of oncolytic viruses. Such methods will require custom- ized developments particular to individual stem cells. Although standard practice in a laboratory setting, clinical application will require further refinement. While teratoma incidence has been well controlled in recent cell-​based trials, the increased risk of leukaemia following gene therapy for X-​linked severe combined immunodeficiency highlights the need for robust long-​term evalu- ation in a clinical setting. Donor cell developmental stage The final stem cell-​derived product can be either a progenitor popu- lation or ex vivo predifferentiated cells. Predifferentiation has the additional challenge of further controlled differentiation steps with complex protocols utilizing a combination of soluble factors and biological scaffolds, and using specialized selection techniques to isolate a possibly rarer differentiated subtype (e.g. separating dif- ferentiating mature β cells from islet progenitors from other endo- crine cell types). The importance of the local cellular niche in cell fate subspecification also adds to the complexity of ex vivo differenti- ation. However, predifferentiation may prove necessary, particularly where the potential pathological host environment may otherwise impose inappropriate differentiation cues upon implanted progen- itors. For example, the inflammatory environment in spinal cord demyelination models has been shown to promote astrocyte spe- cification from neural precursors, while prior ex vivo oligodendro- cyte lineage specification enables effective exogenous remyelination. Predifferentiation is also a method to reduce the risk of uncontrolled in vivo proliferation. Gene editing as a means of introducing novel cellular functions Stem cell manipulation ex vivo beyond directed differentiation can offer additional opportunities for introducing novel functions that are not usually present in the unmodified cell type. While traditional methods for targeted gene editing with homologous recombination have been costly and inefficient, recent advances in this area have made the scaleable use of genetically-​edited stem cells a therapeutic possibility. New techniques for gene targeting include the use of transcrip- tion activator-​like effector nucleases (TALENs—​nucleases fused to a synthetic protein binding to a specific DNA base sequence), and clustered regularly interspaced short palindromic repeats (CRISPR) together with CRISPR-​associated proteins (Cas). For example, the Type II CRISPR/​Cas9 system enables a double strand break in the genome by the Cas9 nuclease at a specific locus determined by complementary RNA sequences (crRNA and tracrRNA which can be combined into a single chimeric sgRNA molecule). While the host cell predominantly repairs the double strand break by non​homologous end joining, its highly error-​prone nature leads to disruption of the target gene, resulting in a gene knock-​out. As introduction of a double strand break at the target locus can also increase the efficiency of homologous recombination, introducing donor DNA containing a novel sequence provides the opportunity for recombination with cleaved sequences via homology directed repair, thus resulting in a gene knock-​in or correction. A key advan- tage of the CRISPR/​Cas9 system is the ability to change the target specificity by altering the RNA sequence, which is cheap and rapidly synthesized compared to synthesis of a new protein. Potential therapeutic possibilities of such technologies include the correction of genetic defects in stem cells (multipotent or iPS) which can then be autologously transplanted back into their donor. More novel targets include conferring host resistance to infectious disease (e.g. CCR5 receptor deletion for resistance to HIV), somatic gene transfer in case of cystic fibrosis, and cancer immunotherapy using engineered T cells (e.g. chimeric antigen receptor-​modified T cells in acute lymphoblastic leukaemia). Recent progress using animal models have shown potentials of in vivo gene therapy using CRISPR-​Cas9. Adeno associated virus mediated delivery of CRISPR-​Cas9 to remove mutated exon 23 of Dystrophin gene in mice with Duchene muscular dystrophy resulted in a partial restoration of muscle function. Much wider application for CRISPR-​Cas9 in gene therapy is its potential in correcting the disease-​causing genetic mutation. Disease-​causing phenotypes were rescued in mouse models of a hereditary liver disease, tyrosinemia, by homology directed repair mediated gene correction in hepatocytes. Although the efficacy of the repair was low, the gene corrected cells conferred a positive growth advantage resulting in functional rescue. However, the benefits of the wider therapeutic repertoire offered by gene editing is in part offset by the need for a higher burden of biological safety. Efficacy of gene correction, issues including DNA target sequence specificity and exclusion of off-​target gene editing will need to be demonstrated prior to clinical use. Nonetheless, the use of gene-​editing technologies to generate genetically-​bespoke cell populations for in vitro disease modelling and drug develop- ment is likely to precede their application in cell-​based therapies (see later). Particular tissue types Nervous tissue The ability to programme or direct neuroectodermal differentiation from human embryonic stem cells, and by extension iPS cells, has progressed more rapidly compared with other lineages. This reflects in part the ‘default’ nature of neural induction from pluripotent stem cells when grown in simplified conditions with limited extrinsic signalling. There are several neural differentiation protocols with the po- tential for scaleable derivation of neural stem cells under clinical grade conditions. Methods of derivation and/​or enrichment in- clude utilizing stage-​specific cell surface markers (including CD133, Notch, and β1-​integrin) for neural progenitor selection. While fur- ther differentiation of neural progenitors into astrocytes (for use in neuroprotective approaches) is relatively straightforward, the

3.7  Stem cells and regenerative medicine 289 generation of the whole range of functional region-​specific neuronal subtypes remains problematic. This is a major challenge for regenera- tive neurology given that regional identity subserves distinct physio- logical function(s) and thus absolute precision of spatial identity is a prerequisite for functional restitution. Midbrain dopaminergic and spinal cord motor neuron differentiation from human pluripotent stem cells are arguably the most advanced, and the former is primed for clinical trials in Parkinson’s disease. Derivation of other neuronal subtypes, however, has been less successful. A combination of devel- opmentally based approaches along with use of positive selection exploiting region-​specific surface markers is likely to overcome this hurdle. Cardiac tissue Attempts to generate cardiomyocytes from adult cells has been prob- lematic, and initial reports suggesting that bone marrow stromal cells and skeletal muscle satellite cells could ‘transdifferentiate’ into cardiomyocytes proved to be flawed. Early protocols utilizing pluri- potent stem cells were dependent on spontaneous differentiation or coculture with visceral endoderm cells or conditioned medium with relatively low efficiency, highlighting the need for a more ra- tional, developmentally rooted approach to differentiation. Over the last decade this approach has been highly successful, and the serial application of activin-​A and BMP-​4 has been demonstrated to re- producibly generate high yields of functional cardiomyocytes for use in preclinical studies. In parallel, the existence of defined surface markers to identify cardiac progenitors (e.g. Flk1+ CXCR4+ ) permits prospective identification and isolation for clinical use. Pancreatic tissue Cadaveric islet cell transplantation offers proof of concept of cell-​ based therapy for type 1 diabetes. The fundamental requirements are islet β-​cell generation displaying physiological glucose-​stimulated insulin secretion. Early studies on hESCs demonstrated very low rates of differen- tiation to islet β cells and several purported examples later emerged as probable culture artefacts. The multiple serial differentiation stages between pluripotent cells and β-​cells (including definitive endoderm, posterior foregut, and islet precursor) has made de- velopmentally guided protocols challenging to develop. The first report of functional β-​cell derivation from hES cells using a devel- opmental approach via sequential differentiation over an 18-​day period was reported in 2006. Since then several protocols for multi- stage differentiation have been reported, utilizing defined develop- mental cues such as activin-​A and retinoic acid. However, further protocol optimization is required to identify key steps utilizing the minimum number of factors and stages. Furthermore, although glucose-​stimulated insulin secretion has been demonstrated in generated β-​cells using glucose stimulation assays, differences have been noted when comparing such responses with primary human β-​cells. These findings could be consistent with an immature β-​cell phenotype, highlighting the need for correlation with functional outcomes (i.e. reversal of diabetes) from preclinical and clinical studies (see later). Validating stem cell-​mediated functional recovery Even when challenges in obtaining scaleable numbers of appro- priate specialized cell types have been overcome, significant barriers to clinical application remain. Foremost is the need to demonstrate, in appropriate experimental systems, restoration of lost function. Notwithstanding the reasonable view that for certain diseases (un- treatable and fatal, e.g. motor neuron disease) a lower burden of mechanistic proof is required before commencing experimental clinical trials, it remains a fundamental tenet of drug or cell medi- cine development that prior demonstration of behavioural recovery is necessary. Although welcome evidence from animal studies of in vivo stem cell-​mediated function is emerging, robust and sustained restoration of lost function remains elusive. This problem reflects in part the limitations of experimental systems in accurately modelling human disease. The challenge of restoring lost function varies according to dis- ease and the organ(s) involved, and is most severe in regenerative neurology where reconnection of circuitry is required over and above restoration of macroscopic structure. Due to the syncytial and pacing nature of myocardium, a prerequisite for regenerative car- diology is a method that ensures electrophysiological synchroniza- tion on cell implantation. By contrast, stem cell-​based therapeutics for type 1 diabetes is comparatively straightforward; restoration of endocrine function does not require homotopic transplantation, and cadaver-​derived transplantation of islet β cells into the hepatic portal vein has established proof of concept for a cell replacement strategy. Other issues Survival, engraftment, and connectivity Sustained functional integration represents the holy grail of regen- erative medicine, and is largely unmet. Clinical context matters, but some general principles can be rehearsed. Donor cell survival, acute and chronic, requires a primed host environment as well as immuno- logical mismatching to be overcome. Achieving long-​term integra- tion requires a permissive host environment. Combined approaches with, for example, immunomodulatory treatment, are one way not just to manage ongoing disease activity but also to limit donor cell vulnerability to immune attack. A common problem for treatment of autoimmune-​mediated disease is to protect the implanted cell population, including autologous material, from the host immuno- logical response. Ensuring appropriate connectivity is likely to require supple- mentary approaches (e.g. brain and spinal cord injury is associ- ated with an inhibitory glial scar that behaves as a physical and biochemical barrier to axonal growth). Approaches under clinical trial to permit appropriate axonal regrowth include cotreatment with enzymes targeting the inhibitory extracellular proteoglycan matrix, and excision of the glial scar combined with a nerve graft bridge. In cardiac repair a functionally integrated cardiac syn- cytium with appropriate excitation–​contraction coupling is a min- imal requirement without the risk of potentially fatal arrhythmias. Indeed, a recent primate study demonstrating remuscularization of myocardial infarcts by human ESC-​derived cardiomyocytes following intramyocardial delivery also noted the development of non​fatal ventricular arrhythmias. By contrast, studies reporting restoration of left ventricular function after human bone marrow cell infusion have failed to show histological integration or sus- tained improvement, with short-​term benefits most likely due to trophic support.

290 SECTION 3  Cell biology Overcoming immune rejection Stem cells allow the development of novel approaches, beyond classic immune suppression, to manage immune mismatch. Personalized cells, masking strategies, and derivation from predetermined tissue-​ matched banks of cell lines are all rational methods under study. Although a conceptually attractive method, generation of autolo- gous stem cell lines for each patient would be impractical and cost prohibitive to implement for common conditions. However, a study focusing on the United Kingdom population has estimated that as few as 10 hESC lines homozygous for common HLA haplotypes (which could be derived by iPS, SCNT, or parthenogenetic ES cells) could achieve complete HLA matching for 38%, and a beneficial match for 67% of cases. Microencapsulation in a permeable substance such as alginate or poly-​l-​ornithine is another option for creating an immunological protective barrier, with promising results in human islet transplant- ation trials. This approach is confined to grafts that do not require cell–​cell contact for function and subcutaneously implanted encapsu- lated stem cell-​derived β-​cells are in phase II trials for type I diabetes. Whether long-​term immunotherapy is necessary is unknown in the context of some stem cell-​based interventions (e.g. within the relative immune privilege of the brain). Indirect evidence sup- porting such an idea comes from the demonstration of successful and early withdrawal of antirejection drugs after dopaminergic fetal neuroblast transplantation for Parkinson’s disease. Route and location of delivery Distribution of cell therapy poses very different challenges com- pared with small molecules and macromolecules—​again context matters. In some cases, such as β-​cell replacement, donor cell func- tion is largely independent of location. Conceptually there is no compelling reason for glucose sensitive insulin secreting cells to be located within the pancreas. By contrast, precise focal targeting is required for neurological disorders and cardiac failure. The problem is compounded in diseases characterized by multifocal pathology. Stereotactic implantation is comparatively straightforward for site-​specific disorders such as Parkinson’s disease or spinal cord injury, but unfeasible for diffuse and multifocal disorders, such as Alzheimer’s disease and multiple sclerosis, respectively. Recent studies that highlight the property of stem cell ‘patho-​ tropism’ or ‘homing’ to sites of injury in response to cytokine/​ chemokine gradients offer a means to circumvent this long-​standing conceptual obstacle to cell-​based therapies for a range of disorders. Analysis by in situ hybridization of Y chromosomes of female heart transplants into male recipients provides some evidence for extracardiac origin of cardiac cells, although these were predomin- antly endothelial cells. Several experimental studies have also shown homing of peripherally delivered cells to the injured heart and brain. However, the significance of limited homing is unclear. Estimates from experimental and clinical studies suggest less than 5% cardiac retention 2 h after infusion of bone marrow-​derived cells. This may, in part, explain why clinical trials in cellular cardiomyoplasty have not thus far shown long-​term benefit. Reproducibility and scale Regardless of the precise method deployed to generate a functional cell type from stem cells, widespread clinical application requires scale and targeted delivery. In many ways this is essentially indis- tinguishable from standard pharmaceutical practice, which requires upscaling and automation. Ultimately, protocol effectiveness will need to be user independent, with adoption of mass production techniques sufficient to generate scaleable production of cells. Aside from logistical and manufacturing issues, the variability of cell lines needs to be addressed. No two lines are the same with regard to epi- genetic, molecular, or immunological factors, or indeed ease of dif- ferentiation to a given germ layer and its cellular derivatives. The potential for personalized cell lines both complicates and potentially resolves these issues. In summary, successful regeneration is an incremental process that begins with in vitro generation of uniform and scaleable num- bers of correct cell type, followed by in vitro and ultimately in vivo demonstration of appropriate distribution, connectivity, survival, and function. Translational considerations: Testing novel regenerative therapeutics An overlooked area in regenerative medicine is the critical import- ance of patient selection and optimal trial design to ensure correct evaluation of novel reparative therapies. Identifying patients with the right disease is an obvious point, but not as simple as it may appear. With the emergence of genetically stratified trials for men- delian disease (such as Huntington’s disease), and the increased rec- ognition of molecular subtypes affecting disease progression (such as in malignant gliomas), it is likely that genetically selective study cohorts for sporadic disease will follow. Furthermore, it is also es- sential that any patients studied are at the appropriate stage of dis- ease for the proposed intervention be studied. To do otherwise is likely to introduce noise, account for type 2 errors, and contribute to inconsistent results in early phase clinical trials. Efficacy demonstrated in early phase II trials needs to be extended to the demonstration of sustained clinical benefit and safety in de- finitive phase III studies, including the inclusion of sham treatment as a robust control where feasible. Recognizing the limitations of preclinical animal studies, it is often necessary to enter the clinic in advance of understanding the mechanism of efficacy. Indeed, creative trial design and outcome measures should be sought to allow early trials to not only test effi- cacy but also to inform on putative mechanism of action. This is il- lustrated by the experience to date of cell replacement in Parkinson’s disease, cellular cardiomyoplasty, and the use of adult mesenchymal stem cells in graft-​versus-​host disease (see next). Neurological repair Although more than 250 cell transplantations involving Parkinson’s disease patients have been undertaken, it is only recently that the importance of patient selection has emerged. Historically, and not unexpectedly for a novel treatment, cell transplantation was under- taken in patients with comparatively advanced disease who had become refractory to conventional treatments. Unexpected re- sults from a randomized study have since led to the re-​evaluation of the role of cell implantation and an emerging consensus is that comparatively early onset Parkinson’s disease is the ideal recipient of cell implantation therapy to minimize adverse events such as graft-​induced dyskinesias. Disability scores, the need for adjunctive pharmacological therapies, and functional imaging together provide

3.7  Stem cells and regenerative medicine 291 reasonable metrics of efficacy. The recognition of such clinical parameters and variation has led to the development of larger scale, multisite studies for definitive evaluation of cell transplantation in Parkinson’s disease. The importance of identifying the right cohort for the pro- posed intervention can be further illustrated in neurological medi- cine with regard to multiple sclerosis. Patients with early active relapsing–​remitting disease require disease-​modifying therapy (immunomodulatory), whereas those with advanced progressive disease characterized by significant neurodegeneration require neuroprotection and repair. Cardiac repair Regardless of cell type, clinical indication and the timing of inter- vention matter (e.g. the needs of acute versus chronic ischaemia differ from that of end-​stage heart failure). For example, stem cell therapy for non​ischaemic heart failure (e.g. dilated cardiomyop- athy) in phase II studies has been associated with more consistent functional improvement compared to stem cell trials in ischaemic heart disease. Furthermore, in contrast to diabetes or Parkinson’s disease, where the mechanism of efficacy is known, this is cur- rently less understood in cardiomyoplasty. It follows therefore that standardization of endpoints should necessarily focus on functional (ventricular ejection fraction) and patient disability scores. In this respect, consideration of disease trajectory is important. For ex- ample, a recent randomized study of intracardiac injection of ex- panded bone marrow cells in patients with advanced heart failure (NYHA III-​IV) has demonstrated a 37% reduction in adverse clin- ical events at one year. This is likely to be of clinical value given the baseline prognosis of the condition, despite no significant changes being noted in left ventricular function. Diabetes The Edmonton experience has been instrumental in providing proof of concept of islet transplantation and has also revealed that insulin independence, the ultimate goal, is short-​lived. An understanding of the mechanisms of normal islet β-​cell self-​renewal and of the fate of transplanted islets is needed to take forward further transplantation studies. However, trials to date demonstrate that those with ‘brittle’ diabetes and recurrent hypoglycaemia appear to benefit the most, regardless of insulin independence, illustrating the value of cohort subselection. The outcomes of ongoing phase II trials using subcuta- neously implanted encapsulated β-​cells are likely to inform further translational work in this area. Regulatory considerations Irrespective of source, stem cell culture and expansion need to fulfil several mandatory criteria for therapeutic application. Although clinical keratinocyte protocols presently use bovine serum and feeder cells, future stem cell therapies will need to conform to good manufacturing practice conditions, which are most likely to stipu- late exclusive use of chemically defined and human-​derived com- ponents. Currently, many culture (and differentiation) protocols require animal products or unknown factors present in conditioned media or proprietary supplements. Regardless of the precise details, stem cell-​based therapies will ultimately need to conform to inter- nationally agreed guidelines laid down by regulatory bodies such as the United States Food and Drug Administration (FDA) and the European Medicines Evaluation Agency (EMEA). Future prospects Tissue replacement and solid organ transplantation Using stem cells to generate solid organs is an important goal in regenerative medicine. Ex vivo organogenesis represents both an engineering and a biological challenge. The use of appropriate scaf- folds for cells to grow and differentiate is one approach that has yielded some success. Tissue-​engineered autologous bladders from urothelial and muscle cells seeded on a collagen–​polyglycolic acid matrix have been successfully used in patients requiring cystoplasty, and some biological and synthetic tissue scaffolds have since been developed which will inform future clinical trials. A key challenge in generating tissue replacements for tubular or- gans for clinical evaluation is determining optimal cell-​scaffold com- binations and cell seeding to achieve successful epithelialization. Aside from urology, this an area of ongoing work for organs such as the gastrointestinal tract and trachea. Use of a natural organ scaffold has been suggested as a potential solution for more complex organs. Building on previous studies using decellularized heart valve grafts, successful recolonization of a completely decellularized heart (with an extracellular matrix and vascular structure) with cardiac and endothelial cells has been dem- onstrated, with some evidence of pump function. A second challenge is achieving the right architecture when the organ is composed of multiple cell types (e.g. despite the success and life-​saving nature of autologous keratinocyte grafts, reconstruc- tion of sweat glands, hair follicles, and melanocytes has not been achieved). This challenge is particularly relevant to bioengineering an artificial kidney, arguably the organ with the highest demand. The kidney’s characteristic anatomical and topographical nephron arrangement develops from a specific reciprocal induction process between the ureteric bud and the metanephrogenic mesenchyme—​ replicating this in vitro is still a long way from being achieved. Stem cell repair independent of differentiation
potential Stem cells can be therapeutic by two mechanisms: firstly, by supple- menting (exogenous) and secondly by enhancing endogenous re- pair. Although exogenous repair through cell/​tissue replacement is conceptually straightforward, the promotion of endogenous repair and tissue protection is an area of active research that may ultimately deliver the larger clinical gain. Using stem cells therapeutically for properties independent of their ability to be differentiated into a specific cell type is comple- mentary to the classic view of stem cells as a means of replacing lost cells. This notion proposes that stem cells that display unexpected properties, including immunoregulation, pathotropism, and the ability to function as cellular ‘mini-​pumps’, can be harnessed to pro- mote tissue protection and endogenous repair. Stem cells as cellular immunomodulators have already entered the clinic and are undergoing clinical trials in various disease con- texts. In 2004, le Blanc and colleagues reported striking remission of severe treatment-​refractory graft-​versus-​host disease following

292 SECTION 3  Cell biology intravenous infusion of allogeneic mesenchymal stem cells, an in- novative approach that was undertaken in advance of definitive experimental proof of concept. Similar findings have since been re- ported in preclinical studies on animal models of autoimmune dis- ease including multiple sclerosis, Crohn’s disease, and rheumatoid arthritis. These studies highlight the potential value of stitching to- gether two increasingly recognized properties of stem cells—​ability to traffic to sites of injury and to recalibrate a dysregulated hos- tile immune system—​in the context of inflammatory or immune-​ mediated disease. Alternatively, stem cells can be used as cellular vehicles for the delivery of protective or reparative factors, which may be pro- duced by default or by genetic overexpression. Growth factors have been shown to have a beneficial effect in several neurological dis- eases including Parkinson’s disease and motor neuron disease. In this regard accumulating evidence suggests that some of the more promising results from stem cell trials in cardiac repair cannot be accounted for by graft-​derived cell/​tissue replacement but rather by graft-​derived trophic-​mediated support. More recently, stem cells have also been used as a means of enzyme replacement in metabolic diseases. Implanted neural stem cells have been shown to prolong survival in an animal model of Sandhoff’s disease through a variety of mechanisms, and display synergy with oral medication. This study further highlights the multifaceted action(s) of stem cells with cell replacement, anti-​inflammatory, and enzyme replacement properties all implicated as contributory to efficacy. Endogenous repair, disease modelling,
and drug discovery Endogenous repair The promotion of endogenous repair is an intuitive and attractive long-​term regenerative strategy. Recognition of adult stem cells in organs hitherto considered incapable of self-​renewal—​brain and heart—​has only fuelled such a proposition. The evidence for en- dogenous niche-​resident adult neural stem cells is irrefutable, not- withstanding the disputed ‘multipotentiality’ of widely distributed oligodendrocyte precursor cells. Increasingly persuasive studies also appear to confirm the presence of endogenous cardiac progenitor/​ stem cells and a recent mammalian report provides strong evidence for endogenous cardiac repair that occurs after injury but not age-​ related loss. Other findings that suggest endogenous replacement of islet β cells raise the prospect of parallel and complementary strat- egies to cell implantation in patients with diabetes with some intact β-​cell tissue. An outstanding question is whether limited numbers of stem cells in restricted niches are relevant to organ repair given that damage is often extensive and geographically distant. Furthermore, the physiological role of such cells as well as their response to in- jury is unknown. Nevertheless, these cells and their progeny pro- vide a rational cellular target for pharmacological compounds to activate, mobilize, and thus promote cell-​mediated repair (Fig. 3.7.6). A  complementary cell-​based approach could seek to isolate endogenous—​typically slow cycling—​stem cells and reimplant them to the injured organ after ex vivo expansion. Such a strategy is well established for haematological stem cell therapy in the context of malignancy. Direct in vivo reprogramming to facilitate endogenous repair Successful reprogramming of differentiated cells into pluripotent stem cells using four transcription factors have led to a second wave of cellular reprogramming where lineage-​restricted transcription factors are harnessed to convert one type somatic cell to another. Direct cellular reprograming has been successfully used to generate neurons, glial cells, hepatocytes, cardiomyocytes in vitro using com- bination of transcription factors that control corresponding cell fate and development. This concept provides an attractive proposition in endogenous tissue repair by regenerating affected cells in the damaged organ by switching fate of the locally residing support cells. Recent studies in rodent models have demonstrated the therapeutic potential of direct reprogramming, such as the direct conversion of hepatic myofibroblasts to hepatocytes in vivo by overexpression of four tran- scription factors which were subsequently capable of ameliorating chemically induced liver fibrosis. Similar strategies have also been demonstrated to convert gastric antrum cells to insulin producing ß cells in vivo. As with other proposed regenerative therapies, the major chal- lenge for clinical translation of these promising proof-​of-​concept studies will be demonstration of target-​specific delivery, clinical safety (particularly for viral-​based gene delivery) and long-​term efficacy. Disease modelling and drug discovery By virtue of their proliferation and differentiation potential, stem cells offer a unique experimental resource for drug discovery and in vitro disease modelling. These opportunities converge on im- proved understanding of disease pathogenesis, endogenous repair, and failure to repair normally, and thus together they provide clues to novel regenerative approaches (Fig. 3.7.6). Human stem cells and their derivatives provide a unique opportunity for disease model- ling and understanding genetic and/​or environmental influences of many human disorders. The scaleable and precise differentiation of human pluripotent stem cells into functional derivates can all be scaled into high-​ throughput and automated systems which have the potential to revolutionize drug discovery. Refinements to a human stem based platform system include incorporation of several complementary strategies including (1) generating iPS cells from patients with spe- cific (or unknown) gene mutations or polymorphisms and differen- tiating these to lineages affected in that disease (e.g. cystic fibrosis); (2)  modelling disease directly by genetic overexpression or gene inactivation or silencing, facilitated by the development of more powerful gene-​editing technologies; and (3) modulating environ- mental parameters to replicate disease conditions (e.g. hypergly- caemia in diabetes). Furthermore, development of advanced culture techniques such as microfluidics and 3D bioprinting can enable the evaluation of more complex cellular environments over a longer time period. Following on from an understanding of pathological mechan- isms, in vitro disease models can be used for drug evaluation and testing. In this respect, stem cells offer distinct advantages over cur- rent human sources used in drug screening, which include primary tissue (capable of only limited proliferation) and tumour cell lines (which have a grossly aneuploid genome). Evaluation of drug targets

Assay development High-throughput screening Clinical testing Secondary assays Gene editing Patient cell lines Environmental modulation Toxicology Pharmacokinetics Human variants Candidate drug Safety and efficacy * Lead compounds Fluorescent gene reporters Cell survival Colorimetric assays Physiological parameters * ** * ** * Fig. 3.7.6  Stem cells and drug discovery. A source of potentially unlimited numbers of non​transformed human cell types presents multiple opportunities in drug discovery and development. High-​throughput stem cell-​based screening can result in the identification of novel disease-​modifying compounds. Their safety and differential efficacy can subsequently be determined in secondary assays utilizing stem cell-​derived material, ultimately leading to the development of candidate drugs that can be evaluated through the conventional clinical trial route.

294 SECTION 3  Cell biology can be approached through either a high-​throughput phenotyping-​ based screening approach (with subsequent deconvolution) or testing of candidate compounds with a disease-​modifying rationale. Simple and measurable outcomes, such as fluorescent gene reporters or cell survival over time, will be necessary for any drug-​based assay in order to allow sufficient scalability. In addition to disease mod- elling and discovery, stem cell-​derived lineages can be utilised for evaluation of drug toxicity. Conclusion Regenerative medicine, although in its infancy, will become of increasing importance in the face of the rising global challenge of diseases such as diabetes, neurodegeneration, and heart failure. Human stem cell biology is rapidly advancing, with significant pro- gress already made in technology enabling efficient pluripotent cell derivation, genetic manipulation, and directed differentiation. It is likely to lead to significant gains in understanding of disease mech- anisms and thus open new therapeutic opportunities—​cell and pharmacologically based—​both to modify disease course and to promote repair of the injured organ. Stem cells can be exploited directly and indirectly to promote re- pair. Specifically, stem cell-​based methods or insights seek to sup- plement and enhance, where appropriate, endogenous repair. Cell implantation strategies require the ability to generate large num- bers of defined functional cell populations appropriate to clinical need (e.g. pancreatic islet cells for diabetes or midbrain dopamin- ergic neuroblasts for Parkinson’s disease). In this regard human embryonic-​ or iPS-​derived populations offer significant advan- tages on account of their developmental competence. However, beyond generation of specific cell populations for replacement strategies, it is an oversimplification to view repair as simply recap- itulation of development given the distinct cellular architecture of adulthood complicated by injury-​related structural and biochem- ical changes. In addition to classic cell or tissue replacement, the evolving concept of ‘therapeutic stem cell plasticity’ offers add- itional methods through which stem cells may be useful for regen- erative medicine. Outside of drug discovery, these include utilizing stem cells to limit damage and promote tissue repair by acting as cellular vehicles to deliver trophic/​angiogenic factors or as cellular immunomodulators. Time to clinic is less easily predicted. This will vary and, as with any innovative treatment, there will be a trade-​off between justifi- able risk and benefit. The ability of human stem cells to both inform on and potentially treat devastating and frequently untreatable dis- orders provides cautious grounds for optimism that stem cells will accelerate the emergence of novel therapeutics for regenerative medicine. FURTHER READING Current cell-​based clinical applications Atala A, et al. (2006). Tissue-​engineered autologous bladders for pa- tients needing cystoplasty. Lancet, 367, 1241–​6. O’Connor NE, et al. (1981). Grafting of burns with cultured epithelium prepared from autologous epidermal cells. Lancet, i, 75–​8. Pellegrini G, et al. (1997). Long-​term restoration of damaged corneal sur- faces with autologous cultivated corneal epithelium. Lancet, 349, 990–​3. Wasiak J, et al. (2006). Autologous cartilage implantation for full thick- ness articular cartilage defects of the knee. Cochrane Database Syst Rev, 3, CD003323. Translation of stem cell-​based therapies Dimmeler S, et al. (2014). Translational strategies and challenges in regenerative medicine. Nat Med, 20, 814–​21. Knoepfler PS (2015). From bench to FDA to bedside: US regulatory trends for new stem cell therapies. Adv Drug Deliv Rev, 82, 192–​6. Simonson OE, et al. (2015). The safety of human pluripotent stem cells in clinical treatment. Ann Med, 47, 370–​80. Trounson A, DeWitt N (2016). Pluripotent stem cells progressing to the clinic. Nat Rev Mol Cell Biol, 17, 194–​200. Trounson A, McDonald C (2015). Stem cell therapies in clinical trials: progress and challenges. Cell Stem Cell, 17, 11–​22. Stem cell derivation and molecular biology Evans MJ, Kaufman MH (1981). Establishment in culture of pluripo- tential cells from mouse embryos. Nature, 292, 154–​6. Gurdon JB, Melton DA (2009). Nuclear reprogramming in cells. Science, 322, 1811–​15. Mali P, et  al. (2013). RNA-​guided human genome engineering via Cas9. Science, 339, 823–​6. Takahashi K, et al. (2007). Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell, 131, 861–​72. Takahashi K, Yamanaka S (2016). A decade of transcription factor-​ mediated reprogramming to pluriplotency. Nat Rev Mol Cell Biol, 17, 183–​93. Thomson JA, et al. (1998). Embryonic stem cell lines derived from human blastocysts. Science, 282, 1145–​7. Yin H, et al. (2014). Genome editing with Cas9 in adult mice corrects a disease mutation and phenotype. Nat Biotechnol, 32, 551–​3. Stem cell immunomodulation Le Blanc K, Ringden O (2006). Mesenchymal stem cells:  proper- ties and role in clinical bone marrow transplantation. Curr Opin Immunol, 18, 586–​91. Le Blanc K, et al. (2004). Treatment of severe acute graft-​versus-​host disease with third party haploidentical mesenchymal stem cells. Lancet, 363, 1439–​41. Naik S, et al. (2018). Two to tango: dialog between immunity and stem cells in health and disease. Cell, 175, 908–20. Pluchino S, et al. (2005). Neurosphere-​derived multipotent precursors promote neuroprotection by an immunomodulatory mechanism. Nature, 436, 266–​71. Zappia E, et al. (2005). Mesenchymal stem cells ameliorate experi- mental autoimmune encephalomyelitis inducing T-​cell anergy. Blood, 106, 1755–​61. Regenerative neurology Eriksson PS, et al. (1998). Neurogenesis in the adult human hippo- campus. Nature Med, 4, 1313–​17. Gill SS, et al. (2003). Direct brain infusion of glial cell line-​derived neurotrophic factor in Parkinson disease. Nature Med, 9, 589–​95. Lee JP, et al. (2007). Stem cells act through multiple mechanisms to benefit mice with neurodegenerative metabolic disease. Nature Med, 13, 439–​47. Lindvall O, et al. (1990). Grafts of fetal dopamine neurons survive and improve motor function in Parkinson’s disease. Science, 247, 574–​7.

3.7  Stem cells and regenerative medicine 295 Suzuki M, et al. (2007). GDNF secreting human neural progenitor cells protect dying motor neurons, but not their projection to muscle, in a rat model of familial ALS. PLoS ONE, 2, e689. Cardiomyocytes and cardiac repair Chong JJ, et al. (2014). Human embryonic stem cell-​derived cardiomyo­ cytes regenerate non-​human primate hearts. Nature, 510, 273–​7. Hsieh PC, et al. (2007). Evidence from a genetic fate-​mapping study that stem cells refresh adult mammalian cardiomyocytes after in- jury. Nature Med, 13, 970–​4. Laugwitz KL, et al. (2005). Postnatal isl1+ cardioblasts enter fully dif- ferentiated cardiomyocyte lineages. Nature, 433, 647–​53. Ott HC, et al. (2008). Perfusion-​decellularized matrix: using nature’s platform to engineer a bioartificial heart. Nature Med, 14, 213–​21. Patel AN, et al. (2016). Ixmyelocel-​T for patients with ischaemic heart failure: a prospective randomised double-​blind trial. Lancet, 387, 2412–​21. Schuldt AJ, et al. (2008). Repairing damaged myocardium: evaluating cells used for cardiac regeneration. Curr Treat Options Cardiovasc Med, 10, 59–​72. Pancreatic β cells and islet transplantation Calafiore R, et al. (2006). Microencapsulated pancreatic islet allografts into nonimmunosuppressed patients with type 1 diabetes: first two cases. Diabetes Care, 29, 137–​8. Kumar SS, et al. (2014). Recent developments in β-​cell differentiation of pluripotent stem cells induced by small and large molecules. Int J Mol Sci, 15, 23418–​47. Schulz, TC (2015). Concise review: manufacturing of pancreatic endo- derm cells for clinical trials in type I diabetes. Stem Cells Transl Med, 4, 927–​31. Shapiro AM, et al. (2006). International trial of the Edmonton protocol for islet transplantation. N Engl J Med, 355, 1318–​30. Other therapeutic possibilities Ajalloueian F, et al. (2018). Bladder biomechanics and the use of scaf- folds for regenerative medicine in the urinary bladder. Nat Rev Urol, 15, 155–​74. Drake MJ, et al. (2015). Application of gene-​editing technologies to HIV-​1. Curr Opin HIV AIDS, 10, 123–​7. Long C, et  al. (2016). Postnatal genome editing partially restores dystrophin expression in a mouse model of muscular dystrophy. Science, 351, 400–​3. Moreau T, et  al. (2016). Large-​scale production of megakaryocytes from human pluripotent stem cells by chemically defined forward programming. Nat Communications, 7, 11208. Perales MA, et al. (2015). Fast cars and no brakes: autologous stem cell transplantation as a platform for novel immunotherapies. Biol Blood Marrow Transplant, 22, 17–​22. Schwartz SD, et al. (2015). Human embryonic stem cell-​derived retinal pigment epithelium in patients with age-​related macular degener- ation and Stargardt’s macular dystrophy:  follow-​up of two open-​ label phase 1/​2 studies. Lancet, 385, 509–​16. Song G, et al. (2016). Direct reprogramming of hepatic myofibroblasts into hepatocytes in vivo attenuates liver fibrosis. Cell Stem Cell, 18, 797–​808.

3.8 The evolution of therapeutic antibodies 296

3.8 The evolution of therapeutic antibodies 296

ESSENTIALS The development of rodent monoclonal antibodies opened the door to the creation of antibodies specific to soluble and cell-​surface antigens. ‘Humanized’ therapeutic antibodies have emerged as blockbuster drugs for the treatment of cancer, immune, and inflam- matory disorders—​the so-​called biologics. In this short chapter, two scientists who made seminal contributions to this field and remain actively engaged in its development give a personal account of how these remarkable developments came about. Introduction There has been a revolution in the pharmaceutical industry: anti- bodies have emerged as major blockbuster drugs for treatment of cancer and immune or inflammatory disorders. Much of this revo- lution was spearheaded in Cambridge, England, initiated by the re- search of Cesar Milstein and George Köhler at the MRC Laboratory of Molecular Biology and who, with N.K. Jerne, shared the 1984 Nobel Prize for Medicine or Physiology. As related in this personal perspective, Cambridge scientists and clinicians took up the chal- lenge to develop the original murine antibodies into powerful phar- maceuticals that can be administered repeatedly without the dire consequences of alloimmunization. Monoclonal antibodies The technological discoveries related to the generation of rodent monoclonal antibodies (mAbs) by Köhler and Milstein in 1975 opened the door to the creation of antibodies specific to soluble antigens and to cell-​surface antigens. Such antibodies not only had the potential to kill the cells or block the molecules involved in dis- ease processes, but were amenable to industrial production in cell cultures. There were, however, several uncertainties about their po- tential as therapeutic agents. For example, it was not clear whether (as agents directed to a single site on a cell-​surface antigen) they would be capable of recruiting lytic payloads of the body’s comple- ment system and myeloid cells. Nor was it clear whether the im- munogenicity of rodent mAbs antibodies in humans would lead to human antimouse antibodies that would block therapy. Indeed, by the mid-​1980s, immunogenicity was emerging as a key concern for the application of mAbs as therapeutic agents. We were witnesses to the discovery and early development of mAbs, and independently, and for differing reasons, embarked on research programmes leading to the reduction of the immunogen- icity of antibodies while ensuring therapeutic efficacy. One of us (HW) sought to reprogram the immune system to make it more tol- erant to foreign antigens, and to restore tolerance in autoimmune disease; the other (GW) sought to use genetic engineering to render rodent antibodies as human as possible. Making ‘humanized’ monoclonal antibodies The starting point came from the work of several scientists, including the late Michael Neuberger, a close colleague at the MRC Laboratory of Molecular Biology. By genetic engineering Neuberger created mouse-​human chimeric antibodies in which the antigen-​binding (variable) domains of rodent mAbs were linked to the effector (con- stant) domains of human antibodies. Chimeric IgG antibodies, how- ever, comprised light chains that were only 50% human, and heavy chains that were 25% human, still leaving a substantial degree of ‘foreignness’ and potential for immunogenicity. GW reasoned that it should be possible to reduce ‘foreignness’ still further. It had long been supposed that the six hypervariable regions of antibodies, which were mainly loops located on one face of the asso- ciated variable domains, were responsible for binding antigen (and leading to their naming by Kabat as complementarity determining regions or CDRs). GW’s innovation was to replace the CDRs of model human antibodies by those from rodent mAbs, and thereby endow the human antibodies with the binding activities of the ro- dent mAbs. These ‘humanized’ antibodies could comprise as little as 5% foreign sequences, and as the CDRs differed between human antibodies anyway, it was suspected that they might be no more im- munogenic than fully human antibodies. Indeed, these antibodies were originally termed ‘reshaped’ human antibodies, and can, in this context, be regarded as a synthetic species of human antibody. 3.8 The evolution of therapeutic antibodies Herman Waldmann and Greg Winter

3.8  The evolution of therapeutic antibodies 297 Moving humanized monoclonal antibodies into clinical practice At the time of GW’s discovery, HW’s group in Cambridge had gener- ated a lytic rat antilymphocyte antibody (CAMPATH-​1) with poten- tial in reprogramming the immune system in autoimmune diseases and transplantation, as well as for treatment of lymphocyte malig- nancies. Concerns about its potential immunogenicity were creating uncertainties for its development as a therapeutic agent, especially for repeat treatments. HW and GW decided to collaborate on the creation of a humanized version of the CAMPATH-​1 antibody, but the work did not prove to be straightforward. We discovered that simply ‘transplanting’ murine CDRs onto a human framework was not sufficient to transfer antigen binding. This could, however, be ‘corrected’ by mutating framework residues thought to be important for the folding of the CDRs. We also knew that different human IgG isotypes varied in their ability to activate complement and harness Fc-​dependent ‘myeloid-​based’ lytic mechanisms, and showed that the human IgG1 isotype was the most effective for these lytic functions. We moved quickly to test the efficacy of the humanized antibody in the clinic. We were fortunate, as Geoff Hale in HW’s group had established a manufacturing facility in Cambridge, the Therapeutic Antibody Centre (TAC), where we could manufacture clinical grade antibody. The humanized CAMPATH-​1 antibody was used for treat- ment of three patients at Addenbrooke’s hospital, two with lymphocyte malignancies and one with an intractable severe vasculitis. We were all amazed at the spectacular effects of the antibody in these three patients. In the two patients with lymphocyte malignancies we saw a substantial reduction of tumour mass without significant side effects, and the pa- tient with autoimmune disease underwent a long-​term remission from the short-​term therapy in what had been an otherwise refractory dis- ease. These exciting outcomes provided the platform for the evolution of Alemtuzumab/​Lemtrada as a treatment in chronic lymphocytic leu- kaemia, and later—​through work with Alastair Compston and Alastair Coles at the Department of Clinical Neuroscience—​for the treatment of relapsing remitting multiple sclerosis. Furthermore, it helped to val- idate the use of humanized antibodies in the clinic, and to catalyse the antibody engineering revolution from which so many valuable new drugs have emerged in the past 25 years. Development of other types of monoclonal antibodies The interest in creating human antibodies by genetic engineering did not stop there. GW and colleagues developed approaches to de- rive human antibodies from large libraries of human antibody vari- able domains, without the need to immunize animals for which GW was awarded a Nobel Prize in 2018. In turn this led to the develop- ment of Humira by Cambridge Antibody Technology in a collabor- ation with the biotechnology company Knoll, and this was the first fully human antibody to be approved for therapy by the US FDA. In parallel Marianne Bruggemann (at AFRC Babraham), Neuberger, and colleagues pioneered the development of transgenic mice with a human V-​gene locus, allowing hybridoma technology to be used for the isolation of human antibodies from immunized mice. Other ingenious approaches have subsequently been developed to make human antibodies. The historical progression of engineering anti- bodies towards a more human form is outlined in Fig. 3.8.1, with many human-​like antibodies emerging as effective therapeutics. The problem of immunogenicity Immunogenicity directed to CDRs does occur for many antibodies, including fully human antibodies, at least in some patients, al- though reporting of immunogenicity has not been as extensive as one might hope. As there is no natural tolerance to these CDR re- gions, additional strategies are desirable to eliminate that residual immunogenicity. One strategy has been to create mutations in the residual im- munogenic sites, such mutants being designed to eliminate the T-​helper and/​or B-​cell epitopes of an antibody, but there is as yet no longer-​term clinical evaluation as to what extent this is achievable. Another strategy has been to establish immunological tolerance to the immunogenic epitopes within the CDR regions. Building on classical studies on tolerance, HW noted that most foreign anti- bodies binding to blood cells (and thereby aggregated) were po- tently immunogenic, but that non​binders (that did not aggregate) were able to induce immunological tolerance to themselves. Indeed, ‘FULLY’ HUMAN RODENT CHIMERIC HUMANIZED Fc VH VL CDRs (in blue) Fig. 3.8.1  The various engineered forms of therapeutic antibodies where the intention has been to replace rodent gene sequences with human derived ones. In blue are shown the regions of an antibody genetically derived from the rodent. In yellow are those derived from human genes. VH VL, variable domains of heavy and light chains, respectively; Fc, the fragment crystallizable region that carries antibody effector functions; CDRs, complementarity determining regions.

298 SECTION 3  Cell biology by making a non​binding mutant of the humanized Campath-​1 anti- body, HW and colleagues were able to induce tolerance to the thera- peutic form. This approach may ultimately allow tolerogenicity to be built directly into therapeutic antibodies. Future prospects More generally the modular nature of antibodies has enabled a whole new generation of engineered antibody-​based product. These have allowed variations in size and pharmacokinetics, and provided opportunities for incorporation of multiple specificities in the same antibody molecule as well as the addition of domains delivering a range of desired payloads. One burgeoning area where these novel constructs have been exploited is in cancer immunotherapy, where the intent has been to recruit and activate T-​cells to tumours so as to exploit their lytic and diverse proinflammatory properties. In particular, encouraging outcomes have been seen from the use of bispecific antibodies, and of chimeric antigen receptors where anti- body variable regions have been connected to with T-​cell receptor signalling domains. Continuing innovations based on understanding antibody mol- ecules and their functional interactions will surely generate new waves of therapeutic advances, targeting extracellular structures in ways that conventional small drugs have not yet achieved. FURTHER READING Bruggemann M, Neuberger MS (1996). Strategies for expressing human antibody repertoires in transgenic mice. Immunol Today, 17, 391–​7. Coles AJ, et  al. (2006). The window of therapeutic opportunity in multiple sclerosis: evidence from monoclonal antibody therapy. J Neurol, 253, 98–​108. Gilliland LK, et  al. (1999). Elimination of the immunogenicity of therapeutic antibodies. J Immunol, 162, 3663–​71. Hale, G., et al. (1988). Remission induction in non-​Hodgkin lymphoma with reshaped human monoclonal antibody CAMPATH-​1H. Lancet, 2, 1394–​9. Jones PT, et al. (1986). Replacing the complementarity-​determining regions in a human antibody with those from a mouse. Nature, 321, 522–​5. Köhler G, Milstein C (1975). Continuous cultures of fused cells se- creting antibody of predefined specificity. Nature, 256, 495–​7. Mathieson PW, et al. (1990). Monoclonal-​antibody therapy in systemic vasculitis. N Engl J Med, 323, 250–​4. Riechmann L, et al. (1988). Reshaping human antibodies for therapy. Nature, 332, 323–​7. Waldmann H (1989). Manipulation of T-​cell responses with mono- clonal antibodies. Annu Rev Immunol, 7, 407–​44. Winter, G., et al. (1994). Making antibodies by phage display tech- nology. Annu Rev Immunol, 12, 433–​55.

3.9 Circulating DNA for molecular diagnostics 299

3.9 Circulating DNA for molecular diagnostics 299

ESSENTIALS Short fragments of cell-​free DNA are released into the plasma when cells die. In patients with cancer some of this circulating DNA is re- leased by tumour cells; in pregnant women some is derived from the fetus; and increased amounts are found in many pathological conditions associated with cell death. In each of these circumstances, analysis of cell-​free DNA can provide useful clinical information (e.g. detection or monitoring of cancer, determination of mutation status of a fetus). With further improvement in analytical technologies and developments of new markers, it is likely that the application of cir- culating cell-​free DNA and cell-​free RNA species in molecular diag- nostics will increase in the future. Introduction Cell-​free DNA is present in the plasma of human subjects. These DNA molecules are short fragments that are released when cells die (Fig. 3.9.1). In cancer patients a proportion of such circulating cell-​ free DNA is released by tumour cells and thus carries molecular signatures of cancer. Such signatures include oncogene mutations, copy number aberrations, DNA methylation changes, and viral sequences in cancer associated with virus infections. By detecting such signatures in plasma, it is possible to detect, monitor and prog- nosticate cancer, and obtain information guiding targeted therapy (Table 3.9.1). In pregnant women, fetal DNA is found circulating in maternal plasma. Such fetal DNA is released by the placenta and carries gen- etic and DNA methylation signatures of the fetus (Fig. 3.9.1). Hence, analysing the plasma DNA of a pregnant woman allows determin- ation of certain genetic characteristics of a fetus (e.g. sex, blood group type, mutation status, and chromosomal constitution; see Table 3.9.1). Such an approach has been referred to as non​invasive pre- natal testing and is now used worldwide, particularly for screening for fetal chromosomal disorders such as Down’s syndrome. Apart from oncological and prenatal applications, cell-​free DNA analysis has also been explored in several emerging areas. Such ap- plications are built on the concept that circulating DNA is a marker of cell death and hence is released in increased amounts in many pathological conditions associated with cell death (Fig. 3.9.1). Examples include the monitoring of rejection episodes following transplantation, trauma, and stroke. With further improvement in analytical technologies and developments of new markers, it is likely that the application of circulating cell-​free DNA in molecular diag- nostics will increase in the future. For many years, nucleic acids extracted from cellular materials (e.g. blood cells and buccal cells) are the predominant materials used for molecular analysis. However, over the last few years, there has been increased interest in the use of extracellular nucleic acids for a variety of molecular diagnostics. This chapter provides an overview of this emerging area (Fig. 3.9.1 and Table 3.9.1). History The existence of cell-​free DNA in plasma has been attributed to Mendel and Metais in 1948. This work is remarkable as it predated the discovery of the double helical structure of DNA by Watson and Crick in 1953, and—​perhaps because it was so far ahead of its time—​ its significance remained unrecognized for many years. In the 1970s, researchers showed that the concentrations of circulating DNA in cancer patients were higher than those without cancer. However, the origin of such excess circulating DNA remained uncertain for many years due to the limitations in technology at that time. With the advent of the polymerase chain reaction (PCR), it was shown in 1994 by Vasioukhin et al. and Sorenson et al. that circulating DNA in cancer patients carried mutations present in the tumour cells, thus demonstrating that a proportion of such circulating DNA molecules is released by the tumour cells. This realization laid the foundation for performing ‘liquid biopsies’ of tumours, in which the sampling of bodily fluids, most typically blood, allows genomic information regarding cancer to be obtained in a non​invasive manner. The presence of circulating tumour DNA in plasma prompted other researchers to look for other types of circulating DNA. In 1997, Lo et al. demonstrated that cell-​free fetal DNA was present in the plasma and serum of pregnant women. This discovery laid the foundation for the development of non​invasive prenatal testing (NIPT). In 1998, the same group of researchers showed that DNA 3.9 Circulating DNA for molecular diagnostics Y.M. Dennis Lo and Rossa W.K. Chiu

300 SECTION 3  Cell biology was released by a transplanted organ (e.g. kidney and liver) into the plasma of a transplant recipient, which opened up a new approach for monitoring rejection following transplantation. In 1999, two independent groups of researchers demonstrated that mRNA released by tumours could be detected in the plasma and serum of cancer patients. In 2000, it was shown that fetal mRNA could also be detected in the plasma of pregnant women. This discovery of circulating mRNA has opened up the possibility of performing gene expression profiling of a tumour or fetus in a non​invasive manner. Developments since then have provided Cancer Other pathologies Organ transplantation Prenatal DNA DNA methylation polymorphisms Viral sequences miRNA mRNA DNA mutations Fig. 3.9.1  Circulating nucleic acid molecules are released from dying cells into plasma, either as a result of normal cell turnover or pathologies. The detection of circulating nucleic acids have been applied as non​invasive means for cancer assessment, prenatal assessment, transplantation monitoring, and the assessment of other pathologies, such as those associated with inflammatory, ischaemic, and immunological cellular damages. Circulating nucleic acid species that have been detected in human plasma include DNA, messenger RNA (mRNA), micro RNA (miRNA), DNA mutations, DNA methylation signatures, DNA polymorphic sequences, and viral nucleic acid sequences. Table 3.9.1  Clinical applications of circulating nucleic acid analysis Applications Clinical utilities Cancer assessment Diagnosis Prognostication To inform choice of therapy Treatment monitoring Screening Prenatal assessment Fetal sex determination (sex-​linked diseases, congenital adrenal hyperplasia) Blood group determination (rhesus D blood group incompatibility) Chromosomal aneuploidies (e.g. trisomy 21, trisomy 18, trisomy 13) Subchromosomal aneuploidies (microdeletions, microduplications) Single gene disease diagnosis Transplantation Graft rejection Monitoring Other pathologies (inflammatory, ischaemia, immunological) Detection Prognostication Identify organs involved (tissue mapping)

3.9  Circulating DNA for molecular diagnostics 301 many powerful methods for analysing such circulating DNA and RNA species. Circulating nucleic acids for cancer detection The first species of circulating tumour-​derived DNA detected in the plasma of cancer patients consisted of oncogene mutations. Since then, many other types of tumour-​derived DNA have been detected in plasma or serum. The main difference between plasma and serum in the context of cell-​free DNA is that DNA is released from the white blood cells during the blood clotting process through which serum is formed. Hence, the fractional concentration of tumour DNA in serum is typically lower than that in plasma. For this reason, most workers in the field prefer to use plasma. In addition to oncogene mutations, microsatellite alterations, fusion genes, DNA methyla- tion changes, and viral sequences associated with cancer have been detected in plasma or serum. In general, a cancer-​associated DNA sequence that is easily dif- ferentiated from any sequence that is present in the human genome represents a good marker for detection. Thus, it was perhaps not sur- prising that viral sequences which were not present in the human genome were used in some of the earliest work elucidating the mo- lecular characteristics and clinical applications of circulating DNA in cancer. One such example is the measurement of Epstein–​Barr virus (EBV) DNA sequences in the plasma of patients with nasopha- ryngeal carcinoma, which is a cancer that is particularly common in south China where it is associated with EBV infection, and EBV DNA is found in the tumour tissues of virtually all cases. In the late 1990s, it was shown that high concentrations of EBV DNA could be found in the plasma and serum of nasopharyngeal carcinoma pa- tients, and the concentrations of such circulating EBV DNA were found to increase with the stage of disease, thus suggesting that they are related to tumour load. Following treatment, the concentrations of circulating EBV DNA would typically decrease, and upon relapse or progression of disease the concentrations of circulating EBV DNA would increase. It has been demonstrated that such circulating EBV DNA consists of short fragments of DNA, rather than intact virions. All of the aforementioned characteristics of circulating EBV DNA are shared by many other species of tumour-​derived DNA in plasma. One powerful application of tumour DNA in plasma is for guiding treatment using targeted therapy. Hence, the presence in the plasma of cancer-​associated mutations which are the targets for specific agents (e.g. epidermal growth factor receptor gene mutations that respond to tyrosine kinase inhibitors) can be used as a predictor that a patient would likely respond to particular therapy. Following initi- ation of effective therapy, DNA fragments carrying the targeted mu- tations would reduce in concentration in plasma. Upon emergence of a tumour cell clone that was resistant to the targeted therapy, DNA fragments carrying the originally targeted mutations would typically increase in plasma, together with those carrying the genetic signa- ture of resistance. With the advent of massively parallel sequencing, it is now possible to detect and measure the concentration of mul- tiple mutations in plasma accurately and sensitively. Apart from mutations that alter the sequence of a cancer genome, cancer cells also exhibit numerous epigenetic changes, which are biochemical modifications of the genome that do not involve a change in the DNA sequence. One of the best-​studied epigenetic changes is DNA methylation. There are numerous changes in DNA methylation in a cancer genome when compared with the genome of non​malignant cells, and such changes have been used as targets for detecting circulating DNA in the plasma of cancer patients. Examples of such markers include the p16 and RASSF1A genes which are hypermethylated in multiple cancer types. Another ex- ample is the hypermethylation of the septin 9 gene that has been used for the screening of colorectal carcinoma. In addition to circulating DNA-​based markers, several mRNA and miRNA species that are preferentially expressed in tumour cells, when compared with non​malignant cells, have been detected in plasma. However, most of the efforts in the use of circulating nu- cleic acids for cancer detection are based on DNA, rather on RNA markers. Reasons for such a course of development include the rela- tive stability of DNA over RNA in general, the ease of extraction and analysis, and the specificity of the resulting tests. Circulating fetal DNA and non​invasive prenatal testing Cell-​free fetal DNA was first demonstrated in maternal plasma in 1997 through the detection of Y-​chromosomal sequences that male fetuses released into their pregnant mother’s plasma. While such sequences are generally referred to as ‘fetal’ in origin, they are actually released from the placenta. Such cell-​free fetal DNA sequences are de- tectable from the early first trimester. By the 10th week of pregnancy, cell-​free fetal DNA represents a mean of 15% of the DNA in maternal plasma. The absolute concentration of cell-​free fetal DNA per unit volume of maternal plasma continues to increase during pregnancy. However, following delivery of the fetus, cell-​free fetal DNA is cleared very quickly, with an estimated half-​life of some 16 minutes. Certain pregnancy-​associated disorders, such as pre-​eclampsia and preterm labour, have been associated with an increase in the absolute concen- tration of cell-​free fetal DNA in maternal plasma. Cell-​free fetal DNA in maternal plasma consists of short fragments of DNA, which interestingly have a shorter size distribution than the background maternally-​derived DNA in plasma. This size dif- ference has been exploited as an approach for enriching fetal DNA, either physically (e.g. through electrophoresis) or bioinformatically (e.g. by computationally ‘targeting’ the short DNA fragments that have been sequenced from maternal plasma). It should, however, be noted that the size difference between circulating fetal and maternal DNA does not allow the complete separation of these two species of circulating DNA. The first diagnostic applications of circulating cell-​free DNA in- volve DNA sequences that the fetus has inherited from its father and which are absent in the pregnant mother’s genome. The first example of such sequences are Y-​chromosomal sequences of a male fetus. The detection of fetal Y-​chromosomal sequences in maternal plasma enables a non​invasive approach for determining the sex of a fetus. Such an approach is useful in the prenatal investigation of preg- nant women who are carriers of mutations for sex-​linked disorders (e.g. haemophilia). The second example of such sequences are RHD sequences of a RhD-​positive fetus, coding for the RhD blood group antigen, that are absent in the genome of a RhD-​negative pregnant mother. This approach is useful for investigating RhD blood group incompatibilities between the pregnant mother and her fetus. The

302 SECTION 3  Cell biology third example of a sequence that a fetus has inherited from its father, but absent in the pregnant mother’s genome, is a mutation that the fetus has inherited from its father. When such an approach is used for the detection of an autosomal dominant disorder, the presence of a paternally inherited mutation in the plasma of a pregnant woman who does not have that mutation herself indicates that the fetus has inherited it and thus is at risk of the disorder. The first such mutation detected was a mutation of the fibroblast growth factor receptor 3 gene causing achondroplasia. When such an approach is used for an autosomal recessive dis- order, one method is to focus on genetic disorders which can be caused by multiple mutations and in which the father and the mother of the fetus carry different mutations. One then attempts to detect the paternally inherited mutation in the maternal plasma. Provided that the assay is sensitive enough, the absence of such a mutation in maternal plasma is taken to indicate that the fetus has not inherited the paternal mutation and hence does not suffer from the disorder. Conversely, the detection of the paternally inherited mutation in maternal plasma would indicate that the fetus has in- herited the paternal mutation but does not provide any information regarding its inheritance of the maternal mutation, hence invasive prenatal diagnosis would still be necessary in this circumstance. The development of more precise methods for measuring DNA sequences in plasma, such as single molecule PCR (or ‘digital’ PCR), has allowed the informativeness of plasma DNA-​based non-​ invasive prenatal testing for autosomal recessive disorders to be en- hanced. This approach is based on the concept that in the genome of a pregnant woman who is a carrier for an autosomal recessive dis- order, the ratio of the mutant copy and the normal copy of the gene implicated in the disorder should be 1:1. Hence, in the maternal plasma, the ratio of the pregnant mother’s own mutant and normal copies of the gene should also be 1:1. However, the fetus would be releasing its own DNA into maternal plasma as well. If the fetus had one copy of the mutant gene and one copy of the normal gene, then the ratio of these versions of the gene would remain unchanged from 1:1 (Fig. 3.9.2). On the other hand, if the fetus had two copies of the mutant gene, then the concentration ratio in maternal plasma would be biased in favour of the mutant gene. Finally, if the fetus had two copies of the normal gene, then the concentration ratio in maternal plasma would be biased in favour of the normal gene. This method has been referred to as the relative mutation dosage approach and has been used in autosomal recessive disorders such as β-​thalassaemia and sickle cell anaemia. Apart from autosomal recessive disorders, this approach can also be used for sex-​linked Mother heterozygous Fetus homozygous for mutation Mutation Fetus heterozygous Mother heterozygous Mother heterozygous Fetus homozygous for normal allele Normal allele Linked polymorphic alleles Fig. 3.9.2  Non​invasive fetal genotyping for single gene disease diagnosis are performed for women who are carriers of a disease mutation by comparing the relative abundance of the mutant and normal alleles in maternal plasma. When a fetus is homozygous for the mutation, the plasma DNA sample shows an overrepresentation of molecules carrying the mutation (left panel). When a fetus is homozygous for the normal allele, the plasma DNA sample shows an overrepresentation of molecules carrying the normal allele (right panel). When a fetus is heterozygous for the maternal mutation, there is equal representation between the mutant and normal alleles in maternal plasma (middle panel). The assessment could be performed by directly comparing the amounts of the DNA molecules carrying the normal and mutant alleles, termed relative mutation dosage (RMD). Alternatively, haplotype based quantitative comparison could be made, termed relative haplotype dosage (RHDO) analysis. The accumulative abundance of the polymorphic alleles linked to the mutation would contribute to the measured representation of the mutant allele. Similarly, the accumulative abundance of the polymorphic alleles linked to the normal allele would contribute to the measured representation of the normal allele. Fetal DNA molecules in maternal plasma are shorter than the maternal molecules, hence these are depicted as the smaller molecules in the illustration. The paternally inherited fetal DNA molecules are depicted as the orange molecules.

3.9  Circulating DNA for molecular diagnostics 303 disorders in which the mother is a carrier of the disease gene on the X chromosome (e.g. haemophilia A). In addition to measuring the dosage of the mutant and normal gene, one can also measure the relative dosage of alleles of single nucleotide polymorphisms (SNPs) that are linked to the gene. Such an approach is particularly robust when multiple SNPs, which are grouped together in a haplotype, are analysed. Such an approach is referred to as the relative haplotype dosage (RHDO) method (Fig. 3.9.2), and has been used for the non​invasive prenatal testing of genetic disorders such as congenital adrenal hyperplasia, β-​thalassaemia and haemophilia A. Haplotype information is conventionally constructed by analysing DNA samples from multiple family members. However, the recent availability of methods for direct haplotype determination has sim- plified the generation of haplotype information for RHDO analysis. Perhaps the most important application of NIPT to date is the use of this technology for detecting fetal chromosomal aneuploidies such as Down’s syndrome. The general concept of this approach is the de- tection of the subtle quantitative aberration in the chromosome in- volved in the aneuploidy in maternal plasma. For example, for the detection of fetal Down’s syndrome, there would be a small increase in the concentration of chromosome 21-​derived DNA sequences in maternal plasma compared with the other chromosomes. One widely used method for detecting such a quantitative aberration of sequences in maternal plasma is with the use of massively parallel sequencing. Apart from Down’s syndrome, such an approach has also been used for the NIPT of trisomy 18, trisomy 13, sex chromosome aneuploidies, and even aberrations involving only part of a chromo- some (e.g. a subchromosomal deletion or duplication). While NIPT for chromosomal aberrations such as Down’s syn- drome are highly accurate, its diagnostic accuracy is not 100% and so it is widely regarded as a highly accurate screening test, rather than as a diagnostic test. Hence, an abnormal result would still need to be con- firmed with an invasive method (e.g. amniocentesis). There are multiple reasons why NIPT is not and perhaps cannot be 100% accurate. Firstly, the robustness of NIPT is related to the fractional concentration of fetal DNA in a particular maternal plasma sample. It is known that some 1–​ 2% of maternal plasma samples contain very low amounts of circulating fetal DNA and hence would be regarded as uninformative for NIPT for chromosomal aneuploidies. Secondly, fetal DNA in maternal plasma is derived from the placenta. It is possible that the placenta might contain a clone of cells with a different chromosomal constitution to the cells of the fetus’s body, a phenomenon known as confined placental mo- saicism. Thirdly, in twin pregnancies, DNA is released by both fetuses into the maternal plasma. However, occasionally, if one twin has subse- quently died, in a scenario that has been referred to as a ‘vanishing twin’, its placenta might continue to release DNA into maternal plasma. If the vanished twin has a chromosomal aneuploidy, then such an aberration might be observed in maternal plasma and might be regarded as a ‘false-​ positive’ result for the remaining healthy twin. NIPT for chromosomal aneuploidies is now performed in over 90 countries in millions of preg- nant women per year, as a result of which some centres have reported a reduction in the use of invasive prenatal testing by some 30%. With the increased use of NIPT worldwide, several groups have come across pregnant women who also have cancer during preg- nancy. Such subjects exhibit genomic aberrations that are detectable in their plasma which originate from the cancer, rather than from the fetus. These studies suggest that pregnant women should be in- formed about this possibility during the counselling for NIPT. Several proof-​of-​concept publications have shown the possibility of performing non​invasive prenatal fetal whole genome sequencing from maternal plasma. However, the high costs of such analysis and the difficulty in data interpretation and genetic counselling have re- stricted these to the research domain at the present time. Fetal RNA and miRNA have also been detected in maternal plasma, but due to the relative complexity of sample stabilization, extraction, and analysis, such approaches have not been used in actual clinical use. Other emerging applications In 1998, it was shown that Y-​chromosomal DNA sequences could be found in female subjects who had received a male liver or kidney through transplantation. The presence of donor-​derived DNA in the plasma of transplantation recipients has now been generalized to bone marrow transplantation, heart transplantation, and lung transplant- ation. Furthermore, this approach has now been extended beyond sex-​mismatched transplantation through the use of SNP markers which are able to differentiate a donor’s from a recipient’s DNA. In addition, through the precise measurement of the concentration of the recipient’s DNA in the donor’s plasma, such as single molecule PCR or massively parallel sequencing, it has been found that rejection episodes are associated with an elevation in donor’s DNA in plasma. This observation is consistent with the concept that DNA is released into plasma when cells die. Plasma DNA analysis is thus a non​invasive approach for monitoring rejection following transplantation. The concept that plasma DNA is a marker for cell death phe- nomena is an important one as it suggests that the concentration of plasma DNA would be elevated in many clinical scenarios associ- ated with tissue damage. Indeed, increase in the levels of circulating DNA has been reported in pulmonary embolism, stroke, trauma, autoimmune diseases, and so on. Early work in this area was per- formed using DNA markers that were not tissue specific. However, the recent development of DNA methylation markers that are spe- cific for particular tissues will likely lead to renewed interest in this area of work. If future technologies would allow plasma DNA to be analysed more rapidly and cheaply, it is possible that circulating DNA analysis would eventually be regarded as a type of biochemical body scan, similar to how serum biochemistry is used nowadays. FURTHER READING Bianchi DW, Chiu RWK (2018). Sequencing of Circulating Cell-free DNA during Pregnancy. New England Journal of Medicine, 379, 464–73. Jiang P, Lo YMD (2016). The long and short of circulating cell-​free DNA and the ins and outs of molecular diagnostics. Trends Genet, 32, 360–​71. Lam WKJ, Jiang P, Chan KCA, et al. (2018). Sequencing-based counting and size profiling of plasma Epstein-Barr virus DNA enhance popu- lation screening of nasopharyngeal carcinoma. Proceedings National Academy of Sciences (USA), 115(22), E5115–E5124. Minear MA, et al. (2015). Noninvasive prenatal genetic testing: cur- rent and emerging ethical, legal, and social issues. Ann Rev Genomics Hum Genet, 16, 369–​98. Thierry AR, et al. (2016). Origins, structures, and functions of circu- lating DNA in oncology. Cancer Metastasis Rev, 35, 347–​76. Wong AI, Lo YMD (2015). Noninvasive fetal genomic, methylomic, and transcriptomic analyses using maternal plasma and clinical im- plications. Trends Mol Med, 21, 98–​108.

SECTION 4 Immunological mechanisms Section editors: John D. Firth, Christopher P. Conlon, and Timothy M. Cox 4.1 The innate immune system  307 Paul Bowness 4.2 The complement system  315 Marina Botto and Matthew C. Pickering 4.3 Adaptive immunity  325 Paul Klenerman and Constantino López-Macias 4.4 Immunodeficiency  337 Sophie Hambleton, Sara Marshall,
and Dinakantha S. Kumararatne 4.5 Allergy  368 Pamela Ewan 4.6 Autoimmunity  379 Antony Rosen 4.7 Principles of transplantation immunology  392 Elizabeth Wallin and Kathryn J. Wood