
After productive IgH rearrangement, recombination of the light chain (IgL) ensues, and the heterodimeric pairing of H and L chains forms the complete antibody of the IgM isotype that is expressed on the surface of a newly formed immature B cell. The antigen-binding site of a heavy chain is formed by the juxtaposition of the hypervariable complementarity-determining regions (CDR-H1, H2 and H3) and the framework 3 region (FR3). The primary antibody heavy chain repertoire is created predominantly by the somatic recombination of variable (V), diversity (D) and joining (J) gene segments, and by the random nontemplated addition of N-nucleotides. ( b) Key steps in antibody diversification. These bottom chains delineate the N-terminal to C-terminal protein sequence. In the bottom chains, framework (FR) and complementarity-determining regions (CDRs) are indicated. These top chains delineate the 5′ to 3′ genetic composition of the antibody. Nontemplated N-nucleotides are shown in red. In the top chains, domains encoded from germline V, D, J and C segments are indicated. In this Review we describe the experimental approaches and technical challenges related to high-throughput antibody gene sequencing, as well as the ways in which Ig-seq might be applied to advance our understanding of immunology and to address unmet clinical needs related to infectious diseases, immune dysregulation and cancer. In addition, Ig-seq can be combined with other techniques, including expression and isolation of antigen-specific antibodies, sequencing of multiple RNAs from single cells 4, and proteomic analyses of antibodies in blood or secretions, to help elucidate the properties of antibodies that mediate protection against infectious diseases or, alternatively, that mediate autoimmune responses. The information gained by Ig-seq is proving invaluable for understanding antibody responses in health and disease and for diagnostic purposes. However, Ig-seq (a term coined by Andrew Fire, Stanford University) has allowed us to determine antibody gene repertoires at an unprecedented depth. Because of labor and cost considerations, it is completely impractical to analyze such a diverse BCR repertoire using traditional Sanger sequencing. Existing mechanisms for BCR diversification can yield an astronomical number of possible BCRs (in theory, >10 13 in humans) 1, 2 this number exceeds the total number of B lymphocytes in the human body ( ∼1–2 × 10 11) (ref. 1), and the repertoire of BCRs expressed in any given individual is continuously shaped by exposure to exogenous antigens and endogenous host factors. BCRs are assembled by somatic recombination of a large number of immunoglobulin gene segments ( Fig. That said, broader application of Ig-seq, especially in clinical settings, will require the development of a standardized experimental design framework that will enable the sharing and meta-analysis of sequencing data generated by different laboratories.Ī potent adaptive immune system is fundamentally reliant upon the generation of a diverse repertoire of B-lymphocyte antigen receptors (BCRs, the membrane-bound form of antibodies expressed on the surface of B cells). Rapid progress in the development of experimental protocols and informatics analysis tools is helping to reduce sequencing artifacts, to achieve more precise quantification of clonal diversity and to extract the most pertinent biological information. Information gained from high-throughput DNA sequencing of immunoglobulin genes (Ig-seq) can be applied to detect B-cell malignancies with high sensitivity, to discover antibodies specific for antigens of interest, to guide vaccine development and to understand autoimmunity. Efforts to determine the antibody repertoire encoded by B cells in the blood or lymphoid organs using high-throughput DNA sequencing technologies have been advancing at an extremely rapid pace and are transforming our understanding of humoral immune responses.
