Clustering-based identification of clonally-related immunoglobulin gene sequence sets

نویسندگان

  • Zhiliang Chen
  • Andrew M Collins
  • Yan Wang
  • Bruno A Gaëta
چکیده

BACKGROUND Clonal expansion of B lymphocytes coupled with somatic mutation and antigen selection allow the mammalian humoral immune system to generate highly specific immunoglobulins (IG) or antibodies against invading bacteria, viruses and toxins. The availability of high-throughput DNA sequencing methods is providing new avenues for studying this clonal expansion and identifying the factors guiding the generation of antibodies. The identification of groups of rearranged immunoglobulin gene sequences descended from the same rearrangement (clonally-related sets) in very large sets of sequences is facilitated by the availability of immunoglobulin gene sequence alignment and partitioning software that can accurately predict component germline gene, but has required painstaking visual inspection and analysis of sequences. RESULTS We have developed and implemented an algorithm for identifying sets of clonally-related sequences in large human immunoglobulin heavy chain gene variable region sequence sets. The program processes sequences that have been partitioned using iHMMune-align, and uses pairwise comparisons of CDR3 sequences and similarity in IGHV and IGHJ germline gene assignments to construct a distance matrix. Agglomerative hierarchical clustering is then used to identify likely groups of clonally-related sequences. The program is available for download from http://www.cse.unsw.edu.au/~ihmmune/ClonalRelate/ClonalRelate.zip. CONCLUSIONS The method was evaluated on several benchmark datasets and provided a more accurate and considerably faster identification of clonally-related immunoglobulin gene sequences than visual inspection by domain experts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance-optimized partitioning of clonotypes from high-throughput immunoglobulin repertoire sequencing data

Motivation: During adaptive immune responses, activated B cells expand and undergo somatic hypermutation of their immunoglobulin (Ig) receptor, forming a clone of diversified cells that can be related back to a common ancestor. Identification of B cell clonotypes from high-throughput Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) data relies on computational analysis. Recently, we pr...

متن کامل

Molecular Identification of Rare Clinical Mycobacteria by Application of 16S-23S Spacer Region Sequencing

Objective(s) In addition to several molecular methods and in particular 16S rDNA analysis, the application of a more discriminatory genetic marker, i.e., 16S-23S internal transcribed spacer gene sequence has had a great impact on identification and classification of mycobacteria. In the current study we aimed to apply this sequencing power to conclusive identification of some Iranian clinical ...

متن کامل

Identification of Bifidobacterium Strains Isolated from Fecal Samples of Some Iranian Subjects Using 16SrRNA Gene Sequence Analysis and PCR-based Gene Specific Primers

For the first time in Iran 40 strains of Bifidobacterium were isolated from feces of Iranian subjects. By using phenotypic tests, 18 isolates were identified as Bifidobacterium longum, 10 as Bifidobacterium bifidum and one as Bifidobacterium catenolatum. In order to validate these results and also to identify other isolates that had not been identified by phenotypic tests, two methods of PCR wi...

متن کامل

Comprehensive Assessment of Potential Multiple Myeloma Immunoglobulin Heavy Chain V-D-J Intraclonal Variation Using Massively Parallel Pyrosequencing

Multiple myeloma (MM) is characterized by the accumulation of malignant plasma cells (PCs) in the bone marrow (BM). MM is viewed as a clonal disorder due to lack of verified intraclonal sequence diversity in the immunoglobulin heavy chain variable region gene (IGHV). However, this conclusion is based on analysis of a very limited number of IGHV subclones and the methodology employed did not per...

متن کامل

The bone marrow of multiple myeloma patients contains B cell populations at different stages of differentiation that are clonally related to the malignant plasma cell

One of the distinguishing features of multiple myeloma (MM) is the proliferation of a clonal plasma cell population in the bone marrow (BM). It is of particular interest that the tumor plasma cells appear to be restricted to the microenvironment of the BM and are rarely detected in the peripheral system, yet the disease is found widely disseminated throughout the axial skeleton. Furthermore, is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2010