Abstracts


Exploring antibody repertoires using deep language models

Presenter: Lea Brönnimann ()

Authors:
Lea Brönnimann; Chiara Rodella; Thomas Lemmin

University of Bern, Faculty of Medicine

The rapid advancement of deep learning has extended into the biomedical field, enabling scientists to leverage these powerful methods to analyze large biological datasets. Among these datasets, collections of antibodies have grown significantly in recent years, with the aim of advancing therapeutic design. Antibodies are crucial to adaptive immunity, with diversity arising from heavy-light chain pairing and somatic hypermutation. To better understand the architecture of antibody repertoires, we trained deep learning models to investigate the sequence and pairing dynamics of human antibodies.

An in-depth analysis of the latent space embeddings learned by our language models revealed biologically meaningful clusters, including distinctions based on gene families and B-cell differentiation states. To explore the interdependence of heavy and light chains, we trained machine translation models to generate light chain sequences from heavy chain inputs. While achieving moderate average sequence identity with the true paired light chains, we observed distinct modes within the sequence identity distribution, potentially reflecting biological patterns of heavy-light chain compatibility.

These findings highlight the utility of deep learning models in antibody repertoire analysis, providing tools to study gene usage, maturation, and pairing. This approach offers a powerful framework for understanding antibody diversity and could inform the development of therapeutic antibodies.

 

Go back

up