Protein Engineering / Protein Properties / Developability / Hot Spot Analysis / Antibody Modeling / Humanization / Molecular Surfaces
The course covers approaches for structure-based antibody design and includes protein-protein interactions analysis, in silico protein engineering, affinity modeling and antibody homology modeling. The interaction of a co-crystallized antibody-antigen complex will be studied by generating and examining molecular surfaces and visualizing protein-protein contacts in 3D. Antibody properties will be evaluated using specialized calculated protein property descriptors and analyzing protein patches. The application of protein engineering tools for affinity and property optimization of antibodies in the context of developability will be studied. Antibody homology modeling optimization examples will include identification of glycosylation sites and their selective modification using a specialized MOE Project antibody database. All the steps necessary for high throughput antibody homology modeling workflow from sequence to structure to property calculations for developability analysis will be described.
Barbara Sander, Senior Applications Scientist, Chemical Computing Group
Biologics: Protein Alignments, Modeling and Docking
Alignments and Superposition / Loop and Linker Modeling / Homology Modeling / Protein Docking / Solubility Analysis / 2D Hot Spot Mapping / Protein Ligand Interaction Fingerprints / QSAR Modeling
The course covers methods for aligning protein sequences, superpositioning structures, homology modeling fusion proteins and conducting protein-protein docking. In particular, an approach for aligning and superpositioning multiple structures will be described for determining structural and surface protein variations in relation to protein property modulation. A method for grafting and refining antibody CDR loops as well as using a knowledge-based approach to scFv fusion protein modeling using the MOE linker application will be described. An approach to generate homology models of a murine antigen structure from a human template as well as protein-protein docking of an antibody to an antigen will be discussed. A QSAR model for predicting and analyzing protein/biologics solubility will be described.
Purvi Gupta, Applications Scientist, Chemical Computing Group
Biophysical Cartography of the Native and Human-Engineered Antibody Landscapes Quantifies the Plasticity of Antibody Developability
Habib Bashour, Postdoctoral Researcher, Dept. of Immunology, University of Oslo
Designing effective monoclonal antibody (mAb) therapeutics faces a multi-parameter optimization challenge known as "developability", which reflects an antibody's ability to progress through development stages based on its physicochemical properties. While natural antibodies may provide valuable guidance for mAb selection, we lack a comprehensive understanding of natural developability parameter (DP) plasticity (redundancy, predictability, sensitivity) and how the DP landscapes of human-engineered and natural antibodies relate to one another. These gaps hinder fundamental developability profile cartography. To chart natural and engineered DP landscapes, we computed 40 sequence- and 46 structure-based DPs of over two million native and human-engineered single-chain antibody sequences. We found lower redundancy among structure-based compared to sequence-based DPs. Sequence DP sensitivity to single amino acid substitutions varied by antibody region and DP, and structure DP values varied across the conformational ensemble of antibody structures. Sequence DPs were more predictable than structure-based ones across different machine-learning tasks and embeddings, indicating a constrained sequence-based design space. Human-engineered antibodies were localized within the developability and sequence landscapes of natural antibodies, suggesting that human-engineered antibodies explore mere subspaces of the natural one. Our work quantifies the plasticity of antibody developability, providing a fundamental resource for multi-parameter therapeutic mAb design.
Predicting Post-Translational Modifications through Structural In Silico Analysis
Michael Knight, Senior Scientist, UCB
Post translational modifications (PTMs) such as deamidation, oxidation, isomerisation and glycation can potentially affect the efficacy and pharmacokinetics of biotherapeutics. As such they must be considered in the critical quality attribute (CQA) risk assessment and be appropriately controlled under a control strategy.
Our recent work has used state of the art in silico molecular modelling, including utilising physics based molecular simulations and machine learning algorithms, to test previously published models for the prediction of aspartic acid isomerisation, asparagine deamidation and oxidation of methionine and tryptophan residues. These predictions have been validated against data from our historical forced degradation studies (FDS), with accuracy of between 81 and 92%. In addition, we have developed our own random forest model using features extracted from molecular dynamics simulations for the prediction of lysine glycation which achieves an accuracy of 97% on the training set and 79% on the independent test set. To our knowledge this is the first such model for predicting lysine glycation in antibodies.
Early prediction of PTM liabilities is highly desirable because it would inform multiple areas of research and development such as candidate selection, characterisation, process development, formulation and control strategy.
Wolfgang Große, Associate Director, Protein Design, CureVac SE
mRNA-based therapeutics have attracted rising interest in the last years with a tremendous boost with the licensing of Corona vaccines during the pandemic. Additional to the large success in the field of prophylactic vaccines, advances in the areas of cancer therapies as well as molecular therapies are being reported, the later covering both protein replacement and passive immunizations. In terms of protein-based therapies monoclonal antibodies are a strongly growing medical class with applications in many applications including cancer treatments, allergy and passive immunizations. Despite the undoubtable successes the accessibility of the technology is still facing bottlenecks. Some of these can be addressed by transient production of the therapeutic in the host organism.
A classic full-length antibody is assembled from two heavy and two light chains to yield a disulphide-bridged, 150 kDa glycoprotein. These molecules, once produced, can be purified, characterized, mixed and stored. In the past decades a multitude of design approaches has been successfully applied in this field to increase yield, valency and modify diverse specificities. Several studies have shown that the antibody platform can indeed be combined with mRNA technology, producing the molecules directly in a host organism, proven in rodent models to clinical trials. When producing these molecules in situ within a host organism from mRNA, the correct assembly needs to be assured, especially when raising the complexity beyond a single encoded AB species. Different adaptions can and have to be made to foster this process. On the other hand some adaptions for a classical recombinant production process can be spared. The plethora of structural information on antibodies allows a software-supported deep assessment and selection of structural features prior to more time-consuming testing, allowing to e.g. select assembly promoters and fine-tune the molecules to an efficient application directly in the desired host organism.
Computational Approaches for the Design and Development of Multispecific Therapeutics
Dilyana Dimova, Senior Data Scientist, Sanofi
Our novel, automated high-throughput engineering platform enables the fast generation and multiparametric screening of tens of thousands of multi-specific molecules. However, the search for variants with improved developability profiles in such huge genotype-phenotype data sets is a challenging task. Herein, we demonstrate how AI-based virtual screening workflows and in silico approaches can guide and support the selection and optimization of multi-specific biologics.
Antibody discovery is a complex problem to solve as it requires the search through a large design space for candidates that are intended to fulfil multiple desirable properties at once such as efficacy, stability, manufacturability and more. Conventional methods often involve the sequential optimisation of different properties through rational design but such methods can be time-consuming and make an inefficient use of data. To address these ongoing challenges, we have developed an iterative, efficient framework consisting of four main stages: design, build, learn, test. We adaptively design antibodies by using ML models to extract as much information from our already observed designs and subsequently use this knowledge to propose new designs to be tested next. Specifically, we achieve this using a class of methods known as multi-objective Bayesian optimisation which, at each iteration of our framework, targets a desired trade-off between exploitation of existing information and exploration of less understood areas of the design space. In the presented case study, our framework allowed us to search the design space quicker and more efficiently than conventional methods while improving the killing selectivity of a solid tumour-targeted T-cell engager to a level that is 400 times greater than the clinical benchmark.
Bio: Lida is a Senior Machine Learning Engineer at LabGenius, primarily working on active learning methods in drug discovery. Lida has built a closed-loop optimisation process for multi-specific antibody drug discovery based on multi-objective Bayesian optimisation. In her previous role, she worked as an Applied Scientist at Improbable, a metaverse company, and completed a 3-year Postdoctoral Fellowship at the MRC Biostatistics Unit at the University of Cambridge. Lida holds a PhD in Statistics from the University of Glasgow, specialising on optimal experimental design methods for the study of complex real-world phenomena.
Validation of Computational and Machine Learning Strategies for Biologics Design
Massimo Sammito, Associate Principal Scientist, AstraZeneca
Turning science into medicine with computational & ML-empowered tools for biologic molecule design. Andrew will give an overview of the languages of biotherapeutics & machine learning, discuss the impact of quality curated data, and finally, demonstrate two applications of machine learning in biomolecule design.
A Machine-learning Algorithm for the Selection of Excipients During Biopharmaceutical Formulation Development
Mark Teese, Product Owner, Bioinformatics, Leukocare AG
Biopharmaceutical formulation development has become increasingly challenging, e.g. due to new
modalities and higher desired drug substance concentrations. The limited drug substance supply and
numerous analytical methods means that only a small selection of excipients can be thoroughly tested
with standard wet lab approaches. To reduce the bias and risk associated with excipient selection, we
developed the Excipient Prediction Software (ExPreSo), a machine-learning algorithm that suggests
protective inactive ingredients based on the properties of the drug substance and target product profile.
A dataset was created of >350 formulations with proven long-term stability, including >200 unique
peptide/protein drug substances. Supervised learning was conducted to obtain suggested excipients for
each drug substance in the dataset. A leave-one-group-out cross-validation methodology was developed
to prevent highly-similar drug substances being split between train and test sets. A blind test set
revealed minimal overfitting. ExPreSo had high predictive power for six excipients, and moderate
predictive power for four others. To our knowledge, ExPreSo is the first machine-learning algorithm for
the prediction of stabilizing biopharmaceutical excipients.
Dr. Mark Teese is the Product Owner of Bioinformatics at Leukocare AG, holding a PhD in Chemistry
from the Australian National University. His prior roles include group leader in biochemistry and data
science at the University of Münster and Technical University of Munich, as well as experience in
software consultancy at TNG Technology GmbH. At Leukocare, he focuses on guiding the development
of AI and data science tools to support the optimization of biopharmaceutical development by the
company's R&D scientists.
Predicting Antibody Developability Using Interpretable Machine Learning
Peter M. Tessier, Albert M. Mattocks Professor, Depts of Pharmaceutical Sciences, Chemical Engineering and Biomedical Engineering, University of Michigan
The development, delivery, and efficacy of therapeutic antibodies are strongly influenced by
three types of molecular interactions mediated by their variable regions, namely affinity, off-
target, and self-interactions. Here we report interpretable machine-learning models for
identifying high-affinity mAbs with optimal combinations of low off-target binding and low self-
association, and demonstrate that these co-optimal antibodies display drug-like in vitro
(formulation) and in vivo (pharmacokinetic) properties.