BAGIM:
Boston Area Group for Informatics and Modeling

Past Presentations


Statistical reasoning with Ligands: Is it possible?
Terry Stouch, Science for Solutions, LLC.
May 19, 2010 Astrazeneca, Waltham, MA
The practices of QSAR and Cheminformatics are based on several fundamentals: numerical descriptions (descriptors) of molecules; our approach/method of analyses; assumptions about the relatedness and distribution of descriptors, molecular structure, and biological activity; and our interpretation of the 'raw' data such as properties and activities. Often statistical approaches are taken with the goal/hope of applying well established and well validated statistical inference and validation. The current talk will pose questions for discussion about whether or not statistical approaches are relevant to molecular description, properties, and activities. This discussion will go hand-in-hand with example-driven consideration of the interpretation and impact of the different types of error in data.

Ligand Pocket Detection in Biological Assemblies
Chris Williams, Howard Feldman, Chemical Computing Group Inc.
November 18, 2009 Novartis Institutes for Biomedical Research, Cambridge
Protein cavity similarity can be used to discover similar ligand binding motifs in proteins that have low sequence and structural similarity. This can be useful for predicting side effects and drug toxicity. While numerous protein cavity databases exist, the majority are based on the contents of the asymmetric unit only, and often only include sites where crystal structures with bound ligands exist. In an attempt to improve on this, we have developed a cavity database using the Biological Assembly information available in v3.2 PDB files, and independent of whether bound ligands are present or not. A simple Alpha Shapes approach is used to detect pockets in the biological assembly of sufficient size and hydrophobicity to potentially bind ligands. An analysis was run over the entire PDB to determine reasonable size cutoffs to use. Employing fingerprinting techniques, we can perform similarity searches using an initial pocket as the query. SCOP domain information is stored along with each cavity to aid in detecting hits with minimal structural similarity to the query.

Improving accuracy in molecular docking and virtual screening experiments
Oleg Stroganov and Val Kulkov, BioMolTech Corp.
October 21, 2009 AstraZeneca R&D Boston
Molecular docking emerged over two decades ago as a promising technique for modeling protein-ligand interactions and predicting ligand bioactivity. So far however, it has fallen short of the expectations of practical researchers. The current state of theory behind energy representation in protein-ligand interactions and the computational challenges of global optimization make the development of an accurate and reliable molecular docking tool an extremely difficult task, since little inaccuracies in the energy representation or deficiencies of the global search and optimization procedure can often produce a dramatic impact on the final result in a molecular docking experiment.

In our attempt to develop a more accurate and reliable tool for molecular docking, we tried to address the known deficiencies in scoring functions and the global optimization techniques. We have shown that small details can sometimes make a significant difference. The exploration of novel features of genetic algorithms, the application of various multilevel local optimization procedures and the optimization of computational expenses resulted in a notable improvement in docking accuracy compared to the currently available methods. The inclusion of original energy terms in the Lead Finder scoring functions and the adjustment of individual sets of scaling coefficients for pose ranking during docking, binding energy estimations and virtual screening also increased the accuracy of docking predictions.

In virtual screening experiments, we have been able to repair some of the deficiencies of the scoring functions by applying the structural filtration technique. The structural filter is defined by a protein-specific set of interactions that are a) structurally conserved in available structures of a particular protein with its bound ligands, and b) that can be viewed as playing a crucial role in protein-ligand binding. The application of structural filtration resulted in a considerable improvement of the enrichment factor ranging from several to several hundred folds depending on the protein target. It appeared that the structural filtration had effectively decreased the scoring function’s false positive rate. The ability of structural filtration to recover relatively small but specifically bound molecules creates promises for the application of this technology in the fragment-based drug discovery.

Antibacterial agents: current challenges and future directions
Joseph Iaconis and Richard Alm, Infection Discovery, AstraZeneca R&D Boston
September 23, 2009 AstraZeneca R&D Boston
Antimicrobial resistance of clinically significant gram-positive and gram-negative bacteria is increasing at an alarming rate worldwide. Bacterial expression of altered target sites, novel beta-lactamases, permeability problems, and efflux have compromised many of the potent antimicrobials in common practice today. Unfortunately, this has translated to greater morbidity and mortality despite longer hospitalization care, increased use of infection control procedures and more expensive antimicrobials.

A major concern over the last 5 years has been the increasing prevalence and rapid dissemination of methicillin-resistant Staphylococcus aureus (MRSA) in the hospital and community setting, respectively. These bacteria frequently colonize the skin and nasal passages of healthy people, however, pose no threat unless the host integrity or immune system is compromised. While newer agents have been developed to treat MRSA infections to mitigate this public health crisis, there is currently a lack of effective agents to treat patients infected with multi-drug resistant gram-negative pathogens, e.g., Pseudomonas aeruginosa, Acinetobacter baumanii, and Klebsiella pneumoniae. Antimicrobial treatment of these latter organisms often requires combination therapy, including polymyxin, a nephrotoxic agent.

The development of novel broad-spectrum agents for gram-negative bacteria is difficult since successful agents must cross the outer and cytoplasmic membranes to reach the target site and by-pass the many different efflux systems that can vary in the gram-negative species. Discovery strategies to deal with these multi-faceted issues may include development of monotherapeutic agents for specific organisms coupled with molecular diagnostics that can rapidly identify pathogens in clinical specimens.

(Not so) Novel compounds by virtual screening: In search of the mythical scaffold hop.
Richard W. Dixon,Vertex Pharmaceuticals
June 24, 2009 Vertex Pharmaceuticals
We investigate the claim that ligand-based similarity is a suitable tool to identify "scaffold hops" contained in a screening collection. Investigation into the structure dependence of 3D superposition methods demonstrates that screening campaign success is independent of the number of conformers considered in the superposition. Expansion of the investigation to include 2D similarity methods demonstrates that a wide variety of approaches are essentially equivalent with respect to overall virtual screening impact. Analysis of the few methods displaying lesser performance in these tests leads to the conclusion that some representation of the chemical environment must be encoded in the features comprising the similarity representation to achieve optimal performance. Failure to do so, either through deliberate modulation of 3D structure or choice of environment-independent representations such as pharmacophore types or molecular shape, results in reduced hit set enrichment rates. Data fusion strategies involving multiple active reference compounds and iterative selection are found to enhance enrichment rates. Fusion of similarity methods, by contrast, is found to be only minimally effective in this effort. Taken together, these results suggest that no ligand-based similarity method considered here is independent of molecular scaffold and in fact all successful methods recapitulate the molecular scaffold in a consistent manner. Scaffold hopping between markedly different molecular topologies within a decoy population is therefore unlikely with current technology.

Methodologies for Efficient Knowledge-Based Antibody Homology Modeling
Johannes Maier, Chemical Computing Group
March 24, 2009 Millennium Pharmaceuticals
Antibodies are globular proteins composed of two heterodimers with each set containing a heavy chain (VH) and light chain (VL). The binding to an antigen is in most antibodies facilitated by six loops, three originating from the VL domain, termed L1, L2 and L3, and three from the VH domain, termed H1, H2 and H3. Due to their modular composition and high target specificity antibodies have become increasingly attractive for use as drugs. Antibody Homology Modeling techniques have often been applied in generating therapeutically more effective antibodies. Here, we demonstrate a collection of procedures as well as an interface to meet the demands of effective antibody homology modeling. The application has flexible components allowing the integration of various work-flows associated with this specific form of modeling. The routines account for the particular structural composition of antibodies when searching for template candidates and building models. A knowledge-based approach is applied with an underlying database of antibody structures originating from the Protein Data Bank (PDB), clustered by class, species, subclass and framework sequence identity. A specially designed loop grafting routine allows for generation of xenogeneic antibody models.

The Integration of Biological Data Using Semantic Web Technologies
Susie Stephens* and Eric Prud'hommeaux**, * Eli Lilly, ** W3C and MIT
November 19, 2008 Millenium Pharmaceuticals

Good BREEDing, techniques for generating hybrid molecules
Joe Leonard, Chemical Computing Group
September 23, 2008 Vertex Pharmaceuticals
This work presents a method for generating novel structures from aligned molecules, using the BREED methodology of Pierce et al (J Med Chem, 2004). It automates the practice of joining fragments of two known structures to create new compounds, using tethered optimization to preserve the position and orientation of each fragment. To increase the structural diversity, the resulting molecules can be bred again to further exchange fragments.

BREED selects all proximate bonds when creating new molecules, which can lead to a combinatorial explosion when working with a large number of initial structures. This explosion can be reduced by restricting bond selection via retrosynthetic schemes such as Lewell et al's RECAP (J Chem Inf Comput Sci, 1998). This scheme labels bonds that are considered synthetically accessible using SMARTS patterns and unlabeled bonds are ignored, reducing the number of uninteresting structures. These patterns can be easily modified or extended to reflect the project's chemistry.

Finally, if protein/ligand cocrystal data is present, pseudo-docking is accomplished by relaxing the bred structures in the receptor site and scoring them with the MM/GBVI scoring function. Because the initial structures are aligned, the pose selection and placement steps are unnecessary.

From pathophysiology to mechanism with Pfizer Systems Biology
David de Graaf, Director, Systems Biology Center for Bioinformatics, Pfizer Research Technology Center
April 30th, 2008 Research Technology Center, Pfizer
The Pfizer Systems Biology team has used a combination of experimental and computational approaches to understand the mechanistic molecular basis of pathophysiological processes such as hepatotoxicity and TNF-a production in the context of RA. I will use a number of examples to demonstrate how one can get from pathophysiology to a molecular entity which modulates the relevant phenotype

To Link or not to Link: That is the Question! Computational Tools for Fragment-Based Lead Discovery and Optimization
Matthias Rarey, University of Hamburg, Center for Bioinformatics, Hamburg, Germany
March 12, 2008 AstraZeneca Pharmaceuticals
Today, structure-based design techniques like molecular docking and scoring are well-established computational approaches in the early phases of pharmaceutical research. Although de novo design methods were developed roughly 15 years ago, most methods applied today focus on the analysis of individual molecules. More recently, fragment-based approaches gain much popularity as the method of choice for finding new drugs. In this presentation, we will address the question of how computational methods can support this fragment-oriented approach.

The foundation of all methods is a formal description of chemical fragment spaces, which allow to model virtual synthesis. The fragment space represents the underlying search space for FlexNovo, a structure-based design tool, and Recore for redesigning core fragments. Due to the complexity of fragment spaces, proper creation, visualization and navigation play an important role. Some new tools for these tasks together with sample spaces (generic and focussed) as well as test cases will be presented.

False Negatives and False Positives in High Throughput Screening data
Vlado Dancik, Primera Biosystems
November 14, 2007 Millennium Pharmaceuticals
Traditionally a single cutoff threshold is applied to high throughput screening (HTS) results to decide which compounds should be followed upon. Consecutive retests will reveal how many of the hits can be confirmed and how many hits are false positive. Unfortunately there also may be false negatives that are never discovered and represent a lost opportunity. We have developed an informatics solution to deal with the false negative/positive compounds which takes into account many different features of the HTS. We can use mixtures of normal and uniform distributions to identify appropriate cutoff thresholds. Based on the historical screening data we define selective and cross-reactive compounds. Based on the structure we predict which compounds are more likely to be novel or appealing to medicinal chemists. We also take into account positions of compounds on plates and compensate for the variability of HTS results among wells. The results of these computational applications on a number of diverse HTS programs at Millennium will be discussed.

Topomer technologies
Richard Cramer, Tripos, Inc.
September 26, 2007 AstraZeneca Pharmaceuticals

A different way of thinking about HERG blockade
Robert Pearlstein, William Egan, Qi-Ying Hu, and Chris Harwell, Novartis Institutes for BioMedical Research
June 19, 2007 Millenium Pharmaceuticals
The hERG potassium channel is a key cardiac ion channel that terminates the plateau phase of ventricular repolarization. The inactivated form of hERG promiscuously binds small organic compounds in the ion conduction pathway, leading to adverse effects. In the absence of co-crystal structures, overall binding mode(s), bound conformation(s), and interactions have yet to be determined, and the basis for promiscuity explained. We created a homology model of the homo-tetrameric pore domain of hERG, and docked a set of known blockers. Our calculations suggest the symmetry of the pore domain, and multiplicity of key side chains identified from mutagenesis, are responsible for the propensity of the cavity to bind compounds containing aromatic and basic groups. Predicted binding geometries and interactions will be described for representative blockers, together with general insights learned from analysis of our homology model.

ZRANK: Re-ranking protein docking predictions with an optimized energy function
Zhiping Weng, Boston University
April 17, 2007 AstraZeneca Pharmaceuticals
The hERG potassium channel is a key cardiac ion channel that terminates the plateau phase of ventricular repolarization. The inactivated form of hERG promiscuously binds small organic compounds in the ion conduction pathway, leading to adverse effects. In the absence of co-crystal structures, overall binding mode(s), bound conformation(s), and interactions have yet to be determined, and the basis for promiscuity explained. We created a homology model of the homo-tetrameric pore domain of hERG, and docked a set of known blockers. Our calculations suggest the symmetry of the pore domain, and multiplicity of key side chains identified from mutagenesis, are responsible for the propensity of the cavity to bind compounds containing aromatic and basic groups. Predicted binding geometries and interactions will be described for representative blockers, together with general insights learned from analysis of our homology model.

An Integrated Approach to Library Design
Patrick Walters, Vertex Pharmaceuticals
May 30, 2006 AstraZeneca Pharmaceuticals
A chemist designing a combinatorial library must consider many criteria when selecting reagents for synthesis. Factors such as target potency, physical properties, metabolic stability, and off-target activity are among many parameters that must be optimized. Although computational models exist to aid the chemist, these models are often poorly validated and are not easily integrated into the drug discovery process. As part of a continuing effort to provide library design tools for medicinal chemists, we have created a software tool known as Reaction Planner. This software provides an easy means of linking a virtual combinatorial library with a well-validated set of computational models. The application of these models can dramatically reduce the size of a virtual library, and help to focus a chemistry effort on the most relevant compounds. Models in Reaction Planner are constructed using NOMAD, an internally developed software platform that allows computational chemists to identify optimal combinations of molecular descriptors and machine learning methods. Models generated using NOMAD can then be published to Reaction Planner where they become part of the medicinal chemistry workflow. This presentation will provide an overview of NOMAD and Reaction Planner, as well as example applications of both programs.

Hi Fidelity Chemistry: Addressing the Concepts of 'Structure' and 'Compound'
Brian Masek, Tripos
March 15, 2006 ArQule, Inc.

Structure-Based Drug Discovery
Greg Petsko, Brandais University
January 26, 2006 ArQule, Inc.