Cheminformatics and QSAR
MOE provides a suite of applications for manipulating and analyzing large collections of molecules. The built in database engine stores molecular data (small molecules, proteins, antibodies, alignments, docking results, etc.) as well as numeric and character data for compound library design and analysis. Data can be imported into a MOE database from external sources and used in clustering, diversity selection and QSAR modeling. The seamless integration between the Database Viewer and MOE modeling applications facilitate chemometrics for QSAR model development.
MOE's spreadsheet-style Database Viewer allows intuitive manipulation of molecular data and properties. Browse through a large collection of molecules with the Database Browser. Transfer molecules seamlessly from the Database Viewer into the MOE 3D visualization environment for further analysis. Analyze correlation plots and matrices generated from molecular descriptors. The database applications are available as command line processing tools (sdtools) and can be incorporated into KNIME as well as Pipeline Pilot workflows.
Molecules can be imported from SD, SMILES, and other common file types as well as SQL databases. A unified small molecule tautomer and titration enumerator to prepare input structures for calculations is included as well as high-performance tools for conformational sampling. These applications can be accessed from the graphical interface or pipeline command line tools. Operate directly on SD files for structure depiction, acid/base titration and tautomer enumeration, database filtering, sorting and descriptor calculations. Remove records that do not satisfy a series of filters (e.g., lead-like, reactive groups, drug-like, etc.), sort records and remove duplicate entries from SD files. Calculate descriptors and write the output to SD or ASCII formats.
Enumerate both reaction-based or R-group based virtual libraries either in 2D or 3D. Filter the virtual compounds by properties or pharmacophores. Build combinatorial libraries by combining scaffold and R-group databases or with MOE's reaction-based library generation method. Reactions can be chosen from a list or sketched in standard sketchers and applied to enumerate libraries of compounds. The default reagent database contains over 3,000 reagents curated from commercial vendors, however, custom reagent databases can be specified.
Enumerated libraries can be built within the receptor active site and ranked by binding affinity scores and pharmacophore, QSAR and similarity models. Symmetric substitution, peptide substitution, bidentate connections and ring creation are supported. A focused library [Labute 2002] can be generated assigning an activity score to each R-group. The activity is estimated by statistical sampling of the chemical space spanned by all possible R-group substitutions.
Calculate over four hundred 2D and 3D molecular descriptors including topological indices, structural keys, E-state indices, physical properties, topological polar surface area (TPSA) and CCG's VSA descriptors [Labute 2003] with wide applicability to both biological activity and ADME property prediction. Apply Extended Hückel-based descriptors, such as LogP, LogD, and molar refractivity, for computing molecular properties. Calculate pKa and pKb of small molecules and determine the populations of ligand protonation states at a given pH. Use descriptors for classification, clustering, filtering and predictive model construction. Add custom descriptors using MOE's built-in Scientific Vector Language.
Build QSAR/QSPR models using PLS, PCR, probabilistic and decision-tree methodologies. CCG's unique Binary QSAR methodology [Labute 1999] is ideal for building pass/fail models from high error content data. Linear models include PCR and PLS methodologies and can support biological activity or ADME clustering assessments.
Perform similarity searching and diverse subset selection using Descriptor, Conformation and Molecular Fingerprint methodologies. Choose between a number of fingerprint systems including 2, 3 and 4-point pharmacophore fingerprints in 2D/3D, MACCS key and EigenSpectrum shape fingerprints.
SAReport is a web-based tool for
performing SAR analysis and visualization of project data
[Clark 2008]. Analyze multiple-scaffold project ligands in 2D
by detecting scaffolds, consistently numbering R-groups across scaffolds
and depicting all compounds in a consistently oriented manner, aligned
by scaffold. The output is a portable and interactive HTML document
that displays R-group tables, heat maps, property graphs providing
insights on key scaffold/R-group combinations.
[Clark 2006] Clark, A., Labute, P., Santavy, M.; 2D Structure Depiction; J. Chem. Inf. Model. 46 (2006) 1107-1123
[Clark 2008] Clark, A.M., Labute, P.; Detection and Assignment of Common Scaffolds in Project Databases of Lead Molecules; J. Med. Chem. 52 (2008) 469-483.
[Clark 2008] Clark, A.M.,Labute, P.;SD File Processing with MOE Pipeline Tools; JCCG (2008)
Labute, P.; Binary QSAR: A New Method for Quantitative Structure Activity
Relationships; Proceedings of the 1999
Pacific Symposium; Altman et al. eds. (1999) World Scientific
Labute, P.; A Widely Applicable Set of Descriptors; J. Mol. Graph. Mod.
18 (2000) 464-477
Labute, P., Nilar, S., Williams, C.; A Probabilistic Approach to High
Throughput Drug Discovery; Combinatorial Chemistry & High Throughput
Screening 5 (2002) 135-145
Labute, P.; The Derivation and Applications of
Molecular Descriptors Based Upon (Approximate) Surface Area; Chemoinformatics:
Concepts, Methods, and Tools for Drug Discovery; J. Bajorath ed. (2003)
[Labute 2000] Labute, P.; A Widely Applicable Set of Descriptors; J. Mol. Graph. Mod. 18 (2000) 464-477
[Labute 2002] Labute, P., Nilar, S., Williams, C.; A Probabilistic Approach to High Throughput Drug Discovery; Combinatorial Chemistry & High Throughput Screening 5 (2002) 135-145
[Labute 2003] Labute, P.; The Derivation and Applications of Molecular Descriptors Based Upon (Approximate) Surface Area; Chemoinformatics: Concepts, Methods, and Tools for Drug Discovery; J. Bajorath ed. (2003)