Frequently Asked Questions

Leadscope Model Applier and the ICH M7 Impurities Guidelines

Why do I need to run an expert alert system or a QSAR statistical model for impurities or degradants?
The ICH M7 draft consensus guideline titled "Assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk" states that (Q)SAR methodologies can be used to predict the outcome of a bacterial mutagenicity assay to support hazard assessment. In certain cases this can avoid having to test impurities or degradants for bacterial mutation.
The ICH M7 guidelines state: Two (Q)SAR prediction methodologies that complement each other should be applied. One methodology should be expert rule-based and the second methodology should be statistical-based. What methodology does Leadscope use?
The Leadscope Model Applier contains both statistical-based QSAR models as well as expert rule-based alerts satisfying the requirement for having the two specified technologies.
Which software applications does the FDA use in a regulatory capacity to support M7 submissions?
The U.S. Federal Food and Drug Administration Center for Drug Evaluation and Research accepts M7 submissions containing QSAR results. They internally run QSAR genotoxicity models that have been internally validated and accepted upon all impurities that are submitted. Currently, the supported applications include the Leadscope Model Applier, Lhasa Derek Nexus, and MultiCase UltraCase. All predictions are performed using the latest versions of the software and models. Leadscope is committed to delivering the most current versions of the QSAR models commercially available to customers as soon as the FDA begins using the models for regulatory purposes.
In the Leadscope Model Applier, which expert alerts should be used to support the ICH M7 guidelines i.e. to "predict the outcome of a bacterial mutagenicity assay?"
The Bacterial Mutation expert alerts (from the Genetox expert alerts suite) should be run to support ICH M7.
Within the Leadscope Model Applier, the Leadscope Genetic Toxicity Statistical QSAR Suite contains around 30 different models. Which ones do I need to run to support the ICH M7 guidelines i.e. to "predict the outcome of a bacterial mutagenicity assay"?
The Gene Mutation-Microbial In Vitro models - Salmonella Mut and E Coli Sal 102 A-T Mut should be the only prediction results used to support the ICH M7 guidelines.
I understand the Leadscope Genetic Toxicity QSAR suite contains models other than these models. Is it possible to run a QSAR analysis and not generate results for other models?
Yes. You have the option to select individual predictive models within a Model Suite to apply to your test set. For instance, in the Leadscope Genetic Toxicity QSAR suite you can limit the models to be applied to only those models selected.
How should a positive result be interpreted in another model from the Leadscope Genetic Toxicity QSAR suite?
The Gene Mutation-Microbial In Vitro models - Salmonella Mut and E Coli Sal 102 A-T Mut should be the only prediction results used to support the ICH M7 guidelines.
Can you explain how the Leadscope Genetox Expert Alerts methodology works?
Leadscope worked with expert genetic toxicologists in the development of this alerts system. The Leadscope Genetox Expert Alerts are based on well-defined mutagenicity structural alerts from the literature. These alerts have been validated against a large database of over 7,000 chemicals with Ames data (referred to as the reference set). Alongside this list of alerts, deactivating factors as well as active subclasses (which represent possible cohorts of concern) are encoded. A positive prediction is made where one or more alerts are present with no defined deactivating factor. In addition, the software determines whether the test compound is similar enough to known classes of chemicals such that it is not trying to extrapolate to areas of chemistry the system has never seen. A positive prediction is only made when the compound is within this applicability domain. A negative prediction is made for chemicals that are within the applicability domain that either contains no alert or when the alert is deactivated.

See the Leadscope Genetox Expert Alerts white paper for more details
Can you explain how the Leadscope (Q)SAR statistical-based methodology works?
The models were built using the Leadscope Predictive Data Miner software and the training data sets were compiled at the US Food and Drug Administration (FDA) by the Division of Drugs Safety Research (DDSR). The QSAR models are implemented with molecular descriptors that include structural features and 7 calculated properties. The structural features include a set selected from Leadscope's 27,000 pre-defined structural features, predictive scaffolds (larger structural features that show association to the toxicity endpoint) and structural alerts identified from the literature. The seven calculated properties used are: parent molecular weight, aLogP, polar surface area, hydrogen bond acceptors, hydrogen bond donors, number of rotational bonds and Lipinski score (rule violation). The QSAR models are built using the structural feature and properties as descriptors also described as x- or independent variables. The models encode the relationship between these descriptors and the toxicity endpoint, such as the results of the bacterial mutagenesis assay (i.e, y- variable or response variable). The modeling technique used to generate these models is referred to as partial logistic regression.

When a prediction is made on a new chemical, the same structural features and properties in the model are calculated for the test compounds. These descriptors are then used with the models to calculate a probability of a positive result. See appendix A for more details.
Does the Leadscope Model Applier methodology follow the validation principles set forth by the Organisation for Economic Co-operation and Development (OECD)?
Yes. Accompanying each model is a QMRF (QSAR Model Reporting Format) report detailing how each model adheres to these principles.
I understand the statistical-based QSAR models are derived from a training set of historical results. How many chemicals does it include, where do they come from, and what sort of chemistries do they cover?
The Salmonella Mut training set was developed at the U.S. FDA using a non-proprietary set of 3,979 compounds from FDA approval packages and the published literature. This includes a recent addition of 445 chemicals, 247 which are drug molecules marketed between 1970 and 2011. Others were added to specifically fill data gaps within the training set that were identified using structural features derived from known toxicophores. These include 155 new examples containing functional groups such as azides, amine oxides, hindered epoxides, propriolactones, quinones, amine halides, diazines, azo compounds, diazoniums, sulfates, aziridine chlorides, nitrites, hydrazines, nitriles, isocyanates, and sulfur mustards and 40 compounds containing previously unmodeled atoms such as boron, silicon, selenium, and tin.
The E Coli Sal 102 A-T Mut training set was constructed to predict A-T base pair mutations and to improve performance and coverage over the previous E Coli models. The training set of 1198 compounds was composed of non-proprietary data from publicly available FDA approval packages and the published literature for E. coli WP2 uvrA, E. coli WP2 uvrA (pKM101), and S. typhimurium TA102. The training set was expanded to include molecular features from more recently marketed drugs as well as targeting areas of chemical space where previous models were known to have weaknesses.
Both this model and the Salmonella model are compliant with the ICH M7 guideline.
Who developed the Leadscope statistical models and expert alerts?
All the genotoxicity QSARs were constructed at the US Food and Drug Administration (FDA) Center for Drug Evaluation and Research by the Division of Drug Safety Research (DDSR).

Leadscope scientists along with Dr. Errol Zeiger and Dr. Ronald Snyder developed the Leadscope Model Applier - Genetox Expert Alerts system.
What sorts of organizations use the Leadscope model applier?
Leadscope Model Applier Genetox Suite is used by pharmaceutical (large and small), biotech, chemical, cosmetic and food product companies. It is also used by regulatory agencies in Europe, United States and Canada. Users can obtain the same results for their compounds as is obtained by the regulatory agencies.

In addition, Leadscope has over 25 toxicology consulting firms participating in our toxicity consulting program. Those consultants utilize the Leadscope Model Applier with the Genetox Suite in performing predictions for their clients. Predictions include analysis of impurities in accordance with the ICH M7 draft guidelines. Results of the consultants predictions made using Leadscope Model Applier have been used in regulatory submissions.
I understand that the Leadscope Model Applier calculates a probability of a positive prediction - what cut-off should I use to determine a positive or negative result?
It is generally accepted to use a cut-off of greater than or equal to 0.6 for a positive call, less than 0.4 for a negative call and assign calls between 0.4 and 0.6 as equivocal. These are the cutoffs presently used by the FDA in evaluating QSAR submissions using Leadscope.
How do I assign confidence to a prediction result?
The Leadscope Model applier will calculate a probability of a positive prediction for the QSAR models. This value, along with information used to explain the results (such as the structural analogs from the training set); can be used to assess the weight of the evidence. The genetox expert alerts are accompanied by a detailed explanation, including the structural definition as well the literature source for any matched alert, a description of the mechanistic rationale for the alert as well as data on historically tested chemicals that also contained the alert.
I have a positive result what should I do now?
If negative bacterial mutagenicity data is available for the compound then this information would override a positive prediction. The draft guidelines mention that if the (Q)SAR is positive, but the alerting substructure is contained in the API (or related substance) in the same chemical environment and it has tested negative, then the alert or (Q)SAR prediction could be refuted. In addition, the draft guideline also allows for an expert review that could refute the positive prediction with sufficient supporting evidence (Amberg et al., 2016). Otherwise, the compound would be predicted positive and should be controlled at or below acceptable limits (generic or adjusted TTC) or a bacterial mutagenicity test should be performed.
What is applicability domain? And how is the domain defined by Leadscope?
In (Q)SAR modeling the applicability domain represents the physico-chemical, structural or biological space of the predictive model. It is important that your test set fit in this space in order for a reliable prediction to be made. The Leadscope Model Applier evaluates the fit of your test set to a model domain using two criteria - 1) the test chemical must have one of the predictive model chemical descriptors (as well all of the property descriptors) and 2) the test chemical must have at least one structural analog in the model training set. If both of these criteria are not met no prediction is generated and the test chemical is labeled as Not in Domain.
On occasion the Leadscope Model Applier does not calculate a prediction and reports Not in Domain or Indeterminate (i.e. equivocal). How should I interpret this for assessing impurities and degradants? Can I turn off domain assessment and what are the implications?
A conservative approach would be to treat an out-of-domain or intermediate/equivocal as positive; however, supporting evidence such as the results from an analog search of the Leadscope SAR Genetox database could be used to assign it as negative. It is not recommended to turn off the domain assessment since there is limited information in the training set and the models to support a prediction of any chemicals identified as out-of-domain.
How do I interpret seemingly conflicting results, for instance what if the Salmonella Mut model predicts positive but the E Coli Sal 102 A-T model predicts either negative or out-of-domain?
A positive result in either the Salmonella Mut or the E Coli Sal 102 A-T Mut models would result in a positive call (see the previous two questions and responses for handling out-of-domain predictions). If a negative and out-of-domain (or equivocal) predictions were made, there is more evidence to assign the result as an overall negative. Supporting evidence from an analog search of the Leadscope SAR Genetox database could be used to further support this assignment.
Can the Leadscope Model Applier support the need to provide additional supportive evidence?
Yes. Included with the Leadscope Model Applier is a high quality toxicity database that can be used to identify structural analogs that provide additional supportive evidence. In addition, there is an explain capability that identifies what portions of the compound were used to calculate a positive or negative results. This can provide additional supportive evidence, especially where this can be linked to mechanistic information.
How much does the Leadscope Model Applier cost?
The Leadscope Model Applier Genetox Statistical Model Suite can be licensed in two manners - 1) as an annual subscription permitting company-wide unlimited use or 2) on a Pay-As-You-Go basis. The subscription license is $20,000 per year, including unlimited users and unlimited predictions. It is truly an organization wide unlimited license. The Pay-As-You-Go license is $150 per compound.

As an additional feature, an organization can include an analog search for their compounds when they make a prediction with the Genetox Suite using either the unlimited license or the pay-per-compound license. The analog searching function utilizes the Leadscope SAR Genetox and SAR Carcinogenicity databases.

The Leadscope Model Applier Genetox Expert Alerts Suite also can be licensed either via an unlimited use license or a pay-per-compound license. The unlimited license for the expert alerts is $20,000 per year. The pay-per-compound license for the Genetox Expert Alerts Suite is $150 per compound.

A 10% discount is provided if a customer licenses both the Genetox Statistical QSAR Suite and the Genetox Expert Alerts Suite through an unlimited license ($36,000 annual for both after discount). The discounted pay-per-compound license for both the Genetox Statistical and Genetox Expert Alerts is $250 per compound.
I have no training in using (Q)SAR models or alerts. How easy is the Leadscope Model Applier to operate?
Once you have drawn a chemical structure or have access to a SMILES string for the chemical, running the software is easy. Simply paste in a chemical structure, select the models you want to run (or click the "Use ICH M7 Setting" button), and then view the results in a customizable table. Reports can be generated directly from this results screen.
Does Leadscope, Inc. cooperate with Toxicology consultants who could run the Leadscope Model Applier over a list of impurities and degradants and who could assist in interpreting my prediction results and preparing a report for the FDA?
If a company chooses not to run their own predictions using the Genetox Suite, Leadscope can run the predictions for companies for impurities and degradants in accordance with the ICH M7 guidelines. Reports will be provided by Leadscope and can be included in regulatory submissions. In addition, the Leadscope Model Applier includes reports required by OECD for Reach submission. The Model Applier system can generate the QMRF and QPRF reports automatically. Both report formats are OECD Compliant.

Leadscope also partners with over 25 consulting firms that utilize the Leadscope Model Applier in making predictions through the Leadscope Toxicity Consulting Program. Partipants in this program also have the ability to provide expert review of predictions made with the Model Applier. Companies have included reports from Leadscope Model Applier in submissions to the U.S. FDA and other regulatory agencies.
How well do the Leadscope genetox expert alerts work?

The Leadscope Genetox Expert Alerts have been validated against two data sets: the expert alerts reference set of over 7,000 compounds (chemicals from the Leadscope SAR genetox database and the training sets) and the Hansen set of over 3,700 compounds. The following table includes the results against the reference set.

The results against the Hansen set are presented below and compared with those from Derek Nexus run against the same test set. The U.S. FDA presented the results of an external validation of Derek Nexus in a poster presented at the Society of Toxicology (SOT) Annual Meeting in 2013 (L. Stavitskaya, B. L. Minnier, R. D. Benz, N. L. Kruhlak, FDA Center for Drug Evaluation and Research, SOT 2013 "Development of Improved Salmonella Mutagenicity QSAR Models Using Structural Fingerprints of Known Toxicophores"). The table below is a comparison of the latest version of the Leadscope Genetox Expert Alerts (version 4.0) to the results for Derek Nexus (version 3.0.1) presented in this poster.

Have the Leadscope QSAR models been externally validated and what are the results?

The U.S. FDA presented the results of an external validation of the Salmonella Mut model in a poster presented at the SOT in 2013 (L. Stavitskaya, B. L. Minnier, R. D. Benz, N. L. Kruhlak, FDA Center for Drug Evaluation and Research, SOT 2013 "Development of Improved Salmonella Mutagenicity QSAR Models Using Structural Fingerprints of Known Toxicophores"). Two data sets were used: the Hansen set and a set from a commercial database (Leadscope external validation). The Hansen set is comprised of public data collected by Hansen et al. from the scientific literature. The entire set contained 6,512 compounds; however, 2,680 were in the training set or were stereo or geometric isomers of structures already in the training set. A further 132 were perceived duplicates within the test set or un-modelable structures, which were removed, leaving a total of 3,700 compounds in the final test set. The Leadscope external validation set is comprised of non-proprietary data harvested from FDA approval packages and the published literature. The entire set contained 3,005 compounds; however, 719 were in the training set or were stereo or geometric isomers of structures already in the training set, or were perceived duplicates within the set. These 719 structures were removed leaving a total of 2,286 compounds in the final test set. The following table from the 2013 SOT poster below summarizes the Leadscope QSAR performance statistics:

The E. coli Sal 102 A-T Mut model does not have an external validation set. However, results from leave 10% out cross-validation tested against the corresponding model's training set were performed by the FDA and presented at the 2013 Genetic Toxicology Association meeting (L. Stavitskaya, B. L. Minnier, R. D. Benz, N. L. Kruhlak, FDA Center for Drug Evaluation and Research, "Development of Improved QSAR Models for Predicting A-T Base Pair Mutations", GTA 2013b poster). The performance is summarized below*:

*The cross-validation studies are carried out using slightly different methodologies developed by each software provider and therefore are not directly comparable.
What are the performance statistics of combining a Leadscope QSAR prediction with another system?

The table below reports the performance statistics for the Leadscope statistical QSAR models and genetox expert alerts as well as the M7 consensus call for the Hansen test set:

The following table, presented by the US FDA at the 2013 SOT meeting (L. Stavitskaya, B. L. Minnier, R. D. Benz, N. L. Kruhlak, FDA Center for Drug Evaluation and Research, SOT 2013 "Development of Improved Salmonella Mutagenicity QSAR Models Using Structural Fingerprints of Known Toxicophores") summarizes the performance of combining the Leadscope QSAR models (LSE) with Derek Nexus alerts from Lhasa Limited (DX) and CASE Ultra from MultiCASE (CU) using the external validation sets discussed in the previous question.

What options are available for training or support?
Leadscope provides unlimited training and support for the product, including on-site or on-line training, tutorials and telephone/email support at no additional cost.


Where is Leadscope being used?
Leadscope is being used throughout the pharmaceutical, biotechnology, consumer products and chemical industries as well as regulatory agencies around the world.
What is Leadscope being use for?
Leadscope is being used to analyze datasets of chemical structures and related biological or toxicity data. It is the fastest way to become familiar with your compound collection. It is being used for data analysis in the following areas: high-throughput screening, lead optimization, compound acquisition, structure-based data mining and in silico toxicology.
Who uses Leadscope?
Leadscope is being used by medicinal chemists, computational chemists, chemoinformaticians as well as toxicologists and regulators.

Importing and Exporting Chemical Structures, Data, and Results

How do I load chemical structures and data into Leadscope?
Leadscope has an easy-to-use wizard for importing structures and data into the software. Structures and/or data can be loaded from an SD file and data can be loaded from a text file, such as a CSV file.
How many compounds can I import into Leadscope?
Leadscope markets a number of products and each have different limits on the number of compounds that can be loaded. Leadscope Data Manager can load up to 10,000 compounds; Leadscope Personal and Leadscope Hosted can load up to 100,000. There is no limit on the number of compounds that can be loaded into Leadscope Enterprise.
What properties are calculated by Leadscope when chemicals are loaded?
Leadscope automatically calculates the following properties for all compounds imported into the Leadscope software: aLogP, polar surface area, number of hydrogen bond donors, number of hydrogen bond acceptors, number of rotatable bonds, molecular weight of the parent compound, molecular weight, number of atoms (for the parent compound) and Lipinski score.
Can I export chemical structures, data, and results?
Structures and data, along with informatics results can be exported in a number of file formats, including SD files, text files and Word RTF documents.

Data Mining in Leadscope

Is it possible to extend the Leadscope chemical feature hierarchy with my own custom chemical features?
The Leadscope chemical feature hierarchy can be extended using with your own branches of substructures, such as toxic groups or privileged fragments. These branches should be defined using an SD file.
How can I compare sets of chemical structures and data?
Multiple properties for the same set of compounds can be analyzed using the chemical hierarchy. Additionally, multiple sets of compounds can be compared using the common chemical substructure hierarchy.
Can I sort the Leadscope chemical feature hierarchy based on data?
The chemical structure hierarchy can be sorted according to the z-score, the mean activity or the frequency.
How does Leadscope group chemical compounds
Leadscope provides a number of ways to group a set of compounds, including the chemical feature hierarchy (27,000 named substructures), recursive partitioning/simulated annealing (a method for identifying active classes characterized by combinations of structural features), structure-based clustering and dynamically-generated significant scaffolds or substructures.
How can I search in Leadscope?
How can I filter my set of chemicals?
Any data that has been calculated by Leadscope or imported into the program can be used to dynamically filter the collections of chemical structures.
Can I correlate my chemicals with a property?
Chemical structure groups can be annotated showing where there is unusually high or low levels of activity associated with that set of compounds.