hERG Inhibition

From ACD Percepta
Revision as of 10:00, 15 June 2017 by Kirilas (talk | contribs)
Jump to navigation Jump to search

Overview


Cardiotoxicity of drug-like compounds associated with human ether-a-go-go (hERG) channel inhibition is becoming more and more common cause of drug candidates’ attrition. The hERG potassium channel is required for normal cardiac depolarization and its blockage can lead to cardiac QT interval prolongation and life threatening arrhythmias.

Using hERG inhibition module, you have the capability to quickly identify hERG inhibitors. Training of models using usually very large ‘in-house’ experimental (screening) data of hERG inhibition would expand the Applicability Domain of the model and would produce reliable predictions for compounds synthesized in your company. Moreover, training allows customization of our model to ensure that it correctly handles the data originating from the particular screening protocol used in your company that may significantly differ from standard protocols described in the literature.

Features

  • Predicts the probability for a compound to inhibit hERG channel at clinically relevant concentrations (Ki < 10 μM).
  • The predictive model is based on a data set of more than 6500 compounds with experimental results collected from published hERG inhibition studies utilizing either patch-clamp or competitive binding methods.
  • Calculates Reliability Index (RI values) of predictions that indicates whether tested compounds belong to Applicability Domain of predictive model.
  • Performs a similarity search and displays top 5 most similar structures from the training set of the model along with their names, experimental results, and literature references.
  • Training of the model using ‘in-house’ data generated by ‘in-house’ screening protocol.

IMPORTANT NOTE:

If you installed ACD/Percepta v. 2016 as an in-place upgrade over the previous version, hERG Inhibition module will continue using the same set of Self-training libraries that was configured in your previous installation. To take advantage of the new, significantly extended built-in library that comes with the current version of the software, you will need to click "Configure" and manually select the folowing entry: hERG-I (Ki less than 10 uM) v. 1.3 (read-only).

In case of a clean installation, this library is selected automatically, and no further action is required.

Interface



  1. Estimated probability of a compound being human ether-a-go-go (hERG) channel inhibitor.
  2. Indication of the prediction reliability along with the Reliability Index value:
    • RI < 0.3 – Not Reliable,
    • RI in range 0.3-0.5 – Borderline Reliability,
    • RI in range 0.5-0.75 – Moderate Reliability,
    • RI >= 0.75 – High Reliability
  3. Up to 5 similar structures in the training set with names, experimental results (Inhibitor, Non-inhibitor, Inconclusive data), and references


Technical information


Experimental data

The built-in library included in hERG inhibition module contains 6670 compounds with experimental values determined using patch-clamp (conventional and automatic) and competitive radioligand displacement assays (reference ligands: dofetilide, astemizole, MK-499). The data were collected from ChEMBL database, as well as original literature publications.

The information provided below applies to the original 'baseline' model that was parameterized on a smaller data set of 663 molecules with high quality quantitative data. However, predictions performed by the module use this 'baseline' model trained with full database of 6670 molecules to ensure the best possible coverage of pharmaceutically relevant chemical space.

Assignment of qualitative categories

The following criteria were applied for conversion of continuous data representing strength of compounds' interaction with hERG channel to binary representation:

  • In patch-clamp studies compounds that exhibited IC50 < 10μM were considered hERG inhibitors, while those with IC50 > 10μM – hERG non-inhibitors.
  • For the data coming from radioligand displacement assay the corresponding thresholds were as follows: Ki < 0.5 μM - inhibitors, Ki > 100 μM - non-inhibitors, while compounds in the intermediate range (0.5 μM < Ki < 100 μM) were labeled inconclusive.

More strict criteria were applied to radioligand displacement data compared to patch-clamp studies since the former method does not provide a direct measure of hERG channel inhibition, but rather represents hERG binding affinity. To ensure high quality of the data set only sufficiently strong or weak binders were considered inhibitors or non-inhibitors respectively, while no definitive categories were are assigned to compounds with moderate binding affinities.

Model features & prediction accuracy

The predictive model of hERG inhibition was derived using GALAS (Global, Adjusted Locally According to Similarity) modeling methodology (please refer to [1] for more details).

Each GALAS model consists of two parts:

  • Global baseline statistical model employing binomial PLS with multiple bootstrapping using a predefined set of fragmental descriptors, that reflects general trends in mutagenicity.
  • Similarity-based routine that performs local correction of baseline predictions taking into account the differences between baseline and experimental values for the most similar training set compounds.


GALAS methodology also provides the basis for estimating reliability of predictions by the means of calculated Reliability Index (RI) value ranging from 0 to 1 that takes into account the following two criteria:

  • Similarity of tested compound to the training set molecules (prediction is unreliable if no similar compounds have been found).
  • Consistence of experimental values and baseline model prediction for the most similar similar compounds from the training set (discrepant data for similar molecules, i.e. alternating hERG blockers and hERG non-blockers lead to lower RI values).

The used method also provides the basis of model Trainability. 'Trainable model' methodology addresses the issue of the chemical space of ‘in-house’ libraries being considerably wider than that of publicly available data which results in limited applicability of most third-party QSARs for analysis of ‘in-house’ data. The ‘Training engine‘ makes appropriate corrections for systematic deviations produced by the baseline QSAR model based on analysis of similar compounds from the experimental data library. Expansion of this Self-training Library with user-defined experimental data for new compounds leads to instant improvement of prediction accuracy for the respective compound classes. Moreover, addition of 'in-house' data allows adapting the existing model to the particular experimental protocol used in your company and avoiding potential issues related to discrepancies between different experimental methods used for determination of drug interactions with hERG (see Model Trainability Demonstration) section.

The accuracy of predictions for compounds within model Applicability Domain (indicated by Reliability Index values) is comparable to screening results. Predictions that are not reliable, may be instantly improved by addition of experimental data for a few similar compounds to the model Self-training Library.

The table below shows performance of the model on the internal validation set consisting of 151 molecules. Predictions for 103 compounds (68.2% of the validation set) within Model Applicability Domain (indicated by Reliability Index (RI) value > 0.3) are highly accurate:

Predicted
True False Accuracy 91.3%
True 60 4 Sensitivity 93.4%
False 5 34 Specificity 87.2%
  • Only compounds within Applicability Domain (RI > 0.3) were considered in testing.

Model Trainability Demonstration

Distribution of Test Set compound by RI values of predictions after addition of different portions of PubChem data set to the Self-training Library
Distribution of Test Set compound by RI values of predictions after addition of different portions of PubChem data set to the Self-training Library

Trainability of the described predictive model of hERG inhibition was tested using an external data set derived from HTS fluorescence assay that has recently become available in the PubChem database. Validation procedure was performed as follows:

  • HTS fluorescence assay data for 1609 compounds were extracted from Pubchem database. Quantitative values provided in the PubChem database (PubChem scores - fluorescence increase over negative control compared to reference compound terfenadine) were converted to binary representation: compounds with Pubchem score > 40% were considered hERG inhibitors; those with Pubchem score from -20 to 20% - non-inhibitors.
  • Part of this external data library was reserved as a test set. The remaining data were added to the Selftraining

Library in three steps.

  • The resulting models containing different portions of HTS data were validated against the reserved test set.

When calculations for the test set are made using Built-in Self-training Library, predicted values for many compounds aremarked ‘Not reliable’ (i.e. fall outside of the Model Applicability Domain, red bars in the figure). However, as discussed above, prediction accuracy is still impressive if calculations of at least borderline reliability (RI ≥ 0.3) are considered. The key point is the appearance of a considerable number of moderate (RI ≥ 0.5) and high quality predictions (RI ≥ 0.7) when even a small part of external data set is added to the Self-training Library (green bars in the figure). The percentage of reliable predictions goes even higher with further expansion of the Library, while the same or better overall accuracy of calculations is maintained:

Reliability RI > 0.5 RI > 0.7
Library N Accuracy N Accuracy
Built-in 104 96.15% 6 83.33%
Built-in + 320 302 99.01% 150 99.33%
Built-in + 623 345 99.13% 177 99.48%
Built-in + 935 376 98.34% 192 99.48%

These results demonstrate the ability of our ‘Trainable model’ methodology to adapt the existing model to the particular chemical space represented by an external compound set. It is also obvious that our Training engine successfully corrects for the differences in experimental estimation when data from different assays are combined and therefore, is particularly suitable for analysis of ‘in-house’ data.