HERG Inhibition

From ACD Percepta
Revision as of 07:51, 28 May 2012 by Kristina (talk | contribs) (→‎Assignment of qualitative categories)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Overview


Cardiotoxicity of drug-like compounds associated with human ether-a-go-go (hERG) channel inhibition is becoming more and more common cause of drug candidates’ attrition. The hERG potassium channel is required for normal cardiac depolarization and its blockage can lead to cardiac QT interval prolongation and life threatening arrhythmias.

Using hERG inhibition module, you have the capability to quickly identify hERG inhibitors. Training of models using usually very large ‘in-house’ experimental (screening) data of hERG inhibition would expand the Applicability Domain of the model and would produce reliable predictions for compounds synthesized in your company. Moreover, training allows customization of our model to ensure that it correctly handles the data originating from the particular screening protocol used in your company that may significantly differ from standard protocols described in the literature.

Features

  • The binomial QSAR model utilizes a training set of more than 600 compounds with experimental results mainly collected from the published hERG inhibition studies by patch-clamp method.
  • Predicts the probability for a compound to inhibit hERG channel at clinically relevant concentrations (Ki < 10 μM).
  • Calculates Reliability Index (RI values) of predictions that indicate whether tested compounds belong to Applicability Domain of predictive model.
  • Performs a similarity search and displays top 5 most similar structures from the training set of the model along with their names, experimental results, and literature references.
  • Training of the model using ‘in-house’ data generated by ‘in-house’ screening protocol.


Interface


File:herg inhibition.png


  1. Estimated probability of a compound being human ether-a-go-go (hERG) channel inhibitor.
  2. Indication of the prediction reliability along with the Reliability Index value:
    • RI < 0.3 – Not Reliable,
    • RI in range 0.3-0.5 – Borderline Reliability,
    • RI in range 0.5-0.75 – Moderate Reliability,
    • RI >= 0.75 – High Reliability
  3. Up to 5 similar structures in the training set with names, experimental results (Inhibitor, Non-inhibitor, Inconclusive data), and references


Technical information


Experimental data

Data set used for model development consisted of 663 binary values (inhibitor, non-inhibitor) These were collected from original publications considering two types of experiments:

  • Electrophysiological patch-clamp assay - hERG current inhibition expressed as IC50 constants (512 compounds).
  • Radioligand (dofetilide, astemizole, MK-499) displacement assay providing Ki values (161 compound).

Assignment of qualitative categories

The following criteria were applied for conversion of continuous data representing strength of compounds' interaction with hERG channel to binary representation:

  • In patch-clamp studies compounds that exhibited IC50 < 10μM were considered hERG inhibitors, while those with IC50 > 10μM – hERG non-inhibitors.
  • For the data coming from radioligand displacement assay the corresponding thresholds were as follows: Ki < 0.5 μM - inhibitors, Ki > 100 μM - non-inhibitors, while compounds in the intermediate range (0.5 μM < Ki < 100 μM) were labeled inconclusive.

More strict criteria were applied to radioligand displacement data compared to patch-clamp studies since the former method does not provide a direct measure of hERG channel inhibition, but rather represents hERG binding affinity. To ensure high quality of the data set only sufficiently strong or weak binders were considered inhibitors or non-inhibitors respectively, while no definitive categories were are assigned to compounds with moderate binding affinities.

File:herg scale.png

Model features & prediction accuracy

The model was developed with Algorithm Builder using a novel methodology consisting of two parts:

  • Global baseline statistical model employing binomial PLS with multiple bootstrapping using a predefined set of fragmental descriptors.
  • Local correction to baseline prediction based on analysis of experimental data for similar compounds.

The underlying methodology enables obtaining an intrinsic evaluation of prediction confidence by the means of Reliability Index (RI) values calculated for each prediction. RI ranging from 0 to 1 serves as an indication whether a submitted compound falls within the Model Applicability Domain. Two criteria influence the calculation od Reliability Index of a prediction:

  • Similarity of the analyzed molecule to compounds in the Self-training Library (prediction is unreliable if no similar compounds have been found in the Library).
  • Consistency of experimental data for similar compounds (discrepant data for similar molecules, i.e. alternating hERG blockers and hERG non-blockers lead to lower RI values).

The presented method also froms the basis of model Trainability. 'Trainable model' methodology addresses the issue of the chemical space of ‘in-house’ libraries being considerably wider than that of publicly available data which results in limited applicability of most third-party QSARs for analysis of ‘in-house’ data. The ‘Training engine‘ makes appropriate corrections for systematic deviations produced by the baseline QSAR model based on analysis of similar compounds from the experimental data library. Expansion of this Self-training Library with user-defined experimental data for new compounds leads to instant improvement of prediction accuracy for the respective compound classes. Moreover, addition of 'in-house' data allows adapting the existing model to the particular experimental protocol used in your company and avoiding potential issues related to discrepancies between different experimental methods used for determination of drug interactions with hERG (see Model Trainability Demonstration) section.

The accuracy of predictions for compounds within model Applicability Domain (indicated by Reliability Index values) is comparable to screening results. Predictions that are not reliable, may be instantly improved by addition of experimental data for a few similar compounds to the model Self-training Library.

The table below shows performance of the model on the internal validation set consisting of 151 molecules. Predictions for 103 compounds (68.2% of the validation set) within Model Applicability Domain (indicated by Reliability Index (RI) value > 0.3) are highly accurate:

Predicted
True False Accuracy 91.3%
True 60 4 Sensitivity 93.4%
False 5 34 Specificity 87.2%
  • Only compounds within Applicability Domain (RI > 0.3) were considered in testing.

Model Trainability Demonstration

Distribution of Test Set compound by RI values of predictions after addition of different portions of PubChem data set to the Self-training Library
Distribution of Test Set compound by RI values of predictions after addition of different portions of PubChem data set to the Self-training Library

Trainability of the described predictive model of hERG inhibition was tested using an external data set derived from HTS fluorescence assay that has recently become available in the PubChem database. Validation procedure was performed as follows:

  • HTS fluorescence assay data for 1609 compounds were extracted from Pubchem database. Quantitative values provided in the PubChem database (PubChem scores - fluorescence increase over negative control compared to reference compound terfenadine) were converted to binary representation: compounds with Pubchem score > 40% were considered hERG inhibitors; those with Pubchem score from -20 to 20% - non-inhibitors.
  • Part of this external data library was reserved as a test set. The remaining data were added to the Selftraining

Library in three steps.

  • The resulting models containing different portions of HTS data were validated against the reserved test set.

When calculations for the test set are made using Built-in Self-training Library, predicted values for many compounds aremarked ‘Not reliable’ (i.e. fall outside of the Model Applicability Domain, red bars in the figure). However, as discussed above, prediction accuracy is still impressive if calculations of at least borderline reliability (RI ≥ 0.3) are considered. The key point is the appearance of a considerable number of moderate (RI ≥ 0.5) and high quality predictions (RI ≥ 0.7) when even a small part of external data set is added to the Self-training Library (green bars in the figure). The percentage of reliable predictions goes even higher with further expansion of the Library, while the same or better overall accuracy of calculations is maintained:

Reliability RI > 0.5 RI > 0.7
Library N Accuracy N Accuracy
Built-in 104 96.15% 6 83.33%
Built-in + 320 302 99.01% 150 99.33%
Built-in + 623 345 99.13% 177 99.48%
Built-in + 935 376 98.34% 192 99.48%

These results demonstrate the ability of our ‘Trainable model’ methodology to adapt the existing model to the particular chemical space represented by an external compound set. It is also obvious that our Training engine successfully corrects for the differences in experimental estimation when data from different assays are combined and therefore, is particularly suitable for analysis of ‘in-house’ data.