Optimal drug combinations and minimal hitting sets
© Vazquez; licensee BioMed Central Ltd. 2009
Received: 9 February 2009
Accepted: 6 August 2009
Published: 6 August 2009
Identifying effective drug combinations that significantly improve over single agents is a challenging problem. Pairwise combinations already represent a huge screening effort. Beyond two drug combinations the task seems unfeasible.
In this work we introduce a method to uncover drug combinations with a putative effective response when presented to a heterogeneous population of malignant agents (strains), such as cancer cell lines or viruses. Using data quantifying the effect of single drugs over several individual strains, we search for minimal drug combinations that successfully target all strains. We show that the latter problem can be mapped to a minimal hitting set problem in mathematics. We illustrate this approach using data for the NCI60 panel of tumor derived cell lines, uncovering 14 anticancer drug combinations.
The drug-response graph and the associated minimal hitting set method can be used to uncover effective drug combinations in anticancer drug screens and drug development programs targeting heterogeneous populations of infectious agents such as HIV.
The main stream in drug discovery has focused on identifying compounds targeting specific malignant agents, such as cancer subtypes or virus strains. In many cases, however, the target of drug therapy is a heterogeneous population of malignant agents, each characterized by a different degree of aggressiveness and response to therapy. Drug resistance is a clear example, whereby an induced or preexisting subpopulation of malignant agents is not responsive to a drug, escaping treatment.
Drug combinations can improve over single therapeuthic agents in two ways. Synergy between two drugs may result in a better response than the two drugs independently. A drug combination may also be more effective when targeting heterogeneous populations of malignant agents. In the latter case, although each single drug may be only effective for a subset of the malignant agents, the drug set as a whole may cover all malignant agents.
Uncovering drug combinations by direct screening is quite challenging due to the large number of potential combinations. A recent high-throughput screen was able to systematically test about 120,000 different two-drugs combinations . Yet, programs like the NCI60 anticancer drug screen count with a stock of above 100,000 potential therapeuthic agents , resulting in more than 5 × 109 two-drugs combinations. The situation becomes even worse when addressing combinations of more than two drugs. More important, assuming that most drug combinations will not improve significantly over single drugs, attempting such high-throughput screens is highly inefficient.
Some interesting techniques are starting to emerge to tackle the potential scarcity of good combinations. The discovery process can be accelerated and the screening costs reduced using stochastic search algorithms and close-loop optimization . Modeling and network approaches can help us to anticipate synergistic effects [4–6]. Yet, there is no general method to identify effective drug combinations from a very large drug stock.
In this work we introduce a systematic framework to uncover effective drug combinations. Our approach is based on the existence of a population of malignant agents (strains), a stock of drugs to target them and certain measure quantifying the response of each strain to each single drug. Starting from this data we construct a strain-drug response graph. Using this graph we show that the problem of finding the minimal number of drugs with a putative effective response over all strains is equivalent to the minimal hitting set problem in mathematics. We illustrate the applicability of this framework using data from the NCI60 anticancer drug screen as a case study. We report 14 drug combinations with a putative effective response over cancer types represented by the NCI60 panel of tumor derived cell lines.
Mapping to a minimal hitting set problem
To start addressing the drug combination problem, let us assume we count with a stock of drugs to target different strains that can be found in the patient population. The strains are characterized, in principle, by a different response to the drugs in our stock. Our goal is to find a minimal set of drugs, taken from the available stock, such that each of the strains will respond well to at least one drug in our set.
Let us show how this work in a specific example. The NCI60 is a program developed by the NCI/NIH aiming the discovery of new chemotherapeutical agents to treat cancer . Their drug stock is made of above 100,000 compounds and response data for 40,000 compounds is publicly available. Their population of cancer cell lines (the strains in this context) is made of 60 tumor derived cell lines, representing nine tissues of origin. The cell lines response to the chemical agents is quantified by the IC50, the drug concentration necessary to inhibit the growth of a exposed cell line culture to 50% relative to the untreated control.
To determine what constitutes a good response we use as a reference the IC50 distribution over all pairs (cell line, drug), after performing a z-transformation of the IC50s in a logarithmic scale (Fig. 1a, solid line). This reference distribution peaks at zero and decays very fast beyond two standard deviations. Values to the left denote small sensitivity – bad response – and values to the right denote high sensitivity -good response. In the following we assume as a good response positive values above two standard deviations (Fig. 1a, dashed line). Applying this criteria to each pair of (cell line, drug) we obtain a graph equivalent to that in Fig. 1 for the NCI60 system.
Finding minimal hitting sets
Covering any of these drugs will automatically reduce to half the size of our computational problem. Thus, we first use a greedy algorithm, first reported in , that recursively covers and removes a drug randomly selected among those drugs with the current highest number of connections, until there are no more samples connected to drugs (Methods, highest-degree-first).
Minimal hitting sets
Drugs in the minimal hitting sets
Mechanism of action
Quinoline-4-carboxamide, N, N'-[(1,4-piperazinediyl) bis(3,1-propanediyl)]bis(2-phenyl-, dihydrochloride
1H-Inden-1-one, 2,3-dihydro-2-[(4-hydroxy-3,5-dimethylphenyl) methylene]-5,6-dimethoxy-, (2E)-
Benzo [1,2-b:4,5-b']dithiophene-4,8-dione, 2-(1-hydroxyethyl)-
7-methoxy-5-oxo-8-[3-(9-oxo-9,10-dihydro-4-acridinylcarboxam ido)propoxyl]-(11aS)-1H,2H,3H,5H-bezo [e]pyrrolo [1,2-a][1,4]d iazepine
Discussion and conclusion
Exhaustive screening of all possible drug combinations is an ineffective strategy to identify good drug combinations. Current screens for single drugs should help to anticipate potentially effective drug combinations, allowing us to narrow down from a see of drug combinations to a short list. The latter can be subject to direct testing, but now with a dramatic decrease of the screening costs.
The strain-drug response graph and the associated minimal hitting set problem provides a systematic framework to tackle this problem. The single agent screen data is represented by a bipartite graph, with a class of vertices representing drugs and another representing malignant agents/strains. Furthermore, the good response of a strain to a drug is represented by a connection between the corresponding vertices in the graph. Using this construction as input, we can search for effective drug combinations, defined as minimal set of drugs such that each strain responds well to at least one drug. The latter problem is mapped to the minimal hitting set problem in mathematics.
The analysis of the NCI60 anticancer drug screen shows how these ideas can be implemented in practice. In this specific example it was possible to identify all minimal hitting sets by exhaustive evaluation of all combinations up to three drug cocktails. An approximate algorithm based on simulated annealing was able to identify all minimal hitting sets as well. The latter algorithm is far more efficient and could be used in problems that are more computationally demanding, with a larger drug stuck or a potentially larger number of drugs in the minimal hitting sets.
The exhaustive search is not a feasible strategy for very large datasets. Therefore, even when the strain-drug response graph is complete, we would rely on approximate algorithms to obtain an upper bound to the minimal hitting set size. Besides the highest-degree-first and simulated annealing algorithms discussed here, there are other heuristic algorithms [8, 11] that in some specific problems may result in better estimates.
From the biological point of view, the identified drug combinations are minimal hitting sets for the NCI60 panel of cell lines. A cell line not included in this panel may not respond well to any of these combinations. Furthermore, using the single drug response data we cannot anticipate potential interactions between the drugs in a given minimal set. Finally, we have not addressed other important issues such as toxicity which may exclude a drug combination for clinical use.
In spite of these caveats, the strain-drug response graph and the associated minimal hitting set problem provide a solid mathematical foundation to the drug combination problem. When information is incomplete and the estimates are approximate, it provides an upper bound to the actual minimal hitting set size. It can be applied to larger panels of cancer cell lines to increase the coverage over the population of cancer cell lines. It narrows down to a short list of drug combinations which can be subject to validation, testing combinatorial effects and toxicity.
In a more general perspective, our formulation can also find applications in drug discovery programs targeting viruses with high mutation rates such as HIV. In this context we would require a collection of virus strains found in the patient population, a stuck of antiviral drugs, and a quantitative measure of how well each virus strain responds to each antiviral drug.
The IC50 data for the NCI60 panel of tumor derived cell lines was obtained from the Developmental Therapeutics Program of NCI/NIH. It consists of IC50 values for 45,344 compounds against the 60 cancer cell lines.
Given a strain-drug response graph, start setting all drugs uncovered. Then recursively transform the drugs state and the drug-response graph as follows: (i) Identify the set of drugs having the largest number of connections in the current drug-response graph. If the latter set is made of one drug select that drug. Otherwise, randomly select one of the drugs in the set. (ii) Set that drug covered, remove the drug, all the samples connected to that drug and the edges connecting the drug and the samples. (iii) Stop if the drug-response graph does not contain any samples connected to at least one drug. Otherwise go to step (i). Note: the application of rule (i) introduces randomness in the algorithm and, as a consequence, different runs may result in different outcomes. Specifically, we may obtain different minimal estimated hitting set sizes and/or different hitting sets with the same size. This fact can be exploited by running the algorithm several times and retaining those solutions having the minimum reported hitting set size.
Simulating annealing algorithm
Given a strain-drug response graph, introduce the state variable x i , taking the value x i = 1 when element (drug) i is covered and 0 otherwise, and the energy or cost function E = ∑ i x i counting the number of covered elements. Proceed as follows: (i) Generate a random set cover and set an initial inverse temperature β = β0. The random set cover does not need to be of minimal size. We generate it by covering one element (drug) selected at random from each set (strain) with at least one element. (ii) Perform Teq equilibration steps. At each step randomly select an element. If it is covered, and uncovering it does not leave uncover any set, then cover it. If it is uncovered, then cover it with probability e-β, where β is the equivalent of the inverse temperature in physics. (iii) Increase β, β → β + Δβ, and return to step (ii). Stop the loop when some convergence criteria is satisfied or β = βmax. Note: the generation of the initial state and the application of rule (ii) introduces randomness in the algorithm and, as a consequence, different runs may result in different outcomes. Specifically, we may obtain different estimated minimal hitting set sizes and/or different hitting sets with the same size. This fact can be exploited by running the algorithm several times and retaining those solutions having the minimum reported hitting set size. In the NCI60 study we identified all minimal hitting sets using β0 = 0, Δβ = 0.1, βmax = 20, Teq = 10 × number of drugs and 1,000 random random covering seeds. A run for each seed took 92 seconds in a 1.86 GHz Desktop computer, 1,000 seeds took 25 and a half hours.
Research at the IAS was funded by the Simons Foundation and the Helen and Martin Chooljian Founders' Circle Member.
- Borisy AA, Elliott PJ, Hurst NW, Lee MS, Lehár J, Price ER, Serbedzija G, Zimmermann GR, Foley MA, Stockwell BR, Keith CT: Systematic discovery of multicomponent therapeutics. Proc Natl Acad Sci USA. 2003, 100: 7977-7982. 10.1073/pnas.1337088100PubMed CentralView ArticlePubMedGoogle Scholar
- Shoemaker RH: The NCI60 human tumour cell line anticancer drug screen. Nature Rev Cancer. 2006, 6: 813-823. 10.1038/nrc1951.View ArticleGoogle Scholar
- wong PK, Yu F, Shahangian A, Cheng G, Sun R, Ho C: Closed-loop control of cellular functions using combinatory grugs guided by a stochastic search algorithm. Proc Natl Acad Sci USA. 2008, 105: 5105-5110. 10.1073/pnas.0800823105PubMed CentralView ArticlePubMedGoogle Scholar
- Yildirim MA, Goh KL, Cusick M, Barabási AL, Vidal M: Drug-target network. Nature Biotech. 2007, 25: 1119-1126. 10.1038/nbt1338.View ArticleGoogle Scholar
- Nelander S, Wang W, Nilsson B, She QB, Pratilas C, Rosen N, Gennemark P, Sander C: Models from experiments: combinatorial drug perturbations of cancer cells. Mol Syst Biol. 2008, 4: 216- 10.1038/msb.2008.53PubMed CentralView ArticlePubMedGoogle Scholar
- Campillo M, Kuhn M, Gavin A, jensen LJ, Bork P: Drug target identification using side effect similarity. Science. 2008, 321: 263-266. 10.1126/science.1158140View ArticleGoogle Scholar
- Garey MR, Johnson DS: Computers and intractability: A guide to the theory of NP-completeness. 2002, WH Freeman, New YorkGoogle Scholar
- Vazirani V: Approximation Algorithms. 2004, Springer, BerlinGoogle Scholar
- Johnson DS: Approximation algorithms for combinatorial problems. J Comp Syst Sci. 1974, 9: 256-278.View ArticleGoogle Scholar
- Teachey DT, Sheen C, Hall J, Ryan T, Brown VI, Fish J, Reid GS, Seif AE, Norris R, Chang YJ, Carroll M, Grupp SA: mTOR inhibitors are synergistic with methetrexate: an effective combination to treat acute lymphoblastic leukemia. Blood. 2008, 112: 2020-2023. 10.1182/blood-2008-02-137141PubMed CentralView ArticlePubMedGoogle Scholar
- Mézard M, Tarzia M: Statistical mechanics of the hitting set problem. Phys Rev E. 2007, 76: 041124-10.1103/PhysRevE.76.041124.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.