 Research
 Open Access
 Published:
Predicting and understanding comprehensive drugdrug interactions via seminonnegative matrix factorization
BMC Systems Biology volume 12, Article number: 14 (2018)
Abstract
Background
Drugdrug interactions (DDIs) always cause unexpected and even adverse drug reactions. It is important to identify DDIs before drugs are used in the market. However, preclinical identification of DDIs requires much money and time. Computational approaches have exhibited their abilities to predict potential DDIs on a large scale by utilizing premarket drug properties (e.g. chemical structure). Nevertheless, none of them can predict two comprehensive types of DDIs, including enhancive and degressive DDIs, which increases and decreases the behaviors of the interacting drugs respectively. There is a lack of systematic analysis on the structural relationship among known DDIs. Revealing such a relationship is very important, because it is able to help understand how DDIs occur. Both the prediction of comprehensive DDIs and the discovery of structural relationship among them play an important guidance when making a coprescription.
Results
In this work, treating a set of comprehensive DDIs as a signed network, we design a novel model (DDINMF) for the prediction of enhancive and degressive DDIs based on seminonnegative matrix factorization. Inspiringly, DDINMF achieves the conventional DDI prediction (AUROC = 0.872 and AUPR = 0.605) and the comprehensive DDI prediction (AUROC = 0.796 and AUPR = 0.579). Compared with two stateoftheart approaches, DDINMF shows it superiority. Finally, representing DDIs as a binary network and a signed network respectively, an analysis based on NMF reveals crucial knowledge hidden among DDIs.
Conclusions
Our approach is able to predict not only conventional binary DDIs but also comprehensive DDIs. More importantly, it reveals several key points about the DDI network: (1) both binary and signed networks show fairly clear clusters, in which both drug degree and the difference between positive degree and negative degree show significant distribution; (2) the drugs having large degrees tend to have a larger difference between positive degree and negative degree; (3) though the binary DDI network contains no information about enhancive and degressive DDIs at all, it implies some of their relationship in the comprehensive DDI matrix; (4) the occurrence of signs indicating enhancive and degressive DDIs is not random because the comprehensive DDI network is equipped with a structural balance.
Background
When two or more drugs are taken together, they would unexpectedly influence each other in terms of pharmacokinetic or pharmacodynamical behavior [1]. This kind of influence is termed as DrugDrug Interaction (DDI), which would cause adverse drug reactions (e.g. the reduction in efficacy or the increment on unexpected toxicity among the coprescribed drugs). As the number of approved drugs increases, the number of unidentified DDIs is rapidly rising, such that more and more adverse effects among the drugs may occur. Unidentified DDIs would lead patients, who are treated with numerous medications, to be in the unsafe treatment of medication errors [2,3,4,5]. Understanding DDI is also the first step towards drug combination, which is one of promising strategies for multifactorial complex diseases [6]. Therefore, it is an urgent need to screen and analyze DDIs before clinical medications are administered. However, traditional experimental approaches for DDI identification (e.g. testing cytochrome P450 [7] or transporterassociated interactions [8]) face challenges, such as high cost, long duration, animal welfare considerations [9], the very limited number of participants and the great number of drug combinations under screening in clinical trials. As a result, only a few of DDIs could be identified during drug development (usually the clinical trial phase), and most of them are reported after the drugs are approved.
Computational approaches are promising to discover potential DDIs on a large scale, and they have gained many concerns from both academy and industry recently [10, 11]. Textmining based computational approaches have been developed for detecting DDIs from different sources [9], including scientific literature [12, 13], electronic medical records [14], and the Adverse Event Reporting System of FDA (http://www.fda.gov). These approaches heavily rely on the clinical evidence in postmarket, thus cannot provide alerts of potentially DDIs before clinical medications are administered. In contrast, machine learningbased computational approaches (e.g. naïve similaritybased approach [15], network recommendationbased [9], classificationbased [16]) are able to provide such alerts by utilizing premarketed drug features or similarities [17], such as, chemical structures [15], targets [18], hierarchical classification codes [16] and side effects [9, 19]. Most of the existing machine learningbased approaches were designed for conventional binary prediction, which only indicates how likely a pair of drugs is a DDI. However, two interacting drugs may increase or decrease their own pharmaceutical behaviors or effects in vivo.
It is more important to know exactly whether the interaction increases or decreases the pharmaceutical behaviors of the drugs when making optimal patient care, establishing the dosage of a drug, designing prophylactic drug therapy, or finding the resistance to therapy with a drug [20]. As one of the important pharmaceutical indices, serum concentration reflects the amount of a drug in the pharmacokinetic circulation [21]. When interacting with other drugs, a drug would increase or decrease the level of its own and its partners’ serum concentration. For example, the serum concentration of Dofetilide (whose DrugBank Id is DB00204) decreases when it is taken with Dabrafenib (DB08912) together, whereas its serum concentration increases when taken with Dalfopristin (DB01764). For short, we name the first case of DDI as a degressive DDI and the second case of DDI as an enhancive DDI in the following texts.
To summarize, predicting approaches for comprehensive DDIs helps to uncover the underlying mechanism of how DDIs occur [19]. However, most of current approaches have been developed only for conventional binary DDIs, but not for enhancive and degressive DDIs. Furthermore, there is a lack of systematic analysis on the structural relationship hidden among known DDIs. Revealing such a structural relationship is very important, because it is able to help understand how DDIs occur. A promising solution of these two issues is able to guide medical doctors to make safe coprescriptions.
In this paper, to address abovementioned issues, we firstly design a novel model (DDINMF) for DDI prediction based on Nonnegative Matrix Factorization (NMF). Representing DDIs as a binary network and a signed network respectively, we then make an attempt to reveal the structural relationship hidden among DDIs by NMF and social balance theory.
Methods
Dataset
We collected 2329 approved drugs from DrugBank [22] and selected 603 drugs, which have DDIs recorded in DrugBank. Among the set of drugs, we also removed those drugs without chemical structures or without the offlabel side effects recorded in OFFSIDES [23], and kept the remaining 568 drugs having both of them. The final DDI dataset contains 21,351 DDIs, including 16,757 enhancive DDIs (EDDI) and 4594 degressive DDIs (DDDI). Each drug is represented as an 881dimensional feature vector f_{ str } based on PubChem structure descriptor and also a 9149 dimensional feature vector f_{ se } according to the offlabel side effects provided by OFFSIDES [23]. The second feature f_{ se } was proposed firstly in [19]. These two vectors are binary, of which a value of 1 denotes the occurrence of a specific structure fragment in its chemical structure or the observation of a specific side effect in clinic, and 0 if this does not occur or is not observed. By organizing DDIs as a DDI graph, we also investigated the degrees of DDIs, enhancive DDIs and degressive DDIs respectively. The details of our DDI dataset are listed in Table 1.
Problem formulation
Without loss of generality, let D = {d_{ i }}, i = 1, 2, …, m be a set of m approved drugs. Their interactions can be organized as an m × m symmetric interaction matrix A_{m × m} = {a_{ ij }}, which can be regarded as the adjacent matrix of DDI network. For the conventional DDI, a_{ ij } = 1 if d_{ i } interacts with d_{ j }, and a_{ ij } = 0 otherwise. For the comprehensive DDI, a_{ ij } ∈ {−1, 0, +1}. In details, if d_{ i } and d_{ j } do not interact with each other, a_{ ij } = 0. When there is an enhancive DDI or a degressive DDI between d_{ i } and d_{ j }, a_{ ij } = + 1 or a_{ ij } = − 1 respectively. Obviously, as the special case of the comprehensive DDI matrix A, the conventional binary DDI matrix A_{ b } can be obtained by A_{ b } = Binary(A).
In addition, each drug d_{ i } in D is represented as a pdimensional feature vector f_{ i } = [f_{1}, f_{2}, …, f_{ k }, …, f_{ p }], where f_{ k } = 1 denotes the kth specific structure fragment or observed side effect and f_{ k } = 0 otherwise . All the drugs in D are sequentially organized as an m × p feature matrix F_{m × p}.
To obtain a brief description, a known drug is referred to as a drug in D and a new drug is defined as the drug having no interaction with any drug in D. A screening scenario for new drugs is considered in this paper. Formally, the task is to predict how likely a newly given drug d_{ x } interacts with one or more known drugs, and how likely a DDI of d_{ x } is an enhancive DDI or a degressive DDI. Fig. 1 illustrates the task.
Prediction model
Since the new drugs have no existing interaction with the drugs in D and are regarded as isolated nodes in the DDI network (Fig. 1), we cannot utilize their topological information to deduce their potential interactions. This is also the wellknown cold start problem in recommendation system. Obviously, we need their additional properties (e.g. chemical structure fragments or side effects), which are called drug features in terms of machine learning. Once the additional features are obtained, we would build a supervised model to perform DDI prediction. The main idea is to build the relationship between the features of known drugs in D and their DDI network topology. We built the model (DDINMF) for predicting DDI based on nonnegative matrix factorization. It includes two phases (Fig. 2).
(1) In the training phase of DDINMF, the adjacent matrix A_{m × m} among m known drugs is first decomposed into two matrices A_{m × m} ≈ W_{m × r} × H_{r × m}, where r is the dimension of the latent topological space. Denote F_{m × p} as the input feature matrix of those m known drugs (the training drugs). Then the relationship between F_{m × p} and H_{r × m} can be modeled by a regression (H^{T})_{m × r} = F_{m × p} × B_{p × r}, where B_{p × r} is the regression coefficient matrix.
(2) In the predicting phase, the learned B_{p × r} firstly maps the n × p input feature matrix of n newly given drugs (denoted as F_{ x }) into the latent topological space by \( {\mathbf{H}}_x^T={\mathbf{F}}_x\times \mathbf{B} \). Then the n × r mapped latent feature matrix H_{ x } is used to generate the predicted interactions between the new drug and the known drugs by A_{ x } = (WH_{ x })^{T}.
DDINMF applies the regular nonnegative matrix factorization (NMF) to decompose the DDI adjacent matrix when given a conventional binary adjacent matrix, and applies a variant of NMF, semiNMF, when given a comprehensive adjacent matrix. A brief introduction of both NMF and semiNMF can be found in the next section. The regression in DDINMF is solved by Partial Least Square Regression (PLSR) because there is a multicollinearity between some columns of F or H. The model of multivariate PLSR can be solved by SIMPLS algorithm [24].
Nonnegative matrix factorization and seminonnegative matrix factorization
Given an instance matrix X_{m × n} = [x_{ 1 }, x_{ 2 }, …, x_{ n }] ∈ ℝ^{+}, Nonnegative Matrix Factorization (NMF) aims to find a base matrix W_{m × r} = [w_{ ir }] ∈ ℝ^{+} and an encoding matrix H_{r × n} = [h_{ ri }] ∈ ℝ^{+}, whose product can well approximate the original matrix X^{+} ≈ W^{+}H^{+}, where r < min(m, n). These approximating factors are typically obtained by solving the constrained least square minimization problem [25]:
The algorithm minimizing the objective function is as follows:
The factor matrices W and H in the regular NMF are able to provide basic clustering. If we view W = [w_{1}, w_{2}, …, w_{ r }] as the cluster centroids, then H = [h_{1}, h_{2}, …, h_{ n }] can be viewed as the cluster indicators for each datapoint. Especially, when X^{+} is a symmetric binary nonnegative matrix (e.g. the adjacent matrix of DDI A_{ b }), W denotes the communities in a network while H denotes the membership of instances to those communities.
Though there is a variant of NMF, symmetric NMF (SymNMF) [26], which looks more natural to the symmetric adjacent matrix of binary DDIs, it has disadvantages from a technical point of view. SymNMF has higher complexity than NMF, thus runs slower. Most importantly, it has a larger reconstructed error, especially a higher divergence of the diagonal elements in its reconstructed matrix comparing to allzero diagonal of the original matrix. Therefore, we apply NMF but not SymNMF to decompose the binary adjacent matrix of DDI.
NMF has a strong constraint of X ∈ ℝ^{+}. When the instance matrix has mixed signs (e.g. the adjacent matrix of comprehensive DDI A ), we consider SemiNMF X^{±} ≈ W^{±}H^{+}, in which both \( \mathbf{X}\in {\mathrm{R}}_{\pm}^{m\times n} \) and \( \mathbf{W}\in {\mathbf{R}}_{\pm}^{m\times r} \) have mixed signs while only \( \mathbf{H}\in {\mathbf{R}}_{+}^{r\times n} \) is restricted to comprise strictly nonnegative components [27]. SemiNMF has the same form of cost/objective function as that of NMF, except for no constraint on W. But the algorithm minimizing the objective function is totally different:
where H^{†} is the Moore–Penrose pseudoinverse of H, A^{pos} is a matrix that has the negative elements of matrix A replaced with 0, and A^{neg} is the one that has the positive elements of A replaced with 0:
Though there are other methods of matrix factorization (e.g. Principal Component Analysis, PCA) that can be used to decompose the comprehensive adjacent matrix of DDIs, semiNMF has the exclusive advantage that both its W and H show physical meanings, such as the ordinary degree, the difference of positive degree and negative degree, and the balance properties. See also Section Results and Discussion.
Finally, NMF and SemiNMF are adopted to decompose the binary and the comprehensive adjacent matrices of DDIs in our predicting model respectively.
Assessment
Kfold Crossvalidation (KCV) is the wellestablished approach to validate the power of generalization of algorithms in machine learning. To reflect the fact that new drugs have no interaction and to avoid overoptimistic prediction [17, 28,29,30,31], the KCV scheme should be elaborately designed. For the given drugs having NO known interaction, the KCV scheme tries to assess the task of predicting new potential interactions between them and those drugs having known interactions. The task corresponding to the KCV scheme is useful when one tries to extend existing coprescriptions by adding new drugs.
The generation of both training samples and testing samples is as follows. 1/K drugs are randomly removed out of all the given drugs in D as the testing drugs and the remaining drugs are taken as the training drugs. The drug pairs among the training drugs are selected as the training samples. The testing drugs are regarded as new drugs, and only the drug pairs between the testing drugs and the training drugs are selected as the testing samples, which are blind to the training (see also Fig. 1). Finally, the above procedures are repeated K times and the average of predicting performance in all rounds of CV is taken as the final performance. Usually, K = 10.
Two measures are usually adopted to assess the predicting performance, including the area under receiver operating characteristic curve (AUROC) and the area under precisionrecall curve (AUPR) [32]. In the prediction of conventional binary DDIs, since positive and negative samples are interactions and noninteractions, both AUROC and AUPR can be calculated by comparing their predicted scores. Considering that the predicted scores of enhancive DDIs tend to be greater than ZERO and the predicted scores of degressive DDIs tend to be less than ZERO, we need to extend the calculation of AUROC and AUPR. In the prediction of comprehensive DDIs, to adopt the same way of calculating AUROC and AUPR, both enhancive and degressive DDIs first are labeled as positive samples while noninteractions are still labeled as negative samples. Then, the union of the predicted scores of enhancive DDIs and the MINUS of the predicted scores of degressive DDIs is compared with those scores of noninteractions by the same way as that in measuring conventional prediction.
Results and Discussion
Feature preprocessing
Because each drug was represented as an 881dimensional feature vector f_{ str } or as a 9149dimensional feature vector f_{ se }, we reduced the high dimensions to accelerate the calculation. Though the value of the reduced dimension was unknown, we estimated it by performing ordinary PCA on the input feature matrix and counting the obtained PCs from the first one until the one, of which the entries having the values near to zeros. Then, using the number of PCs as the estimated dimension, we applied Kernel PCA [33] to the original input feature matrix and obtained the mapped feature matrix having significantly smaller dimensions. Gaussian function was adopted as the kernel in Kernel PCA with its variance equal to 4 for f_{ str } and 16 for f_{ se }. Finally, the reduced dimensions of f_{ str } and f_{ se } were 487 and 557 respectively.
Comparison with the stateoftheart approaches
Before running prediction, we assigned two parameters in our DDINMF, including the dimension of latent space (r) in matrix factorization and the number of latent factors (k) in PLSR as follows. We fixed the former with rank(A)/2 and tuned the latter from the list {10, 20, 30, 40, 50, 60, 70, 80, 90, 100} with the prediction measure of AUROC under 10CV. Using the structure feature f_{ str } and the side effect feature f_{ se }, we run NMF for binary DDI prediction and SemiNMF for comprehensive DDI prediction respectively. When performing the conventional prediction of binary DDIs, we turned the entries equal to − 1 into + 1 in A to form the binary DDI adjacent matrix A _{ b } (see also Section Problem Formulation). The best values of the number of latent factors (k = 40 and k = 100 respectively) were picked up for two kinds of DDI prediction. In addition, we concatenated f_{ str }with f_{ se } and run the prediction again (k = 100). The results show that f_{ str } are significantly better than both f_{ se } and their concatenation (Fig. 3). Therefore, only using the chemical structure feature f_{ str }, we performed DDI prediction in the following experiments.
First, we compared our approach with two stateoftheart approaches, including Naïve similaritybased approach [15] and label propagationbased approach [9]. To obtain the robust predicting performance, 10CV was repeated 50 times under different random seeds. Their final performance was reported by the average performance over 50 repetitions of CV (Table 2). The results show that DDINMF is significantly superior to them in terms of both AUROC and AUPR, and also indicate that the comprehensive prediction is more difficult than the conventional prediction.
We also compared NMF with SymNMF in the case of binary DDI prediction. We decomposed our binary DDI adjacent matrix with the dimension of latent space = rank(A)/2 by NMF and SymNMF respectively. NMF spends only 4.76 s on the binary adjacent matrix of DDIs while SymNMF spends 97.6 s using a computer equipped with Windows 10 (64bits), Intel Core i74700MQ 2.40G and 16 GB RAM. On the other hand, the reconstructed error of NMF is only 26.1122 while that of SymNMF is up to 94.7374. The mean and standard derivation of the diagonal elements in the reconstructed matrix achieved by NMF are 0.3888 and 0.4003, whereas those achieved SymNMF are 1.4352 and 0.8303 respectively. In addition, the performance (AUROC = 0.870 and AUPR = 0.583) achieved by SymNMF is worse than those achieved by NMF (see also Table 2), especially on AUPR. In summary, NMF is better than SymNMF.
Similarly, we compared semiNMF with PCA in the case of comprehensive DDI prediction. PCA achieves the prediction performance with AUROC = 0.780 and AUPR = 0.535. Obviously, semiNMF with AUROC = 0.796 and AUPR = 0.579 outperforms PCA. More importantly, semiNMF has the advantage that both its W and H show physical meanings, such as the ordinary degree, the difference of positive degree and negative degree, and the balance properties. See also next section.
Finally, considering the desirable performance of predicting conventional DDIs, we performed a novel prediction. In detail, we randomly collected 50 extra drugs from DrugBank and checked whether they interact with any of the original 568 drugs. According the predicted scores, the topk drug pairs then were picked out and compared with the records in DrugBank. We counted the records matching the drug pairs of interest. For convenience, the number of such records over k is called hitting ratio. We checked the records from top10 to top100 with step 10 (Table 3). The validation demonstrates that our DDINMF is effective to predict DDIs for newly coming drugs.
Properties of DDI network
In this section, two questions are lifted up: (1) how the decomposed factor matrices W and H reflect the properties of DDI network; (2) whether the binary DDI matrix A_{ b }implies some relationship about enhancive and degressive DDIs in the comprehensive DDI matrix. Since NMF has the nature of clustering, the analysis based on it would give insights on the corresponding answers.
To describe briefly, we refer to a set of DDIs as a binary DDI network or a signed DDI network according to the type of DDI. For a drug, we also refer to the number of it DDIs, the number of its enhancive DDIs, and the number of its degressive DDIs as Degree, Positive Degree, and Negative Degree respectively. In this context, we treated the rows of W as the communityderived features \( \left\{{\mathbf{f}}_i^{\mathbf{W}}\right\} \) and the columns of H as the encoding features \( \left\{{\mathbf{f}}_i^{\mathbf{H}}\right\} \) of drugs respectively. The former indicates how each drug contributes to comprise the communities, which account for all the columns of W, while the latter indicates how likely each drug belongs to those communities. Since their dimensions are high, we applied kernel PCA to them and used the first 3 principal components (PC) of the mapped instances to visualize them in a 3d space.
We firstly tried to visualize the binary DDI network (Fig. 4). Each datapoint in the two spaces is rendered by its drug degrees (Fig. 4a and c) and the difference between its positive degree and negative degree (Fig. 4b and d) respectively. In Fig. 4, Red means the largest degree or the degree difference, blue means the smallest value and other colors transiting from blue to red denote the middle values in an ascending order. Several interesting aspects can be found:
(1) As expected, the mapped spaces show significant clusters. Four significant branches are observed in the mapped space of communityderived features (Fig. 4a) while both two big clusters and one small cluster are found in the mapped space of encoding features (Fig. 4b). Degree shows a fairly clear distribution along those branches or in those clusters. For example, the Spearman Correlation between the degrees of the data points in the biggest branch (Fig. 4a) and their first mapped PC (denoted as X) is up to 0.9250.
(2) Surprisingly, the difference between Positive Degree and Negative Degree shows a significant distribution. Especially, both the branch X in Fig. 4c and the left cluster in Fig. 4d contain all the data points, of which their positive degrees are definitely greater than their negative degrees. In addition, the points having large degrees tend to have a large difference between Positive Degree and Negative Degree. The observation on binary DDIs surprisingly implies that the occurrence of signs indicating enhancive and degressive DDIs is not random.
Similarly, we then tried to visualize the comprehensive DDI network (Fig. 5). The mapped space of communityderived features contains four sectors, which show a significant distribution of both degree and positivenegative degree difference in the plane of the first and the third PCs (denoted as XZ). The mapped space of encoding features also shows that the positive degrees of data points are definitely greater than their negative degrees and that the points having large degrees tend to have a large difference between positive degree and negative degree. Differentially, the mapped space of encoding features presents that its first PC (denoted as X) and the degree are almost completely correlated with each other (Spearman Correlation = 0.9822). In general, the observation is consistent with that in Fig. 4.
In summary, one just keeps it in mind that the binary DDI network contains NO information about enhancive and degressive DDIs at all, but it shows a consistency with the comprehensive DDI network, especially about Degree, Positive Degree and Negative Degree. Therefore, the binary DDI network implies some relationship about enhancive and degressive DDIs in the comprehensive DDI matrix. In other words, the occurrence of signs indicating enhancive and degressive DDIs is not random.
To validate this point further, we checked the signed DDI network to see whether or not it is of a structural balanced network. The relationships in signed networks can be positive (“enhancive”, “like”, “friend”) or negative (“degressive”, “dislike”, “enemy”). Thus, there are four kinds of triangle relation patterns according to the type of DDI, including (positive, positive, positive), (positive, negative, negative), (negative, positive, positive) and (negative, negative, negative). The first two triangles are called the balanced pattern, the third is called as the unbalanced pattern and the last is named as the weakly balanced pattern. We counted four kinds of triangle patterns in the comprehensive (signed) DDI network and listed their numbers as follows: 374,215, 47,389, 46,478 and 7773. Obviously, ~ 90% triangles are balanced patterns. After checking the balance tendency in the mapped space of communityderived features, we found: (1) all the triangles (54,722) among the drugs having the first PC < 0 are purely balanced, and (2) the interactions of the drugs having the second PC < 0 are purely positive and negative in the two branches respectively. In addition, the small cluster in the mapped space of encoding features contains no weakly balanced triangles but 253,284 balanced triangles as well as 8597 imbalanced triangles. The results demonstrate that the comprehensive DDI network is equipped with a structural balance, such that the occurrence of enhancive and degressive DDIs is not random. The observation helps understand how DDIs gather together.
Conclusions
Existing computational approaches are able to screen potential DDIs on a large scale before drugs are used in the medicine market. However, none of them can predict comprehensive DDIs, including enhancive and degressive DDIs, though it is important to know whether the interaction increases or decreases the behavior of the interacting drugs before making a coprescription. Moreover, the important structural relationship hidden among DDIs still remains unknown.
To address the abovementioned issues, we have designed a novel approach DDINMF for comprehensive DDI prediction in different scenarios. Treating the set of DDI as a network, it provides a unified framework for predicting conventional binary DDIs as well as comprehensive DDIs. Experiments on the dataset demonstrate that DDINMF is significantly superior to two stateoftheart approaches in the conventional binary DDI prediction and also shows an acceptable performance in the comprehensive DDI prediction. More importantly, DDINMF enables us to discover crucial knowledge hidden among DDIs, including degree distribution and tendency, as well as the implication of binary network to signed network, especially the structural balance of comprehensive DDI network.
There are still unknown but important factors among DDIs to be uncovered. In the future, we will focus on extracting and integrating more features of drug (e.g. drug targets and pathway) together to achieve improved DDI prediction, building an interpretable mapping between drug features and DDI, and digging out more structural patterns of comprehensive/signed DDI network and even their dynamic properties.
Abbreviations
 AUPR:

The area under precisionrecall curve
 AUROC:

The area under the receiver operating characteristic curve
 CV:

Crossvalidation
 DDI:

Drugdrug interaction
 NMF:

Nonnegative matrix factorization
 PCA:

Principal Components Analysis
 PLSR:

Partial LeastSquares Regression
References
Wienkers LC, Heath TG. Predicting in vivo drug interactions from in vitro drug discovery data. Nat Rev Drug Discov. 2005;4(10):825–33.
Leape LL, Bates DW, Cullen DJ, Cooper J, Demonaco HJ, Gallivan T, Hallisey R, Ives J, Laird N, Laffel G, et al. Systems analysis of adverse drug events. ADE prevention study group. JAMA. 1995;274(1):35–43.
Businaro R. Why we need an efficient and careful pharmacovigilance. Aust J Pharm. 2013;1:4.
Karbownik A, Szałek E, Sobańska K, Grabowski T, Wolc A, Grześkowiak E. Pharmacokinetic drugdrug interaction between erlotinib and paracetamol: a potential risk for clinical practice. Eur J Pharm Sci. 2017;102:55–62.
Mulroy E, Highton J, Jordan S. Giant cell arteritis treatment failure resulting from probable steroid/antiepileptic drugdrug interaction. N Z Med J. 2017;130(1450):102–4.
Zhao XM, Iskar M, Zeller G, Kuhn M, van Noort V, Bork P. Prediction of drug combinations by integrating molecular and pharmacological data. PLoS Comput Biol. 2011;7(12):e1002323.
Veith H, Southall N, Huang R, James T, Fayne D, Artemenko N, Shen M, Inglese J, Austin CP, Lloyd DG, et al. Comprehensive characterization of cytochrome P450 isozyme selectivity across chemical libraries. Nat Biotechnol. 2009;27(11):1050–5.
Huang SM, Temple R, Throckmorton DC, Lesko LJ. Drug interaction studies: study design, data analysis, and implications for dosing and labeling. Clin Pharmacol Ther. 2007;81(2):298–304.
Zhang P, Wang F, Hu J, Sorrentino R. Label propagation prediction of drugdrug interactions based on clinical side effects. Sci Rep. 2015;5:12339.
Wiśniowska B, Polak S. The role of interaction model in simulation of drug interactions and QT prolongation. Curr Pharmacol Rep. 2016;2(6):339–44.
Zhou D, Bui K, Sostek M, AlHuniti N. Simulation and prediction of the drugdrug interaction potential of Naloxegol by physiologically based pharmacokinetic modeling. CPT Pharmacometrics Syst Pharmacol. 2016;5(5):250–7.
Bui QC, Sloot PMA, van Mulligen EM, Kors JA. A novel featurebased approach to extract drugdrug interactions from biomedical text. Bioinformatics. 2014;30(23):3365–71.
Zhang Y, Wu HY, Xu J, Wang J, Soysal E, Li L, Xu H. Leveraging syntactic and semantic graph kernels to extract pharmacokinetic drug drug interactions from biomedical literature. BMC Syst Biol. 2016;10 Suppl 3:67.
Duke JD, Han X, Wang ZP, Subhadarshini A, Karnik SD, Li XC, Hall SD, Jin Y, Callaghan JT, Overhage MJ, et al. Literature based drug interaction prediction with clinical assessment using electronic medical records: novel Myopathy associated drug interactions. PLoS Comput Biol. 2012;8(8):e1002614.
Vilar S, Uriarte E, Santana L, Lorberbaum T, Hripcsak G, Friedman C, Tatonetti NP. Similaritybased modeling in largescale prediction of drugdrug interactions. Nat Protoc. 2014;9(9):2147–63.
Cheng F, Zhao Z. Machine learningbased prediction of drugdrug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties. J Am Med Inform Assoc. 2014;21(e2):e278–86.
Li JX, Shi JY, Gao K, Lei P, Yiu SM. Predicting combinative drug pairs via integrating heterogeneous features for both known and new drugs. Springer International Publishing Switzerland: ISBRA: 2016; 2016. p. 297–8.
Luo H, Zhang P, Huang H, Huang J, Kao E, Shi L, He L, Yang L. DDICPI, a server that predicts drugdrug interactions through implementing the chemicalprotein interactome. Nucleic Acids Res. 2014;42(Web Server issue):46–52.
Shi JY, Huang H, Li JX, Lei P, Zhang YN, Yiu SM. In: IWBBIO, editor. Predicting comprehensive drugdrug interactions for new drugs via triple matrix factorization, vol. 2017. Spain: Lecture Notes in Computer Science: Bioinformatics and Biomedical Engineering, Springer; 2017. p. 108–17.
KochWeser J. Serum drug concentrations in clinical perspective. Ther Drug Monit. 1981;3(1):3–16.
Bichteler A, Wikoff DS, Loko F, Harris MA. Estimating serum concentrations of dioxinlike compounds in the US population effective 20052006 and 20072008: a multiple imputation and trending approach incorporating NHANES pooled sample data. Environ Int. 2017;105:112–25.
Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, Maciejewski A, Arndt D, Wilson M, Neveu V, et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2014;42(Database issue):D1091–7.
Tatonetti NP, Ye PP, Daneshjou R, Altman RB. Datadriven prediction of drug effects and interactions. Sci Transl Med. 2012;4(125):125ra131.
Dejong S. Simpls  an alternative approach to partial leastsquares regression. Chemometr Intell Lab. 1993;18(3):251–63.
Lee DD, Seung HS. Algorithms for nonnegative matrix factorization. Adv Neur In. 2001;13:556–62.
Kuang D, Ding C, Park H. Symmetric nonnegative matrix factorization for graph clustering. Anaheim: 12th SIAM International Conference on Data Mining; 2012. p. 106–17.
Ding C, Li T, Jordan MI. Convex and seminonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell. 2010;32(1):45–55.
Pahikkala T, Airola A, Pietila S, Shakyawar S, Szwajda A, Tang J, Aittokallio T. Toward more realistic drugtarget interaction predictions. Brief Bioinform. 2015;16(2):325–37.
Shi JY, Yiu SM, Li YM, Leung HCM, Chin FYL. Predicting drugtarget interaction for new drugs using enhanced similarity measures and supertarget clustering. Methods. 2015;83:98–104.
Shi JY, Gao K, Shang XQ, Yiu SM. LCMDS: a novel approach of predicting drugdrug interactions for new drugs via DempsterShafer theory of evidence: BIBM IEEE; 2016. p. 512–5.
Shi JY, Li JX, Lu HM. Predicting existing targets for new drugs base on strategies for missing interactions. BMC Bioinformatics. 2016;17(Suppl 8):282.
Jiao Y, Du P. Performance measures in evaluating machine learning based bioinformatics predictors for classifications. Quant Biol. 2016;4(4):320–30.
Scholkopf B, Smola A, Muller KR. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998;10(5):1299–319.
Acknowledgements
The author would like to thank the reviewers for their constructive comments that help make the paper much clearer.
Funding
Research in this work was supported by Aviation Science Fund of China (Grant No.2016ZC53028), the Program of Peak Experience in Northwestern Polytechnical University (2016), the Seed Foundation of Innovation and Creation for Graduate Students in Northwestern Polytechnical University(2018), and China National Training Programs of Innovation and Entrepreneurship for Undergraduates (201710699330). The publication of this article was sponsored by the Fundamental Research Funds for the Central Universities of China (No. 3102017jghk02010).
Availability of data and materials
The dataset used in this work can be download from https://github.com/JustinShi2016/APBC2018.
About this supplement
This article has been published as part of BMC Systems Biology Volume 12 Supplement 1, 2018: Selected articles from the 16th Asia Pacific Bioinformatics Conference (APBC 2018): systems biology. The full contents of the supplement are available online at https://bmcsystbiol.biomedcentral.com/articles/supplements/volume12supplement1.
Author information
Affiliations
Contributions
HY and JYS conceived and designed the experiments, and draft the manuscript. KTM and HH collected the dataset and performed the experiments. ZC and KD analyzed the results. KTM and HH contributed materials/analysis tools. HY and JYS developed the codes used in the analysis. SMY helped to draft the manuscript. JYS is the corresponding author. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Yu, H., Mao, KT., Shi, JY. et al. Predicting and understanding comprehensive drugdrug interactions via seminonnegative matrix factorization. BMC Syst Biol 12, 14 (2018). https://doi.org/10.1186/s1291801805327
Published:
DOI: https://doi.org/10.1186/s1291801805327
Keywords
 Drugdrug interaction
 Nonnegative matrix factorization
 Regression
 Network community
 Balance theory