An improved sparse representation model with structural information for Multicolour Fluorescence In-Situ Hybridization (M-FISH) image classification
- Jingyao Li^{1},
- Dongdong Lin^{1},
- Hongbao Cao^{1, 2} and
- Yu-Ping Wang^{1, 2, 3}Email author
https://doi.org/10.1186/1752-0509-7-S4-S5
© Li et al.; licensee BioMed Central Ltd. 2013
Published: 23 October 2013
Abstract
Background
Multicolour Fluorescence In-Situ Hybridization (M-FISH) images are employed for detecting chromosomal abnormalities such as chromosomal translocations, deletions, duplication and inversions. This technique uses mixed colours of fluorochromes to paint the whole chromosomes for rapid detection of chromosome rearrangements. The M-FISH data sets used in our research are obtained from microscopic scanning of a metaphase cell labelled with five different fluorochromes and a DAPI staining. The reliability of the technique lies in accurate classification of chromosomes (24 classes for male and 23 classes for female) from M-FISH images. However, due to imaging noise, mis-alignment between multiple channels and many other imaging problems, there is always a classification error, leading to wrong detection of chromosomal abnormalities. Therefore, how to accurately classify different types of chromosomes from M-FISH images becomes a challenging problem.
Methods
This paper presents a novel sparse representation model considering structural information for the classification of M-FISH images. In our previous work a sparse representation based classification model was proposed. This model employed only individual pixel information for the classification. With the structural information of neighbouring pixels as well as the information of themselves simultaneously, the novel approach extended the previous one to the regional case. Based on Orthogonal Matching Pursuit (OMP), we developed simultaneous OMP algorithm (SOMP) to derive an efficient solution of the improved sparse representation model by incorporating the structural information.
Results
The p-value of two models shows that the newly proposed model incorporating the structural information is significantly superior to our previous one. In addition, we evaluated the effect of several parameters, such as sparsity level, neighbourhood size, and training sample size, on the of the classification accuracy.
Conclusions
The comparison with our previously used sparse model demonstrates that the improved sparse representation model is more effective than the previous one on the classification of the chromosome abnormalities.
Keywords
Background
For detecting the chromosomal abnormalities associated with genetic diseases or cancers by M-FISH technique, it is important to improve the accuracy of the classification of the chromosomes. Before classification, some preprocessing methods [3–7] are necessary to increase the accuracy by reducing the noise of the original images. In classification, there are two major types of classifiers: the pixel by pixel classifier [8–10] and the region-based classifier [6, 7]. For the classification, we have proposed Bayesian classifier [11] and sparse representation based classification (SRC)[12]. For the segmentation purpose, we have developed Adaptive Fuzzy C-Means (AFCM) segmentation method [6]. To bring the imaging technique into clinical use, further effort is needed to improve the classification accuracy.
Sparse representation methods including compressive sensing have been widely studied recently in applied mathematics and signal/image processing for their advantages in processing high dimensional data [13, 14]. There are many algorithms ( e.g., greedy algorithms (Matching Pursuit (MP [15]), OMP [16] and Homotopy [17]) to solve the sparse models. Recently Multiple Measurement Vectors (MMV) based models have also been proposed to recover a set of vectors that share a common support. Such models can find wide applications in many research fields (e.g., multiple signal classification(MUSIC)[18], blind multiband signal reconstruction[19] and compressive diffuse optical tomography[20]), where MMV problem is commonly applied. Motivated by these efforts on the MMV problem, we proposed a novel sparse representation model by incorporating the structural information into the classification of M-FISH image set, which was reported in our preliminary study [21]. This improved model considers the correlations of neighbouring pixels, which often share the same features and belong to the same class. By utilizing multiple information both from the neighbourhood of a pixel as well as from different spectral channels, the classification results of the proposed sparse model are better than that of sparse model we used before [12].
The paper is organized as follows. First, we introduce the SRC model without structural information and then propose an improved sparse model as well as the corresponding algorithm (i.e., SOMP) for estimating the solution. Next, we apply the improved model to M-FISH classification and compare it with a conventional sparse model which was employed in our previous model [12]. Finally, the paper is concluded with a short summary and discussion of the proposed model.
Methods
The SRC model has been successfully used in many fields (e.g., hyperspectral imaging classification [22] and M-FISH chromosome classification [12]). Before introducing the improved sparse model, we first review the sparse model and show how to apply it on M-FISH image data analysis. Then, we present the improved sparse model with the structural information for M-FISH chromosome classification by utilizing correlated information of the neighbouring pixels within a region. Finally, we describe the numerical algorithm, SOMP, for solving this improved model.
SRC algorithm for M-FISH data
where m represents the number of different classes; and ${\widehat{x}}_{i}$ is the sparse solution corresponding to class i . The class that y belongs to is determined by assigning it to the one that the distance between the y and estimated solution ${A}_{i}{\widehat{x}}_{i}$ is minimum.
Improved sparse model with structural information for M-FISH data analysis
where ${\u2225X\u2225}_{0,q}$ indicates the number of non-zero rows of X, and ${x}^{i}$ indicates the i-th row of X. $I\left({\u2225{x}^{i}\u2225}_{q}^{}>0\right)$ is an indicator function that has the value 1 if ${\u2225{x}^{i}\u2225}_{q}^{}>0$ and 0 otherwise. In this work, we set $q=2$. The solution vectors ${\left\{{x}_{j}\right\}}_{j=1,...,s}$ have the row-wise sparsity (i.e., the non-zero entries in the same row), which indicates the high correlation of the neighbouring pixels.
where y_{ c } is the central pixel of a neighbourhood and ${\u2225Y-{A}_{i}{\widehat{X}}_{i}\u2225}_{F}$ is the residual between an input matrix Y consisting of neighbouring pixels around y_{ c }and the product of the solution ${\widehat{X}}_{i}$ and the corresponding sub-matrix A_{ i }. The minimum value of the residual determines the class which the central pixel belongs to.
Algorithms for the solution of the improved sparse model
Simultaneous Orthogonal Matching Pursuit (SOMP) algorithm
Algorithm 1: SOMP |
---|
(1): Input: training sample matrix $\text{}A$, testing sample matrix $\text{}Y$ |
(2): Output: Row-wise sparse solution $\widehat{X}$ |
(3): Initialization: residual ${R}_{0}=Y$, ${\widehat{X}}_{0}=0$, non-zero rows $\text{\Omega}=\varnothing $, i = 0 |
(4): While stopping criterion false do |
1). Find a new atom from matrix A to best approximate the current residual based on q-norm: $w=\text{arg}\underset{k\in \text{\Omega}}{\text{max}}{\u2225{a}_{k}^{T}{R}_{i-1}\u2225}_{q}$ |
2). Update the non-zero row support $\text{\Omega}=\text{\Omega}\cup \left\{w\right\}$. |
3). Update the signal estimation ${\widehat{X}}_{i}={\left({A}_{\text{\Omega}}^{T}{A}_{\text{\Omega}}\right)}^{+}{A}_{\text{\Omega}}^{T}Y$, where ${A}_{\text{\Omega}}$ denotes the sub-matrix of A consisting of the atoms from matrix A, and the residual: ${R}_{i}=Y-{A}_{\text{\Omega}}{\widehat{X}}_{i}$. |
4).i = i + 1. |
(5): End while |
(6): Return: $\widehat{X}={\widehat{X}}_{i}$ |
Results and analysis
M-FISH database
Segmentation of chromosome regions
M-FISH training and testing data
The improved sparse model with structural information was applied on the classification of M-FISH image data. 20 cells (i.e., 10 male, 10 female) were chosen from our database [24]. The features of different types of chromosome were constructed by randomly sampling pixels from M-FISH images to form the training matrix A, which satisfy the sparsity concentration index (SCI) proposed by[25]. SCI is used to measure the sparsity concentration of the feature vectors. Matrix A is an n×N matrix, in which n represents the spectral dimension of pixels and N represents the number of training features. In the case of M-FISH image data, n equals 5. After completing the matrix training, the rest of the pixels were taken as testing data to validate our proposed classification method.
The analysis of the classification results with different models
The correct classification ratio of each class in an M-FISH image
Class number | New sparse model | General sparse model | Class number | New sparse model | General sparse model |
---|---|---|---|---|---|
1 | 0.951282 | 0.894359 | 13 | 0.903226 | 0.895439 |
2 | 0.988194 | 0.961629 | 14 | 0.832192 | 0.785388 |
3 | 0.930451 | 0.929825 | 15 | 0.969388 | 0.926304 |
4 | 0.972441 | 0.919948 | 16 | 1 | 0.993921 |
5 | 0.983595 | 0.905136 | 17 | 1 | 0.998273 |
6 | 0.975627 | 0.965181 | 18 | 0.929553 | 0.917526 |
7 | 0.967769 | 0.953719 | 19 | 1 | 0.977175 |
8 | 0.959322 | 0.881356 | 20 | 0.930556 | 0.882937 |
9 | 0.997059 | 0.978431 | 21 | 0.832817 | 0.758514 |
10 | 1 | 0.991313 | 22 | 0.997361 | 0.960422 |
11 | 0.958773 | 0.967402 | 23 | 0.981279 | 0.976599 |
12 | 0.997038 | 0.99309 | 24 | 1 | 0.990506 |
Significance analysis of the new sparse model with structural information
Effects of parameters used
Conclusions and discussion
A sparse model based classifier that we proposed before [12] used the pixel by pixel classification, overlooking structural information so that there are much more isolated spots in the results leading to the low accuracy of the classification. In this paper we proposed an improved sparse model, in which the information of a central pixel as well as its neighbouring pixels is used simultaneously for improved classification. This is validated by the comparison of chromosomal classification accuracy between the two models on a real M-FISH database [24]. The comparison (as illustrated by Figure 5) shows that there are more isolated spots (i.e., misclassifications) in the classification results of our previously model [12] than those of using new sparse model incorporating the structural information. The correct classification ratio in Table 2 also shows the improved accuracy of using the improved sparse model. The statistical comparison between the two models indicates that the new sparse model with structural information is superior to the previously used sparse model, with the significant level less than 1e-6,. The effects of parameters used in the model on the accuracy of classification were also investigated. We have shown how the sparsity level (K_{ 0 }) and the neighbourhood size (s) and the training sample size (N_{ i }) affected the RCC of our improved sparse model incorporating structural information and how the training sample size (N_{ i }) affected the RCC of our previously used model as well as improved model. A proper choice of sparsity level (K_{ 0 }< = 5) and neighbourhood size (s = 9) is recommended based on our experiments.
In summary, all the result shows that our proposed improved sparse model incorporating structural information can significantly improve the accuracy of the classification compared with a general sparse model that we proposed before [12]. This will in turn improve the M-FISH imaging technique for detecting chromosome abnormalities to better diagnose genetic diseases and cancers.
Declarations
Acknowledgements
Based on " Classification of multicolor fluorescence in-situ hybridization (M-FISH) image using structure based sparse representation model", by Jingyao Li, Dongdong Lin, Hongbao Cao and Yu-Ping Wang, which appeared in 2012 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). © 2012 IEEE [21]. This work has been partially supported by the NIH and NSF.
Declarations
The publication costs for this article were funded by the corresponding author.
This article has been published as part of BMC Systems Biology Volume 7 Supplement 4, 2013: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2012: Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/7/S4.
Authors’ Affiliations
References
- Schrock E, duManoir S, Veldman T, Schoell B, Wienberg J, FergusonSmith MA, Ning Y, Ledbetter DH, BarAm I, Soenksen D, et al.: Multicolor spectral karyotyping of human chromosomes. Science. 1996, 273 (5274): 494-497. 10.1126/science.273.5274.494.View ArticlePubMedGoogle Scholar
- Speicher MR, Ballard SG, Ward DC: Karyotyping human chromosomes by combinatorial multi-fluor FISH. Nat Genet. 1996, 12 (4): 368-375. 10.1038/ng0496-368.View ArticlePubMedGoogle Scholar
- Choi H, Bovik AC, Castleman KR: Feature normalization via expectation maximization and unsupervised nonparametric classification for M-FISH chromosome images. IEEE Trans Med Imaging. 2008, 27 (8): 1107-1119.View ArticlePubMedGoogle Scholar
- Choi H, Castleman K, Bovik A: Joint segmentation and classification of M-FISH chromosome images. Conf Proc IEEE Eng Med Biol Soc. 2004, 3: 1636-1639.PubMedGoogle Scholar
- Cao HB, Deng HW, Wang YP: Segmentation of M-FISH Images for Improved Classification of Chromosomes With an Adaptive Fuzzy C-means Clustering Algorithm. Ieee T Fuzzy Syst. 2012, 20 (1): 1-8.View ArticleGoogle Scholar
- Karvelis PS, Fotiadis DI, Tsalikakis DG, Georgiou IA: Enhancement of multichannel chromosome classification using a region-based classifier and vector median filtering. IEEE Trans Inf Technol Biomed. 2009, 13 (4): 561-570.View ArticlePubMedGoogle Scholar
- Karvelis PS, Tzallas AT, Fotiadis DI, Georgiou I: A multichannel watershed-based segmentation method for multispectral chromosome classification. IEEE Trans Med Imaging. 2008, 27 (5): 697-708.View ArticlePubMedGoogle Scholar
- Sampat ACB MP, Aggarwal JK, Castleman KR: Pixel-by-pixel classification of MFISH images. 24th IEEE Ann Intern Conf (EMBS). 2002, Houston, TX, 2: 999-1000.Google Scholar
- Schwartzkopf WC, Bovik AC, Evans BL: Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images. Ieee T Med Imaging. 2005, 24 (12): 1593-1610.View ArticleGoogle Scholar
- Sampat MP, Bovik AC, Aggarwal JK, Castleman KR: Supervised parametric and non-parametric classification of chromosome images. Pattern Recogn. 2005, 38 (8): 1209-1223. 10.1016/j.patcog.2004.09.010.View ArticleGoogle Scholar
- Wang YP, Castleman KR: Normalization of multicolor fluorescence in situ hybridization (M-FISH) images for improving color karyotyping. Cytom Part A. 2005, 64A (2): 101-109. 10.1002/cyto.a.20116.View ArticleGoogle Scholar
- Cao HB, Deng HW, Li M, Wang YP: Classification of Multicolor Fluorescence In Situ Hybridization (M-FISH) Images With Sparse Representation. Ieee T Nanobiosci. 2012, 11 (2): 111-118.View ArticleGoogle Scholar
- Simoncelli EP, Olshausen BA: Natural image statistics and neural representation. Annu Rev Neurosci. 2001, 24: 1193-1216. 10.1146/annurev.neuro.24.1.1193.View ArticlePubMedGoogle Scholar
- Li Y, Cichocki A, Amari S: Analysis of sparse representation and blind source separation. Neural Comput. 2004, 16 (6): 1193-1234. 10.1162/089976604773717586.View ArticlePubMedGoogle Scholar
- Mallat SG, Zhang ZF: Matching Pursuits with Time-Frequency Dictionaries. Ieee T Signal Proces. 1993, 41 (12): 3397-3415. 10.1109/78.258082.View ArticleGoogle Scholar
- Tropp JA, Gilbert AC: Signal recovery from random measurements via orthogonal matching pursuit. Ieee T Inform Theory. 2007, 53 (12): 4655-4666.View ArticleGoogle Scholar
- Donoho DL, Tsaig Y: Fast Solution of l(1)-Norm Minimization Problems When the Solution May Be Sparse. Ieee T Inform Theory. 2008, 54 (11): 4789-4812.View ArticleGoogle Scholar
- Kim JM, Lee OK, Ye JC: Compressive MUSIC: Revisiting the Link Between Compressive Sensing and Array Signal Processing. IEEE T Inform Theory. 2012, 58 (1): 278-301.View ArticleGoogle Scholar
- Mishali M, Eldar YC: Blind Multiband Signal Reconstruction: Compressed Sensing for Analog Signals. IEEE T Signal Proces. 2009, 57 (3): 993-1009.View ArticleGoogle Scholar
- Lee O, Kim JM, Bresler Y, Ye JC: Compressive diffuse optical tomography: noniterative exact reconstruction using joint sparsity. IEEE Trans Med Imaging. 2011, 30 (5): 1129-1142.View ArticlePubMedGoogle Scholar
- Li J, Lin D, Cao H, Wang Y: Classification of multicolor fluorescence in-situ hybridization (M-FISH) image using structure based sparse representation model. Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on: 4-7 October 2012. 2012, 1-6. 10.1109/BIBM.2012.6392672.Google Scholar
- Chen Y, Nasrabadi NM, Tran TD: Hyperspectral Image Classification Using Dictionary-Based Sparse Representation. IEEE T Geosci Remote. 2011, 49 (10): 3973-3985.View ArticleGoogle Scholar
- Tropp JA, Gilbert AC, Strauss MJ: Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit. Signal Process. 2006, 86 (3): 572-588. 10.1016/j.sigpro.2005.05.030.View ArticleGoogle Scholar
- M-Fish Database website. [https://sites.google.com/site/xiaobaocao006/database-for-download]
- Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y: Robust Face Recognition via Sparse Representation. Ieee T Pattern Anal. 2009, 31 (2): 210-227.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.