Identifying Tmem59 related gene regulatory network of mouse neural stem cell from a compendium of expression profiles

Background Neural stem cells offer potential treatment for neurodegenerative disorders, such like Alzheimer's disease (AD). While much progress has been made in understanding neural stem cell function, a precise description of the molecular mechanisms regulating neural stem cells is not yet established. This lack of knowledge is a major barrier holding back the discovery of therapeutic uses of neural stem cells. In this paper, the regulatory mechanism of mouse neural stem cell (NSC) differentiation by tmem59 is explored on the genome-level. Results We identified regulators of tmem59 during the differentiation of mouse NSCs from a compendium of expression profiles. Based on the microarray experiment, we developed the parallelized SWNI algorithm to reconstruct gene regulatory networks of mouse neural stem cells. From the inferred tmem59 related gene network including 36 genes, pou6f1 was identified to regulate tmem59 significantly and might play an important role in the differentiation of NSCs in mouse brain. There are four pathways shown in the gene network, indicating that tmem59 locates in the downstream of the signalling pathway. The real-time RT-PCR results shown that the over-expression of pou6f1 could significantly up-regulate tmem59 expression in C17.2 NSC line. 16 out of 36 predicted genes in our constructed network have been reported to be AD-related, including Ace, aqp1, arrdc3, cd14, cd59a, cds1, cldn1, cox8b, defb11, folr1, gdi2, mmp3, mgp, myrip, Ripk4, rnd3, and sncg. The localization of tmem59 related genes and functional-related gene groups based on the Gene Ontology (GO) annotation was also identified. Conclusions Our findings suggest that the expression of tmem59 is an important factor contributing to AD. The parallelized SWNI algorithm increased the efficiency of network reconstruction significantly. This study enables us to highlight novel genes that may be involved in NSC differentiation and provides a shortcut to identifying genes for AD.


Background
One of the main goals of systems biology is to determine the biological networks by high performance computing methods and integrating high-throughput data [1,2]. Compared to the traditional biology, which basic strategy is to decypher biological functions by concentrating efforts on a very limited set of molecules, this systemcentric approach has an enormous success in producing complex biological networks composed of various types of molecules (genes, proteins, MicroRNAs, etc) from large amounts of data [3].
The microarray technology facilitates large-scale surveys of gene expression data for whole-genome mapping and gene expression analyzing under various conditions [4]. A major focus on microarray data analysis is the reconstruction of gene regulatory networks, which aims to find new gene functions and provide insights into the transcriptional regulation that underlies biological processes [5]. A wide variety of approaches have been proposed to infer gene regulatory networks from microarray data. Those approaches are based on different theories, including Boolean networks [6], Bayesian networks [7], relevance networks [8], graphical models [9], genetic algorithm [10], neural networks [11], controlled language-generating automata [12], linear differential equations [13], and nonlinear differential equations [14]. There are two difficulties that can be addressed for constructing gene networks from gene expression data. Firstly, a single set of gene expression data contains a limited number of time-points under a specific condition. Thus, the problem of determining gene regulatory network becomes an ill-posed one which is difficult to overcome. In the second, while microarray experiments collect an increasing amount of data to be correlated, the network reconstruction is an NP-hard problem. Therefore, application of the statistical framework to a large set of genes requires a prohibitive amount of computing time on a single-CPU. A fundamental problem with the sequential algorithms is their limitation to handle large data sets within a reasonable time and memory resources.
Neurodegenerative disorder, including Alzheimer's disease (AD), Parkinson's disease, and Huntington's diseases etc, is a progressive loss of neurons. Recently, transplantation of NSCs within adult brain has been proposed as one of the potential therapies for neurodegenerative disorders [15]. NSCs are multipotent progenitor cells with long-term, self-renewal and differentiation capabilities to generate three major types of central nervous system (CNS) cell: neurons, astrocytes and oligodendrocytes [16]. They are identified as neuroepithelial cells extending from the ventricle to basal lamina of the pial surface in the initial stage of brain development. During the histogenesis, radial glial stem cells divide asymmetrically to neurons and give rise to astrocytes. Then NSCs become neural progenitorcells existing in the adult brain neurogenic region: the sub-ventricular zone (SVZ) and the sub-granular zone (SGZ) [17][18][19][20].
So far the stem cell therapy for neurodegenerative disorders is still a challenging goal [21]. Mechanisms that control the proliferation, differentiation, migration and integration of NSCs are still poorly understood. Comprehensive the gene regulatory network corresponding to NSCs by means of integrating and performing analysis with efficient algorithms is a crucial part of systems biology.
Moreover, mouse transmembrane protein 59 (TMEM59) is an uncharacterized single transmembrane protein. Previously, our study in vitro suggested that TMEM59 is differentially expressed during differentiation of primary NSCs from Sprague-Dawley rat striatum [22]. Especially, the down-regulation of TMEM59 with RNAi interference in mouse C17.2 neural stem cell line increases the differentiation of NSCs into neurons and astrocytes [23]. Our study indicated that TMEM59 is related to the differentiation and status sustaining of NSCs. So far the functions of TMEM59 have not yet been reported. Exploration on the tmem59 related gene regulation network of NSCs would help us better understand the molecular mechanism underlying the NSCs differentiation.
In this paper, we constructed gene regulatory networks of mouse NSCs by the parallel strategy on stepwise network inference method. By integrating our microarray data and the public data, the regulatory mechanism of mouse NSCs differentiation by tmem59 is explored throughout the genome. The important pathways and the core gene, pou6f1, are investigated by Real-time RT-PCR, suggesting that the over-expression of pou6f1 significantly up-regulated tmem59 expression. We also show that many genes in the tmem59 related gene network have been implicated in AD mechanism. The findings enable us to highlight novel genes that may be involved in NSC differentiation and provides a shortcut to identifying genes for AD.

Original data
Microarrays simultaneously quantify thousands of genes on a single glass slide and their use has greatly expanded the breadth of quantified gene expression [24]. In our previous work, six wild and tmem59 knockout mice were separately immersed in 75% alcohol for disinfection [25,26]. Under aseptic conditions, the hippocampuses were made into single cell suspension by mechanical whipping. The supernatant was discarded after 900 rmp, 5 min centrifugation. Then the hippocampuses were resuspended in medium (DMEM/F12 culture medium with B27, EGF and bFGF) and were cultured in a glass bottle in CO2 incubator (5% CO2, 37 degree). The gene expression data were measured 4 days later. To understand the biological functions of tmem59, we investigated the genes that were differentially expressed due to tmem59 knock out. From the tmem59 knock out microarray datasets, 627 genes that differentially expressed with more than 2-fold change were selected as our source of data (data not shown).

Significantly expressed genes selection
In order to focus on much significantly expressed genes related to tmem59, we selected 80 genes for further analysis based on the Differential Ratio following tmem59 knock out. The precise description of the 80 genes with functions is illustrated in Additional File 1: Table S1.

Public data selection
In order to examine the regulatory mechanism between tmem59 and the corresponding genes, it is necessary to integrate much more microarray data which can be from either in-house or public domain. A good resource for public microarray data is the National Institutes of Health Gene Expression Omnibus http://www.ncbi.nlm. nih.gov/geo/. In this study all the data we used is MIAME compliant and is selected from Gene Expression Omnibus (GEO).

Microarray data normalization
We transferred the probe data to standard gene expression data. Because a single gene is represented on the array by typically a set of 11-20 pairs of probes, we mapped probes to their corresponding Entrez GeneIDs. Affymetrix probes were mapped to Entrez GeneIDs using the 3 Sep 2010 release of NetAffx annotations. Where probes had multiple GeneID mappings, the one which appears at the top of the GeneID list was selected because been observed that in the majority of such cases the first identifier tends to be the only one with a published symbol as opposed to one that was automatically generated. We calculated the Average Difference for all the probes of the corresponding gene to compare the probe sets expression level of them. The higher the probe set expressed, the larger Average Difference the probes got. Then the expression levels in those probe sets mapped to same gene was summarized. Probe intensities from Affymetrix oligonucleotide microarrays were normalized to gene expression levels using robust multichip analysis (RMA) [27] which is reported to be the single best normalization method compared to MAS5 (Affymetrix), GCRMA, and Dchip PM [28]. The use of ratios or raw intensities is governed by the capabilities of the microarray technology, not by our algorithm.

Parallelized SWNI Network inference algorithms
We designed and evaluated the Stepwise Network Inference (SWNI) algorithm in previous studies [29]. The SWNI algorithm is a rapid and scalable method of reconstructing gene regulatory networks using gene expression measurements without any prior information about gene functions or network structure. It solves small size problem for high-dimensional data with strict selections in the stepwise regression model. More precisely, the SWNI algorithm infers a module network in two major stages. Firstly, the model is built with ordinary differential equations to describe the dynamics of a gene expression network in perturbation. Secondly, a regression subset-selection strategy is adopted to choose significant regulators for each gene. Moreover, statistical hypothesis testing is used to evaluate the regression model. Then the gene expression network with significant edges and genes is predicted.
However, the SWNI algorithm is a sequential method essentially. While dealing with a large set of genes, the SWNI algorithm requires a prohibitive amount of computing time. To overcome this extreme computational requirement, in this study, we developed a parallel implementation of the SWNI algorithm. Using the message passing interface (MPI), the parallelized SWNI algorithm has higher computing efficiency compared with the SWNI method.
In this study, as same as our own microarray data, the multiple datasets were selected from the experimental platform GPL1261 and were normalized with the RMA algorithm. We subsequently combined all the datasets into a composite training set. The batch adjustment algorithm was applied in the combined training set to ensure that all the datasets were well intermixed [30]. The detail of the parallelized SWNI algorithm is as follows.
A gene expression network is expressed by a set of linear differential equations with each gene expression level as variables, and we havė where A = (a ij ) n×n is an n × n gene regulatory coefficient matrix, and refers to the connectivity of genes in the predictive network; X is an n × m matrix referring to the gene expression level at time t; P = (p ij ) n×m is a matrix representing the external stimuli (like perturbations) or environment conditions. The computational complexity of the sequential SWNI algorithm is O(n 3 ). In order to reduce the computational complexity, we decomposed P by row to partition parallel tasks.

Assessment of the parallelized SWNI algorithm
Artificial gene networks with random scale-free structure were generated and the distribution of vertices follows a power law. The parallelized SWNI algorithm and the SWNI algorithm have same computing precision. The computing precision of the SWNI algorithm has been discussed in [29]. And the performance of the SWNI algorithm was assessed by comparing the inferred network with the pre-determined artificial network.
The performance of the parallel strategy is evaluated on the artificial gene networks in two important aspects, which are speedup and efficiency. Compared with the SWNI algorithm, the parallelized SWNI algorithm performed better in efficiency. And as the number of processors increases, we got almost linear speedups of the parallelized SWNI algorithm.

RNA Isolation and Real-time RT-PCR analysis
To study the regulation of pou6f1 to tmem59 and quantify mRNA by real-time RT-PCR in C17.2 NSCs, we used ReverTra ® Ace qPCR RT kit and SYBR ® Green Realtime PCR Master Mix (Toyobo Life Science Department).
For Neural stem cell line, C17.2 cells were plated onto 24-well plates at a density of 5 × 10 5 cells per well and cultured at 37°C with 5% CO2 for 24 hours before transfection. After reaching about 90% confluence, cells were split. The murine cerebellum-derived immortalized neural stem cell line C17.2 was originally described by Snyder et al. [31].
Full-length cDNA fragment of Pou6f1 was then amplified by RT-PCR using total RNA from mouse brain. The forward primer was 5'-GAAGATCTATGCCCGG-GATC AGCAGTC-3' and the reverse primer was 5'-TCCGGAATTCCGGGATCTGAA AGACGTTC-3'. The cDNA was further digested with Bgl II/EcoR I and subcloned into pEGFP-N2 vector, ultimately sequenced by Invitrogen. The total of 1 ug pEGFP-N2-Pou6f1 DNA per well was used to transfect C17.2 cells using Lipofectamine 2000 at a proportion of 1:1 (according to the manufacturer's protocol). C17.2 cells transfected with pEGFP-N2 in the same condition were used as the control group.
Finally, the total RNA was isolated from each group according to the Trizol manufacture's standard protocol (Takara Bio Inc). PCR primers for amplification of the mouse tmem59 gene was specifically design (Invitrogen). Chloroform and isopropanol were used to extract and precipitate the total mRNA. RT-PCR analysis was performed on a PE9700 PCR machine. All reactions were repeated for three times. The relative quantity of tmem59 mRNA in the cells was calculated using the equation RQ = 2 -ΔΔCt . The β-actin was used for normalization as the internal control gene whereas the calibrator was the mean threshold cycle (Ct) value for each control group transfected with pEGFP-N2 vector. The forward primer sequence for tmem59 gene is 5'-ATGCTTGTCAT CTTGGCTG-3' and the reverse primer sequence is 5'-TCACTTCAGAACG ACCTCA-3'. The forward primer sequence for β-actin is 5'-TGTCCCTGTATGCCT and the reverse primer sequence is 5'-TCACGCAC-GATTTCCCTC-3'.

Statistical analysis
Statistical analysis and graph creation were performed by SigmaStat3.5, SigmaPlot 10.0 and Pajek. Data were obtained from at least three independent experiments. Results were presented as means ± SEM. One-way ANOVA was used to analyze the results of real-time PCR. Proportion was analyzed by z-test, and Yates correction was applied to calculations.

NSCs related microarrays are selected
We selected microarrays about NSCs, neurogenesis, glias and central nervous system (CNS), due to that NSCs are the principal source of constitutive neurogenesis and glias in the CNS. 146 microarray datasets were selected from 21 different platforms. The species, accession numbers, precise descriptions and number of data sets of the 21 platforms are illustrated in Additional File: Table S2. The comparability of gene expression data generated with different microarray platforms is still a matter of concern. Mixing of data from various platforms could lead to poor results due to quantitative biases among the technologies [32]. Therefore, we selected the datasets including only profiles from a single experimental platform, which ID is identified as GPL1261 in GEO database. In particular, we selected 62 mouse stem cell related sample data sets for further analysis from the Affymetrix Mouse Genome 430 2.0 arrays (Array ([Mouse430_2])), which includes approximately 45, 000 probe sets. The 62 mouse NSC related microarray data sets included in the analysis are illustrated in Table 1.

The performance of the parallelized SWNI algorithm
Following the scale-free topology, we simulated two types of artificial gene networks in size of 1000 nodes, 3054 edges, and 1500 nodes, 4630500 edges, respectively. The performance of the parallelized SWNI algorithm was assessed among the workstation described in the method. Speedup and efficiency of the parallel SWNI algorithm are illustrated in Figure 1, and the running time is shown in Table 2. Figure 1 shows that as the increase of the network scale, the parallelized SWNI algorithm performed better in both efficiency and speedup. Table 2 shows that, as increase in the processor numbers, the computing time of the algorithm falls dramatically. The results demonstrated that the parallelized SWNI algorithm has good performance on the artificial gene networks.

Gene regulatory networks of mouse neural stem cell
GRNs related to tmem59 were constructed on a compendium of expression profiles by the parallelized SWNI algorithm ( Figure 2). As illustrated in Figure 2A, NSC-GN1 contains 56 genes, 230 edges, and the average degree is 4. From NSC-GN1, tmem59 is shown to be negatively regulated by cd59, while positively regulated by sncg. The global importance of a node in a network can be evaluated by the node degree of it [33]. The basic evaluated strategy is that the bigger the degree of a node is, or the closer to the centre of a network the node is, the more important it is. According to this principle, in NSC-GN1 there are 22 important nodes, which have higher in-degree than the average degree, and can be identified as: aqp1, calml4, cd59a, clic6, cxcl1, cyb561, flvcr2, igfbpl1, lgals3bp, pou6f1, psmb8, s3-12, sncg arrdc3, axud1, cds1, folr1, gpnmb, paqr9, ptprv, ripk4 and slc35f3. Among the 22 nodes, there are 9 more important nodes with twice in-degree than the average degree. Those nodes are arrdc3, axud1, cds1, folr1, gpnmb, paqr9, ptprv, ripk4 and slc35f3.
In order to focus on more significant genes, we rose the significance level of the hypothesis testing in the parallelized SWNI algorithm to delete nodes with lower significant. NSC-GN1 was further extracted to be a sparser one, which is called NSC-GN2 ( Figure 2B). It contains nodes and edges with higher positive rate and negative rate compared to nodes and edges in NSC-GN1. 36 genes have significant relationship with tmem59 and 46 significant regulatory relationships were identified in NSC-GN2, of which the average node degree is 1.2. Pou6f1 regulates 11 genes in NSC-GN2, suggesting that it is the most important gene in it. Rnd3 and cds1 is related to 5 different genes, respectively. It is worth to mention that, three genes are found to regulate tmem59. In the other words, tmem59 is negatively regulated by cd59a, while positively regulated by sncg and myrip. Both cd59a and sncg were also found in NSC-GN1. Combined with published data, we constructed an integrated network containing both gene regulations and protein-protein interactions with 68 nodes and 98 edges (NSC-GN3 is illustrated in Figure 2C). The average node degree of NSC-GN3 is 1.4. 39 genes, 29 encoded proteins, 66 regulatory relationships and 32 protein-protein interactions are included in NSC-GN3. Partially, gene regulatory relationships of mouse NSCs and differential mechanism of NSCs in protein level is shown in NSC-GN3.

Novel regulatory pathways
We used the predicted regulatory network of mouse NSCs to infer newly gene interactions. We transformed the location of the nodes in NSC-GN2 and got NSC-GN4 ( Figure 2D). From NSC-GN4, four pathways which is related to the expression of tmeme59 were obviously identified as Pou6f1-Cd59a-Tmem59, Pou6f1-sncg-Tmem59, Pou6f1-Wfdc2-Rnd3-Mgp-Myrip-Tmem59, and Pou6f1-Wfdc2-Rnd3-Sncg-Tmem59. All the four pathways initiated from the transcription factor pou6f1. Moreover, the expression of tmem59 is regulated directly by myrip, sncg and cd59a, all of which are regulated by pou6f1 directly or indirectly.
A novel regulator, pou6f1, regulate the expression of tmem59 From Figure 2D, pou6f1 is identified to be a dense node, giving hint that pou6f1 may play an important role in tmem59 expression. In order to confirm this supposition, we constructed an expressional vector to over-express  We simulated two types of artificial gene networks in size of 1000 nodes, 3054 edges, and 1500 nodes, 4630500 edges, respectively, to assess the performance of the parallelized SWNI algorithm. The computing time is calculated. The results show that as increase in the processors number, the computing time of the algorithm falls dramatically. The study suggested that the parallelized SWNI algorithm has good performance on the artificial gene networks.
transcription factor POU6F1 fused with EGFP (pEGFP-N2-POU6F1) for real-time observation and quantification in C17.2 NSCs. The results suggested that, POU6F1, a transcription factor, was expressed successfully in the nucleus of NSC compared with ubiquitous location of EGFP ( Figure 3A, B, C, D). C17.2 NSCs transfected with pEGFP-N2 vector were used as a control group. Statistically, C17.2 NSCs showed 37.06% ± 4.31% (P < 0.01) increase in tmem59 expression caused by the overexpression of pou6f1 ( Figure 3E). This study firstly identifies a regulator pou6f1 that may account for tmem59 expression.

Localization of tmem59 related genes and identification of functional-related gene groups
In NSC-GN2 ( Figure 2B), 36 genes were predicted to be related to tmem59 and 27 of them are annotated in Gene Ontology (GO). Among the 27 annotated proteins, 4, 1, 2 and 4 proteins are localized on plasma, membrane, nucleus and extracellular, respectively. Figure 4 illustrates that 10.8%, 6.0%, 5.4% and 10.8% of all the 37 proteins in NSC-GN2 are localized on different sites, except 27% unannotated ones. As mentioned above, the novel membrane proteinT-MEM59 modulates complex glycosylation. Based on GO annotation, there are 42% of the 37 proteins involved in metabolism including TMEM59 ( Table 3), suggesting that most of the genes have functional similarity with tmem59. Beyond that, more than 20% of the 37 proteins are reported to transport materials within cells. The analysis of tmem59 related GRN of mouse NSCs highlights new candidate genes involved in (i) peptidase activity, hydrolase activity, kinase activity, and transferase activity; (ii) transportation of water, lipid and metal ion; (iii) protein binding; (iv) transcription process.

Discussion
Tmem59 has been reported to sustain the status of NSCs in vitro. Knockout of tmem59 in mouse brain can induce expressional changes of 627 genes in neonatal mouse NSCs. Until now, the underlying function of tmem59, especially on the differentiation of mouse NSCs, is still unclear. In this study, we try to find out regulators likely to affect the gene expression in mouse NSC and new mechanism of neurodegeneration in AD from a compendium of expression profiles.
Firstly, 36 genes were identified to be tmem59 related. In the predicted network NSC-GN2, tmem59 is regulated directly by cd59a, myrip and sncg. Meanwhile, four pathways were found in NSC-GN2 to regulate the expression of tmem59 from pou6f1. Tmem59 is located downstream in all the pathways, indicating that tmem59  is probably regulated by all the other genes. These conclusions are in accordance with observations from earlier studies [23]. Our study suggests that the 36 genes probably act on the differentiation of NSCs and have similar function with tmem59.
Secondly, Our RT PCR analysis results shown that tmem59 is positively regulated by pou6f1. And pou6f1 has been reported to play an important role during the development of mouse telencephalon [54]. Our study suggests that the influence of pou6f1 on mouse telencephalon development is originated from the effect on NSCs during the mouse embryonic development. This study provides further insights into the role of the differentiation of NSCs.
Thirdly, our study suggests that TMEM59 has similar localization with most of its regulators. Recently, TMEM59 was reported to be a Golgi-localized protein, which is crucial in modulated complex glycosylation, cell surface expression and secretion of amyloid precursor protein [34]. As known, proteins in the cell plasma are synthesized directly in free ribosome, while some other membrane proteins which transfer to the nucleus, are synthesized in rough endoplasmic reticulum. The second type of protein will be transported to subcellular location secreted by Golgi-complex. Among the 27 annotated genes in the predicted network NSC-GN2, more than 85% were identified to be nonplasmic localized. This suggests that 85% of the 27 proteins are Golgi-localized in maturation and has similar localization with TMEM59.
Furthermore, our study suggests that the tmem59 related gene regulatory network (NSC-GN2) is probably AD-related. As the precursor of β-amyloid protein (Aβ), β-amyloid precursor protein (APP) is addressed to be the first genetic mutation. The deposition of Aβ in plaques of brain is already identified to be the cause of AD. As been reported, TMEM59 is Golgi-localized in Hek293 cell line, and modulate the complex glycosylation, cell surface expression and secretion of APP. The study indicates that TMEM59 may be associated with AD. In our predicted mouse NSCs related network NSC-GN2, three genes which regulate Tmem59 directly are identified as sncg, cd59a and myrip. Sncg (γ-synuclein) has been identified to be correlated to dementia hippocampus of AD and pathology of Parkinson's disease (PD) [55]. Deficiency of complement regulator cd59a is the cause of neurodegeneration in AD [56]. And Rab27 binding protein MYRIP is involved in insulin exocytosis, impaired which is the pathogenesis of AD [57,58]. Besides, there are nearly 50% of all the genes in NSC-GN2 have been reported to be directly or indirectly related to AD. Therefore, tmem59, which directly regulated by cd59a, myrip and sncgis, is suggested to be associated with AD, and the unreported genes in NSC-GN2 are probably related to AD either.

Conclusions
In this study, we predicted the mouse NSCs related GRNs by the parallelized SWNI algorithm integrating data from the tmem59 knock out microarray datasets Table 3 Function of the 37 differentially expressed genes identified in Figure  and 62 mouse stem cell related microarray datasets in GEO. The parallelized SWNI algorithm increased the efficiency of network reconstruction significantly. In particular, a high confident network of mouse NSCs (NSC-GN2) was predicted. In the network, 36 key genes regulating tmem59 expression were identified. The RT PCR result suggested that tmem59 can be positively regulated by pou6f1 significantly. Moreover, 17 out of 36 genes are predicted to be AD related in our network including tmem59. This is in coherence with published references.
This present work provides new insights regarding the gene regulations of NSCs. The parallel methods presented in this paper might also become a scalable tool for large-scale analysis on various types of cells and species. And integration of multiple datasets will provide for new research directions in microarray analysis. This study enables us to highlight novel genes that may be involved in NSC differentiation and provides a shortcut to identify genes for AD.

Additional material
Additional file 1: Table S1 for 80 selected genes lists from the tmem59 knock-out microarray experiment included in the analysis. From the tmem59 knock out microarray datasets, 627 genes that differentially expressed with more than 2-fold change were selected as our source of data. In order to focus on much significantly expressed genes related to tmem59, we selected 80 genes for further analysis based on the Differential Ratio following tmem59 knock out. The symbol, Gene ID and function of each gene can be searched in Genbank.
Additional file 2: Table S2 for 21 platforms related to 146 microarray datasets about mouse NSCs. Microarrays about NSCs, neurogenesis, glias and central nervous system (CNS) are selected, due to that NSCs are the principal source of constitutive neurogenesis and glias in the CNS. 146 microarray datasets were selected from 21 different platforms for constructing genes regulatory network of mouse NSC. The species, accession numbers, precise descriptions and number of data sets of the 21 platforms are illustrated.