Activation of tumor suppressor genes in breast cancer cells by a synthetic chromatin effector

Disease states, such as breast cancer, arise from the disruption of chromatin, the central DNA-protein structures that package human genetic material. Mounting evidence from genome-wide studies across cancers show that Polycomb-mediated repression of sets of genes, called Polycomb modules, is strongly linked to a poor prognosis. We developed a synthetic transcriptional activator to release silenced genes from the repressed state. The Polycomb-based Transcription Factor (PcTF) is a synthetic effector that accumulates at methyl-histone marks and regulates hundreds of gene targets, including tumor suppressors. We recently reported the activity of PcTF in bone, blood, and brain sarcoma-derived model cell lines. Here, we expand our investigation of PcTF to three breast cancer-derived cell lines. We expressed PcTF in drug-responsive (MCF-7, BT-474) and nonresponsive triple negative (BT-549) breast cancer cell lines. RNA-seq showed that hundreds of genes were up-or down-regulated by PcTF as early as 24 hours after transfection. BT-549, the triple-negative cancer cell line, showed the highest number of PcTF-activated genes. We demonstrate the anti-cancer potential of PcTF by identifying 15 tumor suppressor genes that are upregulated across the three cell types. The data also provide new mechanistic insights into the relationship between chromatin organization and PcTF-mediated regulation of genes. Our results have exciting implications for cancer treatment with engineered biologics.


INTRODUCTION
Eukaryotic chromosomes are organized as chromatin, a dynamic network of interacting proteins, DNA, and RNA in eukaryotic nuclei. These interactions regulate gene transcription and coordinate distinct, genome-wide expression profiles in different cell types. Chromatin mediates epigenetic inheritance [1,2] by regulating expression states that persist through cellular mitosis and in generations of sexually reproducing organisms [3,4] . Trimethylation of histone H3 -a component of the nucleosome protein octamer core, the fundamental subunit of chromatin -at lysine 27 (H3K27me3) plays a central role in the epigenetic regulation of genes that control cell differentiation [5,6] . Several landmark studies have revealed that hyperactivity of the histone-methyltransferase enhancer of zeste 1 and 2 (EZH1, EZH2), which generates H3K27me3, is a feature shared by many types of cancer (recently reviewed in [7] ). In breast cancer, elevated EZH2 has been linked to cell proliferation and metastasis [8,9] and a poor prognosis for breast cancer patients [10][11][12][13] . In stem cells and cancer cells, EZH2 generates the repressive H3K27me3 mark at nucleosomes near the promoters of developmental genes to prevent differentiation and maintain the proliferative state in stem cells or to generate neoplasia in cancer (reviewed in [5] ). Polycomb Repressive Complex 1 (PRC1) binds to the H3K27me3 mark through the polycomb chromodomain (PCD) motif of the CBX protein to stabilize the repressed state. Silencing is reinforced by other chromatin regulators including histone deacetylase (HDAC) and DNA methyltransferase (DMT) [14] .
The PRC module, a group of genes that is silenced by H3K27me3 and Polycomb transcriptional regulators [15,16] , is a high priority for cancer research and the development of epigenetic drugs. Epigenetic therapy targets aberrant chromatin within cancer cells. This approach overcomes the problem of resistance to hormone therapy, which requires the presence of specific transmembrane receptors on the surfaces of cancer cells. Relatively high expression or upregulation of PRC module genes is associated with a non-proliferative state, cell adhesion, organ development, and normal anatomical structure morphogenesis [15] . Knockdown (depletion) of chromatin proteins (reviewed in [16,17] ) and inhibition of Polycomb proteins with low molecular weight compounds and peptides [18][19][20] stimulates expression of developmental genes and perturbs cancer-associated cell behavior. The success of epigenetic interventions in clinical trials [21,22] demonstrates that mis-regulated chromatin is a druggable target in cancer. Basic research has revealed certain limitations for epigenetic inhibitor compounds. Inhibitors indirectly activate silenced genes by blocking repressors, generate incomplete conversion of silenced chromatin into active chromatin [23,24] , interact with off-target proteins outside of the nucleus [25] , and do not affect resistant Polycomb protein mutants [26][27][28] . These limitations can be addressed with alternative molecular technologies.
Biologics, therapies composed of macromolecules such as proteins, are richer in biochemical information compared to low molecular weight compounds. Until recently, none have been designed to decode epigenetic information in cancer cells. To this end, we developed the Polycomb-based Transcription Factor (PcTF), which binds H3K27me3 [29] and recruits endogenous transcription factors to PRC-silenced genes. In bone, brain, and blood-cancer derived cell lines, PcTF expression stimulates transcriptional activation of several anti-oncogenesis genes [30] . PcTF-mediated activation leads to the eventual loss of the silencing mark H3K27me3 and elevation of the active mark H3K4me3 at the tumor suppressor locus CASZ1 . To advance PcTF towards medical translation, we sought to investigate the behavior of this protein in breast cancer cells lines that have been established as models for tumorigenesis [31][32][33] .
Here, we extend our investigation of PcTF target genes to three breast cancer-relevant cell lines. First, we investigated the transcription profiles of predicted PRC module genes in drug-responsive (MCF-7, BT-474) and nonresponsive triple negative (BT-549) breast cancer cell lines.
Receptor-negative BT-549 cells have a transcription profile and histology similar to aggressive tumor cells from patient samples [34,35] . We show that the transcription profiles of the untreated breast cancer cells are distinct from MCF10A, a breast tissue-derived control cell line. We also show that predicted Polycomb-regulated genes are repressed in the breast cancer cells compared to MCF10A.
Over expression of PcTF in transfected breast cancer cells led to the upregulation of dozens of genes, including many predicted PRC module genes and 15 well-characterized tumor suppressor genes, as early as 24 hours after transfection. The transcriptome of BT-549 (triple-negative) showed the highest degree of PcTF-sensitivity. Our results also provide new mechanistic insights into the relationship between chromatin structure and activation of genes by an artificial regulator. We observed that PcTF-sensitive genes are enriched for silencing marks and low levels of activation-associated chromatin marks, suggesting that PcTF regulates genes that are poised for activation.

Differential regulation of genes in breast cancer cell lines
To determine expression levels of predicted PRC module genes, we profiled the transcriptomes of three breast cancer cell lines and the non-invasive, basal B cell line MCF10A [36,37] [38,39] . The basal class exhibits a stem-cell like expression profile [40] , which is consistent with high levels of Polycomb-mediated repression at genes involved in development and differentiation [41,42] . Levels of the repressor protein EZH2 and the histone modification that it generates (H3K27me3) are elevated in MCF7, BT-474, and BT-549 compared to non-metastatic cells such as MCF10A (Table 1)
Comparison of the expression profiles in untreated cells showed that the three breast cancer Differential expression between cell lines for individual genes (Fig. S1) followed similar trends as those observed for the global JSD analysis. We used an expression comparison algorithm (Cuffdiff [50] ) to identify genes that were differentially expressed (2-fold or greater difference in expression, q value ≤ 0.05) or similarly expressed (less than 2-fold difference, q value ≤ 0.05) between cell types.
Comparisons that included MCF10A showed the highest numbers of differentially-expressed genes, as well as the lowest numbers of similarly expressed genes. This result further supports transcriptional differences between the cancerous cell lines and MCF10A (Fig. S1).
Next, we determined expression levels within groups of predicted PRC-regulated genes and observed that expression within these subsets is lower in the three cancer cell types than in MCF10A. We

PcTF-responsive genes include PRC module genes and other loci
We investigated PcTF-mediated gene regulation in the three breast cancer cell lines by profiling the transcriptomes of PcTF-expressing cells (Fig. 2). We transfected cells with PcTF-encoding plasmid DNA (previously described [30] ) via Lipofectamine LTX and and allowed transfected cells to grow for 24, 48, and 72 hours before extracting total RNA for sequencing. RNA-seq reads were aligned to a human reference genome GRCh38 that included the coding region for PcTF (see Methods). No reads aligned to the PcTF coding sequence in control, untransfected cells. In the transfected cells, PcTF expression levels were highest at 24 hours and decreased 1.6 to 5.5-fold every 24 hours ( Fig. 2A).
We observed a similar trend with other cancer cell lines in a previous study [30] . One outlier sample, a replicate for BT-474 cells expressing PcTF for 48 hours, had a markedly different PcTF expression level ( Fig. 2A) and genome-wide transcription profile (Fig. S2) and was therefore omitted from further analyses. Different subsets of genes were up-or down-regulated at least two fold ( q value ≤ 0.05) early, late, or across all time points during PcTF expression (Fig. 2B). Of the genes that showed at least a two-fold change in either direction, the vast majority were up-regulated ( by others [54][55][56][57][58] . These genes may also be direct targets of PcTF. MX1 , HERC6 , and UBE26L belong to the PRC module groups shown in Figure 1 (panels B and C). Other studies have linked high levels of expression from interferon pathway genes with a non-cancerous phenotype. In breast cancer, an immune response gene-expressing subgroup, which includes ISG15 , MX1 , and other interferon genes, has been associated with improved prognosis in triple negative breast cancers [59,60] . Therefore, our results suggest that PcTF shifts the transcription profiles of breast cancer cells towards an anti-cancer state. Our discovery of 19 commonly upregulated genes indicates that diverse cancer subtypes can be similarly affected by a single synthetic transcriptional regulator.

Genes become upregulated over time in the presence of stable PcTF levels
We sought to determine the dynamics of transcriptional regulation at PcTF-regulated genes. Using there is little difference between the basal versus activated expression level. Furthermore, these genes may have been slightly upregulated prior to dox treatment since PcTF was detected at low levels before induction (Fig. 3C).
In all cases except for SP100 , the truncated fusion protein Pc Δ TF did not upregulate

PcTF-sensitive genes are located in chromatin regions that contain both silencing and activation-associated marks
To investigate the contribution of local chromatin states to PcTF-mediated gene regulation, we compared chromatin modifications at several loci with the expression profiles of the corresponding genes. Previously, we showed that CASZ1 , a direct target of PcTF, was enriched for H3K27me3 up to 10 kb upstream of the promoter and showed an immediate (10 -24 hours) response to PcTF in a U-2 OS model cell line [30] . Here, we utilized the extensive public ChIP-seq data that is available for MCF7 to investigate chromatin modifications at PcTF-responsive genes. In untreated MCF7 cells, silenced, non-responsive genes (Fig. 3B). These data provide further evidence that PcTF acts upon a select subset of H3K27me3-enriched loci, as we previously observed in other cell lines [30] . Class 2 PcTF-responsive genes generally lack the H3K27 methylation mark and may represent downstream targets of the products expressed from direct PcTF targets.
To investigate potential off-target binding with a similar histone modification, we analyzed enrichment profiles for the mark H3K9me3. PCD peptides have shown some affinity for both H3K9me3 and H3K27me3 peptide tails in vitro [61,62] . The gene subsets showed no significant differences in mean H3K9me3 enrichment (Fig. 4B), a modification that is frequently found at constitutive pericentric heterochromatin and non-coding DNA [63][64][65] . We observed that PcTF-responsive genes tended to be distributed along chromosome arms rather than concentrated near centromeres (Fig. S4). PcTF target sites coincide more closely with the distribution of facultative chromatin and epigenetically-regulated cell development genes [41,66] , which are characterized by H3K27me3 enrichment in specific cell types.
Enrichment for two active gene-associated marks, H3K27ac and H3K4me3, at  showed that median expression was lower in untreated BT-474 and MCF7 than in the non-cancerous MCF10A cell line (Fig. 5A). This result is consistent with the idea that epigenetic repression of TSGs supports a cancerous cell phenotype. In PcTF-expressing cells, the median expression of the fifteen tumor suppressor genes was increased at all time points compared to the untreated samples for each cancer cell line (Fig. 5A). Interestingly, the median FPKM value for the 15 TSGs was higher in BT-549 than in MCF10A. We examined the expression levels of the individual genes and found that BMP2 , CEACAM1 , CDKN1A , DSP are lower in BT-549, as well as BT-474 and MCF7, than in MCF10A (Fig.   5B). These genes become upregulated in PcTF-expressing cells.

CONCLUSIONS & DISCUSSION
We have demonstrated that PcTF, a synthetic transcription factor that is designed to recognize H3K27me3, leads to broad changes in the transcription profiles of cell lines that represent different breast cancer subtypes. We hypothesized that genes enriched for Polycomb-associated chromatin marks would become up-regulated by PcTF. While many H3K27me3-enriched genes were upregulated in MCF7, many were non-responsive. Several genes that were not identified as Polycomb-enriched became up-regulated by PcTF. Enrichment of H3K9me3 did not distinguish PcTF-responsive genes, therefore we can rule out off-target binding which has been observed for PCD peptides in biochemical studies [61,62] . At PcTF-responsive genes, levels of H3K4me3 and H3K27ac were higher than at silenced non-responsive genes, but RNA PolII was depleted compared to active non-responsive genes. Therefore, the chromatin at PcTF-responsive genes may support a low or intermediate expression state. Berrozpe et al. recently reported that Polycomb complexes preferentially accumulate at weakly expressed genes rather than strongly silenced or highly expressed genes [70] . In our experiments, specific PRC-regulated genes may have been expressed at low to intermediate levels and then further upregulated upon exposure to PcTF. Our analysis of PcTF-regulated genes and chromatin states paves the way for future studies to further resolve chromatin features that distinguish regulatable PRC-repressed genes in cancer cells.
Deregulation of histone-modifying enzymes contributes to epigenomic diversity in breast cancer cells [13] . Other factors can also contribute to transcription profile variations, such as differences in the abundance of wild type or mutated transcription factors, or mutations that impact the stability and turnover of RNA transcripts. Past work has begun to illuminate the relationship between phenotypic subclasses and transcription profiles [15,36,49,71] . Such investigations help to elucidate cancer mechanisms and drug targets for more effective treatments. However, the link between transcriptome and phenotype is not entirely straight-forward. We observed that the transcription profile of BT-549 (invasive basal B) is more similar to MCF7 (luminal) than either were to BT-474 (luminal).
In contrast, other reports have shown clear distinctions between the transcription profiles and phenotypes of BT-549 and MCF7 [36,49] . Differences in transcript profiling methods, our RNA-seq and JSD analysis versus the DNA oligomer arrays used by others, may account for this conflicting result. Further, we acknowledge that the JSD may be driven by a few genes with high expression and high variance, which could account for some of the patterns. Cancer cell transcriptome diversity poses a formidable challenge for the development of drugs that pinpoint specific genes and pathways. The results reported here demonstrate that PcTF co-regulates cohorts of genes, many of which are associated with anticancer functions, in diverse model breast cancer cell lines with different basal gene expression levels. BT-549 showed higher basal expression levels for many genes, including predicted PRC module genes, that became upregulated by PcTF. In contrast, the same genes were strongly repressed in BT-474. In spite of this difference, several of these genes became activated in the presence of PcTF.
PcTF represents a new class of biologic, a medicinal protein-based macromolecule. So far, low molecular weight compounds, rather than biologics, are the predominant method for epigenetic intervention. Their ease of delivery, orally or intravenously, make these compounds a very attractive approach for cancer treatment. However, small compounds have a very limited range of biological activity, e.g. as ligands for specific proteins, compared to macromolecules. Transgenic and synthetic transcription factors expand the repertoire of epigenetic drug activity by allowing selective control of therapeutic genes in cancer cells [72][73][74][75] . Protein expression often relies on inefficient and possibly mutagenic nucleic acid delivery, which poses a significant barrier for many potential biologics. Recent advances in large molecule carriers such as cell penetrating peptides [76][77][78] provide a positive outlook for cellular delivery of purified proteins. We have recently shown that PcTF has affinity for histone H3K27me3 peptides in vitro [29] . Here we demonstrate the potential utility of PcTF in additional cancer types. It will be eventually important to determine if cell-penetrating PcTF proteins meet or exceed the efficacy of small molecule epigenetic drugs in tumor models. At present, PcTF shows promise for activating tumor suppressor genes and provides a potential alternative to traditional tumor therapies.

DNA constructs
Plasmids were constructed to express fusion proteins either constitutively or in the presence of doxycycline. The plasmid for constitutive expression of PcTF, hPCD-TF_MV2 (KAH126), was constructed as previously described [79] .

Preparation of total mRNA
Total messenger RNA was extracted from 90% confluent cells (~1-2x10 6 (Fig. 3C), or as double delta C p = C p dox treated cells / delta C p no dox for gene expression levels in the stable cell lines (Fig. 3D).

Transcriptome analysis
RNA-seq reads were quality-checked before and after trimming and filtering using FastQC [81] .
TrimmomaticSE was used to clip bases that were below the PHRED-scaled threshold quality of 10 at the 5' end and 25 at the trailing 3' end of each read for all samples [82] . A sliding window of 4 bases was used to clip reads when the average quality per base dropped below 30. Reads of less than 50 bp were removed. A combined reference genome index and dictionary for GRCH38.p7 (1-22, X, MT, and non-chromosomal sequences) [83] that included the full coding region of the synthetic PcTF protein were created using Spliced Transcripts Alignment to Reference (STARv2.5.2b) [84] and the picard tools (version 1.1.19) [85] . Trimmed RNA-seq reads were mapped, and splice junctions extracted, using STARv2.5.2b read aligner [84] . Bamtools2.4.0 [86] was used to check alignment quality using the 'stats' command. Mapped reads in BAM format were sorted, duplicates were marked, read groups were added, and the files were indexed using the Bamtools2.4.0 package. CuffDiff, a program in the Cufflinks package [53] , was used to identify genes and transcripts that expressed significant changes in pairwise comparisons between conditions. Fastq and differential expression analysis files are available at the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database (Accession GSE103520, release date September 8, 2017).
CummeRbund [53] was used to calculate distances between features and to generate graphs and charts (JSD plots). R ggplot2 [83,87] and VennDiagrams [88] were used to generate heat maps and Venn diagrams respectively. The entire workflow is provided as a readme file at: https://github.com/WilsonSayresLab/PcTF_differential_expression

Bioinformatics analyses and sources of public shared data
For the results shown in Figure 1B, genome-wide H3K27me3 enrichment in MCF7 cells, determined by chromatin immunoprecipitation followed by deep sequencing (ChIP-seq), was downloaded from the ENCODE project (accession UCSC-ENCODE-hg19:wgEncodeEH002922) [89] . We classified genes with a ChIP-seq peak within 5000 bp up or downstream of the transcription start site as H3K4me3 (ENCFF530LJW.bigWig), and RNA PolII (ENCFF690CUE.bam) and used to generate plots using DeepTools [90] (computeMatrix, plotProfile, plotHeatmap) in the Galaxy online platform at usegalaxy.org [91] . Prior to plotting, the RNA PolII data was converted to bigWig format using bamCoverage. Figure S4 was generated using REViGO [92] Unique differentially expressed genes were researched using GeneCards [69] and GOrilla analysis [93] . For the data shown in Figure S5, REViGO [92] was used to compare the biological functions for the differentially expressed genes. The results in Figure 5 are based on human tumor suppressor genes (983 total) that are reported to show lower expressed in cancer samples of the Cancer Genome Atlas (TCGA) compared to the TCGA normal tissue samples was downloaded from https://bioinfo.uth.edu/TSGene/download.cgi . Of these 983 genes, 589 are breast cancer specific [67,68] .