Adipose-derived mesenchymal stem cells may reduce intestinal epithelial damage in ulcerative colitis by communicating with macrophages and blocking inflammatory pathways: an analysis in silico

Ulcerative colitis is a chronic, non-specific inflammatory disease that affects mainly the colonic mucosa and submucosa. The pathogenesis of ulcerative colitis is unclear, which limits the development of effective treatments. In this study, single-cell sequencing data from 18 ulcerative colitis samples and 12 healthy controls were downloaded from the Single Cell Portal database, cell types were defined through cluster analysis, and genes in each cluster that were differentially expressed in ulcerative colitis were identified. These genes were enriched in functional pathways related to apoptosis, immunity and inflammation. Analysis using iTALK software suggested extensive communication among immune cells. Single-cell sequencing data from adipose-derived mesenchymal stem cells from three healthy female donors were obtained from the Sequence Read Archive database. The SingleR package was used to identify different cell types, for each of which a stemness score was calculated. Pseudotime analysis was performed to infer the trajectory of cells. SCENIC software was used to identify the gene regulatory network in adipose-derived mesenchymal stem cells, and iTALK software was performed to explore the relationship among macrophages, adipose-derived mesenchymal stem cells and enterocytes. Molecular docking confirmed the possibility of cell-cell interactions via binding between surface receptors and their ligands. The bulk data were downloaded and analyzed to validate the expression of genes. Our bioinformatics analyses suggest that ulcerative colitis involves communication between macrophages and enterocytes via ligand-receptor pairs. Our results further suggest that adipose-derived mesenchymal stem cells may alleviate ulcerative colitis by communicating with macrophages to block inflammation.

but they are often ineffective and the disease can recur [3].
The pathogenesis of UC is unclear. Pathological examination of intestinal tissues during active UC shows extensive damage to intestinal epithelial cells as well as diffuse inflammation [2,4]. The intestinal mucosal acts as an immune and mechanical barrier [5] that maintains the stability of the intestinal flora and host immune tolerance toward intestinal microbes [6]. The chronic inflammation in UC can weaken the tight junctions between epithelial cells [7], leading to the destruction of the mucus layer on the surface of the intestinal epithelium [8]. Apoptosis and autophagy may also contribute to damage of the intestinal mucosa in UC [9].
Immune cells appear capable of influencing the course of UC. The subtype of monocytes called macrophages regulate immune responses in the intestinal microenvironment in UC [10]. Macrophages remove apoptotic cells [11] and regulate inflammatory processes [9]. Adipose-derived mesenchymal stem cells (ADMSCs) regulate macrophage function, and they down-regulate pro-inflammatory factors (INFγ, IL-6 and IL-8) while up-regulating anti-inflammatory factors (IL-10, IL-4), thereby weakening the local inflammatory response [12]. Clarifying the roles and interactions of macrophages and ADMSCs in UC may help clarify how the disease occurs and progresses, which may lead to therapeutic targets.
In this bioinformatics study, we found evidence that macrophages may damage the intestinal mucosal barrier by promoting inflammation and intestinal epithelial cell apoptosis/autophagy, likely contributing to UC. We found evidence that, conversely, ADMSCs may communicate with macrophages to block inflammation and thereby alleviate UC.

Single cell data collection and quality control
Single-cell RNA sequencing data from colon biopsies of 18 patients with UC and 12 healthy individuals were collected from the Single Cell Portal database (accession number SCP259) [1]. Single-cell RNA sequencing data from ADMSCs from thigh source of three healthy female donors were obtained from the Sequence Read Archive database (accession number SRP148833) [13].
Sequencing data were subjected to quality control based on the following criteria [14]: gene number between 200 and 6000, unique molecular identifiers (UMI) count > 1000, and mitochondrial gene percentage < 0.1. All 23 samples with sequencing data were used for cellclustering analysis.

Dimensional reduction, clustering and cell type identification
The most variable genes in single cells were identified as described [15]. In brief, the average expression and dispersion of each gene were calculated, then the genes were assigned to eight bins based on their expression. The "NormalizeData" function in Seurat [14] was used to normalize the expression matrix of single cells. The expression matrix was multiplied by 10000 using the "LogNormalize" function, then divided by the size of the total library, so that different cells could be compared. The expression levels of highly variable genes were scaled and centered using the "ScaleData" function in order to exclude the influence of mitochondrial genes and the total number of molecules detected within a cell.
Data were visualized in two dimensions using the "uniform manifold approximation and projection for dimension reduction" (UMAP) method. The SingleR package in R (version 0.2.2) was used to independently infer the cell source and identify the type of each single cell, based on the "Immgen" data set [15]. The "Findallmarkers" function in Seurat version 3.1.2 was used to identify differentially expressed genes (DEGs) in UC. DEGs were defined as those showing |log2(fold change)| > 1 and P < 0.05 with respect to controls.
To uncover the potential biological significance of DEGs, their enrichment in functional pathways was examined using the Kyoto Encyclopedia of Genes and Genomes (KEGG) within the "clusterprofiler" package in R [16]. P < 0.05 was considered to indicate enrichment.

Stemness score
The "stemness" gene set was downloaded from the Molecular Signatures database (https://www.gseamsigdb.org/gsea/msigdb) [17]. Gene set variation analysis was performed using the "GSVA" package in R in order to estimate variation in stemness across different cell types in an unsupervised manner [18].

Pseudotime analysis
Monocle 3 was used to simulate an evolutionary trajectory through pseudotime [15]. The "importCDS" function in Monocle was used to convert the original count in the Seurat object into the "CellDataSet" data set, and the "differentialGeneTest" function was used to identify genes that may help identify genes whose expression changes across pseudotime (qval < 0.01). The "dimension reduction" function was used for clustering, while the "orderCells" function was used to infer the trajectory based on default parameters. Gene expression was mapped using the "plot_genes_pseudotime" function.

Cell-cell crosstalk between cell clusters
The iTALK package in R [19] was used to investigate cell-cell crosstalk between cell clusters. Briefly, the top 50% of highly expressed genes in each cell cluster were matched to the 2,648 non-redundant ligand-receptor pairs included in the iTALK package. These pairs fall into four categories, based on whether the ligand serves as a checkpoint protein, cytokine, growth factor, or "other" protein. The top 20 ligand-receptor pairs for each type were visualized as a ligand-receptor interaction network.

Gene regulatory network and regulons
We used a modified version of the "Single-Cell Regulatory Network Inference" (SCENIC) approach [20,21] to construct a gene regulatory network from the single-cell RNA sequencing data [22]. First, coexpression modules of transcription factors (TFs) and their potential target genes were identified. Second, the most likely target genes were identified based on enrichment of the appropriate binding motifs in the TFs. The resulting regulons of TFs with their most likely target genes were assigned a "regulon activity score" (RAS) in each single cell, based on the area under the receiver operating characteristic curve.

The bioinformatical analysis on the bulk data level
The microarray data of UC patients and controls have been downloaded from the Gene Expression Omnibus (GEO) database (accession number: GSE38713, the platform: GPL570) [23]. The bulk data from intestinal mucosa of 13 healthy controls and 30 UC patients. The differentially expressed genes were identified using the Linear Models for Microarray data (limma) package [24] in R software. P < 0.05 was considered to indicate a statistically significant difference.

Molecular docking
Molecular docking studies of different cell types via surface receptors and ligands were performed using Hex8.0.0.0 software [25] and protein crystal structures available in the Protein Database (https://www.rcsb.org/ pages/contactus) [26]. Binding energy < 0 was taken to indicate possible binding. Potential complexes were visualized with Pymol software [27].

Total cellular landscape in UC
The sampling, sequencing and analysis workflow was show in Figure 1A. Through single-cell RNA sequencing of 68 colonoscopy specimens from 18 patients with UC and 12 healthy individuals, 366,650 high-quality cells were obtained. These cells were divided into 51 clusters based on the UMAP method. And we found that the distribution and number of cells of different subtypes was of difference ( Figure 1B). The violin pictures of healthy individuals-specific marker genes for each cell type further supported these cell types ( Figure 1C). In addition, the expression of marker genes common to UC patients and healthy controls was different in each cell type ( Figure 1D). Furthermore, we identified marker genes specific to UC patients was whose expression differed in UC patients and healthy controls ( Figure 1E and Supplementary Table 1).

Immune cell landscape in UC
UC impairs the integrity of the intestinal mucosa, which can compromise host immune tolerance toward intestinal microbes. Therefore we examined the landscape of immune cells in UC. Analysis of the DEGs in 51 types of cells (Figure 2A) showed functional enrichment in pathways involving immunity, apoptosis and inflammation, including pathways mediated by NF-κB, IL-17, PPAR, ErbB, and T cell receptors ( Figure 2B). We also found that immune cells may communicate with each other via binding between surface receptors and their cognate ligands ( Figure 2C). In particular, iTALK predicted that macrophages may communicate with Best4 + enterocytes, enterocytes and immature enterocytes (Supplementary Table 2). The interactions among the function pathways and genes in enterocytes of the above three subtypes were identified (Supplementary Tables 4-6). Interestingly, we found that some receptor or ligand genes (ITGB1, CD44, VCAN, CD4, ITGB2, AXL CANX and PLAUR) were significantly highly expressed in UC patients compared with healthy controls based on the bulk data (Supplementary Table 3).

Single-cell atlas of ADMSCs
High-quality transcriptome data were obtained from 24,358 single ADMSCs ( Figure 3A). UMAP dimensionality reduction showed that the cells were divided into four clusters ( Figure 3B). The stemness score of 4 clusters was calculated, the Cluster 3 showed the highest stemness score, while clusters 1 and 2 showed the lowest ( Figure 3C). Pseudo-time analysis suggested that ADMSCs in clusters 0, 1 and 2 differentiated from the cells in clusters 3 ( Figure 3D). The specific marker genes showed different expression patterns across the clusters ( Figure 3E). Using the SCENIC method, we organized regulons of TFs with their most likely target genes into six modules. For each module, we identified several representative regulators and cell types based on average activity scores ( Figure  3F), and UMAP analysis identified specific regulators in each cluster of ADMSCs ( Figure 3G).

Potential therapeutic mechanism of ADMSCs in UC
Previous studies have suggested that ADMSCs can alleviate UC [16], but the mechanism involved remains unclear. We found that the expression pattern of genes targeted by regulons in each cluster of ADMSCs ( Figure 4A). In addition, correlation analysis showed that expression level of regulons expression correlated with that of genes encoding surface ligand ( Figure 4B). The iTALK analysis result suggested that ADMSCs may communicate with macrophages via ligandreceptor interactions. The result provided the evidence that macrophages may also communicate with Best 4 + enterocytes, enterocytes and immature enterocytes ( Figure 4C). To examine the possibility of these cell-cell interactions via ligand-receptor interactions, molecular docking studies were performed,  which indicated nine potential ligand-receptor pairs ( Figure 4D). Based on the findings in this study, we proposed a mechanism that ADMSCs may alleviate UC after enter into the human body through communicating with macrophages, thereby further communicating with enterocytes by ligand-receptor pairs ( Figure 5).

DISCUSSION
In this bioinformatics study, we identified DEGs in different cell types of UC patients, which may participate in pathways involving immunity, apoptosis and inflammation. And we found that macrophages may communicate with enterocytes or ADMSCs. These results may help guide future research into the onset, progression and treatment of UC.
Enrichment results of the DEGs identified in our study of UC showed that the signaling pathways were significantly with immunity, apoptosis and inflammation. In this study, MAPK, Ras and mTOR signaling pathway, as well as E. coli-related biological function were significantly enriched based on the DEGs in UC patients. MAPK cascade in turn participates in cell migration, differentiation and proliferation [28]. In addition, Ras proteins function as molecular switches in signaling pathways involved in cell differentiation, growth, migration, survival and proliferation [29]. mTOR, a highly conserved serine/threonine protein kinase that regulates several biological processes [30]. Enteropathogenic E.coli (EPEC) and enterohemorrhagic E.coli (EHEC) are closely related pathogenic strains of Escherichia coli [31], which damage enterocytes and may increase risk of UC [32].
UC involves extensive damage to enterocytes and diffuse inflammation [2]. Macrophages interact with enterocytes [33] and may compromise gap junctions between enterocytes during intestinal inflammation [34]. Our iTALK analysis suggests cross-talk between them, implying that the interaction between macrophages and enterocytes may contribute to UC. Our results further suggest that ADMSCs communicate with macrophages to block inflammation and thereby alleviate UC. These in silico findings are consistent with biological studies showing that ADMSCs interact with immune cells [35] and can shift macrophages from a pro-inflammatory M1 phenotype to an antiinflammatory M2 phenotype [36].
Using the SCENIC method, we proposed a gene regulatory network in ADMSCs. Module M1 in the network contains regulators of gene expression, morphogenesis, and differentiation, and it also contains genes encoding zinc finger proteins required for normal development of the epithelial barrier, such as HOX12 [37], HOXA6 [38], KLF4 [39] and KLF6 [40]. Module M2 contains the regulators SP1, which is involved in many cellular processes [41], and TCF12, which is involved in cell cycle regulation or DNA replication [42]. Module M3 contains regulators FOXO3 and TEAD1. FOXO3 likely activates genes that promote apoptosis and autophagy [43,44], while the TF TEAD1 may promote apoptosis and restrict proliferation [45]. The consistency between our bioinformatics analyses of ADMSCs and previous biological studies suggests the reliability of our approach.
There are some limitations in this study. On the other hand, the data in this study come from a relatively small sample and were analyzed using only bioinformatics techniques, but they provide a useful reference for experimental work to clarify the pathogenesis of UC and develop effective treatments. On the other hand, the expression of genes in UC patients and healthy individuals was preliminarily validated on the bulk data level, the findings in this study needs to be validated in more bulk data or the biological experiments.

AUTHOR CONTRIBUTIONS
The guarantor of integrity of the entire study: Yufeng Lv, Qiong Song and Shaowen Mo; study design: Yufeng Lv, Qiong Song and Shaowen Mo; data analysis: Chengyu Huang, Mengxin Wei, Yixuan Chen and Ting Li; manuscript preparation: Nan Zhang; manuscript editing: Nan Zhang; manuscript review: Nan Zhang.

CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest.

FUNDING
This research received no specific grant from any funding agency in the public, commercial, or not-forprofit sectors.