The Screening of Critical Related Genes in Celiac Disease Based on Intraepithelial Lymphocytes Investigation: A Bioinformatics Analysis

Background: Celiac disease (CD) is an immunological intestinal disorder, which is characterized by response to gluten. In addition to the environmental factors and dysbiosis of the gut microbiota, genetic susceptibility has an important role in the pathogenesis of this multifactorial disorder. Therefore, this study aims to present the crucial involved genes in CD pathogenesis. Materials and Methods: In this bioinformatics analysis study, significant differentially expressed genes of intraepithelial lymphocytes (IELs) samples of celiac patients versus normal patients from Gene Expression Omnibus (GEO) database were screened via the protein-protein interaction (PPI) network. The critical nodes based on degree values, betweenness centrality, and fold changes were determined and enriched by ClueGO to find relative biological terms. Results: According to the network analysis, five central nodes including IL2, PIK3CA, PRDM10, AKT1, and SRC and eight significant differentially expressed genes (DEGs) were determined as the critical genes related to CD. Also, CD4+, CD25+, alpha-beta regulatory T cell differentiation are identified as prominent biological terms in the celiac disease patients. Conclusion: There is a possible biomarker panel related to CD that can be used as a therapeutic or diagnostic tool to manage the disease.


Introduction
C eliac disease (CD) is a small intestinal disorder that can lead to villous atrophy, malabsorption, and malignancy in the small intestine [1,2]. CD is caused by the gluten protein in wheat, barley, and rye ingestion [3]. The main genetic predisposition factor in this disorder is the expression as HLA-DQ2 and HLA-DQ8, the antigen-presenting molecules GMJ.2019;8:e1407 www.gmj.ir of human leukocyte [4]. HLA-DQ2 and HLA-DQ8 bind to gluten peptides and activate destructive intestinal T cells [5]. Gluten peptides induce the secretion of IgA-class autoantibodies in the small-intestinal mucosa, which are targeted against tissue transglutaminase (tTG). Interestingly, after implementation of a gluten-free diet, these autoantibodies can disappear from the circulation more rapidly than the small-intestinal mucosal abnormality [6]. Therefore, the only approved treatment of CD is a lifelong gluten-free diet [7]. On the other hand, the histopathology of celiac disease was classified by Marsh et al. [9]. Beside different diagnostic methods, proteomic analysis of the patient's serum could be a clue to developing a new diagnostic and the therapeutic markers for CD [10]. Also, the protein-protein interaction network analysis obtained by proteomics assays is one of the supportive fields for discovering the pathogenesis biomarkers for celiac disease [11]. Stulík et al. report the detection of 11 proteins with various frequencies by sera of patients with celiac disease. They identified actin, ATP synthase b chain and two charge variants of enolase as autoantigens. Therefore, we assume that protein-protein interaction network analysis is a suitable method for screening the numerous related known genes to CD and introducing the crucial ones. The finding can be considered as potential biomarkers for prognosis and treatment.
Profiles of three clinical controls were used to be compared with the celiac samples. The cells were extracted from the small intestine (duodenal mucosa) of active celiac patients and clinical controls. The data is recorded in the database entitled "Gene expression assessed by genome-wide hybridization bead array in IELs isolated from small intestinal biopsies of celiac disease patients with active and clinical controls." The platform is GPL6883 Illumina human ref-8 v3.0 expression beadchip. The top 250 DEGs were selected, and the uncharacterized ones were excluded. The selected significant DEGs were included in the PPI network via string database by Cytoscape software version 3.6.0 (Applied Biosystems, Foster City, CA, USA). The constructed network was analyzed by network analyzer application of Cytoscape. The hub nodes were based on degree value by using average+2SD cutoff. The 5% top nodes based on betweenness centrality were identified as bottleneck nodes [12]. The common hub and bottleneck nodes were introduced as hub-bottlenecks nodes [13]. The 8 top over-expressed and down-regulated DEGs were determined as critical DEGs. The critical DEGs and hub-bottlenecks were enriched to obtain biological terms via KEGG, wiki Pathways, REACTOME pathways, GO molecular function, GO cellular component, and GO biological process by ClueGO v2.5.0 plugin of Cytoscape. The terms were clustered based on the Kappa score. Log fold change (Log FC) >2 and P≤ 0.05 were considered as statistically significant findings.  Table-2. Since gene function plays an important role in the molecular mechanism of diseases, gene ontology of the identified central nodes and the top DEGs was done, and the biological processes, cellular component, molecular function, and biochemical pathways related to these critical nodes were determined (Figure-2).

Discussion
Molecular mechanisms of different diseases are studied via PPI network analysis. These findings led to the introduction of several central genes, which potentially can be considered as a biomarker panel. In this study, IELs are targeted to achieve new aspects of the molecular mechanism of celiac disease.
As it is shown in Figure-1, the samples are comparable because the expression profiles are matched via boxplot analysis. Among numerous DEGs, only the limited numbers of genes were included in the PPI network.
As it is represented in Table-1, there are five central nodes that play a crucial role in the network. The central nodes and also the deregulated (Table-2) are suitable genes as a biomarker panel. In the following part, the roles of these critical genes in celiac disease will be described and discussed: IL2 the top hub-bottleneck node is a highly connected node that interacts with 113 nodes directly [14]. IL2, as an important lymphokine, is involved in several cellular processes of T cells. The cells which secret IL2 are responsible for responses to antigenic or mitogenic stimulation [15]. Nilsen et al. (1998) reported that levels of interferon-γ, IL2, IL4, IL6, and tumor necrosis factor-α in the duodenal biopsy of the treated celiac patients which were exposed to gluten were rapidly elicited [16]. The regulation relationship between SRC and AKT isoforms, which play a critical role in cell survival, growth, proliferation, angiogenesis, metabolism, and migration is emphasized. These two genes originally are known as oncogenes [17]. The pivotal role of PRDM proteins in cell growth, differentiation, and also neoplastic transformation is highlighted [18]. Therefore, SRC, AKT, and PRDM are involved in regulation of cell differentiation and growth, and deregulation of these terms is the important processes in cancer. ADGRE1 is introduced as a macrophage marker, and its role in several phenomena such as bone regeneration is highlighted [19]. ALDH1L2 is a member of ALDH superfamily, which plays key roles in various life processes mostly in detoxification of pharmaceuticals and environmental pollutants. The process proceeds via NAD (P)+-dependent   GMJ.2019;8:e1407 www.gmj.ir oxidation aldehyde substrates [20]. As it is shown in Table-2, ADGRE1 and ALDH1L2 are the top up-regulated and down-regulated DEGs. As depicted in Figure-2, CD4-positive, CD25-positive, alpha-beta regulatory T cell differentiation involved in the immune response is the prominent biological processes related to the critical genes. All query genes including DEGs and central nodes are involved in this cluster. Based on our finding the T cells are essential elements in celiac diseases. Intestinal T cell response in celiac diseases is investigated and discussed in several types of research [21,22]. As Jabri and Sollid reported, the main feature of the molecular mechanism of celiac disease is related to the response of CD4 T cells to dietary gluten. The response that promotes antigen-antibody reactions results in the most developed lesion in the proximal small intestine of patients [23]. Gene ontology finding reflects direct involvement of the important genes in CD.
The 13 introduced critical genes may be a suitable biomarker panel or at least numbers of them can be considered as individual biomarkers of CD; however, more investigation is needed to achieve these successes.

Conclusion
In conclusion, achieving a molecular diagnostic tool for CD is feasible. The introduced possible biomarkers are suitable therapeutic reagents if further investigation is planned.