Proteome-wide screening for the analysis of protein targeting of Chlamydia pneumoniae in endoplasmic reticulum of host cells and their possible implication in lung cancer development

Available reports have confirmed a link between bacterial infection and the progression of different types of cancers, including colon, lungs, and prostate cancer. Here we report the Chlamydia pneumonia proteins targeting in endoplasmic reticulum (ER) using in-silico approaches and their possible role in lung cancer etiology. We predicted 48 proteins that target human ER, which may be associated with protein folding and protein-protein interactions during infection. The results showed C. pneumoniae proteins targeting human ER and their implications in lung cancer growth. These targeted proteins may be involved in competitive interactions between host and bacterial proteins, which may change the usual pathway functions and trigger the development of lung cancer. Moreover, C. pneumoniae unfolded protein accumulation in the human ER possibly induces ER stress, consequently activating the unfolded protein response (UPR), and providing a favorable microenvironment for cancer growth. The current study showed the C. pneumoniae protein targeting in ER of host cell and their implication in lung cancer growth. These results may help researchers better manage lung cancer and establish a molecular mechanism for C. pneumoniae lung cancer association. List of abbreviations ER: Endoplasmic reticulum


Introduction
Lung cancer is one of the very common cancer, with 1.8 million (approximately 13% of the total cancer) new cases estimated globally in 2012 (Ferlay et al., 2015). The most common cancer is lung carcinoma and leading cause of death, after breast cancer in female and prostate cancer in men (Bray et al., 2018). The association of infection of bacteria with the progression of different types of cancer has confirmed in various studies (Arthur et al., 2012;Khan, 2015;Magat et al., 2020). The Gram-negative microbe Chlamydia pneumoniae is an intracellular obligate bacterium. C. pneumoniae is transmitted due to acute respiratory infection and potentially associated with lung disease including bronchitis, asthma, atherosclerosis, and high risk of lung cancer however, the causal pathogenetic mechanisms are not well understood. The infection of C. pneumoniae has been reported to alter several pathways of infected host cell including calcium-mediated nuclear factor-κB (NF-κB) pathway, apoptosis, necrosis etc (Chaturvedi et al., 2010;Littman et al., 2005;Wahl et al., 2003). NF-κB has crucial role in proliferation, inflammation, and regulation of apoptosis. Alteration in program cell death due to NF-κB pathway enigmatically associated with cancer (Wahl et al., 2003). Inflammation may be connected with carcinogenic process due to chronic infection of bacteria in human (Littman et al., 2005). The lungs of the smoker may be more susceptible for the localization of C. pneumoniae (Von Hertzen, 1998). The radicals of superoxide oxygen, tumor necrosis factor, IL-8 and IL-1β are generated by induced monocytes. The inflammatory factors may trigger the process of carcinogenesis in lung cells (Koyi et al., 1999).
C. pneumoniae contained the capability to interfering with the host cell's apoptotic apparatus. Alteration in the apoptosis has been related to survival scheme of intracellular C. pneumoniae. Many reports of C. pneumoniae infection have been demonstrated to inhibit host cell apoptosis (Airenne et al., 2002;Fischer et al., 2001;Wahl et al., 2003). However, some contradictory study has shown the proapoptotic activities of CP during the infection (Sessa et al., 2009). Members of the inhibitor of apoptosis protein family are important factors that regulate apoptotic cell death. The infection of C. pneumoniae showed the induction of the expression of mRNA and protein of cellular inhibitor of apoptosis 2 (c-IAP2) in human cell line Mono Mac 6 that may help to intracellular survival of the bacterium (Wahl et al., 2003). The carcinogenic process of lung cancer due to the infection of C. pneumoniae is still infancy stage.
Cancer is a serious problem around the world and arises due to various factors including infection, mutations, smoking and disturbance in homeostasis of various biological process. Endoplasmic reticulum (ER) is the important subcellular organelle of eukaryotes which transfers various proteins to their subcellular location or on surface of the cell (Braakman and Bulleid, 2011). ER has various molecular chaperones which involved in protein synthesis, protein folding and maturation. Several biological processes are controlled by the normal functioning of ER. Interestingly, there are several connections to activation of ER stress response constitutes a cellular process that can be triggered by a great variety of pathological human conditions such as Alzheimer's disease and cancer. Different conditions including inflammatory bowel disease, diabetes and various types of cancers are related to the activation of the ER stress response pathway. ER stress is induced in numerous pathological and physiological conditions through accretion of misfolded proteins in the lumen of ER.
In the current work, we predicted the C. pneumoniae protein targeting in the ER of host cell and their possible involvement in the growth of lung cancer. We utilized the bioinformatics approach to decipher the possible localization of whole protein sequences of C. pneumoniae in ER of host cells. Furthermore, we predicted the possible role of molecular weight and isoelectric point of ER targeted proteins in protein targeting and growth of lung cancer. We expect that our results might be very helpful, valuable to provide right direction to experimental research in wet lab experiments.

Selection of database
Various databases were served in order to obtain maximum proteins sequences of C. pneumoniae. In the current study we were served the different database such as NCBI, Uniprot and EMBL to retrieve the complete proteome of C. pneumoniae. C. pneumoniae proteome retrieval The bacterium C. pneumoniae was infecting the host as obligate intracellular bacteria and act as possible factor in the growth of lung cancer (Hua-Feng et al., 2015;Xiong et al., 2019). Six isolates (CWL029, AR39, J138, B21, TW-183, LPCoLN) of bacteria were available in Uniprot database with huge information of protein sequences (Kalman et al., 1999;Myers et al., 2009;Read et al., 2000;Shirai et al., 2000). The whole protein sequences of bacteria TW-183 isolate were retrieved from Uniprot database. These protein sequences were utilized to decipher the ER targeted proteins using bioinformatics approach.
Deciphering the endoplasmic reticulum targeting proteins of C. pneumoniae and their role in lung cancer Online available bioinformatics predictor Hum-mPLoc 2.0 was used in current study to decipher the C. pneumoniae proteins targeting in ER of human cells. This predictor has predicted subcellular localization of proteins in various organisms. (Shen and Chou, 2009). This predictor has predicted subcellular localization of proteins in various organisms. All proteins sequences of TW-183 isolate of C. pneumoniae were used to predict the protein targeting in host Endoplasmic Reticulum. Hum-mPLoc 2.0 was deciphered the protein targeting in various sub cellular organelles such as endoplasmic reticulum, cytoplasm, mitochondrion, nucleus, centriole, cytoskeleton, endosome, extracell, golgi apparatus, lysosome etc. The predictor has used sequential evolution information and functional domain information using ensemble classifier. The protocol of prediction showed in general scheme (Fig. 1). Decipher the possible relation of particular molecular weight of C. pneumoniae proteins in endoplasmic reticulum targeting and their role in lungs cancer ExPASy online bioinformatics predictor compute pI/Mw tool was used to decipher the molecular weight in whole protein sequences of TW-183 bacteria. This tool was used to analyse the various characters of proteins in proteomics research. The tool was worked on experimentally available information of particular amino acid and provided accurate results.
Decipher the possible relation of particular isoelectric point of C. pneumoniae proteins in endoplasmic reticulum targeting and their role in lung cancer Similarly, ExPASy online bioinformatics predictor compute pI/Mw tool was utilized to decipher the isoelectric point (pI) in whole protein sequences of TW-183 bacteria.

Selection of database and retrieve of proteome
In the present work, we were selected the Uniprot database to retrieving the protein sequences of C. pneumoniae. Uniprot database was contained information of most of the bacterium and considered a reliable source for bioinformatic study.
To retrieve the proteome of C. pneumonia TW-183 isolate of C. pneumoniae was selected for the retrieving of whole protein sequences of bacterium due to their large size proteome. The proteome was contained maximum number of protein sequences.
Decipher the endoplasmic reticulum targeting proteins of C. pneumoniae and their role in lungs cancer The results of Hum-mPLoc 2.0 were showed that only 48 proteins of TW-183 isolate of C. Pneumonia targeted in ER of host cells. The descriptions of proteins such as Accession number, possible functions, and number of amino acids are shown in Tab. 1.
The normal functions of host cells may alter trough different strategies due to the protein-protein interaction in the lumen of ER as existences of C. Pneumoniae proteins. ER plays very important role in protein folding and formation of active proteins/enzymes.
Decipher the possible relation of particular molecular weight of C. pneumoniae proteins in endoplasmic reticulum targeting and their role in lungs cancer Whole protein sequences of C. pneumoniae were used to calculate the theoretical molecular weight (MW) using compute pI/Mw tool (https://web.expasy.org/compute_pi/). This tool is highly useful to decipher the MW and theoretical pI value in unknown protein sequence. The relations of specific range of molecular weight of ER target proteins and whole proteins of TW-183 isolate of C. pneumoniae were demonstrated in Figs. 2A and 2B.
The results of our current study were clearly demonstrated that the increases in molecular weight consistently decreased the targeting of proteins in ER of host cells. The proteins contain lowest range of molecular weight (0-20 kDa were showed high potential to target in ER of host cell while the protein having highest molecular (>80 kDa) weight were showed very little potential to target in ER of host cells.
Decipher the possible relation of particular isoelectric point of C. pneumoniae proteins in endoplasmic reticulum targeting and their role in lung cancer Similarly, the values of isoelectric point (pI) in whole proteins sequences of TW-183 isolate of C. pneumoniae were deciphered using online tool compute pI/Mw. The relations of specific range of pI values of ER target proteins and whole proteins of TW-183 isolate of C. pneumoniae were demonstrated in Figs. 3A and 3B.
The previous report demonstrated that pathogen protein translocation is supported through a neutral isoelectric point (pI) between 6.0 to 8.4. Nevertheless, for proteins >50 kDa in size, pI alone does not clarify their localisation (Smigielski et al., 2019). The results were showed that the value of isoelectric point (pI) was not showing any constant pattern for protein targeting in ER of host cells ( Fig. 3A and 3B). It was observed that the pI value 9-10 have the high potential (24 Proteins) to target in ER of host cells.

Discussion
Although wet lab research and experiments for analysing protein targeting signals can provide comprehensible evidence and distinguish between maintenance and return signals but performing wet lab research and experiments are generally more time-taking and costly. Therefore, in-silico computational research is recognized as an alternative approach that presents informative, valuable, and useful direction to experimental research. Cell is considered the small unit of life, which holds numerous molecules including proteins and enzymes. Various subcellular locations generally called cell organelles contain different proteins which have the important functions for the cell's survival'. ER is a main subcellular organelle in the cells of eukaryotes. Approximate one-third the proteins are transported to various subcellular locations to the cells through the ER (Hetz et al., 2015). It is involved in regulation of various essential biological processes including protein folding, post-translational modification, cell metabolism and protein synthesis (Schwarz and Blower, 2016).
In our current study 48 different proteins were predicted to be targeted to the ER of human cells from the entire proteome of C. pneumoniae. These targeted proteins of C. pneumoniae may alter the different functions of human cell during the course of infection. This study has showed that ER of infected epithelial cells mostly contained chlamydial major outer membrane protein (MOMP), lipopolysaccharide (LPS) and the inclusion membrane protein A (IncA) (Giles and Wyrick, 2008). The in-silico to decipher the protein targeting into the human cell is very essential to discover the growth of cancer, particularly when the progression of cancer is related to the bacterial intracellular infection. Various studies have showed that many targeted proteins of bacteria in different subcellular organelles of host cell (including mitochondria, Golgi complex, cytoplasm, and nucleus) exert adverse effect and act as the etiological factors in the progression of various types of cancer (Khan et al., 2016;Khan et al., 2020;Khan et al., 2017). The proteins and enzymes that exist in the ER are known as ER-resident proteins. ERresident proteins are an important topic in ER-related studies. Several of the ER-resident proteins and enzymes have specific protein sorting signals including KDEL or KXXX, while some others proteins do not having such signals (Stornaiuolo et al., 2003). Information of proteins and enzymes localization in different subcellular location may be helpful to know the clues of different functions. It is prerequisite to reveal the information and functions of proteins/enzymes located in intricate pathways of different cell organelles.
Infections are considered as an important part of the natural path of cancer. Pneumonia is very common disease which influencing approximate 450 million persons every year and finding all over in the world (Biscevic-Tokic et al., 2013). Approximate 50%-70% patients has been showed lung cancer with pneumonia (Akinosoglou et al., 2013). It is   a major reason of deaths with about 7% of total deaths worldwide every year (Ruuskanen et al., 2011). The cases of lung cancer and pneumonia enhanced noticeably (Shen et al., 2016). The molecular mechanism of association of C. pneumoniae with lung cancer is poorly investigated till date. We were predicted targeting of various enzymes and proteins in ER of human cell which includes Glycogen synthase (Accession No. Q9Z6V8), Glucose-6-phosphate 1dehydrogenase (G6PD) (Accession No. Q9Z8U6), ABC transporter (Accession No. Q7VPR2), Phosphatidylserine decarboxylase proenzyme (Accession No. Q9Z767), Lipoprotein signal peptidase (Accession No. Q9Z817), Glycogen phosphorylase (Accession No. Q9Z8N1), Lipid-Adisaccharide synthase (Accession No. Q9Z6U3). Although in current study we have deciphered the 48 proteins to target in ER but the functions of various proteins still unknown. The targeted enzymes and proteins may change the normal functioning of protein folding in ER of during infection and help in the growth of lung cancer. Accumulation of C. pneumoniae unfolded enzymes and proteins in the human ER possible act as a factor of ER stress and promote the cleavage of ATF6, which consequently facilitates the activation of unfolded protein response (UPR). The UPR activation has associated with the progression of cancer and lung diseases (Barabutis, 2020;Piton et al., 2016;Wang et al., 2014). It have reported that the expression of G6PD is enhanced in various tumors such as colon cancers, leukemia, gastrointestinal cancers, breast cancers, endometrial carcinomas, liver cancer and lung cancer (Batetta et al., 1999;Jiang et al., 2013;Rao et al., 1997;van Driel et al., 1999). The functions and other informations of the ER targeted proteins of C. pneumoniae were showed in Tab. 1. Hence, an important basic purpose in proteomics and molecular cell biology is to decipher the subcellular targeting of proteins and enzymes in the whole cell. It is also very essential for prioritizing and selecting the accurate goals for drug development. Unfortunately, both processes are time consuming, laborious with and high expenses. To confirm the subcellular location of proteins and enzymes in various organelles of cell is completely based on laboratory works.
The cells constantly translate the great variety of proteins and enzymes. The translated proteins need to be folded correctly, post-translationally modification, accumulated into complexes and located to their ultimate destinations of subcellular organelle. If some mistakes arise in the maturation process the normal functioning of the cell disturbed. In the current study we are focused to decipher the protein targeting of C. pneumoniae in endoplasmic reticulum of host cell.
With the availability of next generation sequences data of proteins sequences, we needed highly desired in silico approaches for rapidly and efficiently identifying the subcellular locations of uncharacterized proteins based on their sequence information alone. ER is the main site of protein modification and protein folding. Misfolded proteins are responsible to stimulates the UPR in the ER (Schmidt et al., 2019), which increases the phenomena of protein folding to restore homeostasis (Choi and Song, 2020). UPR dysregulation signalling contributes to the pathogenesis during infection. The ER stress is related to pathogenesis of different diseases, including diabetes, neurodegenerative disorders, cancer, etc. Nevertheless, the pathogen proteins which have targeted in ER may disturbed the homeostasis and responsible for alteration in normal functioning of host cells. These changes may connect with the abnormal growth of infected cells and act as a prominent factor of cancer growth.

Conclusion
The ER is responsible for various housekeeping functions in the cell and act as a key organelle for the final maturation of proteins that controls its homeostasis state. Change in homeostasis has been associated with the progression of various situations including cancer, which is the highly challenging problem in the present time. The C. pneumoniae bacterial protein targeting in ER of human lung cells may disturbed the normal functioning of ER and act as a potential factor for the growth of lung cancer. Alterations in the UPR connected with the growth and development of cancer due to accumulation of pathogen proteins. In the present study we deciphered the C. pneumoniae protein targeting in ER and their association with lung cancer. We observed that the intercellular infection of C. pneumoniae in human cells acts as an etiological factor in lung cancer. Therefore, our novel finding provides the new direction to researchers, scientists and clinicians for the management and treatment of C. pneumonia associated lung cancer.