The relation of TMPRSS2 Gene Polymorphism to COVID-19 Severity of Indonesian Population in Jakarta

Background: The COVID-19 disease caused by the SARS-CoV-2 virus has diverse symptoms, ranging from asymptomatic, mild symptoms such as flu-like illness and pneumonia to acute respiratory distress syndrome, which ends in death. Until now, the mechanism of the COVID-19 disease that causes widespread symptoms and the severity and factors that influence it are still unclear. During viral internalization, it needs to be cleaved by the serine protease encoded by the TMPRSS2 gene. It is hypothesized that higher expression of the TMPRSS2 gene causes higher virus internalization into cells, leading to more severe symptoms in patients. Methods: The Single Nucleotide Polymorphism Genotype Test was carried out to prove whether the TMPRSS2 gene affects the severity of COVID-19, as evidenced in other viral respiratory diseases. With a better understanding of gene expression related to this disease, it is hoped that we can better understand the mechanism of COVID-19 and establish better therapies and prevention against it. In this study, 68 COVID-19 patients participated and were categorized into two groups based on their clinical symptoms, namely mild symptoms without symptoms (n=12) and Moderate-Severe symptoms (n=56). PBMC cells were isolated from the patient. Then the DNA was extracted and used as a template in the SNP Genotyping of the TMPRSS2 rs2070788 gene variant. Results: The results showed that 35 samples had A/A homozygous genotypes, 29 A/G heterozygous samples, and 4 G/G homozygous samples. n=29) and heterozygous A/G (n=23), whereas only 4 were homozygous G/G. In addition, the homozygous G/G genotype was only detected in the moderate-severe group. Conclusions: A more significant number of samples from the asymptomatic mild symptom group is needed to statistically prove that homozygous G/G variants or G alleles are generally associated with the severity of COVID-19 patients.


Introduction
COVID-19 is a newly emerging infectious disease caused by Severe Acute Respiratory Syndrome Coronavirus type 2 (SARS CoV-2).SARS Cov-2 is a Coronaviridae virus and is classified in the same genus as SARS CoV and Middle East Respiratory Syndrome (MERS CoV), which is betacoronavirus.The virus was enveloped and spherical, with a spike formed by S protein on its surface.The virus also produces M protein that builds its shape, transmembrane glycoprotein E protein, and N protein (nucleocapsid) which protect its single-stranded RNA-positive sense genome (Wan et al., 2020;Xu et al., 2022).
The main transmission route in SARS Cov-2 infection is by droplets and vomit.SARS Cov-2 virus can enter the cell by binding to its receptor, ACE2 protein, on the surface of the type-2 pneumocyte cell.ACE2 proteins were expressed in many types of cells.As SARS CoV and MERS CoV pathogenesis, SARS CoV-2 spike protein binds to ACE2 protein on the cell surface and needs to be cleaved to perform the membrane fusion.The spike (S) protein consists of two domains, known as S1 and S2.The S1 domain binds to the ACE2, then S1 and S2 are cleaved.After cleavage, S2 overcame a drastic conformational change, so the membrane fusion between the cell membrane and the viral envelope occurred.This cleavage process required the activity of a serine protease on the host cell surface.Some proteases that are known to play a role in this process are Transmembrane serine protease 2 (TMPRSS2) and Cathepsin B&L (CatB&L) (Shang et al., 2020).
COVID-19 disease symptoms are very diverse.The infected person can exhibit no signs or asymptomatic, mild-symptom such as flu-like disease, digestive tract disorder, or pneumonia.In some patients, the condition may lead to acute respiratory distress syndrome and mortality (Huang et al., 2020).To date, no proven mechanism, gene expression, or protein is associated with the diversity of COVID-19 clinical symptoms and severity.
Bioinformatics study of TMPRSS2 gene variability among populations and races, as well as its correlation to structural properties of the protein, is examined in some current research.Paniri et al. (2020) found that 21 polymorphisms in the TMPRSS2 gene are correlated to their structural and protein function difference, affecting the pocket protein in rs12329760, creating a donor site in rs875393, and affecting its miRNA profile in rs12627374, respectively (Paniri et al., 2020).Irham et al. (2020) show that TMPRSS2 genetic variants rs464397, rs469390, rs2070788, and rs383510 affect TMPRSS2 gene expression in the lung (Irham et al. 2020).Research done by Cheng et al. (2015) showed that TMPRSS2 genetic variants are correlated to the severity of Influenza A H1N1pdm09 and Influenza A H7N9 virus, as well as its susceptibility to the infection.Based on the research, the genetic variant of the TMPRSS2 gene that positively correlated to the severity of Influenza A disease is rs2070788 and rs383510, which are similar to the variant used in current research (Cheng et al., 2015).
Genetic variants affecting TMPRSS2 expression may vary between populations (Irham et al., 2020).However, genetic variant or expression profile data of the TMPRSS2 gene in Indonesia is not available yet.Nevertheless, some research indicates that rs2070788 and rs383510 are responsible for TMPRSS2 expression in the lung of the Asian population, including South East Asia (Irham et al., 2020;Paniri et al., 2020).
Hence, in this research, we examined the hypothesis that TMPRSS2 gene polymorphism correlates to the severity of COVID-19 symptoms, especially in Indonesia.We did a Single Nucleotide Polymorphism (SNP) Genotyping Assay in Indonesia Population with different levels of severity.By understanding the genetic variant of the TMPRSS2 gene for different severity levels of the Indonesian population of COVID-19 patients, we hope to understand better the COVID-19 mechanism of disease and outcome to build a better therapeutic design and prevention.

Research Design and Samples
This research is an analytics-observational study using whole blood obtained from hospitalized COVID-19 patients in 2 hospitals.Pondok Kopi Jakarta Islamic Hospital and Goenawa Partowidigdo Pulmonary Hospital, respectively.Ethical Committee Board of Faculty of Medicine Universitas Muhammadiyah Prof. Dr. Hamka Jakarta approved ethical clearance.This research was conducted from August 2020 to February 2021.All clinical data during hospitalization and 9 mL of whole blood were obtained from the patient that gave their consent.The samples were divided into two categories, which are (1) the mildasymptomatic group and (2) the moderate-severe group.Patients with no symptoms experienced mild symptoms such as flu-like, anosmia, diarrhea, and other digestive tract disorder syndrome, to mild breath difficulty, were categorized into the mild-asymptomatic group.If the patient needed breathing aid devices due to pneumonia, such as a nasal cannula and ventilator, or a patient fell on ARDS or even died, they were categorized into the moderate-severe group.Whole blood samples were processed to obtain PBMC cells in the Research Laboratory of the Faculty of Medicine Universitas Muhammadiyah Prof. Dr. Hamka Jakarta.

Sample or Participant
9 mL of whole blood collected from patients were centrifuged to separate the plasma and cells.Buffy coats were obtained and diluted in phosphate buffer (PBS) (Vivantis).PBMC were separated from other cells and impurities from the blood using gradient centrifugation with Lymphoprep ™ (Stem Cell Technologies) with 1:1 volume between Lymphoprep and the buffy coat suspension in PBS.PBMCs were washed with PBS twice, resuspended in PBS, and divided into four cryotubes.Afterward, PBMC was stored at -80°C before analysis.

Instrument
DNA isolation steps were done with isolated PBMC using Quick-DNA™ Miniprep Plus Kit by Zymo Research as manufacturer instruction.The isolated DNA was stored at -80°C before being used in the real-time-PCR step.

Data collection
Before used in SNP genotyping analysis, we did literature research about the genetic variant that may affect TMPRSS2 gene expression in Indonesia.Genetic variants rs464397, rs469390, rs2070788, rs383510, rs2298659, rs17854725, rs12329760, and rs3787950 have bioinformatically increased TMPRSS2 expression in several different populations (Irham et al., 2020;Paniri et al., 2020;Asselta et al., 2020).Variants rs20702788 and rs383510 were found to affect TMPRSS2 in the Asia population.Furthermore, rs2070788 was related to higher TMPRSS2 expression in the South East Asia population.Hence, we chose rs2070788 as the genetic variant analyzed in this study.The SNP genotyping step was done using TaqMan ® SNP Genotyping Assay kit (Life Technologies ™ ) ordered and designed for TMPRSS2 gene SNP genotyping analysis in rs2070788.The isolated DNA was used as the template for SNP genotyping with TaqMan ® SNP Genotyping Assay kit per the manufacturer's instruction.Realtime-Polymerase chain reaction (PCR) for SNP genotyping was done using QuantStudio™ 5 machine (Applied Biosystem, ThermoFisher Scientific).

Data analysis
The SNP genotyping result was analyzed using QuantStudio™ Design and Analysis software (Applied Biosystem, ThermoFisher Scientific).The allele discrimination plot of the samples was generated using the software.The genotypic distribution of each sample was correlated to the severity of the patient.Statistical analysis was done in SPPS software using a non-parametric comparative test with the chi-square method.

Sample Collection, DNA Isolation
During the research, 100 hospitalized patients from 2 hospitals agreed to join this study.PBMC was successfully isolated from 80 samples.However, only 68 samples were included in SNP genotyping assay.From 68 samples, only 12 patients were categorized in the Mild-Asymptomatic group, while the other 56 samples were included in the moderatesevere group.Sample quantity disparity between groups may occur because of the hospitalization regulation of COVID-19 patients applied in Indonesia.Those with mild symptoms were not hospitalized yet isolated in the COVID-19 isolation building provided by the government or doing self-isolation at home.Therefore, mild symptom or asymptomatic COVID-19 patient was difficult to find at the hospital where this research was conducted.

SNP Genotyping Assay
SNP genotyping was conducted using TaqMan ® SNP Genotyping Assay kit to detect polymorphism in rs2070788.The assay was done in QuantStudio 5 real-time PCR machine.Subsequently, the allele discrimination plot was generated by the software (Figure 1).

Figure 1. Allele discrimination plot of TMPRSS2 gene SNP genotyping assay in rs2070788
In figure 1, patients who only have the A allele variant in rs2070788 or homozygote A/A appear as a red dot, while blue dots represent the homozygote G/G variant in the samples.Moreover, green dots represent the heterozygote A/G variant.Only four samples were detected as homozygote G/G, while homozygote A/A and heterozygote A/G were relatively equal (Figure 1).The allelic genotype of each sample was then correlated to the severity of the clinical outcome (Table 1).
Thirty-five samples were detected as homozygote A/A, 6 were mild-asymptomatic samples, and the latter was moderate-severe samples.Of 29 samples seen as heterozygote A/G, six were signed as mild-asymptomatic samples, and the rest were moderate-severe samples.Only four samples were found as homozygote G/G, and all those homozygote G/G samples originated from moderate-severe patients.No homozygote G/G was detected in mild-asymptomatic samples.

Discussion
During this study, 100 samples were collected from hospitalized COVID-19 patients in 5 months.However, only 80 PBMC samples were successfully isolated.The rest failed due to improper storage and distribution methods from the hospital to the laboratory where the study was carried out.Moreover, in some samples, PBMC cells were isolated in a meager amount.It may be caused by lymphopenia and leukopenia that commonly occur in COVID-19 patients (DbSNP Reference SNP (Rs) Stand Orientation Reporting Updates, 2019).
SNP genotyping assay of the TMPRSS2 gene in rs2070788 was done by examining the A and G alleles in the samples.Based on the data of rs2070788 in the NCBI database of Single Nucleotide Polymorphism (DbSNP Reference SNP (Rs) Stand Orientation Reporting Updates, 2019), allele G and A variants were commonly found.Data in dbSNP shows that allele G is more common than A. However, genetic variation may vary between populations.As seen in DbSNP Reference SNP (Rs) Stand Orientation Reporting Updates (2019) or rs2070788, the populations used in some studies have a dominant A over G.
Several studies involving Asian populations have concluded that the frequency of the A allele is higher than G.However, no studies involving Indonesian residents have been recorded in the rs2070788 database So far, the population in the database that can be used as a reference for the genetic profile of the Indonesian population at rs2070788 is the Vietnamese population (allele frequency G<A) which is the same as other Asian populations DbSNP Reference SNP (Rs) Stand Orientation Reporting Updates, (2019).
A study by (Stropes et al., 2020) presented the use of a peptidomimetic with a keto benzothiazole warhead, which led to the identification of N-0385 (a compound with strong inhibitory activity of TMPRSS2 or IC50 = 1.9 nM proteolytic activity).This study demonstrated a major antiviral candidate showing potent inhibition of SARS-CoV-2 infection in Calu-3 cells, with an EC50 of 2.8 ± 1.4 nM and a selectivity index higher than 1 × 106.The potency of N-0385 was validated using two intracellular viral biomarkers of infection and by measuring the release of infectious viral particles.Furthermore, complete inhibition of infection is achieved with 100 nM N-0385 in human donor-derived colloid, confirming the low nanomolar potency of N-0385 against SARS-CoV-2.
In this research involving the Indonesian population, genotype variant homozygote A/A was found in 51,5% of samples, followed by heterozygote A/G which was 42,6%, and the few latter were homozygote G/G.The genetic variation in the same group showed that homozygote A/A and heterozygote A/G were relatively equal in number.It may be concluded that allele A is predominant in rs2070788 within Indonesia.However, more sample quantity taken from different ethnic groups or tribes and islands in Indonesia is required to conclude it scientifically.
Homozygote G/G variant distribution between severity groups is the result of Cheng et al. (2015) Stated that the homozygote G/G variant of rs2070788 correlated to the severe outcome of Influenza A infection in the Asian (Hong Kong) population.Our result found homozygote G/G in the moderate-severe group, yet none in the mild-asymptomatic group.However, this research's result is still insufficient in the correlation between severity level and genetic variance.Sample number discrepancies between the two severity groups are enormous.Hence, the statistical test cannot be performed yet.The number of samples from ISSN: 2614-1558 | 319 the mild-asymptomatic group needs to be increased to get more results regarding the genetic variety so that the statistical test can be performed.

Conclusions
Homozygote A/A genotype in TMPRSS2 gene rs2070788 was found predominantly in COVID-19 patients in Indonesia.However, the homozygote G/G variant was only found in moderate-severe COVID-19 patients in the same population.

Table 1 .
Genotype distribution between severity level group