Accuracy of administrative claim data for gastric adenoma after endoscopic resection

Article information

Clin Endosc. 2023;56(3):325-332
Publication date (electronic) : 2023 March 21
doi :
1Division of Gastroenterology, Department of Internal Medicine, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
2Division of Gastroenterology, Department of Internal Medicine, Uijeongbu St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Uijeongbu, Korea
3Catholic Photomedicine Research Institute, Seoul, Korea
4Department of Internal Medicine, Myongji Hospital, Hanyang University College of Medicine, Goyang, Korea
Correspondence: Jae Myung Park Division of Gastroenterology, Department of Internal Medicine, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, 222 Banpo-daero, Seocho-gu, Seoul 06591, Korea E-mail:

Ga-Yeong Shin and Hyun Ho Choi contributed equally to this study.

Received 2022 May 6; Revised 2022 June 11; Accepted 2022 June 22.



Administrative databases provide valuable information for large-cohort studies. This study aimed to evaluate the diagnostic accuracy of an administrative database for resected gastric adenomas.


Data of patients who underwent endoscopic resection for benign gastric lesions were collected from three hospitals. Gastric adenoma cases were identified in the hospital database using International Classification of Diseases (ICD) 10-codes. The non-adenoma group included patients without gastric adenoma codes. The diagnostic accuracy for gastric adenoma was analyzed based on the pathological reports of the resected specimen.


Among 5,095 endoscopic resections with codes for benign gastric lesions, 3,909 patients were included in the analysis. Among them, 2,831 and 1,078 patients were allocated to the adenoma and non-adenoma groups, respectively. Regarding the overall diagnosis of gastric adenoma with ICD-10 codes, the sensitivity, specificity, positive predictive value, and negative predictive value were 98.7%, 88.5%, 95.2%, and 96.8%, respectively. There were no significant differences in these parameters between the tertiary and secondary centers.


Administrative codes of gastric adenoma, according to ICD-10 codes, showed good accuracy and can serve as a useful tool to study prognosis of these patients in real-world data studies in the future.

Graphic abstract


Korea has a National Health Insurance System (NHIS) that covers more than 90% of the country’s medical costs. This system contains a complete set of health-related information including demographic, clinical, diagnostic, and therapeutic databases. The source of the NHIS is the Health Insurance Review and Assessment database, which includes all insurance claims information of approximately 97% of the Korean population. In this database, the names of the diseases are coded according to the International Classification of Diseases, 10th revision edition (ICD-10 code), published by the World Health Organization.1

Administrative databases have been widely used for medical research because they include medical information of a large population along with long-term follow-up data. Their reliability is critical for proper analysis of the results derived from these databases. However, the data may not be completely accurate because they are mainly used for insurance claim purposes. Some studies have shown that only about 70% of primary diagnosis codes concurred with medical records, and clinical diagnoses were made using different subjective diagnostic criteria.2,3 Therefore, the NHIS data should be validated before applying it to various studies.4

Gastric cancer is the fifth most common cancer worldwide and has the third highest cancer-related mortality.5 Gastric adenoma is a precursor to gastric cancer, and its diagnosis and treatment are important for early detection and prevention of gastric cancer.6 Gastric adenoma is histologically subdivided into high- and low-grade pathology, both of which have the potential to progress to cancer. Since highly dysplastic adenomas show malignant changes in more than 60% to 85% of cases,7-9 these lesions should be removed. Low-grade adenoma has a less than 10% chance of progressing to cancer.8,10 However, 12% to 63% of forceps biopsy confirmed that low-grade adenomas are upgraded to either high-grade adenoma or early gastric cancer in the pathology of resected specimen.11-13 Therefore, current clinical guidelines recommend removal of gastric adenoma regardless of the pathological grade.14,15

In areas where gastric cancer is common, such as Korea, screening endoscopy is performed, resulting in frequently encountered gastric adenomas. With the growing recognition of gastric adenoma, correctly identifying patients with gastric adenoma is vital to define the risk stratification or follow-up strategies. Therefore, validated algorithms for identifying patients with gastric adenomas are essential to define a cohort of these patients accurately and consistently for further nationwide studies. Direct validation of the accuracy between the administrative dataset and the NHIS data is impossible because of the Personal Information Protection Act in Korea. Therefore, validation of the accuracy and usefulness of diagnostic codes could only be performed at individual hospitals where the diagnosis of each disease was performed and reported to the Health Insurance Review and Assessment for insurance claims.

We hypothesized that the combination of a diagnostic definition using ICD-10 codes and endoscopic procedure codes can accurately identify gastric adenomas. The aim of our study was to determine the accuracy of diagnosing gastric adenoma resected by endoscopic procedure.



From January 2009 to December 2019, cases of endoscopic resection for gastric lesions were retrospectively collected from the Seoul St. Mary’s Hospital Clinical Data Warehouse of a tertiary university hospital, Seoul St. Mary’s Hospital (Seoul, Korea). We also collected data from two secondary referral centers, a university hospital (Uijeongbu St. Mary’s Hospital, Uijeongbu, Korea) and a secondary referral hospital (Myongji Hospital, Goyang, Korea), between January 2019 and December 2021. The electronic medical record system contains information on the visiting hospital departments, principal diagnoses, surgical and diagnostic procedures, endoscopy documentation, and pathology reports for each patient.

Study population

We retrospectively collected data from patients without cancer who underwent endoscopic resection for gastric lesions (Table 1), and their information regarding age, sex, diagnostic codes, and pathologic reports were collected. Chart reviews were manually conducted by the investigators (GYS, HHC, and JMP) using standardized data extraction forms. Adenoma patients were defined as having D002, D131, or D371 codes according to the ICD-10 (Table 1) within 6 months before or after endoscopic resection. The non-adenoma group included cases without any ICD-10 codes for gastric adenoma. We excluded cases with a history of gastric cancer, gastric cancer code (C16) within 6 months after endoscopic resection, or the first diagnosis of carcinoma with biopsy.

Disease and treatment code used in the inclusion and exclusion criteria


Gastric adenoma was diagnosed as low- or high-grade dysplasia by reviewing endoscopy and pathology reports.

Statistical analysis

We evaluated the validity of gastric adenoma-related ICD-10 codes by comparing them with the diagnosis derived from a comprehensive manual review of medical records. After reviewing the charts and grouping each patient, the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Results are expressed as mean±standard deviation. In addition, 95% confidence intervals (CIs) were calculated using IBM SPSS ver. 20.0 for Windows (IBM Corp.). The significance level was set at p<0.05.

Ethical statements

The Institutional Review Board of the Catholic University of Korea (IRB No: KC19ZESI0679) approved this study and the requirement of informed consent was waved because anonymous data were used.


Diagnostic accuracy for code of gastric adenoma

A total of 5,095 patients were identified as registered for endoscopic treatment of benign gastric lesions (Fig. 1). Among them, we excluded patients with a code of gastric cancer 6 months before or after the resection date (n=959), previous history of gastric cancer (n=168), and those with a code of gastric adenoma more than 6 months after endoscopic resection (n=59). Finally, 3,909 patients were included in the analysis. Their mean age was 62.4±11.3 years, and 2,336 patients (59.8%) were male.

Fig. 1.

Study algorithm for the inclusion and classification of subjects.

In the adenoma group (n=2,831), a pathologic diagnosis of low- or high-grade dysplasia was found in 2,696 cases (95.1%), 2,108 (74.5%) of which were low-grade and 588 (20.8%) were high-grade. According to the referral centers, the number of patients with adenomas was 2,185 and 407 in the tertiary and secondary university hospitals, respectively, and 104 in the secondary referral hospital (Table 2).

Diagnosis of gastric adenoma after endoscopic resection

The pathological diagnoses of the remaining cases (n=135) were subepithelial tumors (leiomyoma, ectopic pancreas), polyps (inflammatory, hyperplastic, fundic, or hamartomas), and chronic gastritis. In the non-adenoma group (n=1,078), 1,043 patients did not have adenomas, whereas 35 were found to have gastric adenoma after resection.

Regarding overall diagnosis of gastric adenoma, the sensitivity, specificity, PPV, and NPV were 98.7% (98.2%–99.1%), 88.5% (86.5%–90.3%), 95.2% (94.4%–96%), and 96.8% (95.5%–97.7%), respectively.

Figure 2A shows the enrollment of the study population from a tertiary hospital. A total of 3,081 patients were analyzed after applying the exclusion criteria. Among them, 2,296 patients had gastric adenoma codes and 785 had codes of non-adenomatous gastric lesions. Their mean age was 61.5±11.2 years, and 60.0% were male. In the tertiary hospital, the sensitivity, specificity, PPV, and NPV were 99.4% (95% CI, 99.0%–99.7%), 86.4% (95% CI, 83.9%–88.5%), 94.7% (95% CI, 93.7%–95.6%), and 98.3% (95% CI, 97.1%–99.1%), respectively.

Fig. 2.

Patient enrollment in each hospital. A tertiary (A) and secondary (B, C) referral center.

Figure 2B shows the enrollment of the study population from a secondary hospital. A total of 609 patients were analyzed after applying the exclusion criteria. Among them, 429 patients had gastric adenomas and 180 had nonadenomatous gastric lesions. Their mean age was 65.4±10.3 years, and 366 (60.1%) were male. In secondary hospitals, the sensitivity, specificity, PPV, and NPV were 96.9% (95% CI, 94.6%–98.3%), 88.4% (95% CI, 82.7%–92.4%), 94.9% (95% CI, 92.2%–96.7%), and 92.8% (95% CI, 87.7%–95.9%), respectively.

Figure 2C shows that the study population was recruited from a secondary referral center. A total of 219 patients were included in this analysis. Among them, 106 had gastric adenoma codes and 113 did not. Their mean age was 65.9±13.3 years, and 121 (55.3%) were male. For the accuracy of gastric adenoma code, the sensitivity, specificity, PPV, and NPV were 92% (85.0%–96.1%), 98.1% (92.7%–99.7%), 98.1% (92.7%–99.7%), and 92% (85.0%–96.1%), respectively.

Diagnostic accuracy according to the individual codes for gastric adenoma

The most common code for gastric adenoma was D131 (n=2,530, 89.4%), followed by D002 (n=291, 10.3%) and D371 (n=10, 0.4%). Among the three codes, the most sensitive was D131, with a sensitivity of 98.9% (95% CI, 98.4%–99.3%). When this code was combined with D002, the sensitivity of adenoma diagnosis improved to 99.0% (95% CI, 98.6%–99.4%) (Table 3). Even after adding D371 in this state, the diagnostic accuracy did not improve significantly compared with that of the combination of D371 and D002, as shown in Table 3.

Diagnostic accuracy according to the diagnosis codes and their combinations


This study demonstrated that the gastric adenoma codes of patients who underwent endoscopic resection in an administrative database were valid in population-based large-cohort studies. There was a high diagnostic accuracy of greater than 95% for sensitivity, PPV, and NPV. The specificity was greater than 87%. The level of diagnostic accuracy was equally high in the administrative code for gastric adenoma between the tertiary and secondary referral centers. Our results support the reliability of previous large-cohort studies using the administrative databases in Korea.

Administrative databases of various disease registries have been used for population-based clinical studies. The reliability of the results is affected not only by the size of the data, but also by the setting of the correct patient and non-adenoma groups. In Korea, endoscopy is performed every two years for Koreans over 40 years of age, as a nationwide screening. Therefore, the detection rate for stomach-related diseases is high. Clinical studies using national big data have been conducted for gastric cancer because its diagnosis using administrative codes is highly accurate.16-19

We tried to identify the accuracy of the ICD-10 code for gastric adenoma registered in the NHIS because, the disease entity of gastric adenoma belongs to gastric neoplasm, which is a major public health concern in Korea. Gastric adenoma is one of the precursors of gastric cancer. Gastric adenoma is removed endoscopically, with the belief that the removal of this lesion reduces gastric cancer-related mortality. However, securing the accuracy of diagnosis is a priority to determine the prognosis of adenoma patients and for stratification of gastric cancer risk after resection, at the national level.

We included only patients treated for adenomas in this study. It is better to confirm the diagnostic code of gastric adenoma after resection rather than confirming it by endoscopic biopsy because of the possibility of pathological up- or downgrade, considering its high degree of pathological heterogeneity.9,12,20 Our present study also excluded about 20% of patients with gastric adenoma due to the entry of the C16 code gastric cancer. In contrast, some cases were pathologically downgraded to chronic inflammation.21 Representatively, pathologists have a low inter-observer agreement in diagnosing lesions as low-grade. Furthermore, diagnostic codes before resection are frequently mixed with other diseases, such as gastric subepithelial tumors.

Gastric adenoma is a precursor of gastric cancer. Therefore, once detected, gastric adenomas are removed through endoscopy in most cases. This explains the high sensitivity of the diagnostic accuracy. In contrast to the remarkably high sensitivity, the specificity was less than 90%. This may be related to the downgrading of the pathology after a low-grade adenoma. Since low-grade dysplasia for flat lesions show minimal architectural disarray and cytological atypia,22 it is often difficult to distinguish inflammation from low-grade adenoma, raising inter-observer variation in the distinction of reactive atypia from true dysplasia.23

Based on the results of this study, when a patient who underwent endoscopic resection had a gastric adenoma code, the accuracy and PPV for the diagnosis of gastric adenoma were very high. Since Korea has a screening strategy for gastric cancer using endoscopy, the diagnosis of gastric adenoma is expected to increase further in the future. Therefore, the number of patients with gastric adenoma removal will also increase, and the results of this study will be useful to study the clinical significance of gastric adenoma removal and in studies on the frequency and intensity of follow-up in patients with gastric adenoma removal. Furthermore, it seems necessary to check the change in the gastric cancer-related mortality rate, using data of the results of the endoscopic treatment of adenoma.

Our study has several strengths. First, this is the first study to investigate the validity of administrative codes for gastric adenomas. Second, all cases were diagnosed by pathological confirmation. Third, we compared the validity of administrative codes by considering a non-adenoma group of patients, without gastric adenomas, who underwent endoscopic resection of gastric lesions.

The present study has several limitations. First, the study was conducted at only two centers. However, these two centers were in different cities, with one being tertiary and the other being secondary. We believe that our data represents the entire gastric adenoma dataset of the NHIS in Korea. Second, this study included patients treated for gastric adenomas for the first time. Therefore, some patients with both, gastric cancer and adenoma may have been excluded from this study. Third, the code given between high-grade dysplasia, which is close to carcinoma, and low-grade dysplasia, which is close to adenoma, can differ from hospital to hospital. For example, some hospitals do not provide code D002, even for high-grade dysplasia, while others still use code D131. Therefore, it can be concluded that the big data obtained in this way may have limitations in clearly differentiating high- and low-grade adenomas.

In conclusion, we validated ICD-10 diagnostic codes for the prognosis of gastric adenomas. The ICD-10 codes of gastric adenoma after endoscopic resection in an administrative database are acceptable for use in population-based, large-cohort studies.


Conflicts of Interest

The authors have no potential conflicts of interest.


This research was supported by the Basic Science Research Program through the National Research Foundation of Korea, funded by the Ministry of Education, Science, and Technology (2019R1A5A2027588 and 2020R1F1A1076448).

Author Contributions

Conceptualization: GYS, JMP; Data curation: GYS, HHC, JMP; Formal analysis: GYS, HHC, JMP; Methodology: DK, SYK, YKC, SSK, MGC; Supervision: DK, YKC, SSK, MGC; Writing–original draft: GYS, JMP, HHC, SYK, MGC; Writing–review & editing: GYS, HHC, SYK, JYP, DK, YKC, SSK, MGC.


1. Office of the Secretary, HHS. Administrative simplification: adoption of a standard for a unique health plan identifier; addition to the National Provider Identifier requirements; and a change to the compliance date for the International Classification of Diseases, 10th edition (ICD-10-CM and ICD-10-PCS) medical data code sets. Final rule. Fed Regist 2012;77:54663–54720.
2. Jeong HS, Lee JH, Shin JW, et al. Scale and structure of 2006 total health expenditure in Korea constructed according to OECD/WHO/EUROSTAT’s SHA (System of Health Accounts). Korean J Health Econ Policy 2008;14:151–169.
3. Park B, Park P, Sung K. Validity of diagnosis code on National Health Insurance Claim Database Seoul National University School of Medicine; 2003.
4. Aljunid SM, Srithamrongsawat S, Chen W, et al. Health-care data collecting, sharing, and using in Thailand, China mainland, South Korea, Taiwan, Japan, and Malaysia. Value Health 2012;15(Suppl 1):S132–S138.
5. Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2018;68:394–424.
6. Lauwers GY, Riddell RH. Gastric epithelial dysplasia. Gut 1999;45:784–790.
7. Di Gregorio C, Morandi P, Fante R, et al. Gastric dysplasia: a follow-up study. Am J Gastroenterol 1993;88:1714–1719.
8. Yamada H, Ikegami M, Shimoda T, et al. Long-term follow-up study of gastric adenoma/dysplasia. Endoscopy 2004;36:390–396.
9. Lansdown M, Quirke P, Dixon MF, et al. High grade dysplasia of the gastric mucosa: a marker for gastric carcinoma. Gut 1990;31:977–983.
10. Rugge M, Cassaro M, Di Mario F, et al. The long term outcome of gastric non-invasive neoplasia. Gut 2003;52:1111–1116.
11. Maekawa A, Kato M, Nakamura T, et al. Incidence of gastric adenocarcinoma among lesions diagnosed as low-grade adenoma/dysplasia on endoscopic biopsy: a multicenter, prospective, observational study. Dig Endosc 2018;30:228–235.
12. Cho SJ, Choi IJ, Kim CG, et al. Risk of high-grade dysplasia or carcinoma in gastric biopsy-proven low-grade dysplasia: an analysis using the Vienna classification. Endoscopy 2011;43:465–471.
13. Kang DH, Choi CW, Kim HW, et al. Predictors of upstage diagnosis after endoscopic resection of gastric low-grade dysplasia. Surg Endosc 2018;32:2732–2738.
14. Dinis-Ribeiro M, Areia M, de Vries AC, et al. Management of precancerous conditions and lesions in the stomach (MAPS): guideline from the European Society of Gastrointestinal Endoscopy (ESGE), European Helicobacter Study Group (EHSG), European Society of Pathology (ESP), and the Sociedade Portuguesa de Endoscopia Digestiva (SPED). Endoscopy 2012;44:74–94.
15. ASGE Standards of Practice Committee, Evans JA, Chandrasekhara V, et al. The role of endoscopy in the management of premalignant and malignant conditions of the stomach. Gastrointest Endosc 2015;82:1–8.
16. Kim MH, Chang J, Kim WJ, et al. Cumulative dose threshold for the chemopreventive effect of aspirin against gastric cancer. Am J Gastroenterol 2018;113:845–854.
17. Shin GY, Park JM, Hong J, et al. Use of proton pump inhibitors vs histamine 2 receptor antagonists for the risk of gastric cancer: population-based cohort study. Am J Gastroenterol 2021;116:1211–1219.
18. Nam JH, Jang SI, Park HS, et al. The effect of menopausal hormone therapy on gastrointestinal cancer risk and mortality in South Korea: a population-based cohort study. BMC Gastroenterol 2021;21:440.
19. Kim J, Hyun HJ, Choi EA, et al. Metformin use reduced the risk of stomach cancer in diabetic patients in Korea: an analysis of Korean NHIS-HEALS database. Gastric Cancer 2020;23:1075–1083.
20. Kim JM, Sohn JH, Cho MY, et al. Pre- and post-ESD discrepancies in clinicopathologic criteria in early gastric cancer: the NECA-Korea ESD for Early Gastric Cancer Prospective Study (N-Keep). Gastric Cancer 2016;19:1104–1113.
21. Kim YJ, Park JC, Kim JH, et al. Histologic diagnosis based on forceps biopsy is not adequate for determining endoscopic treatment of gastric adenomatous lesions. Endoscopy 2010;42:620–626.
22. Goldstein NS, Lewin KJ. Gastric epithelial dysplasia and adenoma: historical review and histological criteria for grading. Hum Pathol 1997;28:127–133.
23. Kim JM, Sohn JH, Cho MY, et al. Inter-observer reproducibility in the pathologic diagnosis of gastric intraepithelial neoplasia and early carcinoma in endoscopic submucosal dissection specimens: a multi-center study. Cancer Res Treat 2019;51:1568–1577.

Article information Continued

Fig. 1.

Study algorithm for the inclusion and classification of subjects.

Fig. 2.

Patient enrollment in each hospital. A tertiary (A) and secondary (B, C) referral center.

Table 1.

Disease and treatment code used in the inclusion and exclusion criteria

Classification Code Name
Gastric adenoma (ICD-10 code) D002 Carcinoma in situ of stomach
D131 Benign neoplasm of stomach
D371 Neoplasm of uncertain behavior of stomach
Endoscopic resection (claim code) Q7652 Endoscopic operation of upper gastrointestinal tumor-mucosal resection and submucosal resection
Q7651 Endoscopic operation of upper gastrointestinal tumor-removal or ablation
QZ933 Endoscopic operation of upper gastrointestinal tumor-submucosal dissection-stomach
Q7653 (From 2018) Endoscopic operation of upper gastrointestinal tumor-submucosal dissection-stomach

ICD, International Classification of Diseases.

Table 2.

Diagnosis of gastric adenoma after endoscopic resection

Administrative code Pathology in resected specimen
Adenoma Non-adenoma
  Overall (n=2,831) 2,696 (95.2) 135 (4.8)
  Tertiary center (n=2,296) 2,185 (95.2) 111 (4.8)
  Secondary center (n=429) 407 (94.9) 22 (5.1)
  Secondary hospital (n=106) 104 (98.1) 2 (1.9)
  Overall (n=1,078) 35 (3.2) 1,043 (96.8)
  Tertiary center (n=785) 13 (1.7) 772 (98.3)
  Secondary center (n=180) 13 (7.2) 167 (92.8)
  Secondary hospital (n=113) 9 (8.0) 104 (92.0)

Values are presented as number (%).

Table 3.

Diagnostic accuracy according to the diagnosis codes and their combinations

Diagnosis code Sensitivity (%) Specificity (%) Positive predictive value (%) Negative predictive value (%)
D131 98.9 (98.4–99.3) 87.8 (85.7–89.7) 94.9 (94.0–95.6) 97.3 (96.1–98.2)
D002 91.8 (88.1–94.5) 99.8 (99.2–99.9) 99.3 (97.3–99.8) 97.3 (96.2–98.1)
D371 21.2 (9.0 –38.9) 99.7 (99.1–99.9) 70.0 (38.7–89.6) 97.3 (96.8–97.7)
D002+D131 99.0 (98.6–99.4) 87.7 (85.6–89.6) 95.3 (94.6–96.0) 97.3 (96.1–98.2)
D002+D131+D371 98.7 (98.2–99.1) 88.5 (86.6–90.3) 95.2 (94.5–95.9) 96.8 (95.5–97.6)

Parenthesis contains 95% confidence interval.