Missing laboratory results data in electronic health databases: implications for monitoring diabetes risk
Abstract
Aim: Laboratory test (lab) results may be useful to detect incident diabetes in electronic health record and claims-based studies. Research design & methods: Using the Mini-Sentinel distributed database, we assessed the value of lab results added to diagnosis codes and dispensing claims to identify incident diabetes. Results: Inclusion of lab results increased the number of diabetes outcomes identified by 21%. In settings where capture of lab results was relatively complete, the absence of lab results was associated with implausibly low rates of the outcome. Conclusion: Lab results can increase sensitivity of algorithms for detecting diabetes, and missing lab results are associated with much lower rates of diabetes ascertainment regardless of algorithm. Patterns of missing lab results may identify ascertainment bias.
First draft submitted: 24 May 2016; Accepted for publication: 21 September 2016; Published online: 9 December 2016
Incident Type 2 diabetes mellitus (T2DM) defined in part with elevated hemoglobin A1c (HbA1c), serum glucose or capillary glucose (diabetes laboratory). Results from these tests are necessary to make the clinical diagnosis, and are often the first and only evidence of asymptomatic disease [1]. As laboratory test (lab) results become more widely available in the large databases used to evaluate the real-world benefits and risks of medications, the use of those lab results will presumably increase the sensitivity of computable phenotypes for diabetes [2]. But it is unclear whether this translates into real advantages in terms of ascertaining more outcomes, more accurately and sooner.
Against the theoretical advantages of using lab results are also theoretical pitfalls, particularly missing data. Two common patterns of missingness are routinely encountered. First, a lab may be performed but its results may not be captured by a claims or electronic health record (EHR)-based data source. In this case, there may be evidence from a procedure claim that a lab was performed but no associated result, or there may be no direct evidence at all that the lab was done. Because this missingness results from the way that data are recorded and handled within a health system, we term it ‘organization-level missingness’.
A second type of missingness, with more complex properties, may occur if providers only choose to order a test in certain patients. For example, providers may be more likely to order an HbA1c in patients who have diabetes risk factors. This kind of missingness has a high probability of occurring not at random. It should not be unique to lab results, because providers might also be less likely to record a diagnosis or prescribe a treatment in some patients than in others. However, it may be easier to identify this type of missingness in lab data because, in databases with good capture of all lab results, it is clear when a lab was completed more often for some patient groups than for others. We term this type of missingness ‘patient-level missingness’.
We undertook a cohort study in the Mini-Sentinel distributed database (MSDD) to assess the potential impact of using lab results, added to diagnosis codes and medication records, to identify incident T2DM, and to assess whether missing lab results limited their usefulness. The cohort study, which was intended as a test case to examine methodological rather than clinical issues, identified adult new initiators of second-generation antipsychotics (SGAs). SGAs are known to increase the risk of T2DM [3–7]; the risk is lower with aripiprazole than with the other commonly used SGAs (olanzapine, quetiapine and risperidone).
Research design & methods
We used the MSDD to compare rates of incident diabetes in new users of different SGAs, defined with and without the use of lab results. This study was a test case to assess the effect of the use of lab results on estimated incident diabetes rates and relative risk estimates, as well as the potential effect of missing lab results on these findings. Missing lab results was defined as the absence of any glucose of HbA1c result values available in the MSDD during the baseline period (baseline labs) or during the follow-up period (follow-up labs).
Data source
The MSDD, developed under the US FDA's Mini-Sentinel program, is an increasingly important tool for active surveillance of medical product safety. Participating data partners (institutions that collect healthcare data and can contribute it to the MSDD – in other words, large national insurers, and organizations that incorporate both insurance and healthcare functions) transform data derived from claims records as well as EHRs into a common data model (a standardized list of variables and definitions that enables data from different data partners to be directly compared and analyzed).
On an individual patient level, the date at which patients begin and end enrollment with a data partner is routinely captured. In addition to standard demographic fields, diagnosis codes and prescription claim records, this data model includes a lab results table, which is populated to the extent possible depending on the amount of lab data captured by each data partner. Data are held at each data partner, and queries can be distributed to each partner, run at that site and the results later combined, a process that maintains local control of data, maximizing data security and confidentiality [8].
MSDD has 19 data partners, not all of whom contribute results to the lab results table. For this test-case, three data partners were chosen to contribute data. These were a small integrated delivery system (site one), a larger integrated delivery system (site two) and a large national insurer (site three). sites one and two capture outpatient lab results for 90–100% of their members. The third (site three) was a commercial insurer that captured outpatient lab results only from certain vendors; this arrangement typically resulted in capture of lab results for 15–30% of patients. The low rate of available lab results data for this site reflects that, apart from specific arrangements with a nonrandom minority of vendors, large national health insurers do not routinely have lab results data available to them. The mix of sites one, two and three was chosen to provide examples of different types of data partners, which are potentially very diverse in terms of both how they capture data as well as patient and provider characteristics [9].
Point of care testing is generally not captured at any data partner, and most data partners have low or nonexistent capture rates from inpatient or emergency department encounters.
The director of the Department of Health and Human Services Office for human research protections has determined the common rule does not apply to activities conducted as part of the Sentinel initiative. Mini-Sentinel activities are public health surveillance activities not under the purview of institutional review boards [10]. Health Insurance Portability and Accountability Act regulations for public health surveillance do apply. Personally identifiable health information is not transmitted to the Mini-Sentinel Operations Center or to the FDA [11].
Patients
The cohort identified consisted of adult new users of SGAs, defined as individuals >21 years of age who had a minimum 183 days of health plan enrollment with insurance coverage prior to a first dispensing of an SGA between 1 January 2008 and 31 October 2012. Patients were classified into one of four mutually exclusive exposure groups depending on the specific SGA dispensed: aripiprazole, olanzapine, quetiapine and risperidone. Rates of use of other SGAs were extremely low, so they were not included. Individuals with evidence of diabetes were excluded; exclusion criteria were any dispensing of an antidiabetic medication, coded diabetes diagnosis or elevated diabetes lab during the baseline period. Patients with baseline pregnancy or polycystic ovarian syndrome were also excluded.
Follow-up & outcomes
Follow-up continued until 365 days after the first dispensing of an SGA, discontinuation of insurance coverage, death, end of study period or occurrence of an outcome. The outcome of incident T2DM was defined in two different ways. The outcome determined without use of labs (O1) occurred if there was any single inpatient or outpatient diabetes diagnosis code or antidiabetic medication dispensing during follow-up. The outcome determined with labs (O2) included any patient who met criteria for O1 or who had at least one captured lab result compatible with diabetes (fasting glucose ≥126 mg/dl, random glucose ≥200 mg/dl or HbA1c ≥6.5v%) during the follow-up period. Compared with formal diagnostic criteria for diabetes [1], this outcome definition is simplified to avoid the use of information not consistently captured in the Mini-Sentinel data (such as oral glucose tolerance tests and symptoms) and to maximize sensitivity, as would be used in a drug surveillance project.
Analysis
Incident diabetes outcome rates based solely on diagnosis codes for diabetes mellitus and records of dispensing of drugs used to treat diabetes mellitus (O1) were compared with incident diabetes outcome rates that included lab results in addition to diagnosis codes for diabetes mellitus and records of dispensing of drugs to treat diabetes mellitus (O2). The primary comparison was between overall incidence rates with and without lab results, although time to event was also assessed. Baseline lab results refers to measured glucose or HbA1c prior to SGA exposure. By definition, for a patient to be eligible for cohort entry, baseline lab results were required to be in the normal range. This is distinct from the lab results used for outcome assessment. Outcome lab results had to be obtained during the follow-up period after SGA exposure started.
Predictors of missing lab results were assessed in univariable and multivariable analyses. Multivariable analysis used logistic regression in which presence or absence of follow-up lab results was a binary dependent variable. Patient demographics, comorbidities present in the 183 days prior to SGA initiation, presence or absence of baseline lab results and year of study entry were variables included in multivariable analysis.
Analyses were initially conducted across all three sites with site as a variable; however, due to large differences in missing lab results by site, further analyses were conducted stratified by site. O1 and O2 were also compared in patients within each site with and without missing data.
Cox proportional hazard models were used to compare rates of outcome (incident diabetes) across the four SGA exposure groups, with aripiprazole as the reference category. Models were adjusted for prespecified covariates (patient demographics, comorbidities and year of study entry). This analysis was done for combined sites, stratified by site, with the outcome defined using O1 and O2, and after restriction to patients with baseline lab results available.
Results
The final cohort included 81,785 individuals (Supplementary Figure 1), 48% of whom were exposed to quetiapine, 24% to risperidone, 18% to aripiprazole and 11% to olanzapine. The average age was 60.3 years, and 62.3% of the cohort was female. These distributions varied across the three sites. Data on race were incomplete at all three sites, with 17.3% overall having unknown race, but among individuals with race recorded, 71.5% were white and 7.5% were African–American. At all three sites, the most common baseline comorbidities were psychosis (present in 56.0% of patients overall), depression (53.6% overall) and hypertension (45.0% overall) (Table 1).
During the year following the SGA initiation, 5.1% of individuals developed a preliminary incident diabetes outcome as defined using diagnosis codes and medication only (O1). Overall crude incidence rates without labs varied from 0.018 to 0.078 per person–year across the three sites (p < 0.001). 46.4% of the cohort had diabetes labs during the follow-up period. When labs were also used to detect outcomes (O2), the incidence rate increased substantially, from 0.038 to 0.082 per person–year across the sites (p < 0.001; Table 2); total overall diabetes incidence was 6.1% using (O2). Median time to detection of diabetes was minimally affected by the inclusion of lab results (data not shown).
Rates of available follow-up lab results varied significantly across sites, with 31.8% missing at site one, 33.8% missing data at site two and 68.1% missing data at site three (p < 0.001 for difference between groups). At all sites, restriction to patients with available follow-up lab results increased the incidence rate whether or not those labs were used in outcome ascertainment. At sites one and two, the rate of incident outcomes identified by any means was negligible in the population who did not have follow-up lab results (Table 2).
Predictors of missing data varied across sites. Particularly at sites one and two, patients were less likely to have missing data for SGA exposures other than aripiprazole. At site three only, presence of baseline labs was a strong predictor of not having missing follow-up lab data (odds ratio: 0.19; 95% CI: 0.18–0.20). These differences persisted after adjustment for demographic factors and preselected clinical conditions (Table 3).
When all sites were combined, the hazard ratio for incident diabetes was non-significantly greater than 1 (range: 1.01–1.07, with all 95% CI including 1) for olanzapine, quetiapine and risperidone, and did not appreciably change between the different outcome definitions. Site-specific models yielded different point estimates, with wide and overlapping CIs. Due to this lower precision, these differences were not significant (Table 4).
Conclusion
This analysis of the FDA's MSDD highlights two major contributions lab results can make to studies of diabetes risk. These points likely apply to many other asymptomatic medical conditions that are diagnosed primarily through laboratory testing (such as hyperlipidemia, subclinical liver injury and subclinical renal injury). First, the use of lab results substantially increases the number of cases identified when results are available. From a study design standpoint, the substantial increase in outcomes seen with use of lab results argues for their value in offering more accurate and precise estimates of incidence and risk. However, our analysis also reveals that the availability of lab test results varies widely across data partners. As has been observed across a variety of clinical scenarios, the implications of different clinical and demographic factors for rates of missingness were also complex and variable across sites [9].
Second, patient-level missing lab data can offer important insights into potential ascertainment bias. This claim is prompted by the finding that having missing lab results is associated with far lower apparent rates of incident diabetes even when the method of diabetes ascertainment does not use labs (Table 2). This effect was pronounced even at site three, and at the other sites it was so strong that diabetes was almost never seen in patients without follow-up lab results. There is a plausible explanation for this: since diabetes diagnosis relies on lab tests, any new diagnosis should have an associated diagnostic lab result somewhere. In a system with low rates of organization-level missing data, like sites one and two, it will be quite unusual for a patient to have a new diabetes diagnosis without also having a diagnostic lab result.
This association could reflect either symptomatic diabetes leading to a lab workup, or screening labs revealing asymptomatic diabetes. Both scenarios likely occur in any large cohort. However, the paradigm for diabetes diagnosis is screening for a largely asymptomatic disease, supporting the clinical sense that in most cases diabetes will be silent and detected only if screening labs are performed [1].
When disease detection relies on screening rather than evaluation of acute clinical presentations, the potential for ascertainment bias is high. For example, at sites one and two, adjusted rates of missing lab results were higher for aripiprazole and lower for all other SGAs. One explanation may be different intensity of monitoring for diabetes, since aripiprazole is known to confer less additional risk for diabetes. In principle, this kind of differential monitoring could become a ‘self-fulfilling prophecy’, if closer scrutiny of patients on high-risk drugs results in higher diagnosis rates and falsely higher apparent risk [12]. Importantly, omitting lab results from outcome ascertainment algorithms would not help, since the lab result is a key part of how the clinical diagnosis is made. Instead, the missing lab results offer an opportunity to identify when ascertainment bias might be present.
It is less clear how to capitalize on this opportunity, although it is clear that lab results should neither be ignored nor naively assumed to be negative if missing. Unlike for missing data at baseline, where methods including multiple-imputation are well-studied and recommended, there is not a strong consensus on how to handle missing data in follow-up [13]. If the disease is truly asymptomatic and detected only on screening, imputation or complete case analysis may be justified under the assumption that data are missing at random after adjustment for baseline characteristics. If some labs are ordered due to disease symptoms or risk factors that emerge after baseline, this assumption will be violated. As an initial step, any discrepancy of follow-up lab frequency between exposure categories should be noted, and quantitative estimates undertaken regarding whether it may account for any difference in risk seen.
This observation has limitations. First, it relies on low-rates of ‘organization-level’ missingness. If an EHR or claims-based data source does not capture most labs that are performed, missing data patterns will primarily reflect that organization-level missingness and will be less useful as evidence of ascertainment bias. Second, some potentially important patient and provider characteristics that can influence missing data (e.g., patient level of education), are not available in the MSDD. However, all cohort members had insurance, adding similarity from a socioeconomic perspective. This work is primarily relevant for largely asymptomatic conditions in which lab testing is the key diagnostic step. For more occult conditions, such as renal impairment, asymptomatic liver injury or other subclinical abnormalities in commonly ordered labs, the fact that lab results capture the key step in the diagnostic process makes them an indispensable tool.
| Variable | Data partner site | |||
|---|---|---|---|---|
| Site one (n = 3764) | Site two (n = 30,637) | Site three (n = 47,384) | All sites (n = 81,785) | |
| Any baseline diabetes lab, n (%); yes: | 2154 (57.2) | 17,973 (58.7) | 12,990 (27.4) | 33,117 (40.5) |
| – Baseline HbA1c | 105 (2.8) | 2170 (7.1) | 1716 (3.6) | 3991 (4.9) |
| – Baseline fasting glucose | 622 (16.5) | 6738 (22.0) | 151 (0.3) | 7511 (9.2) |
| – Baseline random glucose | 1790 (47.6) | 13,575 (44.3) | 12,745 (26.9) | 28,110 (34.4) |
| Any follow-up diabetes lab: | 2567 (68.2) | 20,268 (66.2) | 15,123 (31.9) | 37,958 (46.4) |
| – Any follow-up HbA1c | 231 (6.1) | 3627 (11.8) | 3,177 (6.7) | 7035 (8.6) |
| – Any follow-up fasting glucose | 1196 (31.8) | 10,991 (35.9) | 370 (0.8) | 12,557 (15.4) |
| – Any follow-up random glucose | 1972 (52.4) | 13,229 (43.2) | 14,689 (31.0) | 29,890 (36.5) |
| Gender, n (%); female | 2402 (63.8) | 19,103 (62.4) | 29,416 (62.1) | 50,921 (62.3) |
| Age at cohort entry, mean (SD); years | 55.4 (21.1) | 54.9 (21.1) | 64.2 (19.0) | 60.3 (20.4) |
| Year of cohort entry, n (%): | ||||
| – 2008 | 467 (12.4) | 4290 (14.0) | 5858 (12.4) | 10,615 (13.0) |
| – 2009 | 862 (22.9) | 7128 (23.3) | 11,173 (23.6) | 19,163 (23.4) |
| – 2010 | 852 (22.6) | 6851 (22.4) | 10,572 (22.3) | 18,275 (22.3) |
| – 2011 | 885 (23.5) | 6622 (21.6) | 10,172 (21.5) | 17,679 (21.6) |
| – 2012 | 698 (18.5) | 5746 (18.8) | 9609 (20.3) | 16,053 (19.6) |
| Hispanic, n (%) | 300 (8.0) | 3542 (11.6) | 836 (1.8) | 4678 (5.7) |
| Race, n (%): | ||||
| – White | 2807 (74.6) | 22,986 (75.0) | 32,676 (69.0) | 58,469 (71.5) |
| – African–American | 139 (3.7) | 2605 (8.5) | 3369 (7.1) | 6113 (7.5) |
| – Other | 83 (2.2) | 2393 (7.8) | 546 (1.2) | 3022 (3.7) |
| – Unknown | 735 (19.5) | 2653 (8.7) | 10,793 (22.8) | 14,181 (17.3) |
| Individual comorbidities, n (%); yes: | ||||
| – Alcohol abuse | 435 (11.6) | 3391 (11.1) | 2687 (5.7) | 6513 (8.0) |
| – Anemia | 312 (8.3) | 3023 (9.9) | 8741 (18.4) | 12,076 (14.8) |
| – Cardiac arrhythmia | 384 (10.2) | 3004 (9.8) | 7703 (16.3) | 11,091 (13.6) |
| – Heart failure, chronic | 251 (6.7) | 1802 (5.9) | 6230 (13.1) | 8283 (10.1) |
| – Dementia | 534 (14.2) | 2872 (9.4) | 11463 (24.2) | 14,869 (18.2) |
| – Fluid/electrolyte disorder | 566 (15.0) | 3224 (10.5) | 8423 (17.8) | 12,213 (14.9) |
| – Hypertension | 1146 (30.4) | 10,285 (33.6) | 25,378 (53.6) | 36,809 (45.0) |
| – Liver disease | 96 (2.6) | 846 (2.8) | 1162 (2.5) | 2104 (2.6) |
| – Psychosis | 2769 (73.6) | 19,857 (64.8) | 23,165 (48.9) | 45,791 (56.0) |
| – Pulmonary disease | 576 (15.3) | 4635 (15.1) | 9919 (20.9) | 15,130 (18.5) |
| – PVD | 167 (4.4) | 1683 (5.5) | 5700 (12.0) | 7550 (9.2) |
| – Renal | 338 (9.0) | 2271 (7.4) | 4758 (10.0) | 7367 (9.0) |
| – Tumor | 178 (4.7) | 1634 (5.3) | 3706 (7.8) | 5518 (6.7) |
| – Depression | 2264 (60.1) | 17,482 (57.1) | 24,111 (50.9) | 43,857 (53.6) |
Lab: Laboratory test; PVD: Peripheral vascular disease; SD: Standard deviation.
| Setting | Incidence rate (defined using diagnosis and medication claims but not using labs) | Incidence rate (defined using diagnosis and medication claims, and lab results) |
|---|---|---|
| Site one: | ||
| – Total (n = 3764) | 0.024 (0.019–0.029) | 0.038 (0.032–0.045) |
| – With follow-up labs (n = 2567) | 0.033 (0.027–0.041) | 0.055 (0.046–0.065) |
| – Without follow-up labs (n = 1197) | 0.004 (0.002–0.01) | 0.004 (0.002–0.01) |
| Site two: | ||
| – Total (n = 30,637) | 0.018 (0.016–0.019) | 0.039 (0.037–0.041) |
| – With follow-up labs (n = 20,268) | 0.025 (0.023–0.028) | 0.058 (0.055–0.061) |
| – Without follow-up labs (n = 10,369) | 0.003 (0.002–0.004) | 0.003 (0.002–0.004) |
| Site three: | ||
| – Total (n = 47,384) | 0.078 (0.076–0.081) | 0.082 (0.079–0.084) |
| – With follow-up labs (n = 15,123) | 0.119 (0.113–0.124) | 0.13 (0.124–0.136) |
| – Without follow-up labs (n = 32,261) | 0.06 (0.057–0.062) | 0.06 (0.057–0.062) |
All differences in incidence rates with and without labs are significant at all three sites (p < 0.001).
Lab: Laboratory test.
| Characteristic | Associations with missing follow-up lab results: adjusted odds ratios (95% CI); data partner site | |||
|---|---|---|---|---|
| All sites combined | Site one | Site two | Site three | |
| Number | 81,785 | 3764 | 30,637 | 47,384 |
| SGA, aripiprazole reference: | ||||
| – Olanzapine | 0.93 (0.87–0.98) | 0.72 (0.50–1.04) | 0.78 (0.71–0.87) | 1.02 (0.94, 1.11) |
| – Quetiapine | 0.87 (0.84–0.91) | 0.84 (0.66–1.07) | 0.95 (0.88–1.02) | 0.96 (0.90–1.02) |
| – Risperidone | 0.91 (0.86–0.95) | 0.69 (0.55–0.88) | 0.78 (0.72–0.85) | 0.97 (0.91–1.04) |
| Sex: male vs female | 1.08 (1.04–1.11) | 1.08 (0.93–1.26) | 0.99 (0.94–1.04) | 1.09 (1.04–1.14) |
| Age (per 10 years) | 0.99 (0.98–1.00) | 0.81 (0.77–0.85) | 0.84 (0.83–0.86) | 1.05 (1.03–1.07) |
| Any baseline diabetes lab | 0.29 (0.28–0.29) | 0.93 (0.79–1.09) | 1.07 (1.01–1.13) | 0.19 (0.18–0.20) |
| Year of cohort entry–2008 reference: | ||||
| – 2009 | 0.94 (0.89–0.99) | 0.96 (0.75–1.23) | 0.95 (0.88–1.03) | 0.85 (0.79–0.92) |
| – 2010 | 0.91 (0.86–0.96) | 0.89 (0.69–1.15) | 0.96 (0.88–1.04) | 0.82 (0.76–0.88) |
| – 2011 | 0.83 (0.79–0.88) | 0.77 (0.60–1.00) | 0.95 (0.87–1.03) | 0.68 (0.63–0.74) |
| – 2012 | 0.87 (0.83–0.92) | 1.01 (0.78–1.31) | 0.99 (0.91–1.08) | 0.69 (0.64–0.75) |
| Hispanic – no/unknown reference | 0.44 (0.41–0.47) | 0.91 (0.68–1.22) | 0.71 (0.65– 0.78) | 0.36 (0.30–0.42) |
| Race – unknown reference: | ||||
| – African–American | 0.52 (0.49–0.56) | 0.99 (0.66–1.49) | 0.62 (0.54–0.70) | 0.65 (0.59–0.72) |
| – White | 0.57 (0.54–0.59) | 0.83 (0.68–1.01) | 0.67 (0.61–0.74) | 0.74 (0.69–0.79) |
| – Other | 0.41 (0.37–0.45) | 0.77 (0.47–1.27) | 0.59 (0.52–0.67) | 0.75 (0.61–0.92) |
| Alcohol abuse | 0.86 (0.82–0.92) | 1.00 (0.79–1.27) | 0.98 (0.90–1.06) | 0.98 (0.89–1.07) |
| Anemia | 1.12 (1.07–1.17) | 1.06 (0.77–1.46) | 0.91 (0.83–1.01) | 1.10 (1.04–1.17) |
| Cardiac arrhythmia | 0.97 (0.92–1.02) | 0.83 (0.61–1.12) | 0.93 (0.84–1.03) | 1.00 (0.94–1.07) |
| Heart failure – chronic | 1.13 (1.06–1.20) | 1.13 (0.77–1.67) | 1.21 (1.05–1.38) | 1.07 (1.00–1.16) |
| Dementia | 1.45 (1.38–1.51) | 1.26 (0.97–1.64) | 1.08 (0.98–1.19) | 1.38 (1.30–1.46) |
| Fluid/electrolyte disorder | 1.05 (1.00–1.11) | 0.80 (0.61–1.04) | 0.93 (0.84–1.03) | 1.08 (1.01–1.16) |
| Hypertension | 1.01 (0.97–1.05) | 0.96 (0.78–1.18) | 0.85 (0.79–0.91) | 0.90 (0.86–0.95) |
| Liver disease | 0.78 (0.71–0.86) | 0.74 (0.43–1.27) | 0.81 (0.69–0.95) | 0.74 (0.65–0.85) |
| Psychosis | 0.75 (0.72–0.77) | 0.74 (0.61–0.90) | 0.74 (0.70–0.79) | 0.86 (0.82–0.90) |
| Pulmonary disease | 1.03 (0.99–1.07) | 1.12 (0.90–1.40) | 0.98 (0.91–1.06) | 0.97 (0.91–1.02) |
| PVD | 1.07 (1.01–1.13) | 1.53 (1.02–2.28) | 1.05 (0.92–1.20) | 0.99 (0.92–1.06) |
| Renal | 0.76 (0.72–0.81) | 1.33 (0.98–1.81) | 0.97 (0.87–1.09) | 0.80 (0.74–0.86) |
| Tumor | 0.95 (0.89–1.02) | 0.81 (0.53–1.24) | 1.19 (1.05–1.36) | 0.87 (0.79–0.94) |
| Depression | 0.94 (0.91–0.98) | 1.14 (0.96–1.35) | 1.03 (0.97–1.09) | 0.91 (0.86–0.95) |
Shown: baseline presence of HIV, coagulopathy, hemiplegia, metastatic cancer, pulmonary circulatory disorder, weight loss, acute myocardial infarction, ischemic stroke, intracranial hemorrhage, osteoarthritis; as well as baseline number of medication classes used, number of ambulatory care visits, number of hospital visits, baseline hospitalization and baseline institutional stay.
Lab: Laboratory test; PVD:Peripheral vascular disease; SGA: Second-generation antipsychotic.
| Second-generation antipsychotic agent | Associations with diabetes outcomes: adjusted hazard ratio (95% CI) | |
|---|---|---|
| Outcome 1: diagnosis code or antidiabetic medication dispensing (n = 81,785) | Outcome 2: diagnosis code, antidiabetic medication dispensing or diabetes labs (HbA1c ≥6.5, fasting glucose ≥126, random glucose ≥200 (n = 81,785) | |
| All sites, aripiprazole reference | ||
| Olanzapine | 1.02 (0.90–1.15) | 1.07 (0.96–1.20) |
| Quetiapine | 1.03 (0.94–1.13) | 1.04 (0.95–1.13) |
| Risperidone | 1.03 (0.93–1.15) | 1.01 (0.92–1.11) |
| Site-specific, aripiprazole reference | ||
| Site one: | ||
| – Olanzapine | 0.54 (0.14–2.04) | 1.20 (0.46–3.13) |
| – Quetiapine | 0.92 (0.44–1.93) | 1.53 (0.80–2.90) |
| – Risperidone | 1.18 (0.57–2.44) | 1.43 (0.75–2.73) |
| Site two: | ||
| – Olanzapine | 0.79 (0.55–1.12) | 1.05 (0.83–1.32) |
| – Quetiapine | 0.90 (0.70–1.15) | 0.92 (0.77–1.10) |
| – Risperidone | 0.88 (0.66–1.19) | 0.90 (0.73–1.11) |
| Site three: | ||
| – Olanzapine | 1.06 (0.93–1.21) | 1.06 (0.93–1.21) |
| – Quetiapine | 1.05 (0.95–1.16) | 1.06 (0.95–1.17) |
| – Risperidone | 1.05 (0.93–1.17) | 1.04 (0.93–1.16) |
lab test results are an important component of clinical databases and can be used to help define outcomes such as incident diabetes.
The Mini-Sentinel distributed database is a new resource developed by the FDA for drug safety surveillance. It includes lab data but with significant levels of missingness.
An observational cohort study comparing rates of possible incident diabetes with different drug exposures was performed in Mini-Sentinel. Most cases of diabetes could be detected using just diagnosis codes and prescribing records, but when lab data were also used the rate of outcomes identified increased by 21%.
Numerous baseline variables predicted the likelihood of lab testing for diabetes during follow-up, including the drug exposure group.
In patients with missing lab data during follow-up, rates of incident diabetes were implausibly low, despite the fact that prescribing and diagnosis data were also used to identify new cases of diabetes.
A pattern in which rates of a diagnosis are extremely low when lab testing is absent would be expected in a disease such as diabetes, which is usually asymptomatic and for which lab tests are a necessary part of the diagnostic process.
If rates of lab testing for an asymptomatic disease differ according to the exposure groups, there is a high likelihood of ascertainment bias, in which the percentage of true occurrences of the outcome detected may differ across exposure groups.
This experience shows that examining the patterns of missingness in lab data can provide evidence that ascertainment bias may be present, and may enable quantitative approaches to estimating the potential strength of such bias.
Financial & competing interests disclosure
The Mini-Sentinel program is funded by the US FDA through contract HHSF22301012T-0008 under Master Agreement HHSF223020091006I from the Department of Health and Human Services. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
No writing assistance was utilized in the production of this manuscript.
Ethical conduct of research
The authors state that they have obtained appropriate institutional review board approval or have followed the principles outlined in the Declaration of Helsinki for all human or animal experimental investigations. In addition, for investigations involving human subjects, informed consent has been obtained from the participants involved.
Supplementary Material
File (suppl_figure_1.pdf)
- Download
- 41.28 KB
References
1.
American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes Care 33(Suppl. 1), S62–S69 (2010).
2.
Nichols GA, Desai J, Elston Lafata J et al. Construction of a multisite DataLink using electronic health records for the identification, surveillance, prevention, and management of diabetes mellitus: the SUPREME-DM Project. Prev. Chron. Dis. 9, 110311 (2012).
3.
Newcomer JW. Second-generation (atypical) antipsychotics and metabolic effects: a comprehensive literature review. CNS Drugs 19(Suppl. 1), 1–93 (2005).
4.
De Hert M, Detraux J, van Winkel R, Yu W, Correll CU. Metabolic and cardiovascular adverse effects associated with antipsychotic drugs. Nat. Rev. Endocrinol. 8(2), 114–126 (2011).
5.
Correll CU. Monitoring and management of antipsychotic-related metabolic and endocrine adverse events in pediatric patients. Int. Rev. Psychiatry 20, 195–201 (2008).
6.
Goeb JL, Marco S, Duhamel A et al. Metabolic side effects of risperidone in children and adolescents with early-onset schizophrenia. Prim. Care Companion J. Clin. Psych. 10, 486–487 (2008).
7.
Pringsheim T, Panagiotopoulos C, Davidson J, Ho J. Evidence-based recommendations for monitoring safety of second generation antipsychotics in children and youth. J. Can. Acad. Child Adolesc. Psychiatry 20(3), 218–233 (2011).
8.
Raebel MA, Haynes K, Woodworth TS et al. Electronic clinical laboratory test results data tables: lessons from Mini-Sentinel. Pharmacoepidemiol. Drug Saf. 23(6), 609–618 (2014).
9.
Raebel MA, Shetterly S, Lu CY et al. Methods for using clinical laboratory test results as baseline confounders in multi-site observational database studies when missing data are expected. Pharmacoepidemiol. Drug Saf. 25(7), 798–814 (2016).
10.
Rosati K, Evans B, McGraw D. HIPAA and common rule compliance in the Mini-Sentinel pilot, white paper (2010). http://mini-sentinel.org/work_products/About_Us/HIPAA_and_CommonRuleCompliance_in_the_Mini-SentinelPilot.pdf.
11.
Department of Health and Human Services, Office of the Secretary. Standards for privacy of individually identifiable health information; final rule. 45 CFR Parts 160 and 164. 2002. Federal Register 53182–53273 (2014). www.ihs.gov/privacyact/documents/privrulepd.pdf.
12.
Dolin P. Pioglitazone, bladder cancer, and detection bias. J. Diabetes 6(2), 193–194 (2014).
13.
Wood AM, White IR, Thompson SG. Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clin. Trials 1(4), 368–376 (2004).
Information & Authors
Information
Published In
Copyright
© Future Medicine Ltd.
History
Published online: 9 December 2016
Keywords:
Topics
Authors
Metrics & Citations
Metrics
Article Usage
Article usage data only available from February 2023. Historical article usage data, showing the number of article downloads, is available upon request.
Citations
How to Cite
Missing laboratory results data in electronic health databases: implications for monitoring diabetes risk. (2016) Journal of Comparative Effectiveness Research. DOI: 10.2217/cer-2016-0033
Export citation
Select the citation format you wish to export for this article or chapter.
Citing Literature
- Brian D. Williamson, Chloe Krakauer, Eric Johnson, Susan Gruber, Bryan E. Shepherd, Mark J. van der Laan, Thomas Lumley, Hana Lee, José J. Hernández‐Muñoz, Fengyu Zhao, Sarah K. Dutcher, Rishi Desai, Gregory E. Simon, Susan M. Shortreed, Jennifer C. Nelson, Pamela A. Shaw, Assessing Treatment Effects in Observational Data With Missing Confounders: A Comparative Study of Practical Doubly‐Robust and Traditional Missing Data Methods, Statistics in Medicine, 10.1002/sim.70366, 45, 3-5, (2026).
- Domingo Orozco-Beltran, Samuel Seidu, Jose Antonio Quesada, Routine HbA1c monitoring and cardiovascular outcomes in diabetes: Evidence from a large Spanish cohort, Primary Care Diabetes, 10.1016/j.pcd.2025.11.012, 20, 1, (13-20), (2026).
- Antoinette Liddell, Natalie Pink, Fernanda Baldim Jardim Nobre, Carla Bernardo, Susan Williams, Nigel Stocks, David Gonzalez-Chica, Trends in diabetes monitoring and control among Aboriginal and Torres Strait Islander Peoples attending general practice in urban and rural locations in Australia: a repeated cross-sectional study using data from a national general practice database (MedicineInsight), BMJ Open, 10.1136/bmjopen-2024-093031, 15, 8, (e093031), (2025).
- Mingyue Zheng, Carla Bernardo, Nigel Stocks, Peng Hu, David Gonzalez-Chica, Diabetes mellitus monitoring and control among adults in Australian general practice: a national retrospective cohort study, BMJ Open, 10.1136/bmjopen-2022-069875, 13, 4, (e069875), (2023).
- Julie C Lauffenburger, Niteesh K Choudhry, Massimiliano Russo, Robert J Glynn, Steffen Ventz, Lorenzo Trippa, Designing and conducting adaptive trials to evaluate interventions in health services and implementation research: practical considerations, BMJ Medicine, 10.1136/bmjmed-2022-000158, 1, 1, (e000158), (2022).
- Xiansong Wang, Frankie T.F. Cheng, Thomas Y.T. Lam, Yingzhi Liu, Dan Huang, Xiaodong Liu, Huarong Chen, Lin Zhang, Yusuf Ali, Maggie H.T. Wang, Jun Yu, Tony Gin, Matthew T.V. Chan, William K.K. Wu, Sunny H. Wong, Stress Hyperglycemia Is Associated With an Increased Risk of Subsequent Development of Diabetes Among Bacteremic and Nonbacteremic Patients, Diabetes Care, 10.2337/dc21-1682, 45, 6, (1438-1444), (2022).
- Lytske J. Bakker, Lucas M.A. Goossens, Maurice J. O'Kane, Carin A. Uyl-de Groot, William K. Redekop, Analysing electronic health records: The benefits of target trial emulation, Health Policy and Technology, 10.1016/j.hlpt.2021.100545, 10, 3, (100545), (2021).
- Bret Zeldow, James Flory, Alisa Stephens-Shields, Marsha Raebel, Jason A Roy, Functional clustering methods for longitudinal data with application to electronic health records, Statistical Methods in Medical Research, 10.1177/0962280220965630, 30, 3, (655-670), (2020).
- Sruthi Adimadhyam, Erin F. Barreto, Noelle M. Cocoros, Sengwee Toh, Jeffrey S. Brown, Judith C. Maro, Jacqueline Corrigan-Curay, Gerald J. Dal Pan, Robert Ball, David Martin, Michael Nguyen, Richard Platt, Xiaojuan Li, Leveraging the Capabilities of the FDA’s Sentinel System To Improve Kidney Care, Journal of the American Society of Nephrology, 10.1681/ASN.2020040526, 31, 11, (2506-2516), (2020).
- Marsha A. Raebel, Susan M. Shetterly, Bharati Bhardwaja, Andrew T. Sterrett, Emily B. Schroeder, Joseph Chorny, Tyson P. Hagen, David J. Silverman, Rex Astles, Ira M. Lubin, Technology-Enabled Outreach to Patients Taking High-Risk Medications Reduces a Quality Gap in Completion of Clinical Laboratory Testing, Population Health Management, 10.1089/pop.2019.0033, 23, 1, (3-11), (2020).
