Data entry quality of double data entry vs automated form processing technologies: A cohort study validation of optical mark recognition and intelligent character recognition in a clinical setting
Peer reviewed, Journal article
Published version
Permanent lenke
https://hdl.handle.net/11250/3042387Utgivelsesdato
2020Metadata
Vis full innførselSamlinger
Originalversjon
Paulsen, A., Harboe, K., & Dalen, I. (2020). Data entry quality of double data entry vs automated form processing technologies: A cohort study validation of optical mark recognition and intelligent character recognition in a clinical setting. Health science reports, 3(4), e210. 10.1002/hsr2.210Sammendrag
Background and Aims: Patient-reported outcome measures (PROMs) are increasingly
used in health services. Paper forms are still often used to register such data. Manual
double data entry (DDE) has been defined as the gold standard for transferring data
to an electronic format but is laborious and costly. Automated form processing (AFP)
is an alternative, but validation in a clinical context is warranted. The study objective
was to examine and validate a local hospital AFP setup.
Methods: Patients over 18 years of age who were scheduled for knee or hip replace-
ment at Stavanger University Hospital from 2014 to 2017 who answered PROMs
were included in the study and contributed PROM data. All paper PROMs were
scanned using the AFP techniques of optical mark recognition (OMR) and intelligent
character recognition (ICR) and were processed by DDE by health secretaries using a
data entry program. OMR and ICR were used to capture different types of data. The
main outcome was the proportion of correctly entered numbers, defined as the same
response recorded in AFP and DDE or by consulting the original paper questionnaire
at the data field, item, and PROM level.
Results: A total of 448 questionnaires from 255 patients were analyzed. There was
no statistically significant difference in error proportions per 10 000 data fields
between OMR and DDE for data from check boxes (3.52 95% confidence interval
(CI) 2.17 to 5.72 and 4.18 (95% CI 2.68-6.53), respectively P = .61). The error propor-
tion for ICR (nine errors) was statistically significantly higher than that for DDE (two
errors), that is, 3.53 (95% CI 1.87-6.57) vs 0.78 (95% CI 0.22-2.81) per 100 data
fields/items/questionnaires; P = .033. OMR (0.04% errors) outperformed ICR (3.51%
errors; P < .001), Fisher's exact test.
Conclusions: OMR can produce an error rate that is comparable to that of DDE. In
our setup, ICR is still problematic and is highly dependent on manual validation. When AFP is used, data quality should be tested and documented.