Automatic DFØ Bill Recognition based on Deep Learning
Bachelor thesis
Permanent lenke
https://hdl.handle.net/11250/3077495Utgivelsesdato
2023Metadata
Vis full innførselSamlinger
- Studentoppgaver (TN-IDE) [823]
Beskrivelse
Full text not available
Sammendrag
This thesis presents the development of an internal Optical Character Recognition(OCR) system for receipt scanning, aimed at replacing Direktoratet for Forvaltning ogØkonomistyring’s current system with an in-house one. The primary motivation behindthe development of a new solution is to minimize costs, improve data privacy and security,increase efficiency and accuracy, and extract more data which the company requires. Thecurrent system is designed to detect and recognize text on receipts.
The OCR is built using two components: a Convolutional Neural Networks (CNN)based object detector to localize regions of interest (ROI) and a Transformer based textrecognizer to convert the ROIs into a text output. The study uses receipts acquiredfrom DFØ, which have been manually labeled, annotated, and pre-processed as partof the work before used for the training and evaluation of the two component models.Cross-validation and bootstrapping were used to assess the models’ performance.
Experimental results demonstrate promising performance for both models when usedindependently, with the object detection model achieving 92% global average accuracy forall the classes. The text recognition results were also promising, with 2% CER and 92%precision when bootstrapping. As a bonus, we provide an API and a hosted web-pagethat will be available until censorship deadline intended to depict the use of the developedsystem.
Our findings suggest that it is feasible to develop an OCR system capable of replacingthe external solution currently used by DFØ. This work lays a strong foundation forthe development of a fully operational in-house solution and has the potential to yieldsignificant cost savings, improved efficiency and accuracy, and extraction of more data. This thesis presents the development of an internal Optical Character Recognition(OCR) system for receipt scanning, aimed at replacing Direktoratet for Forvaltning ogØkonomistyring’s current system with an in-house one. The primary motivation behindthe development of a new solution is to minimize costs, improve data privacy and security,increase efficiency and accuracy, and extract more data which the company requires. Thecurrent system is designed to detect and recognize text on receipts.
The OCR is built using two components: a Convolutional Neural Networks (CNN)based object detector to localize regions of interest (ROI) and a Transformer based textrecognizer to convert the ROIs into a text output. The study uses receipts acquiredfrom DFØ, which have been manually labeled, annotated, and pre-processed as partof the work before used for the training and evaluation of the two component models.Cross-validation and bootstrapping were used to assess the models’ performance.
Experimental results demonstrate promising performance for both models when usedindependently, with the object detection model achieving 92% global average accuracy forall the classes. The text recognition results were also promising, with 2% CER and 92%precision when bootstrapping. As a bonus, we provide an API and a hosted web-pagethat will be available until censorship deadline intended to depict the use of the developedsystem.
Our findings suggest that it is feasible to develop an OCR system capable of replacingthe external solution currently used by DFØ. This work lays a strong foundation forthe development of a fully operational in-house solution and has the potential to yieldsignificant cost savings, improved efficiency and accuracy, and extraction of more data.