Object Character Recognition from patient monitor screen

Bukhari, Syed Irtza Akhtar

Bukhari, Syed Irtza Akhtar

Master thesis

Åpne

no.uis:inspera:73533758:56611800.pdf (2.624Mb)

Permanent lenke

https://hdl.handle.net/11250/2786169

Utgivelsesdato

2021

Metadata

Vis full innførsel

Samlinger

Studentoppgaver (TN-IDE) [866]

Sammendrag

The new paradigm shift with the expansion of data requires us to plan better ways to manage it. This data can be in the form of text or digital images and can be found in different domains. This data when managed well, can be of much importance to us. While handling of text data is considered to be an easy process because it has a specific structure. But data in digital images is much larger and requires considerable effort to

handle. Handling of such a large data (video format) to smaller amounts (text and numbers in different data structures) while preserving the important information, comes with a lot of benefits such as sorting, easy transmission and searching.

In the medical field, the data from Patient monitor screens are available in the form of video recordings. SimCapture uses these recordings for their medical training purposes. It is believed that handling such data can be very helpful to serve soulful purposes in this domain. The manual handling of such data can be a very time-consuming task for which automated solutions are required.

For this purpose, a model is proposed that has been developed using different OCR techniques to extract important information from patient monitor screens. This proposed algorithm follows three main steps where at the first step, the input (Stream of videos from SimCapture) is pre-processed and the frames are captured based on a certain threshold for the video feeds. The next step is to apply OCR techniques to identify bounding regions of information. There are different OCRs available with their own computational complexities and limitations on different kinds of data. Lastly, the information that has been extracted is stored into different data-frames for further use. Such solutions can be computationally very exhausting but the project also aims at providing the low-complexity solution for companies having resource limitations by reducing the information size for each video from MegaBytes to Kilobytes of text.

In order to achieve the aims set for this project, we are using two different OCRs to test with our data set i.e. tesseract and easyOCR. While tesseract is considered a good solution for other problems, it has presented a very low accuracy when used for SimCapture’s dataset but the algorithm is computationally very cheap. On the other hand, EasyOCR solves the problem with much better accuracy but is computationally

expensive. Thus, if the resource is not a limitation, EasyOCR is the better model to extract the information from a patient monitor screen and present it in the form of data frames to be fed to a training model.

Utgiver

uis