Show simple item record

dc.contributor.advisorKrisztian Balog
dc.contributor.authorDavid Gordon Ramsay and Rebeca Pop
dc.date.accessioned2021-09-07T16:30:12Z
dc.date.available2021-09-07T16:30:12Z
dc.date.issued2021
dc.identifierno.uis:inspera:78872743:36748811
dc.identifier.urihttps://hdl.handle.net/11250/2774415
dc.descriptionFull text not available
dc.description.abstract
dc.description.abstractTables are common and important in scientific publications. They serve as the main elements for presenting findings in a structured way. This project concerns the extraction of tables from scientific papers that have been published on arxiv.org. ArXiv is an open archive for scholarly articles, where articles are published not only in PDF format, but the respective LaTeX sources are also made available for most. The specific project objectives are: (i) Developing a method for identifying and extracting tables from a La- TeX document; (ii) Enriching the extracted table data with metadata from the article; (iii) Creating a large-scale table corpus that can be dis- tributed; (iv) Setting up batch processes to continuously update the table corpus
dc.languageeng
dc.publisheruis
dc.titlearXiv Table Extractor
dc.typeBachelor thesis


Files in this item

FilesSizeFormatView

This item appears in the following Collection(s)

Show simple item record