Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval
Master thesis
View/ Open
Date
2018-06Metadata
Show full item recordCollections
- Studentoppgaver (TN-IDE) [823]
Abstract
Tables contain a significant amount of valuable knowledge in a structured form. In recent years, a growing body of studies related to tables has been conducted in different application domains. To the best of our knowledge, utilizing neural embeddings regarding table corpus is rather unexploited. In this thesis, our goal is to employ neural language modeling approaches to embed tabular data into vector spaces, which are leveraged and contributed to table-related tasks. Specifically, we consider different tabular data, such as sequences of words, table entities, core column entities, and heading labels in relational tables, for training word and entity embeddings.
These embeddings are utilized subsequently in three particular table-related tasks, i.e., row population, column population, and table retrieval, by incorporating them into existing retrieval models as additional semantic similarity signals. The main novel contribution of Table2Vec is a neural method for performing multiple table-related tasks developed specially on table corpus.
We further conduct an evaluation of table embeddings on the task level. The results show that Table2Vec can significantly and substantially improve upon the performance of state-of-the-art baselines. In the best case, Table2Vec outperforms the corresponding baseline by 40%.
Description
Master's thesis in Computer science