Automatic Generation of Presentations for Research Papers
Abstract
This thesis presents a method for automatically generating presentations from academic research papers using Large Language Models (LLMs). The goal is to present a method that will help academics save time in generating presentations for their research papers. The methods explored include a Baseline LLM method, a Retrieval-Augmented Generation (RAG) method, a Two-Step method, and a combined RAG two-step method.
A dataset of 282 research reports and presentation pairs in LATEX format was curated and made publicly available. The study introduced the novel BERTscore Slide Level Similarity (BSLS) metric for assessing the quality of generated presentations. The baseline method was shown to perform the best and evaluated by various metrics, including ROUGE and BSLS. The RAG, Two-Step, and RAG Two-Step methods did not show measurable improvements beyond the baseline method.
The tex2beam Python package was developed to facilitate this process and has been made publicly available. While no improvement beyond the baseline method was achieved, further research is proposed to refine the techniques and explore new evaluation metrics.