Sammendrag
In late 2022, OpenAI released ChatGPT to the public, a Large Language Model that could solve a plethora of tasks through the use of text-based inputs. However, despite the immense popularity of ChatGPT, most people do not know how it works or how it came to be. This master thesis aims to explain the core building blocks behind the Large Language Model by explaining Neural Networks, Word vectors, and the ChatGPT Transformer model. The thesis will also look at the history of Neural Networks as they resulted in Large Language Models and, subsequently, the transformer model. The potential future of the model will also be discussed in terms of model collapse and how it may result in a “dumber” chatbot. By the end, the thesis will discuss the usage of ChatGPT in regard to model collapse and also look at an example of the chatbot producing the wrong results from the given inputs. Hopefully, through understanding the composition, capabilities, and shortcomings of the Large Language Model, people will have a better idea of how to educate others about ChatGPT and also get a broader understanding of how to utilize it.