Vis enkel innførsel

dc.contributor.advisorMeling, Hein
dc.contributor.authorNygaard, Racin
dc.date.accessioned2023-04-13T06:07:44Z
dc.date.available2023-04-13T06:07:44Z
dc.date.issued2023
dc.identifier.citationImproving Data Availability in Decentralized Storage Systems by Racin Nygaard, Stavanger : University of Stavanger, 2023 (PhD thesis UiS, no. 693)en_US
dc.identifier.isbn978-82-8439-158-8
dc.identifier.issn1890-1387
dc.identifier.urihttps://hdl.handle.net/11250/3062750
dc.descriptionPhD thesis in Information technologyen_US
dc.description.abstractPreserving knowledge for future generations has been a primary concern for humanity since the dawn of civilization. State-of-the-art methods have included stone carvings, papyrus scrolls, and paper books. With each advance in technology, it has become easier to record knowledge. In the current digital age, humanity may preserve enormous amounts of knowledge on hard drives with the click of a button. The aggregation of several hard drives into a computer forms the basis for a storage system. Traditionally, large storage systems have comprised many distinct computers operated by a single administrative entity. With the rise in popularity of blockchain and cryptocurrencies, a new type of storage system has emerged. This new type of storage system is fully decentralized and comprises a network of untrusted peers cooperating to act as a single storage system. During upload, files are split into chunks and distributed across a network of peers. These storage systems encode files using Merkle trees, a hierarchical data structure that provides integrity verification and lookup services. While decentralized storage systems are popular and have a user base in the millions, many technical aspects are still in their infancy. As such, they have yet to prove themselves viable alternatives to traditional centralized storage systems. In this thesis, we contribute to the technical aspects of decentralized storage systems by proposing novel techniques and protocols. We make significant contributions with the design of three practical protocols that each improve data availability in different ways. Our first contribution is Snarl and entangled Merkle trees. Entangled Merkle trees are resilient data structures that decrease the impact hierarchical dependencies have on data availability. Whenever a chunk loss is detected, Snarl uses the entangled Merkle trees to find parity chunks to repair the lost chunk. Our results show that by encoding data as an entangled Merkle tree and using Snarl’s repair algorithm, the storage utilization in current systems could be improved by over five times, with improved data availability. Second, we propose SNIPS, a protocol that efficiently synchronizes the data stored on peers to ensure that all peers have the same data. We designed a Proof of Storage-like construction using a Minimal Perfect Hash Function. Each peer uses the PoS-like construction to create a storage proof for those chunks it wants to synchronize. Peers exchange storage proofs and use them to efficiently determine which chunks they are missing. The evaluation shows that by using SNIPS, the amount of synchronization data can be reduced by three orders of magnitude in current systems. Lastly, in our third contribution, we propose SUP, a protocol that uses cryptographic proofs to check if a chunk is already stored in the network before doing wasteful uploads. We show that SUP may reduce the amount of data transferred by up to 94 % in current systems. The protocols may be deployed independently or in combination to create a decentralized storage system that is more robust to major outages. Each of the protocols has been implemented and evaluated on a large cluster of 1,000 peers.en_US
dc.language.isoengen_US
dc.publisherUniversity of Stavanger, Norwayen_US
dc.relation.ispartofseriesPhD thesis UiS;
dc.relation.ispartofseries;693
dc.relation.haspartPaper 1: Nygaard, R., Estrada-Galiñanes, V., Meling, H. (2021) Snarl: Entangled Merkle Trees for Improved File Availability and Storage Utilization. Proceedings of the 22nd International Middleware Conference. Middleware ’21. Québec city, Canada: Association for Computing Machinery, 2021, pp. 236–247. URL: https://doi.org/10.1145/3464298.3493397en_US
dc.relation.haspartPaper 2: Nygaard, R., Meling, H. SNIPS: Succinct Proof of Storage for Efficient Data Synchronization in Decentralized Storage Systems. To be submitted.en_US
dc.relation.haspartPaper 3: Nygaard, R., Meling, H., Olsen, J.I. (2023) Cost-effective Data Upkeep in Decentralized Storage Systems. Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. Tallinn, Estonia, 2023. URL: https://doi.org/10.1145/3555776.3577728en_US
dc.relation.haspartPaper 4: Nygaard, R. (2022) Lessons Learned from a Bare-metal Evaluation of Erasure Coding Algorithms in P2P Networks. in: arXiv abs/2208.12360 (2022). URL: https://arxiv.org/abs/2208.12360en_US
dc.rightsCopyright the author
dc.subjectdatalagringen_US
dc.subjectinformasjonsteknologien_US
dc.subjectstorage systemsen_US
dc.titleImproving Data Availability in Decentralized Storage Systemsen_US
dc.typeDoctoral thesisen_US
dc.rights.holder© 2023 Racin Nygaarden_US
dc.subject.nsiVDP::Mathematics and natural science: 400::Information and communication science: 420en_US


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel