Show simple item record

dc.contributor.authorMing, Hao
dc.date.accessioned2012-09-26T08:42:00Z
dc.date.available2012-09-26T08:42:00Z
dc.date.issued2012
dc.identifier.urihttp://hdl.handle.net/11250/181795
dc.descriptionMaster's thesis in Computer scienceno_NO
dc.description.abstractOntologies are representations of the entities and relationships that structure an application area. Ontologies are important for tasks such as data integration, natural-language processing, information retrieval, and decision support. NCBO Resource Index is a system for ontology based annotation and indexing of biomedical data. With the increasing of its data, a distributed processing method should be implemented, which can store, compute and inquire those large-scale data in an efficient way. This paper is based on the master thesis of B. Byambajav, Methods for Large-scale Semantic Expansion on Hadoop Architecture, and going forward to seek a better solution for process NCBO Resource Index data and forced on performance optimization of left outer join on the Map side. In this paper, we researched and contrasted different kinds of join algorithms. In order to implement more effective experiments, we studied the characteristics of HDFS and DistributedCache, then an algorithm of left outer join on map side had been implemented on the Hadoop platform, and for the purpose of performance optimization, we inspected several methods to control amount of map task. Further, according to the result of the experiment, we adjusted critical parameters and we got a lot of valuable conclusions. Based on these conclusions, we found the map side join works well and got a better result in previous works.no_NO
dc.language.isoengno_NO
dc.publisherUniversity of Stavanger, Norwayno_NO
dc.relation.ispartofseriesMasteroppgave/UIS-TN-IDE/2012;
dc.subjectinformasjonsteknologino_NO
dc.subjectsignalbehandlingno_NO
dc.subjectMapReduceno_NO
dc.subjectHDFSno_NO
dc.subjectleft out joinno_NO
dc.subjectmap side joinno_NO
dc.subjectperformance analysis and optimizationno_NO
dc.subjectdatateknikkno_NO
dc.titlePerformance analysis and optimization of left outer join on map sideno_NO
dc.typeMaster thesisno_NO
dc.subject.nsiVDP::Technology: 500::Information and communication technology: 550no_NO


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record