Vis enkel innførsel

dc.contributor.authorMorten, Waersland
dc.date.accessioned2016-10-10T10:24:04Z
dc.date.available2016-10-10T10:24:04Z
dc.date.issued2016-06-15
dc.identifier.urihttp://hdl.handle.net/11250/2413858
dc.descriptionMaster's thesis in Computer sciencenb_NO
dc.description.abstractThis thesis presents a technique for discovering and extracting unknown patterns for structured data. There is no need for pre-knowledge to be able to discover patterns. But by applying pre-knowledge these patterns can be classified. When merging information from structured data, it is important that correct information is merged together. To achieved this multiple techniques are needed to analyse the information. This thesis provides a technique that can increase the accuracy. By collecting unique values using a trie structure, unknown pattern is discovered and extracted. These patterns are represented by using regular expressions and classified by using a decision tree. The technique presented provides regular expressions that are efficient and accurate. Along with the decision tree that classifies correct with a score greater than 80%. This technique can be used to improve the accuracy when merging structured data, increases the knowledge about a file, detect ID values, calculate other measurement including the consistency of a file, and if there are typographical errors.nb_NO
dc.language.isoengnb_NO
dc.publisherUniversity of Stavanger, Norwaynb_NO
dc.relation.ispartofseriesMasteroppgave/UIS-TN-IDE/2016;
dc.subjectinformasjonsteknologinb_NO
dc.subjectinformation technologynb_NO
dc.subjectcomputer sciencenb_NO
dc.subjectmachine learningnb_NO
dc.subjecttrie structurenb_NO
dc.subjectregular expressionnb_NO
dc.titleText Pattern Discovery and Extractionnb_NO
dc.typeMaster thesisnb_NO
dc.subject.nsiVDP::Technology: 500::Information and communication technology: 550::Computer technology: 551nb_NO


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel