Probabilistic field mapping for product search
Master thesis
Permanent lenke
http://hdl.handle.net/11250/299612Utgivelsesdato
2015-06Metadata
Vis full innførselSamlinger
- Studentoppgaver (TN-IDE) [823]
Sammendrag
Online shopping has shown a rapid growth in the last few years. Robust
search systems are arguably fundamental to e-commerce sites. Most importantly,
sites should have smart retrieval systems to present optimized results
that could best satisfy customers purchase intent. To address the demand for
such systems we adapted retrieval approaches based on a generative language
modeling framework, representing products as semi-structured documents.
We present and experimentally compare three alternative ranking functions
which make use of different prior estimates. The first method is static field
weighting approach relying on field’s individual performance taking nDCG as
an effectiveness measure. Two other methods dynamically assign term-field
weights according to the distribution of terms in field’s collection. These
retrieval functions infers from user search keywords the most likely matching
product property probabilistically. The methods differ as one of them
considers a uniform field prior whereas the other utilizes performance based
prior. The methods were evaluated in relatively new evaluation methodology
that evaluated ranking systems when real customer were doing online shopping
at toy webshop ‘regiojatek.hu’ : Living labs. In the experiment the lab
present an interleaved result, based on Team draft interleaving, from production
site and our experimental rankings to customers. The Lab employ
an evaluation metric “outcome” and we applied outcome measure to compare
our methods and to interpret our results. Our results show that both
term-specific mapping methods outperformed the static weight assignment
approach. In addition results also suggest that estimating field mapping priors
based on historical clicks does not outperform the setting where the priors
are uniformly distributed. Furthermore,we also discovered that a trec-style
evaluation carried out deeming historical clicks as relevance indicators had
ordered the methods inversely in relation to Living labs. This has possible
implication that Living labs evaluation platform are essential in IR tasks.
Beskrivelse
Master's thesis in Computer science