Probabilistic field mapping for product search

2015-06

Online shopping has shown a rapid growth in the last few years. Robust

search systems are arguably fundamental to e-commerce sites. Most importantly,

sites should have smart retrieval systems to present optimized results

that could best satisfy customers purchase intent. To address the demand for

such systems we adapted retrieval approaches based on a generative language

modeling framework, representing products as semi-structured documents.

We present and experimentally compare three alternative ranking functions

which make use of different prior estimates. The first method is static field

weighting approach relying on field’s individual performance taking nDCG as

an effectiveness measure. Two other methods dynamically assign term-field

weights according to the distribution of terms in field’s collection. These

retrieval functions infers from user search keywords the most likely matching

product property probabilistically. The methods differ as one of them

considers a uniform field prior whereas the other utilizes performance based

prior. The methods were evaluated in relatively new evaluation methodology

that evaluated ranking systems when real customer were doing online shopping

at toy webshop ‘regiojatek.hu’ : Living labs. In the experiment the lab

present an interleaved result, based on Team draft interleaving, from production

site and our experimental rankings to customers. The Lab employ

an evaluation metric “outcome” and we applied outcome measure to compare

our methods and to interpret our results. Our results show that both

term-specific mapping methods outperformed the static weight assignment

approach. In addition results also suggest that estimating field mapping priors

based on historical clicks does not outperform the setting where the priors

are uniformly distributed. Furthermore,we also discovered that a trec-style

evaluation carried out deeming historical clicks as relevance indicators had

ordered the methods inversely in relation to Living labs. This has possible

implication that Living labs evaluation platform are essential in IR tasks.

Master's thesis in Computer science

University of Stavanger, Norway

Masteroppgave/UIS-TN-IDE/2015;