Full description

The Wand and Block-Max Wand code used for experiments in the WSDM 2017 paper "A comparison of document-at-a-time and score-at-a-time query evaluation". Since all of the systems were implemented to use the same underlying index, which is an index from the ATIRE search engine, this ATIRE index must be built before the Wand/BMW indexes can be built. We have provided an end-to-end script which will build the appropriate ATIRE index, construct the Wand/BMW indexes from this, and then run the subsequent queries. Please see: run_gov2.sh and run_clueweb_09.sh. Note that the ATIRE syntax differs between GOV2 and ClueWeb collections, and that the run_clueweb_09.sh script will work for ClueWeb12 too. Abstract We present an empirical comparison between document-at-a-time (DAAT) and score-at-a-time (SAATt) document ranking strategies within a common framework. Although both strategies have been extensively explored, the literature lacks a fair, direct comparison: such a study has been difficult due to vastly different query evaluation mechanics and index organizations. Our work controls for score quantization, document processing, compression, implementation language, implementation effort, and a number of details, arriving at an empirical evaluation that fairly characterizes the performance of three specific techniques: WAND (DAAT), BMW (DAAT), and JASS (SAAT). Experiments reveal a number of interesting findings. The performance gap between WAND and BMW is not as clear as the literature suggests, and both methods are susceptible to tail queries that may take orders of magnitude longer than the median query to execute. Surprisingly, approximate query evaluation in WAND and BMW does not significantly reduce the risk of these tail queries. Overall, JASS is slightly slower than either WAND or BMW, but exhibits much lower variance in query latencies and is much less susceptible to tail query effects. Furthermore, JASS query latency is not particularly sensitive to the retrieval depth, making it an appealing solution for performance-sensitive applications where bounds on query latencies are desirable

Subjects

User Contributed Tags

Login to tag this record with meaningful keywords to make it easier to discover

Identifiers

Local : cb43266c2cd7e0630d0baed1a3f31dba

Wand and Block-Max Wand Code for "A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation"

Licence & Rights:

Access:

Contact Information

Full description

This dataset is part of a larger collection

User Contributed Tags

Quick Links

Explore

External Resources

Share

Wand and Block-Max Wand Code for "A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation"

Licence & Rights:

Access:

Contact Information

Full description

This dataset is part of a larger collection

Related Publications

Related People

Related Websites

User Contributed Tags