Data

Spoken Conversational Search Dataset

RMIT University, Australia
Johanne Trippas (Principal investigator) Damiano Spina (Associated with)
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=https://jtrippas.github.io/Spoken-Conversational-Search/&rft.title=Spoken Conversational Search Dataset&rft.identifier=eedf3814b58151674e452915cc907551&rft.publisher=RMIT University, Australia&rft.description=This dataset was created by observing participants addressing information needs of different complexity in an accoustic setting. The file ConversationalSearchDataSet.csv contains a set of 101 transcribed conversations to solve information needs based on backstories. The collection and transcription process is described in Trippas, Spina, Cavedon, and Sanderson (2017), along with an initial analysis. We provide all the releasable data in different files: Transcripts (ConversationalSearchDataSet.csv) Backstories (backstories_ConversationalSearchDataSet.csv) Code book (CodeBook_CHIIR.pdf) Transcript file structure The file contains 10 columns: Start time: Start time of the utterance. Stop time: Stop time of the utterance. Query: The reference to the information need participants are solving. Query complexity: One of three levels, referencing the task complexiy type (remember, understand, and analyze). Role: Which of the participants is talking in that particular utterance. The roles are annotated as A_User (participant who has the information need which needs to be solved) and B_Receiver (person who has access the the computer and search engine). Action: The action the participant takes in that utterance, these actions are described in the code book and allow for reproduction of the results. Transcript: Transcript of the utterance of the particular user in that particular times lot. Notes: Comments such as the particular search is stopped by the user or researcher or extra notes which relate to the action of the participant regarding the search session. Query counter: A counter which keeps track of how many turns there have been between the participants in that conversation. For the initial data release only the first two turns are given. However, the first three turns are presented if the second turn is classified under the Meta-communcation Theme (See CHIIR 2017 paper for further information). File name: Indicating the group number (2-14) and the date of the experiment. Backstories file structure The file backstories_ConversationalSearchDataSet.csv contains a set of nine selected information needs or topic backstories for this Spoken Conversational Search study that were authored in 2014 collectively by: Peter Bailey, Alistair Moffat, Falk Scholer, Paul Thomas. Further information about the selected backstories can be found in Moffat, Bailey, Scholer, and Thomas (2014). Preliminary Analysis The resleased data is of a prelimiary analysis of a spoken conversational search experimental setup. Please note that coding the data set is an iterative process and therefore has changed for consecutive analysis. We are planning on releasing the full data set and with updated code book once this is finalized. Abstract We present preliminary findings from a study of mixed initiative conversational behaviour for informational search in an acoustic setting. The aim of the observational study is to reveal insights into how users would conduct searches over voice where a screen is absent but where users are able to converse interactively with the search system. We conducted a laboratory-based observational study of 13 pairs of participants each completing three search tasks with different cognitive complexity levels. The communication between the pairs was analyzed for interaction patterns used in the search process. This setup mimics the situation of a user interacting with a search system via a speech-only interface.&rft.creator=Johanne Trippas&rft.date=2018&rft.relation=http://dx.doi.org/10.1145/3020165.3022144&rft_rights=All rights reserved&rft_rights=CC BY-NC: Attribution-Noncommercial 3.0 AU http://creativecommons.org/licenses/by-nc/3.0/au&rft_subject=Speech-only search interactions&rft_subject=Conversational patterns&rft_subject=Interaction behaviours&rft_subject=Mixed initiative conversational behaviour&rft_subject=Information Retrieval and Web Search&rft_subject=INFORMATION AND COMPUTING SCIENCES&rft_subject=LIBRARY AND INFORMATION STUDIES&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

Other view details
Unknown

CC BY-NC: Attribution-Noncommercial 3.0 AU
http://creativecommons.org/licenses/by-nc/3.0/au

All rights reserved

Access:

Other view details

Data Available In Link

Contact Information


Github

Full description

This dataset was created by observing participants addressing information needs of different complexity in an accoustic setting.

The file ConversationalSearchDataSet.csv contains a set of 101 transcribed conversations to solve information needs based on backstories. The collection and transcription process is described in Trippas, Spina, Cavedon, and Sanderson (2017), along with an initial analysis.

We provide all the releasable data in different files:

Transcripts (ConversationalSearchDataSet.csv)
Backstories (backstories_ConversationalSearchDataSet.csv)
Code book (CodeBook_CHIIR.pdf)

Transcript file structure
The file contains 10 columns:

Start time: Start time of the utterance.
Stop time: Stop time of the utterance.
Query: The reference to the information need participants are solving.
Query complexity: One of three levels, referencing the task complexiy type (remember, understand, and analyze).
Role: Which of the participants is talking in that particular utterance. The roles are annotated as A_User (participant who has the information need which needs to be solved) and B_Receiver (person who has access the the computer and search engine).
Action: The action the participant takes in that utterance, these actions are described in the code book and allow for reproduction of the results.
Transcript: Transcript of the utterance of the particular user in that particular times lot.
Notes: Comments such as the particular search is stopped by the user or researcher or extra notes which relate to the action of the participant regarding the search session.
Query counter: A counter which keeps track of how many turns there have been between the participants in that conversation. For the initial data release only the first two turns are given. However, the first three turns are presented if the second turn is classified under the Meta-communcation Theme (See CHIIR 2017 paper for further information).
File name: Indicating the group number (2-14) and the date of the experiment.
Backstories file structure
The file backstories_ConversationalSearchDataSet.csv contains a set of nine selected information needs or topic backstories for this Spoken Conversational Search study that were authored in 2014 collectively by: Peter Bailey, Alistair Moffat, Falk Scholer, Paul Thomas.

Further information about the selected backstories can be found in Moffat, Bailey, Scholer, and Thomas (2014).

Preliminary Analysis
The resleased data is of a prelimiary analysis of a spoken conversational search experimental setup. Please note that coding the data set is an iterative process and therefore has changed for consecutive analysis. We are planning on releasing the full data set and with updated code book once this is finalized.

Abstract

We present preliminary findings from a study of mixed initiative conversational behaviour for informational search in an acoustic setting. The aim of the observational study is to reveal insights into how users would conduct searches over voice where a screen is absent but where users are able to converse interactively with the search system. We conducted a laboratory-based observational study of 13 pairs of participants each completing three search tasks with different cognitive complexity levels. The communication between the pairs was analyzed for interaction patterns used in the search process. This setup mimics the situation of a user interacting with a search system via a speech-only interface.

This dataset is part of a larger collection

Click to explore relationships graph
Identifiers
  • Local : eedf3814b58151674e452915cc907551