Presentation Title

Towards Multilingual Search – Gathering Multilingual Query Logs using Crowdsourcing

Faculty Mentor

Ben Steichen

Start Date

17-11-2018 8:30 AM

End Date

17-11-2018 10:30 AM

Location

HARBESON 38

Session

POSTER 1

Type of Presentation

Poster

Subject Area

engineering_computer_science

Abstract

Research on improving Web search engines strongly relies on knowing current search trends, particularly in terms of typical user queries and their associated search intents (i.e. what the user was looking for). While commercial search engines can gather query information from their live query logs, more precise intent information is typically not captured. Therefore, researchers have begun to explore other techniques such as crowdsourcing to gather additional information related to a user query. However, while such query/intent log studies have been performed successfully in prior work, they have typically only focused on English queries.

By contrast, our research is interested in developing novel search systems that can support users who have multiple language abilities (i.e. bilingual users). This research is therefore interested in gathering information about typical user queries in different languages and from different geographical locations. To this end, we ran a large-scale study using crowdsourcing technology to gather typical user queries in multiple languages (English, Spanish, Chinese), together with descriptions of the intent that users had when issuing those queries. Results from the study show that there are some clear differences between query/intent logs in different languages, and that bilingual users sometimes choose different languages for querying depending on search topic. An additional contribution of the work is the generation of the query/intent log itself, which can be reused by other researchers in their design of research studies involving multilingual search engines.

This document is currently not available here.

Share

COinS
 
Nov 17th, 8:30 AM Nov 17th, 10:30 AM

Towards Multilingual Search – Gathering Multilingual Query Logs using Crowdsourcing

HARBESON 38

Research on improving Web search engines strongly relies on knowing current search trends, particularly in terms of typical user queries and their associated search intents (i.e. what the user was looking for). While commercial search engines can gather query information from their live query logs, more precise intent information is typically not captured. Therefore, researchers have begun to explore other techniques such as crowdsourcing to gather additional information related to a user query. However, while such query/intent log studies have been performed successfully in prior work, they have typically only focused on English queries.

By contrast, our research is interested in developing novel search systems that can support users who have multiple language abilities (i.e. bilingual users). This research is therefore interested in gathering information about typical user queries in different languages and from different geographical locations. To this end, we ran a large-scale study using crowdsourcing technology to gather typical user queries in multiple languages (English, Spanish, Chinese), together with descriptions of the intent that users had when issuing those queries. Results from the study show that there are some clear differences between query/intent logs in different languages, and that bilingual users sometimes choose different languages for querying depending on search topic. An additional contribution of the work is the generation of the query/intent log itself, which can be reused by other researchers in their design of research studies involving multilingual search engines.