Reasoning about Quality in Crowdsourced Enumeration Queries

Human perception and experience are powerful tools for collecting data to answer user queries, one of the features of hybrid human/machine database systems like CrowdDB. Consider queries asking for lists of items like “indoor plants that tolerate low light conditions” or “restaurants in San Francisco serving scallops”. The items in the list could be spread over the web and/or may require human interpretation to find.

When we ask members of the crowd to provide individual items in the list, however, we are faced with questions regarding the quality of the answer set, including:

  • Is the set complete?
  • If not, how much progress have we made so far?
  • How hard would it be to get another new answer?
As crowd workers supply answers one by one, the arrival of new unique answers is rapid at first but then plateaus. This accumulation curve provides insight into reasoning about answer set completeness (in this example, workers are giving items from a set of size 50).

As crowd workers supply answers one by one, the arrival of new unique answers is rapid at first but then plateaus. This accumulation curve provides insight into reasoning about answer set completeness (in this example, workers are giving items from a set of size 50).

The key idea of our technique is to use the arrival rate of new answers from the crowd to reason about the completeness of the set, adapting techniques used by biologists for species estimation.

Imagine you were trying to determine the number of unique species of animals on an island by repeatedly putting out traps overnight and counting which animals were caught (then releasing them). Species estimation algorithms infer the total number of species using the rate at which new species are identified.

The sequence of answers the crowd provides is analogous to these observations of animals; we can estimate the total number of expected answers based on the counts of answers we have seen so far.

However, we discovered that the way in which workers provide their answers is different than how species observations are made. Namely:

  • Individual workers do not give the same answer multiple times
  • Some workers provide more answers than others
  • Workers employ different techniques to find answers, which informs the order in which they provide them

In our work, we characterize the effect of these worker behaviors and devise a technique to reduce their impact.

For more, see our full paper, Crowdsourced Enumeration Queries.

Beth Trushkowsky, AMPLab, UC Berkeley
Tim Kraska, Brown University
Mike Franklin, AMPLab, UC Berkeley
Purna Sarkar, AMPLab, UC Berkeley