The Human Flesh Search: Large-Scale Crowdsourcing for a Decade and Beyond

Aside

Human Flesh Search (HFS, 人肉搜索 in Chinese), a Web-enabled large-scale crowdsourcing phenomenon (mostly based on voluntary crowd power without cash rewards), originated in China a decade ago. It is a new form of search and problem solving scheme that involves the collaboration among a potentially large number of voluntary Web users. The term “human flesh,” an unfortunately bad translation from its Chinese name, refers to the human empowerment (in fact, crowd-powered search is a more appropriate English name). HFS has seen tremendous growth since its inception in 2001 (Figure 1).

Figure1_updatedFigure 1. (a) Types of HFS episodes, and (b) evolution of HFS episodes based on social desirability

HFS has been a unique Web phenomenon for just over 10 years. HFS presents a valuable test-bed for scientists to validate existing and new theories in social computing, sociology, behavioral sciences, and so forth. Based on a comprehensive dataset of HFS episodes collected from participants’ discussion on the Internet, we performed a series of empirical studies, focusing on the scope of HFS activities, the patterns of HFS crowd collaboration process, and the unique characteristics and dynamics of HFS participant networks. More results of the analysis of HFS participant networks can be found in two papers published in 2010 and 2012 (Additional readings 1 and 2).

In this paper, a survey of HFS participants was conducted to provide an in-depth understanding of the HFS community and various factors that motivate these participants to contribute. The survey results shed light on the in-depth understanding of HFS participants and people involved in the crowdsourcing systems. Most participants voluntarily contribute to HFS, without expectation of money rewards (either real-world or virtual world money).

The findings indicate great potential for researchers to explore how to design a more effective and efficient crowdsourcing system, and how to better utilize this power of the crowds for social goods, solve complex task-solving problems, and even for business purposes like marketing and management.

For more, see our full paper, The Chinese “Human Flesh” Web: the first decade and beyond (preprint is also available upon request).

Qingpeng Zhang, City University of Hong Kong

Additoinal readings:

  1. Wang F-Y, Zeng D, Hendler J A, Zhang Q, et al (2010). A study of the human flesh search engine: Crowd-powered expansion of online knowledge. Computer, 43: 45-53. doi:10.1109/MC.2010.216
  2. Zhang Q, Wang F-Y, Zeng D, Wang T (2012). Understanding crowd-powered search groups: A social network perspective. PLoS ONE 7(6): e39749. doi:10.1371/journal.pone.0039749

CrowdCamp Report: HelloCrowd, The “Hello World!” of human computation

The first program a new computer programmer writes in any new programming language is the “Hello world!” program – a single line of code that prints “Hello world!” to the screen.

We ask, by analogy, what should be the first “program” a new user of crowdsourcing or human computation writes?  “HelloCrowd!” is our answer.

Hello World task

The simplest possible “human computation program”

Crowdsourcing and human computation are becoming ever more popular tools for answering questions, collecting data, and providing human judgment.  At the same time, there is a disconnect between interest and ability, where potential new users of these powerful tools don’t know how to get started.  Not everyone wants to take a graduate course in crowdsourcing just to get their first results. To fix this, we set out to build an interactive tutorial that could teach the fundamentals of crowdsourcing.

After creating an account, HelloCrowd tutorial users will get their feet wet by posting three simple tasks to the crowd platform of their choice. In addition to the “Hello, World” task above, we chose two common crowdsourcing tasks: image labeling and information retrieval from the web.  In the first task, workers provide a label for an image of a fruit, and in the second, workers must find the phone number for a restaurant. These tasks can be reused and posted to any crowd platform you like; we provide simple instructions for some common platforms.  The interactive tutorial will auto-generate the task urls for each tutorial user and for each platform.

Mmm, crowdsourcing is delicious

Mmm, crowdsourcing is delicious

More than just another tutorial on “how to post tasks to MTurk”, our goal with Hello Crowd is to teach fundamental concepts.  After posting tasks, new crowdsourcers will learn how to interpret their results (and get even better results next time).  For example: what concepts might the new crowdsourcer learn from the results for the “hello world” task or for the business phone number task?  Phone numbers are simple, right?  What about “867-5309” vs “555.867.5309” vs “+1 (555) 867 5309”?  Our goal is to get new users of these tools up to speed about  how to get good results: form validation (or not), redundancy, task instructions, etc.

In addition to teaching new crowdsourcers how to crowdsource, our tutorial system will be collecting a longitudinal, cross-platform dataset of crowd responses.  Each person who completes the tutorial will have “their” set of worker responses to the standard tasks, and these are all added together into a public dataset that will be available for future research on timing, speed, accuracy and cost.

We’re very proud of HelloCrowd, and hope you’ll consider giving our tutorial a try.

Christian M. Adriano, Donald Bren School, University of California, Irvine
Juho Kim, MIT CSAIL
Anand Kulkarni, MobileWorks
Andy Schriner, University of Cincinnati
Paul Zachary, Department of Political Science, University of California, San Diego

Paying human computers by the bit

Collective human computation – presenting objective questions to multiple humans and collecting their judgments – is a powerful and increasingly popular paradigm for performing computational tasks beyond the reach of today’s algorithms. From image classification to data validation, the human computer is making a comeback.

But how should we measure the performance of a human doing a computational task? Speed without accuracy is worthless, and accuracy itself is hard to measure in classification or estimation tasks in which a close-to-correct judgment still has value.

I assert that the value of a judgment is the amount by which it reduces the surprise of learning the correct answer to a question. This is a basic concept in information theory: the pointwise mutual information between the judgment and the answer.

For example, a classification problem with four equally-likely categories has entropy of 2 bits per question. If you correctly classify a series of objects, you’re giving the full 2 bits of information for each. If you’re a spammer giving judgments that are statistically independent of the correct categories, you’re giving zero information no matter what your spamming strategy is.

Thus, the net value of a contributor’s judgments is the total amount of information they give us, a well-defined extensive quantity that we can measure in bits (or nats or digits, if you please).

This metric has the advantages of being naturally-motivated, task- and model-agnostic, and free of tuning, and it easily plugs in to any resolution algorithm that models contributors and answers as random variables.

Expected values (ie. entropy) can be used to predict a contributor’s performance on a given question, conditioned on what’s already known about that question. Contributors can be preferentially given the questions for which they’re likely to be most informative. By applying this technique to data from Galaxy Zoo 2 (a crowdsourced deep-field galaxy classification project, part of the Zooniverse program), I was able to demonstrate a substantial improvement in accuracy compared to random assignment of questions to contributors.

Finally, we can measure the cost-effectiveness of the judgment collection process or the information efficiency of the resolution algorithm in terms of the total information received from contributors. Related metrics can be used to measure the overlap in information between two contributors or the information wasted by collecting redundant judgments.

The metrics I present can be mixed in to any human computation resolution algorithm that uses a statistical model to turn judgments into answers, by using the model’s estimated parameters to compute a set of conditional probabilities and then plugging these in to the definitions of the information-theoretic quantities. The paper includes worked examples for several models.

For more, see the full paper:
Pay by the Bit: An Information-Theoretic Metric for Collective Human Judgment

Tamsyn P Waterhouse, Google Inc.

CfP: new Crowdsourcing area at ACM Multimedia 2013

Crowdsourcing Area at ACM MM 2013
The 21st ACM International Conference on Multimedia
October 21–25, 2013, Barcelona Spain
Call for Papers: http://acmmm13.org/submissions/call-for-papers/

Following the successful Crowd MM workshop at ACM Multimedia last year, we have added Crowdsourcing as an official technical program area (long and short papers) for ACM MM 2013 in Barcelona Spain.  Multimedia is the flagship conference for ACM SIGMM.

AREA DESCRIPTION

Crowdsourcing makes use of human intelligence and a large pool of contributors to address problems that are difficult to solve using conventional computation. This new area cross-cuts traditional multimedia topics and solicits submissions dedicated to results and novel ideas in multimedia that are made possible by the crowd, i.e., they exploit crowdsourcing principles and techniques. Crowdsourcing is considered to encompass the use of: microtask marketplaces, games-with-a-purpose, collective intelligence and human computation. Topics include, but are not limited to:

  • Exploiting crowdsourcing for multimedia generation, interpretation, sharing or retrieval
  • Learning from crowd-annotated or crowd-augmented multimedia data
  • Economics and incentive structures in multimedia crowdsourcing systems
  • Crowd-based design and evaluation of multimedia algorithms and systems
  • Crowdsourcing in multimedia systems and applications such as Art & Culture, Authoring, Collaboration, Mobile & Multi-device, Multimedia Analysis, Search, and Social Media.

Submissions should have both a clear focus on multimedia and also a critical dependency on crowdsourcing techniques.

CONFERENCE INFO

Since the founding of ACM SIGMM in 1993, ACM Multimedia has been the worldwide premier conference and a key world event to display scientific achievements and innovative industrial products in the multimedia field. At ACM Multimedia 2013, we will celebrate its twenty-first iteration with an extensive program consisting of technical sessions covering all aspects of the multimedia field in forms of oral and poster presentations, tutorials, panels, exhibits, demonstrations and workshops, bringing into focus the principal subjects of investigation, competitions of research teams on challenging problems, and also an interactive art program stimulating artists and computer scientists to meet and discover together the frontiers of artistic communication.

IMPORTANT DATES

  • Abstract for Full Papers: March 1, 2013
  • Manuscript for Full/Short Papers: March 8, 2013
  • Rebuttal May 8–17, 2013
  • Author-to-Author’s Advocate contact period: May 8–13, 2013
  • Notification of Acceptance: June 25, 2013
  • Camera-ready submission: July 30, 2013
  • Conference: October 21–25, 2013, Barcelona Spain

CONFERENCE WEBSITE

http://acmmm13.org

CfP: Crowdsourcing in Virtual Communities track at AMCIS 2013

Mini-track: Crowdsourcing in Virtual Communities
19th Americas Conference on Information Systems (AMCIS 2013)
August 15-17, 2013 in Chicago, Illinois, USA
Link: http://amcis2013.aisnet.org/?option=com_content&id=69

Following the successful crowdsourcing tracks at ACIS 2011 and AMCIS 2012, we are accepting submission to this year’s AMCIS 2013 crowdsourcing track in Chicago. AMCIS is one of the biggest annual conferences in the field of Information Systems with about 1000 participants.

DESCRIPTION
Crowdsourcing harnesses the potential of large networks of people via open calls for contribution and thus enables organizations to tap into a diversity of knowledge, skills, and perspectives. Fueled by the increasing pervasiveness of the Internet, crowdsourcing has been rapidly gaining importance in a wide range of contexts, both in research and practice. In order to provide better guidance for future crowdsourcing efforts, it is crucial to gain a deeper and integrated understanding of the phenomenon. While research on crowdsourcing is multidisciplinary, information systems take a central role in realizing crowdsourcing approaches by interconnecting organizations and globally distributed contributors. By viewing crowdsourcing from an IS perspective, this track aims to channel related research directions and move from the consideration of isolated aspects and applications to a systemic foundation for the design of socio-technical crowdsourcing systems.

We encourage submissions from theoretical, empirical, and design science research on the following and adjacent topics:
- Crowdsourcing ecosystems and markets
- Platforms, tools, and technologies
- Task characteristics, task design, and task choice
- Contributor motivation and incentive structures
- Design of workflows and processes
- Mobile crowdsourcing
- Quality assurance and evaluation of contributions
- Economics of crowdsourcing
- Case studies of crowdsourcing effectiveness
- Adoption of crowdsourcing business models
- Innovative applications

IMPORTANT DATES
January 4, 2013: Bepress will start accepting paper submissions
February 22, 2013 (11:59 pm CST): Deadline for paper submissions
April 22, 2013: Authors notified of acceptance decisions
May 9, 2013: Camera-ready copy due for accepted papers

Announcing HCOMP 2013 – Conference on Human Computation and Crowdsourcing

Bjoern Hartmann, UC-Berkeley 
Eric Horvitz, Microsoft Research

Announcing HCOMP 2013, the Conference on Human Computation and Crowdsourcing,  Palm Springs, November 7-9, 2013.  Paper submission deadline is May 1, 2013.  Thanks to the HCOMP community for bringing HCOMP to life as a full conference, following on the successful workshop series.

HCOMP 2013 at Palm Springs

The First AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2013) will be held November 7-9, 2013 in Palm Springs, California, USA. The conference was created by researchers from diverse fields to serve as a key focal point and scholarly venue for the review and presentation of the highest quality work on principles, studies, and applications of human computation. The conference is aimed at promoting the scientific exchange of advances in human computation and crowdsourcing among researchers, engineers, and practitioners across a spectrum of disciplines.  Papers submissions are due May 1, 2013 with author notification on July 16, 2013.  Workshop and tutorial proposals are due May 10, 2013.  Posters & demonstrations submissions are due July 25, 2013.

For more information, see the HCOMP 2013 website.

CfP: Journal of Business Economics Special Issue on Crowdsourcing and Cloudsourcing (Deadline Sep 30, 2012)

This special issue of the Journal of Business Economics (JBE), one of the leading professional journals in the business economics sector, addresses contemporary facets of Crowdsourcing and Cloudsourcing: Cloud computing as a delivery channel of new applications (Software-as-a-Service), platforms, and infrastructures, and the aspects that relate to utilizing highly specialized talents and expertise of the crowd, which is facilitated by the cloud.

Crowdsourcing and cloudsourcing enables organizations to minimize time to project completion and to maximize access to the smartest global talents. Companies are able to quickly scale up and enhance overall performance. The recent announcement of IBM’s future job model, which seeks to substitute tenure through temporary assignments, is a prominent example of how the adoption of crowdsourcing concepts may have a vast economic and social impact far exceeding the boundaries of the firm.

Continue reading

Crowdsourcing track at AMCIS 2012 in Seattle

Building on our crowdsourcing track at ACIS 2011 in Sydney, we are organizing a crowdsourcing mini-track at this year’s AMCIS conference in Seattle (August 9-12), covering a broad range of topics from human computation to open innovation. We are looking forward to a varied discussion on the role of information systems to crowdsourcing and related fields. The submission deadline is March 1st.

Continue reading

CrowdNet Workshop in Karlsruhe, Germany, Jan 26th 2012

CrowdNet 2012, the 2nd Workshop on Cloud Labor and Human Computation is coming to Karlsruhe, Germany, on Thursday, January 26th, 2012. The CrowdNet workshop aims to bring together researchers and practitioners from various disciplines and industries who are interested in the scientific and economical challenges of cloud labor and human computation. This years workshop is intended as an informal platforms to present results from research and practice as well as to discuss innovative and breakthrough ideas.

The workshop features over fifteen invited talks. Among others, industry representatives from IBM, Clickworker, SAP, and CrowdEngineering will introduce first-hand insights from the respective platforms. Researchers will then present about various challenges of cloud labor, human computation with games, motivation, quality assurance, and programmable human computers.

Continue reading

Worker Motivation in Crowdsourcing and Human Computation

Many human computation systems use crowdsourcing markets like Amazon Mechanical Turk to recruit human workers. The payment in these markets is usually very low, and still collected demographic data shows that the participants are a very diverse group including highly skilled full time workers. Many existing studies on their motivation are rudimental and not grounded on established motivation theory. Therefore, we adapt different models from classic motivation theory, work motivation theory and Open Source Software Development to crowdsourcing markets. The model is tested with a survey of 431 workers on Mechanical Turk. We find that the extrinsic motivational categories (immediate payoffs, delayed payoffs, social motivation) have a strong effect on the time spent on the platform. For many workers, however, intrinsic motivation aspects are more important, especially the different facets of enjoyment based motivation like “task autonomy” and “skill variety”.

Model for Worker Motivation in Crowdsourcing and Human Computation

Our proposed model is based on motivation theory (Deci & Ryan 1985, 2000 and Lindenberg 2001), work motivation theory (Hackman & Oldham 1980), and open source software development (Lakhani & Wolf 2003). It composes motivating factors which can be classified either intrinsic or extrinsic of type. Each category is influenced by one or more constructs that affect the overall motivation of workers.

Intrinsic Motivation

Intrinsic motivation exists if an individual is activated because of its seeking for the fulfillment generated by the activity (e.g. acting just for fun). Within the group of intrinsic motivation, two categories are differentiated: Enjoyment Based and Community Based Motivation. The category of Enjoyment Based Motivation contains factors that lead to that lead to the sensation of “fun” that might be perceived by the workers. These factors are measured by the constructs:

  • Skill Variety: Usage of a diversity of skills that are needed for solving a specific task and fit with the skill set of the worker; e.g. a worker picks a translation task because he likes translating and wants to use his skills in his favorite foreign language.
  • Task Identity: Refers to the extent a worker perceives the completeness of the task he has to do. The more tangible the result of his work is, the higher will be his motivation; e.g. a tasks that allows him to see how the result of his work will be used.
  • Task Autonomy: Refers to the degree of freedom that is allowed to the worker during task execution; e.g. a worker who is motivated because a certain task allows him to be creative.
  • Direct Feedback from the Job: Covers to which extent a sense of achievement can be perceived during or after task execution. This is explicitly limited to direct feedback from the work on a task, not by other persons.
  • Pastime: Covers acting just to “kill time”. It appears if a worker does something in order to avoid boredom.

The category of Community Based Motivation covers the acting of workers guided by the platform community:

  • Community Identification: Covers the acting of workers guided by the subconscious adoption of norms and values from the crowdsourcing platform community, which is caused by a personal identification process.
  • Social Contact: Covers motivation caused by the sheer existence of the community that offers the possibility to foster social contact; i.e. meeting new people.

Extrinsic Motivation

In the case of extrinsic motivation the activity is just an instrument for achieving a certain desired outcome (e.g. acting for money or to avoid sanctions). Three motivational categories are counted to the extrinsic motivation: Immediate Payoffs, Delayed Payoffs and Social Motivation. The category of Immediate Payoffs covers all kinds of immediately received compensations for the work on crowdsourcing tasks:

  • Payment: Motivation by the monetary remuneration received for completing a task.

Delayed Payoffs address all kind of benefits that can be used strategically to generate future material advantages. This type of motivation is measured by:

  • Signaling: Refers to the usage of actions as strategic signals to the surroundings; e.g. selects tasks in order to show presence and advance his chance of being noticed by possible employers.
  • Human Capital Advancement: Refers to the motivation through the possibility to train skills that could be useful to generate future material advantages.

The category of Social Motivation is the extrinsic counterpart of intrinsic motivation by community identification. It covers socially motivated extrinsic motivation out of values, norms and obligations from outside the platform community as well as indirect feedback from the job and the need for social contact:

  • Action Significance by External Values: Captures the significance of an action concerning the compliance with values from outside the crowdsourcing community that is perceived by the worker when contributing to the community or working on a task.
  • Action Significance by External Obligations & Norms: Motivation induced by a third party from outside the platform community that traces back to obligations a worker has or social norms he wants or to comply with in order to avoid sanctions (does not include material obligations).
  • Indirect Feedback from the Job: Covers motivation caused by the prospect of feedback about the delivered working results by other individuals; e.g. working on tasks to get positive feedback from requesters.

A more detailed overview of the individual constructs including examples and references is listed in the appendix of our HCOMP paper. We have tested the model with a survey on Amazon Mechanical Turk. Details about the survey design, data collection, data analysis, and details about the influence of demographics on motivation can be found in our AMCIS 2011 paper which in an extended version of the paper that will be presented at HCOMP 2011 in August. On MTurk, many intrinsic motivation factors seem to dominate the extrinsic ones. Task related factors play a major role in the continuum of factors that motivate the workers which includes the usage of a variety of skills, deciding on the own how to solve a task or the “feasibility” of work results. Surprisingly, this list also and explicitly includes the (extrinsic) motivation to work on tasks to learn new or train existing skills, which related literature has not perceived to be that important yet.

A general model for the motivation of workers in paid crowdsourcing environments and human computation systems  is a prerequisite for many further research directions in that area. An interesting research question is the connection between properties of tasks and platforms and the resulting motivation. The question how workers can be motivated to contribute better results is also very promising. We welcome researcher to apply and extend our model to different platform and to use the constructs for demographic data in related research on Amazon Mechanical Turk.

Kaufmann, N., Schulze, T., Veit, D. (2011). ”More than fun and money. Worker Motivation in Crowdsourcing – A Study on Mechanical Turk”. AMCIS 2011 Proceedings. (forthcoming) PDF HCOMP Paper (including Appendix)