Attacking Crowd-Powered Systems with Crowds

Do you trust your crowd?

Crowd-powered systems are being used for more and more complex tasks these days, but not much is known about the potential risks originating from workers themselves.

Types of Threats

Threats come in all shapes and sizes. A single worker can effectively extract information from a single task, but might have a hard time targeting vulnerable tasks from systems that only periodically includes sensitive information. Individual workers also are usually ineffective at disrupting the final output of systems that combine input from multiple workers. However, groups are able to attack these systems as well as more successfully extract even sparse pieces of sensitive information or reconstruct content that was divided to help improve privacy.

Raising an Army

But can attackers realistically be expected to gather large groups of people to attack these systems? Could they use the crowd itself to boost their numbers? Would crowd workers help a requester do something like steal a user’s credit card for an extra $0.05?

To find out, we ran two sets of experiments using workers from Mechanical Turk. For both, we pretended to be two different requesters: one [potentially] malicious requester (who posted an “Attack” task), and one requester being attacked (who posted a “Target” task). Workers started at the Attack task, were shown a set of instructions, and then asked to continue on to the Target task.

Information Extraction

One way the crowd can attack a system is by collecting private information from a task. This is especially of concern as systems that leverage the crowd for tasks, such as supporting assistive technology that captions a user’s conversation [2], or answers visual questions [1,3], make it possible to access personal information (e.g., a credit card number accidentally captured in an image). To simulate this, our task asked Target task workers to copy all text out of an image they were shown (Fig. 1).

Figure showing the Attack task leading to the Target task, and then returning to the Attack task.

The Attack task asks workers are asked to go to a Target task and return with information.

As a baseline, the Attack task asked workers to complete the Target task without returning any information. We then showed workers an image of a credit card drawing which clearly contained no real personal information (the “Innocent” condition), and contrasted the response rates we saw with the case where the Target task contained an image of a real-looking credit card (the “Malicious” condition). Despite containing the same amount of information, we saw a significant drop in response rate when the task looked more potentially harmful (Fig. 2).

Baseline: 73.8%; Innocent: 62.1%; Malicious: 32.8%

Results for the Information Extraction tests.

Information Manipulation

Another way workers can attack a system is to manipulate the answer that is provided to the user. We again recruited workers to attack a system, but this time, the Attack task provided workers with an answer to provide the target task (Figure 3). Our Target task asked workers to transcribe hand-written text they saw in an image.

Figure showing the Attack task leading to the Target task.

The Attack task asks workers are asked to go to a Target task and enter specific information.

As a baseline, we asked workers to complete the Target task with no special instructions. We then ask workers to provide a specific plausible answer given the image (the “Innocent” case), and compared the answers we received with those we got when the workers were asked to give a clearly wrong answer. We again saw a significant drop in the number of workers who were willing to complete the Attack task as instructed (Fig. 4).

Baseline: 73.8%; Innocent: 75.0%; Malicious: 27.9%

Results for the Information Manipulation tests.

Future Work

Now the question is, how to we avoid these attacks? Future approaches can leverage the fact that hired groups of workers appear to contain some people who are cognizant of when tasks contain potentially harmful information in order to protect against other workers who don’t notice the risk or will complete the tasks regardless – an alarming ~30% of workers.

References

[1] J.P. Bigham, C. Jayant, H. Ji, G. Little, A. Miller, R.C. Miller, R. Miller, A. Tatrowicz, B. White, S. White, T. Yeh. VizWiz: Nearly Real-time Answers to Visual Questions. UIST 2010.
[2] W.S. Lasecki, C.D. Miller, A. Sadilek, A. Abumoussa, D. Borrello, R. Kushalnagar, J.P. Bigham. Real-time Captioning by Groups of Non-Experts. UIST 2014.
[3] W.S. Lasecki, P. Thiha, Y. Zhong, E. Brady, J.P. Bigham. Answering Visual Questions with Conversational Crowd Assistants. ASSETS 2013.

 

Full paper: Information Extraction and Manipulation Threats in Crowd-Powered Systems.
Walter S. Lasecki, University of Rochester / Carnegie Mellon University
Jaime Teevan, Microsoft Research
Ece Kamar, Microsoft Research

2 thoughts on “Attacking Crowd-Powered Systems with Crowds

  1. There’s easier attacks than this.
    Check out the multiple papers on malicious crowdsourcing systems, where participants know full well what they’re doing is against ethnical rules and user agreements etc.

    1. Wang et al, Serf and Turf: Crowdturfing for Fun and Profit, WWW 2012

    2. Motoyama et al. Dirty jobs: The role of freelance labor in web service abuse. In Proc. of Usenix Security 2011.

    • These are great references, thanks! Attacks on existing systems are relatively common (for example, signing up for bogus accounts as part of tasks). MTurk, as an example, has varied in their response speed in trying to prevent these attacks, but there are definitely workers who will choose not to do these as well. We look at a slightly different case where the crowd is used to specifically attack other crowd-powered systems (which are, in theory, more used to dealing with even potentially erroneous human input).

      Also, a shout out to this paper, which raised similar concerns: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6113302&tag=1

Comments are closed.