For On-Demand Workers, It’s All About the Story

From mystery shopping to furniture assembly, apps such as TaskRabbit and Gigwalk leverage the power of distributed, mobile workers who complete physical world tasks instantly and beyond the constraints of traditional office workspaces. We refer to these workers as the “on-demand mobile workforce.” Mobile workforce services allow task requesters to “crowdsource” tasks in the physical world and aim to disrupt the very nature of employment and work (for good and bad; this may be a matter for another post).

Our paper describes an on-demand workforce service categorization based on two dimensions: (1) task location and (2) task complexity (see figure below). Based on marketplace reviews, user testimonies, and informal observations of the services, we placed four main workforce services into the quadrants to exemplify the categorization.

Categorization of on-demand workforce services.

Categorization of on-demand workforce services.

Although a long line of research on incentives and motivations for crowdsourcing exists, especially on platforms like Amazon’s Mechanical Turk, there hasn’t been much work on physical crowdsourcing, despite the recent appearance of many such platforms. We conducted interviews (see the paper here to learn more about the complete methods and findings) of mobile workforce members to learn more about the extrinsic and intrinsic factors that influence the selection and completion of physical world tasks.

To mention a couple of findings, we found certain task characteristics were highly important to workers as they select and accept tasks:

Knowing the person
Because physical world tasks introduce a different set of personal risks compared to virtual world tasks (e.g., physical harm, deception), workers creatively investigated requesters and scrutinized profile photos, email addresses, and task descriptions. Tasks with profile photos helped workers know who to expect on-site and email addresses were used to cross-reference information on social networking sites.

Knowing the “story”
Tasks that listed intended purposes or background stories of the tasks appealed to the mobile workforce. Tasks for an anniversary surprise or to verify the conditions of a grave plot through a photo affected workers’ opinions and influenced future task selections. Workers also appreciated non-financial incentives of unique experiences that occurred as byproducts of task completion (e.g., meeting new people). Tasks with questionable, unethical intentions (e.g., mailing in old phones, posting fake reviews online, writing student papers) were less likely to be fulfilled.

Generally, this study has broader implications for the design of effective, practical, novel and well-reasoned social and technical crowdsourcing applications that organize help and support in the physical world. Particularly, we hope our findings inform future development of mobile workforce services that are not strictly monetary.

Want to learn more? Check out our full paper here at CSCW 2014.

Rannie Teodoro
Pinar Ozturk
Mor Naaman
Winter Mason
Janne Lindqvist

Remote Shopping Advice: Crowdsourcing In-Store Purchase Decisions

Recent Pew reports, as well as our own survey, have found that consumers shopping in brick-and-mortar stores are increasingly using their mobile phones to contact others while they shop. The increasing capabilities of smartphones, combined with the emergence of powerful social platforms like social networking sites and crowd labor marketplaces, offer new opportunities for turning solitary in-store shopping into a rich social experience.We conducted a study to explore the potential of friendsourcing and paid crowdsourcing to enhance in-store shopping. Participants selected and tried on three outfits at a Seattle-area Eddie Bauer store; we created a single, composite image showing the three potential purchases side-by-side. Participants then posted the image to Facebook, asking their friends for feedback on which outfit to purchase; we also posted the image to Amazon’s Mechanical Turk service, and asked up to 20 U.S.-based Turkers to identify their favorite outfit, provide comments explaining their choice, and provide basic demographic information (gender, age).

Study participants posted composite photos showing their three purchase possibilities; these photos were the posted to Facebook and Mechanical Turk to crowdsource the shopping decision.

Study participants posted composite photos showing their three purchase possibilities; these photos were the posted to Facebook and Mechanical Turk to crowdsource the shopping decision.

Although none of our participants had used paid crowdsourcing before, and all were doubtful that it would be useful to them when we described what we planned to do at the start of the study session, the shopping feedback provided by paid crowd workers turned out to be surprisingly compelling to participants – more so than the friendsourced feedback from Facebook, in part because the crowd workers were more honest, explaining not only what looked good, but also what looked bad, and why! They also enjoyed the ability to see how opinions varied among different demographic groups (e.g., did male raters prefer a different outfit than female raters?).

Although Mechanical Turk had a speed advantage over Facebook, both sources generally provided multiple responses within a few minutes – fast enough that a shopper could get real-time decision-support information from the crowd while still in the store.

Our CSCW 2014 paper on “Remote Shopping Advice” describes our study in more detail, as well as how our findings can be applied toward designing next-generation social shopping experiences.

For more, see our full paper, Remote Shopping Advice: Enhancing In-Store Shopping with Social Technologies.

Meredith Ringel Morris, Microsoft Research
Kori Inkpen, Microsoft Research
Gina Venolia, Microsoft Research

Voyant: Generating Structured Feedback on Visual Designs Using a Crowd of Non-Experts

Crowdsourcing offers an emerging opportunity for users to receive rapid feedback on their designs. A critical challenge for generating feedback via crowdsourcing is to identify what type of feedback is desirable to the user, yet can be generated by non-experts. We created Voyant, a system that leverages a non-expert crowd to generate perception-oriented feedback from a selected audience as part of the design workflow.

The system generates five types of feedback: (i) Elements are the individual elements that can be seen in a design. (ii) First Notice refers to the visual order in which elements are first noticed in the design. (iii) Impressions are the perceptions formed in one’s mind upon first viewing the design. (iv) Goals refer to how well the design is perceived to meet its communicative goals. (v) Guidelines refer to how well the design is perceived to meet known guidelines in the domain.

Voyant decomposes feedback generation into a description and interpretation phase, inspired by how a critique is taught in design education. In each phase, the tasks focus a worker’s attention on specific aspects of a design rather than soliciting holistic evaluations to improve outcomes. The system submits these tasks to an online labor market (Amazon Mechanical Turk). Each type of feedback typically requires a few hours to generate and costs a few US dollars.

Our evaluation shows that users were able to leverage the feedback generated by Voyant to develop insight, and discover previously unknown problems with their designs. For example, the Impressions feedback generated by Voyant on a user’s poster (see the video above). The user intended it to be perceived as Shakespeare, but was surprised to learn of an unintended interpretation (see “dog” in word cloud).

To use Voyant, the user imports a design image and configures the crowd demographics. Once generated, the feedback can be utilized to help iterate toward an effective solution.

Try it: http://www.crowdfeedback.me

 

For more, see our full paper, Voyant: Generating Structured Feedback on Visual Designs Using a Crowd of Non-Experts.
Anbang Xu, University of Illinois at Urbana-Champaign
Shih-Wen Huang, University of Illinois at Urbana-Champaign
Brian P. Bailey, University of Illinois at Urbana-Champaign

CrowdCamp Report: HelloCrowd, The “Hello World!” of human computation

The first program a new computer programmer writes in any new programming language is the “Hello world!” program – a single line of code that prints “Hello world!” to the screen.

We ask, by analogy, what should be the first “program” a new user of crowdsourcing or human computation writes?  “HelloCrowd!” is our answer.

Hello World task

The simplest possible “human computation program”

Crowdsourcing and human computation are becoming ever more popular tools for answering questions, collecting data, and providing human judgment.  At the same time, there is a disconnect between interest and ability, where potential new users of these powerful tools don’t know how to get started.  Not everyone wants to take a graduate course in crowdsourcing just to get their first results. To fix this, we set out to build an interactive tutorial that could teach the fundamentals of crowdsourcing.

After creating an account, HelloCrowd tutorial users will get their feet wet by posting three simple tasks to the crowd platform of their choice. In addition to the “Hello, World” task above, we chose two common crowdsourcing tasks: image labeling and information retrieval from the web.  In the first task, workers provide a label for an image of a fruit, and in the second, workers must find the phone number for a restaurant. These tasks can be reused and posted to any crowd platform you like; we provide simple instructions for some common platforms.  The interactive tutorial will auto-generate the task urls for each tutorial user and for each platform.

Mmm, crowdsourcing is delicious

Mmm, crowdsourcing is delicious

More than just another tutorial on “how to post tasks to MTurk”, our goal with Hello Crowd is to teach fundamental concepts.  After posting tasks, new crowdsourcers will learn how to interpret their results (and get even better results next time).  For example: what concepts might the new crowdsourcer learn from the results for the “hello world” task or for the business phone number task?  Phone numbers are simple, right?  What about “867-5309” vs “555.867.5309” vs “+1 (555) 867 5309”?  Our goal is to get new users of these tools up to speed about  how to get good results: form validation (or not), redundancy, task instructions, etc.

In addition to teaching new crowdsourcers how to crowdsource, our tutorial system will be collecting a longitudinal, cross-platform dataset of crowd responses.  Each person who completes the tutorial will have “their” set of worker responses to the standard tasks, and these are all added together into a public dataset that will be available for future research on timing, speed, accuracy and cost.

We’re very proud of HelloCrowd, and hope you’ll consider giving our tutorial a try.

Christian M. Adriano, Donald Bren School, University of California, Irvine
Juho Kim, MIT CSAIL
Anand Kulkarni, MobileWorks
Andy Schriner, University of Cincinnati
Paul Zachary, Department of Political Science, University of California, San Diego

Can we achieve reliable inference using unreliable crowd workers?

Let us assume a set of N crowd workers are given the task of classifying a given dog image into a set of M possible breeds. Since workers may not be canine experts, they may not be able to directly classify and so we should ask simpler questions. There are two basic properties of crowd workers which cause a degraded performance of crowdsourcing systems:

  • Lack of domain expertise (which may necessitate asking binary questions rather than asking for fine classification), and
  • Unreliability (which may necessitate intelligently deployed redundancy)

The above problems can be handled by the use of error-correcting codes. Using code matrices, we can design binary questions for crowd workers that allow the task manager to reliably infer correct classification even with unreliable workers.

Untitled

The performance of a classification task is heavily dependent on the design of these simple binary questions. The question design problem is equivalent to the design of an M x N binary code matrix A={ali}. The rows correspond to the different classes while a column ai corresponds to the question to the ith worker. As an example, consider the task of classifying a dog image into one of four breeds: Pekingese, Mastiff, Maltese, or Saluki. The binary question of whether a dog has a snub nose or a long nose differentiates between {Pekingese, Mastiff} and {Maltese, Saluki}, whereas the binary question of whether the dog is small or large differentiates between {Pekingese, Maltese} and {Mastiff, Saluki}.

ex

An illustrative example is shown in the figure above for the dog breed classification task. Let the columns corresponding to the ith and jth workers be ai = [1010]‘ and aj = [1100]‘ respectively. The ith worker is asked: “Is the dog small or large?” since she is to differentiate between the first (Pekingese) or third (Maltese) breed and the others. The jth worker is asked: “Does the dog have a snub nose or a long nose?” since she is to differentiate between the first two breeds (Pekingese, Mastiff) and the others. These questions are designed from the codematrix using taxonomy of dog breeds. The task manager makes the final classification decision as the hypothesis corresponding to the code word (row) that is closest in Hamming distance to the received vector of decisions. A good codematrix can be designed using simulated annealing or cyclic column replacement based optimization.

To evaluate the performance of this scheme, the worker’s reliability can be modeled as a random variable: spammer-hammer model or the Beta model. The average probability of misclassification can be derived as a function of the mean (μ) of the workers’ reliability. The proposed scheme’s performance can be compared with the traditional voting based scheme. The summary of the results are:

  • Crowd Ordering: Better crowds yield better performance in terms of average error probability
  • Coding is better than majority vote: Good codes perform better than majority vote as they diversify the binary questions and use human cognitive energy more efficiently
  • Gap in performance generally increases for larger system size

For more, see our ICASSP 2013 paper, Reliable Classification by Unreliable Crowds
Aditya Vempaty, Syracuse University
Lav R. Varshney,  IBM Thomas J. Watson Research Center
Pramod K. Varshney, Syracuse University

Truthful Incentives in Crowdsourcing Tasks using Regret Minimization Mechanisms

Monetary incentives in Crowdsourcing platforms

Designing the right incentive structure and pricing policies for workers is the central component of online crowdsourcing platforms, e.g. Mechanical Turk.

  • The job requester’s goal is to maximize the utility derived from the task under a limited budget.
  • Workers’ goal is to maximize their individual profit by deciding which tasks to perform and at what price.

Yet, current crowdsourcing platforms only offer a limited capability to the requester in designing the pricing policies and often rules of thumb are used to price tasks. This limitation could result in an inefficient use of the requester’s budget or workers becoming disinterested in the task.

Price negotiation with workers

Previous work in this direction [Singer et al., HCOMP’11] has focused on designing online truthful mechanisms in the bidding model. This requires eliciting the true costs experienced by the worker, which can be challenging for such platforms. In this paper, we focus on the posted price model, where workers are offered a take-it-or-leave-it price, which is more easily implemented in online crowdsourcing platforms. Figure-1 shows the way price negotiation happens in our posted price model.

PostedPriceModel

Figure 1: Negotiation between requester and worker in the posted-price model compared to that in the bidding model. b_i denotes the bid shared by i_th worker, c_i being the true cost experienced by him. p_i denotes the payment offered to the worker.

Our approach

The main challenge in determining the payments is the unknown distribution of the workers’ cost (“cost curve”) F(p), illustrated in Figure 2.  This leads to the challenge of trading exploration and exploitation – the mechanism needs to “explore” by experimenting with potentially suboptimal prices and has to “exploit” its learning by offering the price that appears best so far. We cast this problem as a multi-armed bandit (MAB) problem, however under a strict budget constraint B and use the approach of regret minimization in online learning to design our mechanism.

Figure 2. costcurve-FB

Figure 2: Upper bound on utility achievable through budget constraint (in red), unknown price curve of workers (in blue) and optimal price (in green). B is the fixed budget and N is the fixed number of workers.

Our mechanism BP-UCB

  • We present a novel posted-price mechanism, BP-UCB, for online budgeted procurement, which is guaranteed to be budget feasible, achieves near-optimal utility for the requester and be incentive compatible (truthful) for workers.
  • We prove no-regret bounds for it and our analysis yields insights into an explicit separation of the regret in terms of wasted budget through overpayment and rejected offers through underpayment.

Experimental Results

We carry out extensive experiments to understand the practical performance of our mechanisms on simulated cost distributions (Figure 3 below). Additionally, to demonstrate the effectiveness of our approach on real world inputs, we carried out a Mechanical Turk study to collect real cost distributions from the workers (Figure 4 below).

Figure 3. results-synthetic-uniform

Figure 3: Simulated cost distributions. Left: Compared to state of the art posted-price mechanism (pp’12) [Badanidiyuru et al., EC’12], our mechanism (bp-ucb) shows up to a 150% increase in utility for given budet. Right: shows that average regret of our mechanism bp-ucb diminishes with increasing budget.

Figure 4: results-mturk

Figure 4: Cost distribution from Mechanical Turk. Left: shows utility achieved for random arrival of workers and shows a 180% increase in utility compared to state of art posted-price mechanism (bp-ucb vs pp’12). Right: demonstrates the robustness of the mechanism against extreme adversarial inputs, simulated by arrival of workers in ascending order of their costs.

For more, see our full paper, Truthful Incentives in Crowdsourcing Tasks using Regret Minimization Mechanisms.
Adish Singla, ETH Zurich
Andreas Krause, ETH Zurich

Let the crowd wrap the web

The web is a valuable source of information, but most of the data can not be automatically processed since are intended for human-consumption.
Wrappers are specialized programs that extract the data from the source code of HTML pages and organize them in a more structured way, making them machine processable.

For example, suppose we want to collect data about movies (e.g. titles, directors, actors, etc.) by means of a set of wrapper extracting the data from the sites available on the Web. Other than the most famous (e.g., IMDB)  web sites, many others can be considered. [Dalvi et al., VLDB 2012] have shown that in many domains, for covering 90% of the entities present in the Web, more than 10000 sites have to be considered.

Fully automated approaches for learning wrappers have  been already proposed (e.g. RoadRunner [Crescenzi and Merialdo, AAI 2008]), but they exhibit limited accuracy. On the other side, supervised wrapper generator have limited applicability at the web scale. The crowd could be the trigger for addressing the problem of wrapping very large numbers of data intensive Web sites with high accuracy.

Figure 1

The web application interface of ALFRED, the query and the page are visualized, and the worker can answer with a binary value (Yes/No). To help the worker the queried value is highlighted.

We propose ALFRED [Crescenzi et al. WWW2013, DBCrowd2013], a wrapper inference system supervised by the crowd. To generate wrappers, the system poses sequences of simple questions that require a boolean answer (e.g. “Is ‘City of God’ the title of the movie in the page?” Y/N). The answers provided by the workers recruited on a crowdsourcing platform are exploited to generate the correct wrapper.

Preliminary results are promising:

  • To generate accurate wrappers, just a few queries are needed. Even in presence of inaccurate workers, ALFRED can generate a correct wrapper with less than 15 queries.

  • The accuracy of the output wrapper is highly predictable, with an average F-measure close to 100% and its standard deviation less than 1%, i.e., almost perfect wrapper with a small variability.

  • Workers’ error rates estimation is accurate, and spammers and unreliable workers are early detected.

  • Costs are contained and highly predictable thanks to a technique to dynamically engage, at runtime, a minimal number of workers, with 92% of the cases covered by just two workers.

Many challenges are still open:

  • to further reduce the costs we aim at adopting a hybrid approach that partially relies on automatic wrapper generation techniques, with a light supervision by the crowd

  • gamification is a promising direction to engage workers and scale out the wrappers  generation. People can play games while teaching ALFRED how to wrap the web.

For more, see our full paper project website, ALFRED, and the full paper, A framework for learning web wrappers from the crowd.

Valter Crescenzi, Università Roma Tre
Paolo Merialdo, Università Roma Tre
Disheng Qiu, Università Roma Tre 

Towards Supporting Search over Trending Events with Social Media

Trending events are events that serve as novel or evolving sources of widespread online activity. Such events range from anticipated events to breaking news, and topics vary from politics to sporting events to celebrity gossip. Recently, search engines have started reflecting search activity around trending events back to users (e.g. Bing Popular Now or Google Hot Searches).

Real-time content published via social media can provide valuable information about time-sensitive topics, but the topics being discussed can change quite rapidly over time. In our analysis, we aimed to answer the following questions: For what types of trending events will real-time information be useful, and for how long will it continue to align with the information needs of users searching about these events?

Figure: Information Types

We surveyed 288 users about their experience with trending events over a week in August 2012. Among other things, they reported on the utility of various types of information when making sense of such events; here, we see how important real-time information is to the users surveyed.

In order to identify ways to better support users issuing such queries, we examined hundreds of trending events during the summer of 2012, using three sources of data: (1) qualitative survey data, (2) query logs from Bing, and (3) Twitter updates from the complete Twitter firehose.

Our findings revealed that:

  • Searchers who click Trending Queries links engage less and with different result content than users who search manually for the same topics. This may be due to a preference for real-time information that is perhaps not currently being satisfied.
  • Search query and social media activity follow similar temporal patterns, but social media activity tends to lead by 4.3 hours on average, providing enough time for a search engine to index and process relevant content.
  • User interest becomes more diverse during the peak of activity for a trending event, but a corresponding increase in overlap between content searched and shared highlights opportunities for supporting search with social media content.
Search vs. Social Media Delays - Histogram

Each data point in this histogram corresponds to a single trending event. The value represents the delay between patterns of query activity and social media activity (negative values indicate that social media precedes search). The dotted red line shows the mean h = -4.3 hours.

Many current search interfaces leveraging social media content tend to provide a reverse-chronologically ordered list of keyword-matched updates. Our finding that search activity often lags behind social media activity means that there may be time for more complex indexing and ranking computation to present more relevant “near-real-time” content in search results.

For more about our study and implications for supporting search over trending events, see our full paper, Towards Supporting Search over Trending Events with Social Media.
Sanjay Kairam, Stanford University
Meredith Ringel Morris, Microsoft Research
Jaime Teevan, Microsoft Research
Dan Liebling, Microsoft Research
Susan Dumais, Microsoft Research

Hey Twitter crowd … What else is there?

Journalists and news editors use Twitter to contextualize and enrich their articles by examining the public response, from comments and opinions to pointers to related news. This is possible because some users in Twitter devote a substantial amount of time and effort to news curation: carefully selecting and filtering news stories highly relevant to specific audiences.

We developed an automatic method that groups together all the users who tweet a particular news item, and later detects new contents posted by them that are related to the original news item.

We call each such group a transient news crowd. The beauty of this approach, in addition to being fully automatic, is that there is no need to pre-define topics and the crowd becomes available immediately, allowing journalists to cover news beats incorporating the shifts of interest of their audiences.

Transient news crowds
Figure 1. Detection of follow-up stories related to a published article using the crowd of users that tweeted the article.

Transient news crowds

We define the crowd of a news article as the set of users that tweeted the article within the first 6 hours after it is published. We followed users on each crowd during one week, recording every public tweet they posted during this period. We used Twitter data around news stories published by two prominent international news portals: BBC News and Al Jazeera English.

What did we find?

  • After they tweet a news article, people’s subsequent tweets are correlated to that article during a brief period of time.
  • The correlation is weak but significant, in terms of reflecting the similarity between the articles that originate a crowd.
  • While the majority of crowds simply disperse over time, parts of some news crowds come together again around new newsworthy events.

Crowd summarisation

We illustrate the outcome of our automatic method with the article Central African rebels advance on capital, posted on Al Jazeera on 28 December, 2012.

transient news crowds - example
Figure 2. Word clouds generated for the crowd on the article “Central African rebels advance on capital”, by considering the terms appearing in stories filtered by our system (top) and on the top stories by frequency
(bottom).

Without using our method (in the figure, bottom), we obtain frequently-posted articles which are weakly related or not related at all to the original news article. Using our method (in the figure, top), we observe several follow-up articles to the original one. Four days after the news article was published, several members of the crowd tweeted an article about the fact that the rebels were considering a coalition offer. Seven days after the news article was published, crowd members posted that rebels had stopped advancing towards Bangui, the capital of the Central African Republic.

You can find more details in our papers:

  • Janette Lehmann, Carlos Castillo, Mounia Lalmas and Ethan Zuckerman: Transient News Crowds in Social Media. Seventh International AAAI Conference on Weblogs and Social Media, 2013, Massachusetts.
  • Janette Lehmann, Carlos Castillo, Mounia Lalmas and Ethan Zuckerman: Finding News Curators in Twitter. WWW Workshop on Social News On the Web (SNOW), Rio de Janeiro, Brazil.


Janette Lehmann, Universitat Pompeu Fabra
Carlos Castillo, Qatar Computing Research Institute
Mounia Lalmas, Yahoo! Labs Barcelona
Ethan Zuckerman, MIT Center for Civic Media

What do users really want in an event summarization system?

The wide usage of social media means that users now have to keep up with a large number of incoming content, motivating the development of several stream monitoring tools, such as PalanteerTopsyTweet Archivist, etc. Such tools could be used to aid in sensemaking about real-life events by detecting and summarizing social media content about these events. Given the large amount of content being shared and the limited attention of users, what information should we provide to users about special events as they are detected in social media? 

In our analysis, we analyzed tweets related to four diverse events:

  1. Facebook IPO
  2. Obamacare
  3. Japan Earthquake
  4. BP Oil Spill

The figure below shows the temporal patterns of usage for words related to the Facebook launch price. By exploiting the content similarity between tweets written around the same time, we could discover various aspects (topics) of an event.

Facebook IPO Launch Price

These plots show frequency of usage over time for various words related to the Facebook IPO. We can see similarities and differences in the temporal profiles of the usage of each of these words.

The figure below shows how the volume of content related to various aspects (topics) of an event changes over time, as the event unfolds. Notice that some aspects have a longer lifespan of attention from tweeters, while others peak and die off quickly.

Topics through time

These two figures show how the topics within an event change over time. The figure on the left shows raw volumes, while the figure on the right shows underlying patterns used in our model. Notice how topics spike at different times and with different amounts of concentration over time.

We used our model to generate summaries and hired workers on Amazon Mechanical Turk to provide feedback. Please refer to this link for the summaries we showed to our workers. Which summary do you like best? This is what some of our respondents had to say:

  1. Number 3 has the most facts.
  2. Summary 2 is more straight forward information & not personal appeal pieces like live chats and other stuff with people who are unqualified to speak about the issue.
  3. None. All too partisan
  4. Summary 3 has most news with less personal commentary than the others.
  5. I believe that summary 1 and 2 had a large amount of personal opinion and not fact.
  6. I think summary 3 best summarize Facebook IPO because it shows a broad range of information related to the event.
  7. Summary 3 is more comprehensive and offers better overall summary.

Overall, we received feedback from users that they want summaries that are comprehensive, covering a broad range of information. Furthermore, they want summaries to be objective, factual, and non-partisanWhile we believe we have done well in giving users comprehensive and broad range information, we think that future work in summarization will reduce the gap between what researchers are doing and what users really want.

For more, see our full paper,  Automatic Summarization of Events from Social Media.
Freddy Chua, Living Analytics Research Centre, Singapore Management University
Sitaram Asur, Social Computing Research Group, Hewlett Packard Research Labs