Unfolding the Event Landscape on Twitter: Classification and Exploration of User Categories

Munmun De Choudhury, Microsoft Research
Nick Diakopoulos, Rutgers University
Mor Naaman, Rutgers University

Over the past couple of years, social media has continually been emerging as the first place where millions react to large-scale events, be it catastrophes (Haiti), political upheavals (Iran) or globally/socially relevant festivities (the royal wedding). Such reactions on an event are often shared by a variety of users – spanning from organizational entities, celebrities, journalists or ordinary individuals. Understanding participation and attention patterns from different “user types” can enable us infer information creation and consumption behavior around events, composition of the major stakeholders of information, as well as help better event-based information search and exploration in social media. Our work takes initial steps towards building an automatic classifier for user types on the social media Twitter, differentiating between organizations, journalists/media bloggers, and ordinary individuals. We thereafter use those classifications to characterize a series of diverse events.

Continue reading

Do Editors or Articles Drive Collaboration? Multilevel Statistical Network Analysis of Wikipedia Coauthorship

Brian Keegan, Northwestern University
Darren Gergle, Northwestern University
Noshir Contractor, Northwestern University

Wikipedia editors fulfill distinct and diverse collaboration roles and different types of articles employ different forms of coordination. However, extant scholarship has not examined the interaction between these features: how do editors with particular skills self-organize around articles requiring different forms of collaboration? Using statistical network analysis methods called p*/exponential random graph models (p*/ERGMs), we analyze multi-level processes which structure diverse Wikipedia collaborations.

Continue reading

Supporting reflective public thought with ConsiderIt

Travis Kriplean, Computer Science & Engineering, U. Washington
Jonathan Morgan, Human Centered Design & Engineering, U. Washington
Deen Freelon, Communication, American University
Alan Borning, Computer Science & Engineering, U. Washington
Lance Bennett, Political Science, Communication, U. Washington

There are surprisingly few intuitive tools for supporting large groups in making decisions together, whether they are citizens, employees, or even program committee members. This is problematic if we are to address challenges that we face as a society. We have invented a new model of public deliberation and have implemented it as the ConsiderIt platform. By encouraging people to think through tradeoffs together and consider the perspectives of others,  we believe ConsiderIt can help build public trust while improving upon our collective ability to take more effective action on problems such as financial reform and climate change.

Continue reading

Friends, Romans, Countrymen: Lend me your URLs

Abhinay Nagpal, Computer Science Dept., Stanford University
Sudheendra Hangal, Computer Science Dept.,  Stanford University
Rifat Reza Joyee, Computer Science Dept., Stanford University
Monica S. Lam, Computer Science Dept., Stanford University

Human curation is useful for obtaining high-quality information. In fact, in the early days of the web, people found information using directories of useful pages. Over time, we’ve moved to search engines that use algorithms to infer relevant and authoritative pages on the web. However, the commercial importance of search engines has meant that they face the problem of web spam. Bringing back elements of human curation may be one way to solve this problem. And who better than your friends to curate information for you! Such social curation also offers implicit personalization since people often share common interests and affiliations with their friends.

We exploit the fact that your social chatter already contains a list of sites that are useful to you. For example, people email or tweet about links they find interesting and would like to share with friends. In our research, we created customized search indexes for users that bias web search results towards domains present in their social chatter. We found that this approach is effective at combating web spam and delivering high quality search results. It also solves another problem with current search engines — user privacy. Search engines need to generate detailed profiles of users to deliver personalized results. This information includes the user’s social graph, location, etc. In contrast, our approach enables personalized search without revealing a lot of detail to the search engine. Moreover, this form of personalization can be better, since only the user has access to all his chatter — it is not limited by commercial arrangements between search engines and channels of social chatter.

High level workflow of Slant

We have developed a system called Slant that extracts URLs from email archives and Twitter feeds, and uses the domains therein to create a personalized Google Custom Search engine for each user.  This engine restricts search results to the domains mentioned in social chatter — in essence, these domains are treated as a whitelist. We evaluated the results from various personalized custom search engines, and found that, even though the personalized indexes used only a few thousand domains, their results as rated by users matched or exceeded the results from personalized Google search.

Specifically, in our study, we asked users to compare results from different search engines:

  • Google’s personalized results
  • Results from a custom search engine that had domains from TopTweets account
  • Results from a custom search engine that had domains from user’s Twitter account.
  • Results from a custom search engine that had domains from user’s email account.
  • Results from a search engine where we supplied user’s friend names and appended to the original query.

The results are shown below, and indicate that both the email and Twitter-based indexes frequently match or exceed personalized Google ratings.

We further categorized queries along Broder’s taxonomy, as one of transactional, navigational or informational, and obtained insights about which search indices do better for each category. Please see our paper for details.

A secondary benefit of Slant is that it lets users consume information implicitly, by piping the recommendations implicit in their social feeds into a search engine. This means that users can follow more people on Twitter, or subscribe to more mailing lists, without having to read all the content manually.

For more, see our full CSCW-2012 paper, Friends, Romans, Countrymen: Lend me your URLs. Using Social Chatter to Personalize Web Search. Interested readers can try out the Slant research prototype here.

Organizing online productions without formal organizations

Haiyi Zhu, Carnegie Mellon University
Robert E. Robert, Carnegie Mellon University
Aniket Kittur, Carnegie Mellon University

A challenge for many online production communities is to direct their members to accomplish tasks that are important to the group, even when these tasks may not match individual members’ interests.  For example, many people may want to work on the same popular areas (e.g., an article on “Barack Obama” in Wikipedia) while ignoring less popular areas that require work.

Many techniques used in conventional employment organizations are not effective in organizing online volunteers’ behaviors due to the fundamental characteristics of online communities, including lack of employment contracts, weak external incentives, weak interpersonal bonds, impoverished communication, large size, and high turnover. For example, if a project tries to exert too much managerial control, volunteers can simply leave, with fewer economic or social consequences than if they had quit a job or left a real-life social group.

Instead, communities must turn to other means of managing volunteers. One technique is by leveraging group identification—the perception of belonging to a group. Literatures show that, if volunteers feel that their identities are tied to the identity of the group, their goals may be more likely to reflect those that are important to the group.  However, group identification by itself does not specify which particular tasks to work on. In contrast, direction setting—for example by specifying goals—can be an effective mechanism for highlighting specific tasks. However, direction setting by itself may not be enough to motivate people to work on those highlighted tasks.

We hypothesize that group identification and direction setting can complement each other in managing volunteers’ efforts. Group identification can align the individual volunteer’s goals with the group’s goals; and direction setting can focus people’s group-oriented motivation towards the group’s important tasks.

We tested our hypotheses in the context of Wikiprojects’ Collaborations of the Week (COTWs), which is a project direction setting mechanism that designates one or two articles to improve in a defined period. Figure 1 shows the example announcements. We examined editors’ contributions on the same articles during the collaboration period (the period when the articles are selected as collaboration targets) and the non-collaboration period (the pre- and post-collaboration period).We included in our sample editors who are aware of the events. We operationalized self-identified group members as those who edited any project pages (including the project member lists). The results are shown in Figure 2.

We found that people in general contributed more during collaboration periods, but the effect is dramatically larger for those self-identified group members. The results support our hypothesis and suggest that people who identify themselves as group members voluntarily follow directions and perform tasks valued by the group.

These results were obtained in the context of projects within Wikipedia. However, we believe that the basic idea of combining group identification and direction setting, as an unobtrusive management method, can be generalized to other kinds of online communities and offline volunteer organizations.

For more, see our full paper, Organizing without formal organization: Group Identification, Goal Setting and Social Modeling in Directing Online Production.

CrowdSearch 2012 Workshop @ WWW 2012

A workshop about crowdsourcing web search will be held April 17, 2012 in Lyon, France, co-located with the WWW 2012 conference.  I am on the program committee for this workshop.  It will bring together research on all aspects of involving humans in web search, including crowdsourcing content discovery, quality assessment, social search services, designing effective incentives, cognitive factors, and software architecture.  Submissions are due February 8, with an abstract submission by February 1.

Continue reading

CrowdNet Workshop in Karlsruhe, Germany, Jan 26th 2012

CrowdNet 2012, the 2nd Workshop on Cloud Labor and Human Computation is coming to Karlsruhe, Germany, on Thursday, January 26th, 2012. The CrowdNet workshop aims to bring together researchers and practitioners from various disciplines and industries who are interested in the scientific and economical challenges of cloud labor and human computation. This years workshop is intended as an informal platforms to present results from research and practice as well as to discuss innovative and breakthrough ideas.

The workshop features over fifteen invited talks. Among others, industry representatives from IBM, Clickworker, SAP, and CrowdEngineering will introduce first-hand insights from the respective platforms. Researchers will then present about various challenges of cloud labor, human computation with games, motivation, quality assurance, and programmable human computers.

Continue reading

Collaboratively Crowdsourcing Workflows with Turkomatic

Anand Kulkarni, UC Berkeley, MobileWorks
Matthew Can, Stanford
Bjoern Hartmann, UC Berkeley

A central challenge in crowd computing is the workflow design problem: how can we divide a complex job — for instance, editing a paper or writing a computer program — into a sequence of microtasks that can be solved by a pool of crowd workers on the web? Effective workflow design is a difficult process, requiring careful task design, extensive software development, and iterated testing with a live crowd. The complexity of workflow design limits participation in crowdsourcing marketplaces to experts willing to invest substantial time and effort, and limits the kinds of tasks that can be crowdsourced today.

What if we could use the crowd to attack the workflow design problem itself? We present Turkomatic, a tool that allows requesters to collaboratively design and execute workflows in conjunction with the crowd.
Continue reading

Social transparency in networked information exchange

Colleen Stuart, Carnegie Mellon University
Laura Dabbish, Carnegie Mellon University
Sara Kiesler, Carnegie Mellon University
Peter Kinnaird, Carnegie Mellon University
Ruogu Kang, Carnegie Mellon University

More information than ever before is being revealed about content, people, and their interactions. Facebook broadcasts user activities. Twitter explicitly supports the attribution of retweets. Microsoft Academic Search visualizes co-authorship networks. It is technically possible for Internet applications to make almost any action on a piece of information visible to users within or across websites. Social transparency, which we define as the availability of social meta-data surrounding information exchange, is increasing at a rapid rate.

Continue reading

Briefing News Reporting with Mobile Assignments – Perceptions, Needs and Challenges

Heli Väätäjä, Tampere University of Technology, Finland
Paul Egglestone, UCLAN, UK

How do mobile assignments fit to creative and complex work, furthermore, when the completion criteria for the tasks may not be clearly defined at the time of creation? What kind of needs and challenges arise related to mobile processes, co-operation and task content?

Mobile assignments sent to or accessible through mobile handheld devices (such as smartphones) are one possible approach to engaging readers to news reporting activities. We studied the use of mobile assignments for briefing news reporting in the case of mobile professionals. We present in this post findings from our paper from CSCW 2012 conference that are applicable to crowdsourced news reporting with readers as participants in news making.

Mobile assignments were described by participants as “quick” and “easy” and especially suitable for small stories and fast reporting situations. The importance of clear and sufficient task instructions was emphasized, since missing or unclear information leads to need for contact and communication. In case of crowdsourcing it is not usually wished for that the crowdworkers contact the “employers”, since this leads to extra workload and takes time, reducing the efficiency and savings aimed for. Expressed information needs included 1) details on the content requested, such as multimedia quality, number of photos, length of the text and video clips, 2) reporting schedule, 3) type of reporting (oneshot or drip-feeding), 4) intended usage context of the material, 5) type of story asked for and 6) perspective of the story to be covered.

The news editors responsible for creating the news briefings expressed several information needs as well as concerns related to the mobile assignment processes. First, when creating the assignments, information related to the skills, equipment, availability and location of the reporters could be used in targeting the assignments to specific sub-groups. Second, to be able to foresee and plan the reporting, the editors would need to know whether any of the respondents are going to carry out the assignment and when to expect the material or will it be delivered in time.

As the mobile reporters sent the material asked for to the newsroom, they wanted to know whether the task was completed as is, or whether the editors had more wishes for completing the reporting. Editors could, for example, wish for that the mobile reporters would drip feed news content in certain types of reporting situations. In addition, they might want to check some facts related to the reported material.

We also found that locationing of mobile reporters was perceived as useful for coordination of reporting. However, privacy issues arouse, and specifically trust, safety and security were raised to discussion in case of certain countries.

This paper can be found along with another paper related to crowdsourced news reporting at

https://sites.google.com/site/helivaataja/papers