People post millions of updates to social media sites like Facebook and Twitter everyday. When it comes to understanding what groups of people are experiencing, knowing the area where these messages originate can make a huge difference: are ”I feel sick” posts pointing to a breaking epidemic or just run-of-the-mill flu? Does the posts promoting a political candidate reveal wide-spread support or just a loud minority form their home town?
However, one of the big challenges in doing geographic analyses is estimating
where people are. For example, only 0.7% of all Twitter messages come with some
kind of GPS data. Our work starts from this data and then uses a old social
principle: People are often friends with others who live nearby. If we
know the locations of only a small number of people, we can look at a person’s
social network and try to infer their location based on where their friends are.
We looked at a Twitter social network of 47.7 million people, where two people are connected if they’ve both talked to each other at least once. In our estimates, we found that we could estimate a location for most people in the network (95%) and that our estimates were often very close to where people actually were, with over half within 10km (6mi). Moreover, our method enables geo-tagging over 77% of all Twitter messages.
In the paper, we examined many other hypotheses and found:
- The method works regardless of the person’s countries of origin
- Locations can be accurately inferred across a variety of social network sizes – even if they only have one friend
- Locations can even be inferred using data from other social network like Foursquare, provided you can find individuals who have identities in both
- Only a small amount of location data is needed
For more, see our full paper, That’s What Friends Are For: Inferring Location in Online Social Media Platforms Based on Social Relationships
David Jurgens, Sapienza University of Rome