Abstract
Massive social media data produced from microblog platforms provide a new data source for studying human dynamics at an unprecedented scale. Meanwhile, population bias in geotagged Twitter users is widely recognized. Understanding the demographic and socioeconomic biases of Twitter users is critical for making reliable inferences on the attitudes and behaviors of the population. However, the existing global models cannot capture the regional variations of the demographic and socioeconomic biases. To bridge the gap, we modeled the relationships between different demographic/socioeconomic factors and geotagged Twitter users for the whole contiguous United States, aiming to understand how the demographic and socioeconomic factors relate to the number of Twitter users at county level. To effectively identify the local Twitter users for each county of the United States, we integrate three commonly used methods and develop a query approach in a high-performance computing environment. The results demonstrate that we can not only identify how the demographic and socioeconomic factors relate to the number of Twitter users, but can also measure and map how the influence of these factors vary across counties.
Original language | English (US) |
---|---|
Pages (from-to) | 228-242 |
Number of pages | 15 |
Journal | Cartography and Geographic Information Science |
Volume | 46 |
Issue number | 3 |
DOIs | |
State | Published - May 4 2019 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Geography, Planning and Development
- Management of Technology and Innovation
- Civil and Structural Engineering
Keywords
- Social media
- big data
- geographically weighted regression
- population bias
- spatial statistics