What are we missing? An empirical exploration in the structural biases of hashtag-based sampling on Twitter

  • Evelien D'heer imec -MICT, Department of Communication Sciences, Ghent University
  • Baptist Vandersmissen imec – IDLab, Department of Electronics and Information Systems, Ghent University
  • Wesley De Neve imec – IDLab, Department of Electronics and Information Systems, Ghent University
  • Pieter Verdegem Communication and Media Research Institute (CAMRI), University of Westminster
  • Rik Van de Walle imec – IDLab, Department of Electronics and Information Systems, Ghent University
Keywords: Twitter, Methodology, Hashtag, Elections, Conversation

Abstract

The hashtag is a recognized and often used method to collect Twitter messages. However, it has its limits with respect to the inclusion of follow-messages, or @replies, that do not contain a hashtag. This paper explored to what extent the inclusion of non-hashtagged responses affected the study of interactions between Twitter users. We drew from the Twitter debate on the 2014 Belgian elections, collected under the #vk2014 hashtag. Our dataset included non-hashtagged responses to assess (1) how they differ from hashtagged responses; and, (2) how this affects the conversation network. The findings showed that (1) hashtagged responses were more likely to include other interactive elements (e.g., hyperlinks); and, (2) the inclusion of non-hashtagged responses generated larger and more reciprocal networks. However, central users further strengthened their position in the network.

Author Biographies

Evelien D'heer, imec -MICT, Department of Communication Sciences, Ghent University

Postdoctoral researcher studying political communication on social media, based in the Department of Communication Sciences at Ghent University, research group imec — MICT, Belgium.

Baptist Vandersmissen, imec – IDLab, Department of Electronics and Information Systems, Ghent University

Research fellow in the IDLab of the Department of Electronics and Information Systems at Ghent University — imec, Belgium.

Wesley De Neve, imec – IDLab, Department of Electronics and Information Systems, Ghent University

Professor in the IDLab of the Department of Electronics and Information Systems at Ghent University — imec, Belgium and in the Image and Video Systems Lab of the Department of Electrical Engineering at KAIST, South Korea.

Pieter Verdegem, Communication and Media Research Institute (CAMRI), University of Westminster

Senior Lecturer in the Communication and Media Research Institute (CAMRI) at the University of Westminster, U.K.

Rik Van de Walle, imec – IDLab, Department of Electronics and Information Systems, Ghent University

ull Professor and head of the IDLab of the Department of Electronics and Information Systems at Ghent University — imec, Belgium.

Published
2017-01-16
How to Cite
D’heer, E., Vandersmissen, B., De Neve, W., Verdegem, P., & Van de Walle, R. (2017). What are we missing? An empirical exploration in the structural biases of hashtag-based sampling on Twitter. First Monday, 22(2). https://doi.org/10.5210/fm.v22i2.6353