A comparative study on two ground truth inference algorithms based on manually labeled social media data

Xiaoyu Sean Lu, Mengchu Zhou, Haoyue Liu, Liang Qi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In the booming information era, smart devices such as smart phones accompany peoples' lives all the time. Social media platforms provide users with uninterrupted communication and information acquisition including posting users' feelings and sharing ideas. This study focuses on short texts posted by users. Their true meaning is defined as ground truth. However, acquiring it from the users directly is extremely difficult and time-consuming. In other words, in many cases, short texts do not have their ground truth. Thus, we deal with a no ground truth problem. In this work, we ask for labelers to label short texts completely based on their own judgment of these texts. Two ground truth inference approaches, majority voting (MV) and positive label frequency threshold (PLAT), integrate the labels from different labelers and deduce the ground truth. We then analyze which one better suits for labeling unlabeled short texts. The work is of great significance in helping us obtain useful knowledge from massive social media data.

Original languageEnglish (US)
Title of host publicationProceedings of the 2019 IEEE 16th International Conference on Networking, Sensing and Control, ICNSC 2019
EditorsHaibin Zhu, Jiacun Wang, MengChu Zhou
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages436-441
Number of pages6
ISBN (Electronic)9781728100838
DOIs
StatePublished - May 2019
Event16th IEEE International Conference on Networking, Sensing and Control, ICNSC 2019 - Banff, Canada
Duration: May 9 2019May 11 2019

Publication series

NameProceedings of the 2019 IEEE 16th International Conference on Networking, Sensing and Control, ICNSC 2019

Conference

Conference16th IEEE International Conference on Networking, Sensing and Control, ICNSC 2019
Country/TerritoryCanada
CityBanff
Period5/9/195/11/19

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Control and Optimization
  • Instrumentation

Keywords

  • Ground truth inference algorithms
  • Short text classification
  • Social media data

Fingerprint

Dive into the research topics of 'A comparative study on two ground truth inference algorithms based on manually labeled social media data'. Together they form a unique fingerprint.

Cite this