Understanding and detecting the intended meaning in social media is challenging because social media messages contain varieties of noise and chaos that are irrelevant to the themes of interests. For example, conventional supervised classification approaches would produce inconsistent solutions to detecting and clarifying whether any given Twitter message is really about a wildfire event. Consequently, a renovated workflow was designed and implemented. The workflow consists of four sequential procedures: (1) Apply the latent semantic analysis and cosine similarity calculation to examine the similarity between Twitter messages; (2) Apply Affinity Propagation to identify exemplars of Twitter messages; (3) Apply the cosine similarity calculation again to automatically match the exemplars to known training results, and (4) Apply accumulative exemplars to classify Twitter messages using a support vector machine approach. The overall correction ratio was over 90% when a series of ongoing and historical wildfire events were examined.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Earth and Planetary Sciences(all)
- Social media
- supervised learning