Auditing data streams for correlated glitches

Ji Meng Loh, Tamraparni Dasu

Research output: Contribution to conferencePaperpeer-review

Abstract

Cellular networks carry vast amounts of voice, text and data traffic every second. The networks are monitored constantly to measure network performance, detect traffic congestion, identify anomalies, and to serve other customer service and network support functions. The data collected from mobility networks is used to make many critical decisions. The quality of the information plays an important role in the effectiveness of these decisions. Therefore it is important to ensure that the data collected from cellular networks meets quality standards. In particular, identifying glitches that are correlated can help in identifying root causes and facilitate more efficient problem solving in the network as well as quicker data repairs. In this paper, we present a methodology for automated auditing of massive, complex data streams with a focus on correlated glitches, and a case study that illustrates the application of this methodology. The methodology has two main components, a set of logical constraints that embody domain specific information, and statistical methods for identifying correlated glitches to enable automated quantitative cleaning of data. Together, the two components provide a comprehensive yet customizable set of criteria for evaluating information quality as a function of time and network topology. We demonstrate the use of the cross g function to identify correlations in glitches. In the case study, we focus on duplicate, missing, inconsistent and anomalous data, and correlations between glitches across time, space and topology.

Original languageEnglish (US)
Pages204-218
Number of pages15
StatePublished - 2011
Externally publishedYes
Event16th International Conference on Information Quality, ICIQ 2011 - Adelaide, SA, Australia
Duration: Nov 18 2011Nov 20 2011

Other

Other16th International Conference on Information Quality, ICIQ 2011
Country/TerritoryAustralia
CityAdelaide, SA
Period11/18/1111/20/11

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Safety, Risk, Reliability and Quality

Keywords

  • Automated detection
  • Correlated glitches
  • Data quality

Fingerprint

Dive into the research topics of 'Auditing data streams for correlated glitches'. Together they form a unique fingerprint.

Cite this