Abstract
Cellular networks carry vast amounts of voice, text and data traffic every second. The networks are monitored constantly to measure network performance, detect traffic congestion, identify anomalies, and to serve other customer service and network support functions. The data collected from mobility networks is used to make many critical decisions. The quality of the information plays an important role in the effectiveness of these decisions. Therefore it is important to ensure that the data collected from cellular networks meets quality standards. In particular, identifying glitches that are correlated can help in identifying root causes and facilitate more efficient problem solving in the network as well as quicker data repairs. In this paper, we present a methodology for automated auditing of massive, complex data streams with a focus on correlated glitches, and a case study that illustrates the application of this methodology. The methodology has two main components, a set of logical constraints that embody domain specific information, and statistical methods for identifying correlated glitches to enable automated quantitative cleaning of data. Together, the two components provide a comprehensive yet customizable set of criteria for evaluating information quality as a function of time and network topology. We demonstrate the use of the cross g function to identify correlations in glitches. In the case study, we focus on duplicate, missing, inconsistent and anomalous data, and correlations between glitches across time, space and topology.
Original language | English (US) |
---|---|
Pages | 204-218 |
Number of pages | 15 |
State | Published - 2011 |
Externally published | Yes |
Event | 16th International Conference on Information Quality, ICIQ 2011 - Adelaide, SA, Australia Duration: Nov 18 2011 → Nov 20 2011 |
Other
Other | 16th International Conference on Information Quality, ICIQ 2011 |
---|---|
Country/Territory | Australia |
City | Adelaide, SA |
Period | 11/18/11 → 11/20/11 |
All Science Journal Classification (ASJC) codes
- Information Systems
- Safety, Risk, Reliability and Quality
Keywords
- Automated detection
- Correlated glitches
- Data quality