On protocol-independent data redundancy elimination

Yan Zhang, Nirwan Ansari

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

Data redundancy elimination (DRE), also known as data de-duplication, reduces the amount of data to be transferred or stored by identifying and eliminating both intra-object and inter-object duplicated data elements with a reference or pointer to the unique data copy. Large scale trace-driven studies have showed that packet-level DRE techniques can achieve 15-60% bandwidth savings when deployed at access links of the service providers, up to almost 50% bandwidth savings in Wi-Fi networks and as much as 60% mobile data volume reduction in cellular networks. In this paper, we survey the state-of-the-art protocol-independent redundancy elimination techniques. We overview the system architecture and main processing of protocol-independent DRE techniques, followed by discussion on major mechanisms activated in protocol-independent DRE, including the fingerprinting mechanism, cache management mechanism, chunk matching mechanism, and decoding error recovery mechanism. We also present several redundancy elimination systems deployed in wireline, wireless and cellular networks, respectively. Several other techniques to enhance the DRE performance are further discussed, such as DRE bypass techniques, non-uniform sampling, and chunk overlap.

Original languageEnglish (US)
Article number6524464
Pages (from-to)455-472
Number of pages18
JournalIEEE Communications Surveys and Tutorials
Volume16
Issue number1
DOIs
StatePublished - Mar 2014

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering

Keywords

  • Data redundancy elimination (DRE)
  • content delivery acceleration
  • data de-duplication
  • protocol-independent DRE
  • wide area network (WAN) optimization

Fingerprint

Dive into the research topics of 'On protocol-independent data redundancy elimination'. Together they form a unique fingerprint.

Cite this