Semantic deep web: Automatic attribute extraction from the deep web data sources

Yoo Jung An, James Geller, Yi Ta Wu, Soon Ae Chun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Scopus citations

Abstract

"Deep Web" refers to the rich information and data hidden in backend databases, etc., that search engines or Web crawlers cannot access. It is mostly accessible through manual query interfaces. This paper introduces the Semantic Deep Web, utilizing an ontology to determine relevance of query interface attributes to access the Deep Web. In addition, we present a novel approach to automatically extracting attributes from query interfaces in order to address the current limitations in accessing Deep Web data sources. Our Automatic Attribute Extraction method (1) identifies attributes that are used by query Web page designers, called Programmer Viewpoint Attributes, and (2) attributes that are presented as labels to users, called User Viewpoint Attributes. An ontology enriches the candidate query attributes by providing synonyms and by supporting the attributes used by designers and users. Our experimental results in several e-commerce domains show that the attributes obtained by our algorithm compare favorably with manually determined attributes to be used for Deep Web queries.

Original languageEnglish (US)
Title of host publicationProceedings of the 2007 ACM Symposium on Applied Computing
Pages1667-1672
Number of pages6
DOIs
StatePublished - 2007
Event2007 ACM Symposium on Applied Computing - Seoul, Korea, Republic of
Duration: Mar 11 2007Mar 15 2007

Publication series

NameProceedings of the ACM Symposium on Applied Computing

Other

Other2007 ACM Symposium on Applied Computing
Country/TerritoryKorea, Republic of
CitySeoul
Period3/11/073/15/07

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • Automatic attribute extraction
  • Deep web
  • Semantic deep web
  • Semantic web

Fingerprint

Dive into the research topics of 'Semantic deep web: Automatic attribute extraction from the deep web data sources'. Together they form a unique fingerprint.

Cite this