Topical classification of domain names based on subword embeddings

Chong Wang, Yi Chen

Research output: Contribution to journalArticlepeer-review

Abstract

A good domain name can help a company rapidly increase their brand awareness, attract more visitors, and therefore obtain more customers. Due to the exponential increase in the number of domain names, registrants are often frustrated because their preferred domain names are already taken. In order to enhance registrants’ satisfaction and efficiency, as well as to increase the revenue of registrars (e.g. GoDaddy, Yahoo, Squarespace), it is important to suggest alternative domain names that are available. The first step is to detect registrants’ needs by classifying the attempted domain name to one of the categories. This study is the first that defines the problem of domain name classification, which classifies a registrant's preferred domain name into pre-defined categories. The paper proposes deep neural networks with subword embeddings that are built in multiple strategies. We build embeddings for character n-grams of a domain name by learning from training data, learning from external corpus, or learning from external corpus and adjusting based on training data. The experiments show that the proposed methods significantly outperform the baselines.

Original languageEnglish (US)
Article number100961
JournalElectronic Commerce Research and Applications
Volume40
DOIs
StatePublished - Mar 1 2020

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Computer Networks and Communications
  • Marketing
  • Management of Technology and Innovation

Keywords

  • Domain names
  • E-Commerce
  • Internet
  • Text classification
  • WWW

Fingerprint Dive into the research topics of 'Topical classification of domain names based on subword embeddings'. Together they form a unique fingerprint.

Cite this