TY - JOUR
T1 - Embedding Imputation With Self-Supervised Graph Neural Networks
AU - Varolgunes, Uras
AU - Yao, Shibo
AU - Ma, Yao
AU - Yu, Dantong
N1 - Funding Information:
This work was supported by the Department of Energy under Grant DE-SC0022346.
Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - Embedding learning is essential in various research areas, especially in natural language processing (NLP). However, given the nature of unstructured data and word frequency distribution, general pre-trained embeddings, such as word2vec and GloVe, are often inferior in language tasks for specific domains because of missing or unreliable embedding. In many domain-specific language tasks, pre-existing side information can often be converted to a graph to depict the pair-wise relationship between words. Previous methods use kernel tricks to pre-compute a fixed graph for propagating information across different words and imputing missing representations. These methods require predefining the optimal graph construction strategy before any model training, resulting in an inflexible two-step process. In this paper, we leverage the recent advances in graph neural networks and self-supervision strategy to simultaneously learn a similarity graph and impute missing embeddings in an end-to-end fashion with the overall time complexity well controlled. We undertake extensive experiments to show that the integrated approach performs better than several baseline methods.
AB - Embedding learning is essential in various research areas, especially in natural language processing (NLP). However, given the nature of unstructured data and word frequency distribution, general pre-trained embeddings, such as word2vec and GloVe, are often inferior in language tasks for specific domains because of missing or unreliable embedding. In many domain-specific language tasks, pre-existing side information can often be converted to a graph to depict the pair-wise relationship between words. Previous methods use kernel tricks to pre-compute a fixed graph for propagating information across different words and imputing missing representations. These methods require predefining the optimal graph construction strategy before any model training, resulting in an inflexible two-step process. In this paper, we leverage the recent advances in graph neural networks and self-supervision strategy to simultaneously learn a similarity graph and impute missing embeddings in an end-to-end fashion with the overall time complexity well controlled. We undertake extensive experiments to show that the integrated approach performs better than several baseline methods.
KW - Embedding imputation
KW - graph neural networks
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85166481717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85166481717&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3292314
DO - 10.1109/ACCESS.2023.3292314
M3 - Article
AN - SCOPUS:85166481717
SN - 2169-3536
VL - 11
SP - 70610
EP - 70620
JO - IEEE Access
JF - IEEE Access
ER -