A method for constructing word sense embeddings based on word sense induction
Loading...
Downloads
2
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Nature
Location
Signature
License
Abstract
Polysemy is an inherent characteristic of natural language. In order to make it easier to distinguish
between diferent senses of polysemous words, we propose a method for encoding multiple diferent
senses of polysemous words using a single vector. The method frst uses a two-layer bidirectional
long short-term memory neural network and a self-attention mechanism to extract the contextual
information of polysemous words. Then, a K-means algorithm, which is improved by optimizing
the density peaks clustering algorithm based on cosine similarity, is applied to perform word sense
induction on the contextual information of polysemous words. Finally, the method constructs the
corresponding word sense embedded representations of the polysemous words. The results of
the experiments demonstrate that the proposed method produces better word sense induction
than Euclidean distance, Pearson correlation, and KL-divergence and more accurate word sense
embeddings than mean shift, DBSCAN, spectral clustering, and agglomerative clustering.
Description
Subject(s)
Citation
Scientific Reports. 2023, vol. 13, issue 1, art. no. 12945.