A method for constructing word sense embeddings based on word sense induction

dc.contributor.authorSun, Yujia
dc.contributor.authorPlatoš, Jan
dc.date.accessioned2024-03-06T14:23:54Z
dc.date.available2024-03-06T14:23:54Z
dc.date.issued2023
dc.description.abstractPolysemy is an inherent characteristic of natural language. In order to make it easier to distinguish between diferent senses of polysemous words, we propose a method for encoding multiple diferent senses of polysemous words using a single vector. The method frst uses a two-layer bidirectional long short-term memory neural network and a self-attention mechanism to extract the contextual information of polysemous words. Then, a K-means algorithm, which is improved by optimizing the density peaks clustering algorithm based on cosine similarity, is applied to perform word sense induction on the contextual information of polysemous words. Finally, the method constructs the corresponding word sense embedded representations of the polysemous words. The results of the experiments demonstrate that the proposed method produces better word sense induction than Euclidean distance, Pearson correlation, and KL-divergence and more accurate word sense embeddings than mean shift, DBSCAN, spectral clustering, and agglomerative clustering.cs
dc.description.firstpageart. no. 12945cs
dc.description.issue1cs
dc.description.sourceWeb of Sciencecs
dc.description.volume13cs
dc.identifier.citationScientific Reports. 2023, vol. 13, issue 1, art. no. 12945.cs
dc.identifier.doi10.1038/s41598-023-40062-3
dc.identifier.issn2045-2322
dc.identifier.urihttp://hdl.handle.net/10084/152294
dc.identifier.wos001045574100067
dc.language.isoencs
dc.publisherSpringer Naturecs
dc.relation.ispartofseriesScientific Reportscs
dc.relation.urihttps://doi.org/10.1038/s41598-023-40062-3cs
dc.rightsCopyright © 2023, The Author(s)cs
dc.rights.accessopenAccesscs
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/cs
dc.titleA method for constructing word sense embeddings based on word sense inductioncs
dc.typearticlecs
dc.type.statusPeer-reviewedcs
dc.type.versionpublishedVersioncs

Files

Original bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
2045-2322-2023v13i1an12945.pdf
Size:
2.15 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
license.txt
Size:
718 B
Format:
Item-specific license agreed upon to submission
Description: