FASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model

Publication:
FASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model

dc.contributor.department	Department of Computer Engineering
dc.contributor.facultymember	Yes
dc.contributor.kuauthor	Yüret, Deniz
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.date.accessioned	2024-11-09T12:19:35Z
dc.date.issued	2012
dc.description.abstract	Lexical substitutes have found use in areas such as paraphrasing, text simplification, machine translation, word sense disambiguation, and part of speech induction. However the computational complexity of accurately identifying the most likely substitutes for a word has made large scale experiments difficult. In this letter we introduce a new search algorithm, FASTSUBS, that is guaranteed to find the K most likely lexical substitutes for a given word in a sentence based on an n-gram language model. The computation is sublinear in both K and the vocabulary size V. An implementation of the algorithm and a dataset with the top 100 substitutes of each token in the WSJ section of the Penn Treebank are available at https://goo.gl/jzKH0.
dc.description.fulltext	YES
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.issue	11
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	N/A
dc.description.sponsorship	N/A
dc.description.version	Author's final manuscript
dc.description.volume	19
dc.identifier.doi	10.1109/LSP.2012.2215587
dc.identifier.embargo	NO
dc.identifier.filenameinventoryno	IR00241
dc.identifier.issn	1070-9908
dc.identifier.quartile	N/A
dc.identifier.scopus	2-s2.0-84866393731
dc.identifier.uri	https://doi.org/10.1109/LSP.2012.2215587
dc.identifier.wos	308963900004
dc.keywords	Lexical substitutes
dc.keywords	Statistical language modeling
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	The IEEE Signal Processing Letters
dc.relation.uri	http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/1266
dc.subject	Engineering
dc.subject	Electrical and electronic
dc.title	FASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model
dc.type	Journal Article
dspace.entity.type	Publication
local.contributor.kuauthor	Yüret, Deniz
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 1266.pdf
Size:: 94.36 KB
Format:: Adobe Portable Document Format

Download

Collections

Publications with Fulltext

Publication: FASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model

Files

Original bundle

Collections

Publication:
FASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model