Publication:
FASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorYüret, Deniz
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokid179996
dc.date.accessioned2024-11-09T12:19:35Z
dc.date.issued2012
dc.description.abstractLexical substitutes have found use in areas such as paraphrasing, text simplification, machine translation, word sense disambiguation, and part of speech induction. However the computational complexity of accurately identifying the most likely substitutes for a word has made large scale experiments difficult. In this letter we introduce a new search algorithm, FASTSUBS, that is guaranteed to find the K most likely lexical substitutes for a given word in a sentence based on an n-gram language model. The computation is sublinear in both K and the vocabulary size V. An implementation of the algorithm and a dataset with the top 100 substitutes of each token in the WSJ section of the Penn Treebank are available at https://goo.gl/jzKH0.
dc.description.fulltextYES
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.issue11
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipN/A
dc.description.versionAuthor's final manuscript
dc.description.volume19
dc.formatpdf
dc.identifier.doi10.1109/LSP.2012.2215587
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR00241
dc.identifier.issn1070-9908
dc.identifier.linkhttps://doi.org/10.1109/LSP.2012.2215587
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-84866393731
dc.identifier.urihttps://hdl.handle.net/20.500.14288/1498
dc.identifier.wos308963900004
dc.keywordsLexical substitutes
dc.keywordsStatistical language modeling
dc.languageEnglish
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/1266
dc.sourceThe IEEE Signal Processing Letters
dc.subjectEngineering
dc.subjectElectrical and electronic
dc.titleFASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.authorid0000-0002-7039-0046
local.contributor.kuauthorYüret, Deniz
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
1266.pdf
Size:
94.36 KB
Format:
Adobe Portable Document Format