Publication:
Unsupervised part of speech tagging using unambiguous substitutes from a statistical language model

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorYatbaz, Mehmet Ali
dc.contributor.kuauthorYüret, Deniz
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T22:51:07Z
dc.date.issued2010
dc.description.abstractWe show that unsupervised part of speech tagging performance can be significantly improved using likely substitutes for target words given by a statistical language model. We choose unambiguous substitutes for each occurrence of an ambiguous target word based on its context. The part of speech tags for the unambiguous substitutes are then used to filter the entry for the target word in the word-tag dictionary. A standard HMM model trained using the filtered dictionary achieves 92.25% accuracy on a standard 24,000 word corpus.
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipNational Natural Science Foundation of China
dc.description.sponsorshipDep. Lang. Inf. Adm., Minist. Educ.
dc.description.sponsorshipBaiDu
dc.description.sponsorshipGoogle
dc.description.sponsorshipFujitsu R and D Center CO., LTD.
dc.description.volume2
dc.identifier.linkhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-80053400067andpartnerID=40andmd5=9d00490aa573a3ff0f42e50375e15a0e
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-80053400067
dc.identifier.urihttps://hdl.handle.net/20.500.14288/6771
dc.keywordsHMM models
dc.keywordsPart of speech tagging
dc.keywordsPart-of-speech tags
dc.keywordsStatistical language models
dc.keywordsComputational linguistics
dc.language.isoeng
dc.publisherCOLING
dc.relation.ispartofColing 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference
dc.subjectComputer engineering
dc.titleUnsupervised part of speech tagging using unambiguous substitutes from a statistical language model
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorYüret, Deniz
local.contributor.kuauthorYatbaz, Mehmet Ali
local.publication.orgunit1College of Engineering
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files