Publication:
A RoBERTa approach for automated processing of sustainability reports

dc.contributor.coauthorTasdemir, Beyza
dc.contributor.coauthorYilmaz, Cenk Arda
dc.contributor.coauthorDemiralp, Goekcan
dc.contributor.coauthorAtay, Mert
dc.contributor.coauthorAngin, Pelin
dc.contributor.coauthorDikmener, Gokhan
dc.contributor.departmentDepartment of International Relations
dc.contributor.kuauthorAngın, Merih
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of International Relations
dc.contributor.schoolcollegeinstituteCollege of Administrative Sciences and Economics
dc.contributor.yokid308500
dc.date.accessioned2024-11-09T23:06:17Z
dc.date.issued2022
dc.description.abstractThere is a strong need and demand from the United Nations, public institutions, and the private sector for classifying government publications, policy briefs, academic literature, and corporate social responsibility reports according to their relevance to the Sustainable Development Goals (SDGs). It is well understood that the SDGs play a major role in the strategic objectives of various entities. However, linking projects and activities to the SDGs has not always been straightforward or possible with existing methodologies. Natural language processing (NLP) techniques offer a new avenue to identify linkages for SDGs from text data. This research examines various machine learning approaches optimized for NLP-based text classification tasks for their success in classifying reports according to their relevance to the SDGs. Extensive experiments have been performed with the recently released Open Source SDG (OSDG) Community Dataset, which contains texts with their related SDG label as validated by community volunteers. Results demonstrate that especially fine-tuned RoBERTa achieves very high performance in the attempted task, which is promising for automated processing of large collections of sustainability reports for detection of relevance to SDGs.
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.issue23
dc.description.openaccessYES
dc.description.sponsorshipH2020 Marie Sklodowska-Curie ActionsH2020 Marie Sklodowska-Curie Actions (H2020-MSCA-IF-2019) grant [896716] This research was funded by H2020 Marie Sklodowska-Curie Actions (H2020-MSCA-IF-2019) grant number 896716. The funder had no role in the design of the study
dc.description.sponsorshipin the collection, analyses, or interpretation of data
dc.description.sponsorshipin the writing of the manuscript
dc.description.sponsorshipor in the decision to publish the results.
dc.description.volume14
dc.identifier.doi10.3390/su142316139
dc.identifier.eissn2071-1050
dc.identifier.scopus2-s2.0-85143778158
dc.identifier.urihttp://dx.doi.org/10.3390/su142316139
dc.identifier.urihttps://hdl.handle.net/20.500.14288/8953
dc.identifier.wos897354000001
dc.keywordsCorporate social responsibility
dc.keywordsNatural language processing
dc.keywordsRoBERTa
dc.keywordsSustainable development goals
dc.languageEnglish
dc.publisherMdpi
dc.sourceSustainability
dc.subjectGreen sustainable science technology
dc.subjectEnvironmental sciences
dc.subjectEnvironmental studies
dc.titleA RoBERTa approach for automated processing of sustainability reports
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.authorid0000-0003-0739-798X
local.contributor.kuauthorAngın, Merih
relation.isOrgUnitOfPublication9fc25a77-75a8-48c0-8878-02d9b71a9126
relation.isOrgUnitOfPublication.latestForDiscovery9fc25a77-75a8-48c0-8878-02d9b71a9126

Files