Publication: Exploring modulation spectrum features for speech-based depression level classification
dc.contributor.coauthor | Toledo-Ronen, Orith | |
dc.contributor.coauthor | Sorin, Alexander | |
dc.contributor.department | N/A | |
dc.contributor.kuauthor | Bozkurt, Elif | |
dc.contributor.kuprofile | PhD Student | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.yokid | N/A | |
dc.date.accessioned | 2024-11-10T00:04:50Z | |
dc.date.issued | 2014 | |
dc.description.abstract | In this paper, we propose a Modulation Spectrum-based manageable feature set for detection of depressed speech. Modulation Spectrum (MS) is obtained from the conventional speech spectrogram by spectral analysis along the temporal trajectories of the acoustic frequency bins. While MS representation of speech provides rich and high-dimensional joint frequency information, extraction of discriminative features from it remains as an open question. We propose a lower dimensional representation, which first employs a Melfrequency filterbank in the acoustic frequency domain and Discrete Cosine Transform in the modulation frequency domain, and then applies feature selection in both domains. We compare and fuse the proposed feature set with other complementary prosodic and spectral features at the feature and decision levels. In our experiments, we use Support Vector Machines for discriminating the depressed speech in a speaker-independent fashion. Feature-level fusion of the proposed MS-based features with other prosodic and spectral features after dimension reduction provides up to ~9% improvement over the baseline results and also correlates the most with clinical ratings of patients' depression level. | |
dc.description.indexedby | WoS | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsorship | Amazon | |
dc.description.sponsorship | Baidu | |
dc.description.sponsorship | et al. | |
dc.description.sponsorship | ||
dc.description.sponsorship | Temasek Laboratories at Nanyang Technological University (TL at NTU) | |
dc.description.sponsorship | ||
dc.identifier.doi | N/A | |
dc.identifier.issn | 2308-457X | |
dc.identifier.link | https://www.scopus.com/inward/record.uri?eid=2-s2.0-84910070415&partnerID=40&md5=5bfebfff94737517e4b29deea7566531 | |
dc.identifier.scopus | 2-s2.0-84910070415 | |
dc.identifier.uri | N/A | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/16342 | |
dc.identifier.wos | 395050100253 | |
dc.keywords | Decision fusion | |
dc.keywords | Depression assessment | |
dc.keywords | Feature fusion | |
dc.keywords | Modulation spectrum | |
dc.keywords | Prosody Discrete cosine transforms | |
dc.keywords | Feature extraction | |
dc.keywords | Frequency domain analysis | |
dc.keywords | Modulation | |
dc.keywords | Spectrum analysis | |
dc.keywords | Speech | |
dc.keywords | Speech communication | |
dc.keywords | Decision fusion | |
dc.keywords | Depression assessment | |
dc.keywords | Feature fusion | |
dc.keywords | Modulation spectrum | |
dc.keywords | Prosody | |
dc.keywords | Speech recognition | |
dc.language | English | |
dc.publisher | International Speech and Communication Association | |
dc.source | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | |
dc.subject | Computer science | |
dc.subject | Artificial intelligence | |
dc.subject | Engineering | |
dc.subject | Electrical electronic engineering | |
dc.title | Exploring modulation spectrum features for speech-based depression level classification | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | N/A | |
local.contributor.kuauthor | Bozkurt, Elif |