Publication: Hierarchical compact clustering attention (COCA) for unsupervised object-centric learning
Program
KU-Authors
KU Authors
Co-Authors
Editor & Affiliation
Compiler & Affiliation
Translator
Other Contributor
Date
Language
eng
Embargo Status
No
Journal Title
Journal ISSN
Volume Title
Alternative Title
Abstract
We propose the Compact Clustering Attention (COCA) layer, an effective building block that introduces a hierarchical strategy for object-centric representation learning, while solving the unsupervised object discovery task on single images. COCA is an attention-based clustering module capable of extracting object-centric representations from multi-object scenes, when cascaded into a bottom-up hierarchical network architecture, referred to as COCA-Net. At its core, COCA utilizes a novel clustering algorithm that leverages the physical concept of compactness, to highlight distinct object centroids in a scene, providing a spatial inductive bias. Thanks to this strategy, COCA-Net generates high-quality segmentation masks on both the decoder side and, notably, the encoder side of its pipeline. Additionally, COCA-Net is not bound by a predetermined number of object masks that it generates and handles the segmentation of background elements better than its competitors. We demonstrate COCA-Net's segmentation performance on six widely adopted datasets, achieving superior or competitive results against the state-of-the-art models across nine different evaluation metrics.
Source
Publisher
IEEE
Subject
Computer science
Citation
Has Part
Source
2025 IEEE CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Book Series Title
Edition
DOI
10.1109/CVPR52734.2025.02364
item.page.datauri
Link
Rights
N/A
Copyrights Note
Creative Commons license
Except where otherwised noted, this item's license is described as N/A
