Unpaired aired image-to-image translation with learnable tookens in diffusionGAN

Publication:
Unpaired aired image-to-image translation with learnable tookens in diffusionGAN

dc.contributor.advisor	Yemez, Yücel
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.kuauthor	Dinçer, Ege
dc.contributor.program	Computer Sciences and Engineering
dc.contributor.referee	Sahillioğlu, Yusuf\|\|Erdem, Aykut
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.coverage.spatial	İstanbul
dc.date.accessioned	2025-06-30T04:36:29Z
dc.date.available	2025-04-16
dc.date.issued	2024
dc.description.abstract	Stable Diffusion models have recently achieved outstanding results in image generation tasks, surpassing prior state-of-the-art models based on Generative Adversarial Networks (GANs). While GANs were computationally efficient, their training stability often posed challenges. We introduce a novel framework that seeks to combine the strengths of both Stable Diffusion and GAN architectures for unpaired image-to-image translation. Our approach avoids the need for training Stable Diffusion from scratch by using pretrained token embeddings and a discriminator within a GAN-like training paradigm. This eliminates the requirement for pre-specified text prompts, as the framework learns suitable prompts through embeddings to perform domain-to-domain translation in an unsupervised setting. We show high-quality images generated by our framework and discuss promising possible ways for future enhancements.
dc.description.abstract	Kararlı Difüzyon modelleri (Stable Diffusion) , görüntü sentezi alanında önemli bir dönüm noktası oluşturarak, daha önce Üretken Rekabetçi Ağlar (GAN) tarafından elde edilen sonuçları geride bırakmıştır. GAN'ların eğitim sürecindeki istikrarsızlık sorunları, Kararlı Difüzyon'un daha etkili bir alternatif olmasını sağlamıştır. Bu çalışmada, görüntüden görüntüye çeviri için hem Kararlı Difüzyon'un güçlü yönlerini hem de GAN'ların eğitim paradigmasını bir araya getiren yeni bir mimari önermekteyiz. Önerilen yeni model, önceden eğitilmiş Kararlı Difüzyon modelini kullanarak, metin belirteçleri ve bir ayırıcı (Discriminator) sayesinde sıfırdan eğitim gerektirmeden, farklı uzaylar arasında dönüşüm yapabilmektedir. Bu durum, önceden belirlenmiş metin istemlerine olan ihtiyacı ortadan kaldırmakta ve model, denetimsiz bir ortamda daha esnek bir şekilde öğrenme yeteneği kazanmaktadır. Tezimizde, modelin ürettiği yüksek kaliteli görüntülerle birlikte, gelecekteki çalışmalar için potansiyel iyileştirme alanlarını tartışıyoruz.
dc.description.fulltext	Yes
dc.format.extent	xii, 47 leaves : illustrations ; 30 cm.
dc.identifier.embargo	No
dc.identifier.endpage	59
dc.identifier.filenameinventoryno	T_2024_158_GSSE
dc.identifier.uri	https://hdl.handle.net/20.500.14288/29833
dc.identifier.yoktezid	925852
dc.identifier.yoktezlink	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=P3dtmmHrq-mzEcmCLi1CqY49iPG6P8T9RwZSjq56WGzXH37zPBhQSoqfS_Neyt_p
dc.language.iso	eng
dc.publisher	Koç University
dc.relation.collection	KU Theses and Dissertations
dc.rights	restrictedAccess
dc.rights.copyrightsnote	© All Rights Reserved. Accessible to Koç University Affiliated Users Only!
dc.subject	Mathematical models
dc.subject	Fluids
dc.subject	Fluid mechanics
dc.subject	Computer simulation
dc.subject	Artificial intelligence
dc.subject	Image processing, Digital techniques
dc.subject	Image processing
dc.subject	Computer vision
dc.title	Unpaired aired image-to-image translation with learnable tookens in diffusionGAN
dc.title.alternative	DiffusionGAN ile öğrenebilir belirteçler kullanarak eşleştirilmemiş görüntü dönüşümü
dc.type	Thesis
dspace.entity.type	Publication
local.contributor.kuauthor	Dinçer, Ege
person.familyName	Dinçer
person.givenName	Ege
relation.isAdvisorOfThesis	23c08ce5-6539-43b2-a2fa-ce7e80c2b52d
relation.isAdvisorOfThesis.latestForDiscovery	23c08ce5-6539-43b2-a2fa-ce7e80c2b52d
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	434c9663-2b11-4e66-9399-c863e2ebae43

Files

Original bundle

Now showing 1 - 1 of 1

Name:: T_2024_158_GSSE.pdf
Size:: 34.98 MB
Format:: Adobe Portable Document Format

Download

Collections

Theses & Dissertations

Publication: Unpaired aired image-to-image translation with learnable tookens in diffusionGAN

Files

Original bundle

Collections

Publication:
Unpaired aired image-to-image translation with learnable tookens in diffusionGAN