Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models

Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that fo...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings / IEEE Workshop on Applications of Computer Vision pp. 4921 - 4931
Main Authors	Wang, Hai, Xiang, Xiaoyu, Fan, Yuchen, Xue, Jing-Hao
Format	Conference Proceeding
Language	English
Published	IEEE 03.01.2024
Subjects	Algorithms Codes Computational photography Computer vision etc Games Generative models for image Geometry image and video synthesis Noise reduction Task analysis video
Online Access	Get full text
ISSN	2642-9381
DOI	10.1109/WACV57701.2024.00486

Cover

Abstract	Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion.
AbstractList	Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion.
Author	Wang, Hai Xiang, Xiaoyu Xue, Jing-Hao Fan, Yuchen
Author_xml	– sequence: 1 givenname: Hai surname: Wang fullname: Wang, Hai email: hai.wang.22@ucl.ac.uk organization: University College London – sequence: 2 givenname: Xiaoyu surname: Xiang fullname: Xiang, Xiaoyu email: xiangxiaoyu@meta.com organization: Meta Reality Labs – sequence: 3 givenname: Yuchen surname: Fan fullname: Fan, Yuchen email: ycfan@meta.com organization: Meta Reality Labs – sequence: 4 givenname: Jing-Hao surname: Xue fullname: Xue, Jing-Hao email: jinghao.xue@ucl.ac.uk organization: University College London
BookMark	eNotj9FKwzAUQKMouM39wR76A5n3JmnSPI5O52CiD1Mfx22bdpG1kaYF9esd6NPhvBw4U3bVhc4xtkBYIoK9e1_lb6kxgEsBQi0BVKYv2Nwam8kUJGZWwCWbCK0EtzLDGzaN8QNAWrRywjb5GIfQ-h_fNYnUwNeu6Z1LXqgLPbUUk-HYh7E5Jnv3NfAh8G1LjUvWvq7H6EOXPIXKneItu67pFN38nzP2-nC_zx_57nmzzVc77gWogddGF0ZLUgoRi7SQYElRZUujqYasRLSFkgUKKg1Wpahq48CWqSZyzlVWztjir-vPfvjsfUv99wHP10poK38ByidOoA
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/WACV57701.2024.00486
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences
EISBN	9798350318920
EISSN	2642-9381
EndPage	4931
ExternalDocumentID	10484269
Genre	orig-research
GroupedDBID	6IE 6IF 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS
ID	FETCH-LOGICAL-i204t-f76b763a44111b5b309a4ad9c76af08c119b43b12ac71dc2df7e09c56aaeeed93
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:11:48 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i204t-f76b763a44111b5b309a4ad9c76af08c119b43b12ac71dc2df7e09c56aaeeed93
PageCount	11
ParticipantIDs	ieee_primary_10484269
PublicationCentury	2000
PublicationDate	2024-Jan.-3
PublicationDateYYYYMMDD	2024-01-03
PublicationDate_xml	– month: 01 year: 2024 text: 2024-Jan.-3 day: 03
PublicationDecade	2020
PublicationTitle	Proceedings / IEEE Workshop on Applications of Computer Vision
PublicationTitleAbbrev	WACV
PublicationYear	2024
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0039193
Score	2.4581466
Snippet	Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods...
SourceID	ieee
SourceType	Publisher
StartPage	4921
SubjectTerms	Algorithms Codes Computational photography Computer vision etc Games Generative models for image Geometry image and video synthesis Noise reduction Task analysis video
Title	Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models
URI	https://ieeexplore.ieee.org/document/10484269
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JTwIxFG6EkydcMO7pwWuxnc6006MBEU0kHkC5ka4JUcDIzIVfb98wqDEx8TZppun-tvb7HkJXwQgruHAkM1QT0FhEOadBGGrgI5PcAHb4cSgG4_Rhkk1qsHqFhfHeV4_PfAc-q7t8t7QlhMriCU9zgF42UEPmYgPW2opdrqIpUmPjGFXXLzfd50xKCj5gAgzZKcClf2RQqRRIv4WG26Y370ZeO2VhOnb9i5Xx333bQ-1vrB5--tJC-2jHLw5QqzYucX10V4forltGO28-W8e_MBeU9Hx0tWNdvYi7YK5XuE7Zg0fgDBdLcj-Psgb3ZiGUEFLDkDbtbdVG4_7tqDsgdRYFMktoWpAghYlCREe7hzGTGU6VTrVTVgodaG4ZUyblhiXaSuZs4oL0VNlMaB0H6RQ_Qs3FcuGPEQ48t0mijPFapTIB6rLAfOa8UpnWzp-gNkzM9H1DlDHdzsnpH-VnaBcWp4po8HPULD5KfxF1fGEuq7X9BOjcpuY
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JTwIxGG0UD3rCBeNuD16L7XSjRwMiKBAPoN5It0mIAkZmLvx622FQY2LibdJM0_3b2vc-AK5SI6ygwiFusEZRYyHlnI7CUEc-MklNxA73B6IzYvcv_KUEqxdYGO998fjM1-NncZfv5jaPobJwwlkjQi83wRZnjPEVXGsteKkKxkiJjiNYXT_fNJ-4lDh6gUnkyGYRMP0jh0qhQtpVMFg3vno58lrPM1O3y1-8jP_u3S6ofaP14OOXHtoDG362D6qleQnLw7s4AHfNPFh608ky_AWpwKjlg7Md6upZ2AdTvYBl0h44jO5wNkfdaZA2sDVJ0zwG1WBMnPa2qIFR-3bY7KAyjwKaJJhlKJXCBDGig-VDiOGGYqWZdspKoVPcsIQow6ghibaSOJu4VHqsLBdah0E6RQ9BZTaf-SMAU9qwSaKM8VoxmUTyspR47rxSXGvnj0EtTsz4fUWVMV7Pyckf5ZdguzPs98a97uDhFOzEhSriG_QMVLKP3J8HjZ-Zi2KdPwHWVKoz
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+Workshop+on+Applications+of+Computer+Vision&rft.atitle=Customizing+360-Degree+Panoramas+through+Text-to-Image+Diffusion+Models&rft.au=Wang%2C+Hai&rft.au=Xiang%2C+Xiaoyu&rft.au=Fan%2C+Yuchen&rft.au=Xue%2C+Jing-Hao&rft.date=2024-01-03&rft.pub=IEEE&rft.eissn=2642-9381&rft.spage=4921&rft.epage=4931&rft_id=info:doi/10.1109%2FWACV57701.2024.00486&rft.externalDocID=10484269