Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models

Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that fo...

Full description

Saved in:
Bibliographic Details
Published inProceedings / IEEE Workshop on Applications of Computer Vision pp. 4921 - 4931
Main Authors Wang, Hai, Xiang, Xiaoyu, Fan, Yuchen, Xue, Jing-Hao
Format Conference Proceeding
LanguageEnglish
Published IEEE 03.01.2024
Subjects
Online AccessGet full text
ISSN2642-9381
DOI10.1109/WACV57701.2024.00486

Cover

Abstract Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion.
AbstractList Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion.
Author Wang, Hai
Xiang, Xiaoyu
Xue, Jing-Hao
Fan, Yuchen
Author_xml – sequence: 1
  givenname: Hai
  surname: Wang
  fullname: Wang, Hai
  email: hai.wang.22@ucl.ac.uk
  organization: University College London
– sequence: 2
  givenname: Xiaoyu
  surname: Xiang
  fullname: Xiang, Xiaoyu
  email: xiangxiaoyu@meta.com
  organization: Meta Reality Labs
– sequence: 3
  givenname: Yuchen
  surname: Fan
  fullname: Fan, Yuchen
  email: ycfan@meta.com
  organization: Meta Reality Labs
– sequence: 4
  givenname: Jing-Hao
  surname: Xue
  fullname: Xue, Jing-Hao
  email: jinghao.xue@ucl.ac.uk
  organization: University College London
BookMark eNotj9FKwzAUQKMouM39wR76A5n3JmnSPI5O52CiD1Mfx22bdpG1kaYF9esd6NPhvBw4U3bVhc4xtkBYIoK9e1_lb6kxgEsBQi0BVKYv2Nwam8kUJGZWwCWbCK0EtzLDGzaN8QNAWrRywjb5GIfQ-h_fNYnUwNeu6Z1LXqgLPbUUk-HYh7E5Jnv3NfAh8G1LjUvWvq7H6EOXPIXKneItu67pFN38nzP2-nC_zx_57nmzzVc77gWogddGF0ZLUgoRi7SQYElRZUujqYasRLSFkgUKKg1Wpahq48CWqSZyzlVWztjir-vPfvjsfUv99wHP10poK38ByidOoA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/WACV57701.2024.00486
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9798350318920
EISSN 2642-9381
EndPage 4931
ExternalDocumentID 10484269
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-i204t-f76b763a44111b5b309a4ad9c76af08c119b43b12ac71dc2df7e09c56aaeeed93
IEDL.DBID RIE
IngestDate Wed Aug 27 02:11:48 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-f76b763a44111b5b309a4ad9c76af08c119b43b12ac71dc2df7e09c56aaeeed93
PageCount 11
ParticipantIDs ieee_primary_10484269
PublicationCentury 2000
PublicationDate 2024-Jan.-3
PublicationDateYYYYMMDD 2024-01-03
PublicationDate_xml – month: 01
  year: 2024
  text: 2024-Jan.-3
  day: 03
PublicationDecade 2020
PublicationTitle Proceedings / IEEE Workshop on Applications of Computer Vision
PublicationTitleAbbrev WACV
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0039193
Score 2.4581466
Snippet Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods...
SourceID ieee
SourceType Publisher
StartPage 4921
SubjectTerms Algorithms
Codes
Computational photography
Computer vision
etc
Games
Generative models for image
Geometry
image and video synthesis
Noise reduction
Task analysis
video
Title Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models
URI https://ieeexplore.ieee.org/document/10484269
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JTwIxFG6EkydcMO7pwWuxnc6006MBEU0kHkC5ka4JUcDIzIVfb98wqDEx8TZppun-tvb7HkJXwQgruHAkM1QT0FhEOadBGGrgI5PcAHb4cSgG4_Rhkk1qsHqFhfHeV4_PfAc-q7t8t7QlhMriCU9zgF42UEPmYgPW2opdrqIpUmPjGFXXLzfd50xKCj5gAgzZKcClf2RQqRRIv4WG26Y370ZeO2VhOnb9i5Xx333bQ-1vrB5--tJC-2jHLw5QqzYucX10V4forltGO28-W8e_MBeU9Hx0tWNdvYi7YK5XuE7Zg0fgDBdLcj-Psgb3ZiGUEFLDkDbtbdVG4_7tqDsgdRYFMktoWpAghYlCREe7hzGTGU6VTrVTVgodaG4ZUyblhiXaSuZs4oL0VNlMaB0H6RQ_Qs3FcuGPEQ48t0mijPFapTIB6rLAfOa8UpnWzp-gNkzM9H1DlDHdzsnpH-VnaBcWp4po8HPULD5KfxF1fGEuq7X9BOjcpuY
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JTwIxGG0UD3rCBeNuD16L7XSjRwMiKBAPoN5It0mIAkZmLvx622FQY2LibdJM0_3b2vc-AK5SI6ygwiFusEZRYyHlnI7CUEc-MklNxA73B6IzYvcv_KUEqxdYGO998fjM1-NncZfv5jaPobJwwlkjQi83wRZnjPEVXGsteKkKxkiJjiNYXT_fNJ-4lDh6gUnkyGYRMP0jh0qhQtpVMFg3vno58lrPM1O3y1-8jP_u3S6ofaP14OOXHtoDG362D6qleQnLw7s4AHfNPFh608ky_AWpwKjlg7Md6upZ2AdTvYBl0h44jO5wNkfdaZA2sDVJ0zwG1WBMnPa2qIFR-3bY7KAyjwKaJJhlKJXCBDGig-VDiOGGYqWZdspKoVPcsIQow6ghibaSOJu4VHqsLBdah0E6RQ9BZTaf-SMAU9qwSaKM8VoxmUTyspR47rxSXGvnj0EtTsz4fUWVMV7Pyckf5ZdguzPs98a97uDhFOzEhSriG_QMVLKP3J8HjZ-Zi2KdPwHWVKoz
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+Workshop+on+Applications+of+Computer+Vision&rft.atitle=Customizing+360-Degree+Panoramas+through+Text-to-Image+Diffusion+Models&rft.au=Wang%2C+Hai&rft.au=Xiang%2C+Xiaoyu&rft.au=Fan%2C+Yuchen&rft.au=Xue%2C+Jing-Hao&rft.date=2024-01-03&rft.pub=IEEE&rft.eissn=2642-9381&rft.spage=4921&rft.epage=4931&rft_id=info:doi/10.1109%2FWACV57701.2024.00486&rft.externalDocID=10484269