Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models
Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that fo...
Saved in:
| Published in | Proceedings / IEEE Workshop on Applications of Computer Vision pp. 4921 - 4931 |
|---|---|
| Main Authors | , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
03.01.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2642-9381 |
| DOI | 10.1109/WACV57701.2024.00486 |
Cover
| Abstract | Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion. |
|---|---|
| AbstractList | Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion. |
| Author | Wang, Hai Xiang, Xiaoyu Xue, Jing-Hao Fan, Yuchen |
| Author_xml | – sequence: 1 givenname: Hai surname: Wang fullname: Wang, Hai email: hai.wang.22@ucl.ac.uk organization: University College London – sequence: 2 givenname: Xiaoyu surname: Xiang fullname: Xiang, Xiaoyu email: xiangxiaoyu@meta.com organization: Meta Reality Labs – sequence: 3 givenname: Yuchen surname: Fan fullname: Fan, Yuchen email: ycfan@meta.com organization: Meta Reality Labs – sequence: 4 givenname: Jing-Hao surname: Xue fullname: Xue, Jing-Hao email: jinghao.xue@ucl.ac.uk organization: University College London |
| BookMark | eNotj9FKwzAUQKMouM39wR76A5n3JmnSPI5O52CiD1Mfx22bdpG1kaYF9esd6NPhvBw4U3bVhc4xtkBYIoK9e1_lb6kxgEsBQi0BVKYv2Nwam8kUJGZWwCWbCK0EtzLDGzaN8QNAWrRywjb5GIfQ-h_fNYnUwNeu6Z1LXqgLPbUUk-HYh7E5Jnv3NfAh8G1LjUvWvq7H6EOXPIXKneItu67pFN38nzP2-nC_zx_57nmzzVc77gWogddGF0ZLUgoRi7SQYElRZUujqYasRLSFkgUKKg1Wpahq48CWqSZyzlVWztjir-vPfvjsfUv99wHP10poK38ByidOoA |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/WACV57701.2024.00486 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9798350318920 |
| EISSN | 2642-9381 |
| EndPage | 4931 |
| ExternalDocumentID | 10484269 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS |
| ID | FETCH-LOGICAL-i204t-f76b763a44111b5b309a4ad9c76af08c119b43b12ac71dc2df7e09c56aaeeed93 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 27 02:11:48 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i204t-f76b763a44111b5b309a4ad9c76af08c119b43b12ac71dc2df7e09c56aaeeed93 |
| PageCount | 11 |
| ParticipantIDs | ieee_primary_10484269 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-Jan.-3 |
| PublicationDateYYYYMMDD | 2024-01-03 |
| PublicationDate_xml | – month: 01 year: 2024 text: 2024-Jan.-3 day: 03 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings / IEEE Workshop on Applications of Computer Vision |
| PublicationTitleAbbrev | WACV |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0039193 |
| Score | 2.4581466 |
| Snippet | Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 4921 |
| SubjectTerms | Algorithms Codes Computational photography Computer vision etc Games Generative models for image Geometry image and video synthesis Noise reduction Task analysis video |
| Title | Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models |
| URI | https://ieeexplore.ieee.org/document/10484269 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JTwIxFG6EkydcMO7pwWuxnc6006MBEU0kHkC5ka4JUcDIzIVfb98wqDEx8TZppun-tvb7HkJXwQgruHAkM1QT0FhEOadBGGrgI5PcAHb4cSgG4_Rhkk1qsHqFhfHeV4_PfAc-q7t8t7QlhMriCU9zgF42UEPmYgPW2opdrqIpUmPjGFXXLzfd50xKCj5gAgzZKcClf2RQqRRIv4WG26Y370ZeO2VhOnb9i5Xx333bQ-1vrB5--tJC-2jHLw5QqzYucX10V4forltGO28-W8e_MBeU9Hx0tWNdvYi7YK5XuE7Zg0fgDBdLcj-Psgb3ZiGUEFLDkDbtbdVG4_7tqDsgdRYFMktoWpAghYlCREe7hzGTGU6VTrVTVgodaG4ZUyblhiXaSuZs4oL0VNlMaB0H6RQ_Qs3FcuGPEQ48t0mijPFapTIB6rLAfOa8UpnWzp-gNkzM9H1DlDHdzsnpH-VnaBcWp4po8HPULD5KfxF1fGEuq7X9BOjcpuY |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JTwIxGG0UD3rCBeNuD16L7XSjRwMiKBAPoN5It0mIAkZmLvx622FQY2LibdJM0_3b2vc-AK5SI6ygwiFusEZRYyHlnI7CUEc-MklNxA73B6IzYvcv_KUEqxdYGO998fjM1-NncZfv5jaPobJwwlkjQi83wRZnjPEVXGsteKkKxkiJjiNYXT_fNJ-4lDh6gUnkyGYRMP0jh0qhQtpVMFg3vno58lrPM1O3y1-8jP_u3S6ofaP14OOXHtoDG362D6qleQnLw7s4AHfNPFh608ky_AWpwKjlg7Md6upZ2AdTvYBl0h44jO5wNkfdaZA2sDVJ0zwG1WBMnPa2qIFR-3bY7KAyjwKaJJhlKJXCBDGig-VDiOGGYqWZdspKoVPcsIQow6ghibaSOJu4VHqsLBdah0E6RQ9BZTaf-SMAU9qwSaKM8VoxmUTyspR47rxSXGvnj0EtTsz4fUWVMV7Pyckf5ZdguzPs98a97uDhFOzEhSriG_QMVLKP3J8HjZ-Zi2KdPwHWVKoz |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+Workshop+on+Applications+of+Computer+Vision&rft.atitle=Customizing+360-Degree+Panoramas+through+Text-to-Image+Diffusion+Models&rft.au=Wang%2C+Hai&rft.au=Xiang%2C+Xiaoyu&rft.au=Fan%2C+Yuchen&rft.au=Xue%2C+Jing-Hao&rft.date=2024-01-03&rft.pub=IEEE&rft.eissn=2642-9381&rft.spage=4921&rft.epage=4931&rft_id=info:doi/10.1109%2FWACV57701.2024.00486&rft.externalDocID=10484269 |