LFSamba: Marry SAM With Mamba for Light Field Salient Object Detection

A light field camera can reconstruct 3D scenes using captured multi-focus images that contain rich spatial geometric information, enhancing applications in stereoscopic photography, virtual reality, and robotic vision. In this work, a state-of-the-art salient object detection model for multi-focus l...

Full description

Saved in:

Bibliographic Details
Published in	IEEE signal processing letters Vol. 31; pp. 3144 - 3148
Main Authors	Liu, Zhengyi, Wang, Longzhen, Fang, Xianyong, Tu, Zhengzheng, Wang, Linbo
Format	Journal Article
Language	English
Published	New York IEEE 2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation models Annotations Convolution Costs Datasets Feature extraction Field cameras Image enhancement Image reconstruction light field Machine vision Mamba Modelling multi-focus Object detection Object recognition Salience salient object detection SAM Solid modeling Stereophotography Supervised learning Three-dimensional displays Transformers Virtual reality
Online Access	Get full text
ISSN	1070-9908 1558-2361
DOI	10.1109/LSP.2024.3493799

Cover

Abstract	A light field camera can reconstruct 3D scenes using captured multi-focus images that contain rich spatial geometric information, enhancing applications in stereoscopic photography, virtual reality, and robotic vision. In this work, a state-of-the-art salient object detection model for multi-focus light field images, called LFSamba, is introduced to emphasize four main insights: (a) Efficient feature extraction, where SAM is used to extract modality-aware discriminative features; (b) Inter-slice relation modeling, leveraging Mamba to capture long-range dependencies across multiple focal slices, thus extracting implicit depth cues; (c) Inter-modal relation modeling, utilizing Mamba to integrate all-focus and multi-focus images, enabling mutual enhancement; (d) Weakly supervised learning capability, developing a scribble annotation dataset from an existing pixel-level mask dataset, establishing the first scribble-supervised baseline for light field salient object detection.
AbstractList	A light field camera can reconstruct 3D scenes using captured multi-focus images that contain rich spatial geometric information, enhancing applications in stereoscopic photography, virtual reality, and robotic vision. In this work, a state-of-the-art salient object detection model for multi-focus light field images, called LFSamba, is introduced to emphasize four main insights: (a) Efficient feature extraction, where SAM is used to extract modality-aware discriminative features; (b) Inter-slice relation modeling, leveraging Mamba to capture long-range dependencies across multiple focal slices, thus extracting implicit depth cues; (c) Inter-modal relation modeling, utilizing Mamba to integrate all-focus and multi-focus images, enabling mutual enhancement; (d) Weakly supervised learning capability, developing a scribble annotation dataset from an existing pixel-level mask dataset, establishing the first scribble-supervised baseline for light field salient object detection.
Author	Tu, Zhengzheng Liu, Zhengyi Wang, Longzhen Wang, Linbo Fang, Xianyong
Author_xml	– sequence: 1 givenname: Zhengyi orcidid: 0000-0003-3265-823X surname: Liu fullname: Liu, Zhengyi email: liuzywen@ahu.edu.cn organization: School of Computer Science and Technology, Anhui University, Hefei, China – sequence: 2 givenname: Longzhen orcidid: 0009-0007-3278-3088 surname: Wang fullname: Wang, Longzhen email: 1774537072@qq.com organization: School of Computer Science and Technology, Anhui University, Hefei, China – sequence: 3 givenname: Xianyong orcidid: 0000-0002-6045-8430 surname: Fang fullname: Fang, Xianyong email: fangxianyong@ahu.edu.cn organization: School of Computer Science and Technology, Anhui University, Hefei, China – sequence: 4 givenname: Zhengzheng orcidid: 0000-0002-9689-8657 surname: Tu fullname: Tu, Zhengzheng email: zhengzhengahu@163.com organization: School of Computer Science and Technology, Anhui University, Hefei, China – sequence: 5 givenname: Linbo orcidid: 0000-0001-7276-7065 surname: Wang fullname: Wang, Linbo email: wanglb@ahu.edu.cn organization: School of Computer Science and Technology, Anhui University, Hefei, China
BookMark	eNpNkDFPwzAQhS0EEm1hZ2CwxJxy9tlxzFYVAkipihQQo-UkDk3VJsVJh_57XLUD07s7vXdP-sbksu1aR8gdgyljoB-z_GPKgYspCo1K6wsyYlImEceYXYYZFERaQ3JNxn2_BoCEJXJE0izN7bawT3RhvT_QfLag382wCmu40rrzNGt-VgNNG7epaG43jWsHuizWrhzosxuCNF17Q65qu-nd7Vkn5Ct9-Zy_Rdny9X0-y6KSCzlEClhcgdSsFCxR6KyuKowLKHVVllKiKrAqFNYCQNXSCuSokkTbokTJZc1xQh5Of3e--927fjDrbu_bUGmQIWjkMRfBBSdX6bu-9642O99srT8YBuZIywRa5kjLnGmFyP0p0jjn_tmVUIwD_gERE2Sm
CODEN	ISPLEM
Cites_doi	10.1109/LSP.2024.3374079 10.1109/TPAMI.2023.3235415 10.1109/TIP.2020.2990341 10.1109/ICME55011.2023.00404 10.1109/tcsvt.2024.3437685 10.1109/CVPR.2017.404 10.1109/LSP.2024.3383798 10.1609/aaai.v34i07.6860 10.1109/TIP.2022.3207605 10.1109/LSP.2020.3044544 10.1016/j.imavis.2022.104595 10.1109/LSP.2023.3291311 10.1145/3107956 10.1109/CVPR52688.2022.00180 10.1609/aaai.v35i4.16434 10.1109/TCSVT.2023.3281465 10.1016/j.imavis.2021.104352 10.1016/j.neucom.2022.03.056 10.1109/ICME55011.2023.00407 10.1109/CVPR.2014.359 10.48550/arXiv.2010.11929 10.1109/ICCV.2019.00893 10.1109/TCYB.2021.3095512 10.1109/CVPR.2019.00623 10.1109/LSP.2023.3342613 10.24963/ijcai.2019/127 10.1109/ICCV48922.2021.00467 10.1109/CVPR42600.2020.01256 10.1109/TMM.2023.3274933 10.1109/ICCV51070.2023.00371 10.1109/TIP.2021.3071691
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1109/LSP.2024.3493799
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-2361
EndPage	3148
ExternalDocumentID	10_1109_LSP_2024_3493799 10747120
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 62376005 funderid: 10.13039/501100001809
GroupedDBID	-~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 85S 97E AAJGR AARMG AASAJ AAWTH AAYJJ ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TAE TN5 VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c245t-7016d0591c41873ea9dd36b0c9dcc5537b3db73f4007f5a43237889abc3525f23
IEDL.DBID	RIE
ISSN	1070-9908
IngestDate	Mon Jun 30 12:44:38 EDT 2025 Wed Oct 01 03:03:35 EDT 2025 Wed Aug 27 03:06:43 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c245t-7016d0591c41873ea9dd36b0c9dcc5537b3db73f4007f5a43237889abc3525f23
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0003-3265-823X 0000-0002-6045-8430 0009-0007-3278-3088 0000-0001-7276-7065 0000-0002-9689-8657
PQID	3130932624
PQPubID	75747
PageCount	5
ParticipantIDs	proquest_journals_3130932624 ieee_primary_10747120 crossref_primary_10_1109_LSP_2024_3493799
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20240000 2024-00-00 20240101
PublicationDateYYYYMMDD	2024-01-01
PublicationDate_xml	– year: 2024 text: 20240000
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE signal processing letters
PublicationTitleAbbrev	LSP
PublicationYear	2024
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref35 ref12 Zhang (ref15) 2019 ref34 ref37 ref14 ref36 ref31 ref30 ref33 ref10 ref32 ref2 ref1 ref17 ref39 ref16 ref38 Wei (ref22) 2020 Liu (ref21) 2024 Shi (ref18) 2015; 1 Gu (ref8) 2024 ref24 ref23 ref26 ref25 Gu (ref19) 2022 ref28 ref27 Zhu (ref20) 2024 ref29 ref7 ref9 ref4 ref3 ref5 Chen (ref6) 2022; 35 ref40 Zhang (ref11) 2021
References_xml	– ident: ref2 doi: 10.1109/LSP.2024.3374079 – ident: ref32 doi: 10.1109/TPAMI.2023.3235415 – ident: ref14 doi: 10.1109/TIP.2020.2990341 – ident: ref23 doi: 10.1109/ICME55011.2023.00404 – volume: 1 start-page: 802 volume-title: Proc. 28th Int. Conf. Neural Inf. Process. Syst. year: 2015 ident: ref18 article-title: Convolutional LSTM network: A machine learning approach for precipitation nowcasting – ident: ref34 doi: 10.1109/tcsvt.2024.3437685 – ident: ref35 doi: 10.1109/CVPR.2017.404 – ident: ref3 doi: 10.1109/LSP.2024.3383798 – ident: ref16 doi: 10.1609/aaai.v34i07.6860 – ident: ref40 doi: 10.1109/TIP.2022.3207605 – start-page: 1 volume-title: Proc. Int. Conf. Mach. Learn. year: 2024 ident: ref20 article-title: Vision mamba: Efficient visual representation learning with bidirectional state space model – ident: ref4 doi: 10.1109/LSP.2020.3044544 – start-page: 1 volume-title: Proc. Brit. Mach. Vis. Conf. year: 2021 ident: ref11 article-title: Learning synergistic attention for light field salient object detection – ident: ref30 doi: 10.1016/j.imavis.2022.104595 – ident: ref1 doi: 10.1109/LSP.2023.3291311 – start-page: 898 volume-title: Proc. 33rd Int. Conf. Neural Inf. Process. Syst. year: 2019 ident: ref15 article-title: Memory-oriented decoder for light field salient object detection – start-page: 12321 volume-title: Proc. AAAI Conf. Artif. Intell. year: 2020 ident: ref22 article-title: F3 Net: Fusion, feedback and focus for salient object detection – start-page: 1 volume-title: Proc. Conf. Lang. Model. year: 2024 ident: ref8 article-title: Mamba: Linear-time sequence modeling with selective state spaces – ident: ref25 doi: 10.1145/3107956 – ident: ref17 doi: 10.1109/CVPR52688.2022.00180 – ident: ref39 doi: 10.1609/aaai.v35i4.16434 – ident: ref13 doi: 10.1109/TCSVT.2023.3281465 – ident: ref29 doi: 10.1016/j.imavis.2021.104352 – ident: ref10 doi: 10.1016/j.neucom.2022.03.056 – ident: ref31 doi: 10.1109/ICME55011.2023.00407 – ident: ref24 doi: 10.1109/CVPR.2014.359 – ident: ref7 doi: 10.48550/arXiv.2010.11929 – ident: ref26 doi: 10.1109/ICCV.2019.00893 – volume: 35 start-page: 16664 year: 2022 ident: ref6 article-title: AdaptFormer: Adapting vision transformers for scalable visual recognition publication-title: Adv. Neural Inf. Process. Syst. – ident: ref28 doi: 10.1109/TCYB.2021.3095512 – ident: ref36 doi: 10.1109/CVPR.2019.00623 – ident: ref9 doi: 10.1109/LSP.2023.3342613 – start-page: 1 volume-title: proc. 38th Annu. Conf. Neural Inf. Process. Syst. year: 2024 ident: ref21 article-title: VMamba: Visual state space model – start-page: 1 volume-title: Proc. Int. Conf. Learn. Representations year: 2022 ident: ref19 article-title: Efficiently modeling long sequences with structured state spaces – ident: ref27 doi: 10.24963/ijcai.2019/127 – ident: ref12 doi: 10.1109/ICCV48922.2021.00467 – ident: ref37 doi: 10.1109/CVPR42600.2020.01256 – ident: ref33 doi: 10.1109/TMM.2023.3274933 – ident: ref5 doi: 10.1109/ICCV51070.2023.00371 – ident: ref38 doi: 10.1109/TIP.2021.3071691
SSID	ssj0008185
Score	2.4340384
Snippet	A light field camera can reconstruct 3D scenes using captured multi-focus images that contain rich spatial geometric information, enhancing applications in...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Index Database Publisher
StartPage	3144
SubjectTerms	Adaptation models Annotations Convolution Costs Datasets Feature extraction Field cameras Image enhancement Image reconstruction light field Machine vision Mamba Modelling multi-focus Object detection Object recognition Salience salient object detection SAM Solid modeling Stereophotography Supervised learning Three-dimensional displays Transformers Virtual reality
Title	LFSamba: Marry SAM With Mamba for Light Field Salient Object Detection
URI	https://ieeexplore.ieee.org/document/10747120 https://www.proquest.com/docview/3130932624
Volume	31
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2361 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0008185 issn: 1070-9908 databaseCode: RIE dateStart: 19940101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT4MwFG90Jz34OeN0mh68eGADWih4W1SymG2azMXdSFtKNEZmJjvoX-97hemiMfEGpIXmvfI-2t_7lZAziRxaroqdMHdzh4OKHcmZcYQ2UmbQi-WYKA5HYX_Cb6bBtC5Wt7UwxhgLPjMdvLR7-dlML3CprIvgQeH5kKGviyisirW-zC56ngpg6DpgYqPlnqQbdwfjO8gEfd5hHLyxpXn99kH2UJVflti6l2SbjJYDq1Alz51FqTr64wdn479HvkO26kCT9qqZsUvWTLFHNlfoB_dJMkjG8kXJCzqU8_k7HfeG9OGpfIRbeEohnKUDzN1pgjA3OoaQHb5CbxWu3dArU1oYV9Ekk-T6_rLv1OcqONrnQekICPMyCKs8zb1IMCPjLGOhcnWcaR0ETCgkXWY5HpmeB6A6H0nnY6k0cqfmPjsgjWJWmENCQ6ElmEuWRzm0Y0xxz0jhhxrRcyIIW-R8Ken0taLPSG3a4cYpaCVFraS1VlqkiYJbaVfJrEXaS92k9Q_2ljIPt3D90OdHf3Q7Jhv49mq5pE0a5XxhTiCAKNWpnTif1EC-rA
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT4MwFG-MHtSDnzNOp_bgxQMb0ALD26KSqTBNtsXdSFtKNMbNTHbQv973CtNFY-INSJs2feV9tL_3e4ScCuTQsmVo-bmdWxxEbAnOtBUoLUQGvViOgWLS87tDfjPyRlWyusmF0Vob8Jlu4qO5y88maoZHZS0EDwaOCxH6isc598p0rS_Fi7anhBjaFijZ9vxW0g5bcf8eYkGXNxkHe2yIXr-tkCmr8ksXGwMTbZLefGolruS5OStkU338YG3899y3yEblatJOuTe2yZIe75D1BQLCXRLFUV-8SHFOEzGdvtN-J6EPT8UjvMJXCg4tjTF6pxEC3WgfnHYYhd5JPL2hl7owQK5xjQyjq8FF16oqK1jK5V5hBeDoZeBYOYo77YBpEWYZ86Wtwkwpz2OBRNpllmPR9NwD4blIOx8KqZA9NXfZHlkeT8Z6n1A_UAIUJsvbObRjTHJHi8D1FeLnAs-vk7P5SqevJYFGagIPO0xBKilKJa2kUic1XLiFduWa1UljLpu0-sXeUubgJa7ru_zgj24nZLU7SOI0vu7dHpI1HKk8PGmQ5WI600fgThTy2GyiTwRJwfk
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LFSamba%3A+Marry+SAM+With+Mamba+for+Light+Field+Salient+Object+Detection&rft.jtitle=IEEE+signal+processing+letters&rft.au=Liu%2C+Zhengyi&rft.au=Wang%2C+Longzhen&rft.au=Fang%2C+Xianyong&rft.au=Tu%2C+Zhengzheng&rft.date=2024&rft.issn=1070-9908&rft.eissn=1558-2361&rft.volume=31&rft.spage=3144&rft.epage=3148&rft_id=info:doi/10.1109%2FLSP.2024.3493799&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LSP_2024_3493799
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1070-9908&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1070-9908&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1070-9908&client=summon