SUN RGB-D: A RGB-D scene understanding benchmark suite

Although RGB-D sensors have enabled major break-throughs for several vision tasks, such as 3D reconstruction, we have not attained the same level of success in high-level scene understanding. Perhaps one of the main reasons is the lack of a large-scale benchmark with 3D annotations and 3D evaluation...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 567 - 576
Main Authors Shuran Song, Lichtenberg, Samuel P., Jianxiong Xiao
Format Conference Proceeding Journal Article
LanguageEnglish
Published IEEE 01.06.2015
Subjects
Online AccessGet full text
ISSN1063-6919
1063-6919
DOI10.1109/CVPR.2015.7298655

Cover

Abstract Although RGB-D sensors have enabled major break-throughs for several vision tasks, such as 3D reconstruction, we have not attained the same level of success in high-level scene understanding. Perhaps one of the main reasons is the lack of a large-scale benchmark with 3D annotations and 3D evaluation metrics. In this paper, we introduce an RGB-D benchmark suite for the goal of advancing the state-of-the-arts in all major scene understanding tasks. Our dataset is captured by four different sensors and contains 10,335 RGB-D images, at a similar scale as PASCAL VOC. The whole dataset is densely annotated and includes 146,617 2D polygons and 64,595 3D bounding boxes with accurate object orientations, as well as a 3D room layout and scene category for each image. This dataset enables us to train data-hungry algorithms for scene-understanding tasks, evaluate them using meaningful 3D metrics, avoid overfitting to a small testing set, and study cross-sensor bias.
AbstractList Although RGB-D sensors have enabled major break-throughs for several vision tasks, such as 3D reconstruction, we have not attained the same level of success in high-level scene understanding. Perhaps one of the main reasons is the lack of a large-scale benchmark with 3D annotations and 3D evaluation metrics. In this paper, we introduce an RGB-D benchmark suite for the goal of advancing the state-of-the-arts in all major scene understanding tasks. Our dataset is captured by four different sensors and contains 10,335 RGB-D images, at a similar scale as PASCAL VOC. The whole dataset is densely annotated and includes 146,617 2D polygons and 64,595 3D bounding boxes with accurate object orientations, as well as a 3D room layout and scene category for each image. This dataset enables us to train data-hungry algorithms for scene-understanding tasks, evaluate them using meaningful 3D metrics, avoid overfitting to a small testing set, and study cross-sensor bias.
Author Jianxiong Xiao
Shuran Song
Lichtenberg, Samuel P.
Author_xml – sequence: 1
  surname: Shuran Song
  fullname: Shuran Song
  organization: Princeton Univ., Princeton, NJ, USA
– sequence: 2
  givenname: Samuel P.
  surname: Lichtenberg
  fullname: Lichtenberg, Samuel P.
  organization: Princeton Univ., Princeton, NJ, USA
– sequence: 3
  surname: Jianxiong Xiao
  fullname: Jianxiong Xiao
  organization: Princeton Univ., Princeton, NJ, USA
BookMark eNpNkDFPwzAUhA0qEm3pD0AsGVlSnu3YzmMrBQpSBahQ1shJXsCidds4Gfj3REoHprvhdPruRmzgd54Yu-Qw5RzwZv75tpoK4GpqBKZaqRM24ok2UqNO4JQNOWgZa-Q4-OfP2SQEl4MESBEFDJl-X79Eq8VdfH8bzXoThYI8Ra0vqQ6N9aXzX1FOvvje2vonCq1r6IKdVXYTaHLUMVs_PnzMn-Ll6-J5PlvGTkDaxBVKEjJXiUIhko7AJloVmJal7lBtwpXKDRIoqHhh0VhRWtBpWkrLdYEox-y6793Xu0NLocm2rsPbbKynXRsybkw3xhglu-hVH3VElO1r19H-Zsd35B8YPFR2
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/CVPR.2015.7298655
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 1467369640
9781467369640
EISSN 1063-6919
EndPage 576
ExternalDocumentID 7298655
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IH
6IK
ABDPE
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IPLJI
M43
RIE
RIO
RNS
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-i208t-f93e23b5459224691a465c98dd6673a4155b79e050f1ca97a2da0688d3a16c993
IEDL.DBID RIE
ISSN 1063-6919
IngestDate Fri Sep 05 06:57:17 EDT 2025
Wed Aug 27 02:49:09 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i208t-f93e23b5459224691a465c98dd6673a4155b79e050f1ca97a2da0688d3a16c993
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
PQID 1770307753
PQPubID 23500
PageCount 10
ParticipantIDs ieee_primary_7298655
proquest_miscellaneous_1770307753
PublicationCentury 2000
PublicationDate 20150601
PublicationDateYYYYMMDD 2015-06-01
PublicationDate_xml – month: 06
  year: 2015
  text: 20150601
  day: 01
PublicationDecade 2010
PublicationTitle 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev CVPR
PublicationYear 2015
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib030089920
ssj0023720
ssj0003211698
Score 2.5653436
Snippet Although RGB-D sensors have enabled major break-throughs for several vision tasks, such as 3D reconstruction, we have not attained the same level of success in...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 567
SubjectTerms Benchmark testing
Benchmarking
Cameras
Computer vision
Estimation
Iterative closest point algorithm
Layout
Pattern recognition
Scene analysis
Sensors
Tasks
Three dimensional
Three-dimensional displays
Title SUN RGB-D: A RGB-D scene understanding benchmark suite
URI https://ieeexplore.ieee.org/document/7298655
https://www.proquest.com/docview/1770307753
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLaJ8yUiMOM2H48RsUCgVElVVKOpWObYjUEWK2mTh1-NLnIKAgS2yEsW5XOx3eXf3EDqnUUpDpWOiuZSExq4kMVMBYTww4XMiNU_LLN8RG07p_SycNdDFphZGa10mn2kHDksuXy1lAb_KegYIQh3lFtoyblbVatW-E7jAX1noA6twYCIbxjeMgg9qLCXzyWAiHrcMp-fyXv95PIEkr9CxN7BKK7-W53LPGbTQQz3bKtVk4RR54siPH40c__s4O6jzVd2Hx5t9axc1dLaHWhaOYvuxr81QrfhQj7URe5yO8OTumtxc4qvqAEM7KI2L70UyODGnv7yJ1QKvC4NpO2g6uH3qD4lVXiCvvhvnJOWB9oPEoCsODee4JygLJY-VApVQASAkibh2Qzf1pOCR8JUA9RoVCI9JA3n2UTNbZvoAYWNXppQr_NRPKEsNoJepkFKaqFzENKZd1AbLzN-r5hpza5QuOqttPzcODyyGyPSyWM-9qOxhZsKsw78vPULb8DKrfK5j1MxXhT4xyCFPTkuX-QSMcrsM
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFH9BPOgJFYz4WROPDvbRdas3RREVCEEw3pau7aIhDgPbxb_edh9o1IO3ptmy7q1rf6-_994P4Ax7EXaF9A1JOTewb3LDJ8IxCHWU-xxySaMsyndIelN8_-w-V-B8lQsjpcyCz2RLNzMuX8x5qo_K2goI6jzKNVh3lVfh59la5exxTM1gFeBHr8OO8m0IXXEKttZjybhPoodi0YLjtEza7jyNxjrMy20Vjyi0Vn4t0Nmu063BoBxvHmwya6VJ2OIfP0o5_veFtqDxld-HRqudaxsqMt6BWgFIUfG7L1VXqflQ9tWBPE6HaHx7ZVxfoMu8gXRBKInS72kyKFSXv7yxxQwtU4VqGzDt3kw6PaPQXjBebdNPjIg60nZCha-oLjlHLYaJy6kvhNYJZRqGhB6VpmtGFmfUY7ZgWr9GOMwiXIGeXajG81juAVJ2JUKYzI7sEJNIQXoeMc658suZj33chLq2TPCel9cICqM04bS0faCmvOYxWCzn6TKwvKyKmXK09v--9QQ2epNBP-jfDR8OYFN_2Dy66xCqySKVRwpHJOFxNn0-AZrmvl8
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=SUN+RGB-D%3A+A+RGB-D+scene+understanding+benchmark+suite&rft.au=Shuran+Song&rft.au=Lichtenberg%2C+Samuel+P.&rft.au=Jianxiong+Xiao&rft.date=2015-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=567&rft.epage=576&rft_id=info:doi/10.1109%2FCVPR.2015.7298655&rft.externalDocID=7298655
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon