Combining Vertex-Centric Graph Processing with SPARQL for Large-Scale RDF Data Analytics

Modern applications require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework fo...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on parallel and distributed systems Vol. 28; no. 12; pp. 3374 - 3388
Main Authors Abdelaziz, Ibrahim, Harbi, Razen, Salihoglu, Semih, Kalnis, Panos
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1045-9219
1558-2183
DOI10.1109/TPDS.2017.2720174

Cover

Abstract Modern applications require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.
AbstractList Modern applications require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.
Author Abdelaziz, Ibrahim
Salihoglu, Semih
Kalnis, Panos
Harbi, Razen
Author_xml – sequence: 1
  givenname: Ibrahim
  surname: Abdelaziz
  fullname: Abdelaziz, Ibrahim
  email: brahim.abdelaziz@kaust.edu.sa
  organization: King Abdullah Univ. of Sci. & Technol., Thuwal, Saudi Arabia
– sequence: 2
  givenname: Razen
  surname: Harbi
  fullname: Harbi, Razen
  email: razen.harbi@aramco.com
  organization: Saudi Aramco, Thuwal, Saudi Arabia
– sequence: 3
  givenname: Semih
  surname: Salihoglu
  fullname: Salihoglu, Semih
  email: semih.salihoglu@uwaterloo.ca
  organization: Univ. of Waterloo, Waterloo, ON, Canada
– sequence: 4
  givenname: Panos
  surname: Kalnis
  fullname: Kalnis, Panos
  email: panos.kalnis@kaust.edu.sa
  organization: King Abdullah Univ. of Sci. & Technol., Thuwal, Saudi Arabia
BookMark eNp9kEFPAjEQhRuDiYD-AOOliefFdtvdbo8EBE02EQGNt023dKFk2WJbovx7u4F48OBpZvLem8x8PdBpTKMAuMVogDHiD8vZeDGIEWaDmLWFXoAuTpIsinFGOqFHNIl4jPkV6Dm3RQjTBNEu-BiZXakb3azhu7JefUcj1XirJZxasd_AmTVSOdfqX9pv4GI2nL_msDIW5sKuVbSQolZwPp7AsfACDhtRH72W7hpcVqJ26uZc--Bt8rgcPUX5y_R5NMwjGXPiIyw4q4jCXGYESSlLmciYiFQyhngQsjJBFRV0Fb5M0ypehRErKquKE5aViPTB_Wnv3prPg3K-2JqDDVe4Ig4YaMpISoOLnVzSGuesqgqpvfDahF-FrguMihZj0WIsWn7FGWNI4j_JvdU7YY__Zu5OGa2U-vUznvCUYvIDbzp-qQ
CODEN ITDSEO
CitedBy_id crossref_primary_10_14778_3151106_3151109
crossref_primary_10_3390_analytics2010004
crossref_primary_10_1145_3186728_3164144
crossref_primary_10_3390_machines13010058
Cites_doi 10.1145/1999299.1999303
10.1109/ICDM.2009.14
10.1007/s00778-009-0165-y
10.14778/2536349.2536352
10.1145/1807167.1807184
10.1145/1940747.1940751
10.1007/978-3-319-11915-1_6
10.1007/978-3-319-13186-3_58
10.1109/ICDE.2015.7113332
10.1186/2041-1480-4-33
10.1186/1471-2105-10-S5-S4
10.1145/2484838.2484843
10.1109/ICDE.2014.6816681
10.1109/BigData.2014.7004371
10.1109/TKDE.2011.103
10.1145/2588555.2610511
10.14778/2733004.2733057
10.1109/BigData.2013.6691601
10.1007/s00778-016-0420-y
10.14778/2556549.2556573
10.14778/2212351.2212354
10.1145/1327452.1327492
10.1145/1367497.1367578
10.1145/2213836.2213895
10.1007/s00778-013-0337-7
10.14778/2535570.2488333
10.1109/ICDE.2011.5767868
10.1145/2463676.2467799
10.1145/1772690.1772696
10.14778/2556549.2556571
10.14778/2556549.2556572
10.14778/2977797.2977806
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2017.2720174
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 3388
ExternalDocumentID 10_1109_TPDS_2017_2720174
7959641
Genre orig-research
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
TWZ
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c293t-1a97f3e19c830cccbc5c23a6c7709f3e8b50f4a4d11066f2d50f1e4cff9378b03
IEDL.DBID RIE
ISSN 1045-9219
IngestDate Sun Oct 05 00:17:17 EDT 2025
Wed Oct 01 02:59:17 EDT 2025
Thu Apr 24 23:02:07 EDT 2025
Wed Aug 27 02:52:21 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 12
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c293t-1a97f3e19c830cccbc5c23a6c7709f3e8b50f4a4d11066f2d50f1e4cff9378b03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-1449-5115
PQID 2174467364
PQPubID 85437
PageCount 15
ParticipantIDs ieee_primary_7959641
crossref_primary_10_1109_TPDS_2017_2720174
crossref_citationtrail_10_1109_TPDS_2017_2720174
proquest_journals_2174467364
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2017-Dec.-1
2017-12-1
20171201
PublicationDateYYYYMMDD 2017-12-01
PublicationDate_xml – month: 12
  year: 2017
  text: 2017-Dec.-1
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2017
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref35
ref13
ref34
ref12
(ref20) 0
ref31
ref33
ref32
ref10
gallego (ref24) 2011
zaharia (ref37) 2010; 10
ref2
ref1
ref39
ref38
ref16
schätzle (ref26) 2016; 9
goodman (ref36) 2014
techentin (ref15) 2014
xin (ref30) 2014
ref23
deweese (ref40) 2013
papailiou (ref11) 2013
ref25
ref42
ref41
gonzalez (ref17) 2012
wang (ref18) 2013; 13
ref22
ref44
ref21
ref43
ref28
ref27
(ref19) 0
ref29
ref8
ref7
ref9
ref4
qi (ref14) 2013
ref3
ref6
ref5
References_xml – ident: ref32
  doi: 10.1145/1999299.1999303
– ident: ref31
  doi: 10.1109/ICDM.2009.14
– ident: ref5
  doi: 10.1007/s00778-009-0165-y
– start-page: 25
  year: 2014
  ident: ref36
  article-title: Using vertex-centric programming platforms to implement SPARQL queries on large graphs
  publication-title: Proc 4th Workshop Irregular Appl Archit Algorithms
– ident: ref6
  doi: 10.14778/2536349.2536352
– ident: ref16
  doi: 10.1145/1807167.1807184
– ident: ref28
  doi: 10.1145/1940747.1940751
– ident: ref2
  doi: 10.1007/978-3-319-11915-1_6
– ident: ref3
  doi: 10.1007/978-3-319-13186-3_58
– ident: ref27
  doi: 10.1109/ICDE.2015.7113332
– ident: ref4
  doi: 10.1186/2041-1480-4-33
– ident: ref1
  doi: 10.1186/1471-2105-10-S5-S4
– ident: ref21
  doi: 10.1145/2484838.2484843
– ident: ref41
  doi: 10.1109/ICDE.2014.6816681
– ident: ref13
  doi: 10.1109/BigData.2014.7004371
– ident: ref29
  doi: 10.1109/TKDE.2011.103
– ident: ref8
  doi: 10.1145/2588555.2610511
– ident: ref44
  doi: 10.14778/2733004.2733057
– ident: ref42
  doi: 10.1109/BigData.2013.6691601
– start-page: 236
  year: 2013
  ident: ref14
  article-title: Clustering remote RDF data using SPARQL update queries
  publication-title: Proc IEEE 29th Int Conf Data Eng Workshops
– ident: ref9
  doi: 10.1007/s00778-016-0420-y
– year: 0
  ident: ref20
– ident: ref43
  doi: 10.14778/2556549.2556573
– ident: ref22
  doi: 10.14778/2212351.2212354
– ident: ref33
  doi: 10.1145/1327452.1327492
– year: 2014
  ident: ref30
  article-title: Graphx: Unifying data-parallel and graph-parallel analytics
  publication-title: arXiv 1402 2394
– ident: ref38
  doi: 10.1145/1367497.1367578
– year: 2011
  ident: ref24
  article-title: An empirical study of real-world SPARQL queries
  publication-title: USEWOD
– ident: ref35
  doi: 10.1145/2213836.2213895
– ident: ref7
  doi: 10.1007/s00778-013-0337-7
– ident: ref12
  doi: 10.14778/2535570.2488333
– start-page: 255
  year: 2013
  ident: ref11
  article-title: H2RDF+: High-performance distributed joins over large-scale RDF graphs
  publication-title: Proc IEEE Int Conf Big Data
– year: 0
  ident: ref19
– start-page: 930
  year: 2013
  ident: ref40
  article-title: Graph Clustering in SPARQL
  publication-title: Proc SIAM Workshop Netw Sci
– ident: ref39
  doi: 10.1109/ICDE.2011.5767868
– volume: 13
  start-page: 1
  year: 2013
  ident: ref18
  article-title: Asynchronous large-scale graph processing made easy
  publication-title: CIDR
– start-page: 216
  year: 2014
  ident: ref15
  article-title: Implementing Iterative Algorithms with SPARQL
  publication-title: EDBT/ICDT Workshops
– start-page: 17
  year: 2012
  ident: ref17
  article-title: PowerGraph: Distributed graph-parallel computation on natural graphs
  publication-title: Proc 10th USENIX Conf Oper Syst Des Implementation
– ident: ref23
  doi: 10.1145/2463676.2467799
– ident: ref25
  doi: 10.1145/1772690.1772696
– volume: 10
  year: 2010
  ident: ref37
  article-title: Spark: Cluster computing with working sets
  publication-title: Proc HotCloud
– ident: ref10
  doi: 10.14778/2556549.2556571
– ident: ref34
  doi: 10.14778/2556549.2556572
– volume: 9
  start-page: 804
  year: 2016
  ident: ref26
  publication-title: Proc VLDB Endowment
  doi: 10.14778/2977797.2977806
SSID ssj0014504
Score 2.3063908
Snippet Modern applications require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3374
SubjectTerms Algorithm design and analysis
Algorithms
Analytics
Data analysis
Filtering algorithms
graph analytics
Graphical models
Matched filters
Mathematical analysis
Pattern matching
Query processing
RDF data
Resource description framework
Search engines
SPARQL
State of the art
Task complexity
vertex-centric
Title Combining Vertex-Centric Graph Processing with SPARQL for Large-Scale RDF Data Analytics
URI https://ieeexplore.ieee.org/document/7959641
https://www.proquest.com/docview/2174467364
Volume 28
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT9swFH5inLbDCgVEN5h82GnCxW6cOD5WlIImOhUKqLcodmwJgVrUpdLEX4-f41ZjQ4hbIttRlM-On9-P7wP4XvlNS5WlpjZhmqK8Ns1TWSHvbSKdy0ob_B2jX9n5jfg5TacbcLSuhbHWhuQz28XLEMuv5maJrrJj1MXOsEr9g8yzplZrHTEQaZAK9KeLlCq_DGMEkzN1fD0eTDCJS3Yx6MileLEHBVGV__7EYXsZtmC0erEmq-S-u6x11zz9w9n43jffgs_RziT9ZmJsw4adtaG10nAgcUm34dNfhIQ7MPUddJCMILd2Uds_NHh_7ww5Q2JrEssKsB0duGQy7l9dXhBv-JILTCmnEw-5JVeDIRmUdUkC5QkSQe_CzfD0-uScRu0FarwBUFNeKukSy5XJE2aM0SY1vaTMjJRM-YZcp8yJUlT-C2eZ61X-llthnPP2Tq5Zsgebs_nM7gNxLuF-LLdVooXWVjGnqp7kXGmd9kzaAbZCozCRmBz1MR6KcEBhqkAAC0SuiAB24Md6yGPDyvFW5x0EZN0xYtGBgxXkRVy3vws8oAlMdRNfXh_1FT7iQ5uElgPYrBdLe-jNklp_C_PxGTmA3Qc
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT9swFH5C7DA4wMYPUcaGD5wQLnZjJ_URresKtAhoQb1FsWNL01CLIJXQ_vr5OW4FA03cEtlWonx2_Px-fB_AQek3LVUUmtqEaYry2rQtsxJ5b5PMubSwwd8xuEh7N-JsLMdLcLSohbHWhuQz28TLEMsvp2aGrrJj1MVOsUr9gxRCyLpaaxEzEDKIBfrzhaTKL8QYw-RMHY8uO0NM48qaGHbkmXixCwVZlVf_4rDBdNdhMH-1Oq_kd3NW6ab58w9r43vf_ROsRUuTnNRT4zMs2ckGrM9VHEhc1Buw-oyScBPGvoMOohHk1j5U9okG_-8vQ34itTWJhQXYji5cMrw8ub7qE2_6kj4mldOhB92S606XdIqqIIH0BKmgt-Cm-2P0vUej-gI13gSoKC9U5hLLlWknzBijjTStpEhNljHlG9paMicKUfovnKauVfpbboVxzls8bc2SbVieTCd2B4hzCfdjuS0TLbS2ijlVtjLOldayZWQD2ByN3ERqclTIuMvDEYWpHAHMEbk8AtiAw8WQ-5qX43-dNxGQRceIRQP25pDnceU-5nhEE5jsJnbfHrUPH3ujQT_vn16cf4EVfECd3rIHy9XDzH71Rkqlv4W5-Rf1iOBU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Combining+Vertex-Centric+Graph+Processing+with+SPARQL+for+Large-Scale+RDF+Data+Analytics&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Abdelaziz%2C+Ibrahim&rft.au=Harbi%2C+Razen&rft.au=Salihoglu%2C+Semih&rft.au=Kalnis%2C+Panos&rft.date=2017-12-01&rft.issn=1045-9219&rft.volume=28&rft.issue=12&rft.spage=3374&rft.epage=3388&rft_id=info:doi/10.1109%2FTPDS.2017.2720174&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TPDS_2017_2720174
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon