Goal-based composition of scalable hybrid analytics for heterogeneous architectures

Crafting scalable analytics in order to extract actionable business intelligence is a challenging endeavour, requiring multiple layers of expertise and experience. Often, this expertise is irreconcilably split between an organisation’s engineers and subject matter domain experts. Previous approaches...

Full description

Saved in:
Bibliographic Details
Published inJournal of parallel and distributed computing Vol. 108; pp. 59 - 73
Main Authors Coetzee, P., Jarvis, S.A.
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.10.2017
Subjects
Online AccessGet full text
ISSN0743-7315
1096-0848
1096-0848
DOI10.1016/j.jpdc.2016.11.009

Cover

Abstract Crafting scalable analytics in order to extract actionable business intelligence is a challenging endeavour, requiring multiple layers of expertise and experience. Often, this expertise is irreconcilably split between an organisation’s engineers and subject matter domain experts. Previous approaches to this problem have relied on technically adept users with tool-specific training. Such an approach has a number of challenges: Expertise — There are few data-analytic subject domain experts with in-depth technical knowledge of compute architectures; Performance — Analysts do not generally make full use of the performance and scalability capabilities of the underlying architectures; Heterogeneity — calculating the most performant and scalable mix of real-time (on-line) and batch (off-line) analytics in a problem domain is difficult; Tools — Supporting frameworks will often direct several tasks, including, composition, planning, code generation, validation, performance tuning and analysis, but do not typically provide end-to-end solutions embedding all of these activities. In this paper, we present a novel semi-automated approach to the composition, planning, code generation and performance tuning of scalable hybrid analytics, using a semantically rich type system which requires little programming expertise from the user. This approach is the first of its kind to permit domain experts with little or no technical expertise to assemble complex and scalable analytics, for hybrid on- and off-line analytic environments, with no additional requirement for low-level engineering support. This paper describes (i) an abstract model of analytic assembly and execution, (ii) goal-based planning and (iii) code generation for hybrid on- and off-line analytics. An implementation, through a system which we call Mendeleev, is used to (iv) demonstrate the applicability of this technique through a series of case studies, where a single interface is used to create analytics that can be run simultaneously over on- and off-line environments. Finally, we (v) analyse the performance of the planner, and (vi) show that the performance of Mendeleev’s generated code is comparable with that of hand-written analytics. •A new abstract model of assembly and execution for arbitrary analytics, centred around a semantically rich type system.•Goal-based planning of hybrid analytic applications using this abstract model, requiring little programming ability from the user.•Automatic code generation across scalable compute architectures, integrating heterogeneous on- and off-line runtime environments.•Validation of the planning approach through its application to four case studies in telecommunications and image analysis, including an exploration of the performance and scalability of the planning engine for each of these case studies.•A demonstration of comparable performance with equivalent hand-written alternatives in both on- and off-line runtime environments.
AbstractList Crafting scalable analytics in order to extract actionable business intelligence is a challenging endeavour, requiring multiple layers of expertise and experience. Often, this expertise is irreconcilably split between an organisation’s engineers and subject matter domain experts. Previous approaches to this problem have relied on technically adept users with tool-specific training. Such an approach has a number of challenges: Expertise — There are few data-analytic subject domain experts with in-depth technical knowledge of compute architectures; Performance — Analysts do not generally make full use of the performance and scalability capabilities of the underlying architectures; Heterogeneity — calculating the most performant and scalable mix of real-time (on-line) and batch (off-line) analytics in a problem domain is difficult; Tools — Supporting frameworks will often direct several tasks, including, composition, planning, code generation, validation, performance tuning and analysis, but do not typically provide end-to-end solutions embedding all of these activities. In this paper, we present a novel semi-automated approach to the composition, planning, code generation and performance tuning of scalable hybrid analytics, using a semantically rich type system which requires little programming expertise from the user. This approach is the first of its kind to permit domain experts with little or no technical expertise to assemble complex and scalable analytics, for hybrid on- and off-line analytic environments, with no additional requirement for low-level engineering support. This paper describes (i) an abstract model of analytic assembly and execution, (ii) goal-based planning and (iii) code generation for hybrid on- and off-line analytics. An implementation, through a system which we call Mendeleev, is used to (iv) demonstrate the applicability of this technique through a series of case studies, where a single interface is used to create analytics that can be run simultaneously over on- and off-line environments. Finally, we (v) analyse the performance of the planner, and (vi) show that the performance of Mendeleev’s generated code is comparable with that of hand-written analytics. •A new abstract model of assembly and execution for arbitrary analytics, centred around a semantically rich type system.•Goal-based planning of hybrid analytic applications using this abstract model, requiring little programming ability from the user.•Automatic code generation across scalable compute architectures, integrating heterogeneous on- and off-line runtime environments.•Validation of the planning approach through its application to four case studies in telecommunications and image analysis, including an exploration of the performance and scalability of the planning engine for each of these case studies.•A demonstration of comparable performance with equivalent hand-written alternatives in both on- and off-line runtime environments.
Author Coetzee, P.
Jarvis, S.A.
Author_xml – sequence: 1
  givenname: P.
  orcidid: 0000-0001-5488-6896
  surname: Coetzee
  fullname: Coetzee, P.
  email: p.l.coetzee@warwick.ac.uk
  organization: Department of Computer Science, University of Warwick, United Kingdom
– sequence: 2
  givenname: S.A.
  surname: Jarvis
  fullname: Jarvis, S.A.
  email: s.a.jarvis@warwick.ac.uk
  organization: Department of Computer Science, University of Warwick, United Kingdom
BookMark eNqN0EFLwzAUwPEgE9ymX8BTv0Br0qRNC15k6BQGHtRzeE1fXUrWlKRT-u1tmScPw1Pe5fce-a_IonMdEnLLaMIoy-_apO1rnaTTnDCWUFpekCWjZR7TQhQLsqRS8Fhyll2RVQgtpYxlsliSt60DG1cQsI60O_QumMG4LnJNFDRYqCxG-7Hypo6gAzsORoeocT7a44DefWKH7hgi8HpvBtTD0WO4JpcN2IA3v--afDw9vm-e493r9mXzsIs1F2KIodKS01KmaSGhKAXPK8GZplXGsBIFL1OZlwhNrYXOmoyLnGMJKU9FJVEKxteEn_Yeux7Gb7BW9d4cwI-KUTV3Ua2au6i5i2JMTV0mVZyU9i4Ej43SZoD504MHY8_T9A_91737E8IpxZdBr4I22GmsjZ-CqdqZc_wHKMWSDQ
CitedBy_id crossref_primary_10_1016_j_parco_2019_102584
crossref_primary_10_1016_j_jpdc_2017_05_020
Cites_doi 10.1093/bioinformatics/bth361
10.14778/1454159.1454179
10.1145/2581377
10.1093/bib/3.4.331
10.1016/j.parco.2014.07.004
10.1145/1327452.1327492
10.1109/MIC.2008.114
10.1613/jair.1141
10.1089/big.2013.0011
10.1145/359576.359585
ContentType Journal Article
Copyright 2016 The Author(s)
Copyright_xml – notice: 2016 The Author(s)
DBID 6I.
AAFTH
AAYXX
CITATION
ADTOC
UNPAY
DOI 10.1016/j.jpdc.2016.11.009
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1096-0848
EndPage 73
ExternalDocumentID 10.1016/j.jpdc.2016.11.009
10_1016_j_jpdc_2016_11_009
S0743731516301666
GroupedDBID --K
--M
-~X
.~1
0R~
1B1
1~.
1~5
29L
4.4
457
4G.
5GY
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFSI
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADFGL
ADHUB
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CAG
COF
CS3
DM4
DU5
E.L
EBS
EFBJH
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
K-O
KOM
LG5
LG9
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SET
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
TWZ
WUQ
XJT
XOL
XPP
ZMT
ZU3
ZY4
~G-
~G0
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
ADVLN
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ADTOC
AGCQF
UNPAY
ID FETCH-LOGICAL-c344t-abc730972287a89436b431c0b51eb48392769eafdc4c5f53463e9a2324b7e7413
IEDL.DBID .~1
ISSN 0743-7315
1096-0848
IngestDate Tue Aug 19 17:35:33 EDT 2025
Thu Oct 16 04:47:03 EDT 2025
Thu Apr 24 23:01:41 EDT 2025
Fri Feb 23 02:31:22 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Streaming analysis
Hybrid analytics
Heterogeneous compute
Hadoop
Data intensive computing
Data science
Analytic planning
Language English
License This is an open access article under the CC BY license.
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c344t-abc730972287a89436b431c0b51eb48392769eafdc4c5f53463e9a2324b7e7413
ORCID 0000-0001-5488-6896
OpenAccessLink https://www.sciencedirect.com/science/article/pii/S0743731516301666
PageCount 15
ParticipantIDs unpaywall_primary_10_1016_j_jpdc_2016_11_009
crossref_citationtrail_10_1016_j_jpdc_2016_11_009
crossref_primary_10_1016_j_jpdc_2016_11_009
elsevier_sciencedirect_doi_10_1016_j_jpdc_2016_11_009
PublicationCentury 2000
PublicationDate October 2017
2017-10-00
PublicationDateYYYYMMDD 2017-10-01
PublicationDate_xml – month: 10
  year: 2017
  text: October 2017
PublicationDecade 2010
PublicationTitle Journal of parallel and distributed computing
PublicationYear 2017
Publisher Elsevier Inc
Publisher_xml – name: Elsevier Inc
References Altinel, Brown, Cline, Kartha, Louie, Markl, Mau, Ng, Simmen, Singh (br000005) 2007
Islam, Huang, Battisha, Chiang, Srinivasan, Peters, Neumann, Abdelnur (br000070) 2012
Dean, Ghemawat (br000055) 2008; 51
Law, Wang, Zaniolo (br000090) 2004
R. Rea, K. Mamidipaka, IBM InfoSphere Streams: Enabling complex analytics with ultra-low latencies on data in motion, IBM White Paper.
Sirin, Parsia (br000165) 2004
Veldhuizen (br000175) 1995; 7
Bergmann, Gil (br000015) 2011
Shvachko, Kuang, Radia, Chansler (br000160) 2010
O. Lassila, R. Swick, et al. Resource Description Framework (RDF) model and syntax specification, W3C Recommendation.
Nau, Au, Ilghami, Kuter, Murdock, Wu, Yaman (br000105) 2003; 20
Stolee, Elbaum (br000170) 2014; 23
Pruett (br000130) 2007
Riabov, Liu (br000150) 2005; 20
Loton (br000095) 2008
Cascading Project, Cascading—Platform for Big Data, URL
Olston, Reed, Srivastava, Kumar, Tomkins (br000120) 2008
Apache Software Foundation, Apache Storm
M. Kornacker, J. Erickson, Cloudera Impala: Real-time queries in Apache Hadoop, 2012. URL
Coetzee, Jarvis (br000035) 2015; vol.~2
Coetzee, Leeke, Jarvis (br000040) 2014; 40
Riabov, Boillet, Feblowitz, Liu, Ranganathan (br000145) 2008
Hoare (br000140) 1978; 21
Martin, Burstein, Hobbs, Lassila, McDermott, McIlraith, Narayanan, Paolucci, Parsia, Payne (br000100) 2004; 22
Xin, Rosen (br000195) 2013
Daniel, Rodríguez, Roy~Chowdhury, Motahari~Nezhad, Casati (br000050) 2012
Wong (br000190) 2007
C. Ogbuji, et al. FuXi 1.4: A Python-based, bi-directional logical reasoning system for the semantic web, URL
G. Bracha, Generics in the Java programming language, Sun Microsystems
Constantinescu, Faltings, Binde (br000045) 2004
Zaharia, Chowdhury, Franklin, Shenker, Stoica (br000205) 2010; vol.~10
Yu, Benatallah, Casati, Daniel (br000200) 2008; 12
Google, Inc., Google Mashup Editor, URL
M. Birbeck, S. McCarron, CURIE Syntax 1.0: A syntax for expressing Compact URIs, W3C Working Group Note.
.
Hausenblas, Nadeau (br000065) 2013; 1
Pistore, Traverso, Bertoli, Marconi (br000125) 2005
Jain, Mishra, Srinivasan, Gehrke, Widom, Balakrishnan, Çetintemel, Cherniack, Tibbetts, Zdonik (br000075) 2008; 1
accessed 2015.
Whitehouse, Zhao, Liu (br000180) 2006
Wilkinson, Links (br000185) 2002; 3
Oinn, Addis, Ferris, Marvin, Senger, Greenwood, Carver, Glover, Pocock, Wipat (br000115) 2004; 20
Riabov, Liu (br000155) 2006
Jain (10.1016/j.jpdc.2016.11.009_br000075) 2008; 1
Riabov (10.1016/j.jpdc.2016.11.009_br000150) 2005; 20
Olston (10.1016/j.jpdc.2016.11.009_br000120) 2008
Wilkinson (10.1016/j.jpdc.2016.11.009_br000185) 2002; 3
Yu (10.1016/j.jpdc.2016.11.009_br000200) 2008; 12
Pruett (10.1016/j.jpdc.2016.11.009_br000130) 2007
Dean (10.1016/j.jpdc.2016.11.009_br000055) 2008; 51
Islam (10.1016/j.jpdc.2016.11.009_br000070) 2012
Xin (10.1016/j.jpdc.2016.11.009_br000195) 2013
Riabov (10.1016/j.jpdc.2016.11.009_br000145) 2008
Pistore (10.1016/j.jpdc.2016.11.009_br000125) 2005
Loton (10.1016/j.jpdc.2016.11.009_br000095) 2008
Riabov (10.1016/j.jpdc.2016.11.009_br000155) 2006
10.1016/j.jpdc.2016.11.009_br000025
Law (10.1016/j.jpdc.2016.11.009_br000090) 2004
Veldhuizen (10.1016/j.jpdc.2016.11.009_br000175) 1995; 7
Coetzee (10.1016/j.jpdc.2016.11.009_br000035) 2015; vol.~2
10.1016/j.jpdc.2016.11.009_br000020
10.1016/j.jpdc.2016.11.009_br000030
Coetzee (10.1016/j.jpdc.2016.11.009_br000040) 2014; 40
Shvachko (10.1016/j.jpdc.2016.11.009_br000160) 2010
Hoare (10.1016/j.jpdc.2016.11.009_br000140) 1978; 21
Sirin (10.1016/j.jpdc.2016.11.009_br000165) 2004
Altinel (10.1016/j.jpdc.2016.11.009_br000005) 2007
Stolee (10.1016/j.jpdc.2016.11.009_br000170) 2014; 23
Hausenblas (10.1016/j.jpdc.2016.11.009_br000065) 2013; 1
Constantinescu (10.1016/j.jpdc.2016.11.009_br000045) 2004
10.1016/j.jpdc.2016.11.009_br000135
10.1016/j.jpdc.2016.11.009_br000110
10.1016/j.jpdc.2016.11.009_br000010
Oinn (10.1016/j.jpdc.2016.11.009_br000115) 2004; 20
10.1016/j.jpdc.2016.11.009_br000085
10.1016/j.jpdc.2016.11.009_br000060
10.1016/j.jpdc.2016.11.009_br000080
Nau (10.1016/j.jpdc.2016.11.009_br000105) 2003; 20
Zaharia (10.1016/j.jpdc.2016.11.009_br000205) 2010; vol.~10
Daniel (10.1016/j.jpdc.2016.11.009_br000050) 2012
Bergmann (10.1016/j.jpdc.2016.11.009_br000015) 2011
Wong (10.1016/j.jpdc.2016.11.009_br000190) 2007
Whitehouse (10.1016/j.jpdc.2016.11.009_br000180) 2006
Martin (10.1016/j.jpdc.2016.11.009_br000100) 2004; 22
References_xml – reference: G. Bracha, Generics in the Java programming language, Sun Microsystems,
– reference: , accessed 2015.
– reference: Apache Software Foundation, Apache Storm,
– reference: M. Kornacker, J. Erickson, Cloudera Impala: Real-time queries in Apache Hadoop, 2012. URL:
– reference: C. Ogbuji, et al. FuXi 1.4: A Python-based, bi-directional logical reasoning system for the semantic web, URL:
– start-page: 5
  year: 2006
  end-page: 20
  ident: br000180
  article-title: Semantic streams: A framework for composable semantic interpretation of sensor data
  publication-title: Wireless Sensor Networks
– volume: 20
  start-page: 3045
  year: 2004
  end-page: 3054
  ident: br000115
  article-title: Taverna: A tool for the composition and enactment of bioinformatics workflows
  publication-title: Bioinformatics
– volume: 7
  start-page: 26
  year: 1995
  end-page: 31
  ident: br000175
  article-title: Expression templates
  publication-title: C++ Report
– volume: 3
  start-page: 331
  year: 2002
  end-page: 341
  ident: br000185
  article-title: BioMOBY: An open source biological web services proposal
  publication-title: Brief. Bioinform.
– reference: R. Rea, K. Mamidipaka, IBM InfoSphere Streams: Enabling complex analytics with ultra-low latencies on data in motion, IBM White Paper.
– volume: 20
  start-page: 379
  year: 2003
  end-page: 404
  ident: br000105
  article-title: SHOP2: An HTN planning system
  publication-title: J. Artificial Intelligence Res.
– reference: Cascading Project, Cascading—Platform for Big Data, URL:
– start-page: 13
  year: 2013
  end-page: 24
  ident: br000195
  article-title: Shark: SQL and rich analytics at scale
  publication-title: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
– start-page: 17
  year: 2011
  end-page: 31
  ident: br000015
  article-title: Retrieval of semantic workflows with knowledge intensive similarity measures
  publication-title: Case-Based Reasoning Research and Development
– start-page: 492
  year: 2004
  end-page: 503
  ident: br000090
  article-title: Query languages and data models for database sequences and data streams
  publication-title: Proceedings of the 30th International Conference on Very Large Data Bases
– start-page: 775
  year: 2008
  end-page: 784
  ident: br000145
  article-title: Wishful search: Interactive composition of data mashups
  publication-title: Proceedings of the 17th International Conference on World Wide Web
– volume: vol.~2
  start-page: 56
  year: 2015
  end-page: 65
  ident: br000035
  article-title: Goal-based analytic composition for on- and off-line execution at scale
  publication-title: Proceedings of IEEE Trustcom/BigDataSE/ISPA, 2015
– volume: 20
  start-page: 1205
  year: 2005
  ident: br000150
  article-title: Planning for stream processing systems
  publication-title: Proc. AAAI Natl. Artif. Intell.
– start-page: 1
  year: 2010
  end-page: 10
  ident: br000160
  article-title: The hadoop distributed file system
  publication-title: Proceedings of the 26th Symposium on Mass Storage Systems and Technologies, MSST
– volume: vol.~10
  start-page: 10
  year: 2010
  ident: br000205
  article-title: Spark: Cluster computing with working sets
  publication-title: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing
– reference: M. Birbeck, S. McCarron, CURIE Syntax 1.0: A syntax for expressing Compact URIs, W3C Working Group Note.
– volume: 12
  start-page: 44
  year: 2008
  end-page: 52
  ident: br000200
  article-title: Understanding mashup development
  publication-title: Internet Computing, IEEE
– reference: Google, Inc., Google Mashup Editor, URL:
– volume: 1
  start-page: 1379
  year: 2008
  end-page: 1390
  ident: br000075
  article-title: Towards a streaming SQL standard
  publication-title: Proc. VLDB Endowment
– year: 2007
  ident: br000130
  article-title: Yahoo! Pipes
– start-page: 33
  year: 2004
  end-page: 40
  ident: br000165
  article-title: Planning for semantic web services
  publication-title: Semantic Web Services Workshop at 3rd International Semantic Web Conference
– start-page: 31
  year: 2006
  end-page: 41
  ident: br000155
  article-title: Scalable planning for distributed stream processing systems
  publication-title: Proceedings of the International Conference on Automated Planning and Scheduling
– volume: 23
  start-page: 26
  year: 2014
  ident: br000170
  article-title: Solving the search for source code
  publication-title: ACM Trans. Softw. Eng. Methodol.
– year: 2008
  ident: br000095
  article-title: Introduction to Microsoft Popfly, No Programming Required
– start-page: 1370
  year: 2007
  end-page: 1373
  ident: br000005
  article-title: Damia: A data mashup fabric for intranet applications
  publication-title: Proceedings of the 33rd International Conference on Very Large Data Bases
– volume: 51
  start-page: 107
  year: 2008
  end-page: 113
  ident: br000055
  article-title: MapReduce: Simplified data processing on large clusters
  publication-title: Commun. ACM
– start-page: 270
  year: 2007
  end-page: 271
  ident: br000190
  article-title: Marmite: Towards end-user programming for the web
  publication-title: IEEE Symposium on Visual Languages and Human-Centric Computing
– start-page: 493
  year: 2012
  end-page: 494
  ident: br000050
  article-title: Discovery and reuse of composition knowledge for assisted mashup development
  publication-title: Proceedings of the 21st International Conference on World Wide Web
– volume: 21
  start-page: 666
  year: 1978
  end-page: 677
  ident: br000140
  article-title: Communicating sequential processes
  publication-title: Commun. ACM
– reference: .
– volume: 40
  start-page: 738
  year: 2014
  end-page: 753
  ident: br000040
  article-title: Towards unified secure on-and off-line analytics at scale
  publication-title: Parallel Comput.
– start-page: 506
  year: 2004
  end-page: 513
  ident: br000045
  article-title: Large scale, type-compatible service composition
  publication-title: Proceedings of the IEEE International Conference on Web Services
– start-page: 1099
  year: 2008
  end-page: 1110
  ident: br000120
  article-title: Pig Latin: A Not-So-Foreign language for data processing
  publication-title: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data
– start-page: 293
  year: 2005
  end-page: 301
  ident: br000125
  article-title: Automated synthesis of composite BPEL4WS web services
  publication-title: Proceedings of the IEEE International Conference on Web Services
– volume: 22
  year: 2004
  ident: br000100
  article-title: OWL-S: Semantic markup for web services
  publication-title: W3C member submission
– start-page: 4
  year: 2012
  ident: br000070
  article-title: Oozie: Towards a scalable workflow management system for Hadoop
  publication-title: Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
– volume: 1
  start-page: 100
  year: 2013
  end-page: 104
  ident: br000065
  article-title: Apache drill: Interactive Ad-Hoc analysis at scale
  publication-title: Big Data
– reference: O. Lassila, R. Swick, et al. Resource Description Framework (RDF) model and syntax specification, W3C Recommendation.
– ident: 10.1016/j.jpdc.2016.11.009_br000080
– year: 2007
  ident: 10.1016/j.jpdc.2016.11.009_br000130
– volume: 20
  start-page: 3045
  issue: 17
  year: 2004
  ident: 10.1016/j.jpdc.2016.11.009_br000115
  article-title: Taverna: A tool for the composition and enactment of bioinformatics workflows
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bth361
– ident: 10.1016/j.jpdc.2016.11.009_br000030
– start-page: 506
  year: 2004
  ident: 10.1016/j.jpdc.2016.11.009_br000045
  article-title: Large scale, type-compatible service composition
– year: 2008
  ident: 10.1016/j.jpdc.2016.11.009_br000095
– volume: 20
  start-page: 1205
  issue: 3
  year: 2005
  ident: 10.1016/j.jpdc.2016.11.009_br000150
  article-title: Planning for stream processing systems
  publication-title: Proc. AAAI Natl. Artif. Intell.
– start-page: 31
  year: 2006
  ident: 10.1016/j.jpdc.2016.11.009_br000155
  article-title: Scalable planning for distributed stream processing systems
– volume: 22
  year: 2004
  ident: 10.1016/j.jpdc.2016.11.009_br000100
  article-title: OWL-S: Semantic markup for web services
  publication-title: W3C member submission
– start-page: 1370
  year: 2007
  ident: 10.1016/j.jpdc.2016.11.009_br000005
  article-title: Damia: A data mashup fabric for intranet applications
– volume: 1
  start-page: 1379
  issue: 2
  year: 2008
  ident: 10.1016/j.jpdc.2016.11.009_br000075
  article-title: Towards a streaming SQL standard
  publication-title: Proc. VLDB Endowment
  doi: 10.14778/1454159.1454179
– start-page: 492
  year: 2004
  ident: 10.1016/j.jpdc.2016.11.009_br000090
  article-title: Query languages and data models for database sequences and data streams
– start-page: 293
  year: 2005
  ident: 10.1016/j.jpdc.2016.11.009_br000125
  article-title: Automated synthesis of composite BPEL4WS web services
– start-page: 13
  year: 2013
  ident: 10.1016/j.jpdc.2016.11.009_br000195
  article-title: Shark: SQL and rich analytics at scale
– ident: 10.1016/j.jpdc.2016.11.009_br000020
– start-page: 1
  year: 2010
  ident: 10.1016/j.jpdc.2016.11.009_br000160
  article-title: The hadoop distributed file system
– volume: 23
  start-page: 26
  issue: 3
  year: 2014
  ident: 10.1016/j.jpdc.2016.11.009_br000170
  article-title: Solving the search for source code
  publication-title: ACM Trans. Softw. Eng. Methodol.
  doi: 10.1145/2581377
– start-page: 270
  year: 2007
  ident: 10.1016/j.jpdc.2016.11.009_br000190
  article-title: Marmite: Towards end-user programming for the web
– start-page: 17
  year: 2011
  ident: 10.1016/j.jpdc.2016.11.009_br000015
  article-title: Retrieval of semantic workflows with knowledge intensive similarity measures
– ident: 10.1016/j.jpdc.2016.11.009_br000135
– volume: 3
  start-page: 331
  issue: 4
  year: 2002
  ident: 10.1016/j.jpdc.2016.11.009_br000185
  article-title: BioMOBY: An open source biological web services proposal
  publication-title: Brief. Bioinform.
  doi: 10.1093/bib/3.4.331
– ident: 10.1016/j.jpdc.2016.11.009_br000110
– volume: 40
  start-page: 738
  issue: 10
  year: 2014
  ident: 10.1016/j.jpdc.2016.11.009_br000040
  article-title: Towards unified secure on-and off-line analytics at scale
  publication-title: Parallel Comput.
  doi: 10.1016/j.parco.2014.07.004
– ident: 10.1016/j.jpdc.2016.11.009_br000060
– start-page: 4
  year: 2012
  ident: 10.1016/j.jpdc.2016.11.009_br000070
  article-title: Oozie: Towards a scalable workflow management system for Hadoop
– start-page: 33
  year: 2004
  ident: 10.1016/j.jpdc.2016.11.009_br000165
  article-title: Planning for semantic web services
– volume: 51
  start-page: 107
  issue: 1
  year: 2008
  ident: 10.1016/j.jpdc.2016.11.009_br000055
  article-title: MapReduce: Simplified data processing on large clusters
  publication-title: Commun. ACM
  doi: 10.1145/1327452.1327492
– volume: 12
  start-page: 44
  issue: 5
  year: 2008
  ident: 10.1016/j.jpdc.2016.11.009_br000200
  article-title: Understanding mashup development
  publication-title: Internet Computing, IEEE
  doi: 10.1109/MIC.2008.114
– ident: 10.1016/j.jpdc.2016.11.009_br000085
– ident: 10.1016/j.jpdc.2016.11.009_br000010
– start-page: 493
  year: 2012
  ident: 10.1016/j.jpdc.2016.11.009_br000050
  article-title: Discovery and reuse of composition knowledge for assisted mashup development
– volume: vol.~2
  start-page: 56
  year: 2015
  ident: 10.1016/j.jpdc.2016.11.009_br000035
  article-title: Goal-based analytic composition for on- and off-line execution at scale
– volume: vol.~10
  start-page: 10
  year: 2010
  ident: 10.1016/j.jpdc.2016.11.009_br000205
  article-title: Spark: Cluster computing with working sets
– volume: 20
  start-page: 379
  issue: 1
  year: 2003
  ident: 10.1016/j.jpdc.2016.11.009_br000105
  article-title: SHOP2: An HTN planning system
  publication-title: J. Artificial Intelligence Res.
  doi: 10.1613/jair.1141
– volume: 1
  start-page: 100
  issue: 2
  year: 2013
  ident: 10.1016/j.jpdc.2016.11.009_br000065
  article-title: Apache drill: Interactive Ad-Hoc analysis at scale
  publication-title: Big Data
  doi: 10.1089/big.2013.0011
– start-page: 5
  year: 2006
  ident: 10.1016/j.jpdc.2016.11.009_br000180
  article-title: Semantic streams: A framework for composable semantic interpretation of sensor data
– start-page: 1099
  year: 2008
  ident: 10.1016/j.jpdc.2016.11.009_br000120
  article-title: Pig Latin: A Not-So-Foreign language for data processing
– ident: 10.1016/j.jpdc.2016.11.009_br000025
– start-page: 775
  year: 2008
  ident: 10.1016/j.jpdc.2016.11.009_br000145
  article-title: Wishful search: Interactive composition of data mashups
– volume: 7
  start-page: 26
  issue: 5
  year: 1995
  ident: 10.1016/j.jpdc.2016.11.009_br000175
  article-title: Expression templates
  publication-title: C++ Report
– volume: 21
  start-page: 666
  issue: 8
  year: 1978
  ident: 10.1016/j.jpdc.2016.11.009_br000140
  article-title: Communicating sequential processes
  publication-title: Commun. ACM
  doi: 10.1145/359576.359585
SSID ssj0011578
Score 2.1656225
Snippet Crafting scalable analytics in order to extract actionable business intelligence is a challenging endeavour, requiring multiple layers of expertise and...
SourceID unpaywall
crossref
elsevier
SourceType Open Access Repository
Enrichment Source
Index Database
Publisher
StartPage 59
SubjectTerms Analytic planning
Data intensive computing
Data science
Hadoop
Heterogeneous compute
Hybrid analytics
Streaming analysis
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8NAEF2kPXiyfmJFZQ_eNCVpdjfJsYi1CBZBC_UU9iuUGppiEqT-eneaTamipd53EzI7y7zJvHmD0BUTAQ-T5f8v13WIkonDOWUOi8KuCJSbeAx6hx-HbDAiD2M6tjI50AvzrX6_5GFN5wqkBj3WAbVN6NVrMmpwdwM1R8On3mutsxlU4wo8F1i1IQlth8zvD_krCu2WszlffPA0XYsy_VY1rihfihMCueStUxaiIz9_SDdu9wH7aM-CTdyrvOMA7ejZIWrVgxywvddH6Pk-46kDAU1h4JhbIhfOEpybM4TuKjxZQG8X5iBiAtLO2KBdPAEyTWZ8UGdljteLEvkxGvXvXm4Hjp224EifkMLhQprbHgVdk0NxUGVnwoAL6QrqaUEARwUs0jxRkkiaUJ8wX0ccAJkItMEl_glqzLKZPkWYmbRSsYQS2Y1M-qdEJKEea9CQJkpot4282vqxtFLkMBEjjWvO2TQGq8VgNZOjxMZqbXS92jOvhDg2rqb1ocYWSlQQITbns3HfzcoDtnjN2f-Wn6NG8V7qCwNjCnFp_fcL1Grs8w
  priority: 102
  providerName: Unpaywall
Title Goal-based composition of scalable hybrid analytics for heterogeneous architectures
URI https://dx.doi.org/10.1016/j.jpdc.2016.11.009
https://doi.org/10.1016/j.jpdc.2016.11.009
UnpaywallVersion publishedVersion
Volume 108
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1096-0848
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0011578
  issn: 1096-0848
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  customDbUrl:
  eissn: 1096-0848
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0011578
  issn: 1096-0848
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals [SCFCJ]
  customDbUrl:
  eissn: 1096-0848
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0011578
  issn: 1096-0848
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Science Direct
  customDbUrl:
  eissn: 1096-0848
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0011578
  issn: 1096-0848
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1096-0848
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0011578
  issn: 1096-0848
  databaseCode: AKRWK
  dateStart: 19840801
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La8JAEF7EHtpL36X2IXvorY1Gs9mYo0itbakUrGBPYV9BRZJQleKlv70zuhELRUpPgbDLhtnHfJP95htCbrgMRCNe_v9yXYdpFTtC-NzhYaMuA-3GNY65wy9d3umzp4E_KJBWnguDtEp79q_O9OVpbd9UrTWr2WhU7aHzCzzwWBwWKaBwzGBnAVYxqHytaR6oJdPIpTixtU2cWXG8xplGGcMar6CSJ5ISf3dOu_MkE4tPMZlsOJ_2Idm3qJE2Vx92RAomOSYHeUUGajfoCek9pGLioGfSFMnilpFF05hOYTIwTYoOF5ikRQWqkaBGMwXYSofIiklhMZl0PqWbtwvTU9Jv37-1Oo4tm-Aoj7GZI6SCbRsGdQiGBMqrcwkoQbnSrxnJEBAFPDQi1oopP_Y9xj0TCkRWMjAAMLwzUkzSxJwTyiE-1Dz2maqHEMdpGSq8WAVYY5iWxi2RWm6vSFlNcSxtMYly8tg4QhtHaGMINiKwcYncrvtkK0WNra39fBqiH-sigiN_a7-79Zz9YZiLfw5zSfbq6OeX7L4rUpx9zM01oJSZLC-XYZnsNB-fO1149ruvzfdvMarnbw
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PT8IwFG4QD3jxtxF_9uBNJ2PrOnY0REQFLkDCbWm7LkAIWwRiuPi3-x7rCCaGGK9Lly6vr-99b_3eV0LuuPRFLV79_7Jti0UqtoTwuMWDmiP9yI6rHHuH2x3e7LO3gTcokHreC4O0ShP7s5i-itbmScVYs5KORpUuJj_fhYzFwUkBhe-QXeY5PlZgj19rngeKydRyLU4cbjpnMpLXOI1Qx7DKH1HKE1mJv2en0mKaiuWnmEw2sk_jkOwb2Eifsi87IgU9PSYH-ZUM1OzQE9J9ScTEwtQUUWSLG0oWTWI6g9XAPik6XGKXFhUoR4IizRRwKx0iLSYBb9LJYkY3jxdmp6TfeO7Vm5a5N8FSLmNzS0gF-zbwHaiGBOqrcwkwQdnSq2rJEBH5PNAijhRTXuy5jLs6EAitpK8BYbhnpDhNpvqcUA4FYsRjjykngEIukoHCk1XANZpFUttlUs3tFSojKo53W0zCnD02DtHGIdoYqo0QbFwm9-t30kxSY-toL1-G8IdjhBDzt773sF6zP0xz8c9pbkmp2Wu3wtZr5_2S7DmY9FdUvytSnH8s9DVAlrm8WbnkN2KT51Q
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8NAEF2kPXiyfmJFZQ_eNCVpdjfJsYi1CBZBC_UU9iuUGppiEqT-eneaTamipd53EzI7y7zJvHmD0BUTAQ-T5f8v13WIkonDOWUOi8KuCJSbeAx6hx-HbDAiD2M6tjI50AvzrX6_5GFN5wqkBj3WAbVN6NVrMmpwdwM1R8On3mutsxlU4wo8F1i1IQlth8zvD_krCu2WszlffPA0XYsy_VY1rihfihMCueStUxaiIz9_SDdu9wH7aM-CTdyrvOMA7ejZIWrVgxywvddH6Pk-46kDAU1h4JhbIhfOEpybM4TuKjxZQG8X5iBiAtLO2KBdPAEyTWZ8UGdljteLEvkxGvXvXm4Hjp224EifkMLhQprbHgVdk0NxUGVnwoAL6QrqaUEARwUs0jxRkkiaUJ8wX0ccAJkItMEl_glqzLKZPkWYmbRSsYQS2Y1M-qdEJKEea9CQJkpot4282vqxtFLkMBEjjWvO2TQGq8VgNZOjxMZqbXS92jOvhDg2rqb1ocYWSlQQITbns3HfzcoDtnjN2f-Wn6NG8V7qCwNjCnFp_fcL1Grs8w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Goal-based+composition+of+scalable+hybrid+analytics+for+heterogeneous+architectures&rft.jtitle=Journal+of+parallel+and+distributed+computing&rft.au=Coetzee%2C+P.&rft.au=Jarvis%2C+S.A.&rft.date=2017-10-01&rft.issn=0743-7315&rft.volume=108&rft.spage=59&rft.epage=73&rft_id=info:doi/10.1016%2Fj.jpdc.2016.11.009&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jpdc_2016_11_009
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0743-7315&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0743-7315&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0743-7315&client=summon