Big Data Analytics and Knowledge Discovery 19th International Conference, DaWaK 2017, Lyon, France, August 28-31, 2017, Proceedings

This book constitutes the refereed proceedings of the 19th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2017, held in Lyon, France, in August 2017.The 24 revised full papers and 11 short papers presented were carefully reviewed and selected from 97 submissions. The p...

Full description

Saved in:

Bibliographic Details
Main Authors	Bellatreche, Ladjel, Chakravarthy, Sharma
Format	eBook Book
Language	English
Published	Cham Springer Nature 2017 Springer Springer International Publishing AG
Edition	1
Series	LNCS sublibrary. SL 3, Information systems and applications, incl. Internet/Web, and HCI
Subjects	Big data Big data > Congresses Data mining Data mining > Congresses Database management Database management > Congresses Special computer methods
Online Access	Get full text
ISBN	3319642839 9783319642833 3319642820 9783319642826

Cover

Abstract	This book constitutes the refereed proceedings of the 19th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2017, held in Lyon, France, in August 2017.The 24 revised full papers and 11 short papers presented were carefully reviewed and selected from 97 submissions. The papers are organized in the following topical sections: new generation data warehouses design; cloud and NoSQL databases; advanced programming paradigms; non-functional requirements satisfaction; machine learning; social media and twitter analysis; sentiment analysis and user influence; knowledge discovery; and data flow management and optimization.
AbstractList	This book constitutes the refereed proceedings of the 19th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2017, held in Lyon, France, in August 2017.The 24 revised full papers and 11 short papers presented were carefully reviewed and selected from 97 submissions. The papers are organized in the following topical sections: new generation data warehouses design; cloud and NoSQL databases; advanced programming paradigms; non-functional requirements satisfaction; machine learning; social media and twitter analysis; sentiment analysis and user influence; knowledge discovery; and data flow management and optimization.
Author	Bellatreche, Ladjel Chakravarthy, Sharma
Author_xml	– sequence: 1 fullname: Bellatreche, Ladjel – sequence: 2 fullname: Chakravarthy, Sharma
BackLink	https://cir.nii.ac.jp/crid/1130282271621980672$$DView record in CiNii
BookMark	eNqNz9tKAzEQBuCIVrS177CIIAqFZHK-7MkDFrwRb5ckm7Zr101t1pa-vakr4qUXM8MPHz9MF53UofZHqEsp0YKBovr4b-igLmAiNZNSsFPUJRgTUBIzfob6Mb7hlJXkWqlzdDsqF9nENCYb1qbaN6WLmamL7KkOu8oXC59NyujC1m_2F6gzN1X0_Z_bQ69305fxw2D2fP84Hs4GBoiiZOCsLQjRaYwohARLmDVaKsacdYCZYQBWUGbmFjtL5xowFtwTLoE5KDTtoZu22MSV38VlqJqYbytvQ1jFPDX9vkqTvW7tehM-Pn1s8m_mfN1sTJVPR2NBMYDg_5Ccp2p5kFetrMsyd-VhE5JaFIAkAohWOL2V2GXLnImmSix_D3VYbMx6GXPOiODA6ReFz3aB
ContentType	eBook Book
DBID	I4C RYH
DEWEY	005.745
DatabaseName	Casalini Torrossa eBooks Institutional Catalogue CiNii Complete
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	3319642839 9783319642833
Edition	1
Editor	Bellatreche, Ladjel Chakravarthy, Sharma
Editor_xml	– sequence: 1 fullname: Bellatreche, Ladjel – sequence: 2 fullname: Chakravarthy, Sharma
ExternalDocumentID	9783319642833 EBC6302265 EBC5578375 BB24760848 5416525
GroupedDBID	0D6 0DA 38. AABBV AALVI ABBVZ ABHTH ABQUB ACDJR ADCXD AEDXK AEKFX AEZAY AGIGN AGYGE AIODD ALBAV ALMA_UNASSIGNED_HOLDINGS AZZ BATQV BBABE CVWCR CZZ I4C IEZ LDH NUC SAO SBO SWYDZ TPJZQ TSXQS Z7R Z7U Z7W Z7X Z7Y Z7Z Z81 Z83 Z84 Z88 AEJLV RYH Z85
ID	FETCH-LOGICAL-a21831-cbbd119d11a6d672b14ba97844cbc204a422b634afb0cb3f920065e15724c2d93
ISBN	3319642839 9783319642833 3319642820 9783319642826
IngestDate	Fri Nov 08 04:09:20 EST 2024 Fri May 30 22:02:37 EDT 2025 Fri May 30 23:21:38 EDT 2025 Fri Jun 27 00:06:32 EDT 2025 Tue Nov 14 22:51:53 EST 2023
IsPeerReviewed	false
IsScholarly	false
LCCN	2017947764
LCCallNum_Ident	Q
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-a21831-cbbd119d11a6d672b14ba97844cbc204a422b634afb0cb3f920065e15724c2d93
Notes	"LNCS sublibrary: SL3 - information systems and applications, incl. Internet/Web and HCI"--T.p. verso Includes bibliographical references and index "DEXA DAWAK 17"--P. [1] of cover
OCLC	1001287045
PQID	EBC5578375
PageCount	489
ParticipantIDs	askewsholts_vlebooks_9783319642833 proquest_ebookcentral_EBC6302265 proquest_ebookcentral_EBC5578375 nii_cinii_1130282271621980672 casalini_monographs_5416525
PublicationCentury	2000
PublicationDate	2017 c2017 2017-09-13
PublicationDateYYYYMMDD	2017-01-01 2017-09-13
PublicationDate_xml	– year: 2017 text: 2017
PublicationDecade	2010
PublicationPlace	Cham
PublicationPlace_xml	– name: Netherlands – name: Cham
PublicationSeriesTitle	LNCS sublibrary. SL 3, Information systems and applications, incl. Internet/Web, and HCI
PublicationYear	2017
Publisher	Springer Nature Springer Springer International Publishing AG
Publisher_xml	– name: Springer Nature – name: Springer – name: Springer International Publishing AG
SSID	ssj0001875988
Score	2.1433089
Snippet	This book constitutes the refereed proceedings of the 19th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2017, held in Lyon,...
SourceID	askewsholts proquest nii casalini
SourceType	Aggregation Database Publisher
SubjectTerms	Big data Big data -- Congresses Data mining Data mining -- Congresses Database management Database management -- Congresses Special computer methods
Subtitle	19th International Conference, DaWaK 2017, Lyon, France, August 28-31, 2017, Proceedings
TableOfContents	3.1 Verifiable Secret Sharing -- 3.2 Order-Preserving Secret Sharing -- 3.3 Discussion -- 4 Index-Based Methods -- 4.1 Bucketization-Based Indexing -- 4.2 Order-Preserving Indexing -- 4.3 Searchable Encryption -- 4.4 Discussion -- 5 Secure Databases -- 5.1 CryptDB -- 5.2 MONOMI -- 5.3 Multi-valued Order Preserving Encryption (MV-OPE) -- 5.4 Secure Trusted Hardware -- 5.5 Discussion -- 6 Conclusion -- 6.1 Security -- 6.2 Query Post-processing -- 6.3 Storage Overhead -- 6.4 Computational Overhead -- 6.5 Wrap-up -- References -- TARDIS: Optimal Execution of Scientific Workflows in Apache Spark -- 1 Introduction -- 2 Problem Definition -- 3 Background -- 3.1 Spark -- 4 TARDIS Engine -- 4.1 Architecture -- 4.2 TARDIS Language -- 4.3 Data Placement -- 4.4 Scheduling -- 4.5 Collecting Output Files -- 5 Experiments -- 6 Conclusion -- References -- MDA-Based Approach for NoSQL Databases Modelling -- Abstract -- 1 Introduction -- 2 Research Problem and Related Work -- 3 UMLtoNoSQL Approach -- 3.1 UMLtoGenericModel Transformation -- 3.2 GenericModeltoPhysicalModel Transformation -- 4 Experiments -- 4.1 Implementation -- 4.2 Evaluation -- 5 Conclusion and Future Work -- References -- Advanced Programming Paradigms -- MiSeRe-Hadoop: A Large-Scale Robust Sequential Classification Rules Mining Framework -- 1 Introduction -- 2 Preliminaries -- 3 MiSeRe Algorithm -- 4 MiSeRe Hadoop Algorithm -- 4.1 Step I: -- 4.2 Step II: -- 5 Experiments -- 6 Conclusion and Future Work -- References -- An Efficient Map-Reduce Framework to Mine Periodic Frequent Patterns -- 1 Introduction -- 2 Background -- 2.1 Mining Periodic-Frequent Patterns on a Single Machine -- 2.2 Mining PFPs with Period Summary -- 2.3 Map-Reduce Framework -- 2.4 Parallel FP-growth -- 3 Proposed Approaches -- 3.1 Parallel Periodic Frequent Pattern Growth (PPF-growth) -- 3.2 PPF-growth Using Partition Summary 5 Conclusion -- References -- Modeling Data Flow Execution in a Parallel Environment -- 1 Introduction -- 1.1 Parallelizing Data Flows -- 1.2 Assumptions Regarding a Single Multi-core Machine Execution Environment -- 1.3 Motivation for Devising a New Cost Model -- 2 Other Related Work -- 3 Preliminaries -- 4 Our Cost Model -- 4.1 A Generalized Cost Model for Response Time -- 4.2 Models Without Considering the Communication Cost -- 4.3 Considering Communication Costs -- 4.4 Considering Partitioned Parallelism -- 5 Conclusions and Future Work -- References -- Machine Learning -- Accelerating K-Means by Grouping Points Automatically -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 The Framework of Our Algorithm -- 3.2 Filtering for Clusters of Points -- 3.3 Fission Step: Grouping Points Automatically -- 3.4 Filtering for Groups of Points -- 3.5 Fusion Step: Limiting the Increasing Number of Groups -- 3.6 Algorithm -- 4 Experiment and Analysis -- 4.1 Experiment Design -- 4.2 Cost Comparison and Relative Speedup -- 4.3 Separability -- 4.4 Avoided Distance Calculations -- 5 Conclusion and Future Work -- References -- A Machine Learning Trainable Model to Assess the Accuracy of Probabilistic Record Linkage -- 1 Introduction -- 2 Related Work -- 3 Assessing the Accuracy of Record Linkage -- 4 Machine Learning Algorithms -- 4.1 Decision Trees -- 4.2 Gradient Boosted Trees -- 4.3 Random Forests -- 4.4 Naïve Bayes -- 4.5 Linear Support Vector Machine -- 4.6 Logistic Regression -- 4.7 Comparative Analysis -- 5 Proposed Trainable Model -- 5.1 Pre-processing -- 5.2 Transformation -- 5.3 Model Selection -- 5.4 Model Execution -- 6 Experimental Results -- 7 Conclusions and Future Work -- References -- An Efficient Approach for Instance Selection -- 1 Introduction -- 2 Related Works -- 3 Notations -- 4 The XLDIS Algorithm -- 5 Experiments 6 Conclusion -- References -- Search Result Personalization in Twitter Using Neural Word Embeddings -- 1 Introduction -- 2 Related Work -- 2.1 Twitter Search -- 2.2 Personalized Twitter Search -- 3 Our Approach -- 3.1 User Modeling -- 3.2 Results Re-ranking -- 4 Evaluation -- 4.1 Twitter Lists Based Evaluation -- 4.2 Hashtags Based Evaluation -- 5 Conclusions -- References -- Diverse Selection of Feature Subsets for Ensemble Regression -- 1 Introduction -- 2 Related Work -- 3 Diverse Subset Selection Strategy (DS3) -- 3.1 Problem Overview -- 3.2 Solution Overview -- 3.3 Relevance Based Generation of Initial Candidates -- 3.4 Multiple Feature Sets Based on Difference and Quality -- 3.5 Unifying Multiple Subsets by Ensemble Regression -- 3.6 Time Complexity -- 4 Experiments -- 4.1 Synthetic Data Sets -- 4.2 Real-World Data Sets -- 4.3 Parameter Analysis -- 4.4 Iterations -- 5 Conclusions -- References -- K-Means Clustering Using Homomorphic Encryption and an Updatable Distance Matrix: Secure Third Party Data Clustering with Limited Data Owner Interaction -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 3.1 K-Means Clustering -- 3.2 Liu's Homomorphic Encryption Scheme -- 4 The Updatable Distance Matrix Concept -- 5 Secure K-Means Clustering Using the UDM Concept -- 5.1 Data Owner Process -- 5.2 Third Party Process -- 6 Evaluation -- 7 Conclusion -- References -- Reweighting Forest for Extreme Multi-label Classification -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Problem Definition and Proposed Framework -- 3.2 The Reweighting Phase -- 3.3 The Pretesting Phase -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Experimental Results -- 5 Conclusion -- References -- Social Media and Twitter Analysis -- A Relativistic Opinion Mining Approach to Detect Factual or Opinionated News Sources -- 1 Introduction 4 Performance Evaluation -- 5 Conclusion -- References -- MapReduce-Based Complex Big Data Analytics over Uncertain and Imprecise Social Networks -- 1 Introduction and Related Work -- 2 Background: Data Science -- 3 Mining Complex Big Data in Uncertain and Imprecise Social Networks -- 3.1 Interdependencies Between Followers and Followees in Complex Big Social Networks -- 3.2 Discovery of Popular Followees -- 3.3 The First Set of MapReduce Functions in BigUISN -- 3.4 The Second Set of MapReduce Functions in BigUISN -- 3.5 Beyond the Second Set of MapReduce Functions in BigUISN -- 4 Evaluation, Observations, and Discussion -- 5 Conclusions and Future Work -- References -- Non-functional Requirements Satisfaction -- A Case for Abstract Cost Models for Distributed Execution of Analytics Operators -- 1 Introduction -- 2 Piecewise Linear Model Structure and Training -- 3 Makespan Model for Sorting -- 3.1 Round-Time Estimation for Map and Reduce Phase -- 3.2 Exploiting Model Structure for Optimization -- 4 Dense Matrix Product -- 4.1 Makespan Model for Block-Wise Matrix Multiplication -- 4.2 Optimal Partitioning -- 5 Experiments -- 5.1 Basic Setup -- 5.2 Sorting -- 5.3 Matrix Multiplication -- 6 Related Work -- 7 Conclusions -- References -- Pre-processing and Indexing Techniques for Constellation Queries in Big Data -- 1 Introduction -- 2 Related Works -- 3 Problem Formulation -- 4 CQ Processing -- 4.1 Query Pre-processing -- 4.2 Query Transformation -- 4.3 Dataset Pre-processing -- 5 Experiments -- 5.1 Query Pre-processing -- 5.2 PH-tree Versus Quad-Tree -- 6 Conclusion -- References -- A Lightweight Elastic Queue Middleware for Distributed Streaming Pipeline -- 1 Introduction -- 2 Elastic Queue Middleware -- 2.1 The Role of EQM in Elastic Streaming Processing Engines -- 2.2 Implementing EQM Based on HBase -- 3 Experiments -- 4 Related Work Intro -- Preface -- Organization -- Contents -- New Generation Data Warehouses Design -- Evaluation of Data Warehouse Design Methodologies in the Context of Big Data -- Abstract -- 1 Introduction -- 2 Methodology Classification -- 3 Metrics for Design Evaluation of Methodologies -- 3.1 Metrics for Methodology Evaluation -- 3.2 Metrics for Schema Quality Evaluation -- 4 Experimental Results -- 4.1 Methodology Evaluation -- 4.2 Schema Evaluation -- 5 Conclusion -- References -- Optimal Task Ordering in Chain Data Flows: Exploring the Practicality of Non-scalable Solutions -- 1 Introduction -- 2 Preliminaries -- 2.1 Problem Complexity -- 2.2 Chains in TPC-DI -- 3 Accurate Algorithms for Linear Execution Plans -- 3.1 Backtracking -- 3.2 Dynamic Programming -- 3.3 Topological Sorting -- 4 Evaluation of the Time Overhead -- 5 Related Work -- 6 Conclusions -- References -- Exploiting Mathematical Structures of Statistical Measures for Comparison of RDF Data Cubes -- 1 Introduction -- 2 Model and Data Representation -- 3 Structural Comparison of RDF Data Cubes -- 3.1 Computability and Comparability -- 3.2 Comparison Functionalities -- 3.3 Experimentation -- 4 Conclusion -- References -- S2D: Shared Distributed Datasets, Storing Shared Data for Multiple and Massive Queries Optimization in a Distributed Data Warehouse -- 1 Introduction -- 2 Related Work -- 3 Overview of Shared Distributed Datasets -- 3.1 Phase 1: The Logical Representation -- 3.2 Phase 2: The Physical Representation -- 4 Experimental Evaluation -- 4.1 Experimental Setup -- 4.2 Experimental Results and Discussion -- 5 Conclusion and Future Work -- References -- Cloud and NoSQL Databases -- Enforcing Privacy in Cloud Databases -- 1 Introduction -- 2 Non-cryptographic Methods -- 2.1 Differential Privacy -- 2.2 Data Anonymization -- 2.3 Data Fragmentation -- 3 Secret Sharing-Based Methods 2 Related Work
Title	Big Data Analytics and Knowledge Discovery
URI	http://digital.casalini.it/9783319642833 https://cir.nii.ac.jp/crid/1130282271621980672 https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=5578375 https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=6302265 https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9783319642833
Volume	10440
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dT9swELdG9zJe2KdWGMia9tZkih07afZGoROCaS9jwFvkj5R2Q61EUiT4A_Z3c-ckTum-tD3USlLrYvl3zn347kzIO85BxgmZhpFVRSi0YiGoITo0BqS7MDpm1kX5fk6OvorjC3nRHavosksq_d7c_TKv5H9QhWeAK2bJ_gOynig8gGvAF1pAGNo15dffNtuvs8sBxnYOFNYU8ZWWvYMM910MBmfeNiZ_NV3z_XWZfnVk-7k6GXBXDAns9NuFV2rr__eXl8uyGnB0cuI99nSUO_nndfPfvQe3JQ5xzOfqupgulmWbHnniR33Yjto7CTA2q4KP8rQ-D1rZb4UPCakjEm6A92tO-eKqcK-6MVi65sZo3ZgPzNvYlQsDo3CtWLYTv6MRF2mCxwFskI00BcP78f74-NNZ52EDUywbDjGhp6XTFPnq6G6STVV-B3ECoqYqUTdRpcKUVFA15rPZTwLaaR2nT0kPM1GekUfF_DnZas_foM3n-AX5AWxAkQ2oZwO4stSzAfVsQD9QZAL6ABzagRNQxwIU5yygyAABreEPaA0-deAHrgdQWwH-JTn7OD49OAqbczVChQoxC43WlrEMfiqxSco1g0UK8yJgcRoeCSU410ks1ERHsFwnGfqdZMFkyoXhNotfkd58MS9eY8p_FtkIyBkhRGLiDE1ik9lJEhWKF7ZP3q5McH5z5WIAynwFhTjuk5123nNYonWt9jKXYCxILvtkF6DIzQxbhhvtoNRi4TOWDTGQoE9oC1LuqDdhzfl4dCBBIMWp_FOXBAjyRG7_5S075EnHtm9Ir7peFrugj1Z6r2G8e0Yvg6s
linkProvider	Library Specific Holdings
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Big+data+analytics+and+knowledge+discovery+%3A+19th+International+Conference%2C+DaWaK+2017%2C+Lyon%2C+France%2C+August+28-31%2C+2017+%3A+proceedings&rft.au=International+Conference+on+Data+Warehousing+and+Knowledge+Discovery&rft.au=Bellatreche%2C+Ladjel&rft.au=Chakravarthy%2C+Sharma&rft.date=2017-01-01&rft.pub=Springer&rft.isbn=9783319642826&rft.externalDocID=BB24760848
thumbnail_m	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fvle.dmmserver.com%2Fmedia%2F640%2F97833196%2F9783319642833.jpg