Trust-Based Scheduling Framework for Big Data Processing with MapReduce
Security and privacy have become a great concern in cloud computing platforms in which users risk the leakage of their private data. The leakage can happen while the data is at rest (in storage), in processing, or on moving within a cloud or between different cloud infrastructures, e.g., from privat...
Saved in:
| Published in | IEEE transactions on services computing Vol. 15; no. 1; pp. 279 - 293 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Piscataway
IEEE
01.01.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1939-1374 2372-0204 |
| DOI | 10.1109/TSC.2019.2938959 |
Cover
| Abstract | Security and privacy have become a great concern in cloud computing platforms in which users risk the leakage of their private data. The leakage can happen while the data is at rest (in storage), in processing, or on moving within a cloud or between different cloud infrastructures, e.g., from private to public clouds. This paper focuses on protecting data "in processing". For big data applications, the MapReduce framework has been proven as an efficient solution and has been widely deployed, e.g., in healthcare and business data analysis. In this article, we propose a trust-based framework for MapReduce in big data processing tasks. Specifically, we first quantify and propose to assign the sensitive values for data and trust values for map and reduce slots. We then compute the trust value of each resource employed in the big data processing tasks. Depending on the data's sensitivity level of a task, the task requires a given level of trust (i.e., higher sensitive data requires servers/slots with higher trust level). The MapReduce scheduling problem is then formulated as the maximum weighted matching problem of a bipartite graph that aims to maximize the total trust value over all possible assignments subject to various trust requirement of different tasks. The problem is known to be NP-hard. To tackle it, we observe that within a computing node (VM), slots share the same trust value granted from the secured transformation phase. This helps reduce the number of slot nodes of a weight bipartite graph. Leveraging this fact, we propose an efficient heuristic algorithm that achieves 94.7 percent of the optimal solution obtained via exhaustive search. Extensive simulations show that the trust-based scheduling scheme provides much higher protection for data sensitivity while ensuring good performance for big data applications. |
|---|---|
| AbstractList | Security and privacy have become a great concern in cloud computing platforms in which users risk the leakage of their private data. The leakage can happen while the data is at rest (in storage), in processing, or on moving within a cloud or between different cloud infrastructures, e.g., from private to public clouds. This paper focuses on protecting data "in processing". For big data applications, the MapReduce framework has been proven as an efficient solution and has been widely deployed, e.g., in healthcare and business data analysis. In this article, we propose a trust-based framework for MapReduce in big data processing tasks. Specifically, we first quantify and propose to assign the sensitive values for data and trust values for map and reduce slots. We then compute the trust value of each resource employed in the big data processing tasks. Depending on the data's sensitivity level of a task, the task requires a given level of trust (i.e., higher sensitive data requires servers/slots with higher trust level). The MapReduce scheduling problem is then formulated as the maximum weighted matching problem of a bipartite graph that aims to maximize the total trust value over all possible assignments subject to various trust requirement of different tasks. The problem is known to be NP-hard. To tackle it, we observe that within a computing node (VM), slots share the same trust value granted from the secured transformation phase. This helps reduce the number of slot nodes of a weight bipartite graph. Leveraging this fact, we propose an efficient heuristic algorithm that achieves 94.7 percent of the optimal solution obtained via exhaustive search. Extensive simulations show that the trust-based scheduling scheme provides much higher protection for data sensitivity while ensuring good performance for big data applications. |
| Author | Dang, Thanh Dat Nguyen, Diep N. Hoang, Doan |
| Author_xml | – sequence: 1 givenname: Thanh Dat orcidid: 0000-0002-1827-3731 surname: Dang fullname: Dang, Thanh Dat email: datth22@gmail.com organization: School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, NSW, Australia – sequence: 2 givenname: Doan orcidid: 0000-0003-1798-4926 surname: Hoang fullname: Hoang, Doan email: doan.hoang@uts.edu.au organization: School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, NSW, Australia – sequence: 3 givenname: Diep N. orcidid: 0000-0003-2659-8648 surname: Nguyen fullname: Nguyen, Diep N. email: diep.nguyen@uts.edu.au organization: School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, NSW, Australia |
| BookMark | eNp9kEtPAjEURhuDiYDuTdxM4nqwj2mnXQoKmmA0guumlDtQhCm2JcR_7xCICxeu7uac7yang1q1rwGha4J7hGB1N50MehQT1aOKScXVGWpTVtIcU1y0UJsopnLCyuICdWJcYSyolKqNRtOwiynvmwjzbGKXMN-tXb3IhsFsYO_DZ1b5kPXdInswyWRvwVuI8UDsXVpmL2b73igWLtF5ZdYRrk63iz6Gj9PBUz5-HT0P7se5pYqknMu55FgAzETFgbMZtkwyw6zkFShZUaUKUEaUjFSl4pZiSWakkJhLIg0mrItuj7vb4L92EJNe-V2om5eaCsqZKAtKG0ocKRt8jAEqbV0yyfk6BePWmmB9iKabaPoQTZ-iNSL-I26D25jw_Z9yc1QcAPziUlKGmWA_svN3jA |
| CODEN | ITSCAD |
| CitedBy_id | crossref_primary_10_1016_j_comnet_2024_110628 crossref_primary_10_1109_ACCESS_2021_3129885 crossref_primary_10_3390_s24072098 crossref_primary_10_1142_S1793962321500100 crossref_primary_10_3390_app132312799 crossref_primary_10_1109_ACCESS_2024_3509218 crossref_primary_10_1109_TMC_2024_3406721 crossref_primary_10_3390_app14031319 crossref_primary_10_3390_electronics12051182 crossref_primary_10_3390_math13050730 |
| Cites_doi | 10.1109/TNSM.2014.041614.120394 10.1109/CloudTech.2017.8284736 10.1109/TPDS.2014.2358556 10.1109/SERVICES.2012.28 10.1109/TrustCom.2011.129 10.1109/GLOCOM.2015.7417577 10.1109/TrustCom.2011.18 10.1109/INFOCOM.2014.6848063 10.1109/TCC.2015.2469659 10.1109/Trustcom/BigDataSE/ICESS.2017.281 10.1109/SP.2015.10 10.1016/j.cose.2016.06.003 10.1109/CCGrid.2012.77 10.1109/CCGrid.2014.96 10.1109/CLUSTER.2015.93 10.1109/INFCOM.2011.5935152 10.1109/BigData.2015.7363785 10.1109/TrustCom.2014.39 10.1109/TCC.2015.2474403 10.1109/TCC.2014.2379096 10.1109/ACCESS.2016.2558446 10.1145/1327452.1327492 10.1109/BigData.2015.7363748 10.1109/SYNASC.2015.59 10.1109/CCGrid.2014.39 10.1109/TSG.2016.2548565 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TSC.2019.2938959 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2372-0204 |
| EndPage | 293 |
| ExternalDocumentID | 10_1109_TSC_2019_2938959 8823036 |
| Genre | orig-research |
| GroupedDBID | 0R~ 29I 4.4 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD HZ~ IEDLZ IFIPE IPLJI JAVBF M43 O9- OCL P2P PQQKQ RIA RIE RNI RNS RZB AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c291t-58d8506eeb6f5e53b0c383a3c85fe98f2994e9a6731f795c2081b14805818a013 |
| IEDL.DBID | RIE |
| ISSN | 1939-1374 |
| IngestDate | Sun Jun 29 16:57:10 EDT 2025 Thu Apr 24 23:01:47 EDT 2025 Wed Oct 01 01:39:49 EDT 2025 Wed Aug 27 02:23:57 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c291t-58d8506eeb6f5e53b0c383a3c85fe98f2994e9a6731f795c2081b14805818a013 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-1798-4926 0000-0002-1827-3731 0000-0003-2659-8648 |
| PQID | 2625367422 |
| PQPubID | 85503 |
| PageCount | 15 |
| ParticipantIDs | proquest_journals_2625367422 crossref_citationtrail_10_1109_TSC_2019_2938959 crossref_primary_10_1109_TSC_2019_2938959 ieee_primary_8823036 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2022-Jan.-Feb.-1 2022-1-1 20220101 |
| PublicationDateYYYYMMDD | 2022-01-01 |
| PublicationDate_xml | – month: 01 year: 2022 text: 2022-Jan.-Feb.-1 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE transactions on services computing |
| PublicationTitleAbbrev | TSC |
| PublicationYear | 2022 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 Dinh (ref8) ref15 ref14 ref31 ref30 ref33 ref10 (ref7) 2017 ref2 ref1 ref17 ref16 ref19 ref18 Schubert (ref11) 2012 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 Hoang (ref29) ref9 ref4 ref3 ref6 Verge (ref5) 2014 |
| References_xml | – ident: ref12 doi: 10.1109/TNSM.2014.041614.120394 – volume-title: Proc. 19th Pacific Asia Conf. Inf. Syst. ident: ref29 article-title: Health data in cloud environments – ident: ref14 doi: 10.1109/CloudTech.2017.8284736 – ident: ref26 doi: 10.1109/TPDS.2014.2358556 – volume-title: Hadoop: Open Source Implementation of MapReduce year: 2017 ident: ref7 – ident: ref19 doi: 10.1109/SERVICES.2012.28 – ident: ref30 doi: 10.1109/TrustCom.2011.129 – ident: ref1 doi: 10.1109/GLOCOM.2015.7417577 – start-page: 447 volume-title: Proc. 24th USENIX Conf. Security Symp. ident: ref8 article-title: M2R: Enabling stronger privacy in mapreduce computation – ident: ref15 doi: 10.1109/TrustCom.2011.18 – ident: ref23 doi: 10.1109/INFOCOM.2014.6848063 – ident: ref33 doi: 10.1109/TCC.2015.2469659 – ident: ref20 doi: 10.1109/Trustcom/BigDataSE/ICESS.2017.281 – year: 2014 ident: ref5 article-title: iCloud hack leaks hundreds of nude celebrity photos – ident: ref9 doi: 10.1109/SP.2015.10 – ident: ref16 doi: 10.1016/j.cose.2016.06.003 – ident: ref18 doi: 10.1109/CCGrid.2012.77 – ident: ref10 doi: 10.1109/CCGrid.2014.96 – ident: ref22 doi: 10.1109/CLUSTER.2015.93 – ident: ref24 doi: 10.1109/INFCOM.2011.5935152 – ident: ref17 doi: 10.1109/BigData.2015.7363785 – ident: ref28 doi: 10.1109/TrustCom.2014.39 – year: 2012 ident: ref11 article-title: Advances in clouds – ident: ref21 doi: 10.1109/TCC.2015.2474403 – ident: ref25 doi: 10.1109/TCC.2014.2379096 – ident: ref4 doi: 10.1109/ACCESS.2016.2558446 – ident: ref6 doi: 10.1145/1327452.1327492 – ident: ref27 doi: 10.1109/BigData.2015.7363785 – ident: ref2 doi: 10.1109/BigData.2015.7363748 – ident: ref13 doi: 10.1109/SYNASC.2015.59 – ident: ref31 doi: 10.1109/CCGrid.2014.39 – ident: ref3 doi: 10.1109/TSG.2016.2548565 |
| SSID | ssj0062889 |
| Score | 2.354554 |
| Snippet | Security and privacy have become a great concern in cloud computing platforms in which users risk the leakage of their private data. The leakage can happen... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 279 |
| SubjectTerms | Algorithms Big Data Big Data applications big data security Cloud computing Data analysis Data privacy Data processing data sensitive Graph theory Heuristic algorithms Heuristic methods Leakage MapReduce Measurement Privacy Processor scheduling Scheduling Security Sensitivity Task analysis Trust-aware framework trust-based scheduling Trusted computing Trustworthiness |
| Title | Trust-Based Scheduling Framework for Big Data Processing with MapReduce |
| URI | https://ieeexplore.ieee.org/document/8823036 https://www.proquest.com/docview/2625367422 |
| Volume | 15 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2372-0204 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0062889 issn: 1939-1374 databaseCode: RIE dateStart: 20080101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLa2neDAayAGA-XABYl2XfpKjmwwJqRxYJu0W9Um6YRA2wTdhV-P3cc0AULcenCkyE5jf479GeAqTBEk825q8UQoy9NcWjKRXYsHQqjQCG10zvb5FAyn3uPMn9XgZtMLY4zJi8-MTZ_5W75eqjWlyjqCXoXcoA71UARFr1Z169LUXFk9QzqyMxn3qW5L2ujOhCQm0i23k89R-XH55h5lsA-jai9FIcmrvc4SW31-o2n872YPYK8MLdltcRYOoWYWR7C7RTjYhIcJtVhYPXRdmo3RXpoK0edsUJVoMYxhWe9lzu7iLGZlFwFJUL6WjeLVM1G9mmOYDu4n_aFVjlKwFJfdzPKFJmo6Y5Ig9Y3vJo5CaBq7SlCxmUjRKXlGxkHodtNQ-opjpJAgUnJ8dOgxhokn0FgsF-YUmHGU0ISS0I6eRoClXcOlTzx5bhKmvAWdStORKnnGadzFW5TjDUdGaJuIbBOVtmnB9WbFquDY-EO2SareyJVabkG7MmZU_oQfEUds5waI_fnZ76vOYYdTN0OeUWlDI3tfmwuMMbLkMj9cX5tIzRo |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8MwDLVgHIAD34jBgBy4INGtS5s2ObLBGLDtAEPiVrVJihBoTNBd-PXYXTtNgBC3HhwpstPYz7GfAU7CFEEyb6YOT6R2fMOVoxLVdHggpQ6tNNbkbJ-DoPvg3zyKxwU4m_XCWGvz4jNbp8_8Ld-86QmlyhqSXoW8YBGWhO_7YtqtVd67NDdXlQ-RrmoM79tUuaXq6NCkIi7SOceTT1L5cf3mPqWzDv1yN9NSkpf6JEvq-vMbUeN_t7sBa0Vwyc6np2ETFuxoC1bnKAe34WpITRZOC52XYfdoMUOl6E-sUxZpMYxiWev5iV3EWcyKPgKSoIwt68fjOyJ7tTvw0LkctrtOMUzB0Vw1M0dIQ-R01iZBKqzwElcjOI09LancTKbolnyr4iD0mmmohOYYKySIlVyBLj3GQHEXKqO3kd0DZl0tDeEktKRvEGIZz3IliCnPS8KUV6FRajrSBdM4Dbx4jXLE4aoIbRORbaLCNlU4na0YT1k2_pDdJlXP5AotV6FWGjMqfsOPiCO68wJE_3z_91XHsNwd9ntR73pwewArnHob8vxKDSrZ-8QeYsSRJUf5QfsChMvQZw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Trust-Based+Scheduling+Framework+for+Big+Data+Processing+with+MapReduce&rft.jtitle=IEEE+transactions+on+services+computing&rft.au=Dang%2C+Thanh+Dat&rft.au=Hoang%2C+Doan&rft.au=Nguyen%2C+Diep+N.&rft.date=2022-01-01&rft.issn=1939-1374&rft.eissn=2372-0204&rft.volume=15&rft.issue=1&rft.spage=279&rft.epage=293&rft_id=info:doi/10.1109%2FTSC.2019.2938959&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TSC_2019_2938959 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1939-1374&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1939-1374&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1939-1374&client=summon |