Distributed and Distribution-Robust Meta Reinforcement Learning (D ^-RMRL) for Data Pre-Storage and Routing in Cube Satellite Networks
In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the...
Saved in:
| Published in | IEEE journal of selected topics in signal processing Vol. 17; no. 1; pp. 128 - 141 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
IEEE
01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1932-4553 1941-0484 |
| DOI | 10.1109/JSTSP.2022.3232944 |
Cover
| Abstract | In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the satellites, such that the ground users can be directly served with the pre-stored data. This pre-storage and routing design problem is formulated as a decentralized Markov decision process (Dec-MDP) in which we seek to find the optimal strategy that maximizes the pre-store hit rate, i.e., the fraction of users being directly served with the pre-stored data. To obtain the optimal strategy, a distributed distribution-robust meta reinforcement learning (D<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>-RMRL) algorithm is proposed that consists of three key ingredients: value-decomposition for achieving the global optimum in distributed setting with minimum communication overhead, meta learning to obtain the optimal initial to reduce the training time under dynamic conditions, and pre-training to further speed up the meta training procedure. Simulation results show that, using the proposed value decomposition and meta training techniques, the satellite networks can achieve a 31.8% improvement of the pre-store hits and a 40.7% improvement of the convergence speed, compared to a baseline reinforcement learning algorithm. Moreover, the use of the proposed pre-training mechanism helps to shorten the meta-learning procedure by up to 43.7%. |
|---|---|
| AbstractList | In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the satellites, such that the ground users can be directly served with the pre-stored data. This pre-storage and routing design problem is formulated as a decentralized Markov decision process (Dec-MDP) in which we seek to find the optimal strategy that maximizes the pre-store hit rate, i.e., the fraction of users being directly served with the pre-stored data. To obtain the optimal strategy, a distributed distribution-robust meta reinforcement learning (D[Formula Omitted]-RMRL) algorithm is proposed that consists of three key ingredients: value-decomposition for achieving the global optimum in distributed setting with minimum communication overhead, meta learning to obtain the optimal initial to reduce the training time under dynamic conditions, and pre-training to further speed up the meta training procedure. Simulation results show that, using the proposed value decomposition and meta training techniques, the satellite networks can achieve a 31.8% improvement of the pre-store hits and a 40.7% improvement of the convergence speed, compared to a baseline reinforcement learning algorithm. Moreover, the use of the proposed pre-training mechanism helps to shorten the meta-learning procedure by up to 43.7%. In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the satellites, such that the ground users can be directly served with the pre-stored data. This pre-storage and routing design problem is formulated as a decentralized Markov decision process (Dec-MDP) in which we seek to find the optimal strategy that maximizes the pre-store hit rate, i.e., the fraction of users being directly served with the pre-stored data. To obtain the optimal strategy, a distributed distribution-robust meta reinforcement learning (D<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>-RMRL) algorithm is proposed that consists of three key ingredients: value-decomposition for achieving the global optimum in distributed setting with minimum communication overhead, meta learning to obtain the optimal initial to reduce the training time under dynamic conditions, and pre-training to further speed up the meta training procedure. Simulation results show that, using the proposed value decomposition and meta training techniques, the satellite networks can achieve a 31.8% improvement of the pre-store hits and a 40.7% improvement of the convergence speed, compared to a baseline reinforcement learning algorithm. Moreover, the use of the proposed pre-training mechanism helps to shorten the meta-learning procedure by up to 43.7%. |
| Author | Wang, Xiaodong Saad, Walid Hu, Ye |
| Author_xml | – sequence: 1 givenname: Ye orcidid: 0000-0001-9872-5461 surname: Hu fullname: Hu, Ye email: yh3453@columbia.edu organization: Department of Electrical Engineering, Columbia University, New York, NY, USA – sequence: 2 givenname: Xiaodong orcidid: 0000-0002-2945-9240 surname: Wang fullname: Wang, Xiaodong email: xw2008@columbia.edu organization: Department of Electrical Engineering, Columbia University, New York, NY, USA – sequence: 3 givenname: Walid orcidid: 0000-0003-2247-2458 surname: Saad fullname: Saad, Walid email: walids@vt.edu organization: Wireless@VT, Bradley Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA, USA |
| BookMark | eNp9kMtKAzEUhoMoqNUXEBcBN7qYmtvcltJ6pWqZ6tYhkzkjqW1SkwziC_jcTi-IuHB1zuH8F_j20baxBhA6oqRPKcnP7yZPk3GfEcb6nHGWC7GF9mguaEREJraXO2eRiGO-i_a9nxISpwkVe-hrqH1wumoD1FiaGv_c2pqosFXrA76HIHEB2jTWKZiDCXgE0hltXvHpEL9ExX0xOsPdFw9lJx07iCbBOvkKq8zCdnGdVhs8aCvAExlgNtMB8AOED-ve_AHaaeTMw-Fm9tDz1eXT4CYaPV7fDi5GkWJ5EqKKc0XjFGoGNG6gzphkVOUkq1nKE1FliteJEI1QtcwoUTJhgsuY1pAInquG99DJOnfh7HsLPpRT2zrTVZYsTbOUCMp5p2JrlXLWewdNuXB6Lt1nSUm55F2ueJdL3uWGd2fK_piUDnKJMTipZ_9bj9dWDQC_uggRueD8G1n8kM4 |
| CODEN | IJSTGY |
| CitedBy_id | crossref_primary_10_1109_JSAC_2024_3460086 crossref_primary_10_1109_TAES_2024_3438681 |
| Cites_doi | 10.1109/ICC.2012.6363993 10.1109/TWC.2020.3024629 10.1109/JSAC.2021.3118346 10.1109/JIOT.2021.3065664 10.1109/TCOMM.2017.2685383 10.1109/MWC.001.1900178 10.1109/JSAC.2017.2680898 10.1109/TGCN.2019.2954166 10.1109/MCOM.2019.1800796 10.1609/aaai.v32i1.11794 10.1007/BF00992698 10.1017/CBO9780511807213 10.1155/2018/3026405 10.1109/tnn.1998.712192 10.1016/j.comnet.2020.107213 10.1109/JSAC.2021.3088689 10.1002/9781119673811 10.1002/sat.1374 10.1002/ett.3861 10.1109/JSAC.2018.2832798 10.21236/ADA280862 10.1109/MWC.2017.1600173 10.1109/JSAC.2003.819970 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| DBID | 97E RIA RIE AAYXX CITATION 7SP 8FD H8D L7M |
| DOI | 10.1109/JSTSP.2022.3232944 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Electronics & Communications Abstracts Technology Research Database Aerospace Database Advanced Technologies Database with Aerospace |
| DatabaseTitle | CrossRef Aerospace Database Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
| DatabaseTitleList | Aerospace Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1941-0484 |
| EndPage | 141 |
| ExternalDocumentID | 10_1109_JSTSP_2022_3232944 10004943 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: U.S. National Science Foundation grantid: CNS-1909372 |
| GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL RIA RIE RNS AAYXX CITATION 7SP 8FD H8D L7M |
| ID | FETCH-LOGICAL-c296t-b33c157ed2e15fed82a21c908d27364b8c3d644f4cda810ca6243a51de6439cf3 |
| IEDL.DBID | RIE |
| ISSN | 1932-4553 |
| IngestDate | Mon Jun 30 10:18:31 EDT 2025 Wed Oct 01 03:34:40 EDT 2025 Thu Apr 24 23:03:25 EDT 2025 Wed Aug 27 02:54:06 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c296t-b33c157ed2e15fed82a21c908d27364b8c3d644f4cda810ca6243a51de6439cf3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-9872-5461 0000-0002-2945-9240 0000-0003-2247-2458 |
| PQID | 2778704133 |
| PQPubID | 75721 |
| PageCount | 14 |
| ParticipantIDs | crossref_citationtrail_10_1109_JSTSP_2022_3232944 crossref_primary_10_1109_JSTSP_2022_3232944 ieee_primary_10004943 proquest_journals_2778704133 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Jan. 2023-1-00 20230101 |
| PublicationDateYYYYMMDD | 2023-01-01 |
| PublicationDate_xml | – month: 01 year: 2023 text: 2023-Jan. |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE journal of selected topics in signal processing |
| PublicationTitleAbbrev | JSTSP |
| PublicationYear | 2023 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 ref15 ref14 ref11 ref10 ref32 ref2 ref1 (ref31) 2022 ref16 ref19 Puterman (ref21) 2014 Sutton (ref24) 2000 Nichol (ref18) 2018 ref23 ref20 Finn (ref26) 2017 ref28 Sunehag (ref25) 2017 ref29 ref8 Vanschoren (ref17) 2018 ref7 (ref30) 2022 ref9 ref4 ref3 ref6 Watkins (ref22) 1992; 8 ref5 Erhan (ref27) 2010 |
| References_xml | – ident: ref7 doi: 10.1109/ICC.2012.6363993 – ident: ref23 doi: 10.1109/TWC.2020.3024629 – ident: ref16 doi: 10.1109/JSAC.2021.3118346 – ident: ref6 doi: 10.1109/JIOT.2021.3065664 – ident: ref4 doi: 10.1109/TCOMM.2017.2685383 – ident: ref14 doi: 10.1109/MWC.001.1900178 – ident: ref19 doi: 10.1109/JSAC.2017.2680898 – ident: ref3 doi: 10.1109/TGCN.2019.2954166 – ident: ref11 doi: 10.1109/MCOM.2019.1800796 – start-page: 201 volume-title: Proc. 13th Int. Conf. Artif. Intell. Statist.. Workshop year: 2010 ident: ref27 article-title: Why does unsupervised pre-training help deep learning – ident: ref32 doi: 10.1609/aaai.v32i1.11794 – volume-title: Markov Decision Processes: Discrete Stochastic Dynamic Programming year: 2014 ident: ref21 – volume: 8 start-page: 279 issue: 34 year: 1992 ident: ref22 article-title: Q-learning publication-title: Mach. Learn. doi: 10.1007/BF00992698 – ident: ref20 doi: 10.1017/CBO9780511807213 – year: 2022 ident: ref31 article-title: Starlink daily coverage estimates – ident: ref13 doi: 10.1155/2018/3026405 – ident: ref29 doi: 10.1109/tnn.1998.712192 – ident: ref12 doi: 10.1016/j.comnet.2020.107213 – ident: ref15 doi: 10.1109/JSAC.2021.3088689 – ident: ref1 doi: 10.1002/9781119673811 – ident: ref5 doi: 10.1002/sat.1374 – ident: ref9 doi: 10.1002/ett.3861 – year: 2018 ident: ref17 article-title: Meta-learning: A survey – year: 2018 ident: ref18 article-title: On first-order meta-learning algorithms – ident: ref10 doi: 10.1109/JSAC.2018.2832798 – year: 2017 ident: ref25 article-title: Value-decomposition networks for cooperative multi-agent learning – ident: ref28 doi: 10.21236/ADA280862 – year: 2022 ident: ref30 article-title: Starlink coverage tracker – volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2000 ident: ref24 article-title: Policy gradient methods for reinforcement learning with function approximation – ident: ref2 doi: 10.1109/MWC.2017.1600173 – ident: ref8 doi: 10.1109/JSAC.2003.819970 – start-page: 1126 volume-title: Proc. 34th Int. Conf. Mach. Learn. year: 2017 ident: ref26 article-title: Model-agnostic meta-learning for fast adaptation of deep networks |
| SSID | ssj0057614 |
| Score | 2.409784 |
| Snippet | In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 128 |
| SubjectTerms | Actor-critic Algorithms cube satellite network data pre-storage Decomposition Heuristic algorithms Logic gates Machine learning Markov processes meta learning multi-agent reinforcement learning Optimization Robustness Routing Satellite communication Satellite networks Satellites Task analysis Training value decomposition |
| Title | Distributed and Distribution-Robust Meta Reinforcement Learning (D ^-RMRL) for Data Pre-Storage and Routing in Cube Satellite Networks |
| URI | https://ieeexplore.ieee.org/document/10004943 https://www.proquest.com/docview/2778704133 |
| Volume | 17 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Xplore Digital Library customDbUrl: eissn: 1941-0484 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0057614 issn: 1932-4553 databaseCode: RIE dateStart: 20070101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7qSQ--xdVVcvCgSNY2SdPmKK4ioovsKniypMlUROnKbnvxB_i7TdJWfKB4a-mkBGYy8yWZ-QahPSEzgIApIlWuCZc8IirOBREi19JGGMWNz7YYiPNbfnEX3TXF6r4WBgB88hn03KO_yzdjXbmjsqOwpjNhs2g2TkRdrNW6XYubw-YKmRIeRaytkAnkkbXx0bXdC1LaYxZBSM6_RCHfVuWHL_YB5mwJDdqp1XklT72qzHr69Rtr47_nvowWG6iJj2vbWEEzUKyihU8EhGvore94c13LKzBYFQZ_vFt1keE4q6YlvoJS4SF4ilXtTxNxw8r6gPf7-J4Mr4aXB9h-xX1lRa8nQEZ2K289lf-nSzpyso8FPqkywCPlWUBLwIM6B326jm7PTm9OzknTmYFoKkVJMsZ0GMVgKIRRDiahioZaBomxaEjwLNHMWKCVc21UEgZaCcqZikIDDgDpnG2guWJcwCbCSWCkA0FMUeA6DyQ1UaZ5oIwjapSyg8JWU6luaMtd94zn1G9fApl67aZOu2mj3Q46_BjzUpN2_Cm97tT1SbLWVAd1W4tIm4U9TWnsPJyN_Gzrl2HbaN61pK-PabporpxUsGOBS5nteoN9B9uD6Ow |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB619ND2AC0PsS0UH3qgqrwkfmTjY8WCFthdoV2QOBE59qSqqLIVm1z4AfxubCdB0KpVb4kyjizNeOazPfMNwOdE5YgR11TpwlChhKR6UCQ0SQqjXITRwoZsi2kyuhSnV_KqLVYPtTCIGJLPsO8fw12-XZjaH5UdxA2dCX8Jr6QQQjblWp3jdcg5bi-RGRVS8q5GJlIHzsrn5243yFifOwyhhHgWh0JjlT-8cQgxx2sw7SbXZJbc9Osq75u733gb_3v272C1BZvkW2Md7-EFluvw9gkF4QbcDz1zrm96hZbo0pLHd6cwOlvk9bIiE6w0mWEgWTXhPJG0vKzfyf6QXNPZZDb-QtxXMtRO9PwW6dxt5p2vCv_0aUde9kdJDuscyVwHHtAKybTJQl9uwuXx0cXhiLa9GahhKqlozrmJ5QAtw1gWaFOmWWxUlFqHhxKRp4ZbB7UKYaxO48johAmuZWzRQyBT8C1YKRclbgNJI6s8DOKaoTBFpJiVuRGRtp6qUakexJ2mMtMSl_v-GT-zsIGJVBa0m3ntZq12e_D1ccyvhrbjn9KbXl1PJBtN9WCns4isXdrLjA28j3Oxn3_4y7A9eD26mIyz8cn07CO88Q3qm0ObHVipbmvcdTCmyj8F430ANGbsOQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributed+and+Distribution-Robust+Meta+Reinforcement+Learning+%28D%24%5E%7B2%7D%24-RMRL%29+for+Data+Pre-Storage+and+Routing+in+Cube+Satellite+Networks&rft.jtitle=IEEE+journal+of+selected+topics+in+signal+processing&rft.au=Hu%2C+Ye&rft.au=Wang%2C+Xiaodong&rft.au=Saad%2C+Walid&rft.date=2023-01-01&rft.issn=1932-4553&rft.eissn=1941-0484&rft.volume=17&rft.issue=1&rft.spage=128&rft.epage=141&rft_id=info:doi/10.1109%2FJSTSP.2022.3232944&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_JSTSP_2022_3232944 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-4553&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-4553&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-4553&client=summon |