Distributed and Distribution-Robust Meta Reinforcement Learning (D ^-RMRL) for Data Pre-Storage and Routing in Cube Satellite Networks

In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal of selected topics in signal processing Vol. 17; no. 1; pp. 128 - 141
Main Authors Hu, Ye, Wang, Xiaodong, Saad, Walid
Format Journal Article
LanguageEnglish
Published New York IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1932-4553
1941-0484
DOI10.1109/JSTSP.2022.3232944

Cover

Abstract In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the satellites, such that the ground users can be directly served with the pre-stored data. This pre-storage and routing design problem is formulated as a decentralized Markov decision process (Dec-MDP) in which we seek to find the optimal strategy that maximizes the pre-store hit rate, i.e., the fraction of users being directly served with the pre-stored data. To obtain the optimal strategy, a distributed distribution-robust meta reinforcement learning (D<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>-RMRL) algorithm is proposed that consists of three key ingredients: value-decomposition for achieving the global optimum in distributed setting with minimum communication overhead, meta learning to obtain the optimal initial to reduce the training time under dynamic conditions, and pre-training to further speed up the meta training procedure. Simulation results show that, using the proposed value decomposition and meta training techniques, the satellite networks can achieve a 31.8% improvement of the pre-store hits and a 40.7% improvement of the convergence speed, compared to a baseline reinforcement learning algorithm. Moreover, the use of the proposed pre-training mechanism helps to shorten the meta-learning procedure by up to 43.7%.
AbstractList In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the satellites, such that the ground users can be directly served with the pre-stored data. This pre-storage and routing design problem is formulated as a decentralized Markov decision process (Dec-MDP) in which we seek to find the optimal strategy that maximizes the pre-store hit rate, i.e., the fraction of users being directly served with the pre-stored data. To obtain the optimal strategy, a distributed distribution-robust meta reinforcement learning (D[Formula Omitted]-RMRL) algorithm is proposed that consists of three key ingredients: value-decomposition for achieving the global optimum in distributed setting with minimum communication overhead, meta learning to obtain the optimal initial to reduce the training time under dynamic conditions, and pre-training to further speed up the meta training procedure. Simulation results show that, using the proposed value decomposition and meta training techniques, the satellite networks can achieve a 31.8% improvement of the pre-store hits and a 40.7% improvement of the convergence speed, compared to a baseline reinforcement learning algorithm. Moreover, the use of the proposed pre-training mechanism helps to shorten the meta-learning procedure by up to 43.7%.
In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube satellite delivers requested data to user clusters under its coverage. A group of ground gateways will route and pre-store certain data to the satellites, such that the ground users can be directly served with the pre-stored data. This pre-storage and routing design problem is formulated as a decentralized Markov decision process (Dec-MDP) in which we seek to find the optimal strategy that maximizes the pre-store hit rate, i.e., the fraction of users being directly served with the pre-stored data. To obtain the optimal strategy, a distributed distribution-robust meta reinforcement learning (D<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>-RMRL) algorithm is proposed that consists of three key ingredients: value-decomposition for achieving the global optimum in distributed setting with minimum communication overhead, meta learning to obtain the optimal initial to reduce the training time under dynamic conditions, and pre-training to further speed up the meta training procedure. Simulation results show that, using the proposed value decomposition and meta training techniques, the satellite networks can achieve a 31.8% improvement of the pre-store hits and a 40.7% improvement of the convergence speed, compared to a baseline reinforcement learning algorithm. Moreover, the use of the proposed pre-training mechanism helps to shorten the meta-learning procedure by up to 43.7%.
Author Wang, Xiaodong
Saad, Walid
Hu, Ye
Author_xml – sequence: 1
  givenname: Ye
  orcidid: 0000-0001-9872-5461
  surname: Hu
  fullname: Hu, Ye
  email: yh3453@columbia.edu
  organization: Department of Electrical Engineering, Columbia University, New York, NY, USA
– sequence: 2
  givenname: Xiaodong
  orcidid: 0000-0002-2945-9240
  surname: Wang
  fullname: Wang, Xiaodong
  email: xw2008@columbia.edu
  organization: Department of Electrical Engineering, Columbia University, New York, NY, USA
– sequence: 3
  givenname: Walid
  orcidid: 0000-0003-2247-2458
  surname: Saad
  fullname: Saad, Walid
  email: walids@vt.edu
  organization: Wireless@VT, Bradley Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA, USA
BookMark eNp9kMtKAzEUhoMoqNUXEBcBN7qYmtvcltJ6pWqZ6tYhkzkjqW1SkwziC_jcTi-IuHB1zuH8F_j20baxBhA6oqRPKcnP7yZPk3GfEcb6nHGWC7GF9mguaEREJraXO2eRiGO-i_a9nxISpwkVe-hrqH1wumoD1FiaGv_c2pqosFXrA76HIHEB2jTWKZiDCXgE0hltXvHpEL9ExX0xOsPdFw9lJx07iCbBOvkKq8zCdnGdVhs8aCvAExlgNtMB8AOED-ve_AHaaeTMw-Fm9tDz1eXT4CYaPV7fDi5GkWJ5EqKKc0XjFGoGNG6gzphkVOUkq1nKE1FliteJEI1QtcwoUTJhgsuY1pAInquG99DJOnfh7HsLPpRT2zrTVZYsTbOUCMp5p2JrlXLWewdNuXB6Lt1nSUm55F2ueJdL3uWGd2fK_piUDnKJMTipZ_9bj9dWDQC_uggRueD8G1n8kM4
CODEN IJSTGY
CitedBy_id crossref_primary_10_1109_JSAC_2024_3460086
crossref_primary_10_1109_TAES_2024_3438681
Cites_doi 10.1109/ICC.2012.6363993
10.1109/TWC.2020.3024629
10.1109/JSAC.2021.3118346
10.1109/JIOT.2021.3065664
10.1109/TCOMM.2017.2685383
10.1109/MWC.001.1900178
10.1109/JSAC.2017.2680898
10.1109/TGCN.2019.2954166
10.1109/MCOM.2019.1800796
10.1609/aaai.v32i1.11794
10.1007/BF00992698
10.1017/CBO9780511807213
10.1155/2018/3026405
10.1109/tnn.1998.712192
10.1016/j.comnet.2020.107213
10.1109/JSAC.2021.3088689
10.1002/9781119673811
10.1002/sat.1374
10.1002/ett.3861
10.1109/JSAC.2018.2832798
10.21236/ADA280862
10.1109/MWC.2017.1600173
10.1109/JSAC.2003.819970
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SP
8FD
H8D
L7M
DOI 10.1109/JSTSP.2022.3232944
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Electronics & Communications Abstracts
Technology Research Database
Aerospace Database
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Aerospace Database
Technology Research Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList Aerospace Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1941-0484
EndPage 141
ExternalDocumentID 10_1109_JSTSP_2022_3232944
10004943
Genre orig-research
GrantInformation_xml – fundername: U.S. National Science Foundation
  grantid: CNS-1909372
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
RIA
RIE
RNS
AAYXX
CITATION
7SP
8FD
H8D
L7M
ID FETCH-LOGICAL-c296t-b33c157ed2e15fed82a21c908d27364b8c3d644f4cda810ca6243a51de6439cf3
IEDL.DBID RIE
ISSN 1932-4553
IngestDate Mon Jun 30 10:18:31 EDT 2025
Wed Oct 01 03:34:40 EDT 2025
Thu Apr 24 23:03:25 EDT 2025
Wed Aug 27 02:54:06 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c296t-b33c157ed2e15fed82a21c908d27364b8c3d644f4cda810ca6243a51de6439cf3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-9872-5461
0000-0002-2945-9240
0000-0003-2247-2458
PQID 2778704133
PQPubID 75721
PageCount 14
ParticipantIDs crossref_citationtrail_10_1109_JSTSP_2022_3232944
crossref_primary_10_1109_JSTSP_2022_3232944
ieee_primary_10004943
proquest_journals_2778704133
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-Jan.
2023-1-00
20230101
PublicationDateYYYYMMDD 2023-01-01
PublicationDate_xml – month: 01
  year: 2023
  text: 2023-Jan.
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE journal of selected topics in signal processing
PublicationTitleAbbrev JSTSP
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref14
ref11
ref10
ref32
ref2
ref1
(ref31) 2022
ref16
ref19
Puterman (ref21) 2014
Sutton (ref24) 2000
Nichol (ref18) 2018
ref23
ref20
Finn (ref26) 2017
ref28
Sunehag (ref25) 2017
ref29
ref8
Vanschoren (ref17) 2018
ref7
(ref30) 2022
ref9
ref4
ref3
ref6
Watkins (ref22) 1992; 8
ref5
Erhan (ref27) 2010
References_xml – ident: ref7
  doi: 10.1109/ICC.2012.6363993
– ident: ref23
  doi: 10.1109/TWC.2020.3024629
– ident: ref16
  doi: 10.1109/JSAC.2021.3118346
– ident: ref6
  doi: 10.1109/JIOT.2021.3065664
– ident: ref4
  doi: 10.1109/TCOMM.2017.2685383
– ident: ref14
  doi: 10.1109/MWC.001.1900178
– ident: ref19
  doi: 10.1109/JSAC.2017.2680898
– ident: ref3
  doi: 10.1109/TGCN.2019.2954166
– ident: ref11
  doi: 10.1109/MCOM.2019.1800796
– start-page: 201
  volume-title: Proc. 13th Int. Conf. Artif. Intell. Statist.. Workshop
  year: 2010
  ident: ref27
  article-title: Why does unsupervised pre-training help deep learning
– ident: ref32
  doi: 10.1609/aaai.v32i1.11794
– volume-title: Markov Decision Processes: Discrete Stochastic Dynamic Programming
  year: 2014
  ident: ref21
– volume: 8
  start-page: 279
  issue: 34
  year: 1992
  ident: ref22
  article-title: Q-learning
  publication-title: Mach. Learn.
  doi: 10.1007/BF00992698
– ident: ref20
  doi: 10.1017/CBO9780511807213
– year: 2022
  ident: ref31
  article-title: Starlink daily coverage estimates
– ident: ref13
  doi: 10.1155/2018/3026405
– ident: ref29
  doi: 10.1109/tnn.1998.712192
– ident: ref12
  doi: 10.1016/j.comnet.2020.107213
– ident: ref15
  doi: 10.1109/JSAC.2021.3088689
– ident: ref1
  doi: 10.1002/9781119673811
– ident: ref5
  doi: 10.1002/sat.1374
– ident: ref9
  doi: 10.1002/ett.3861
– year: 2018
  ident: ref17
  article-title: Meta-learning: A survey
– year: 2018
  ident: ref18
  article-title: On first-order meta-learning algorithms
– ident: ref10
  doi: 10.1109/JSAC.2018.2832798
– year: 2017
  ident: ref25
  article-title: Value-decomposition networks for cooperative multi-agent learning
– ident: ref28
  doi: 10.21236/ADA280862
– year: 2022
  ident: ref30
  article-title: Starlink coverage tracker
– volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2000
  ident: ref24
  article-title: Policy gradient methods for reinforcement learning with function approximation
– ident: ref2
  doi: 10.1109/MWC.2017.1600173
– ident: ref8
  doi: 10.1109/JSAC.2003.819970
– start-page: 1126
  volume-title: Proc. 34th Int. Conf. Mach. Learn.
  year: 2017
  ident: ref26
  article-title: Model-agnostic meta-learning for fast adaptation of deep networks
SSID ssj0057614
Score 2.409784
Snippet In this paper, the problem of data pre-storage and routing in dynamic, resource-constrained cube satellite networks is studied. In such a network, each cube...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 128
SubjectTerms Actor-critic
Algorithms
cube satellite network
data pre-storage
Decomposition
Heuristic algorithms
Logic gates
Machine learning
Markov processes
meta learning
multi-agent reinforcement learning
Optimization
Robustness
Routing
Satellite communication
Satellite networks
Satellites
Task analysis
Training
value decomposition
Title Distributed and Distribution-Robust Meta Reinforcement Learning (D ^-RMRL) for Data Pre-Storage and Routing in Cube Satellite Networks
URI https://ieeexplore.ieee.org/document/10004943
https://www.proquest.com/docview/2778704133
Volume 17
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Xplore Digital Library
  customDbUrl:
  eissn: 1941-0484
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0057614
  issn: 1932-4553
  databaseCode: RIE
  dateStart: 20070101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7qSQ--xdVVcvCgSNY2SdPmKK4ioovsKniypMlUROnKbnvxB_i7TdJWfKB4a-mkBGYy8yWZ-QahPSEzgIApIlWuCZc8IirOBREi19JGGMWNz7YYiPNbfnEX3TXF6r4WBgB88hn03KO_yzdjXbmjsqOwpjNhs2g2TkRdrNW6XYubw-YKmRIeRaytkAnkkbXx0bXdC1LaYxZBSM6_RCHfVuWHL_YB5mwJDdqp1XklT72qzHr69Rtr47_nvowWG6iJj2vbWEEzUKyihU8EhGvore94c13LKzBYFQZ_vFt1keE4q6YlvoJS4SF4ilXtTxNxw8r6gPf7-J4Mr4aXB9h-xX1lRa8nQEZ2K289lf-nSzpyso8FPqkywCPlWUBLwIM6B326jm7PTm9OzknTmYFoKkVJMsZ0GMVgKIRRDiahioZaBomxaEjwLNHMWKCVc21UEgZaCcqZikIDDgDpnG2guWJcwCbCSWCkA0FMUeA6DyQ1UaZ5oIwjapSyg8JWU6luaMtd94zn1G9fApl67aZOu2mj3Q46_BjzUpN2_Cm97tT1SbLWVAd1W4tIm4U9TWnsPJyN_Gzrl2HbaN61pK-PabporpxUsGOBS5nteoN9B9uD6Ow
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB619ND2AC0PsS0UH3qgqrwkfmTjY8WCFthdoV2QOBE59qSqqLIVm1z4AfxubCdB0KpVb4kyjizNeOazPfMNwOdE5YgR11TpwlChhKR6UCQ0SQqjXITRwoZsi2kyuhSnV_KqLVYPtTCIGJLPsO8fw12-XZjaH5UdxA2dCX8Jr6QQQjblWp3jdcg5bi-RGRVS8q5GJlIHzsrn5243yFifOwyhhHgWh0JjlT-8cQgxx2sw7SbXZJbc9Osq75u733gb_3v272C1BZvkW2Md7-EFluvw9gkF4QbcDz1zrm96hZbo0pLHd6cwOlvk9bIiE6w0mWEgWTXhPJG0vKzfyf6QXNPZZDb-QtxXMtRO9PwW6dxt5p2vCv_0aUde9kdJDuscyVwHHtAKybTJQl9uwuXx0cXhiLa9GahhKqlozrmJ5QAtw1gWaFOmWWxUlFqHhxKRp4ZbB7UKYaxO48johAmuZWzRQyBT8C1YKRclbgNJI6s8DOKaoTBFpJiVuRGRtp6qUakexJ2mMtMSl_v-GT-zsIGJVBa0m3ntZq12e_D1ccyvhrbjn9KbXl1PJBtN9WCns4isXdrLjA28j3Oxn3_4y7A9eD26mIyz8cn07CO88Q3qm0ObHVipbmvcdTCmyj8F430ANGbsOQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributed+and+Distribution-Robust+Meta+Reinforcement+Learning+%28D%24%5E%7B2%7D%24-RMRL%29+for+Data+Pre-Storage+and+Routing+in+Cube+Satellite+Networks&rft.jtitle=IEEE+journal+of+selected+topics+in+signal+processing&rft.au=Hu%2C+Ye&rft.au=Wang%2C+Xiaodong&rft.au=Saad%2C+Walid&rft.date=2023-01-01&rft.issn=1932-4553&rft.eissn=1941-0484&rft.volume=17&rft.issue=1&rft.spage=128&rft.epage=141&rft_id=info:doi/10.1109%2FJSTSP.2022.3232944&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_JSTSP_2022_3232944
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-4553&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-4553&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-4553&client=summon