Model-Free μ Synthesis via Adversarial Reinforcement Learning

Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important ro...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the American Control Conference pp. 3335 - 3341
Main Authors Keivan, Darioush, Havens, Aaron, Seiler, Peter, Dullerud, Geir, Hu, Bin
Format Conference Proceeding
LanguageEnglish
Published American Automatic Control Council 08.06.2022
Subjects
Online AccessGet full text
ISSN2378-5861
DOI10.23919/ACC53348.2022.9867674

Cover

Abstract Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely μ synthesis. We build a connection between robust adversarial RL and μ synthesis, and develop a model-free version of the well-known DK-iteration for solving state-feedback μ synthesis with static D-scaling. In the proposed algorithm, the K step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the D step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control.
AbstractList Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely μ synthesis. We build a connection between robust adversarial RL and μ synthesis, and develop a model-free version of the well-known DK-iteration for solving state-feedback μ synthesis with static D-scaling. In the proposed algorithm, the K step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the D step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control.
Author Seiler, Peter
Keivan, Darioush
Havens, Aaron
Hu, Bin
Dullerud, Geir
Author_xml – sequence: 1
  givenname: Darioush
  surname: Keivan
  fullname: Keivan, Darioush
  email: dk12@illinois.edu
  organization: University of Illinois at Urbana-Champaign,Coordinated Science Laboratory (CSL),Department of Mechanical Science & Engineering
– sequence: 2
  givenname: Aaron
  surname: Havens
  fullname: Havens, Aaron
  email: ahavens2@illinois.edu
  organization: University of Illinois at Urbana-Champaign,Coordinated Science Laboratory (CSL),Department of Electrical and Computer Engineering
– sequence: 3
  givenname: Peter
  surname: Seiler
  fullname: Seiler, Peter
  email: pseiler@umich.edu
  organization: University of Michigan,Department of Electrical Engineering and Computer Science
– sequence: 4
  givenname: Geir
  surname: Dullerud
  fullname: Dullerud, Geir
  email: dullerud@illinois.edu
  organization: University of Illinois at Urbana-Champaign,Coordinated Science Laboratory (CSL),Department of Mechanical Science & Engineering
– sequence: 5
  givenname: Bin
  surname: Hu
  fullname: Hu, Bin
  email: binhu7@illinois.edu
  organization: University of Illinois at Urbana-Champaign,Coordinated Science Laboratory (CSL),Department of Electrical and Computer Engineering
BookMark eNotj91KwzAYQKMouM09gSB5gdYkX5ImN0IpToWK4M_1SNsvGtlSScpg7-Yz7JkU3NXh3Bw4c3IWx4iEXHNWCrDc3tRNowCkKQUTorRGV7qSJ2RpK8O1VlJxq-GUzARUplBG8wsyz_mLMW6tZjNy-zQOuClWCZEefujrPk6fmEOmu-BoPewwZZeC29AXDNGPqcctxom26FIM8eOSnHu3ybg8ckHeV3dvzUPRPt8_NnVbBMFgKgbsXA9eKs8YQ82lGQA62XtljfUATDM0ckDRo-n_3HXAnFaOd76zAhQsyNV_NyDi-juFrUv79XEXfgGqzkyj
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.23919/ACC53348.2022.9867674
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Library
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781665451963
1665451963
EISSN 2378-5861
EndPage 3341
ExternalDocumentID 9867674
Genre orig-research
GroupedDBID -~X
23M
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i203t-debac3f45f000e6148d33b4cf5989f33060e84de2ce8cf33ab30a65a1bfb92353
IEDL.DBID RIE
IngestDate Wed Aug 27 02:19:12 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-debac3f45f000e6148d33b4cf5989f33060e84de2ce8cf33ab30a65a1bfb92353
PageCount 7
ParticipantIDs ieee_primary_9867674
PublicationCentury 2000
PublicationDate 2022-June-8
PublicationDateYYYYMMDD 2022-06-08
PublicationDate_xml – month: 06
  year: 2022
  text: 2022-June-8
  day: 08
PublicationDecade 2020
PublicationTitle Proceedings of the American Control Conference
PublicationTitleAbbrev ACC
PublicationYear 2022
Publisher American Automatic Control Council
Publisher_xml – name: American Automatic Control Council
SSID ssj0019960
Score 1.8421102
Snippet Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based...
SourceID ieee
SourceType Publisher
StartPage 3335
SubjectTerms Approximation algorithms
Benchmark testing
MIMICs
Numerical models
Reinforcement learning
Robust control
Title Model-Free μ Synthesis via Adversarial Reinforcement Learning
URI https://ieeexplore.ieee.org/document/9867674
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JSwMxFA61J724tOJODh6d6UyWaeYiSLEUQRG10FvJ8iJFmUo7FfS3-Rv8TSaZsS548JYEspDtvZe873sIHTuVwBJuqTtpXRMxSXgkMpVGgjpplehU08rL9yobDNnFiI8a6GSJhQGA4HwGsU-Gv3wz1Qv_VNbJhacXYytopSuyCqu1_DHwLCMVApjQPM07Z71eQJk6E5CQuK75I4RKkCD9dXT52XflOPIQL0oV69dftIz_HdwGan9h9fD1UgptogYUW2jtG81gC536gGePUX8GgN_f8O1L4bS--WSOnycSh4jMc-n3Ib6BwKOqw5MhrqlX79to2D-_6w2iOm5CNCEJLSMDSmpqGbfuvgPP9GkoVUxbnovcUmckJCCYAaJBaJeXiiYy4zJVVjl9j9Nt1CymBewgDJl0zQhnx2rNDDFKZtLdn5YAA26M2EUtPxPjp4oaY1xPwt7fxfto1a9G8LQSB6hZzhZw6GR6qY7CYn4ABZuiFw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwELWgHIALS4vYyYEjSRMvwbkgoYqqQFshaKXeKi8TVIFS1KZI8G18A9-E7YSyiAM325IXeZsZe94bhI6NSpBilhJz0k61TwVmPo9l5HNipFWoIkUKL99u3OrTqwEbLKCTORYGAJzzGQQ26f7y9VjN7FNZPeGWXowuoiVGKWUFWmv-Z2B5RgoMMCZJlNTPGw2HMzVGIMZBWfdHEBUnQ5prqPPZe-E68hDMchmo11_EjP8d3jqqfaH1vJu5HNpAC5BtotVvRINVdGZDnj36zQmA9_7m3b1kRu-bjqbe80h4LibzVNid6N2CY1JV7tHQK8lX72uo37zoNVp-GTnBH-GQ5L4GKRRJKUvNjQeW61MTIqlKWcKTlBgzIQRONWAFXJm8kCQUMRORTKXR-BjZQpVsnME28iAWphluLFmlqMZailiYGzTFQIFpzXdQ1c7E8KkgxxiWk7D7d_ERWm71Ou1h-7J7vYdW7Mo4vyu-jyr5ZAYHRsLn8tAt7Aew_6Vk
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+American+Control+Conference&rft.atitle=Model-Free+%CE%BC+Synthesis+via+Adversarial+Reinforcement+Learning&rft.au=Keivan%2C+Darioush&rft.au=Havens%2C+Aaron&rft.au=Seiler%2C+Peter&rft.au=Dullerud%2C+Geir&rft.date=2022-06-08&rft.pub=American+Automatic+Control+Council&rft.eissn=2378-5861&rft.spage=3335&rft.epage=3341&rft_id=info:doi/10.23919%2FACC53348.2022.9867674&rft.externalDocID=9867674