The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes . MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision probl...

Full description

Saved in:
Bibliographic Details
Published inMathematics of operations research Vol. 36; no. 4; pp. 593 - 603
Main Author Ye, Yinyu
Format Journal Article
LanguageEnglish
Published Linthicum INFORMS 01.11.2011
Institute for Operations Research and the Management Sciences
Inst
Subjects
Online AccessGet full text
ISSN0364-765X
1526-5471
DOI10.1287/moor.1110.0516

Cover

Abstract We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes . MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30 (3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6 (2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one.
AbstractList We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one.
We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes . MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30 (3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6 (2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one.
We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomialtime interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Open Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one.
We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. Key words: simplex method; policy-iteration method; Markov decision problem; linear programming; dynamic programming; strongly polynomial time MSC2000 subject classification: Primary: 90C40, 90C05; secondary: 68Q25, 90C39 OR/MS subject classification: Primary: linear programming, algorithm; secondary: dynamic programming, Markov History: Received May 15, 2010; revised August 16, 2011.
We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomialtime interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. [PUBLICATION ABSTRACT]
Audience Academic
Author Ye, Yinyu
Author_xml – sequence: 1
  givenname: Yinyu
  surname: Ye
  fullname: Ye, Yinyu
BackLink http://www.econis.eu/PPNSET?PPN=681232110$$DView this record in ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften
BookMark eNqFklFv0zAUhSM0JLrBK28IC56QSIkdO04fq41BpU1M25B4sxznJnVJ7M52oRV_HocwRqUiZMmWne8cX9-c4-TIWANJ8hxnU0xK_q631k0xjtuM4eJRMsGMFCmjHB8lkywvaMoL9uVJcuz9Kssw45hOkh-3S0A3ul93sEXS1OjKdlrt0kUAJ4O2Bl1CWNrao7mLYHDWtN1uoHbG9lp2qLEOhWhyKd1X-w2dgdJ-0F05W3XQo-86LJFE53oLNTrTXtmNCehaBniaPG5k5-HZ7_Uk-Xz-_vb0Y3rx6cPidH6RKsZ4SAmhEoDLapYrxQpSV6qqc6i4ZCXHFZE4zzJgUjU4Y3U5a-oCN4rCrJhhQjOenySvRt-1s3cb8EGs7MaZeKUYiBLzPI_Q6xFqZQdCm8YGJ1Uf6xVzwvMCU1qSSKUHqBZM7FYXf0ej4_EePz3Ax1FDr9VBwZs9QWQCbEMrN96Lxc31Pvv2L7baeG3Ax8nrdhn8KNnDX444RFPtxdrpXrqdKEpMchJzEwk6EspZ7x00QunwKwWxat0JnIkhaWJImhiSJoakPTzyj-ze-p-CF6Ng5UP8cE9TTIdKiocuDw1zvf-f30_tme2h
CODEN MOREDQ
CitedBy_id crossref_primary_10_1080_10556788_2012_668906
crossref_primary_10_1016_j_automatica_2015_05_016
crossref_primary_10_2139_ssrn_3427401
crossref_primary_10_1016_j_ejcon_2024_101068
crossref_primary_10_1016_j_orl_2014_07_006
crossref_primary_10_1080_00207543_2014_931609
crossref_primary_10_3390_math10142497
crossref_primary_10_1007_s10107_023_02017_4
crossref_primary_10_1109_TAC_2023_3274791
crossref_primary_10_1137_20M1380211
crossref_primary_10_1007_s10479_012_1089_2
crossref_primary_10_1007_s10107_020_01585_z
crossref_primary_10_1137_151002915
crossref_primary_10_1016_j_ijpe_2023_108997
crossref_primary_10_1287_moor_2022_0284
crossref_primary_10_2478_foli_2023_0001
crossref_primary_10_1007_s11042_018_6098_y
crossref_primary_10_1007_s10994_023_06368_z
crossref_primary_10_1145_2432622_2432623
crossref_primary_10_1002_nav_21743
crossref_primary_10_1080_10556788_2016_1208748
crossref_primary_10_1007_s10479_012_1199_x
crossref_primary_10_1287_moor_2019_1000
crossref_primary_10_1007_s11590_017_1111_3
crossref_primary_10_1007_s40687_020_00235_2
crossref_primary_10_1016_j_orl_2024_107091
crossref_primary_10_1137_20M1367192
crossref_primary_10_1287_moor_2014_0699
crossref_primary_10_1007_s00186_017_0610_4
crossref_primary_10_1016_j_orl_2013_12_011
crossref_primary_10_1016_j_tcs_2013_08_020
crossref_primary_10_1007_s11590_016_1040_6
crossref_primary_10_1016_j_orl_2011_01_003
crossref_primary_10_1038_s41534_023_00766_w
crossref_primary_10_1137_18M1197205
crossref_primary_10_2514_1_I011235
crossref_primary_10_1002_nav_21992
crossref_primary_10_1109_TCOMM_2022_3141786
crossref_primary_10_1016_j_ejor_2018_10_052
crossref_primary_10_1145_3152042_3152045
crossref_primary_10_1287_moor_2015_0753
crossref_primary_10_1007_s00224_019_09925_z
crossref_primary_10_1016_j_orl_2020_07_001
crossref_primary_10_1287_opre_2013_1164
crossref_primary_10_1016_j_orl_2013_02_002
crossref_primary_10_4204_EPTCS_85_2
crossref_primary_10_1287_opre_2017_1598
crossref_primary_10_1287_moor_2021_0345
crossref_primary_10_1016_j_jda_2017_04_004
crossref_primary_10_1109_JSAC_2015_2416987
crossref_primary_10_9746_sicetr_52_566
crossref_primary_10_1007_s10479_022_05025_3
crossref_primary_10_1137_22M149185X
crossref_primary_10_1109_ACCESS_2019_2939161
crossref_primary_10_1002_net_21670
crossref_primary_10_1007_s10107_014_0802_0
crossref_primary_10_1016_j_jclepro_2013_08_033
crossref_primary_10_1016_j_orl_2024_107199
crossref_primary_10_1016_j_orl_2016_01_010
crossref_primary_10_1287_opre_2022_2392
crossref_primary_10_1007_s10107_023_01956_2
crossref_primary_10_1287_moor_2017_0912
crossref_primary_10_1287_opre_2022_2269
crossref_primary_10_1007_s11590_018_1276_4
crossref_primary_10_1007_s11750_013_0291_y
crossref_primary_10_1016_j_orl_2012_01_004
crossref_primary_10_1080_10556788_2019_1695131
crossref_primary_10_1109_TCOMM_2022_3200105
crossref_primary_10_1016_j_orl_2015_10_002
crossref_primary_10_1287_opre_2021_2258
Cites_doi 10.1287/moor.1050.0149
10.1007/978-3-642-14162-1_46
10.1007/BF02579150
10.1287/moor.12.3.441
10.1016/0024-3795(68)90002-5
10.1515/9781400884179
10.1287/mnsc.10.1.98
10.1073/pnas.39.10.1095
10.1214/aoms/1177697379
10.1002/9780470316887
10.1287/mnsc.6.3.259
10.1287/moor.14.3.502
10.1287/ijoc.6.1.15
10.2307/1910385
10.1007/BF02592148
10.1016/0167-6377(90)90022-W
10.1287/moor.13.1.90
10.1287/ijoc.6.2.188
ContentType Journal Article
Copyright Copyright 2011, Institute for Operations Research and the Management Sciences
COPYRIGHT 2011 Institute for Operations Research and the Management Sciences
Copyright Institute for Operations Research and the Management Sciences Nov 2011
Copyright_xml – notice: Copyright 2011, Institute for Operations Research and the Management Sciences
– notice: COPYRIGHT 2011 Institute for Operations Research and the Management Sciences
– notice: Copyright Institute for Operations Research and the Management Sciences Nov 2011
DBID AAYXX
CITATION
OQ6
N95
ISR
JQ2
DOI 10.1287/moor.1110.0516
DatabaseName CrossRef
ECONIS
Gale Business: Insights
Gale In Context: Science
ProQuest Computer Science Collection
DatabaseTitle CrossRef
ProQuest Computer Science Collection
DatabaseTitleList




CrossRef
ProQuest Computer Science Collection


DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Business
Mathematics
EISSN 1526-5471
EndPage 603
ExternalDocumentID 2546389161
A273614482
681232110
10_1287_moor_1110_0516
41412326
moor.1110.0516
Genre Research Article
GeographicLocations United States
GeographicLocations_xml – name: United States
GroupedDBID 08R
1AW
1OL
29M
3V.
4.4
4S
5GY
7WY
85S
8AL
8AO
8FE
8FG
8FL
8G5
8H
8VB
AAKYL
AAPBV
ABBHK
ABEFU
ABFLS
ABJCF
ABPPZ
ABUWG
ACIWK
ACNCT
ACYGS
ADCOW
ADGDI
ADMHP
ADODI
AEILP
AELPN
AENEX
AEUPB
AFKRA
AFXKK
AKVCP
ALMA_UNASSIGNED_HOLDINGS
ARAPS
ARCSS
AZQEC
BDTQF
BENPR
BES
BEZIV
BGLVJ
BHOJU
BKOMP
BPHCQ
CBXGM
CHNMF
CS3
CWXUR
CZBKB
DQDLB
DSRWC
DWQXO
EBA
EBE
EBO
EBR
EBS
EBU
ECEWR
ECR
ECS
EDO
EFSUC
EJD
EMK
EPL
F20
FEDTE
FRNLG
GIFXF
GNUQQ
GROUPED_ABI_INFORM_COMPLETE
GROUPED_ABI_INFORM_RESEARCH
GUQSH
HCIFZ
HECYW
HGD
HQ6
HVGLF
H~9
IAO
ICW
IEA
IGG
IOF
ISR
ITC
JAA
JBU
JMS
JPL
JSODD
JST
K6
K60
K6V
K7-
L6V
M0C
M0N
M2O
M7S
MBDVC
MV1
N95
NIEAY
P-O
P2P
P62
PADUT
PQEST
PQQKQ
PQUKI
PRG
PRINS
PROAC
PTHSS
QWB
RNS
RPU
RXW
SA0
TAE
TH9
TN5
TUS
U5U
WH7
X
XFK
XHC
XI7
Y99
ZL0
ZY4
-~X
.DC
18M
2AX
AAOAC
AAWIL
AAWTO
ABAWQ
ABDNZ
ABFAN
ABKVW
ABQDR
ABXSQ
ABYRZ
ABYWD
ABYYQ
ACDIW
ACGFO
ACHJO
ACMTB
ACTMH
ACUHF
ACVFL
ACXJH
ADULT
AEGXH
AELLO
AEMOZ
AFVYC
AGLNM
AHAJD
AHQJS
AIAGR
AIHAF
AKBRZ
ALRMG
AMVHM
APTMU
ASMEE
BAAKF
IPSME
JAAYA
JBMMH
JBZCM
JENOY
JHFFW
JKQEH
JLEZI
JLXEF
JPPEU
K1G
K6~
8H~
AADHG
AAYXX
CCPQU
CITATION
PHGZM
PHGZT
PQBIZ
PQBZA
PQGLB
PUEGO
WHG
XOL
.4S
ABTAH
OQ6
JQ2
ID FETCH-LOGICAL-c557t-224aee7ab93cc562dbcbd3eb7a5871b2a1300e5acf105d89fd61fc4e969124073
ISSN 0364-765X
IngestDate Fri Aug 15 20:04:02 EDT 2025
Mon Oct 20 22:22:40 EDT 2025
Thu Jun 12 23:37:23 EDT 2025
Mon Oct 20 16:09:15 EDT 2025
Thu Oct 16 14:09:27 EDT 2025
Fri May 23 01:10:44 EDT 2025
Sat Mar 08 16:11:26 EST 2025
Thu Apr 24 23:01:01 EDT 2025
Wed Oct 01 02:52:23 EDT 2025
Thu May 29 08:43:16 EDT 2025
Wed Jan 06 02:47:59 EST 2021
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c557t-224aee7ab93cc562dbcbd3eb7a5871b2a1300e5acf105d89fd61fc4e969124073
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 912481733
PQPubID 37790
PageCount 11
ParticipantIDs crossref_citationtrail_10_1287_moor_1110_0516
gale_infotracacademiconefile_A273614482
informs_primary_10_1287_moor_1110_0516
proquest_journals_912481733
gale_businessinsightsgauss_A273614482
gale_incontextgauss_ISR_A273614482
gale_infotracgeneralonefile_A273614482
jstor_primary_41412326
gale_infotracmisc_A273614482
econis_primary_681232110
crossref_primary_10_1287_moor_1110_0516
ProviderPackageCode Y99
RPU
NIEAY
CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2011-11-01
PublicationDateYYYYMMDD 2011-11-01
PublicationDate_xml – month: 11
  year: 2011
  text: 2011-11-01
  day: 01
PublicationDecade 2010
PublicationPlace Linthicum
PublicationPlace_xml – name: Linthicum
PublicationTitle Mathematics of operations research
PublicationYear 2011
Publisher INFORMS
Institute for Operations Research and the Management Sciences
Inst
Publisher_xml – name: INFORMS
– name: Institute for Operations Research and the Management Sciences
– name: Inst
References B20
B21
B22
B23
B24
B25
B26
B27
B28
B29
B30
B31
B10
B32
B11
B12
B13
B14
B15
B16
B17
B18
B19
B1
B2
B3
B4
B5
B6
B7
B8
B9
Mansour Y. (B21) 1999
Altman E. (B1) 1999
Bellman R. (B2) 1957
Kallenberg L. C. M. (B14) 1983
Klee V. (B17) 1972
Bertsekas D. P. (B3) 1987
de Ghellinck G. (B7) 1960; 2
Khachiyan L. G. (B16) 1979; 244
Lovász L. (B19) 1988
Littman M. L. (B18) 1994
Howard R. A. (B13) 1960
References_xml – ident: B12
– ident: B9
– ident: B14
– ident: B10
– ident: B3
– ident: B20
– ident: B1
– ident: B27
– ident: B7
– ident: B5
– ident: B29
– ident: B25
– ident: B23
– ident: B21
– ident: B18
– ident: B16
– ident: B31
– ident: B8
– ident: B11
– ident: B13
– ident: B2
– ident: B26
– ident: B4
– ident: B28
– ident: B6
– ident: B24
– ident: B22
– ident: B17
– ident: B32
– ident: B15
– ident: B30
– ident: B19
– ident: B32
  doi: 10.1287/moor.1050.0149
– volume-title: Inequalities III
  year: 1972
  ident: B17
– ident: B12
  doi: 10.1007/978-3-642-14162-1_46
– ident: B15
  doi: 10.1007/BF02579150
– ident: B23
  doi: 10.1287/moor.12.3.441
– ident: B29
  doi: 10.1016/0024-3795(68)90002-5
– ident: B6
  doi: 10.1515/9781400884179
– ident: B8
  doi: 10.1287/mnsc.10.1.98
– volume-title: Linear Programming and Finite Markovian Control Problems.
  year: 1983
  ident: B14
– volume-title: Geometric Algorithms and Combinatorial Optimization
  year: 1988
  ident: B19
– ident: B26
  doi: 10.1073/pnas.39.10.1095
– ident: B30
  doi: 10.1214/aoms/1177697379
– ident: B24
  doi: 10.1002/9780470316887
– ident: B20
  doi: 10.1287/mnsc.6.3.259
– volume: 2
  start-page: 161
  year: 1960
  ident: B7
  publication-title: Cahiers du Centre d'Etudes de Recherche Opérationnelle
– volume-title: Dynamic Programming
  year: 1957
  ident: B2
– ident: B9
  doi: 10.1287/moor.14.3.502
– ident: B4
  doi: 10.1287/ijoc.6.1.15
– ident: B5
  doi: 10.2307/1910385
– start-page: 394
  volume-title: Proc. 11th Annual Conf. Uncertainty Artificial Intelligence (UAI–95)
  year: 1994
  ident: B18
– volume-title: Dynamic Programming and Markov Processes
  year: 1960
  ident: B13
– volume-title: Dynamic Programming, Deterministic and Stochastic Models
  year: 1987
  ident: B3
– start-page: 401
  volume-title: Proc. 15th Internat. Conf. Uncertainty Artificial Intelligence
  year: 1999
  ident: B21
– ident: B28
  doi: 10.1007/BF02592148
– ident: B27
  doi: 10.1016/0167-6377(90)90022-W
– volume-title: Constrained Markov Decision Processes
  year: 1999
  ident: B1
– ident: B11
  doi: 10.1287/moor.13.1.90
– ident: B22
  doi: 10.1287/ijoc.6.2.188
– volume: 244
  start-page: 1086
  year: 1979
  ident: B16
  publication-title: Dokl. Akad. Nauk SSSR
SSID ssj0015714
Score 2.4090855
Snippet We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes . MIT, Cambridge] and the original simplex...
We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex...
SourceID proquest
gale
econis
crossref
jstor
informs
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 593
SubjectTerms Algorithms
Arithmetic
Discount rates
Dynamic programming
Linear programming
Markov analysis
Markov decision problem
Markov processes
Mathematical vectors
Mathematics
Methods
Optimal policy
policy-iteration method
Polynomials
Simplex method
strongly polynomial time
Studies
Title The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate
URI https://www.jstor.org/stable/41412326
http://www.econis.eu/PPNSET?PPN=681232110
https://www.proquest.com/docview/912481733
Volume 36
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: Mathematics Source
  customDbUrl:
  eissn: 1526-5471
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0015714
  issn: 0364-765X
  databaseCode: AMVHM
  dateStart: 19760201
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/mathematics-source
  providerName: EBSCOhost
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9NAEF1FKSA48BFaNbSgFQJ6iFxie-21jxGiCkgpErQoN2u9XkeRil3VCWrpn2fGu_5qgyhcrCgd2a7fy-zseOYNIW8CkaZhkirLjh1uQfwvwA8qbilPOMyzecKSstri2J-ess9zb97rXbe7S1bxofy1sa_kf1CF7wBX7JL9B2Trk8IX8BnwhSMgDMc7Y1wsUd_3Ujf8lyK_llZKRlz1fOhihOVdBSa9F5jMyM-usBlZty6WkSd27OQ_R4kZuDMyY2ZM59soXV5CXIoNvDhZYoTqEu2gdlZLv5Z1Ifm5uTy-kWgly8rMbOnyl9nVup1uwHq3TrqhKWHAO_zSnLAqFKwLP5vyncpNtbOPrs8s7nvztiPWSiiGcKzlVT09RPGWt3cwX3L0I88v0POPD8HBbJDVPtZTPG_Iak8gbsPNcADr95YDS8K4T7Yms-_TWf0SyuO2UR_T92o0P-Gq77vX7MQ09zCRsSzqJf6-VsAtqoLXW4t-GcmcPCWPzRaETjSfnpGeygbkQdUBMSBPqkkf1DzRAXnUkq18Tq6Bd9TwjgIQ9CbvqOEdBd7Rine04R2FW6WAHtW8oxXvqOEdRd5RQUve0Yp3FHm3TU6PPp58mFpmiIclPY-vLAgRhVJcxKErJQTbSSzjxFUxFx7s1WNH4PtU8AwyhUg_CcI08e1UMhX6oY3ZBneH9LM8U7uEOlxC9M6ZCphAhd3YYVKFIWy8fCldpobEqmCIpFG4x0ErZxHudAG2CGHDPe84QtiG5KC2P9faLn-03NWo1nYo2-di8mRI3iLOkZkZC4cCs2rFQqyLImpYNiSvSzvUXMmwqEsbfPr2tWN0YIzSHO5cCtMjA_8_yrR1LN91LBdapH6T4X7HEFYP2T2P4edfH8FOSd_ajNkMHwH8Ya_ic2R8YxEhdoHNXffFHR_PHnnYeJt90l9drNVLCPdX8Svzs_wNVz8H2A
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+simplex+and+policy-iteration+methods+are+strongly+polynomial+for+the+Markov+decision+problem+with+a+fixed+discount+rate&rft.jtitle=Mathematics+of+operations+research&rft.au=Ye%2C+Yinyu&rft.date=2011-11-01&rft.pub=Institute+for+Operations+Research+and+the+Management+Sciences&rft.issn=0364-765X&rft.volume=36&rft.issue=4&rft.spage=593&rft_id=info:doi/10.1287%2Fmoor.1110.0516&rft.externalDBID=N95&rft.externalDocID=A273614482
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0364-765X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0364-765X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0364-765X&client=summon