The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate
We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes . MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision probl...
        Saved in:
      
    
          | Published in | Mathematics of operations research Vol. 36; no. 4; pp. 593 - 603 | 
|---|---|
| Main Author | |
| Format | Journal Article | 
| Language | English | 
| Published | 
        Linthicum
          INFORMS
    
        01.11.2011
     Institute for Operations Research and the Management Sciences Inst  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0364-765X 1526-5471  | 
| DOI | 10.1287/moor.1110.0516 | 
Cover
| Abstract | We prove that the classic policy-iteration method [Howard, R. A. 1960.
Dynamic Programming and Markov Processes
. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are
strongly
polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is
superior
to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem.
Math. Oper. Res.
30
(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed.
Inequalities III.
Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes.
INFORMS J. Comput.
6
(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. | 
    
|---|---|
| AbstractList | We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes . MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30 (3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6 (2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomialtime interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Open Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomial-time interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. Key words: simplex method; policy-iteration method; Markov decision problem; linear programming; dynamic programming; strongly polynomial time MSC2000 subject classification: Primary: 90C40, 90C05; secondary: 68Q25, 90C39 OR/MS subject classification: Primary: linear programming, algorithm; secondary: dynamic programming, Markov History: Received May 15, 2010; revised August 16, 2011. We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex method with the most-negative-reduced-cost pivoting rule of Dantzig are strongly polynomial-time algorithms for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policy-iteration and simplex methods is superior to that of the only known strongly polynomialtime interior-point algorithm [Ye, Y. 2005. A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3) 733-749] for solving this problem. The result is surprising because the simplex method with the same pivoting rule was shown to be exponential for solving a general linear programming problem [Klee, V., G. J. Minty. 1972. How good is the simplex method? Technical report. O. Shisha, ed. Inequalities III. Academic Press, New York], the simplex method with the smallest index pivoting rule was shown to be exponential for solving an MDP regardless of discount rates [Melekopoglou, M., A. Condon. 1994. On the complexity of the policy improvement algorithm for Markov decision processes. INFORMS J. Comput. 6(2) 188-192], and the policy-iteration method was recently shown to be exponential for solving undiscounted MDPs under the average cost criterion. We also extend the result to solving MDPs with transient substochastic transition matrices whose spectral radii are uniformly below one. [PUBLICATION ABSTRACT]  | 
    
| Audience | Academic | 
    
| Author | Ye, Yinyu | 
    
| Author_xml | – sequence: 1 givenname: Yinyu surname: Ye fullname: Ye, Yinyu  | 
    
| BackLink | http://www.econis.eu/PPNSET?PPN=681232110$$DView this record in ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften | 
    
| BookMark | eNqFklFv0zAUhSM0JLrBK28IC56QSIkdO04fq41BpU1M25B4sxznJnVJ7M52oRV_HocwRqUiZMmWne8cX9-c4-TIWANJ8hxnU0xK_q631k0xjtuM4eJRMsGMFCmjHB8lkywvaMoL9uVJcuz9Kssw45hOkh-3S0A3ul93sEXS1OjKdlrt0kUAJ4O2Bl1CWNrao7mLYHDWtN1uoHbG9lp2qLEOhWhyKd1X-w2dgdJ-0F05W3XQo-86LJFE53oLNTrTXtmNCehaBniaPG5k5-HZ7_Uk-Xz-_vb0Y3rx6cPidH6RKsZ4SAmhEoDLapYrxQpSV6qqc6i4ZCXHFZE4zzJgUjU4Y3U5a-oCN4rCrJhhQjOenySvRt-1s3cb8EGs7MaZeKUYiBLzPI_Q6xFqZQdCm8YGJ1Uf6xVzwvMCU1qSSKUHqBZM7FYXf0ej4_EePz3Ax1FDr9VBwZs9QWQCbEMrN96Lxc31Pvv2L7baeG3Ax8nrdhn8KNnDX444RFPtxdrpXrqdKEpMchJzEwk6EspZ7x00QunwKwWxat0JnIkhaWJImhiSJoakPTzyj-ze-p-CF6Ng5UP8cE9TTIdKiocuDw1zvf-f30_tme2h | 
    
| CODEN | MOREDQ | 
    
| CitedBy_id | crossref_primary_10_1080_10556788_2012_668906 crossref_primary_10_1016_j_automatica_2015_05_016 crossref_primary_10_2139_ssrn_3427401 crossref_primary_10_1016_j_ejcon_2024_101068 crossref_primary_10_1016_j_orl_2014_07_006 crossref_primary_10_1080_00207543_2014_931609 crossref_primary_10_3390_math10142497 crossref_primary_10_1007_s10107_023_02017_4 crossref_primary_10_1109_TAC_2023_3274791 crossref_primary_10_1137_20M1380211 crossref_primary_10_1007_s10479_012_1089_2 crossref_primary_10_1007_s10107_020_01585_z crossref_primary_10_1137_151002915 crossref_primary_10_1016_j_ijpe_2023_108997 crossref_primary_10_1287_moor_2022_0284 crossref_primary_10_2478_foli_2023_0001 crossref_primary_10_1007_s11042_018_6098_y crossref_primary_10_1007_s10994_023_06368_z crossref_primary_10_1145_2432622_2432623 crossref_primary_10_1002_nav_21743 crossref_primary_10_1080_10556788_2016_1208748 crossref_primary_10_1007_s10479_012_1199_x crossref_primary_10_1287_moor_2019_1000 crossref_primary_10_1007_s11590_017_1111_3 crossref_primary_10_1007_s40687_020_00235_2 crossref_primary_10_1016_j_orl_2024_107091 crossref_primary_10_1137_20M1367192 crossref_primary_10_1287_moor_2014_0699 crossref_primary_10_1007_s00186_017_0610_4 crossref_primary_10_1016_j_orl_2013_12_011 crossref_primary_10_1016_j_tcs_2013_08_020 crossref_primary_10_1007_s11590_016_1040_6 crossref_primary_10_1016_j_orl_2011_01_003 crossref_primary_10_1038_s41534_023_00766_w crossref_primary_10_1137_18M1197205 crossref_primary_10_2514_1_I011235 crossref_primary_10_1002_nav_21992 crossref_primary_10_1109_TCOMM_2022_3141786 crossref_primary_10_1016_j_ejor_2018_10_052 crossref_primary_10_1145_3152042_3152045 crossref_primary_10_1287_moor_2015_0753 crossref_primary_10_1007_s00224_019_09925_z crossref_primary_10_1016_j_orl_2020_07_001 crossref_primary_10_1287_opre_2013_1164 crossref_primary_10_1016_j_orl_2013_02_002 crossref_primary_10_4204_EPTCS_85_2 crossref_primary_10_1287_opre_2017_1598 crossref_primary_10_1287_moor_2021_0345 crossref_primary_10_1016_j_jda_2017_04_004 crossref_primary_10_1109_JSAC_2015_2416987 crossref_primary_10_9746_sicetr_52_566 crossref_primary_10_1007_s10479_022_05025_3 crossref_primary_10_1137_22M149185X crossref_primary_10_1109_ACCESS_2019_2939161 crossref_primary_10_1002_net_21670 crossref_primary_10_1007_s10107_014_0802_0 crossref_primary_10_1016_j_jclepro_2013_08_033 crossref_primary_10_1016_j_orl_2024_107199 crossref_primary_10_1016_j_orl_2016_01_010 crossref_primary_10_1287_opre_2022_2392 crossref_primary_10_1007_s10107_023_01956_2 crossref_primary_10_1287_moor_2017_0912 crossref_primary_10_1287_opre_2022_2269 crossref_primary_10_1007_s11590_018_1276_4 crossref_primary_10_1007_s11750_013_0291_y crossref_primary_10_1016_j_orl_2012_01_004 crossref_primary_10_1080_10556788_2019_1695131 crossref_primary_10_1109_TCOMM_2022_3200105 crossref_primary_10_1016_j_orl_2015_10_002 crossref_primary_10_1287_opre_2021_2258  | 
    
| Cites_doi | 10.1287/moor.1050.0149 10.1007/978-3-642-14162-1_46 10.1007/BF02579150 10.1287/moor.12.3.441 10.1016/0024-3795(68)90002-5 10.1515/9781400884179 10.1287/mnsc.10.1.98 10.1073/pnas.39.10.1095 10.1214/aoms/1177697379 10.1002/9780470316887 10.1287/mnsc.6.3.259 10.1287/moor.14.3.502 10.1287/ijoc.6.1.15 10.2307/1910385 10.1007/BF02592148 10.1016/0167-6377(90)90022-W 10.1287/moor.13.1.90 10.1287/ijoc.6.2.188  | 
    
| ContentType | Journal Article | 
    
| Copyright | Copyright 2011, Institute for Operations Research and the Management Sciences COPYRIGHT 2011 Institute for Operations Research and the Management Sciences Copyright Institute for Operations Research and the Management Sciences Nov 2011  | 
    
| Copyright_xml | – notice: Copyright 2011, Institute for Operations Research and the Management Sciences – notice: COPYRIGHT 2011 Institute for Operations Research and the Management Sciences – notice: Copyright Institute for Operations Research and the Management Sciences Nov 2011  | 
    
| DBID | AAYXX CITATION OQ6 N95 ISR JQ2  | 
    
| DOI | 10.1287/moor.1110.0516 | 
    
| DatabaseName | CrossRef ECONIS Gale Business: Insights Gale In Context: Science ProQuest Computer Science Collection  | 
    
| DatabaseTitle | CrossRef ProQuest Computer Science Collection  | 
    
| DatabaseTitleList | CrossRef ProQuest Computer Science Collection  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering Computer Science Business Mathematics  | 
    
| EISSN | 1526-5471 | 
    
| EndPage | 603 | 
    
| ExternalDocumentID | 2546389161 A273614482 681232110 10_1287_moor_1110_0516 41412326 moor.1110.0516  | 
    
| Genre | Research Article | 
    
| GeographicLocations | United States | 
    
| GeographicLocations_xml | – name: United States | 
    
| GroupedDBID | 08R 1AW 1OL 29M 3V. 4.4 4S 5GY 7WY 85S 8AL 8AO 8FE 8FG 8FL 8G5 8H 8VB AAKYL AAPBV ABBHK ABEFU ABFLS ABJCF ABPPZ ABUWG ACIWK ACNCT ACYGS ADCOW ADGDI ADMHP ADODI AEILP AELPN AENEX AEUPB AFKRA AFXKK AKVCP ALMA_UNASSIGNED_HOLDINGS ARAPS ARCSS AZQEC BDTQF BENPR BES BEZIV BGLVJ BHOJU BKOMP BPHCQ CBXGM CHNMF CS3 CWXUR CZBKB DQDLB DSRWC DWQXO EBA EBE EBO EBR EBS EBU ECEWR ECR ECS EDO EFSUC EJD EMK EPL F20 FEDTE FRNLG GIFXF GNUQQ GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GUQSH HCIFZ HECYW HGD HQ6 HVGLF H~9 IAO ICW IEA IGG IOF ISR ITC JAA JBU JMS JPL JSODD JST K6 K60 K6V K7- L6V M0C M0N M2O M7S MBDVC MV1 N95 NIEAY P-O P2P P62 PADUT PQEST PQQKQ PQUKI PRG PRINS PROAC PTHSS QWB RNS RPU RXW SA0 TAE TH9 TN5 TUS U5U WH7 X XFK XHC XI7 Y99 ZL0 ZY4 -~X .DC 18M 2AX AAOAC AAWIL AAWTO ABAWQ ABDNZ ABFAN ABKVW ABQDR ABXSQ ABYRZ ABYWD ABYYQ ACDIW ACGFO ACHJO ACMTB ACTMH ACUHF ACVFL ACXJH ADULT AEGXH AELLO AEMOZ AFVYC AGLNM AHAJD AHQJS AIAGR AIHAF AKBRZ ALRMG AMVHM APTMU ASMEE BAAKF IPSME JAAYA JBMMH JBZCM JENOY JHFFW JKQEH JLEZI JLXEF JPPEU K1G K6~ 8H~ AADHG AAYXX CCPQU CITATION PHGZM PHGZT PQBIZ PQBZA PQGLB PUEGO WHG XOL .4S ABTAH OQ6 JQ2  | 
    
| ID | FETCH-LOGICAL-c557t-224aee7ab93cc562dbcbd3eb7a5871b2a1300e5acf105d89fd61fc4e969124073 | 
    
| ISSN | 0364-765X | 
    
| IngestDate | Fri Aug 15 20:04:02 EDT 2025 Mon Oct 20 22:22:40 EDT 2025 Thu Jun 12 23:37:23 EDT 2025 Mon Oct 20 16:09:15 EDT 2025 Thu Oct 16 14:09:27 EDT 2025 Fri May 23 01:10:44 EDT 2025 Sat Mar 08 16:11:26 EST 2025 Thu Apr 24 23:01:01 EDT 2025 Wed Oct 01 02:52:23 EDT 2025 Thu May 29 08:43:16 EDT 2025 Wed Jan 06 02:47:59 EST 2021  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 4 | 
    
| Language | English | 
    
| LinkModel | OpenURL | 
    
| MergedId | FETCHMERGED-LOGICAL-c557t-224aee7ab93cc562dbcbd3eb7a5871b2a1300e5acf105d89fd61fc4e969124073 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14  | 
    
| PQID | 912481733 | 
    
| PQPubID | 37790 | 
    
| PageCount | 11 | 
    
| ParticipantIDs | crossref_citationtrail_10_1287_moor_1110_0516 gale_infotracacademiconefile_A273614482 informs_primary_10_1287_moor_1110_0516 proquest_journals_912481733 gale_businessinsightsgauss_A273614482 gale_incontextgauss_ISR_A273614482 gale_infotracgeneralonefile_A273614482 jstor_primary_41412326 gale_infotracmisc_A273614482 econis_primary_681232110 crossref_primary_10_1287_moor_1110_0516  | 
    
| ProviderPackageCode | Y99 RPU NIEAY CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2011-11-01 | 
    
| PublicationDateYYYYMMDD | 2011-11-01 | 
    
| PublicationDate_xml | – month: 11 year: 2011 text: 2011-11-01 day: 01  | 
    
| PublicationDecade | 2010 | 
    
| PublicationPlace | Linthicum | 
    
| PublicationPlace_xml | – name: Linthicum | 
    
| PublicationTitle | Mathematics of operations research | 
    
| PublicationYear | 2011 | 
    
| Publisher | INFORMS Institute for Operations Research and the Management Sciences Inst  | 
    
| Publisher_xml | – name: INFORMS – name: Institute for Operations Research and the Management Sciences – name: Inst  | 
    
| References | B20 B21 B22 B23 B24 B25 B26 B27 B28 B29 B30 B31 B10 B32 B11 B12 B13 B14 B15 B16 B17 B18 B19 B1 B2 B3 B4 B5 B6 B7 B8 B9 Mansour Y. (B21) 1999 Altman E. (B1) 1999 Bellman R. (B2) 1957 Kallenberg L. C. M. (B14) 1983 Klee V. (B17) 1972 Bertsekas D. P. (B3) 1987 de Ghellinck G. (B7) 1960; 2 Khachiyan L. G. (B16) 1979; 244 Lovász L. (B19) 1988 Littman M. L. (B18) 1994 Howard R. A. (B13) 1960  | 
    
| References_xml | – ident: B12 – ident: B9 – ident: B14 – ident: B10 – ident: B3 – ident: B20 – ident: B1 – ident: B27 – ident: B7 – ident: B5 – ident: B29 – ident: B25 – ident: B23 – ident: B21 – ident: B18 – ident: B16 – ident: B31 – ident: B8 – ident: B11 – ident: B13 – ident: B2 – ident: B26 – ident: B4 – ident: B28 – ident: B6 – ident: B24 – ident: B22 – ident: B17 – ident: B32 – ident: B15 – ident: B30 – ident: B19 – ident: B32 doi: 10.1287/moor.1050.0149 – volume-title: Inequalities III year: 1972 ident: B17 – ident: B12 doi: 10.1007/978-3-642-14162-1_46 – ident: B15 doi: 10.1007/BF02579150 – ident: B23 doi: 10.1287/moor.12.3.441 – ident: B29 doi: 10.1016/0024-3795(68)90002-5 – ident: B6 doi: 10.1515/9781400884179 – ident: B8 doi: 10.1287/mnsc.10.1.98 – volume-title: Linear Programming and Finite Markovian Control Problems. year: 1983 ident: B14 – volume-title: Geometric Algorithms and Combinatorial Optimization year: 1988 ident: B19 – ident: B26 doi: 10.1073/pnas.39.10.1095 – ident: B30 doi: 10.1214/aoms/1177697379 – ident: B24 doi: 10.1002/9780470316887 – ident: B20 doi: 10.1287/mnsc.6.3.259 – volume: 2 start-page: 161 year: 1960 ident: B7 publication-title: Cahiers du Centre d'Etudes de Recherche Opérationnelle – volume-title: Dynamic Programming year: 1957 ident: B2 – ident: B9 doi: 10.1287/moor.14.3.502 – ident: B4 doi: 10.1287/ijoc.6.1.15 – ident: B5 doi: 10.2307/1910385 – start-page: 394 volume-title: Proc. 11th Annual Conf. Uncertainty Artificial Intelligence (UAI–95) year: 1994 ident: B18 – volume-title: Dynamic Programming and Markov Processes year: 1960 ident: B13 – volume-title: Dynamic Programming, Deterministic and Stochastic Models year: 1987 ident: B3 – start-page: 401 volume-title: Proc. 15th Internat. Conf. Uncertainty Artificial Intelligence year: 1999 ident: B21 – ident: B28 doi: 10.1007/BF02592148 – ident: B27 doi: 10.1016/0167-6377(90)90022-W – volume-title: Constrained Markov Decision Processes year: 1999 ident: B1 – ident: B11 doi: 10.1287/moor.13.1.90 – ident: B22 doi: 10.1287/ijoc.6.2.188 – volume: 244 start-page: 1086 year: 1979 ident: B16 publication-title: Dokl. Akad. Nauk SSSR  | 
    
| SSID | ssj0015714 | 
    
| Score | 2.4090855 | 
    
| Snippet | We prove that the classic policy-iteration method [Howard, R. A. 1960.
Dynamic Programming and Markov Processes
. MIT, Cambridge] and the original simplex... We prove that the classic policy-iteration method [Howard, R. A. 1960. Dynamic Programming and Markov Processes. MIT, Cambridge] and the original simplex...  | 
    
| SourceID | proquest gale econis crossref jstor informs  | 
    
| SourceType | Aggregation Database Index Database Enrichment Source Publisher  | 
    
| StartPage | 593 | 
    
| SubjectTerms | Algorithms Arithmetic Discount rates Dynamic programming Linear programming Markov analysis Markov decision problem Markov processes Mathematical vectors Mathematics Methods Optimal policy policy-iteration method Polynomials Simplex method strongly polynomial time Studies  | 
    
| Title | The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate | 
    
| URI | https://www.jstor.org/stable/41412326 http://www.econis.eu/PPNSET?PPN=681232110 https://www.proquest.com/docview/912481733  | 
    
| Volume | 36 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVEBS databaseName: Mathematics Source customDbUrl: eissn: 1526-5471 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0015714 issn: 0364-765X databaseCode: AMVHM dateStart: 19760201 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/mathematics-source providerName: EBSCOhost  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9NAEF1FKSA48BFaNbSgFQJ6iFxie-21jxGiCkgpErQoN2u9XkeRil3VCWrpn2fGu_5qgyhcrCgd2a7fy-zseOYNIW8CkaZhkirLjh1uQfwvwA8qbilPOMyzecKSstri2J-ess9zb97rXbe7S1bxofy1sa_kf1CF7wBX7JL9B2Trk8IX8BnwhSMgDMc7Y1wsUd_3Ujf8lyK_llZKRlz1fOhihOVdBSa9F5jMyM-usBlZty6WkSd27OQ_R4kZuDMyY2ZM59soXV5CXIoNvDhZYoTqEu2gdlZLv5Z1Ifm5uTy-kWgly8rMbOnyl9nVup1uwHq3TrqhKWHAO_zSnLAqFKwLP5vyncpNtbOPrs8s7nvztiPWSiiGcKzlVT09RPGWt3cwX3L0I88v0POPD8HBbJDVPtZTPG_Iak8gbsPNcADr95YDS8K4T7Yms-_TWf0SyuO2UR_T92o0P-Gq77vX7MQ09zCRsSzqJf6-VsAtqoLXW4t-GcmcPCWPzRaETjSfnpGeygbkQdUBMSBPqkkf1DzRAXnUkq18Tq6Bd9TwjgIQ9CbvqOEdBd7Rine04R2FW6WAHtW8oxXvqOEdRd5RQUve0Yp3FHm3TU6PPp58mFpmiIclPY-vLAgRhVJcxKErJQTbSSzjxFUxFx7s1WNH4PtU8AwyhUg_CcI08e1UMhX6oY3ZBneH9LM8U7uEOlxC9M6ZCphAhd3YYVKFIWy8fCldpobEqmCIpFG4x0ErZxHudAG2CGHDPe84QtiG5KC2P9faLn-03NWo1nYo2-di8mRI3iLOkZkZC4cCs2rFQqyLImpYNiSvSzvUXMmwqEsbfPr2tWN0YIzSHO5cCtMjA_8_yrR1LN91LBdapH6T4X7HEFYP2T2P4edfH8FOSd_ajNkMHwH8Ya_ic2R8YxEhdoHNXffFHR_PHnnYeJt90l9drNVLCPdX8Svzs_wNVz8H2A | 
    
| linkProvider | EBSCOhost | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+simplex+and+policy-iteration+methods+are+strongly+polynomial+for+the+Markov+decision+problem+with+a+fixed+discount+rate&rft.jtitle=Mathematics+of+operations+research&rft.au=Ye%2C+Yinyu&rft.date=2011-11-01&rft.pub=Institute+for+Operations+Research+and+the+Management+Sciences&rft.issn=0364-765X&rft.volume=36&rft.issue=4&rft.spage=593&rft_id=info:doi/10.1287%2Fmoor.1110.0516&rft.externalDBID=N95&rft.externalDocID=A273614482 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0364-765X&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0364-765X&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0364-765X&client=summon |