Constrained dynamic programming with two discount factors: applications and an algorithm
We consider a discrete time Markov decision process, where the objectives are linear combinations of standard discounted rewards, each with a different discount factor. We describe several applications that motivate the recent interest in these criteria. For the special case where a standard discoun...
Saved in:
| Published in | IEEE transactions on automatic control Vol. 44; no. 3; pp. 628 - 631 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
New York, NY
IEEE
01.03.1999
Institute of Electrical and Electronics Engineers |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0018-9286 |
| DOI | 10.1109/9.751365 |
Cover
| Summary: | We consider a discrete time Markov decision process, where the objectives are linear combinations of standard discounted rewards, each with a different discount factor. We describe several applications that motivate the recent interest in these criteria. For the special case where a standard discounted cost is to be minimized, subject to a constraint on another standard discounted cost but with a different discount factor, we provide an implementable algorithm for computing an optimal policy. |
|---|---|
| Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
| ISSN: | 0018-9286 |
| DOI: | 10.1109/9.751365 |