Constrained dynamic programming with two discount factors: applications and an algorithm

We consider a discrete time Markov decision process, where the objectives are linear combinations of standard discounted rewards, each with a different discount factor. We describe several applications that motivate the recent interest in these criteria. For the special case where a standard discoun...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on automatic control Vol. 44; no. 3; pp. 628 - 631
Main Authors Feinberg, E.A., Shwartz, A.
Format Journal Article
LanguageEnglish
Published New York, NY IEEE 01.03.1999
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text
ISSN0018-9286
DOI10.1109/9.751365

Cover

More Information
Summary:We consider a discrete time Markov decision process, where the objectives are linear combinations of standard discounted rewards, each with a different discount factor. We describe several applications that motivate the recent interest in these criteria. For the special case where a standard discounted cost is to be minimized, subject to a constraint on another standard discounted cost but with a different discount factor, we provide an implementable algorithm for computing an optimal policy.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0018-9286
DOI:10.1109/9.751365