Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times

We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each...

Full description

Saved in:
Bibliographic Details
Published inManagement science Vol. 66; no. 5; pp. 1962 - 1980
Main Authors Zhang, Huanan, Chao, Xiuli, Shi, Cong
Format Journal Article
LanguageEnglish
Published Linthicum INFORMS 01.05.2020
Institute for Operations Research and the Management Sciences
Subjects
Online AccessGet full text
ISSN0025-1909
1526-5501
DOI10.1287/mnsc.2019.3288

Cover

Abstract We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation . The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions. This paper was accepted by Yinyu Ye, optimization.
AbstractList We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation. The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions.
We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation. The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions. This paper was accepted by Yinyu Ye, optimization.
We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation . The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions. This paper was accepted by Yinyu Ye, optimization.
Author Shi, Cong
Chao, Xiuli
Zhang, Huanan
Author_xml – sequence: 1
  givenname: Huanan
  orcidid: 0000-0002-0672-5227
  surname: Zhang
  fullname: Zhang, Huanan
  organization: Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, Pennsylvania State University, University Park, Pennsylvania 16802
– sequence: 2
  givenname: Xiuli
  orcidid: 0000-0001-5233-4385
  surname: Chao
  fullname: Chao, Xiuli
  organization: Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48105
– sequence: 3
  givenname: Cong
  orcidid: 0000-0003-3564-3391
  surname: Shi
  fullname: Shi, Cong
  organization: Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, Pennsylvania State University, University Park, Pennsylvania 16802
BookMark eNqFkE1LAzEQhoMo2FavngOed50k-5H1VorWQsFDq9eQZrNtym5Sk1Tpv3eXehLE08DL-8wwzxhdWmc1QncEUkJ5-dDZoFIKpEoZ5fwCjUhOiyTPgVyiEQDNE1JBdY3GIewBoORlMULvs9YFY7c47jSey8MjnuKllt4O2bTdOm_irsON83jpQkxWstUBL-ynttH5E16dQtRdwF99bQBrvDadDjfoqpFt0Lc_c4Lenp_Ws5dk-TpfzKbLRLESYsJkVm9oRopa6aZUJc9loTgjlCreVP1Xim10HzegJDRK50BJzQpNS6bygmdsgu7Pew_efRx1iGLvjt72JwXNoGBQkYz0rezcUt6F4HUjlIkyGmejl6YVBMRgUAwGxWBQDAZ7LP2FHbzppD_9DSRnwNheWBf-638DC2uD_A
CitedBy_id crossref_primary_10_2139_ssrn_4772763
crossref_primary_10_1287_msom_2022_1086
crossref_primary_10_2139_ssrn_3775303
crossref_primary_10_1016_j_knosys_2023_110459
crossref_primary_10_1287_mnsc_2021_04241
crossref_primary_10_1016_j_ejor_2020_10_006
crossref_primary_10_1287_mnsc_2022_4382
crossref_primary_10_2139_ssrn_2836057
crossref_primary_10_1287_mnsc_2020_3799
crossref_primary_10_2139_ssrn_3637705
crossref_primary_10_1111_poms_13339
crossref_primary_10_2139_ssrn_4008913
crossref_primary_10_1111_poms_13778
crossref_primary_10_1002_nav_21949
crossref_primary_10_1287_mnsc_2022_02476
crossref_primary_10_2139_ssrn_3734594
crossref_primary_10_2139_ssrn_4781604
crossref_primary_10_1287_mnsc_2021_4222
crossref_primary_10_1287_msom_2022_1168
crossref_primary_10_1287_msom_2021_0979
crossref_primary_10_1287_opre_2021_2161
crossref_primary_10_2139_ssrn_3554042
crossref_primary_10_2139_ssrn_3602544
crossref_primary_10_1007_s10479_022_04932_9
crossref_primary_10_1051_e3sconf_202457703001
crossref_primary_10_1287_msom_2021_0135
crossref_primary_10_1287_opre_2022_0273
crossref_primary_10_1111_poms_13693
crossref_primary_10_2139_ssrn_3803777
crossref_primary_10_2139_ssrn_4456201
crossref_primary_10_1111_poms_13178
crossref_primary_10_1287_mnsc_2023_4859
crossref_primary_10_1111_poms_13326
crossref_primary_10_2139_ssrn_4652347
crossref_primary_10_1287_opre_2022_0624
crossref_primary_10_2139_ssrn_4511384
crossref_primary_10_2139_ssrn_4096320
crossref_primary_10_1287_msom_2024_1061
crossref_primary_10_1111_itor_13042
crossref_primary_10_2139_ssrn_4806687
crossref_primary_10_2139_ssrn_4348606
crossref_primary_10_1002_nav_22211
crossref_primary_10_1287_opre_2021_0032
crossref_primary_10_2139_ssrn_4414361
crossref_primary_10_1177_10591478241231858
crossref_primary_10_1287_opre_2022_2263
crossref_primary_10_1016_j_cie_2024_110490
crossref_primary_10_1287_mnsc_2023_00920
crossref_primary_10_2139_ssrn_3978123
crossref_primary_10_1287_isre_2021_1040
crossref_primary_10_1287_mnsc_2022_02533
crossref_primary_10_1287_opre_2020_612
crossref_primary_10_1287_opre_2021_2129
crossref_primary_10_1287_opre_2023_2477
crossref_primary_10_2139_ssrn_3625059
crossref_primary_10_2139_ssrn_4671416
crossref_primary_10_2139_ssrn_3552995
crossref_primary_10_1287_opre_2020_0612
crossref_primary_10_2139_ssrn_3456834
crossref_primary_10_2139_ssrn_3287560
crossref_primary_10_2139_ssrn_3619625
crossref_primary_10_1016_j_orl_2024_107196
crossref_primary_10_2139_ssrn_4794903
crossref_primary_10_1016_j_tre_2021_102335
crossref_primary_10_2139_ssrn_4626023
crossref_primary_10_1111_poms_13824
crossref_primary_10_1111_poms_13868
crossref_primary_10_1287_mnsc_2021_4171
crossref_primary_10_1287_opre_2022_2307
crossref_primary_10_2139_ssrn_4040305
crossref_primary_10_1287_msom_2022_0323
Cites_doi 10.1287/moor.2015.0760
10.1287/opre.48.3.436.12437
10.1016/j.ejor.2011.02.004
10.1287/mnsc.1080.0945
10.1287/opre.1070.0463
10.1287/opre.2018.1724
10.1287/opre.2014.1298
10.1016/0167-7152(83)90025-1
10.1287/opre.2013.1239
10.1287/moor.1080.0355
10.1287/mnsc.1120.1654
10.1287/moor.1070.0285
10.1287/opre.1070.0471
10.1287/moor.1080.0367
10.1287/opre.1070.0482
10.1287/opre.1040.0130
10.1287/opre.2015.1474
10.1137/070704277
10.1287/opre.2016.1514
10.1561/9781680831719
10.1137/1011090
ContentType Journal Article
Copyright Copyright Institute for Operations Research and the Management Sciences May 2020
Copyright_xml – notice: Copyright Institute for Operations Research and the Management Sciences May 2020
DBID AAYXX
CITATION
8BJ
FQK
JBE
DOI 10.1287/mnsc.2019.3288
DatabaseName CrossRef
International Bibliography of the Social Sciences (IBSS)
International Bibliography of the Social Sciences
International Bibliography of the Social Sciences
DatabaseTitle CrossRef
International Bibliography of the Social Sciences (IBSS)
DatabaseTitleList International Bibliography of the Social Sciences (IBSS)
CrossRef

DeliveryMethod fulltext_linktorsrc
Discipline Business
EISSN 1526-5501
EndPage 1980
ExternalDocumentID 10_1287_mnsc_2019_3288
mnsc20193288
Genre Research Articles
GroupedDBID 08R
0R1
1AW
1OL
29M
2AX
3EH
3R3
3V.
4
4.4
41
5GY
6XO
7WY
7X5
85S
8AO
8FI
8FJ
8FL
8VB
AABCJ
AAIKC
AAPBV
AAYJJ
ABBHK
ABEFU
ABIVO
ABNOP
ABPPZ
ABSIS
ABTRL
ABUFD
ABUWG
ABZEH
ACDCL
ACHQT
ACNCT
ACTDY
ACVYA
ACYGS
ADBBV
ADDCT
ADGDI
ADNFJ
AEILP
AENEX
AETEA
AEUPB
AFDAS
AFFDN
AFFNX
AFKRA
AJPNJ
AKVCP
ALMA_UNASSIGNED_HOLDINGS
AQNXB
AQSKT
AQUVI
AZQEC
B-7
BBAFP
BENPR
BEZIV
BPHCQ
BVXVI
CBXGM
CCKSF
CS3
CWXUR
CYVLN
DU5
DWQXO
EBA
EBE
EBO
EBR
EBS
EBU
ECR
EHE
EJD
EMK
EPL
F20
F5P
FH7
FRNLG
FYUFA
G8K
GENNL
GNUQQ
GROUPED_ABI_INFORM_ARCHIVE
GROUPED_ABI_INFORM_COMPLETE
GROUPED_ABI_INFORM_RESEARCH
GUPYA
HGD
HVGLF
H~9
IAO
IEA
IGG
IOF
IPO
ISM
ITC
JAV
JBC
JPL
JSODD
JST
K6
K60
L8O
LI
M0C
M0T
M2M
MV1
N95
NEJ
NIEAY
P-O
P2P
PQEST
PQQKQ
PQUKI
PRINS
PROAC
QWB
REX
RNS
RPU
SA0
SJN
TH9
TN5
U5U
UKR
VOH
VQA
WH7
X
XFK
XHC
XI7
XXP
XZL
Y99
YCJ
YNT
YZZ
ZCG
ZL0
-~X
.-4
18M
41~
AAAZS
AADHG
AAMNW
AAWTO
AAXLS
AAYXX
ABAWQ
ABDNZ
ABDPE
ABKVW
ABLWH
ABXSQ
ABYYQ
ACGFO
ACHJO
ACXJH
ADEPB
ADMHG
ADNWM
ADULT
AEGXH
AEMOZ
AFAIT
AFTQD
AGKTX
AHAJD
AHQJS
AIAGR
APTMU
ASMEE
BAAKF
CCPQU
CITATION
IPC
IPSME
IPY
ISL
JAAYA
JBMMH
JBZCM
JENOY
JHFFW
JKQEH
JLEZI
JLXEF
JPPEU
K1G
K6~
LPU
OFU
PHGZM
PHGZT
PJZUB
PPXIY
PQBIZ
PQBZA
PSYQQ
PUEGO
UKHRP
XSW
YYP
8BJ
FQK
JBE
ID FETCH-LOGICAL-c370t-3a4db2416dcef7c785a6c83122c8f9128c3bec78f0ca0fce5021d36e273c56843
ISSN 0025-1909
IngestDate Sat Aug 16 11:50:38 EDT 2025
Wed Oct 01 03:27:56 EDT 2025
Thu Apr 24 22:57:25 EDT 2025
Wed Jan 06 02:47:50 EST 2021
IsPeerReviewed true
IsScholarly true
Issue 5
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c370t-3a4db2416dcef7c785a6c83122c8f9128c3bec78f0ca0fce5021d36e273c56843
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-5233-4385
0000-0002-0672-5227
0000-0003-3564-3391
PQID 2406309141
PQPubID 40737
PageCount 19
ParticipantIDs crossref_primary_10_1287_mnsc_2019_3288
informs_primary_10_1287_mnsc_2019_3288
proquest_journals_2406309141
crossref_citationtrail_10_1287_mnsc_2019_3288
ProviderPackageCode Y99
RPU
NIEAY
CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2020-05-01
PublicationDateYYYYMMDD 2020-05-01
PublicationDate_xml – month: 05
  year: 2020
  text: 2020-05-01
  day: 01
PublicationDecade 2020
PublicationPlace Linthicum
PublicationPlace_xml – name: Linthicum
PublicationTitle Management science
PublicationYear 2020
Publisher INFORMS
Institute for Operations Research and the Management Sciences
Publisher_xml – name: INFORMS
– name: Institute for Operations Research and the Management Sciences
References B20
B21
B23
B25
B26
B27
B28
B29
B10
B11
B12
B13
B15
B16
B18
B19
B1
B2
B3
B4
B5
B6
B7
B8
B9
Karlin S (B13) 1958
Zipkin P (B27) 2000
References_xml – ident: B12
– ident: B9
– ident: B10
– ident: B3
– ident: B20
– ident: B1
– ident: B27
– ident: B7
– ident: B5
– ident: B29
– ident: B25
– ident: B23
– ident: B21
– ident: B18
– ident: B16
– ident: B8
– ident: B11
– ident: B13
– ident: B2
– ident: B26
– ident: B4
– ident: B28
– ident: B6
– ident: B15
– ident: B19
– ident: B6
  doi: 10.1287/moor.2015.0760
– start-page: 135
  volume-title: Studies in the Mathematical Theory of Inventory and Production
  year: 1958
  ident: B13
– ident: B4
  doi: 10.1287/opre.48.3.436.12437
– ident: B2
  doi: 10.1016/j.ejor.2011.02.004
– ident: B10
  doi: 10.1287/mnsc.1080.0945
– ident: B12
  doi: 10.1287/opre.1070.0463
– ident: B26
  doi: 10.1287/opre.2018.1724
– ident: B3
  doi: 10.1287/opre.2014.1298
– ident: B20
  doi: 10.1016/0167-7152(83)90025-1
– ident: B5
  doi: 10.1287/opre.2013.1239
– ident: B8
  doi: 10.1287/moor.1080.0355
– ident: B1
  doi: 10.1287/mnsc.1120.1654
– ident: B15
  doi: 10.1287/moor.1070.0285
– ident: B28
  doi: 10.1287/opre.1070.0471
– ident: B9
  doi: 10.1287/moor.1080.0367
– ident: B29
  doi: 10.1287/opre.1070.0482
– volume-title: Foundations of Inventory Management
  year: 2000
  ident: B27
– ident: B11
  doi: 10.1287/opre.1040.0130
– ident: B23
  doi: 10.1287/opre.2015.1474
– ident: B19
  doi: 10.1137/070704277
– ident: B25
  doi: 10.1287/opre.2016.1514
– ident: B7
  doi: 10.1561/9781680831719
– ident: B18
  doi: 10.1137/1011090
SSID ssj0007876
Score 2.519206
Snippet We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical...
SourceID proquest
crossref
informs
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1962
SubjectTerms Algorithms
base-stock policy
censored demand
Convergence
Demand
Inventory
Inventory management
lead time
Learning
learning algorithms
lost sales
nonparametric
Regret
regret analysis
Sales
Stochastic models
Title Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times
URI https://www.proquest.com/docview/2406309141
Volume 66
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 1526-5501
  dateEnd: 20201105
  omitProxy: true
  ssIdentifier: ssj0007876
  issn: 0025-1909
  databaseCode: BENPR
  dateStart: 19870101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELa2i1pxqUofAkorH6pyqEwTO3aS3ugKWCGKUAvV3tLEdgBpyaIme-mv7zh2vNkuFbSXKIps5zHfPDyZB0LvFC1FAsKRRCJXJCpoDCwlGOx5VJDLlBa8TY_-cirGF9HxhE8Ggx-9qKV5U-zJX3fmlfwPVeEa0NVkyf4DZf2icAHOgb5wBArD8UE0Hk1ndZfudJTf2izzE-_smF7OYOt_ddOGEp7M6oZ8A3VQt7U1qvbnuqtXbr2xpttmLyWka_Pk42M-OG254msez2GMB9noKm_dr5Pr-fTaO3Da3sEmv_Cy72egwSKqz8f9cwLmgxVw2olLKgjsccK-PLVdVBxueE84ArPTnqINU9vDaUWIU-MGObypalNiMkz3GLWd_5arZf-hxXxsodnVwAqZmZ-Z-ZmZ_witUZD7wRCtfT44PfvqtTUILNG19TVv5wp7wgofl59gyXB5bMva1isavDVLzp-hp24_gfctODbQQFfP0ZMuneEF-u4wggEjGDDyCe_jDiHYIwTDbfACIdgjBDuEYIMQM1HhFiEv0cXhwfloTFwvDSJZHDSE5ZEqwFoTSuoylnHCcyETFlIqkzKF15UMuDlOykDmQSk1B9tPMaHBupXArhF7hYbVrNKbCCvOpUh5UmpdGhOoSMpYKVaEppZRoNQWIt2HyqQrNG_6nUyzu0mzhXb9-FtbYuWvI9-7737vwJ2OLJnj1zoztisD8zgKtx98x9dofcELO2jY_JzrN2CFNsVbB6Tf-XWE6Q
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Closing+the+Gap%3A+A+Learning+Algorithm+for+Lost-Sales+Inventory+Systems+with+Lead+Times&rft.jtitle=Management+science&rft.au=Zhang%2C+Huanan&rft.au=Chao%2C+Xiuli&rft.au=Shi%2C+Cong&rft.date=2020-05-01&rft.issn=0025-1909&rft.eissn=1526-5501&rft.volume=66&rft.issue=5&rft.spage=1962&rft.epage=1980&rft_id=info:doi/10.1287%2Fmnsc.2019.3288&rft.externalDBID=n%2Fa&rft.externalDocID=10_1287_mnsc_2019_3288
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0025-1909&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0025-1909&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0025-1909&client=summon