Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times

We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each...

Full description

Saved in:

Bibliographic Details
Published in	Management science Vol. 66; no. 5; pp. 1962 - 1980
Main Authors	Zhang, Huanan, Chao, Xiuli, Shi, Cong
Format	Journal Article
Language	English
Published	Linthicum INFORMS 01.05.2020 Institute for Operations Research and the Management Sciences
Subjects	Algorithms base-stock policy censored demand Convergence Demand Inventory Inventory management lead time Learning learning algorithms lost sales nonparametric Regret regret analysis Sales Stochastic models
Online Access	Get full text
ISSN	0025-1909 1526-5501
DOI	10.1287/mnsc.2019.3288

Cover

Abstract	We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation . The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions. This paper was accepted by Yinyu Ye, optimization.
AbstractList	We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation. The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions. We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation. The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions. This paper was accepted by Yinyu Ye, optimization. We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an adaptive inventory-ordering decision in each period based only on the past sales (censored demand) data. The standard performance measure is regret, which is the cost difference between a learning algorithm and the clairvoyant (full-information) benchmark. When the benchmark is chosen to be the (full-information) optimal base-stock policy, Huh et al. [Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009a) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.] developed a nonparametric learning algorithm with a cubic-root convergence rate on regret. An important open question is whether there exists a nonparametric learning algorithm whose regret rate matches the theoretical lower bound of any learning algorithms. In this work, we provide an affirmative answer to this question. More precisely, we propose a new nonparametric algorithm termed the simulated cycle-update policy and establish a square-root convergence rate on regret, which is proven to be the lower bound of any learning algorithm. Our algorithm uses a random cycle-updating rule based on an auxiliary simulated system running in parallel and also involves two new concepts, namely the withheld on-hand inventory and the double-phase cycle gradient estimation . The techniques developed are effective for learning a stochastic system with complex system dynamics and lasting impact of decisions. This paper was accepted by Yinyu Ye, optimization.
Author	Shi, Cong Chao, Xiuli Zhang, Huanan
Author_xml	– sequence: 1 givenname: Huanan orcidid: 0000-0002-0672-5227 surname: Zhang fullname: Zhang, Huanan organization: Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, Pennsylvania State University, University Park, Pennsylvania 16802 – sequence: 2 givenname: Xiuli orcidid: 0000-0001-5233-4385 surname: Chao fullname: Chao, Xiuli organization: Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48105 – sequence: 3 givenname: Cong orcidid: 0000-0003-3564-3391 surname: Shi fullname: Shi, Cong organization: Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, Pennsylvania State University, University Park, Pennsylvania 16802
BookMark	eNqFkE1LAzEQhoMo2FavngOed50k-5H1VorWQsFDq9eQZrNtym5Sk1Tpv3eXehLE08DL-8wwzxhdWmc1QncEUkJ5-dDZoFIKpEoZ5fwCjUhOiyTPgVyiEQDNE1JBdY3GIewBoORlMULvs9YFY7c47jSey8MjnuKllt4O2bTdOm_irsON83jpQkxWstUBL-ynttH5E16dQtRdwF99bQBrvDadDjfoqpFt0Lc_c4Lenp_Ws5dk-TpfzKbLRLESYsJkVm9oRopa6aZUJc9loTgjlCreVP1Xim10HzegJDRK50BJzQpNS6bygmdsgu7Pew_efRx1iGLvjt72JwXNoGBQkYz0rezcUt6F4HUjlIkyGmejl6YVBMRgUAwGxWBQDAZ7LP2FHbzppD_9DSRnwNheWBf-638DC2uD_A
CitedBy_id	crossref_primary_10_2139_ssrn_4772763 crossref_primary_10_1287_msom_2022_1086 crossref_primary_10_2139_ssrn_3775303 crossref_primary_10_1016_j_knosys_2023_110459 crossref_primary_10_1287_mnsc_2021_04241 crossref_primary_10_1016_j_ejor_2020_10_006 crossref_primary_10_1287_mnsc_2022_4382 crossref_primary_10_2139_ssrn_2836057 crossref_primary_10_1287_mnsc_2020_3799 crossref_primary_10_2139_ssrn_3637705 crossref_primary_10_1111_poms_13339 crossref_primary_10_2139_ssrn_4008913 crossref_primary_10_1111_poms_13778 crossref_primary_10_1002_nav_21949 crossref_primary_10_1287_mnsc_2022_02476 crossref_primary_10_2139_ssrn_3734594 crossref_primary_10_2139_ssrn_4781604 crossref_primary_10_1287_mnsc_2021_4222 crossref_primary_10_1287_msom_2022_1168 crossref_primary_10_1287_msom_2021_0979 crossref_primary_10_1287_opre_2021_2161 crossref_primary_10_2139_ssrn_3554042 crossref_primary_10_2139_ssrn_3602544 crossref_primary_10_1007_s10479_022_04932_9 crossref_primary_10_1051_e3sconf_202457703001 crossref_primary_10_1287_msom_2021_0135 crossref_primary_10_1287_opre_2022_0273 crossref_primary_10_1111_poms_13693 crossref_primary_10_2139_ssrn_3803777 crossref_primary_10_2139_ssrn_4456201 crossref_primary_10_1111_poms_13178 crossref_primary_10_1287_mnsc_2023_4859 crossref_primary_10_1111_poms_13326 crossref_primary_10_2139_ssrn_4652347 crossref_primary_10_1287_opre_2022_0624 crossref_primary_10_2139_ssrn_4511384 crossref_primary_10_2139_ssrn_4096320 crossref_primary_10_1287_msom_2024_1061 crossref_primary_10_1111_itor_13042 crossref_primary_10_2139_ssrn_4806687 crossref_primary_10_2139_ssrn_4348606 crossref_primary_10_1002_nav_22211 crossref_primary_10_1287_opre_2021_0032 crossref_primary_10_2139_ssrn_4414361 crossref_primary_10_1177_10591478241231858 crossref_primary_10_1287_opre_2022_2263 crossref_primary_10_1016_j_cie_2024_110490 crossref_primary_10_1287_mnsc_2023_00920 crossref_primary_10_2139_ssrn_3978123 crossref_primary_10_1287_isre_2021_1040 crossref_primary_10_1287_mnsc_2022_02533 crossref_primary_10_1287_opre_2020_612 crossref_primary_10_1287_opre_2021_2129 crossref_primary_10_1287_opre_2023_2477 crossref_primary_10_2139_ssrn_3625059 crossref_primary_10_2139_ssrn_4671416 crossref_primary_10_2139_ssrn_3552995 crossref_primary_10_1287_opre_2020_0612 crossref_primary_10_2139_ssrn_3456834 crossref_primary_10_2139_ssrn_3287560 crossref_primary_10_2139_ssrn_3619625 crossref_primary_10_1016_j_orl_2024_107196 crossref_primary_10_2139_ssrn_4794903 crossref_primary_10_1016_j_tre_2021_102335 crossref_primary_10_2139_ssrn_4626023 crossref_primary_10_1111_poms_13824 crossref_primary_10_1111_poms_13868 crossref_primary_10_1287_mnsc_2021_4171 crossref_primary_10_1287_opre_2022_2307 crossref_primary_10_2139_ssrn_4040305 crossref_primary_10_1287_msom_2022_0323
Cites_doi	10.1287/moor.2015.0760 10.1287/opre.48.3.436.12437 10.1016/j.ejor.2011.02.004 10.1287/mnsc.1080.0945 10.1287/opre.1070.0463 10.1287/opre.2018.1724 10.1287/opre.2014.1298 10.1016/0167-7152(83)90025-1 10.1287/opre.2013.1239 10.1287/moor.1080.0355 10.1287/mnsc.1120.1654 10.1287/moor.1070.0285 10.1287/opre.1070.0471 10.1287/moor.1080.0367 10.1287/opre.1070.0482 10.1287/opre.1040.0130 10.1287/opre.2015.1474 10.1137/070704277 10.1287/opre.2016.1514 10.1561/9781680831719 10.1137/1011090
ContentType	Journal Article
Copyright	Copyright Institute for Operations Research and the Management Sciences May 2020
Copyright_xml	– notice: Copyright Institute for Operations Research and the Management Sciences May 2020
DBID	AAYXX CITATION 8BJ FQK JBE
DOI	10.1287/mnsc.2019.3288
DatabaseName	CrossRef International Bibliography of the Social Sciences (IBSS) International Bibliography of the Social Sciences International Bibliography of the Social Sciences
DatabaseTitle	CrossRef International Bibliography of the Social Sciences (IBSS)
DatabaseTitleList	International Bibliography of the Social Sciences (IBSS) CrossRef
DeliveryMethod	fulltext_linktorsrc
Discipline	Business
EISSN	1526-5501
EndPage	1980
ExternalDocumentID	10_1287_mnsc_2019_3288 mnsc20193288
Genre	Research Articles
GroupedDBID	08R 0R1 1AW 1OL 29M 2AX 3EH 3R3 3V. 4 4.4 41 5GY 6XO 7WY 7X5 85S 8AO 8FI 8FJ 8FL 8VB AABCJ AAIKC AAPBV AAYJJ ABBHK ABEFU ABIVO ABNOP ABPPZ ABSIS ABTRL ABUFD ABUWG ABZEH ACDCL ACHQT ACNCT ACTDY ACVYA ACYGS ADBBV ADDCT ADGDI ADNFJ AEILP AENEX AETEA AEUPB AFDAS AFFDN AFFNX AFKRA AJPNJ AKVCP ALMA_UNASSIGNED_HOLDINGS AQNXB AQSKT AQUVI AZQEC B-7 BBAFP BENPR BEZIV BPHCQ BVXVI CBXGM CCKSF CS3 CWXUR CYVLN DU5 DWQXO EBA EBE EBO EBR EBS EBU ECR EHE EJD EMK EPL F20 F5P FH7 FRNLG FYUFA G8K GENNL GNUQQ GROUPED_ABI_INFORM_ARCHIVE GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GUPYA HGD HVGLF H~9 IAO IEA IGG IOF IPO ISM ITC JAV JBC JPL JSODD JST K6 K60 L8O LI M0C M0T M2M MV1 N95 NEJ NIEAY P-O P2P PQEST PQQKQ PQUKI PRINS PROAC QWB REX RNS RPU SA0 SJN TH9 TN5 U5U UKR VOH VQA WH7 X XFK XHC XI7 XXP XZL Y99 YCJ YNT YZZ ZCG ZL0 -~X .-4 18M 41~ AAAZS AADHG AAMNW AAWTO AAXLS AAYXX ABAWQ ABDNZ ABDPE ABKVW ABLWH ABXSQ ABYYQ ACGFO ACHJO ACXJH ADEPB ADMHG ADNWM ADULT AEGXH AEMOZ AFAIT AFTQD AGKTX AHAJD AHQJS AIAGR APTMU ASMEE BAAKF CCPQU CITATION IPC IPSME IPY ISL JAAYA JBMMH JBZCM JENOY JHFFW JKQEH JLEZI JLXEF JPPEU K1G K6~ LPU OFU PHGZM PHGZT PJZUB PPXIY PQBIZ PQBZA PSYQQ PUEGO UKHRP XSW YYP 8BJ FQK JBE
ID	FETCH-LOGICAL-c370t-3a4db2416dcef7c785a6c83122c8f9128c3bec78f0ca0fce5021d36e273c56843
ISSN	0025-1909
IngestDate	Sat Aug 16 11:50:38 EDT 2025 Wed Oct 01 03:27:56 EDT 2025 Thu Apr 24 22:57:25 EDT 2025 Wed Jan 06 02:47:50 EST 2021
IsPeerReviewed	true
IsScholarly	true
Issue	5
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c370t-3a4db2416dcef7c785a6c83122c8f9128c3bec78f0ca0fce5021d36e273c56843
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0001-5233-4385 0000-0002-0672-5227 0000-0003-3564-3391
PQID	2406309141
PQPubID	40737
PageCount	19
ParticipantIDs	crossref_primary_10_1287_mnsc_2019_3288 informs_primary_10_1287_mnsc_2019_3288 proquest_journals_2406309141 crossref_citationtrail_10_1287_mnsc_2019_3288
ProviderPackageCode	Y99 RPU NIEAY CITATION AAYXX
PublicationCentury	2000
PublicationDate	2020-05-01
PublicationDateYYYYMMDD	2020-05-01
PublicationDate_xml	– month: 05 year: 2020 text: 2020-05-01 day: 01
PublicationDecade	2020
PublicationPlace	Linthicum
PublicationPlace_xml	– name: Linthicum
PublicationTitle	Management science
PublicationYear	2020
Publisher	INFORMS Institute for Operations Research and the Management Sciences
Publisher_xml	– name: INFORMS – name: Institute for Operations Research and the Management Sciences
References	B20 B21 B23 B25 B26 B27 B28 B29 B10 B11 B12 B13 B15 B16 B18 B19 B1 B2 B3 B4 B5 B6 B7 B8 B9 Karlin S (B13) 1958 Zipkin P (B27) 2000
References_xml	– ident: B12 – ident: B9 – ident: B10 – ident: B3 – ident: B20 – ident: B1 – ident: B27 – ident: B7 – ident: B5 – ident: B29 – ident: B25 – ident: B23 – ident: B21 – ident: B18 – ident: B16 – ident: B8 – ident: B11 – ident: B13 – ident: B2 – ident: B26 – ident: B4 – ident: B28 – ident: B6 – ident: B15 – ident: B19 – ident: B6 doi: 10.1287/moor.2015.0760 – start-page: 135 volume-title: Studies in the Mathematical Theory of Inventory and Production year: 1958 ident: B13 – ident: B4 doi: 10.1287/opre.48.3.436.12437 – ident: B2 doi: 10.1016/j.ejor.2011.02.004 – ident: B10 doi: 10.1287/mnsc.1080.0945 – ident: B12 doi: 10.1287/opre.1070.0463 – ident: B26 doi: 10.1287/opre.2018.1724 – ident: B3 doi: 10.1287/opre.2014.1298 – ident: B20 doi: 10.1016/0167-7152(83)90025-1 – ident: B5 doi: 10.1287/opre.2013.1239 – ident: B8 doi: 10.1287/moor.1080.0355 – ident: B1 doi: 10.1287/mnsc.1120.1654 – ident: B15 doi: 10.1287/moor.1070.0285 – ident: B28 doi: 10.1287/opre.1070.0471 – ident: B9 doi: 10.1287/moor.1080.0367 – ident: B29 doi: 10.1287/opre.1070.0482 – volume-title: Foundations of Inventory Management year: 2000 ident: B27 – ident: B11 doi: 10.1287/opre.1040.0130 – ident: B23 doi: 10.1287/opre.2015.1474 – ident: B19 doi: 10.1137/070704277 – ident: B25 doi: 10.1287/opre.2016.1514 – ident: B7 doi: 10.1561/9781680831719 – ident: B18 doi: 10.1137/1011090
SSID	ssj0007876
Score	2.519206
Snippet	We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical...
SourceID	proquest crossref informs
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	1962
SubjectTerms	Algorithms base-stock policy censored demand Convergence Demand Inventory Inventory management lead time Learning learning algorithms lost sales nonparametric Regret regret analysis Sales Stochastic models
Title	Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times
URI	https://www.proquest.com/docview/2406309141
Volume	66
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: http://www.proquest.com/pqcentral?accountid=15518 eissn: 1526-5501 dateEnd: 20201105 omitProxy: true ssIdentifier: ssj0007876 issn: 0025-1909 databaseCode: BENPR dateStart: 19870101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELa2i1pxqUofAkorH6pyqEwTO3aS3ugKWCGKUAvV3tLEdgBpyaIme-mv7zh2vNkuFbSXKIps5zHfPDyZB0LvFC1FAsKRRCJXJCpoDCwlGOx5VJDLlBa8TY_-cirGF9HxhE8Ggx-9qKV5U-zJX3fmlfwPVeEa0NVkyf4DZf2icAHOgb5wBArD8UE0Hk1ndZfudJTf2izzE-_smF7OYOt_ddOGEp7M6oZ8A3VQt7U1qvbnuqtXbr2xpttmLyWka_Pk42M-OG254msez2GMB9noKm_dr5Pr-fTaO3Da3sEmv_Cy72egwSKqz8f9cwLmgxVw2olLKgjsccK-PLVdVBxueE84ArPTnqINU9vDaUWIU-MGObypalNiMkz3GLWd_5arZf-hxXxsodnVwAqZmZ-Z-ZmZ_witUZD7wRCtfT44PfvqtTUILNG19TVv5wp7wgofl59gyXB5bMva1isavDVLzp-hp24_gfctODbQQFfP0ZMuneEF-u4wggEjGDDyCe_jDiHYIwTDbfACIdgjBDuEYIMQM1HhFiEv0cXhwfloTFwvDSJZHDSE5ZEqwFoTSuoylnHCcyETFlIqkzKF15UMuDlOykDmQSk1B9tPMaHBupXArhF7hYbVrNKbCCvOpUh5UmpdGhOoSMpYKVaEppZRoNQWIt2HyqQrNG_6nUyzu0mzhXb9-FtbYuWvI9-7737vwJ2OLJnj1zoztisD8zgKtx98x9dofcELO2jY_JzrN2CFNsVbB6Tf-XWE6Q
linkProvider	ProQuest
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Closing+the+Gap%3A+A+Learning+Algorithm+for+Lost-Sales+Inventory+Systems+with+Lead+Times&rft.jtitle=Management+science&rft.au=Zhang%2C+Huanan&rft.au=Chao%2C+Xiuli&rft.au=Shi%2C+Cong&rft.date=2020-05-01&rft.issn=0025-1909&rft.eissn=1526-5501&rft.volume=66&rft.issue=5&rft.spage=1962&rft.epage=1980&rft_id=info:doi/10.1287%2Fmnsc.2019.3288&rft.externalDBID=n%2Fa&rft.externalDocID=10_1287_mnsc_2019_3288
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0025-1909&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0025-1909&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0025-1909&client=summon