HiBid: A Cross-Channel Constrained Bidding System With Budget Allocation by Hierarchical Offline Deep Reinforcement Learning
Online display advertising platforms service numerous advertisers by providing real-time bidding (RTB) for the scale of billions of ad requests every day. The bidding strategy handles ad requests cross multiple channels to maximize the number of clicks under the set financial constraints, i.e., tota...
Saved in:
| Published in | IEEE transactions on computers Vol. 73; no. 3; pp. 815 - 828 |
|---|---|
| Main Authors | , , , , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
IEEE
01.03.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0018-9340 1557-9956 |
| DOI | 10.1109/TC.2023.3343111 |
Cover
| Abstract | Online display advertising platforms service numerous advertisers by providing real-time bidding (RTB) for the scale of billions of ad requests every day. The bidding strategy handles ad requests cross multiple channels to maximize the number of clicks under the set financial constraints, i.e., total budget and cost-per-click (CPC), etc. Different from existing works mainly focusing on single channel bidding, we explicitly consider cross-channel constrained bidding with budget allocation. Specifically, we propose a hierarchical offline deep reinforcement learning (DRL) framework called "HiBid", consisted of a high-level planner equipped with auxiliary loss for non-competitive budget allocation, and a data augmentation enhanced low-level executor for adaptive bidding strategy in response to allocated budgets. Additionally, a CPC-guided action selection mechanism is introduced to satisfy the cross-channel CPC constraint. Through extensive experiments on both the large-scale log data and online A/B testing, we confirm that HiBid outperforms six baselines in terms of the number of clicks, CPC satisfactory ratio, and return-on-investment (ROI). We also deploy HiBid on Meituan advertising platform to already service tens of thousands of advertisers every day. |
|---|---|
| AbstractList | Online display advertising platforms service numerous advertisers by providing real-time bidding (RTB) for the scale of billions of ad requests every day. The bidding strategy handles ad requests cross multiple channels to maximize the number of clicks under the set financial constraints, i.e., total budget and cost-per-click (CPC), etc. Different from existing works mainly focusing on single channel bidding, we explicitly consider cross-channel constrained bidding with budget allocation. Specifically, we propose a hierarchical offline deep reinforcement learning (DRL) framework called "HiBid", consisted of a high-level planner equipped with auxiliary loss for non-competitive budget allocation, and a data augmentation enhanced low-level executor for adaptive bidding strategy in response to allocated budgets. Additionally, a CPC-guided action selection mechanism is introduced to satisfy the cross-channel CPC constraint. Through extensive experiments on both the large-scale log data and online A/B testing, we confirm that HiBid outperforms six baselines in terms of the number of clicks, CPC satisfactory ratio, and return-on-investment (ROI). We also deploy HiBid on Meituan advertising platform to already service tens of thousands of advertisers every day. |
| Author | Wang, Xingxing Tang, Bo Wang, Dong Sun, Yaqi Zhou, Jiahong Xie, Qianlong Liu, Chi Harold Mao, Shangqin Dai, Zipeng Wang, Hao |
| Author_xml | – sequence: 1 givenname: Hao orcidid: 0009-0004-0199-0488 surname: Wang fullname: Wang, Hao organization: School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China – sequence: 2 givenname: Bo orcidid: 0000-0001-7129-0250 surname: Tang fullname: Tang, Bo organization: Meituan, Beijing, China – sequence: 3 givenname: Chi Harold orcidid: 0000-0002-0252-329X surname: Liu fullname: Liu, Chi Harold email: liuchi02@gmail.com organization: School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China – sequence: 4 givenname: Shangqin orcidid: 0000-0002-3247-0483 surname: Mao fullname: Mao, Shangqin organization: Meituan, Beijing, China – sequence: 5 givenname: Jiahong orcidid: 0000-0002-1319-2369 surname: Zhou fullname: Zhou, Jiahong organization: Meituan, Beijing, China – sequence: 6 givenname: Zipeng orcidid: 0000-0002-2479-9801 surname: Dai fullname: Dai, Zipeng organization: School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China – sequence: 7 givenname: Yaqi orcidid: 0009-0001-3961-7652 surname: Sun fullname: Sun, Yaqi organization: Meituan, Beijing, China – sequence: 8 givenname: Qianlong orcidid: 0000-0002-5400-7924 surname: Xie fullname: Xie, Qianlong organization: Meituan, Beijing, China – sequence: 9 givenname: Xingxing orcidid: 0000-0001-5495-0827 surname: Wang fullname: Wang, Xingxing organization: Meituan, Beijing, China – sequence: 10 givenname: Dong orcidid: 0000-0002-1964-3984 surname: Wang fullname: Wang, Dong organization: Meituan, Beijing, China |
| BookMark | eNp9kLFPGzEYxS0EUgPt3IXBEvMF29_ZF7OFgzaVIiFBUMeTz_5MjC6-4HOGSPzxvTQMqEOnt7zfe9LvnJzGPiIh3zmbcs709aqeCiZgClAC5_yETLiUVaG1VKdkwhifFRpK9oWcD8MrY0wJpifkfRFug7uhc1qnfhiKem1ixI7WfRxyMiGio2PBhfhCn_ZDxg39HfKa3u7cC2Y677remhz6SNs9XQRMJtl1sKajD953I07vELf0EUP0fbK4wZjpEk2K4-JXcuZNN-C3j7wgzz_uV_WiWD78_FXPl4UVmuXClMCULJ23VSUBrbQavBCts-iwKsFKVSnr27YCJxQq7TygYb5V0kI7Q7ggV8fdberfdjjk5rXfpTheNkKLkoECMRtb18eWPZhI6JttChuT9g1nzUFxs6qbg-LmQ_FIyH8IG_JfGwd13X-4yyMXEPHTCygGEuAPke2LeA |
| CODEN | ITCOB4 |
| CitedBy_id | crossref_primary_10_1109_TKDE_2024_3523472 |
| Cites_doi | 10.1145/3447548.3467113 10.1145/3097983.3098134 10.1109/ICDM.2019.00122 10.1145/3447548.3467199 10.1109/TC.2015.2444843 10.1145/3292500.3330681 10.1109/TCST.2005.847331 10.1145/2983323.2983656 10.1145/3357384.3358031 10.1145/2623330.2623633 10.1109/TC.2014.2346204 10.1145/3485447.3512109 10.1109/TC.2015.2435784 10.1007/s10479-005-5724-z 10.1145/3219819.3219918 10.1145/3534678.3539211 10.1609/aaai.v35i5.16580 10.1038/nature14236 10.1145/2020408.2020604 10.1145/3018661.3018702 10.1201/9781315140223 10.1109/TC.2023.3251850 10.1109/TC.2015.2409864 10.1257/aer.99.2.430 10.1145/3269206.3271748 10.1145/3447548.3467167 10.1145/2187836.2187888 10.1109/TKDE.2017.2775228 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TC.2023.3343111 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) - NZ CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1557-9956 |
| EndPage | 828 |
| ExternalDocumentID | 10_1109_TC_2023_3343111 10360353 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: U23A20310; U21A20519 funderid: 10.13039/501100001809 |
| GroupedDBID | --Z -DZ -~X .55 .DC 0R~ 29I 3EH 3O- 4.4 5GY 5VS 6IK 85S 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACIWK ACNCT AENEX AETEA AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 IAAWW IBMZZ ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ MVM O9- OCL P2P PQQKQ RIA RIE RNI RNS RXW RZB TAE TN5 TWZ UHB UKR UPT VH1 X7M XJT XOL XZL YXB YYQ YZZ ZCG AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c290t-a430654dfc7753ec5c93f22bdcede743c5676cfbb73d26e69df3ea0fb65c3b8e3 |
| IEDL.DBID | RIE |
| ISSN | 0018-9340 |
| IngestDate | Mon Jun 30 06:18:41 EDT 2025 Thu Apr 24 23:03:48 EDT 2025 Wed Oct 01 00:45:31 EDT 2025 Wed Aug 27 02:12:03 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c290t-a430654dfc7753ec5c93f22bdcede743c5676cfbb73d26e69df3ea0fb65c3b8e3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-5400-7924 0000-0002-3247-0483 0009-0004-0199-0488 0000-0002-1964-3984 0000-0002-2479-9801 0000-0001-7129-0250 0000-0002-1319-2369 0009-0001-3961-7652 0000-0002-0252-329X 0000-0001-5495-0827 |
| PQID | 2924036328 |
| PQPubID | 85452 |
| PageCount | 14 |
| ParticipantIDs | crossref_citationtrail_10_1109_TC_2023_3343111 proquest_journals_2924036328 crossref_primary_10_1109_TC_2023_3343111 ieee_primary_10360353 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2024-03-01 |
| PublicationDateYYYYMMDD | 2024-03-01 |
| PublicationDate_xml | – month: 03 year: 2024 text: 2024-03-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on computers |
| PublicationTitleAbbrev | TC |
| PublicationYear | 2024 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref12 ref36 ref10 ref2 ref1 ref16 Sohn (ref37) 2015; 28 ref19 ref18 Ding (ref33) 2020; 33 Zhang (ref17) 2021; 34 García (ref32) 2015; 16 Ajay (ref13) 2021 ref24 ref23 ref26 ref25 ref20 ref42 ref41 ref22 ref44 ref21 ref43 Ran (ref39) 2023; 202 ref28 Taha (ref11) 2003; 7 ref27 ref8 ref7 Lyu (ref15) 2022 ref9 Jaques (ref30) 2019 ref4 ref3 Tessler (ref35) 2019 ref6 Paternain (ref38) 2019 ref5 Fujimoto (ref31) 2019 Gehring (ref14) 2021; 34 Kumar (ref29) 2019; 32 ref40 Le (ref34) 2019 |
| References_xml | – start-page: 1711 volume-title: Proc. NeurIPS year: 2022 ident: ref15 article-title: Mildly conservative Q-learning for offline reinforcement learning – ident: ref20 doi: 10.1145/3447548.3467113 – volume-title: Proc. ICLR year: 2021 ident: ref13 article-title: Opal: Offline primitive discovery for accelerating offline reinforcement learning – ident: ref6 doi: 10.1145/3097983.3098134 – ident: ref24 doi: 10.1109/ICDM.2019.00122 – volume: 7 volume-title: Operations Research: An Introduction year: 2003 ident: ref11 – ident: ref27 doi: 10.1145/3447548.3467199 – ident: ref43 doi: 10.1109/TC.2015.2444843 – ident: ref23 doi: 10.1145/3292500.3330681 – ident: ref41 doi: 10.1109/TCST.2005.847331 – ident: ref26 doi: 10.1145/2983323.2983656 – ident: ref9 doi: 10.1145/3357384.3358031 – ident: ref18 doi: 10.1145/2623330.2623633 – volume: 16 start-page: 1437 issue: 42 year: 2015 ident: ref32 article-title: A comprehensive survey on safe reinforcement learning publication-title: J. Mach. Learn. Res. – ident: ref3 doi: 10.1109/TC.2014.2346204 – volume: 34 start-page: 11553 volume-title: Proc. NeurIPS year: 2021 ident: ref14 article-title: Hierarchical skills for efficient exploration – ident: ref16 doi: 10.1145/3485447.3512109 – ident: ref2 doi: 10.1109/TC.2015.2435784 – ident: ref40 doi: 10.1007/s10479-005-5724-z – volume: 202 start-page: 28701 volume-title: Proc. of the 40th Internl Conf on Machine Learning year: 2023 ident: ref39 article-title: Policy regularization with dataset constraint for offline reinforcement learning – ident: ref28 doi: 10.1145/3219819.3219918 – ident: ref8 doi: 10.1145/3534678.3539211 – ident: ref25 doi: 10.1609/aaai.v35i5.16580 – ident: ref36 doi: 10.1038/nature14236 – ident: ref4 doi: 10.1145/2020408.2020604 – start-page: 7555 volume-title: Proc. NeurIPS year: 2019 ident: ref38 article-title: Constrained reinforcement learning has zero duality gap – start-page: 2052 volume-title: Proc. ICML year: 2019 ident: ref31 article-title: Off-policy deep reinforcement learning without exploration – ident: ref7 doi: 10.1145/3018661.3018702 – year: 2019 ident: ref30 article-title: Way off-policy batch deep reinforcement learning of implicit human preferences in dialog – volume: 34 start-page: 20410 volume-title: Proc. NeurIPS year: 2021 ident: ref17 article-title: Bcorle($\lambda$λ): An offline reinforcement learning and evaluation framework for coupons allocation in e-commerce market – ident: ref42 article-title: Mindopt studio – volume: 28 start-page: 3483 volume-title: Proc. NeurIPS year: 2015 ident: ref37 article-title: Learning structured output representation using deep conditional generative models – ident: ref12 doi: 10.1201/9781315140223 – ident: ref1 doi: 10.1109/TC.2023.3251850 – volume: 32 volume-title: Proc. NeurIPS year: 2019 ident: ref29 article-title: Stabilizing off-policy Q-learning via bootstrapping error reduction – start-page: 3703 volume-title: Proc. ICML year: 2019 ident: ref34 article-title: Batch policy learning under constraints – ident: ref44 doi: 10.1109/TC.2015.2409864 – ident: ref5 doi: 10.1257/aer.99.2.430 – ident: ref22 doi: 10.1145/3269206.3271748 – volume-title: Proc. ICLR year: 2019 ident: ref35 article-title: Reward constrained policy optimization – ident: ref19 doi: 10.1145/3447548.3467167 – ident: ref10 doi: 10.1145/2187836.2187888 – ident: ref21 doi: 10.1109/TKDE.2017.2775228 – volume: 33 start-page: 8378 volume-title: Proc. NeurIPS year: 2020 ident: ref33 article-title: Natural policy gradient primal-dual method for constrained Markov decision processes |
| SSID | ssj0006209 |
| Score | 2.4243135 |
| Snippet | Online display advertising platforms service numerous advertisers by providing real-time bidding (RTB) for the scale of billions of ad requests every day. The... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 815 |
| SubjectTerms | Advertising Budgets Constraints Costs cross-channel bidding Data augmentation Deep learning deep reinforcement learning Real-time bidding systems Real-time systems Reinforcement learning Resource management Strategy Training |
| Title | HiBid: A Cross-Channel Constrained Bidding System With Budget Allocation by Hierarchical Offline Deep Reinforcement Learning |
| URI | https://ieeexplore.ieee.org/document/10360353 https://www.proquest.com/docview/2924036328 |
| Volume | 73 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1557-9956 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006209 issn: 0018-9340 databaseCode: RIE dateStart: 19680101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Rb9MwED6xPcEDhTFEt4HugQdeEhLbcZq9dd2mCokhoU70LYrt86iouoqlD5v242c7zjRAlXjLwzlydHe-O-fu-wA-urrYZhUrEiWtSYSSPBnlhU0EGd5QXpac-eHkrxdyeim-zIt5HFYPszBEFJrPKPWP4V--udYbf1XmPJzLjBd8B3bKkeyGtR6PXdn3c-TOg7nIIo5PnlWfZ5PUs4SnnLtwmed_hKDAqfLPQRyiy_kALvp9dU0lv9JNq1J99xdk439v_BW8jHkmjjvDeA3PaLUHg57DAaNL78GLJ4CEb-B-ujhZmGMc48RvPfGzBytaoqf1DGQSZNAJ-HiHHdY5_li0P_FkY66oxfHSh0avalS3OF344ebAtbLEb9b6hBZPidb4nQJeqw5XkxghXq_24fL8bDaZJpGfIdGsytqkEYF23lhduqKHdKErbhlTRpMhl5noQpZSW6VKbpgkWRnLqcmskoXmakT8Leyurlf0DlC6RNUIqTV3VbJtctXo3JYFs5UlYbUYQtqrrNYRvNx_9rIORUxW1bNJ7XVcRx0P4dPjgnWH27FddN9r7IlYp6whHPVGUUfHvqlZ5QEMJWejgy3LDuG5e7vo-tSOYLf9vaH3LnFp1YdgsA_DQOow |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB5BOUAPLZRW3T7ABw5cEhK_sultu6UK0C4S2oreotgelxWrbdVmD6348diOUxVQJW45jBVHM-OZcWa-D-Cdq4ttVlKRKGlNwpVkyTAXNuFoWIN5UTDqh5NPJ7I645_PxXkcVg-zMIgYms8w9Y_hX7651Et_VeY8nMmMCfYUngnOuejGte4PXtl3dOTOhxnPIpJPnpUfpuPU84SnjLmAmed_BKHAqvLPURziy_E6TPqddW0lP9Nlq1J99xdo439v_SWsxUyTjDrTeAVPcLEB6z2LA4lOvQGrDyAJX8OvanY4MwdkRMZ-64mfPljgnHhiz0AngYY4AR_xSId2Tr7P2h_kcGkusCWjuQ-OXtlE3ZJq5sebA9vKnHy11qe05AjxinzDgNiqw-UkiSCvF5twdvxxOq6SyNCQaFpmbdLwQDxvrC5c2YNa6JJZSpXRaNDlJlrIQmqrVMEMlShLYxk2mVVSaKaGyLZgZXG5wG0g0qWqhkutmauTbZOrRue2ENSWFrnVfABpr7JaR_hy_9nzOpQxWVlPx7XXcR11PID39wuuOuSOx0U3vcYeiHXKGsBebxR1dO2bmpYewlAyOtx5ZNlbeF5NT0_qk0-TL7vwwr2Jd11re7DSXi9x36UxrXoTjPc3KmftfQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=HiBid%3A+A+Cross-Channel+Constrained+Bidding+System+With+Budget+Allocation+by+Hierarchical+Offline+Deep+Reinforcement+Learning&rft.jtitle=IEEE+transactions+on+computers&rft.au=Wang%2C+Hao&rft.au=Tang%2C+Bo&rft.au=Liu%2C+Chi+Harold&rft.au=Mao%2C+Shangqin&rft.date=2024-03-01&rft.issn=0018-9340&rft.eissn=1557-9956&rft.volume=73&rft.issue=3&rft.spage=815&rft.epage=828&rft_id=info:doi/10.1109%2FTC.2023.3343111&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TC_2023_3343111 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9340&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9340&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9340&client=summon |