A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine
The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O (n3) time dynamic programming algorithm for solving the optimal trian...
        Saved in:
      
    
          | Published in | 2014 Second International Symposium on Computing and Networking pp. 86 - 95 | 
|---|---|
| Main Author | |
| Format | Conference Proceeding | 
| Language | English Japanese  | 
| Published | 
            IEEE
    
        01.12.2014
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 2379-1888 | 
| DOI | 10.1109/CANDAR.2014.14 | 
Cover
| Abstract | The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O (n3) time dynamic programming algorithm for solving the optimal triangulation problem for a convex n-gon in the HMM. Although the HMM can run a lot of threads in parallel, it is very hard to accelerate computation involving complicated memory access such as the dynamic programming for the optimal triangulation problem. It is often the case that the acceleration rate is limited to the bandwidth w of the global memory for problems involving complicated stride memory access. Quite surprisingly, our implementation of the dynamic programming algorithm for solving the optimal triangulation problem runs O (n 3 /w 2 ) time units using max (wL, w 2 l) threads on the HMM with bandwidth w, global memory latency L and shared memory latency l. Hence, this parallel algorithm achieves the acceleration rate of more than w although the dynamic programming algorithm involves complicated stride memory access. Also, we prove that this parallel algorithm is time optimal when L = O (wl). | 
    
|---|---|
| AbstractList | The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O (n3) time dynamic programming algorithm for solving the optimal triangulation problem for a convex n-gon in the HMM. Although the HMM can run a lot of threads in parallel, it is very hard to accelerate computation involving complicated memory access such as the dynamic programming for the optimal triangulation problem. It is often the case that the acceleration rate is limited to the bandwidth w of the global memory for problems involving complicated stride memory access. Quite surprisingly, our implementation of the dynamic programming algorithm for solving the optimal triangulation problem runs O (n 3 /w 2 ) time units using max (wL, w 2 l) threads on the HMM with bandwidth w, global memory latency L and shared memory latency l. Hence, this parallel algorithm achieves the acceleration rate of more than w although the dynamic programming algorithm involves complicated stride memory access. Also, we prove that this parallel algorithm is time optimal when L = O (wl). | 
    
| Author | Nakano, Koji | 
    
| Author_xml | – sequence: 1 givenname: Koji surname: Nakano fullname: Nakano, Koji organization: Dept. of Inf. Eng., Hiroshima Univ., Higashi-Hiroshima, Japan  | 
    
| BookMark | eNotj01Lw0AURUeoYFu7deNm_kDifGbylqFVK7S2SF3X6eQlHckkZZJN_71BXVwulwMH7oxM2q5FQh44Szln8LQs3lfFRyoYVylXN2TGlQFQXAsxIVMhDSQ8z_M7suj7b8aYFEyxTE7JV0EPPiDdXQYfbEP3NtqmwYYWTd1FP5wDrbpIhzPS1bW1wTu6j10dbQi-rWnX_qK1x2ijO3s3KrYYunilWzvuFu_JbWWbHhf_PSefL8-H5TrZ7F7flsUm8cKwIVHSApeZGFMBnByUugKmhNGVc6UpuUUHyMqTQZGjRl2dBCoptIUcbA5yTh7_vB4Rj5c4vonXo2Fa8MzIHzkZVfs | 
    
| CODEN | IEEPAD | 
    
| ContentType | Conference Proceeding | 
    
| DBID | 6IE 6IL CBEJK RIE RIL  | 
    
| DOI | 10.1109/CANDAR.2014.14 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Computer Science | 
    
| EISBN | 1479941522 9781479941520  | 
    
| EndPage | 95 | 
    
| ExternalDocumentID | 7052167 | 
    
| Genre | orig-research | 
    
| GroupedDBID | 6IE 6IF 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL  | 
    
| ID | FETCH-LOGICAL-i270t-43a91362136f99bc9d5f904275fccd7d1aec9e0db7e28e5e5fb2e4325a989a893 | 
    
| IEDL.DBID | RIE | 
    
| ISSN | 2379-1888 | 
    
| IngestDate | Wed Aug 27 02:03:07 EDT 2025 | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | false | 
    
| Language | English Japanese  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-i270t-43a91362136f99bc9d5f904275fccd7d1aec9e0db7e28e5e5fb2e4325a989a893 | 
    
| PageCount | 10 | 
    
| ParticipantIDs | ieee_primary_7052167 | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2014-12-01 | 
    
| PublicationDateYYYYMMDD | 2014-12-01 | 
    
| PublicationDate_xml | – month: 12 year: 2014 text: 2014-12-01 day: 01  | 
    
| PublicationDecade | 2010 | 
    
| PublicationTitle | 2014 Second International Symposium on Computing and Networking | 
    
| PublicationTitleAbbrev | candar | 
    
| PublicationYear | 2014 | 
    
| Publisher | IEEE | 
    
| Publisher_xml | – name: IEEE | 
    
| SSID | ssj0003204063 ssj0001967840  | 
    
| Score | 1.5854005 | 
    
| Snippet | The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main... | 
    
| SourceID | ieee | 
    
| SourceType | Publisher | 
    
| StartPage | 86 | 
    
| SubjectTerms | Bandwidth CUDA Dynamic programming GPU Graphics processing units Heuristic algorithms Hidden Markov models Instruction sets memory machine models Optimized production technology parallel algorithms  | 
    
| Title | A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine | 
    
| URI | https://ieeexplore.ieee.org/document/7052167 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LS8MwHA7bTp7mY-KbHDzark0faY5lcwyhc4iD3WaSpjrsWhndQf9680u7TcSDh0IfEEJe36_J9_0-hG7DVPle6LoWzaLM8kmq55zGeUuji0eocB3pgMA5mYTjmf8wD-YtdLfTwiilDPlM2XBrzvLTUm5gq6xPQWka0jZq0yistVr7_RSml90mbwk8e0QPT2OkRjzKLFf_6TU5G12H9QfxZBg_AbPLt0HA88NZxQDLqIuSbZVqPsm7vamELb9-ZWv8b50PUW8v4cPTHTgdoZYqjlF36-GAmyl9gl5iDCoQ_KiXjhXP8ZSvwV0lx3H-Wq6X1dsK67AW6zARD2vzeigWKF0rXS4uC_NpvAQds7FVyXEC5N1PnBiapuqh2ej-eTC2GtcFa0moU1m-x5mrYU1fGWNCsjTIGDhyBJmUKU1driRTDqRlJpEKVJAJovubBJxFjOvw5xR1irJQZwgLTkMREiVTIvyAZdxL3cjhkS8ElTyi5-gEGmzxUSfWWDRtdfH360t0AP1Vc0muUKdab9S1jggqcWOGwjevxrL9 | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8IwGG4QD3pCBeO3PXh0uHXtuh4JSFAZEoOJN2y7TokDDBkH_fX23YYY48HDkn0kTdOv5137PO-D0EUQG-oHnufwJEwcSmI75yzOOxZdfMKV52oXBM7RIOg90tsn9lRBl99aGGNMTj4zTbjNz_LjuV7CVtkVB6VpwDfQJqOUskKttd5REXbhLTOXwLNP7ADNrdSIz4Xj2X-9Mmuj54qrdmvQaT0At4s2QcLzw1slh5ZuDUWrShWMkrfmMlNN_fkrX-N_a72DGmsRHx5-w9MuqpjZHqqtXBxwOanr6LmFQQeC7-3iMZUpHsoF-KukuJW-zBeT7HWKbWCLbaCIO4V9PRQLpK6pLRfPZ_mn3gSUzLmxSoojoO9-4CgnapoGeuxej9o9p_RdcCaEu5lDfSk8C2z2SoRQWsQsEeDJwRKtYx570mhhXEjMTELDDEsUsT1OmBShkDYA2kfV2XxmDhBWkgcqIEbHRFEmEunHXujKkCrFtQz5IapDg43fi9Qa47Ktjv5-fY62eqOoP-7fDO6O0Tb0XcEsOUHVbLE0pzY-yNRZPiy-AD6ktko | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+Second+International+Symposium+on+Computing+and+Networking&rft.atitle=A+Time+Optimal+Parallel+Algorithm+for+the+Dynamic+Programming+on+the+Hierarchical+Memory+Machine&rft.au=Nakano%2C+Koji&rft.date=2014-12-01&rft.pub=IEEE&rft.issn=2379-1888&rft.spage=86&rft.epage=95&rft_id=info:doi/10.1109%2FCANDAR.2014.14&rft.externalDocID=7052167 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2379-1888&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2379-1888&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2379-1888&client=summon |