A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O (n3) time dynamic programming algorithm for solving the optimal trian...

Full description

Saved in:

Bibliographic Details
Published in	2014 Second International Symposium on Computing and Networking pp. 86 - 95
Main Author	Nakano, Koji
Format	Conference Proceeding
Language	English Japanese
Published	IEEE 01.12.2014
Subjects	Bandwidth CUDA Dynamic programming GPU Graphics processing units Heuristic algorithms Hidden Markov models Instruction sets memory machine models Optimized production technology parallel algorithms
Online Access	Get full text
ISSN	2379-1888
DOI	10.1109/CANDAR.2014.14

Cover

Abstract	The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O (n3) time dynamic programming algorithm for solving the optimal triangulation problem for a convex n-gon in the HMM. Although the HMM can run a lot of threads in parallel, it is very hard to accelerate computation involving complicated memory access such as the dynamic programming for the optimal triangulation problem. It is often the case that the acceleration rate is limited to the bandwidth w of the global memory for problems involving complicated stride memory access. Quite surprisingly, our implementation of the dynamic programming algorithm for solving the optimal triangulation problem runs O (n 3 /w 2 ) time units using max (wL, w 2 l) threads on the HMM with bandwidth w, global memory latency L and shared memory latency l. Hence, this parallel algorithm achieves the acceleration rate of more than w although the dynamic programming algorithm involves complicated stride memory access. Also, we prove that this parallel algorithm is time optimal when L = O (wl).
AbstractList	The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O (n3) time dynamic programming algorithm for solving the optimal triangulation problem for a convex n-gon in the HMM. Although the HMM can run a lot of threads in parallel, it is very hard to accelerate computation involving complicated memory access such as the dynamic programming for the optimal triangulation problem. It is often the case that the acceleration rate is limited to the bandwidth w of the global memory for problems involving complicated stride memory access. Quite surprisingly, our implementation of the dynamic programming algorithm for solving the optimal triangulation problem runs O (n 3 /w 2 ) time units using max (wL, w 2 l) threads on the HMM with bandwidth w, global memory latency L and shared memory latency l. Hence, this parallel algorithm achieves the acceleration rate of more than w although the dynamic programming algorithm involves complicated stride memory access. Also, we prove that this parallel algorithm is time optimal when L = O (wl).
Author	Nakano, Koji
Author_xml	– sequence: 1 givenname: Koji surname: Nakano fullname: Nakano, Koji organization: Dept. of Inf. Eng., Hiroshima Univ., Higashi-Hiroshima, Japan
BookMark	eNotj01Lw0AURUeoYFu7deNm_kDifGbylqFVK7S2SF3X6eQlHckkZZJN_71BXVwulwMH7oxM2q5FQh44Szln8LQs3lfFRyoYVylXN2TGlQFQXAsxIVMhDSQ8z_M7suj7b8aYFEyxTE7JV0EPPiDdXQYfbEP3NtqmwYYWTd1FP5wDrbpIhzPS1bW1wTu6j10dbQi-rWnX_qK1x2ijO3s3KrYYunilWzvuFu_JbWWbHhf_PSefL8-H5TrZ7F7flsUm8cKwIVHSApeZGFMBnByUugKmhNGVc6UpuUUHyMqTQZGjRl2dBCoptIUcbA5yTh7_vB4Rj5c4vonXo2Fa8MzIHzkZVfs
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CANDAR.2014.14
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	1479941522 9781479941520
EndPage	95
ExternalDocumentID	7052167
Genre	orig-research
GroupedDBID	6IE 6IF 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL
ID	FETCH-LOGICAL-i270t-43a91362136f99bc9d5f904275fccd7d1aec9e0db7e28e5e5fb2e4325a989a893
IEDL.DBID	RIE
ISSN	2379-1888
IngestDate	Wed Aug 27 02:03:07 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English Japanese
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i270t-43a91362136f99bc9d5f904275fccd7d1aec9e0db7e28e5e5fb2e4325a989a893
PageCount	10
ParticipantIDs	ieee_primary_7052167
PublicationCentury	2000
PublicationDate	2014-12-01
PublicationDateYYYYMMDD	2014-12-01
PublicationDate_xml	– month: 12 year: 2014 text: 2014-12-01 day: 01
PublicationDecade	2010
PublicationTitle	2014 Second International Symposium on Computing and Networking
PublicationTitleAbbrev	candar
PublicationYear	2014
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0003204063 ssj0001967840
Score	1.5854005
Snippet	The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main...
SourceID	ieee
SourceType	Publisher
StartPage	86
SubjectTerms	Bandwidth CUDA Dynamic programming GPU Graphics processing units Heuristic algorithms Hidden Markov models Instruction sets memory machine models Optimized production technology parallel algorithms
Title	A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine
URI	https://ieeexplore.ieee.org/document/7052167
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LS8MwHA7bTp7mY-KbHDzark0faY5lcwyhc4iD3WaSpjrsWhndQf9680u7TcSDh0IfEEJe36_J9_0-hG7DVPle6LoWzaLM8kmq55zGeUuji0eocB3pgMA5mYTjmf8wD-YtdLfTwiilDPlM2XBrzvLTUm5gq6xPQWka0jZq0yistVr7_RSml90mbwk8e0QPT2OkRjzKLFf_6TU5G12H9QfxZBg_AbPLt0HA88NZxQDLqIuSbZVqPsm7vamELb9-ZWv8b50PUW8v4cPTHTgdoZYqjlF36-GAmyl9gl5iDCoQ_KiXjhXP8ZSvwV0lx3H-Wq6X1dsK67AW6zARD2vzeigWKF0rXS4uC_NpvAQds7FVyXEC5N1PnBiapuqh2ej-eTC2GtcFa0moU1m-x5mrYU1fGWNCsjTIGDhyBJmUKU1driRTDqRlJpEKVJAJovubBJxFjOvw5xR1irJQZwgLTkMREiVTIvyAZdxL3cjhkS8ElTyi5-gEGmzxUSfWWDRtdfH360t0AP1Vc0muUKdab9S1jggqcWOGwjevxrL9
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8IwGG4QD3pCBeO3PXh0uHXtuh4JSFAZEoOJN2y7TokDDBkH_fX23YYY48HDkn0kTdOv5137PO-D0EUQG-oHnufwJEwcSmI75yzOOxZdfMKV52oXBM7RIOg90tsn9lRBl99aGGNMTj4zTbjNz_LjuV7CVtkVB6VpwDfQJqOUskKttd5REXbhLTOXwLNP7ADNrdSIz4Xj2X-9Mmuj54qrdmvQaT0At4s2QcLzw1slh5ZuDUWrShWMkrfmMlNN_fkrX-N_a72DGmsRHx5-w9MuqpjZHqqtXBxwOanr6LmFQQeC7-3iMZUpHsoF-KukuJW-zBeT7HWKbWCLbaCIO4V9PRQLpK6pLRfPZ_mn3gSUzLmxSoojoO9-4CgnapoGeuxej9o9p_RdcCaEu5lDfSk8C2z2SoRQWsQsEeDJwRKtYx570mhhXEjMTELDDEsUsT1OmBShkDYA2kfV2XxmDhBWkgcqIEbHRFEmEunHXujKkCrFtQz5IapDg43fi9Qa47Ktjv5-fY62eqOoP-7fDO6O0Tb0XcEsOUHVbLE0pzY-yNRZPiy-AD6ktko
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+Second+International+Symposium+on+Computing+and+Networking&rft.atitle=A+Time+Optimal+Parallel+Algorithm+for+the+Dynamic+Programming+on+the+Hierarchical+Memory+Machine&rft.au=Nakano%2C+Koji&rft.date=2014-12-01&rft.pub=IEEE&rft.issn=2379-1888&rft.spage=86&rft.epage=95&rft_id=info:doi/10.1109%2FCANDAR.2014.14&rft.externalDocID=7052167
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2379-1888&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2379-1888&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2379-1888&client=summon