GPU 부하 상황에서 DNN 추론 시 은닉 계층별 수행시간 분석

딥러닝의 모델로 사용되는 심층 신경망(DNN)은 여러 계층(layer)으로 구성된다. 최근에는 연산의 병렬화를 위해 단일 모델을 더 작은 단위로 분할 처리하는 등의 방법들이 연구되고 있다. 전체 모델을 효과적으로 나누기 위해서는 추론 시 DNN의 각 계층의 계산에 걸리는 시간을 분석할 필요가 있다. 이에 영향을 미치는 요인으로 각 계층에서 수행되는 연산량이나 자원의 가용성 등을 들 수 있다. 특히, 작업 부하 상황에서 자원 경합에 의한 수행시간의 변화는 대부분 선행연구에서 깊이 있게 다뤄지지 않았다. 본 논문은 딥러닝 연산에 가장...

Full description

Saved in:

Bibliographic Details
Published in	정보과학회 컴퓨팅의 실제 논문지 Vol. 26; no. 10; pp. 463 - 468
Main Authors	유용환(Yong-Hwan Yoo), 정혁진(Hyuk-Jin Jeong), 문수묵(Soo-Mook Moon)
Format	Journal Article
Language	Korean
Published	Korean Institute of Information Scientists and Engineers 01.10.2020 한국정보과학회
Subjects	컴퓨터학 계층 자원 경합 GPU workloads GPU 작업 부하 DNN resource contention execution time partitioning 분할 심층 신경망 수행시간 layer
Online Access	Get full text
ISSN	2383-6318 2383-6326
DOI	10.5626/KTCP.2020.26.10.463

Cover

Abstract	딥러닝의 모델로 사용되는 심층 신경망(DNN)은 여러 계층(layer)으로 구성된다. 최근에는 연산의 병렬화를 위해 단일 모델을 더 작은 단위로 분할 처리하는 등의 방법들이 연구되고 있다. 전체 모델을 효과적으로 나누기 위해서는 추론 시 DNN의 각 계층의 계산에 걸리는 시간을 분석할 필요가 있다. 이에 영향을 미치는 요인으로 각 계층에서 수행되는 연산량이나 자원의 가용성 등을 들 수 있다. 특히, 작업 부하 상황에서 자원 경합에 의한 수행시간의 변화는 대부분 선행연구에서 깊이 있게 다뤄지지 않았다. 본 논문은 딥러닝 연산에 가장 많이 사용되는 그래픽 처리장치(GPU)의 가용성과 계층의 수행시간 간의 관계에 주목하여, GPU의 병렬 작업 부하 수준을 변화시키면서 각 layer의 수행시간을 측정한다. 측정된 수행시간으로 결정 트리 모델을 학습시켜, 각 계층의 수행시간에 영향을 미치는 요소들과 그 중요도를 분석한다. 나아가, 이러한 정보를 바탕으로 계층별 수행시간 예측에 사용될 수 있는 두 가지 회귀 모델의 정확도를 비교한다. Deep neural networks (DNN) comprise multiple different layers. Recently, several studies have proposed dividing a large DNN into multiple partitions and parallelizing their computations. For efficient partitioning and execution of the network, we need to know the execution time of each DNN layer. Many previous studies have suggested estimating this execution time with the amount of computation performed in the layer. Another critical factor of a DNN layer’s execution time is the availability of the computation resources. In particular, execution time has not been studied thoroughly under resource contention. In this paper, we focus on the graphical processing unit (GPU), currently the most popular hardware component for accelerating the DNN computations. We change the concurrent workloads on a GPU to various levels and measure the execution time of several core DNN layers. Using a decision tree trained on these results, we analyze the effect of different factors on each layer’s execution time together, and their relative importance in deciding the execution time. Also, we compare two different regression models that can be used to predict the layer execution time based on this information. KCI Citation Count: 0
AbstractList	딥러닝의 모델로 사용되는 심층 신경망(DNN)은 여러 계층(layer)으로 구성된다. 최근에는 연산의 병렬화를 위해 단일 모델을 더 작은 단위로 분할 처리하는 등의 방법들이 연구되고 있다. 전체 모델을 효과적으로 나누기 위해서는 추론 시 DNN의 각 계층의 계산에 걸리는 시간을 분석할 필요가 있다. 이에 영향을 미치는 요인으로 각 계층에서 수행되는 연산량이나 자원의 가용성 등을 들 수 있다. 특히, 작업 부하 상황에서 자원 경합에 의한 수행시간의 변화는 대부분 선행연구에서 깊이 있게 다뤄지지 않았다. 본 논문은 딥러닝 연산에 가장 많이 사용되는 그래픽 처리장치(GPU)의 가용성과 계층의 수행시간 간의 관계에 주목하여, GPU의 병렬 작업 부하 수준을 변화시키면서 각 layer의 수행시간을 측정한다. 측정된 수행시간으로 결정 트리 모델을 학습시켜, 각 계층의 수행시간에 영향을 미치는 요소들과 그 중요도를 분석한다. 나아가, 이러한 정보를 바탕으로 계층별 수행시간 예측에 사용될 수 있는 두 가지 회귀 모델의 정확도를 비교한다. Deep neural networks (DNN) comprise multiple different layers. Recently, several studies have proposed dividing a large DNN into multiple partitions and parallelizing their computations. For efficient partitioning and execution of the network, we need to know the execution time of each DNN layer. Many previous studies have suggested estimating this execution time with the amount of computation performed in the layer. Another critical factor of a DNN layer’s execution time is the availability of the computation resources. In particular, execution time has not been studied thoroughly under resource contention. In this paper, we focus on the graphical processing unit (GPU), currently the most popular hardware component for accelerating the DNN computations. We change the concurrent workloads on a GPU to various levels and measure the execution time of several core DNN layers. Using a decision tree trained on these results, we analyze the effect of different factors on each layer’s execution time together, and their relative importance in deciding the execution time. Also, we compare two different regression models that can be used to predict the layer execution time based on this information. KCI Citation Count: 0
Author	정혁진(Hyuk-Jin Jeong) 유용환(Yong-Hwan Yoo) 문수묵(Soo-Mook Moon)
Author_xml	– sequence: 1 fullname: 유용환(Yong-Hwan Yoo) – sequence: 2 fullname: 정혁진(Hyuk-Jin Jeong) – sequence: 3 fullname: 문수묵(Soo-Mook Moon)
BackLink	https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002636701$$DAccess content in National Research Foundation of Korea (NRF)
BookMark	eNo9jjtPwlAAhW8MJiLyC1y6ODi03ndvRwKIBALE1Lnp49Y0aDElDm4YO0hYHHxgAomObiZE4m-il_9go8TpfDnn5OTsgkI8iCUA-wgajGN-1LKrPQNDDA3MjdyknGyBIiaC6JxgXvhnJHZAeTiMPIiRwFRYqAhajd6Zli1H66eppu5u168f6uVBpTOt1uloavmYvb9pajLT1HyUTcbaapGq769skWrqfrp-HufR6jPNB1KVzvfAduheDGV5oyVgH9ft6one7jaa1Upbjzllum9KTgXzIJIIUStwQxLiEEscMIok9xF2iel6iLrQoyY3AyQZ8swwFH5-2RekBA7_ZuMkdPp-5Azc6FfPB04_cSqndtOxmBAWx3n3YNO9TqJLGUSuc5WDm9w4nW6tjiA1GRSM_AD-2nGx
ContentType	Journal Article
DBID	DBRKI TDB ACYCR
DOI	10.5626/KTCP.2020.26.10.463
DatabaseName	DBPIA - 디비피아 Nurimedia DBPIA Journals Korean Citation Index
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
DocumentTitleAlternate	Analyzing the Execution Time of Different DNN Hidden Layer Types During Inference Under GPU Contention
DocumentTitle_FL	Analyzing the Execution Time of Different DNN Hidden Layer Types During Inference Under GPU Contention
EISSN	2383-6326
EndPage	468
ExternalDocumentID	oai_kci_go_kr_ARTI_9588962 NODE10475085
GroupedDBID	.UV ALMA_UNASSIGNED_HOLDINGS DBRKI TDB ACYCR
ID	FETCH-LOGICAL-n645-c7e6485b01e1149daf3f2f2e2d541e6c12a37ab14a0b4767d1e51b7ff8c489c83
ISSN	2383-6318
IngestDate	Sun Mar 09 07:54:41 EDT 2025 Thu Feb 06 13:06:06 EST 2025
IsPeerReviewed	false
IsScholarly	false
Issue	10
Keywords	계층 자원 경합 GPU workloads GPU 작업 부하 DNN resource contention execution time partitioning 분할 심층 신경망 수행시간 layer
Language	Korean
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-n645-c7e6485b01e1149daf3f2f2e2d541e6c12a37ab14a0b4767d1e51b7ff8c489c83
PageCount	6
ParticipantIDs	nrf_kci_oai_kci_go_kr_ARTI_9588962 nurimedia_primary_NODE10475085
PublicationCentury	2000
PublicationDate	2020-10
PublicationDateYYYYMMDD	2020-10-01
PublicationDate_xml	– month: 10 year: 2020 text: 2020-10
PublicationDecade	2020
PublicationTitle	정보과학회 컴퓨팅의 실제 논문지
PublicationYear	2020
Publisher	Korean Institute of Information Scientists and Engineers 한국정보과학회
Publisher_xml	– name: Korean Institute of Information Scientists and Engineers – name: 한국정보과학회
SSID	ssib021824891 ssib044742771 ssib053377435 ssib019653237
Score	1.7477437
Snippet	딥러닝의 모델로 사용되는 심층 신경망(DNN)은 여러 계층(layer)으로 구성된다. 최근에는 연산의 병렬화를 위해 단일 모델을 더 작은 단위로 분할 처리하는 등의 방법들이...
SourceID	nrf nurimedia
SourceType	Open Website Publisher
StartPage	463
SubjectTerms	컴퓨터학
Title	GPU 부하 상황에서 DNN 추론 시 은닉 계층별 수행시간 분석
URI	https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE10475085 https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002636701
Volume	26
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
ispartofPNX	정보과학회 컴퓨팅의 실제 논문지, 2020, 26(10), , pp.463-468
journalDatabaseRights	– providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2383-6326 dateEnd: 99991231 omitProxy: true ssIdentifier: ssib044742771 issn: 2383-6318 databaseCode: M~E dateStart: 20140101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnR1Na9RANNR60IsoKtaPEsQ5SdZ8zCQzx2R3tVq69rCF3pZMdiKysJXSXjxIxT1YevHgR4UW9CTehGLxN3XT_-B7M8lulAqtlyX75n2_zMx7IXljWffSNFc8ywMnlDJw4KZgDqdp7kgvkJzKMPJ0L72lTriwQp-sstWZc99qby1tbshG9vLE70r-J6oAg7jiV7JniOyEKQDgGuILvxBh-D1VjB8tr9wn7YQkIb6w0G4RwYjgAGoSHkCOqEGCxAIhIiLC1UOUCFhoOh2NCLSCIpPYI7FraBNE0DQtzThBEBcAikkSIAMk5CRhWjpCDCEH8VooKCQmrDSZa5CMsrTSo1VPjhEYu2iE4ZrQUmDSrIwT-iIm3OiXJBoJQAGJtWQOfFmpeuUKsI1WvJtaB0BBnjCg7cCxCC2F-w0Po-SeW3tCorlparyItTu1XwWvo5Sao1jtesOTTlFq0kpHGQirP3mBMrt6h8_Mlcpy40VG4taZ3DRd6CFrgrkSlPuQqsP8-uZCy61Alf_4SVsg5LP4NGax21xuoNINP2wAeEJbbzj-VyLwR8vxQfa892ytN1jvQWH1uCcY5wJznfM-bJ94RsrSq3a1cGNXyqDW9AhPBaB82oSJ0oj60bTugBIDqg59-O3EdtMRDLV_cILukPkN1yFhvDDcxFMvYOmsZYHdy9alsnyzYzMXr1gzg7Wr1iLMQ3t8uHX8Ydcu3rw-_vy9-PSuGO3ZMMfs4vD9-OsXu9jZs4v9rfHOtn10MCp-_RwfjOzi7e7xx20YOvoxAgajYrR_zeo-bHebC055SIkzDClzskiFlDPwgfJgWeuneZD7ua_8PqOeCjPPT4MolR5NXUmjMOp7inkyynOegYcyHly3ZodrQ3XDstPMU1Dvplz2Fc1oX0hYP30luR-5ufTzOesu-EAH5t8BmrPmJy7qvTANa3qdp602tmSBYozdPA2XW9bF6Q1_25rdWN9UdyD53pDzOvC_AVlQpYA
linkProvider	ISSN International Centre
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GPU+%EB%B6%80%ED%95%98+%EC%83%81%ED%99%A9%EC%97%90%EC%84%9C+DNN+%EC%B6%94%EB%A1%A0+%EC%8B%9C+%EC%9D%80%EB%8B%89+%EA%B3%84%EC%B8%B5%EB%B3%84+%EC%88%98%ED%96%89%EC%8B%9C%EA%B0%84+%EB%B6%84%EC%84%9D&rft.jtitle=%EC%A0%95%EB%B3%B4%EA%B3%BC%ED%95%99%ED%9A%8C+%EC%BB%B4%ED%93%A8%ED%8C%85%EC%9D%98+%EC%8B%A4%EC%A0%9C+%EB%85%BC%EB%AC%B8%EC%A7%80%2C+26%2810%29&rft.au=%EC%9C%A0%EC%9A%A9%ED%99%98&rft.au=%EC%A0%95%ED%98%81%EC%A7%84&rft.au=%EB%AC%B8%EC%88%98%EB%AC%B5&rft.date=2020-10-01&rft.pub=%ED%95%9C%EA%B5%AD%EC%A0%95%EB%B3%B4%EA%B3%BC%ED%95%99%ED%9A%8C&rft.issn=2383-6318&rft.eissn=2383-6326&rft.spage=463&rft.epage=468&rft_id=info:doi/10.5626%2FKTCP.2020.26.10.463&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_9588962
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2383-6318&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2383-6318&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2383-6318&client=summon