Spark 환경에서 대용량 그래프 유사 서브 그래프 매칭 기법

최근 각종 실험 장비의 발전에 따라 과학데이터가 급격히 증가하고 있다. 특히 그래프 데이터를 활용한 유사 서브 그래프 매칭 기법은 다양한 분야의 응용 및 연구에서 중요하게 활용되고 있다. 하지만 기존의 유사 서브 그래프 매칭 기법들은 단일 서버 환경에서 동작하도록 설계되어 있기 때문에 대용량 그래프의 처리에 한계가 존재한다. 따라서 본 논문에서는 Spark 환경에서 대용량 그래프 유사 서브 그래프 매칭 기법을 제안한다. 제안하는 기법은 분산 컴퓨팅 환경을 고려하여 대용량 그래프에 대한 처리를 수행한다. 또한 보다 효율적인 가지치기,...

Full description

Saved in:

Bibliographic Details
Published in	정보과학회 컴퓨팅의 실제 논문지 Vol. 24; no. 9; pp. 463 - 469
Main Authors	임종태(Jongtae Lim), 최도진(Dojin Choi), 서동민(Dongmin Seo), 유석종(Seok Jong Yu), 복경수(Kyoungsoo Bok), 유재수(Jaesoo Yoo)
Format	Journal Article
Language	Korean
Published	Korean Institute of Information Scientists and Engineers 01.09.2018 한국정보과학회
Subjects	컴퓨터학 bigdata 유사 서브 그래프 매칭 apache spark 그래프 분석 대용량 그래프 graph analysis large-sclae graph 아파치 스파크 빅데이터 approximate subgraph matching
Online Access	Get full text
ISSN	2383-6318 2383-6326
DOI	10.5626/KTCP.2018.24.9.463

Cover

More Information
Summary:	최근 각종 실험 장비의 발전에 따라 과학데이터가 급격히 증가하고 있다. 특히 그래프 데이터를 활용한 유사 서브 그래프 매칭 기법은 다양한 분야의 응용 및 연구에서 중요하게 활용되고 있다. 하지만 기존의 유사 서브 그래프 매칭 기법들은 단일 서버 환경에서 동작하도록 설계되어 있기 때문에 대용량 그래프의 처리에 한계가 존재한다. 따라서 본 논문에서는 Spark 환경에서 대용량 그래프 유사 서브 그래프 매칭 기법을 제안한다. 제안하는 기법은 분산 컴퓨팅 환경을 고려하여 대용량 그래프에 대한 처리를 수행한다. 또한 보다 효율적인 가지치기, 유사도 계산, 그리고 결과 반환 기법을 이용하여 유사 서브 그래프 매칭의 가지치기 효율 및 속도를 향상시킨다. With the development of various experiment tools, the amount of science data generated for fields such as astronomy, cosmology, biology, and humanities has increased rapidly. Among these science data, graph data occupies a very high proportion. Approximate sub-graph matching is the analytic technique that searches for the similar subgraphs with a query graph in target graph. However, the existing approximate subgraph matching schemes have limits to process large scale network data because they do not consider the distributed computing environments. In this paper, we propose an approximate subgraph matching scheme for large-scale graph data in distributed computing environments. The proposed scheme uses big data processing platform to process the large-scale graph data. And the proposed scheme improves the performance of the query processing using efficiently pruning algorithm and similarity calculate algorithm. KCI Citation Count: 0
ISSN:	2383-6318 2383-6326
DOI:	10.5626/KTCP.2018.24.9.463