LogKernel: A Threat Hunting Approach Based on Behaviour Provenance Graph and Graph Kernel Clustering

Cyber threat hunting is a proactive search process for hidden threats in an organization’s information system. It is a crucial component of active defense against advanced persistent threats (APTs). However, most of the current threat hunting methods rely on Cyber Threat Intelligence (CTI), which ca...

Full description

Saved in:
Bibliographic Details
Published inSecurity and communication networks Vol. 2022; pp. 1 - 16
Main Authors Li, Jiawei, Zhang, Ru, Liu, Jianyi, Liu, Gongshen
Format Journal Article
LanguageEnglish
Published London Hindawi 27.09.2022
John Wiley & Sons, Inc
Subjects
Online AccessGet full text
ISSN1939-0114
1939-0122
DOI10.1155/2022/4577141

Cover

Abstract Cyber threat hunting is a proactive search process for hidden threats in an organization’s information system. It is a crucial component of active defense against advanced persistent threats (APTs). However, most of the current threat hunting methods rely on Cyber Threat Intelligence (CTI), which can find known attacks but cannot find unknown attacks that have not been disclosed by CTI. In this paper, we propose LogKernel, a threat hunting method based on graph kernel clustering which can effectively separate attack behaviour from benign activities. LogKernel first abstracts system audit logs into behaviour provenance graphs (BPGs) and then clusters graphs by embedding them into a continuous space using a graph kernel. In particular, we designed a new graph kernel clustering method based on the characteristics of BPGs, which can capture both structure information and rich label information of the BPGs. To reduce false positives, LogKernel further quantifies the threat of abnormal behaviour. We evaluate LogKernel on the malicious dataset, which includes seven simulated attack scenarios, and the DAPRA CADETS dataset, which includes four attack scenarios. The result shows that LogKernel can hunt all attack scenarios among them, and compared to the state-of-the-art methods, it can find unknown attacks.
AbstractList Cyber threat hunting is a proactive search process for hidden threats in an organization’s information system. It is a crucial component of active defense against advanced persistent threats (APTs). However, most of the current threat hunting methods rely on Cyber Threat Intelligence (CTI), which can find known attacks but cannot find unknown attacks that have not been disclosed by CTI. In this paper, we propose LogKernel, a threat hunting method based on graph kernel clustering which can effectively separate attack behaviour from benign activities. LogKernel first abstracts system audit logs into behaviour provenance graphs (BPGs) and then clusters graphs by embedding them into a continuous space using a graph kernel. In particular, we designed a new graph kernel clustering method based on the characteristics of BPGs, which can capture both structure information and rich label information of the BPGs. To reduce false positives, LogKernel further quantifies the threat of abnormal behaviour. We evaluate LogKernel on the malicious dataset, which includes seven simulated attack scenarios, and the DAPRA CADETS dataset, which includes four attack scenarios. The result shows that LogKernel can hunt all attack scenarios among them, and compared to the state-of-the-art methods, it can find unknown attacks.
Author Liu, Gongshen
Zhang, Ru
Liu, Jianyi
Li, Jiawei
Author_xml – sequence: 1
  givenname: Jiawei
  orcidid: 0000-0003-2611-1852
  surname: Li
  fullname: Li, Jiawei
  organization: Beijing University of Posts and TelecommunicationsBeijing 100876Chinabupt.edu.cn
– sequence: 2
  givenname: Ru
  orcidid: 0000-0001-6641-3236
  surname: Zhang
  fullname: Zhang, Ru
  organization: Beijing University of Posts and TelecommunicationsBeijing 100876Chinabupt.edu.cn
– sequence: 3
  givenname: Jianyi
  orcidid: 0000-0003-3133-4452
  surname: Liu
  fullname: Liu, Jianyi
  organization: Beijing University of Posts and TelecommunicationsBeijing 100876Chinabupt.edu.cn
– sequence: 4
  givenname: Gongshen
  orcidid: 0000-0001-5194-1570
  surname: Liu
  fullname: Liu, Gongshen
  organization: Shanghai Jiao Tong UniversityShanghai 200240Chinasjtu.edu.cn
BookMark eNp9kD9PwzAUxC1UJNrCxgewxAih_pc4YWsraBGVYOgeOc5LkyrYwU6K-PakSsWABNO74Xd3TzdBI2MNIHRNyT2lYThjhLGZCKWkgp6hMU14EhDK2OhHU3GBJt7vCYmokGKM8o3dvYAzUD_gOd6WDlSL151pK7PD86ZxVukSL5SHHFuDF1CqQ2U7h9-cPYBRRgNeOdWUWJn8pIY8vKw734Lrgy7ReaFqD1enO0Xbp8ftch1sXlfPy_km0JzLNpBZHkWEAPAw5yCKhJGiEJRoqUmW0DBjEOo4hojqXAPXpGAZFywGoZMkivkU3Qyx_dcfHfg23fefmr4xZZKxRHLJeE_dDZR21nsHRdq46l25r5SS9DhjepwxPc3Y4-wXrqtWtZU1rVNV_ZfpdjCVlcnVZ_V_xTfga4ME
CitedBy_id crossref_primary_10_1016_j_jnca_2024_104004
crossref_primary_10_1109_COMST_2023_3299519
crossref_primary_10_3390_electronics13010100
Cites_doi 10.1109/TDSC.2020.2971484
10.17671/gazibtd.512800
10.1109/tnnls.2018.2829867
10.14722/ndss.2021.24549
10.1109/tnnls.2019.2927224
10.21105/joss.00205
10.1007/978-3-540-45167-9_11
10.1109/tnnls.2018.2817538
10.1007/s41109-019-0195-3
ContentType Journal Article
Copyright Copyright © 2022 Jiawei Li et al.
Copyright © 2022 Jiawei Li et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0
Copyright_xml – notice: Copyright © 2022 Jiawei Li et al.
– notice: Copyright © 2022 Jiawei Li et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0
DBID RHU
RHW
RHX
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1155/2022/4577141
DatabaseName Hindawi Publishing Complete
Hindawi Publishing Subscription Journals
Hindawi Publishing Open Access
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList CrossRef

Technology Research Database
Database_xml – sequence: 1
  dbid: RHX
  name: Hindawi Publishing Open Access
  url: http://www.hindawi.com/journals/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1939-0122
Editor Aswani Kumar, Ch
Editor_xml – sequence: 1
  givenname: Ch
  surname: Aswani Kumar
  fullname: Aswani Kumar, Ch
EndPage 16
ExternalDocumentID 10_1155_2022_4577141
GrantInformation_xml – fundername: Fundamental Research Funds for the Central Universities
  grantid: 2021XD-A11-1
– fundername: National Natural Science Foundation of China
  grantid: U1936216; U21B2020
GroupedDBID .4S
.DC
05W
0R~
123
1OC
3SF
4.4
52U
5DZ
66C
8-1
8UM
AAESR
AAFWJ
AAJEY
AAONW
ACGFO
ADBBV
ADIZJ
AENEX
AFBPY
AFKRA
AJXKR
ALMA_UNASSIGNED_HOLDINGS
ARAPS
ARCSS
ATUGU
AZVAB
BCNDV
BENPR
BGLVJ
BHBCM
BNHUX
BOGZA
BRXPI
CCPQU
CS3
DR2
DU5
EBS
EIS
F1Z
G-S
GROUPED_DOAJ
HCIFZ
HZ~
IAO
ICD
ITC
IX1
K7-
LITHE
MY.
MY~
NNB
O9-
OIG
OK1
P2P
PIMPY
RHU
RHW
RHX
TH9
TUS
W99
WBKPD
XV2
24P
AAYXX
ACCMX
ADMLS
CITATION
H13
7SC
7SP
8FD
AAMMB
AEFGJ
AGXDD
AIDQK
AIDYY
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c337t-7bd6600ee35d3e4f920ff410c7c0b915b2e5c88e61cdce3c0f2b3428e4c99683
IEDL.DBID RHX
ISSN 1939-0114
IngestDate Fri Jul 25 20:52:59 EDT 2025
Tue Jul 01 01:43:54 EDT 2025
Thu Apr 24 23:07:42 EDT 2025
Sun Jun 02 18:53:16 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
https://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c337t-7bd6600ee35d3e4f920ff410c7c0b915b2e5c88e61cdce3c0f2b3428e4c99683
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-6641-3236
0000-0003-3133-4452
0000-0003-2611-1852
0000-0001-5194-1570
OpenAccessLink https://dx.doi.org/10.1155/2022/4577141
PQID 2722973723
PQPubID 1046363
PageCount 16
ParticipantIDs proquest_journals_2722973723
crossref_primary_10_1155_2022_4577141
crossref_citationtrail_10_1155_2022_4577141
hindawi_primary_10_1155_2022_4577141
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-09-27
PublicationDateYYYYMMDD 2022-09-27
PublicationDate_xml – month: 09
  year: 2022
  text: 2022-09-27
  day: 27
PublicationDecade 2020
PublicationPlace London
PublicationPlace_xml – name: London
PublicationTitle Security and communication networks
PublicationYear 2022
Publisher Hindawi
John Wiley & Sons, Inc
Publisher_xml – name: Hindawi
– name: John Wiley & Sons, Inc
References P. Gao (4)
T. Gärtner (24) 2003
FreeBuf Service (38) 2018
45
(29) 2020
47
G. Nikolentzos (27) 2018
28
Team Red Raindrops (37) 2019
J. Li (48) 2022
S. M. Milajerdi (2)
W. U. Hassan (14)
S. V. N. Vishwanathan (23) 2010; 11
H. Yin (20)
(42) 2020
W. Ul Hassan (6)
T. Pasquier (41)
N. Shervashidze (26) 2011; 12
F. Liu (46)
J. Zeng (3)
W. U. Hassan (9)
K. H. Lee (17) 2013; 16
S. T. King (10)
Y. Kwon (15)
M. N. Hossain (7)
X. Han (12) 2020
30
31
Yoroi Blog (39) 2020
11
M. N. Hossain (16)
36
(34) 2018
(43) 2020
S. M. Milajerdi (1)
P. Gao (22)
G. Siglidis (44) 2020; 21
N. M. Kriege (25) 2020; 5
W. U. Hassan (21)
M. Leland (33) 2017; 2
B. Kahle (35) 1996
sbousseaden (40)
M. N. Hossain (8)
A. Alsaheel (13) 2021
R. Wei (5)
S. Ma (19)
L. McInnes (32)
S. Ma (18) 2017
References_xml – ident: 11
  doi: 10.1109/TDSC.2020.2971484
– volume-title: Tencent Security
  year: 2018
  ident: 38
  article-title: The Magic Bear (APT28) organizes the latest attack
– ident: 28
  doi: 10.17671/gazibtd.512800
– ident: 19
  article-title: Protracer: Towards Practical Provenance Tracing by Alternating between Logging and Tainting
– year: 2020
  ident: 12
  article-title: Unicorn: Runtime Provenance-Based Detector for Advanced Persistent threats
– ident: 6
  article-title: Tactical provenance analysis for endpoint detection and response systems
– year: 2018
  ident: 34
  article-title: Google Safe Browsing
– start-page: 33
  ident: 32
  article-title: Accelerated hierarchical density based clustering
– start-page: 405
  ident: 41
  article-title: Practical whole-system provenance capture
– ident: 31
  doi: 10.1109/tnnls.2018.2829867
– ident: 45
– ident: 3
  article-title: Watson: Abstracting Behaviors from Audit Logs via Aggregation of Contextual semantics
  doi: 10.14722/ndss.2021.24549
– ident: 21
  article-title: Towards Scalable Cluster Auditing through Grammatical Inference over Provenance graphs
– start-page: 193
  ident: 4
  article-title: Enabling efficient cyber threat hunting with cyber threat intelligence
– ident: 8
  article-title: Combating dependence explosion in forensic analysis using alternative tag propagation semantics
– start-page: 487
  ident: 16
  article-title: SLEUTH}: real-time attack scenario reconstruction from {COTS} audit data
– volume: 12
  issue: 9
  year: 2011
  ident: 26
  article-title: Weisfeiler-Lehman graph kernels
  publication-title: Journal of Machine Learning Research
– year: 2020
  ident: 39
  article-title: The North Korean Kimsuky APT keeps threatening South Korea evolving its TTPs
– start-page: 3
  ident: 5
  article-title: DeepHunter: A Graph Neural Network Based Approach for Robust Cyber Threat Hunting
– ident: 36
– ident: 9
  article-title: Nodoze: Combatting threat alert fatigue with automated provenance triage
– volume-title: Amazon
  year: 1996
  ident: 35
  article-title: Alexa Rank
– start-page: 116
  ident: 20
  article-title: Panorama: capturing system-wide information flow for malware detection and analysis
– ident: 47
  doi: 10.1109/tnnls.2019.2927224
– ident: 15
  article-title: MCI: Modeling-based causality inference in audit logging for attack investigation
– volume: 2
  start-page: 205
  issue: 11
  year: 2017
  ident: 33
  article-title: hdbscan: Hierarchical density based clustering
  publication-title: The Journal of Open Source Software
  doi: 10.21105/joss.00205
– year: 2020
  ident: 42
  article-title: Transparent computing engagement 3 data release
– ident: 14
  article-title: OmegaLog: High-Fidelity Attack Investigation via Transparent Multi-Layer Log analysis
– start-page: 129
  volume-title: Learning theory And Kernel Machines
  year: 2003
  ident: 24
  article-title: On graph kernels: hardness results and efficient alternatives
  doi: 10.1007/978-3-540-45167-9_11
– year: 2020
  ident: 43
  article-title: DARPA Transparent Computing
– start-page: 1111
  year: 2017
  ident: 18
  article-title: {MPI}: multiple perspective attack investigation with semantic aware execution partitioning
  publication-title: 26th {USENIX} Security Symposium ({USENIX} Security 17)
– ident: 30
  doi: 10.1109/tnnls.2018.2817538
– volume: 21
  start-page: 1
  issue: 54
  year: 2020
  ident: 44
  article-title: GraKeL: a graph kernel library in Python
  publication-title: Journal of Machine Learning Research
– start-page: 223
  ident: 10
  article-title: Backtracking intrusions
– volume: 5
  start-page: 1
  issue: 1
  year: 2020
  ident: 25
  article-title: A survey on graph kernels
  publication-title: Applied Network Science
  doi: 10.1007/s41109-019-0195-3
– start-page: 1137
  ident: 2
  article-title: Holmes: Real-Time Apt Detection through Correlation of Suspicious Information flows
– volume-title: 30th {USENIX} Security Symposium ({USENIX} Security 21)
  year: 2021
  ident: 13
  article-title: {ATLAS}: A Sequence-Based Learning Approach for Attack Investigation
– volume: 16
  year: 2013
  ident: 17
  article-title: High accuracy attack provenance via binary-based execution partition
  publication-title: Network and Distributed System Security Symposium
– volume: 11
  start-page: 1201
  year: 2010
  ident: 23
  article-title: Graph kernels
  publication-title: Journal of Machine Learning Research
– year: 2022
  ident: 48
  article-title: LogKernel A Threat Hunting Approach Based on Behaviour Provenance Graph and Graph Kernel Clustering
– ident: 7
  article-title: Sleuth: real-time attack scenario reconstruction from cots audit data
– ident: 46
  article-title: Log2vec: a heterogeneous graph embedding based approach for detecting cyber threats within enterprise
– year: 2020
  ident: 29
  article-title: Process Injection: Process Hollowing
– start-page: 1795
  ident: 1
  article-title: Poirot: aligning attack behavior with kernel audit records for cyber threat hunting
– volume-title: Qianxin
  year: 2019
  ident: 37
  article-title: Sea Lotus is using new techniques to attack an environmental group in Vietnam
– ident: 40
  article-title: Windows EVTX Samples [200 EVTX examples]
– start-page: 113
  ident: 22
  article-title: {AIQL}: Enabling efficient attack investigation from system monitoring data
– year: 2018
  ident: 27
  article-title: Message passing graph kernels
SSID ssj0061474
Score 2.279686
Snippet Cyber threat hunting is a proactive search process for hidden threats in an organization’s information system. It is a crucial component of active defense...
SourceID proquest
crossref
hindawi
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Accuracy
Audits
Behavior
Clustering
Datasets
Design
Graphs
Hunting
Information systems
Intelligence gathering
Kernels
Knowledge
Methods
Performance evaluation
Search process
Semantics
Subject specialists
Threat evaluation
Threats
Title LogKernel: A Threat Hunting Approach Based on Behaviour Provenance Graph and Graph Kernel Clustering
URI https://dx.doi.org/10.1155/2022/4577141
https://www.proquest.com/docview/2722973723
Volume 2022
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFA46EPRBvOJ0jjzMJym2uTSpb3M4izdkTNhbaZPUCaOVbsO_78maiTpE31o4yUPOSc73kZPvINRJtfRZLlOP5iL0mOSwpTKdQywLwyMgRYTax8kPj2H8zG5HfOREkqarV_iQ7Sw9JxeMCxHYB-rrMrTBO4hHywMXEoxwl8e29Cdgy_r2H2O_ZZ6NsaW8768rR_Air_R30LYDhLhbe3AXrZliD219kQncR_q-fLkzVWEml7iLh2OL83Bc93jAXScKjq8gH2lcFthJHs4r_FSVcJZZv-IbK0yN00K7r3o-3JvMrVICTHSAhv3rYS_2XHcET1EqZp7IdAhoxRjKNTUsj4if5yzwlVB-FgU8I4YrKU0YKK0MVX5OMgpkwzAFHEfSQ9QoysIcIRyIgACzoZQyznKTSg7bOgXkprnIIuU30fly4RLllMNtA4tJsmAQnCd2mRO3zE109mn9Vitm_GLXcT74w6y1dFDittc0IYLYnluC0OP_zXKCNu2vLfQgooUas2puTgFNzLI24OgBaS8i6gOkZr-f
linkProvider Hindawi Publishing
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LogKernel%3A+A+Threat+Hunting+Approach+Based+on+Behaviour+Provenance+Graph+and+Graph+Kernel+Clustering&rft.jtitle=Security+and+communication+networks&rft.au=Li%2C+Jiawei&rft.au=Zhang%2C+Ru&rft.au=Liu%2C+Jianyi&rft.au=Liu%2C+Gongshen&rft.date=2022-09-27&rft.pub=John+Wiley+%26+Sons%2C+Inc&rft.issn=1939-0114&rft.eissn=1939-0122&rft.volume=2022&rft_id=info:doi/10.1155%2F2022%2F4577141&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1939-0114&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1939-0114&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1939-0114&client=summon