SC-LKM: A Semantic Chunking and Large Language Model-based Cybersecurity Knowledge Graph Construction Method

Pu wang
Yangsen Zhang
ZiCheng Zhou
YuQI Wang

0 evaluations Published on May 12, 2025

This article on Sciety

Abstract

In cybersecurity, building an accurate knowledge graph is crucial to discover key entities and relationships in security incidents from complex threats and massive unstructured data. Traditional knowledge graph construction methods rely on rules or machine learning models, but have limitations when dealing with large-scale unstructured data. The GraphRAG framework constructs a knowledge graph through document segmentation, but due to the fragmentation and semantic discontinuity in the segmentation process, the integrity and accuracy of the graph are affected. To address these challenges, this paper proposes a cybersecurity knowledge graph construction method SC-LKM combining GraphRAG framework and semantic chunking. SC-LKM uses semantic chunking to accurately construct the Cybersecurity event knowledge graph, which overcomes the challenges of fragmentation and semantic inconsistency in traditional methods and GraphRAG. For the fine-grained processing of unstructured documents, the hierarchical text segmentation strategy is used to ensure the logical consistency between knowledge units, and the problem of semantic fragmentation and information loss is solved. In addition, SC-LKM integrates the semantic understanding ability of Qwen2.5-14B-Instruction, which significantly improves the extraction accuracy and reasoning quality. Experimental results show that SC-LKM outperforms the baseline methods in terms of entity recognition coverage, topology density and semantic consistency

Related articles are currently not available for this article.