Knowledge Graph-Driven Training Data Construction for Urban Flood-Traffic Scenario Generation Using Small Language Models

Geunhwi Park; Juneyoung Park; Chunjoo Yoon; Jaehong Park

doi:10.32604/cmc.2026.081652

Open Access icon Open Access

ARTICLE

Knowledge Graph-Driven Training Data Construction for Urban Flood-Traffic Scenario Generation Using Small Language Models

Geunhwi Park¹, Juneyoung Park^2,*, Chunjoo Yoon³, Jaehong Park³

1 Department of Smart City Engineering, Hanyang University, Ansan-si, Republic of Korea
2 Department of Transportation and Logistics Engineering, Hanyang University, Ansan-si, Republic of Korea
3 Department of Highway & Transportation Research, Korea Institute of Civil Engineering and Building Technology, Goyang-si, Republic of Korea

* Corresponding Author: Juneyoung Park. Email: email

Computers, Materials & Continua 2026, 88(2), 89 https://doi.org/10.32604/cmc.2026.081652

Received 06 March 2026; Accepted 14 May 2026; Issue published 15 June 2026

Abstract

Urban flooding caused by extreme rainfall events disrupts transportation systems, yet generating realistic flood-traffic scenarios for disaster preparedness remains a labor-intensive manual process. This study proposes a Knowledge Graph (KG)-driven pipeline that automatically generates domain-specific training data for fine-tuning small language models (sLLMs) to synthesize urban flood-traffic scenarios. A domain KG comprising 58 entities and 285 relationships was constructed for Jinju City, South Korea, integrating empirical flood data from 112 local documents with quantitative rainfall-traffic impact values from 14 international studies. Nine domain constraint rules, including a novel spatial consistency rule, ensure the physical plausibility of generated scenarios. Through constrained weighted graph walks, 800 semi-structured English narrative scenarios were automatically generated in approximately 5 min, substantially reducing the labor required compared to manual creation. Three sLLMs spanning different architectures and parameter scales—Flan-T5-Large (770M), Qwen2.5-3B-Instruct (3B), and Qwen2.5-7B-Instruct (7B)—were fine-tuned using QLoRA on a single GPU with 16 GB VRAM. Evaluation on 78 test samples demonstrated consistent performance improvements with increasing model scale: Qwen2.5-7B achieved BLEU-4 of 0.5524, ROUGE-L of 0.6883, BERTScore F1 of 0.9662, and KG Fact Consistency of 1.0000, representing a 33.8% BLEU-4 improvement over Flan-T5-Large. Both Qwen models achieved KG Fact Consistency of 1.0000. The 3B model achieved 98.6% of the 7B model’s BLEU-4 at 53% of the VRAM cost with identical factual consistency, representing the most cost-effective configuration. All models were trained for 10 epochs on the same GPU, demonstrating practical feasibility for municipal disaster response deployment.

Keywords

Knowledge graph; training data generation; urban flood; traffic scenario; small language model; fine-tuning; text generation; QLoRA

Cite This Article

APA Style

Park, G., Park, J., Yoon, C., Park, J. (2026). Knowledge Graph-Driven Training Data Construction for Urban Flood-Traffic Scenario Generation Using Small Language Models. Computers, Materials & Continua, 88(2), 89. https://doi.org/10.32604/cmc.2026.081652

Vancouver Style

Park G, Park J, Yoon C, Park J. Knowledge Graph-Driven Training Data Construction for Urban Flood-Traffic Scenario Generation Using Small Language Models. Comput Mater Contin. 2026;88(2):89. https://doi.org/10.32604/cmc.2026.081652

IEEE Style

G. Park, J. Park, C. Yoon, and J. Park, “Knowledge Graph-Driven Training Data Construction for Urban Flood-Traffic Scenario Generation Using Small Language Models,” Comput. Mater. Contin., vol. 88, no. 2, pp. 89, 2026. https://doi.org/10.32604/cmc.2026.081652

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Knowledge Graph-Driven Training Data Construction for Urban Flood-Traffic Scenario Generation Using Small Language Models

Abstract

Keywords

Cite This Article

224

50

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link