Open Access
ARTICLE
Fine Tuned QA Models for Java Programming
Dr. G.Y. Pathrikar College of CS and IT, MGM University, Chhatrapati Sambhaji Nagar, India
* Corresponding Author: Jeevan Pralhad Tonde. Email:
(This article belongs to the Special Issue: Advances in Artificial Intelligence for Engineering and Sciences)
Journal on Artificial Intelligence 2026, 8, 107-118. https://doi.org/10.32604/jai.2026.075857
Received 10 November 2025; Accepted 14 January 2026; Issue published 13 February 2026
Abstract
As education continues to evolve alongside artificial intelligence, there is growing interest in how large language models (LLMs) can support more personalized and intelligent learning experiences. This study focuses on building a domain-specific question answering (QA) system tailored to computer science education, with a particular emphasis on Java programming. While transformer-based models such as BERT, RoBERTa, and DistilBERT have demonstrated strong performance on general-purpose datasets like SQuAD, they often struggle with technical educational content where annotated data is scarce. To address this challenge, we developed a custom dataset, JavaFactoidQA, consisting of 1000 fact-based question–answer pairs derived from Java course materials and textbooks. A two-step fine-tuning strategy was adopted, in which models were first fine-tuned on the SQuAD dataset to capture general language understanding and subsequently fine-tuned on the Java-specific dataset to adapt to programming terminology and structure. Experimental results show that RoBERTa-Base achieved the best performance, with an F1 score of 88.7% and an Exact Match (EM) score of 82.4%, followed closely by BERT-Base and DistilBERT. The results were further compared with domain-specific QA models from healthcare and finance, demonstrating that the proposed approach performs competitively despite using a relatively small dataset. Overall, this study shows that careful dataset design combined with sequential fine-tuning enables effective adaptation of transformer-based QA models for educational applications, including automated assessment, intelligent tutoring, and interactive learning environments. Future work will explore extending the approach to additional subjects, incorporating cognitive-level tagging, and evaluating performance on broader educational QA benchmarks.Keywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools