Chatbot tutor for higher education– Experimental Design for Optimizing Technical Accuracy

Authors Linda Becker, Michele Martin, Loredana Tiedke, Semih Severengiz
Publication date 2025
Type Lecture

Summary

"While integrating Large Language Models (LLMs) can provide personalised, continuous feedback at scale in large courses, doing so responsibly requires technical accuracy and data privacy. In this study, we present an on-premises, privacy-preserving chatbot tutor for a university course that focuses on energy and sustainability related tasks. This tutor is built using Ollama and Docker, and is aligned to the curriculum via Retrieval-Augmented Generation (RAG). The first phase of a three-phase evaluation involves running a full-factorial experiment (three LLMs × three RAG datasets × six prompts x five repetitions; 270 runs) under an automated Puppeteer protocol. The outputs are rated on seven dimensions, including correctness, hallucinations, pedagogical value and repeatability, that reflect the failure modes observed in external audits. The result is a practical framework for optimising LLM/RAG configurations so that energy-domain learning tasks receive reliable feedback without sending student data off-campus."

Language English
Key words chatbots, education, feedback, RAG, generative AI
Digital Object Identifier 10.48544/f7dea67a-f11a-4f76-93fc-c9d8c789d1b2

Algemeen