May

Master thesis presentation: Benchmarking Large Language Models for Vulnerability Detection: Comparing Local and Cloud LLMs

26 May 2026 10:15 to 11:00 | Thesis defence

Alexandra Pykälistö and Karl Müller-Uri present their master thesis May 26, in E:3139.

Benchmarking Large Language Models for Vulnerability Detection: Comparing Local and Cloud LLMs

Abstract: This thesis investigates the possibility of utilizing locally fine-tuned LLMs in order to discover and flag memory related security flaws in C and C++- code. Five locally fine-tuned models have been examined and compared to each other, their non-fine-tuned versions, as well as proprietary cloud models. The models were fed functions taken from C/C++-projects, and were asked to determine whether the function in question was vulnerable.

Two different prompting methods were used during the evaluation, which were zero-shot prompting and few-shot prompting. After each evaluation, performance metrics such as accuracy and F1-score were calculated. We show that while fine-tuning enhanced the performances of the local models with respect to F1-score, their ability to detect vulnerabilities remained unsatisfactory. The highest performing model, CodeLlama 7B, achieved a F1-score of only 0.12. However, as the cloud models, which are orders of magnitude larger in parameter size and with more extensive pre-training, did not outperform this, it indicates that the methods utilized in the thesis were suboptimal.

Supervisor: Christian Gehrmann

Examiner: Thomas Johansson

About the event

Location:

E:3139

Contact:

susanna [dot] lonnqvist [at] eit [dot] lth [dot] se

Save the event to your calendar

Master thesis presentation: Benchmarking Large Language Models for Vulnerability Detection: Comparing Local and Cloud LLMs

About the event

Contact us

About the website