Unit tests are fundamental for ensuring software correctness but are costly and time-intensive to design and create. Recent advances in Large Language Models (LLMs) have shown potential for automating test generation, though existing evaluations often focus on simple scenarios and lack scalability for real-world applications. To address these limitations, we present AgoneTest, an automated system for generating and assessing complex, class-level test suites for Java projects. Leveraging the Methods2Test dataset, we developed Classes2Test, a new dataset enabling the evaluation of LLM-generated tests against human-written tests. Our key contributions include a scalable automated software system, a new dataset, and a detailed methodology for evaluating test quality.
A System for Automated Unit Test Generation using Large Language Models and Assessment of Generated Test Suites / Lops, Andrea; Narducci, Fedelucio; Ragone, Azzurra; Trizio, Michelantonio; Bartolini, Claudio. - (2025), pp. 29-36. ( 18th IEEE International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2025 ita 2025) [10.1109/icstw64639.2025.10962454].
A System for Automated Unit Test Generation using Large Language Models and Assessment of Generated Test Suites
Lops, Andrea
;Narducci, Fedelucio
;Ragone, Azzurra
;Trizio, Michelantonio;Bartolini, Claudio
2025
Abstract
Unit tests are fundamental for ensuring software correctness but are costly and time-intensive to design and create. Recent advances in Large Language Models (LLMs) have shown potential for automating test generation, though existing evaluations often focus on simple scenarios and lack scalability for real-world applications. To address these limitations, we present AgoneTest, an automated system for generating and assessing complex, class-level test suites for Java projects. Leveraging the Methods2Test dataset, we developed Classes2Test, a new dataset enabling the evaluation of LLM-generated tests against human-written tests. Our key contributions include a scalable automated software system, a new dataset, and a detailed methodology for evaluating test quality.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

