On the Feasibility and Benefits of Extensive Evaluation

Proc. ACM Manag. Data; Proceedings of ACM International Conference on Management of Data (SIGMOD), 2025

Yujie Hui, Miao Yu, Hao Qi, Yifan Gan, Tianxi Li, Yuke Li, Xueyuan Ren, Sixiang Ma, Xiaoyi Lu, Yang Wang

Abstract

Benchmark and system parameters often have a significant impact on performance evaluation, which raises a long-lasting question about which settings we should use.This paper studies the feasibility and benefits of extensive evaluation. A full extensive evaluation, which tests all possible settings, is usually too expensive. This work investigates whether it is possible to sample a subset of the settings and, upon them, generate observations that match those from a full extensive evaluation. Towards this goal, we have explored the incremental sampling approach, which starts by measuring a small subset of random settings, builds a prediction model on these samples using the popular ANOVA approach, adds more samples if the model is not accurate enough, and terminates otherwise.To summarize our findings: 1) Enhancing a research prototype to support extensive evaluation mostly involves changing hard-coded configurations, which does not take much effort. 2) Some systems are highly predictable, which means that they can achieve accurate predictions with a low sampling rate, but some systems are less predictable. 3) We have not found a method that can consistently outperform random sampling + ANOVA. Based on these findings, we provide recommendations to improve artifact predictability and strategies for selecting parameter values during evaluation.

Full text links

External link

Journal Article

Issue_date
September 2024
Publisher
Association for Computing Machinery
Address
New York, NY, USA
Doi
10.1145/3677137
Journal
Proc. ACM Manag. Data; Proceedings of ACM International Conference on Management of Data (SIGMOD)
Month
Oct
Articleno
201
Series
SIGMOD '25

Cite

Plain text

BibTeX