Abstract
In many industries, predicting metric outcomes of large systems is a fundamental problem, previously driven largely by traditional tabular-based regression. However, such methods have limited applicability for complex systems data in the wild such as configuration files or system logs, where feature engineering can be nearly impossible. In this paper, we propose text-to-text regression as a general and scalable method for performance prediction over such formats. When trained over data from Google's Compute-as-a-Service (CaaS) system, a simple randomly initialized T5 encoder-decoder of at most 200M parameters, achieves near perfect (>0.95) rank correlation across the entire system and can adapt to new tasks easily in as little as 500 few-shot examples, achieving significantly more predictive power (100x lower mean-squared-error) than previous tabular approaches. Furthermore, we ablate important behaviors in this paradigm, namely the importance of encoder-decoders, increased sequence length, and the model's natural ability to quantify its own uncertainty. Our results and ablations pave the way towards universal simulators for various real world outcomes.
Authors
Yash Akhauri, Bryan Lewandowski, Cheng-Hsi Lin, Adrian N. Reyes, Grant C. Forbes, Arissa Wongpanich, Bangding Yang, Mohamed S. Abdelfattah, Sagi Perel, Xingyou Song
Venue
arXiv