Benchmarking the environmental footprint of BERT models

Published:

Paper Code

This is the follow-up paper to BERTchen: Scaling large language models (LLMs) has led to remarkable gains in accuracy, but it also significantly increases resource consumption and CO_2e emissions. While performance improvements are well documented, the environmental footprint of these models is frequently under-reported. In this paper, we provide a comprehensive analysis of CO_2e emissions across various encoder-only architectures, with a particular focus on English and German BERT models. We investigate how architectural modifications, pretraining strategies, and data adjustments influence model efficiency. Our analysis reveals that, over time, environmental impact diminishes even as performance continues to improve. Using BERTchen we highlight the challenges in accurately reporting emissions, particularly without insider knowledge. Finally, we discuss the importance of incorporating CO_2e emission metrics in model evaluation and propose guidelines to standardize their reporting in future research.