T5Gemma
A collection of encoder-decoder models that provide a strong quality-inference efficiency tradeoff
Download T5Gemma
T5Gemma adapts pretrained decoder-only Gemma 2 models into an encoder-decoder architecture. These models are trained with either PrefixLM for strong generative performance or UL2 for high-quality contextual representations.
Capabilities
- 
        
Enhanced reasoning
Dedicated encoder significantly boosts performance on tasks requiring deep context comprehension, such as math reasoning (GSM8K).
 - 
        
Flexible architecture
Model adaptation techniques allows for flexible configurations, including "unbalanced" models where the encoder and decoder have different sizes.
 - 
        
High efficiency
Superior quality-to-efficiency ratio without extensive compute requirements.
 
Model variants
- 
        
Gemma 2 sizes
Checkpoints based on the official Gemma 2 2B and 9B models, as well as the “unbalanced” 9B-2B checkpoint.
 - 
        
T5 sizes
Small, Base, Large, and XL sizes following the T5 configuration, plus an additional model sized between T5 Large and T5 X.