IBM Granite 4.0: Hybrid Mamba-2/Transformer Models Slash Memory Use, Keep Performance
IBM released Granite 4.0, a hybrid Mamba-2/Transformer LLM family that cuts serving memory by over 70% for long-context inference while keeping strong instruction-following and tool-use performance.