IBM Unveils Granite 4.0 Nano: Enterprise-Ready Small Models for Edge AI

What Granite 4.0 Nano brings

IBM's AI team has released Granite 4.0 Nano, a family of compact, open-source language models designed for local and edge inference with enterprise governance. The series includes eight models across two main parameter scales — roughly 350M and about 1B — offered in both hybrid SSM (H) and pure transformer variants, each available as base and instruct-tuned versions. All models are published under an Apache 2.0 license and carry cryptographic signatures and ISO 42001 alignment for provenance.

Model lineup and formats

The Nano series contains hybrid SSM plus transformer variants (H models) and transformer-only counterparts. Key members include Granite 4.0 H 1B (roughly 1.5B parameters) and Granite 4.0 H 350M (around 350M). Transformer-only versions are also provided to maximize compatibility with popular runtimes. In total, the release mixes hybrid and transformer architectures across base and instruct flavors to suit on-device, edge, and browser deployments.

Architecture and training approach

The H variants interleave state-space model (SSM) layers with transformer layers. This hybrid design reduces memory growth compared with pure attention while retaining the flexibility of transformer blocks. Importantly, the Nano models were trained using the same Granite 4.0 pipeline and dataset scale — more than 15 trillion tokens — rather than a reduced-data shortcut. After pretraining they were instruction-tuned to improve tool use and instruction-following, bringing strengths from larger Granite models down to sub-2B scales.

Performance and benchmarks

IBM positions Granite 4.0 Nano as competitive with other sub-2B models such as Qwen, Gemma, and LiquidAI LFM. Reported aggregates indicate meaningful gains across general knowledge, math, code, and safety benchmarks at similar parameter budgets. On agent-centric tasks, the Nano models show strong results on IFEval and the Berkeley Function Calling Leaderboard v3, which are relevant for tool-using agents.

Governance, licensing and runtime support

All Granite 4.0 models, including Nano, are released under Apache 2.0, cryptographically signed, and aligned with ISO 42001 standards. This gives enterprises provenance and governance guarantees that are often missing from community small models. The models are available on Hugging Face and IBM watsonx.ai, with native runtime support for vLLM, llama.cpp, and MLX, enabling realistic local, edge, and browser deployments.

Why it matters for edge and enterprise teams

By pushing the same training recipe and governance story used for larger Granite models down to 350M and ~1B parameter scales, IBM offers small-model options that inherit capability and auditability. That combination — competitive performance, compact footprints, open licensing, and enterprise-grade provenance — makes Granite 4.0 Nano attractive for engineers and teams building on-device or edge AI solutions with stricter compliance needs.

Availability

Model weights and technical details are published on Hugging Face and IBM's watsonx.ai, with additional resources like tutorials and notebooks available on GitHub and community channels.