Salesforce’s xGen-small Revolutionizes Enterprise AI with Efficient Long-Context Processing

Challenges in Enterprise Language Processing

Enterprise language processing faces significant challenges as business workflows increasingly rely on synthesizing data from diverse sources such as internal documents, code repositories, research reports, and real-time streams. While large language models have advanced capabilities, they come with high costs, frequent hardware upgrades, and data privacy concerns. Pursuing larger models shows diminishing returns and increasing energy demands, which may limit AI progress.

Need for Balanced AI Models

Modern enterprises require AI solutions that balance long-context comprehension with cost efficiency, predictable serving costs, and strong privacy protections. Small language models, despite handling complex, high-volume inference, are uniquely suited to meet these requirements.

Limitations of Traditional Approaches

Traditional methods to extend language model context length—like retrieval-augmented generation (RAG), external tool calls, and memory mechanisms—add complexity and potential points of failure. Larger models with extended context windows increase computational overhead. Genuine long-context capabilities allowing single-pass processing of entire documents or conversations eliminate these issues.

Introducing xGen-small

Salesforce AI Research developed xGen-small, a compact language model optimized for efficient long-context processing tailored for enterprise needs. It combines domain-specific data curation, scalable pre-training, length-extension techniques, instruction fine-tuning, and reinforcement learning to deliver high performance at predictable low costs.

Innovative "Small but Long" Architecture

xGen-small adopts a "small but long" design, shrinking model size while fine-tuning data distributions and training protocols for enterprise domains. This approach requires integrated expertise across data curation, pre-training, length-extension, and fine-tuning stages.

Data Curation and Pre-training

The pipeline starts with a multi-trillion-token corpus, applying spam filters, quality classifiers, duplicate removal, and balancing general and specialized content including code, mathematics, and natural language. Pre-training uses TPU v5p pods with the Jaxformer v8 library and advanced optimization techniques to maximize efficiency.

Performance and Evaluation

xGen-small competes well against larger baselines, blending diverse data types to balance efficiency and performance. The 9B model achieves state-of-the-art results on the RULER benchmark, with stable performance from 4K to 128K tokens thanks to a two-stage length-extension strategy and sequence parallelism.

Instruction Fine-tuning and Reinforcement Learning

Post-training involves supervised fine-tuning with diverse instruction datasets and large-scale reinforcement learning to enhance reasoning, especially in mathematics, coding, and STEM tasks. This process ensures robust instruction-following capabilities.

Enterprise Benefits

The "small but long" approach lowers inference costs and hardware needs while enabling seamless processing of extensive enterprise knowledge without external retrieval. xGen-small offers a sustainable, cost-effective, and privacy-preserving AI solution for enterprise-scale deployment.

For more information, explore the model on Hugging Face and check out the technical details. Stay updated by following Salesforce AI Research on Twitter.