<RETURN_TO_BASE

G-ACT: A Breakthrough Framework to Direct Programming Language Bias in Large Language Models

University of Michigan researchers introduce G-ACT, a novel framework to control programming language bias in large language models, enhancing reliability in scientific code generation.

Challenges in Scientific Code Generation with LLMs

Large Language Models (LLMs) have become sophisticated tools for natural language processing and have enabled agentic systems that manage complex workflows. However, their application in generating scientific code remains limited. Scientific software often relies on low-level languages such as C++ and CUDA, which are underrepresented in LLM training data. This leads to syntactic and semantic errors in generated code, causing compilation failures or unstable runtime behaviors. Current LLM-based agents depend heavily on user-supplied control primitives and carefully designed prompts, which can be misunderstood and result in unpredictable execution flows.

Limitations of Current Steering Techniques

To steer LLM outputs, several methods have been explored, including uncovering causal relationships within model activations and performing neuron-level interventions. Techniques like Supervised Fine-Tuning (SFT), weight modulation, and Reinforcement Learning with Human Feedback (RLHF) provide direct model steering but come with high computational costs and potential reductions in model robustness and general performance. Activation Patching, which uses corrupted inputs as baselines for fine-grained control, is widely used but requires millions of model evaluations and is mostly applied in controlled benchmarks rather than real-world scenarios.

Introducing the G-ACT Framework

Researchers at the University of Michigan have developed G-ACT (Gradient-refined Adaptive Activation Steering), a scalable framework to steer LLMs towards generating scientific code in specific programming languages. G-ACT analyzes activation differences per prompt, clusters them into steering directions, and employs lightweight per-layer probes that are trained and refined online to select suitable steering vectors. This approach enables concept-level control with scalability and interpretability, offering a practical solution for consistent programming language selection in scientific computing tasks.

Model Evaluation Reveals Baseline Biases

The study evaluated five instruction-tuned LLMs including Llama-3.2-3B-Instruct, Llama-3.3-70B-Instruct, Qwen2.5-Coder-32B-Instruct, Qwen2.5-14B-Instruct-1M, and QwQ-32B. The models were tested on 84 benchmark questions with 25 repetitions each at a sampling temperature of 1.0. Results showed inherent language preferences: Llama-3.2-3B favored Java (76.2%), Llama-3.3-70B preferred Python (73.8%), Qwen2.5-Coder leaned towards Python (59.5%), and Qwen2.5-14B favored Julia (66.7%). These biases arise from model size, architecture, and fine-tuning data.

Static Neuron Activation Enables Language Biasing

By selectively activating individual MLP neurons in Llama-3.2-3B-Instruct, researchers achieved strong causal control over programming language selection. Targeting C++ generation resulted in nearly 100% C++ code output, effectively suppressing Python, Java, and Julia outputs. Code generation tests identified two behavioral regimes: Python-leaning tasks generated 40-80% Python code for high-level operations, while C++-dominant tasks produced 60-90% C++ code for performance-critical routines. Overall, the model generated about 73% C++ code more frequently than Python but still defaulted to Python in many cases.

Performance of Gradient-Refined Activation Steering

The G-ACT framework significantly improved probe classification accuracy from 0% to 61.5% in early layers of LLaMA-3.2 3B. Although it introduced a modest runtime overhead (1.3-1.4 times slower generation), selective layer steering and caching optimizations maintain practical usability. Beyond programming language control, G-ACT supports concept-level control through persistent transformation matrices, ensuring consistent behavior across users. This framework sets a new standard for reliable and interpretable LLM steering in scientific computing.

For more details, check out the original research paper and follow related discussions on social platforms.

🇷🇺

Сменить язык

Читать эту статью на русском

Переключить на Русский