Inside GPT-5: Practical Guide to Verbosity, Function Calls, CFG and Minimal Reasoning
'A compact developer guide to GPT-5 features including verbosity control, free-form function calling, CFG enforcement, and minimal reasoning with illustrative code samples.'
Installing the libraries
Start by installing the Python packages used in the examples and set your API key in the environment.
!pip install pandas openaiimport os
from getpass import getpass
os.environ['OPENAI_API_KEY'] = getpass('Enter OpenAI API Key: ')Verbosity parameter
The Verbosity parameter controls how long and detailed GPT-5s replies are without having to change the prompt.
low1 short, concise responses with minimal extra text.medium(default) 1 balanced detail and clarity.high1 very detailed answers for explanations or teaching.
The example below runs the same prompt with three verbosity settings and collects token usage and outputs.
from openai import OpenAI
import pandas as pd
from IPython.display import display
client = OpenAI()
question = "Write a poem about a detective and his first solve"
data = []
for verbosity in ["low", "medium", "high"]:
response = client.responses.create(
model="gpt-5-mini",
input=question,
text={"verbosity": verbosity}
)
# Extract text
output_text = ""
for item in response.output:
if hasattr(item, "content"):
for content in item.content:
if hasattr(content, "text"):
output_text += content.text
usage = response.usage
data.append({
"Verbosity": verbosity,
"Sample Output": output_text,
"Output Tokens": usage.output_tokens
})
# Create DataFrame
df = pd.DataFrame(data)
# Display nicely with centered headers
pd.set_option('display.max_colwidth', None)
styled_df = df.style.set_table_styles(
[
{'selector': 'th', 'props': [('text-align', 'center')]}, # Center column headers
{'selector': 'td', 'props': [('text-align', 'left')]} # Left-align table cells
]
)
display(styled_df)In the example the output tokens scale roughly linearly with verbosity: low (731) 1 medium (1017) 1 high (1263).
Free-form function calling
GPT-5 can emit raw text payloads intended to be executed directly by target tools or runtimes, such as Python scripts, SQL queries, or shell commands. Unlike some earlier tool-call formats that used structured JSON wrappers, GPT-5s free-form outputs make it easier to feed the models result straight into an external runtime.
Use cases include:
- Code sandboxes (Python, C++, Java, etc.)
- SQL query generation for databases
- Shell command generation (Bash)
- Configuration or code generation tools
Example: GPT-5 generating a Python snippet to count vowels and compute a cube.
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="gpt-5-mini",
input="Please use the code_exec tool to calculate the cube of the number of vowels in the word 'pineapple'",
text={"format": {"type": "text"}},
tools=[
{
"type": "custom",
"name": "code_exec",
"description": "Executes arbitrary python code",
}
]
)
print(response.output[1].input)This output demonstrates GPT-5 producing an executable Python snippet that can be fed directly into a Python runtime without extra parsing.
Context-Free Grammar (CFG)
A Context-Free Grammar (CFG) defines production rules for a language. Using a CFG you can strictly constrain model output to match a syntax (for example, a valid SQL statement, JSON document, or a specific code pattern). GPT-5 shows improved adherence to CFG constraints, producing outputs that exactly match the grammar specification when configured.
First, an unconstrained example (older model) that may emit extra prose around the answer:
from openai import OpenAI
import re
client = OpenAI()
email_regex = r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$"
prompt = "Give me a valid email address for John Doe. It can be a dummy email"
# No grammar constraints -- model might give prose or invalid format
response = client.responses.create(
model="gpt-4o", # or earlier
input=prompt
)
output = response.output_text.strip()
print("GPT Output:", output)
print("Valid?", bool(re.match(email_regex, output)))Now a grammar-constrained call to GPT-5 that uses a regex grammar to enforce a strict email format:
from openai import OpenAI
client = OpenAI()
email_regex = r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$"
prompt = "Give me a valid email address for John Doe. It can be a dummy email"
response = client.responses.create(
model="gpt-5", # grammar-constrained model
input=prompt,
text={"format": {"type": "text"}},
tools=[
{
"type": "custom",
"name": "email_grammar",
"description": "Outputs a valid email address.",
"format": {
"type": "grammar",
"syntax": "regex",
"definition": email_regex
}
}
],
parallel_tool_calls=False
)
print("GPT-5 Output:", response.output[1].input)In practice, GPT-4-style responses sometimes include extra natural language that breaks strict syntactic validation (for instance, "Sure, heres a test email: johndoe@example.com"), while GPT-5 can produce the exact token sequence that matches the grammar (for example, john.doe@example.com).
Minimal reasoning
Minimal reasoning reduces the models internal reasoning tokens, lowering latency and speeding up time-to-first-token for deterministic or trivial tasks like classification, formatting, or extraction. The trade-off is that complex multi-step reasoning may need more effort.
Example: a fast odd/even classifier with minimal reasoning.
import time
from openai import OpenAI
client = OpenAI()
prompt = "Classify the given number as odd or even. Return one word only."
start_time = time.time() # Start timer
response = client.responses.create(
model="gpt-5",
input=[
{ "role": "developer", "content": prompt },
{ "role": "user", "content": "57" }
],
reasoning={
"effort": "minimal" # Faster time-to-first-token
},
)
latency = time.time() - start_time # End timer
# Extract model's text output
output_text = ""
for item in response.output:
if hasattr(item, "content"):
for content in item.content:
if hasattr(content, "text"):
output_text += content.text
print("--------------------------------")
print("Output:", output_text)
print(f"Latency: {latency:.3f} seconds")Minimal reasoning is well-suited for short rewrites, extraction, formatting and quick classifications when you need low latency.
Next steps and resources
The examples above show how GPT-5 extends developer control via verbosity tuning, free-form function outputs, strict grammar enforcement and low-latency minimal reasoning. Explore the full code samples and repositories for deeper experimentation, and adapt the patterns to your toolchains and runtimes.
Сменить язык
Читать эту статью на русском