OpenAI Debuts Agent Builder and AgentKit: A Visual-First Stack for Building, Deploying, and Evaluating AI Agents

Contents

What’s new?How the pieces fit in the puzzle?How safety is included?

OpenAI has released AgentKit, a cohesive platform that packages a visual Agent Builder, an embeddable ChatKit UI, and expanded Evals into a single workflow for shipping production agents. The launch includes Agent Builder in beta and the rest generally available.

What’s new?

Agent Builder (beta). A visual canvas for composing multi-step, multi-agent workflows with drag-and-drop nodes, connectors, per-node guardrails, preview runs, inline eval configuration, and full versioning. Teams can start from templates or a blank canvas; the Responses API powers execution. OpenAI highlights internal and customer usage to compress iteration cycles when moving from prototype to production.

Agents SDK. A code-first alternative to the canvas with type-safe libraries in Node, Python, and Go. OpenAI positions the SDK as faster to integrate than manual prompt-and-tool orchestration while sharing the same execution substrate (Responses API).

@Albertsons used AgentKit to build an agent.

An associate can ask it to create a plan to improve ice cream sales. The agent looks at the full context — seasonality, historical trends, external factors — and gives a recommendation. pic.twitter.com/rak7G5qc5U

— OpenAI Developers (@OpenAIDevs) October 6, 2025

ChatKit (GA). A drop-in, brand-customizable chat interface for deploying agentic experiences on the web or in apps. It handles streaming, threads, and “thinking” UIs; the marketing page shows organizations using it for support and internal assistants.

Built-in tools and connectors. Agent workflows can call web search, file search, image generation, code interpreter, “computer use,” and external connectors, including Model Context Protocol (MCP) servers—reducing glue code for common tasks.

Connector Registry (beta). Centralized admin governance across ChatGPT and the API for data sources such as Dropbox, Google Drive, SharePoint, Microsoft Teams, and third-party MCPs. Rollout begins for customers with the Global Admin Console.

Evals (GA) and optimization. New capabilities include datasets, trace grading for end-to-end workflow assessment, automated prompt optimization, and third-party model evaluation. OpenAI emphasizes continuous measurement to raise task accuracy.

Pricing and availability. OpenAI states ChatKit and the new Evals features are GA; Agent Builder is beta. All are included under standard API model pricing (i.e., pay for model/compute usage rather than separate SKUs).

How the pieces fit in the puzzle?

Design: Use Agent Builder to visually assemble agents and guardrails, or write agents with the Agents SDK against the Responses API.
Deploy: Embed with ChatKit to deliver a production chat surface without building a frontend from scratch.
Optimize: Instrument with Evals (datasets, trace grading, graders) and iterate prompts based on graded traces.

How safety is included?

OpenAI’s launch materials pair Agent Builder with guardrails (open-source, modular) that can detect jailbreaks, mask/flag PII, and enforce policies at the node/tool boundary. Admins govern connections and data flows through the Connector Registry spanning both ChatGPT and the API.

It is a consolidated stack: AgentKit packages a visual Agent Builder for graph-based workflows, an embeddable ChatKit UI, and an Agents SDK that sits on top of the Responses API; this reduces bespoke orchestration and frontend work while keeping evaluation in-loop via datasets and trace grading. Our assessment: the value is operational—versioned node graphs, built-in tools (web/file search, computer use), connector governance, and standardized eval hooks are production concerns that previously required custom infrastructure.

Introducing AgentKit—build, deploy, and optimize agentic workflows.

💬 ChatKit: Embeddable, customizable chat UI
👷 Agent Builder: WYSIWYG workflow creator
🛤️ Guardrails: Safety screening for inputs/outputs
⚖️ Evals: Datasets, trace grading, auto-prompt optimization pic.twitter.com/pGgNHKOvj3

— OpenAI Developers (@OpenAIDevs) October 6, 2025

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.