API Server
The Penguin API Server provides a web-based interface for interacting with the Penguin AI agent, enabling both programmatic access and a browser-based user interface.
Architecture
Server Initialization
The API server is built using FastAPI and initializes the core components using a factory pattern with reusable core instances:
def create_app() -> FastAPI:
"""Create and configure the FastAPI application."""
app = FastAPI(
title="Penguin AI",
description="AI Assistant with reasoning, memory, and tool use capabilities",
version=__version__,
docs_url="/api/docs",
redoc_url="/api/redoc"
)
# Configure CORS with environment-based origins
origins_env = os.getenv("PENGUIN_CORS_ORIGINS", "").strip()
origins_list = [o.strip() for o in origins_env.split(",") if o.strip()] or [
"http://localhost:8000",
"http://127.0.0.1:8000",
"http://localhost:9000",
"http://127.0.0.1:9000",
]
app.add_middleware(
CORSMiddleware,
allow_origins=origins_list,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Initialize or reuse core instance
core = get_or_create_core()
router.core = core
github_webhook_router.core = core
# Include API routes
app.include_router(router)
app.include_router(github_webhook_router)
# Mount static files for web UI
static_dir = Path(__file__).parent / "static"
if static_dir.exists():
app.mount("/static", StaticFiles(directory=str(static_dir)), name="static")
return app
Startup and CORS Hardening
Current server behavior is slightly stricter than older examples on this page:
- if
PENGUIN_CORS_ORIGINSis unset, Penguin now uses a small localhost-only development allowlist instead of"*" - if the server binds to a non-local interface such as
0.0.0.0while auth is disabled, startup is blocked unlessPENGUIN_ALLOW_INSECURE_NO_AUTH=trueis set explicitly
That startup check exists to reduce accidental exposure of an unauthenticated development server.
Core Instance Management
The server uses a global core instance pattern for efficient resource usage:
_core_instance: Optional[PenguinCore] = None
def get_or_create_core() -> PenguinCore:
"""Get the global core instance or create it if it doesn't exist."""
global _core_instance
if _core_instance is None:
_core_instance = _create_core()
return _core_instance
API Endpoints
Task / Project Web Surface Notes
The task/project web surface has recently been hardened to better reflect backend runtime truth.
Key current behaviors:
- task payloads now expose richer lifecycle state, including
phase, dependency fields, artifact evidence, recipe references, metadata, and clarification requests POST /api/v1/tasks/{task_id}/executenow runs throughRunMode- this preserves non-terminal states such as
waiting_input - clarification-needed outcomes are no longer flattened into fake terminal states
- this preserves non-terminal states such as
POST /api/v1/tasks/{task_id}/clarification/resumeis available for answering the latest open clarification request and resuming executionGET /api/v1/events/ssenow carries clarification-related session status visibility for OpenCode-compatible web clients
This part of the API is still under active audit, but the intended contract is clear: the web surface should expose the same task and clarification truth that the runtime uses internally.
Chat Endpoints
POST /api/v1/chat/message
Process a chat message with optional conversation support, multi-modal capabilities, and agent routing.
Request Body:
{
"text": "Hello, how can you help me with Python development?",
"conversation_id": "optional-conversation-id",
"session_id": "optional-session-id",
"agent_id": "code-expert",
"directory": "/absolute/path/to/repo",
"context": {"key": "value"},
"context_files": ["path/to/file1.py", "path/to/file2.py"],
"streaming": true,
"max_iterations": 5,
"image_paths": ["/path/to/image.png"],
"include_reasoning": false
}
Parameters:
text(required): The message contentconversation_id(optional): Continue an existing conversationsession_id(optional): Session identity for request scoping and directory binding (takes precedence for binding when both are provided)agent_id(optional): Route to a specific agentdirectory(optional): Working/project directory for scoped tool executioncontext(optional): Additional context dictionarycontext_files(optional): Files to load as contextstreaming(optional): Accepted for compatibility; REST/chat/messagereturns final payloads (use WebSocket endpoint for token streaming)max_iterations(optional): Maximum reasoning iterations (default: 5)image_paths(optional): List of image paths for vision modelsinclude_reasoning(optional): Include reasoning content in response
Response:
{
"response": "I can help you with Python development in several ways...",
"action_results": [
{
"action": "code_execution",
"result": "Execution result...",
"status": "completed"
}
],
"reasoning": "First, I'll analyze the requirements..."
}
New Features:
agent_id: Route messages to specific agents for specialized handlinginclude_reasoning: Capture extended reasoning for reasoning-capable models- Enhanced streaming with reasoning support
- Improved error handling and response consistency
- Per-agent conversation tracking
Session Directory Binding and Isolation:
- Session-to-directory binding is immutable by default. The first valid
directorybound to asession_id/conversation_idwins. - Rebinding the same session to a different directory returns
409 Conflict. - Invalid directories return
400 Bad Request. - Request execution is context-scoped, so concurrent requests in different repos keep tool/file/command roots isolated.
Tool Execution Contract:
- Tools are invoked by the model during chat processing; the web API does not expose a direct “run arbitrary tool by name” chat contract.
- Native provider tool calls are the primary path when supported by the selected provider.
- ActionXML remains a fallback compatibility path and is skipped when native provider tool calls have already executed for the turn.
- Tool arguments and results are persisted as structured metadata where possible, with stable JSON serialization for structured arguments.
- Tool execution is scoped to the session-bound
directory, and approval-gated tools pause until the approval flow resolves. - REST responses include completed tool runs in
action_results; live clients receive corresponding message/tool-part events.
WebSocket /api/v1/chat/stream
Stream chat responses in real-time with support for reasoning models and multiple agents.
Authentication note:
- when web auth is enabled, protected WebSocket routes now enforce authentication during the handshake before
accept() - header-based API key or bearer-token auth is supported only for clients that can send auth headers as part of the WebSocket handshake
- native browser
WebSocket(...)clients cannot attach arbitraryAuthorizationorX-API-Keyheaders, so the example below only works for public routes or routes intentionally exposed viaPENGUIN_PUBLIC_ENDPOINTS - query-string API key auth is not supported
- for authenticated connections, use a server-side or Node-based WebSocket client that can set headers, or introduce an authenticated upgrade/ticket flow in front of the browser client
Connection:
// Works for public or intentionally exposed routes only.
const ws = new WebSocket('ws://127.0.0.1:9000/api/v1/chat/stream');
Send Message:
{
"text": "Explain this code",
"agent_id": "code-expert",
"conversation_id": "conv_123",
"session_id": "conv_123",
"directory": "/absolute/path/to/repo",
"include_reasoning": true,
"max_iterations": 5,
"context_files": ["main.py"],
"image_paths": ["/path/to/screenshot.png"]
}
WebSocket Events:
start: Response stream startedtoken: Individual content tokens (coalesced for performance){"event": "token", "data": {"token": "Hello"}}reasoning: Reasoning tokens (ifinclude_reasoning: true){"event": "reasoning", "data": {"token": "Analyzing code structure..."}}progress: Iteration progress updates{"event": "progress", "data": {"iteration": 2, "max_iterations": 5}}complete: Final response with all results{"event": "complete","data": {"response": "Complete response text...","action_results": [...],"reasoning": "Full reasoning content..."}}error: Error information{"event": "error", "data": {"message": "Error details"}}
Features:
- Token coalescing for improved UI performance (5-character buffer)
- Separate reasoning token stream for reasoning-capable models
- Progress tracking for multi-iteration processing
- Agent-scoped conversations
- Session-scoped directory binding for per-repo isolation
- Automatic buffer flushing on timeout (100ms)
Multi-Agent Endpoints
Penguin now supports multi-agent orchestration with per-agent conversations, model configs, and tool defaults.
GET /api/v1/agents
List all registered agents with their configurations.
Query Parameters:
simple: Return simplified agent list (optional, default: false)
Response:
[
{
"id": "default",
"persona": null,
"persona_description": null,
"model": {
"model": "openai/gpt-5",
"provider": "openai",
"max_output_tokens": 8000,
"max_context_window_tokens": 399000
},
"parent": null,
"children": ["sub-agent-1"],
"default_tools": ["file_read", "bash"],
"active": true,
"paused": false,
"is_sub_agent": false
}
]
POST /api/v1/agents
Create a new agent with optional model config and persona.
Request Body:
{
"id": "code-reviewer",
"model_config_id": "claude-opus",
"persona": "code-review-expert",
"system_prompt": "You are a code review expert...",
"activate": true,
"default_tools": ["file_read", "grep"],
"initial_prompt": "Ready to review code"
}
Response:
{
"id": "code-reviewer",
"persona": "code-review-expert",
"model": {
"model": "anthropic/claude-opus-20240229",
"provider": "anthropic"
},
"active": true
}
GET /api/v1/agents/{agent_id}
Get detailed information about a specific agent.
PATCH /api/v1/agents/{agent_id}
Update agent state (e.g., pause/resume).
Request Body:
{
"paused": true
}
DELETE /api/v1/agents/{agent_id}
Remove an agent from the registry.
Query Parameters:
preserve_conversation: Keep conversation history (default: true)
POST /api/v1/agents/{agent_id}/delegate
Delegate a task to a specific agent (parent→child delegation pattern).
Request Body:
{
"content": "Review this code for security issues",
"parent": "default",
"channel": "code-review",
"metadata": {"priority": "high"}
}
GET /api/v1/agents/{agent_id}/history
Get conversation history for a specific agent.
Query Parameters:
include_system: Include system messages (default: true)limit: Maximum number of messages
GET /api/v1/agents/{agent_id}/sessions
List all sessions for a specific agent.
GET /api/v1/agents/{agent_id}/sessions/{session_id}/history
Get history for a specific agent session.
Message Routing Endpoints
POST /api/v1/messages
Send a message to a specific recipient (agent or human).
Request Body:
{
"recipient": "code-reviewer",
"content": "Please review this PR",
"message_type": "message",
"channel": "code-review",
"metadata": {"pr_id": "123"}
}
POST /api/v1/messages/to-agent
Send a message directly to an agent.
POST /api/v1/messages/to-human
Send a message to the human user (for agent-initiated communication).
POST /api/v1/messages/human-reply
Send a human reply to a specific agent.
WebSocket Event Stream
WebSocket /api/v1/events/ws
Stream real-time events from agents and system with filtering.
Query Parameters:
agent_id: Filter by agent ID (optional)message_type: Filter by message type (message|action|status)channel: Filter by channel (optional)include_ui: Include UI events (default: true)include_bus: Include MessageBus events (default: true)
Event Types:
bus.message: MessageBus protocol messagesmessage: User/assistant messagesstream_chunk: Streaming response chunkshuman_message: Messages to human user
WebSocket /api/v1/ws/messages
Alias of /api/v1/events/ws for message streaming (backward compatibility).
Telemetry Endpoints
GET /api/v1/telemetry
Get comprehensive telemetry summary including per-agent token usage.
Response:
{
"agents": {
"default": {
"messages_sent": 42,
"tasks_completed": 5
}
},
"tokens": {
"total": {"prompt": 15000, "completion": 8000},
"per_agent": {
"default": {"prompt": 12000, "completion": 6000},
"code-reviewer": {"prompt": 3000, "completion": 2000}
}
},
"uptime_seconds": 3600
}
WebSocket /api/v1/ws/telemetry
Stream telemetry updates in real-time.
Query Parameters:
agent_id: Filter by agent ID (optional)interval: Update interval in seconds (default: 2)
Multi-Agent Coordinator Endpoints
The coordinator provides role-based routing and workflow orchestration across agents.
POST /api/v1/coord/send-role
Send a message to all agents with a specific role.
Request Body:
{
"role": "code-reviewer",
"content": "Review the latest commit",
"message_type": "message"
}
POST /api/v1/coord/broadcast
Broadcast a message to multiple roles simultaneously.
Request Body:
{
"roles": ["code-reviewer", "tester"],
"content": "New PR ready for review",
"message_type": "message"
}
POST /api/v1/coord/rr-workflow
Execute a round-robin workflow across agents with a specific role.
Request Body:
{
"role": "worker",
"prompts": [
"Process task 1",
"Process task 2",
"Process task 3"
]
}
POST /api/v1/coord/role-chain
Execute a sequential workflow across different roles.
Request Body:
{
"roles": ["analyzer", "implementer", "tester"],
"content": "Implement feature X with full testing"
}
POST /api/v1/agents/{agent_id}/register
Register an existing agent with a coordinator role.
Request Body:
{
"role": "code-reviewer"
}
GitHub Integration Endpoints
Penguin includes built-in GitHub webhook support for automated workflows.
POST /api/v1/integrations/github/webhook
Receive GitHub webhook events (configured in GitHub repository settings).
Supported Events:
push: Code push eventspull_request: PR creation, updates, and mergesissues: Issue creation and updatesissue_comment: Comments on issuespull_request_review: PR reviewsworkflow_run: GitHub Actions workflow results
Event Processing: The webhook handler automatically:
- Validates webhook signatures (if
GITHUB_WEBHOOK_SECRETis set) - Routes events to appropriate agents based on configuration
- Triggers automated analysis, review, or response workflows
- Logs events for debugging and auditing
Configuration: Set these environment variables:
GITHUB_WEBHOOK_SECRET=your-webhook-secret
GITHUB_APP_ID=your-app-id # Optional
GITHUB_APP_PRIVATE_KEY_PATH=/path/to/key.pem # Optional
Conversation Endpoints
GET /api/v1/conversations
List all available conversations.
GET /api/v1/conversations/{conversation_id}
Retrieve a specific conversation by ID.
GET /api/v1/conversations/{conversation_id}/history
Get the message history for a specific conversation.
Query Parameters:
include_system: Include system messages (default: true)limit: Maximum number of messagesagent_id: Filter by agent ID (optional)channel: Filter by channel (optional)message_type: Filter by message type (optional)
POST /api/v1/conversations/create
Create a new conversation.
Project Management
POST /api/v1/projects/create
Create a new project.
Request Body:
{
"name": "My New Project",
"description": "Optional project description"
}
POST /api/v1/tasks/execute
Execute a task in the background.
Request Body:
{
"name": "Task name",
"description": "Task description",
"continuous": false,
"time_limit": 30
}
Utility Endpoints
GET /api/v1/token-usage
Get current token usage statistics.
GET /api/v1/context-files
List all available context files.
POST /api/v1/context-files/load
Load a context file into the current conversation.
Request Body:
{
"file_path": "path/to/context/file.md"
}
Checkpoint Management
Penguin now supports conversation checkpointing for branching and rollback functionality.
POST /api/v1/checkpoints/create
Create a manual checkpoint of the current conversation state.
Request Body:
{
"name": "Before refactoring",
"description": "Checkpoint before starting code refactoring"
}
Response:
{
"checkpoint_id": "ckpt_abc123",
"status": "created",
"name": "Before refactoring",
"description": "Checkpoint before starting code refactoring"
}
GET /api/v1/checkpoints
List available checkpoints with optional filtering.
Query Parameters:
session_id: Filter by session ID (optional)limit: Maximum number of checkpoints (default: 50)
Response:
{
"checkpoints": [
{
"id": "ckpt_abc123",
"name": "Before refactoring",
"description": "Checkpoint before starting code refactoring",
"created_at": "2024-01-01T10:00:00Z",
"type": "manual",
"session_id": "session_xyz"
}
]
}
POST /api/v1/checkpoints/{checkpoint_id}/rollback
Rollback conversation to a specific checkpoint.
Response:
{
"status": "success",
"checkpoint_id": "ckpt_abc123",
"message": "Successfully rolled back to checkpoint ckpt_abc123"
}
POST /api/v1/checkpoints/{checkpoint_id}/branch
Create a new conversation branch from a checkpoint.
Request Body:
{
"name": "Alternative approach",
"description": "Exploring different solution path"
}
Response:
{
"branch_id": "conv_branch_xyz",
"source_checkpoint_id": "ckpt_abc123",
"status": "created",
"name": "Alternative approach",
"description": "Exploring different solution path"
}
GET /api/v1/checkpoints/stats
Get statistics about the checkpointing system.
Response:
{
"enabled": true,
"total_checkpoints": 25,
"auto_checkpoints": 20,
"manual_checkpoints": 4,
"branch_checkpoints": 1,
"config": {
"frequency": 1,
"retention_hours": 24,
"max_age_days": 30
}
}
POST /api/v1/checkpoints/cleanup
Clean up old checkpoints according to retention policy.
Response:
{
"status": "completed",
"cleaned_count": 5,
"message": "Cleaned up 5 old checkpoints"
}
Model Management
Penguin supports runtime model switching and model discovery.
GET /api/v1/models
List all available models from configuration.
Response:
{
"models": [
{
"id": "claude-3-sonnet",
"name": "anthropic/claude-3-sonnet-20240229",
"provider": "anthropic",
"vision_enabled": true,
"max_output_tokens": 4000,
"current": true
},
{
"id": "gpt-4",
"name": "openai/gpt-4",
"provider": "openai",
"vision_enabled": false,
"max_output_tokens": 8000,
"current": false
}
]
}
POST /api/v1/models/load
Switch to a different model at runtime.
Request Body:
{
"model_id": "gpt-4"
}
Response:
{
"status": "success",
"model_id": "gpt-4",
"current_model": "openai/gpt-4",
"message": "Successfully loaded model: gpt-4"
}
GET /api/v1/models/current
Get information about the currently loaded model.
Response:
{
"model": "anthropic/claude-3-sonnet-20240229",
"provider": "anthropic",
"client_preference": "native",
"max_output_tokens": 4000,
"temperature": 0.7,
"streaming_enabled": true,
"vision_enabled": true
}
GET /api/v1/models/discover
Discover available models from OpenRouter catalogue (requires OPENROUTER_API_KEY).
Response:
{
"models": [
{
"id": "anthropic/claude-3-opus-20240229",
"name": "Claude 3 Opus",
"provider": "anthropic",
"context_length": 200000,
"max_output_tokens": 4096,
"pricing": {
"prompt": "0.000015",
"completion": "0.000075"
}
},
{
"id": "openai/gpt-4-turbo",
"name": "GPT-4 Turbo",
"provider": "openai",
"context_length": 128000,
"max_output_tokens": 4096,
"pricing": {
"prompt": "0.00001",
"completion": "0.00003"
}
}
]
}
Configuration: Set environment variables for attribution:
OPENROUTER_API_KEY=your-key
OPENROUTER_SITE_URL=https://your-app.com # Optional
OPENROUTER_SITE_TITLE=YourAppName # Optional
Enhanced Task Execution
POST /api/v1/tasks/execute-sync
Execute a task synchronously using the Engine layer with enhanced error handling.
Request Body:
{
"name": "Create API endpoint",
"description": "Create a REST API endpoint for user management",
"continuous": false,
"time_limit": 300
}
Response:
{
"status": "completed",
"response": "I've created a REST API endpoint for user management...",
"iterations": 3,
"execution_time": 45.2,
"action_results": [
{
"action": "file_creation",
"result": "Created api/users.py",
"status": "completed"
}
],
"task_metadata": {
"name": "Create API endpoint",
"continuous": false
}
}
WebSocket /api/v1/tasks/stream
Stream task execution events in real-time for long-running tasks.
WebSocket Events:
start: Task execution startedprogress: Progress updates during executionaction: Individual action execution resultscomplete: Task completed successfullyerror: Error occurred during execution
Runtime Configuration Management
Penguin provides runtime configuration management for dynamically adjusting project and workspace roots without restarting the server.
GET /api/v1/system/config
Get the current runtime configuration.
Response:
{
"status": "success",
"config": {
"project_root": "/Users/username/my-project",
"workspace_root": "/Users/username/penguin_workspace",
"execution_mode": "project",
"active_root": "/Users/username/my-project"
}
}
Configuration Fields:
project_root: The current project directory (typically a git repository)workspace_root: The Penguin workspace directory (for notes, conversations, memory)execution_mode: Current execution mode (projectorworkspace)active_root: The currently active root based on execution mode
POST /api/v1/system/config/project-root
Dynamically change the project root directory at runtime.
Request Body:
{
"path": "/Users/username/new-project"
}
Response:
{
"status": "success",
"message": "Project root set to /Users/username/new-project",
"path": "/Users/username/new-project",
"active_root": "/Users/username/new-project"
}
Features:
- Validates that the path exists and is a directory
- Updates all subscribed components (ToolManager, FileMap, etc.)
- Changes take effect immediately without server restart
- Preserves workspace root and other settings
Error Responses:
400: Path does not exist or is not a directory500: Internal error during configuration update
POST /api/v1/system/config/workspace-root
Dynamically change the workspace root directory at runtime.
Request Body:
{
"path": "/Users/username/custom-workspace"
}
Response:
{
"status": "success",
"message": "Workspace root set to /Users/username/custom-workspace",
"path": "/Users/username/custom-workspace",
"active_root": "/Users/username/custom-workspace"
}
Use Cases:
- Switch between different workspace directories
- Point to shared team workspaces
- Temporarily work in an isolated environment
POST /api/v1/system/config/execution-mode
Switch between project and workspace execution modes.
Request Body:
{
"path": "workspace"
}
Valid Modes:
project: File operations target the project rootworkspace: File operations target the workspace root
Response:
{
"status": "success",
"message": "Execution mode set to workspace: /Users/username/penguin_workspace",
"execution_mode": "workspace",
"active_root": "/Users/username/penguin_workspace"
}
Behavior:
projectmode: Tools operate in your code repositoryworkspacemode: Tools operate in the Penguin workspace (isolated from project)- Active root updates automatically based on mode
- File operations respect the selected mode immediately
Error Responses:
400: Invalid mode (must be 'project' or 'workspace')
Architecture: Observer Pattern
The runtime configuration system uses an observer pattern for clean component synchronization:
Benefits:
- Immediate Updates: Changes propagate to all components instantly
- No Restart Required: Server continues running without interruption
- Type-Safe: Validation ensures configuration integrity
- Observable: Components can react to configuration changes autonomously
Example: Switching Projects
import requests
base_url = "http://127.0.0.1:9000"
# Get current configuration
response = requests.get(f"{base_url}/api/v1/system/config")
print(response.json())
# Switch to a different project
response = requests.post(
f"{base_url}/api/v1/system/config/project-root",
json={"path": "/Users/username/other-project"}
)
print(response.json())
# Switch execution mode to project
response = requests.post(
f"{base_url}/api/v1/system/config/execution-mode",
json={"path": "project"}
)
print(response.json())
Environment Variables (Initial Configuration)
While runtime endpoints allow dynamic changes, you can set initial values via environment variables:
# Set initial project root
PENGUIN_PROJECT_ROOT=/path/to/project
# Set initial workspace root
PENGUIN_WORKSPACE=/path/to/workspace
# Set initial execution mode
PENGUIN_WRITE_ROOT=project # or 'workspace'
These environment variables are read at server startup and can be overridden at runtime via the API.
System Diagnostics
GET /api/v1/system/info
Get comprehensive system information including component status.
Response:
{
"penguin_version": "0.4.0",
"engine_available": true,
"checkpoints_enabled": true,
"current_model": {
"model": "anthropic/claude-3-sonnet-20240229",
"provider": "anthropic",
"streaming_enabled": true,
"vision_enabled": true
},
"conversation_manager": {
"active": true,
"current_session_id": "session_xyz",
"total_messages": 42
},
"tool_manager": {
"active": true,
"total_tools": 15
},
"memory_provider": {
"initialized": true,
"provider_type": "LanceProvider"
}
}
GET /api/v1/system/status
Get current system status and runtime information.
Response:
{
"status": "active",
"runmode_status": "RunMode idle.",
"continuous_mode": false,
"streaming_active": false,
"token_usage": {
"total": {"input": 1500, "output": 800},
"session": {"input": 300, "output": 150}
},
"timestamp": "2024-01-01T12:00:00Z",
"initialization": {
"core_initialized": true,
"fast_startup_enabled": true
}
}
Enhanced Capabilities Discovery
GET /api/v1/capabilities
Get comprehensive model and system capabilities.
Response:
{
"capabilities": {
"vision_enabled": true,
"streaming_enabled": true,
"checkpoint_management": true,
"model_switching": true,
"file_upload": true,
"websocket_support": true,
"task_execution": true,
"run_mode": true,
"multi_modal": true,
"context_files": true
},
"current_model": {
"model": "anthropic/claude-3-sonnet-20240229",
"provider": "anthropic",
"vision_enabled": true
},
"api_version": "v1",
"penguin_version": "0.4.0"
}
Security & Permissions
The web server now has active auth controls instead of relying entirely on network trust.
Current behavior:
- protected HTTP routes can require API-key or JWT authentication
- protected WebSocket routes require explicit auth checks before connection acceptance
- query-parameter API keys are not accepted
- additional unauthenticated routes can be exposed intentionally with
PENGUIN_PUBLIC_ENDPOINTS
Default public routes include:
//api/docs/api/redoc/api/openapi.json/api/v1/health/static/...
GitHub webhook note: When global auth is enabled, GitHub webhook delivery usually requires either:
- exposing
/api/v1/integrations/github/webhookthroughPENGUIN_PUBLIC_ENDPOINTS, or - placing Penguin behind a relay/gateway that handles the trust boundary
GitHub will not send Penguin API keys on webhook delivery.
Penguin provides security endpoints for managing permissions, approvals, and audit logs.
The security and approval endpoints are currently implemented in penguin/api/routes.py and are being migrated to penguin/web/routes.py. See context/architecture/api-routes-audit.md for the full migration plan. Both router files are functional, but web/routes.py is the primary active file.
GET /api/v1/security/audit
Get recent permission audit log entries.
Query Parameters:
limit: Maximum entries to return (1-1000, default: 100)result: Filter by result (allow,ask,deny)category: Filter by category (filesystem,process,network,git,memory)agent_id: Filter by agent ID
Response:
{
"entries": [
{
"id": "audit_abc123",
"timestamp": "2024-01-01T12:00:00Z",
"operation": "filesystem.write",
"resource": "/path/to/file.py",
"result": "allow",
"reason": "Within workspace boundary",
"policy": "workspace-boundary",
"agent_id": "default",
"tool_name": "write_file"
}
],
"total": 42,
"filters": {
"limit": 100,
"result": null,
"category": null,
"agent_id": null
}
}
GET /api/v1/security/audit/stats
Get audit statistics summary.
Response:
{
"total": 1250,
"by_result": {
"allow": 1180,
"ask": 45,
"deny": 25
},
"by_category": {
"filesystem": 890,
"process": 200,
"network": 100,
"git": 60
}
}
GET /api/v1/approvals
List pending approval requests.
Response:
{
"pending": [
{
"id": "apr_xyz789",
"tool_name": "bash",
"operation": "process.execute",
"resource": "rm -rf ./build",
"reason": "Destructive command requires approval",
"session_id": "sess_abc",
"created_at": "2024-01-01T12:00:00Z",
"expires_at": "2024-01-01T12:05:00Z"
}
]
}
POST /api/v1/approvals/{request_id}/approve
Approve a pending permission request.
Request Body:
{
"scope": "session"
}
Scope Options:
once: Approve this single request onlysession: Approve similar operations for the current sessionpattern: Approve operations matching the resource pattern
Response:
{
"status": "approved",
"request_id": "apr_xyz789",
"scope": "session"
}
POST /api/v1/approvals/{request_id}/deny
Deny a pending permission request.
Response:
{
"status": "denied",
"request_id": "apr_xyz789"
}
POST /api/v1/approvals/pre-approve
Pre-approve an operation pattern for a session.
Request Body:
{
"operation": "filesystem.delete",
"pattern": "./build/*",
"session_id": "sess_abc"
}
Response:
{
"status": "pre-approved",
"operation": "filesystem.delete",
"pattern": "./build/*"
}
WebSocket Approval Events
When using WebSocket streaming, approval requests are sent as events:
{
"event": "approval_required",
"data": {
"request_id": "apr_xyz789",
"tool_name": "bash",
"operation": "process.execute",
"resource": "rm -rf ./build",
"reason": "Destructive command requires approval",
"expires_at": "2024-01-01T12:05:00Z"
}
}
After approval/denial:
{
"event": "approval_resolved",
"data": {
"request_id": "apr_xyz789",
"status": "approved",
"scope": "session"
}
}
File Upload and Multi-Modal Support
POST /api/v1/upload
Security notes:
- this endpoint is currently restricted to image uploads
- empty uploads are rejected
- server-side size enforcement is controlled by
PENGUIN_MAX_UPLOAD_BYTES - partially written files are removed if validation fails
Upload files (primarily images) for use in conversations.
Request Body: (multipart/form-data)
file: The file to upload
Response:
{
"path": "/workspace/uploads/abc123.png",
"filename": "image.png",
"content_type": "image/png"
}
Web Interface
Penguin includes a simple web-based chat interface for interacting with the assistant directly in the browser.
Web UI Features
The browser interface provides a simple chat experience with:
- Conversation history management
- Markdown rendering for formatted responses
- Code syntax highlighting
- Real-time updates
- Conversation switching and creation
PenguinAPI - Programmatic Access
The PenguinAPI class provides a Python interface for embedding Penguin in other applications:
from penguin.web import PenguinAPI
# Create API instance (reuses or creates core)
api = PenguinAPI()
# Send a message with streaming
response = await api.chat(
message="Help me debug this function",
agent_id="code-expert",
streaming=True,
include_reasoning=True,
on_chunk=lambda chunk, msg_type: print(f"[{msg_type}] {chunk}")
)
print(response["assistant_response"])
print(response["reasoning"])
# Stream responses
async for msg_type, chunk in api.stream_chat(
message="Explain this code",
agent_id="code-expert"
):
print(f"[{msg_type}] {chunk}", end="")
# Manage conversations
conversation_id = await api.create_conversation(name="Debug Session")
conversations = await api.list_conversations()
history = await api.get_conversation_history(conversation_id)
# Execute tasks via Engine
result = await api.run_task(
task_description="Implement user authentication",
max_iterations=10,
project_id="my-project"
)
Key Features
- Async/Await Support: Full async support for non-blocking operations
- Streaming Callbacks: Real-time token streaming with
on_chunkcallback - Agent Routing: Direct messages to specific agents via
agent_id - Reasoning Support: Capture extended reasoning with
include_reasoning - Conversation Management: Create, list, and manage conversation history
- Task Execution: High-level task execution via Engine layer
- Health Monitoring: Check system health and component status
Integration with Core Components
The API server integrates with Penguin's core components using a factory pattern:
def _create_core() -> PenguinCore:
"""Create a new PenguinCore instance with proper configuration."""
config_obj = Config.load_config()
model_config = config_obj.model_config
# Initialize components
api_client = APIClient(model_config=model_config)
api_client.set_system_prompt(SYSTEM_PROMPT)
config_dict = config_obj.to_dict() if hasattr(config_obj, 'to_dict') else {}
tool_manager = ToolManager(config_dict, log_error)
# Create core with proper Config object
core = PenguinCore(
config=config_obj,
api_client=api_client,
tool_manager=tool_manager,
model_config=model_config
)
return core
This integration ensures:
- ModelConfig: Configures model behavior with provider abstraction
- APIClient: Handles LLM communication with streaming and reasoning support
- ToolManager: Executes tools with lazy loading and fast startup
- PenguinCore: Coordinates multi-agent system with event-driven architecture
- ConversationManager: Manages per-agent sessions with checkpointing
- Engine: Provides high-level task orchestration and stop conditions
Concurrency Isolation Audit (Recommended)
For substantial web/runtime changes, run a focused isolation audit before calling a deployment production-safe:
- Verify request-dependent state (
session_id,conversation_id,agent_id,directory) is context-scoped and not stored in shared mutable globals. - Verify concurrent requests against different repos do not cross-write files or leak tool cwd roots.
- Verify event payloads (
message.*,tool.*, stream events) preserve the correct scoped session/conversation IDs. - Verify fallback paths (non-Engine or legacy paths) either keep equivalent isolation guarantees or are explicitly documented.
- Re-run targeted multi-session tests with parallel chat/tool workloads before release.
Running the Server
Command Line
Start the server using the CLI:
# Using the penguin-web command
penguin-web
# Or using Python module
python -m penguin.web.server
Programmatic Startup
Start the server from Python code:
from penguin.web import start_server
# Start with custom settings
start_server(
host="0.0.0.0",
port=9000,
debug=False # Enable auto-reload in development
)
Environment Variables
Configure the penguin-web entrypoint via environment variables:
# Server settings
HOST=0.0.0.0
PORT=9000
DEBUG=false
# CORS configuration
PENGUIN_CORS_ORIGINS=http://localhost:3000,https://myapp.com
# GitHub webhook integration
GITHUB_WEBHOOK_SECRET=your-secret
GITHUB_APP_ID=123456
GITHUB_APP_PRIVATE_KEY_PATH=/path/to/key.pem
# Model configuration
OPENROUTER_API_KEY=your-key # For model discovery
Access Points
By default, the server runs on port 9000 and provides:
- Web UI: http://127.0.0.1:9000/
- API Documentation (Swagger): http://127.0.0.1:9000/api/docs
- API Documentation (ReDoc): http://127.0.0.1:9000/api/redoc
- GitHub Webhook: http://127.0.0.1:9000/api/v1/integrations/github/webhook
- WebSocket Events: ws://127.0.0.1:9000/api/v1/events/ws
- WebSocket Chat: ws://127.0.0.1:9000/api/v1/chat/stream
API Documentation
FastAPI automatically generates OpenAPI documentation for all endpoints, available at:
- Swagger UI: http://127.0.0.1:9000/api/docs
- ReDoc: http://127.0.0.1:9000/api/redoc