Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wisdom-docs.juheapi.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

responses is OpenAI’s most advanced model response generation interface, supporting richer interactive capabilities and tool integration. This endpoint follows the OpenAI Responses API format and provides enhanced features beyond the standard chat completions endpoint.

Core Features

  • Multimodal Input: Supports text, image, and file inputs
  • Text Output: Generates high-quality text responses
  • Stateful Interaction: Uses outputs from previous responses as subsequent inputs, maintaining conversation coherence
  • Built-in Tools: Integrates file search, web search, code interpreter, and other functions
  • Function Calling: Allows models to access external systems and data sources
  • Streaming Support: Real-time streaming responses via Server-Sent Events (SSE)
  • Reasoning Models: Supports reasoning configuration for gpt-5 and o-series models

Important Notes

Model VariationsDifferent model providers may support different request parameters and return varying response fields. We strongly recommend consulting the model catalog for complete parameter lists and usage instructions for each model.
Response Pass-through PrincipleWisGate typically does not modify model responses beyond reverse-engineered formats, ensuring you receive response content consistent with the original API provider.
When to Use Responses APIUse the /v1/responses endpoint for OpenAI Pro series models (like o3-pro, o3-mini) and when you need advanced features like built-in tools, multimodal inputs, or stateful conversations. For standard chat completions, use /v1/chat/completions.

Auto-Generated DocumentationThe request parameters and response format are automatically generated from the OpenAPI specification. All parameters, their types, descriptions, defaults, and examples are pulled directly from openapi.json. Scroll down to see the interactive API reference.

FAQ

What’s the difference between /v1/chat/completions and /v1/responses?

The /v1/responses endpoint is OpenAI’s more advanced interface that offers:
  • Built-in tools: Web search, file search, code interpreter
  • Multimodal inputs: Support for images and files in addition to text
  • Stateful conversations: Better conversation state management
  • Required for Pro models: OpenAI Pro series models (o3-pro, o3-mini) must use this endpoint
Use /v1/chat/completions for standard chat interactions with most models. Use /v1/responses when you need advanced features or are using Pro series models.

How do I use multimodal inputs (text + images)?

You can combine text and images in a single request:
response = client.responses.create(
    model="gpt-4.1",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_text",
                    "text": "What is in this image?"
                },
                {
                    "type": "input_image",
                    "image_url": "https://example.com/image.jpg"
                }
            ]
        }
    ]
)
Enable built-in tools by including them in the tools array:
response = client.responses.create(
    model="gpt-4.1",
    input="What was a positive news story from today?",
    tools=[
        {"type": "web_search_preview"}
    ]
)

How do I maintain conversation state?

Use previous_response_id to create multi-turn conversations:
# First message
response1 = client.responses.create(
    model="gpt-4.1",
    input="Hello, my name is Alice."
)

# Follow-up message
response2 = client.responses.create(
    model="gpt-4.1",
    input="What's my name?",
    previous_response_id=response1.id
)
Alternatively, use the conversation parameter to manage conversation state automatically.

How do I use function calling?

Define custom functions and include them in the tools array:
response = client.responses.create(
    model="gpt-4.1",
    input="What is the weather like in Boston today?",
    tools=[
        {
            "type": "function",
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["location", "unit"]
            }
        }
    ],
    tool_choice="auto"
)

How do I use reasoning models (o3, gpt-5)?

For reasoning models, you can configure reasoning effort:
response = client.responses.create(
    model="o3-mini",
    input="How much wood would a woodchuck chuck?",
    reasoning={
        "effort": "high"  # Options: minimal, low, medium, high
    }
)
Higher effort values result in more thorough reasoning but may take longer and use more tokens.

How do I enable streaming?

Set stream: true to enable Server-Sent Events streaming:
stream = client.responses.create(
    model="gpt-4.1",
    input="Tell me a story",
    stream=True
)

for chunk in stream:
    # Process streaming chunks
    print(chunk, end="")