Gemini Content Generation

Overview

The Gemini API endpoint allows you to generate content using Google’s Gemini models in their native format. This endpoint follows the official Gemini API specification, making it easy to integrate with existing Gemini-compatible code.

Latest News: gemini-3-pro-preview is now supported!

Quick Start

Simply replace the Base URL and API Key in the official SDK or requests to use it:

Base URL: https://api.wisgate.ai (replace generativelanguage.googleapis.com)
API Key: Replace $GEMINI_API_KEY with your $WISDOM_GATE_KEY

Basic Example: Text Generation

curl "https://api.wisgate.ai/v1beta/models/gemini-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "How does AI work?"
          }
        ]
      }
    ]
  }'

Important Notes

Model DifferencesDifferent Gemini model versions may support different request parameters and return different response fields. We strongly recommend consulting the model catalog for complete parameter lists and usage instructions for each model.

Response Pass-through PrincipleWisGate typically does not modify model responses outside of reverse format, ensuring you receive response content consistent with the original Gemini API provider.

Streaming SupportWisGate supports Server-Sent Events (SSE) for streaming responses. Use the streamGenerateContent operator with ?alt=sse parameter to enable real-time streaming, which is useful for chat applications.

Auto-Generated DocumentationThe request parameters and response format are automatically generated from the OpenAPI specification. All parameters, their types, descriptions, defaults, and examples are pulled directly from openapi.json. Scroll down to see the interactive API reference.

FAQ

How to control Thinking?

Gemini models support a “thinking” process to improve reasoning capabilities. The control method depends on the model version. For details, please refer to the official documentation: Gemini Thinking Guide

Gemini 3 Series (e.g., `gemini-3-pro-preview`)

Use the thinkingLevel parameter to control thinking intensity ("LOW" or "HIGH").

curl "https://api.wisgate.ai/v1beta/models/gemini-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [{ "parts": [{ "text": "Explain quantum physics simply." }] }],
    "generationConfig": {
      "thinkingConfig": {
        "thinkingLevel": "LOW"
      }
    }
  }'

Gemini 2.5 Series (e.g., `gemini-2.5-pro`)

Use the thinkingBudget parameter to control the Token budget for thinking.

0: Disable thinking.
-1: Dynamic thinking (model decides automatically, default).
> 0: Set a specific Token limit (e.g., 1024).

curl "https://api.wisgate.ai/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [{ "parts": [{ "text": "Solve this logic puzzle." }] }],
    "generationConfig": {
      "thinkingConfig": {
        "thinkingBudget": 1024
      }
    }
  }'

How to use Streaming Responses?

Streaming responses allow you to receive results incrementally as the model generates content, reducing perceived latency. For details, please refer to the official documentation: Gemini Text Generation - Streaming Responses Note: The URL must point to streamGenerateContent and it is recommended to add ?alt=sse to use the Server-Sent Events format.

curl "https://api.wisgate.ai/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  --no-buffer \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain how AI works"
          }
        ]
      }
    ]
  }'

How to maintain conversation context?

Include the complete conversation history in the contents array:

conversation = [
    {
        "role": "user",
        "parts": [{"text": "What is Python?"}]
    },
    {
        "role": "model",
        "parts": [{"text": "Python is a programming language..."}]
    },
    {
        "role": "user",
        "parts": [{"text": "What are its advantages?"}]
    }
]

response = requests.post(
    "https://api.wisgate.ai/v1beta/models/gemini-3-pro-preview:generateContent",
    headers={
        "x-goog-api-key": "WISDOM_GATE_KEY",
        "Content-Type": "application/json"
    },
    json={"contents": conversation}
)

What does finishReason mean?

The finishReason field in the response indicates why the model stopped generating:

Value	Meaning
`STOP`	Natural completion
`MAX_TOKENS`	Reached maxOutputTokens limit
`SAFETY`	Triggered safety filter
`RECITATION`	Detected recitation of training data
`OTHER`	Other reason

How to control costs?

Use maxOutputTokens in generationConfig to limit output length
Choose appropriate models (e.g., gemini-2.5-flash is more economical than gemini-3-pro-preview)
Streamline prompts, avoid redundant context
Monitor token consumption in the usageMetadata field of responses
Use thinking budgets wisely for reasoning models to control reasoning token usage

How to use multimodal input (text and images)?

Gemini supports multimodal input through the parts array. You can include both text and images in a single request:

data = {
    "contents": [
        {
            "parts": [
                {"text": "What is in this image?"},
                {
                    "inlineData": {
                        "mimeType": "image/jpeg",
                        "data": "base64_encoded_image_data_here"
                    }
                }
            ]
        }
    ]
}

Text Models

Image Models

Video Models

User

Error Handling

Gemini Content Generation

Overview

Quick Start

Basic Example: Text Generation

Important Notes

FAQ

How to control Thinking?

Gemini 3 Series (e.g., `gemini-3-pro-preview`)

Gemini 2.5 Series (e.g., `gemini-2.5-pro`)

How to use Streaming Responses?

How to maintain conversation context?

What does finishReason mean?

How to control costs?

How to use multimodal input (text and images)?

Text Models

Image Models

Video Models

User

Error Handling

Documentation Index

​Overview

​Quick Start

​Basic Example: Text Generation

​Important Notes

​FAQ

​How to control Thinking?

​Gemini 3 Series (e.g., gemini-3-pro-preview)

​Gemini 2.5 Series (e.g., gemini-2.5-pro)

​How to use Streaming Responses?

​How to maintain conversation context?

​What does finishReason mean?

​How to control costs?

​How to use multimodal input (text and images)?

Overview

Quick Start

Basic Example: Text Generation

Important Notes

FAQ

How to control Thinking?

Gemini 3 Series (e.g., `gemini-3-pro-preview`)

Gemini 2.5 Series (e.g., `gemini-2.5-pro`)

How to use Streaming Responses?

How to maintain conversation context?

What does finishReason mean?

How to control costs?

How to use multimodal input (text and images)?