minimax/minimax-m3

$0.2400 / M input tokens · $0.9600 / M output tokens

MiniMax M3 is a MiniMax model route on OurToken for developers who need hosted API access for coding, agent workflows, long-context tasks, multimodal evaluation, and production assistants.

Get API Key

24H Status Monitor

99.3% uptime

8 hours agonow

Available

2026-07-23 14:34:21 UTC

Pricing

Pay-per-use

No upfront costs, pay only for what you use

40% of official price

Input$0.60 / M$0.2400 / M Tokens

Output$2.40 / M$0.9600 / M Tokens

Cached input$0.12 / M$0.0480 / M Tokens

Cache writes$0 / M$0 / M Tokens

API Usage

API Access Guide

Base URLhttps://api.ourtoken.ai/v1

API Endpointchat/completions

Full URLhttps://api.ourtoken.ai/v1/chat/completions

Model IDminimax-m3

Get API Key

Code examples

Use the OurToken API endpoint for this model. The examples below use direct HTTP requests and the recommended endpoint for the model family.

curl https://api.ourtoken.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "minimax-m3",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "max_tokens": 256
  }'

Chat Completions API Reference

Create a chat response with the OpenAI Chat Completions-compatible endpoint. Use https://api.ourtoken.ai/v1 as the SDK Base URL and POST /chat/completions as the endpoint.

Authorization

Content-Type	application/json
Authorization	Bearer YOUR_API_KEY

Request Body

Field	Type	Required	Description
model	string	Required	Model ID to call.
messages	array<object>	Required	Conversation messages sent to the model.
max_tokens	integer	Optional	Maximum number of output tokens.
temperature	number	Optional	Sampling temperature.
top_p	number	Optional	Nucleus sampling parameter.
stream	boolean	Optional	Whether to return a streaming response.
stream_options	object	Optional	Additional options for streaming responses.
tools	array<object>	Optional	Tools available to the model.
tool_choice	string \| object	Optional	Controls how the model selects tools.
response_format	object	Optional	Controls structured output, such as JSON object responses.

Response Body

Field	Type	Required	Description
id	string	Required	Unique chat completion identifier.
object	"chat.completion"	Required	Object type returned by the Chat Completions API.
created	integer	Required	Unix timestamp when the response was created.
model	string	Required	Model that produced the response.
choices	array<object>	Required	Candidate responses returned by the model.
choices[].message.role	string	Required	Role of the returned chat message.
choices[].message.content	string	Optional	Text content in the returned chat message.
choices[].finish_reason	string	Optional	Reason generation stopped.
usage	object	Optional	Token usage information for the chat completion.
usage.prompt_tokens	integer	Optional	Input token count.
usage.completion_tokens	integer	Optional	Output token count.
usage.total_tokens	integer	Optional	Total token count.
usage.prompt_tokens_details	object	Optional	Breakdown of input token usage.
usage.prompt_tokens_details.cached_tokens	integer	Optional	Tokens served from cache.

Model Introduction

MiniMax minimax-m3

MiniMax M3 is a MiniMax model route on OurToken for developers who need hosted API access for coding, agent workflows, long-context tasks, multimodal evaluation, and production assistants.

MiniMax M3 gives teams a MiniMax route for application work where long context, coding workflows, multimodal prompts, and predictable API pricing matter. Use MiniMax M3 API when you want to test MiniMax workflows through the OurToken unified API while keeping model IDs, usage logs, cache costs, and price review in one dashboard.

Why It Looks Great

40% of the official MiniMax M3 reference price for input, output, and cache read tokens.
OpenAI-compatible API setup through the same OurToken endpoint used by other supported models.
Cache write is listed as $0, while standard input, output, and cache read tokens remain paid categories.
Useful for evaluating coding agents, long-context tasks, tool-use experiments, and multimodal workflows without separate provider-specific integration.
Dashboard logs and usage visibility help teams review request cost after launch.

Key Features

Model ID: minimax-m3
Input price: $0.2400 per 1M tokens on OurToken
Output price: $0.9600 per 1M tokens on OurToken
Cache read price: $0.0480 per 1M tokens on OurToken
Cache write price: $0 per 1M tokens on OurToken
Provider: MiniMax

Specifications

ProviderMiniMax

Model TypeLarge Language Model (LLM)

Model IDminimax-m3

Context Length1M tokens

Max Output512K tokens

InputText, image, video

OutputText

OurToken Input Price$0.2400 / 1M tokens

OurToken Output Price$0.9600 / 1M tokens

OurToken Cache Read Price$0.0480 / 1M tokens

OurToken Cache Write Price$0 / 1M tokens

Official Input Reference$0.60 / 1M tokens

Official Output Reference$2.40 / 1M tokens

Official Cache Read Reference$0.12 / 1M tokens

MiniMax M3 API Features

Use MiniMax M3 API for unified MiniMax API access, transparent MiniMax M3 pricing, cache visibility, multimodal evaluation, and production agent workflows.

Unified Access

Call MiniMax M3 API through OurToken's unified endpoint while keeping model access, API key management, and usage history in one place. Use minimax-m3 as the model ID and reuse OpenAI-compatible request patterns for coding agents, chat systems, and long-context workflows.

Pricing Clarity

Review MiniMax M3 pricing before rollout. OurToken lists $0.2400 input and $0.9600 output per 1M tokens, so teams can estimate MiniMax M3 price for coding, multimodal prompts, and high-volume assistant workloads.

Cache Costs

Separate cache behavior from normal prompt spend with explicit cache pricing. MiniMax M3 API cache read is listed at $0.0480 per 1M tokens on OurToken, while cache write is $0, which is the MiniMax M3 free case users should understand clearly.

Agent Workflows

Use MiniMax M3 model evaluation for coding agents, tool-use experiments, and multi-step automation. Competitor material highlights agentic capability and OpenCode-style workflows, but teams should validate Opencode MiniMax M3 behavior with their own prompts and acceptance criteria.

Multimodal Context

Evaluate long-context and multimodal tasks such as document review, repository analysis, visual inputs, video-grounded prompts, and multi-turn collaboration. Competitor material describes 1M context and native multimodality, which should be tested in your own production-like workload.

Deployment Choices

Compare hosted API access with searches such as MiniMax M3 HuggingFace and MiniMax M3 Ollama. OurToken focuses on managed API keys, usage logs, pricing visibility, and simple integration rather than local model hosting.

How to Use MiniMax M3 API on OurToken

Create an API key, copy minimax-m3, compare MiniMax M3 pricing, call the unified endpoint, and monitor real usage.

Create API Key

Create an OurToken API key from the dashboard and store it in a secure server-side environment variable. This gives your backend access to MiniMax M3 API while keeping credentials out of client code, notebooks, and public repositories.

Copy Model ID

Use minimax-m3 as the model value in your request body. Keeping the exact MiniMax M3 model ID in configuration helps developers avoid naming mistakes when comparing MiniMax API routes across local tests, staging traffic, and production deployments.

Call Endpoint

Send requests to the OurToken unified API endpoint with your API key, model ID, and prompt payload. Existing OpenAI-compatible chat request patterns can usually be reused after changing the base URL, credential, and model value.

Compare Pricing

Compare MiniMax M3 API pricing before rollout: OurToken lists $0.2400 input, $0.9600 output, and $0.0480 cache read per 1M tokens. Cache write is $0, which is the MiniMax M3 free token category to separate from paid input and output.

Test Workflows

Run representative coding, agent, long-context, image, and video-input prompts before scaling. If you are evaluating Opencode MiniMax M3 workflows, compare tool behavior, response quality, latency, and token usage against your production acceptance criteria.

Monitor Cost

After launch, review history logs for request count, input tokens, output tokens, cache read tokens, and spend. Real usage data helps teams compare MiniMax M3 price against actual traffic instead of relying only on benchmark pages or provider listings.

MiniMax M3 API FAQ

Answers about MiniMax M3 API pricing, MiniMax API access, free cache-write usage, model setup, OpenCode workflows, and deployment comparisons.

What is MiniMax M3 API?

MiniMax M3 API is the MiniMax M3 model route available through OurToken for teams that want hosted access to a coding, agent, long-context, and multimodal model. Developers can use the minimax-m3 model ID, create an OurToken API key, and call it through the same unified API flow used by other supported models.

What is MiniMax M3 API pricing on OurToken?

MiniMax M3 API pricing on OurToken is $0.2400 per 1M input tokens and $0.9600 per 1M output tokens. The catalog also lists cache read at $0.0480 per 1M tokens and cache write at $0, so teams can estimate MiniMax M3 pricing by token category before scaling traffic.

Is MiniMax M3 free on OurToken?

MiniMax M3 free usage on OurToken refers to cache write pricing, which is listed as $0 per 1M tokens. Standard input, output, and cache read tokens are still paid categories. Treat free cache write as a useful cost detail, not as a free MiniMax M3 API plan.

What is the MiniMax M3 price for cache read and cache write?

The MiniMax M3 price for cache read is $0.0480 per 1M cache read tokens on OurToken, compared with the official $0.12 reference. Cache write is listed as $0 per 1M tokens. This makes cache behavior important to track separately from normal input and output spend.

Can I use Opencode MiniMax M3 workflows through OurToken?

You can evaluate Opencode MiniMax M3 workflows by calling minimax-m3 through OurToken and testing coding, tool-use, and multi-step agent prompts. Competitor material highlights agentic and coding capability, but production decisions should compare repository prompts, latency, tool behavior, and output quality against your own acceptance criteria.

How do MiniMax M3 HuggingFace, MiniMax M3 Ollama, and OurToken API access compare?

MiniMax M3 HuggingFace and MiniMax M3 Ollama are common searches for model discovery or local-style deployment paths. OurToken focuses on hosted MiniMax M3 API access with API keys, usage logs, and pricing visibility. Choose based on whether your priority is managed API integration, local experimentation, or model research.