# Connect Curl & HTTP to Unsloth Unsloth exposes three OpenAI/Anthropic-compatible wire formats at the same base URL on the port Unsloth started on. All of them take an `Authorization: Bearer sk-unsloth-…` header and return either JSON or SSE, depending on whether you set `stream`. \ \ This page groups the recipes by endpoint (`/v1/chat/completions`, `/v1/messages`, `/v1/responses`, `/v1/models`) and ends with a shared section on Unsloth's built-in **server-side tools**, which work across all the chat endpoints. {% hint style="info" %} If you're not sure what URL / key / model name to use, read the API overview first. It walks you through starting Unsloth, loading a model, and creating an `sk-unsloth-…` key. {% endhint %} ### 🔑 Authentication Every request needs an `Authorization` header: ``` Authorization: Bearer sk-unsloth-xxxxxxxxxxxx ``` To keep keys out of your shell history, export the key once and reference the env var: ```bash export UNSLOTH_STUDIO_AUTH_TOKEN=sk-unsloth-xxxxxxxxxxxx ``` The snippets below inline the key as `sk-unsloth-xxxxxxxxxxxx` for clarity. In practice, substitute `$UNSLOTH_STUDIO_API_KEY`. ### 📋 List loaded models ```bash curl http://localhost:8888/v1/models \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" ``` Response: ```json { "object": "list", "data": [ {"id": "unsloth/gemma-3-27b-it-GGUF", "object": "model", "owned_by": "local"} ] } ```

Use the `id` field whenever a request needs a `"model"` value (or when a client like opencode asks for a **Model ID**). ### 💬 Chat Completions (`/v1/chat/completions`) The OpenAI Chat Completions dialect. The broadest compatibility surface. Works with the OpenAI SDK, opencode, Cursor, Continue, Cline, Open WebUI, SillyTavern, and most OpenAI-compatible tools. #### Basic request ```bash curl http://localhost:8888/v1/chat/completions \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "default", "messages": [{"role": "user", "content": "Hello"}] }' ```

#### Streaming Add `"stream": true` and the response switches to Server-Sent Events (`text/event-stream`). Tell `curl` to flush as bytes arrive with `--no-buffer` (`-N`): ```bash curl -N http://localhost:8888/v1/chat/completions \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-local", "messages": [{"role": "user", "content": "Write a haiku about locally-run LLMs."}], "stream": true }' ``` Each line of the response looks like `data: {"choices":[{"delta":{"content":"..."}}]}`, ending with `data: [DONE]`.

#### Images (vision) Attach an image as an `image_url` content part in the user message. The URL can be HTTPS or a base64 `data:` URI: ```bash # Embed a local file as base64 (trimmed for brevity) IMG=$(base64 -w 0 test.jpg) cat > /tmp/request.json <

#### Function calling (OpenAI tools) Pass OpenAI-style `tools` and (optionally) `tool_choice`. Your client runs each tool call and returns the result on the next turn. ```bash curl http://localhost:8888/v1/chat/completions \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model":"default", "input":[{"role":"user","content":"What is the weather in Paris?"}], "tools":[{ "type":"function", "name":"get_weather", "description":"Get current weather for a city.", "parameters":{ "type":"object", "properties":{"city":{"type":"string"}}, "required":["city"] } }], "tool_choice":"required" }' | jq '.output, .usage' ```

### 📨 Anthropic Messages (`/v1/messages`) Unsloth's Anthropic-compatible dialect used by Claude Code, the Anthropic SDK, OpenClaw, and any client that speaks the Messages API. #### Basic request ```bash curl http://localhost:8888/v1/messages \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "default", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello"}] }' ``` {% hint style="warning" %} `max_tokens` is required on `/v1/messages` (it's optional on `/v1/chat/completions`). {% endhint %}

#### Streaming ```bash curl -N http://localhost:8888/v1/messages \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-local", "max_tokens": 1024, "messages": [{"role": "user", "content": "Explain LoRA in two sentences."}], "stream": true }' ``` Events follow Anthropic's SSE shape: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, `message_stop`, plus Unsloth's custom `tool_result` event for server-side tool output. #### Images (vision) Anthropic-style image content uses a `source` block with base64 data: ```bash IMG=$(base64 -w 0 test.jpg) cat > /tmp/request.json <

#### Tool calling (Anthropic tools) ```bash curl http://localhost:8888/v1/messages \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-local", "max_tokens": 1024, "messages": [{"role": "user", "content": "Whats the weather in Tokyo?"}], "tools": [ { "name": "get_weather", "description": "Get the current weather for a city", "input_schema": { "type": "object", "properties": { "city": {"type": "string"} }, "required": ["city"] } } ], "tool_choice": {"type": "auto"} }' ``` `tool_choice` values map as follows to the OpenAI dialect: Anthropic `auto` → OpenAI `auto`, Anthropic `any` → OpenAI `required`, Anthropic `{type: "tool", name: "x"}` → OpenAI `{type: "function", function: {name: "x"}}`, Anthropic `none` → OpenAI `none`.

### 🧬 Responses (`/v1/responses`) Unsloth also speaks the newer **OpenAI Responses API**, the protocol Codex and other recent OpenAI clients have moved to. ```bash curl http://localhost:8888/v1/responses \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-local", "input": "Write a one-sentence greeting." }' ```

Streaming works the same way as Chat Completions. Add `"stream": true` and pipe with `-N`. ### 🧰 Unsloth server-side tools (shorthand) In addition to client-side function calling, Unsloth can execute **Python**, **bash**, and **web search** server-side and stream the results back as custom `tool_result` events. This is the feature that makes Unsloth feel like a "real" agent out of the box, no round-tripping tool calls through your client. Opt in by passing these extra fields to **either** `/v1/chat/completions` or `/v1/messages`: | Field | Type | Notes | | ----------------- | --------------- | ------------------------------------------------------------------------ | | `enable_thinking` | `boolean` | `false` to disable thinking. `true` by default | | `enable_tools` | `boolean` | `true` to enable server-side tool execution. | | `enabled_tools` | `array` | Which tools the model can call. Supports `python`, `bash`, `web_search`. | | `session_id` | `string` | Optional. Persists tool state (e.g. Python kernel) across calls. | #### Thinking mode Thinking mode is enabled by default. ```bash curl http://localhost:8888/v1/chat/completions \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "default", "messages": [{"role": "user", "content": "summarize quantum tunneling in one sentence"}], "stream": false, }' ``` The model will think before providing an answer.

To disable thinking pass `enable_thinking: false` in your request. The model will provide an answer without thinking first.

#### Python execution ```bash curl http://localhost:8888/v1/chat/completions \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-local", "messages": [{"role": "user", "content": "What is 123 * 456? Use code to compute it."}], "stream": false, "enable_tools": true, "enabled_tools": ["python"], "session_id": "my-session" }' ```

#### Web search + Python (streaming) ```bash curl http://localhost:8888/v1/chat/completions \ -H "Authorization: Bearer sk-unsloth-xxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-local", "messages": [{"role": "user", "content": "Search for Python 3.13 features"}], "stream": true, "enable_tools": true, "enabled_tools": ["web_search", "python"], "session_id": "my-session" }' ```

#### On `/v1/messages` The same shorthand works against the Anthropic Messages endpoint: ```bash curl http://localhost:8888/v1/messages \ -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxx" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-local", "max_tokens": 1024, "messages": [{"role": "user", "content": "Search for Python 3.13 features"}], "stream": true, "enable_tools": true, "enabled_tools": ["web_search", "python"], "session_id": "my-session" }' ```

Unsloth streams its own `tool_result` SSE events in addition to the standard Anthropic / OpenAI event types, The model sees each tool's output on its next turn. ### ❔ Troubleshooting **`401 Unauthorized`** **-** The `Authorization` header is missing or the key is wrong. Double-check: `Authorization: Bearer sk-unsloth-…`. **`curl` hangs on streaming requests -** Add `-N` (same as `--no-buffer`). Without it, `curl` buffers the SSE stream and you see nothing until the end. **Base64 encoding differs between OSes** **-** Linux's `base64` defaults to wrapping lines, macOS / BSD does not. Use `base64 -w 0` on Linux, `base64` on macOS, or pipe the output through `tr -d '\n'`. **JSON escaping in shells** **-** Heredocs (`-d @file.json`) are cleaner than inline strings once the body gets complex. Example: `curl ... -d @body.json`. **`max_tokens` errors on `/v1/messages`** **-** The Anthropic dialect requires it. Add `"max_tokens": 1024` (or whatever limit you want). For endpoint-level issues (model not loading, connection dropped, wrong port) see the API overview page. --- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://unsloth.ai/docs/integrations/connect-curl-and-http-to-unsloth.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.