# Connect Curl & HTTP to Unsloth

Unsloth exposes three OpenAI/Anthropic-compatible wire formats at the same base URL on the port Unsloth started on. All of them take an `Authorization: Bearer sk-unsloth-…` header and return either JSON or SSE, depending on whether you set `stream`. \
\
This page groups the recipes by endpoint (`/v1/chat/completions`, `/v1/messages`, `/v1/responses`, `/v1/models`) and ends with a shared section on Unsloth's built-in **server-side tools**, which work across all the chat endpoints.

{% hint style="info" %}
If you're not sure what URL / key / model name to use, read the API overview first. It walks you through starting Unsloth, loading a model, and creating an `sk-unsloth-…` key.
{% endhint %}

### 🔑 Authentication

Every request needs an `Authorization` header:

```
Authorization: Bearer sk-unsloth-xxxxxxxxxxxx
```

To keep keys out of your shell history, export the key once and reference the env var:

```bash
export UNSLOTH_STUDIO_AUTH_TOKEN=sk-unsloth-xxxxxxxxxxxx
```

The snippets below inline the key as `sk-unsloth-xxxxxxxxxxxx` for clarity. In practice, substitute `$UNSLOTH_STUDIO_API_KEY`.

### 📋 List loaded models

```bash
curl http://localhost:8888/v1/models \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx"
```

Response:

```json
{
  "object": "list",
  "data": [
    {"id": "unsloth/gemma-3-27b-it-GGUF", "object": "model", "owned_by": "local"}
  ]
}
```

<figure><img src="/files/P7uayFTXkv36B22HRBye" alt=""><figcaption></figcaption></figure>

Use the `id` field whenever a request needs a `"model"` value (or when a client like opencode asks for a **Model ID**).

### 💬 Chat Completions (`/v1/chat/completions`)

The OpenAI Chat Completions dialect. The broadest compatibility surface. Works with the OpenAI SDK, opencode, Cursor, Continue, Cline, Open WebUI, SillyTavern, and most OpenAI-compatible tools.

#### Basic request

```bash
curl http://localhost:8888/v1/chat/completions \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "default",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

<figure><img src="/files/tXGzWlxxh76cnIYXtaxX" alt=""><figcaption></figcaption></figure>

#### Streaming

Add `"stream": true` and the response switches to Server-Sent Events (`text/event-stream`). Tell `curl` to flush as bytes arrive with `--no-buffer` (`-N`):

```bash
curl -N http://localhost:8888/v1/chat/completions \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-local",
    "messages": [{"role": "user", "content": "Write a haiku about locally-run LLMs."}],
    "stream": true
  }'
```

Each line of the response looks like `data: {"choices":[{"delta":{"content":"..."}}]}`, ending with `data: [DONE]`.

<figure><img src="/files/nrGfv5aISlkPONDyOlJP" alt=""><figcaption></figcaption></figure>

#### Images (vision)

Attach an image as an `image_url` content part in the user message. The URL can be HTTPS or a base64 `data:` URI:

```bash
# Embed a local file as base64 (trimmed for brevity)
IMG=$(base64 -w 0 test.jpg)

cat > /tmp/request.json <<EOF
{
  "model":"default",
  "messages":[{"role":"user","content":[
    {"type":"text","text":"Describe the image."},
    {"type":"image_url","image_url":{"url":"data:image/jpeg;base64,$IMG"}}
  ]}],
  "max_tokens":200,
  "stream":false
}
EOF

curl http://localhost:8888/v1/chat/completions \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d @/tmp/request.json
```

{% hint style="info" %}
The loaded model must be multimodal. If you load a text-only model the request succeeds structurally but the model won't process the image.
{% endhint %}

<figure><img src="/files/KhxG79l5Gdmtc61xTCsZ" alt=""><figcaption></figcaption></figure>

#### Function calling (OpenAI tools)

Pass OpenAI-style `tools` and (optionally) `tool_choice`. Your client runs each tool call and returns the result on the next turn.

```bash
curl http://localhost:8888/v1/chat/completions \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model":"default",
    "input":[{"role":"user","content":"What is the weather in Paris?"}],
    "tools":[{
      "type":"function",
      "name":"get_weather",
      "description":"Get current weather for a city.",
      "parameters":{
        "type":"object",
        "properties":{"city":{"type":"string"}},
        "required":["city"]
      }
    }],
    "tool_choice":"required"
  }' | jq '.output, .usage'
```

<figure><img src="/files/7deYtiUtBsoPEJR3NGl0" alt=""><figcaption></figcaption></figure>

### 📨 Anthropic Messages (`/v1/messages`)

Unsloth's Anthropic-compatible dialect used by Claude Code, the Anthropic SDK, OpenClaw, and any client that speaks the Messages API.

#### Basic request

```bash
curl http://localhost:8888/v1/messages \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "default",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

{% hint style="warning" %}
`max_tokens` is required on `/v1/messages` (it's optional on `/v1/chat/completions`).
{% endhint %}

<figure><img src="/files/GmnVpXYdI4662b3mahJH" alt=""><figcaption></figcaption></figure>

#### Streaming

```bash
curl -N http://localhost:8888/v1/messages \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-local",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Explain LoRA in two sentences."}],
    "stream": true
  }'
```

Events follow Anthropic's SSE shape: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, `message_stop`, plus Unsloth's custom `tool_result` event for server-side tool output.

#### Images (vision)

Anthropic-style image content uses a `source` block with base64 data:

```bash
IMG=$(base64 -w 0 test.jpg)

cat > /tmp/request.json <<EOF
{
  "messages":[{"role":"user","content":[
    {"type":"text","text":"Describe the image."},
    {"type":"image","source":{"type":"base64","media_type":"image/jpeg","data":"$IMG"}}
  ]}],
  "max_tokens":200,
  "stream":true
}
EOF

curl -s http://3.128.175.236:8888/v1/messages \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxx" \
  -H "Content-Type: application/json" \
  -d @/tmp/request.json
```

<figure><img src="/files/EOkRmhPmvpTnsjgZJoob" alt=""><figcaption></figcaption></figure>

#### Tool calling (Anthropic tools)

```bash
curl http://localhost:8888/v1/messages \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-local",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Whats the weather in Tokyo?"}],
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "input_schema": {
          "type": "object",
          "properties": {
            "city": {"type": "string"}
          },
          "required": ["city"]
        }
      }
    ],
    "tool_choice": {"type": "auto"}
  }'
```

`tool_choice` values map as follows to the OpenAI dialect: Anthropic `auto` → OpenAI `auto`, Anthropic `any` → OpenAI `required`, Anthropic `{type: "tool", name: "x"}` → OpenAI `{type: "function", function: {name: "x"}}`, Anthropic `none` → OpenAI `none`.

<figure><img src="/files/Jqk1nRiGMBuFKMaUKxER" alt=""><figcaption></figcaption></figure>

### 🧬 Responses (`/v1/responses`)

Unsloth also speaks the newer **OpenAI Responses API**, the protocol Codex and other recent OpenAI clients have moved to.

```bash
curl http://localhost:8888/v1/responses \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-local",
    "input": "Write a one-sentence greeting."
  }'
```

<figure><img src="/files/mFRVvtAHk7kC9o1irYie" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/6oK1YDaFoNZauk9x6aaC" alt=""><figcaption></figcaption></figure>

Streaming works the same way as Chat Completions. Add `"stream": true` and pipe with `-N`.

### 🧰 Unsloth server-side tools (shorthand)

In addition to client-side function calling, Unsloth can execute **Python**, **bash**, and **web search** server-side and stream the results back as custom `tool_result` events. This is the feature that makes Unsloth feel like a "real" agent out of the box, no round-tripping tool calls through your client.

Opt in by passing these extra fields to **either** `/v1/chat/completions` or `/v1/messages`:

| Field             | Type            | Notes                                                                    |
| ----------------- | --------------- | ------------------------------------------------------------------------ |
| `enable_thinking` | `boolean`       | `false` to disable thinking. `true` by default                           |
| `enable_tools`    | `boolean`       | `true` to enable server-side tool execution.                             |
| `enabled_tools`   | `array<string>` | Which tools the model can call. Supports `python`, `bash`, `web_search`. |
| `session_id`      | `string`        | Optional. Persists tool state (e.g. Python kernel) across calls.         |

#### Thinking mode

Thinking mode is enabled by default.&#x20;

```bash
curl http://localhost:8888/v1/chat/completions \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "default",
    "messages": [{"role": "user", "content": "summarize quantum tunneling in one sentence"}],
    "stream": false,
  }'
```

The model will think before providing an answer.

<figure><img src="/files/0SmaIDjxf7sTCwX4jvYx" alt=""><figcaption></figcaption></figure>

To disable thinking pass `enable_thinking: false` in your request. The model will provide an answer without thinking first.

<figure><img src="/files/bi4WMn7nSESZFm33q0jZ" alt=""><figcaption></figcaption></figure>

#### Python execution

```bash
curl http://localhost:8888/v1/chat/completions \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-local",
    "messages": [{"role": "user", "content": "What is 123 * 456? Use code to compute it."}],
    "stream": false,
    "enable_tools": true,
    "enabled_tools": ["python"],
    "session_id": "my-session"
  }'
```

<figure><img src="/files/RMVz0lh2r9f0KR7N0qpp" alt=""><figcaption></figcaption></figure>

#### Web search + Python (streaming)

```bash
curl http://localhost:8888/v1/chat/completions \
  -H "Authorization: Bearer sk-unsloth-xxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-local",
    "messages": [{"role": "user", "content": "Search for Python 3.13 features"}],
    "stream": true,
    "enable_tools": true,
    "enabled_tools": ["web_search", "python"],
    "session_id": "my-session"
  }'
```

<figure><img src="/files/28ynYadAnwRUOIqKCvl9" alt=""><figcaption></figcaption></figure>

#### On `/v1/messages`

The same shorthand works against the Anthropic Messages endpoint:

```bash
curl http://localhost:8888/v1/messages \
  -H "Authorization: Bearer sk-unsloth-xxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-local",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Search for Python 3.13 features"}],
    "stream": true,
    "enable_tools": true,
    "enabled_tools": ["web_search", "python"],
    "session_id": "my-session"
  }'
```

<figure><img src="/files/BCgLj2illiC4RKJquR0h" alt=""><figcaption></figcaption></figure>

Unsloth streams its own `tool_result` SSE events in addition to the standard Anthropic / OpenAI event types, The model sees each tool's output on its next turn.

### ❔ Troubleshooting

**`401 Unauthorized`** **-** The `Authorization` header is missing or the key is wrong. Double-check: `Authorization: Bearer sk-unsloth-…`.

**`curl` hangs on streaming requests -** Add `-N` (same as `--no-buffer`). Without it, `curl` buffers the SSE stream and you see nothing until the end.

**Base64 encoding differs between OSes** **-** Linux's `base64` defaults to wrapping lines, macOS / BSD does not. Use `base64 -w 0` on Linux, `base64` on macOS, or pipe the output through `tr -d '\n'`.

**JSON escaping in shells** **-** Heredocs (`-d @file.json`) are cleaner than inline strings once the body gets complex. Example: `curl ... -d @body.json`.

**`max_tokens` errors on `/v1/messages`** **-** The Anthropic dialect requires it. Add `"max_tokens": 1024` (or whatever limit you want).

For endpoint-level issues (model not loading, connection dropped, wrong port) see the API overview page.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://unsloth.ai/docs/integrations/connect-curl-and-http-to-unsloth.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
