# Unsloth 推理

Unsloth 原生支持 2 倍更快的推理。有关仅推理的笔记本，请点击 [这里](https://colab.research.google.com/drive/1aqlNQi7MMJbynFDyOQteD2t0yVfjb9Zh?usp=sharing).

所有 QLoRA、LoRA 和非 LoRA 的推理路径均快 2 倍。这不需要更改代码或新增依赖项。

<pre class="language-python"><code class="lang-python"><strong>from unsloth import FastLanguageModel
</strong>model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "lora_model", # 您用于训练的模型
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # 启用原生 2 倍加速推理
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 64)
</code></pre>

#### NotImplementedError: 需要 UTF-8 区域设置。当前为 ANSI

有时当您执行一个单元格时 [会出现此错误](https://github.com/googlecolab/colabtools/issues/3409) 要解决此问题，请在新单元格中运行以下命令：

```python
import locale
locale.getpreferredencoding = lambda: "UTF-8"
```
