Unsloth 推理
了解如何使用 Unsloth 更快的推理来运行您微调后的模型。
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "lora_model", # 您用于训练的模型
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # 启用原生 2 倍加速推理
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 64)NotImplementedError: 需要 UTF-8 区域设置。当前为 ANSI
import locale
locale.getpreferredencoding = lambda: "UTF-8"最后更新于
这有帮助吗?

