使用 NVIDIA DGX Spark 与 Unsloth 微调 LLM

关于如何在 NVIDIA DGX Spark 上对 OpenAI gpt-oss 进行微调和强化学习 (RL) 的教程。

Unsloth 使得在 NVIDIA DGX™ Spark 上对最大 200B 参数 的大型语言模型进行本地微调成为可能。借助 128 GB 的统一内存，您可以训练诸如 gpt-oss-120b等大型模型，并在 DGX Spark 上直接运行或部署推理。

如在 OpenAI DevDay所示，gpt-oss-20b 曾在 DGX Spark 上使用 RL 和 Unsloth 训练以自动赢得 2048。您可以在 DGX Spark 的 Docker 容器或虚拟环境中使用 Unsloth 进行训练。

在本教程中，我们将在 DGX Spark 上安装 Unsloth 后，使用 Unsloth 笔记本通过 RL 训练 gpt-oss-20b。gpt-oss-120b 将使用大约 68GB 的统一内存。

在 1,000 步和 4 小时的 RL 训练后，gpt-oss 模型在 2048 上大大优于原始模型，且更长时间的训练会进一步提升结果。

⚡ 逐步教程

从 DGX Spark 的 Unsloth Docker 镜像开始

首先，使用 DGX Spark Dockerfile 构建 Docker 镜像，该文件可以在此找到。您也可以在 DGX Spark 的终端中运行以下命令：

sudo apt update && sudo apt install -y wget
wget -O Dockerfile "https://raw.githubusercontent.com/unslothai/notebooks/main/Dockerfile_DGX_Spark"

然后，使用保存的 Dockerfile 构建训练用 Docker 镜像：

docker build -f Dockerfile -t unsloth-dgx-spark .

您也可以点击查看完整的 DGX Spark Dockerfile

FROM nvcr.io/nvidia/pytorch:25.09-py3

# 设置 CUDA 环境变量
ENV CUDA_HOME=/usr/local/cuda-13.0/
ENV CUDA_PATH=$CUDA_HOME
ENV PATH=$CUDA_HOME/bin:$PATH
ENV LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
ENV C_INCLUDE_PATH=$CUDA_HOME/include:$C_INCLUDE_PATH
ENV CPLUS_INCLUDE_PATH=$CUDA_HOME/include:$CPLUS_INCLUDE_PATH

# 从源代码安装 triton 以支持最新的 blackwell
RUN git clone https://github.com/triton-lang/triton.git && \
    cd triton && \
    git checkout c5d671f91d90f40900027382f98b17a3e04045f6 && \
    pip install -r python/requirements.txt && \
    pip install . && \
    cd ..

# 从源代码安装 xformers 以支持 blackwell
RUN git clone --depth=1 https://github.com/facebookresearch/xformers --recursive && \
    cd xformers && \
    export TORCH_CUDA_ARCH_LIST="12.1" && \
    python setup.py install && \
    cd ..

# 安装 unsloth 及其他依赖
RUN pip install unsloth unsloth_zoo bitsandbytes==0.48.0 transformers==4.56.2 trl==0.22.2

# 启动 shell
CMD ["/bin/bash"]

启动容器

以 GPU 访问和卷挂载启动训练容器：

docker run -it \
    --gpus=all \
    --net=host \
    --ipc=host \
    --ulimit memlock=-1 \
    --ulimit stack=67108864 \
    -v $(pwd):$(pwd) \
    -v $HOME/.cache/huggingface:/root/.cache/huggingface \
    -w $(pwd) \
    unsloth-dgx-spark

启动 Jupyter 并运行笔记本

在容器内，启动 Jupyter 并运行所需的笔记本。您可以使用“强化学习 gpt-oss 20b 赢取 2048” 笔记本在此。实际上，所有 Unsloth 笔记本都可在 DGX Spark 中运行，包括 120b 笔记本！只需移除安装单元格即可。

下面的命令也可用于运行 RL 笔记本。Jupyter Notebook 启动后，打开“gpt_oss_20B_RL_2048_Game.ipynb”

NOTEBOOK_URL="https://raw.githubusercontent.com/unslothai/notebooks/refs/heads/main/nb/gpt_oss_(20B)_Reinforcement_Learning_2048_Game_DGX_Spark.ipynb"
wget -O "gpt_oss_20B_RL_2048_Game.ipynb" "$NOTEBOOK_URL"

jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root