💧Liquid LFM2.5: How To Run & Fine-tune

Run and fine-tune LFM2.5 Instruct and Vision locally on your device!

Liquid AI releases LFM2.5, including their instruct and vision model. LFM2.5-1.2B-Instruct is a 1.17B parameter hybrid reasoning model trained on 28T tokens and RL, delivering best-in-class performance at the 1B scale for instruction following, tool use, and agentic tasks.

LFM2.5 runs on under 1GB RAM and achieves 239 tok/s decode on AMD CPU. You can also fine-tune it locally with Unsloth.

Text LFM2.5-InstructVision LFM2.5-VL

Model Specifications:

  • Parameters: 1.17B

  • Architecture: 16 layers (10 double-gated LIV convolution blocks + 6 GQA blocks)

  • Training Budget: 28T tokens

  • Context Length: 32,768 tokens

  • Vocabulary Size: 65,536

  • Languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish

⚙️ Usage Guide

Liquid AI recommends these settings for inference:

  • temperature = 0.1

  • top_k = 50

  • top_p = 0.1

  • repetition_penalty = 1.05

  • Maximum context length: 32,768

Chat Template Format

LFM2.5 uses a ChatML-like format:

LFM2.5 chat template:

Tool Use

LFM2.5 supports function calling with special tokens <|tool_call_start|> and <|tool_call_end|>. Provide tools as a JSON object in the system prompt:

🖥️ Run LFM2.5-1.2B-Instruct

📖 llama.cpp Tutorial (GGUF)

1. Build llama.cpp

Obtain the latest llama.cpp from GitHubarrow-up-right. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU.

2. Run directly from Hugging Face

3. Or download the model first

4. Run in conversation mode

🦥 Fine-tuning LFM2.5 with Unsloth

Unsloth supports fine-tuning LFM2.5 models. The 1.2B model fits comfortably on a free Colab T4 GPU. Training is 2x faster with 50% less VRAM.

Free Colab Notebook:

LFM2.5 is recommended for agentic tasks, data extraction, RAG, and tool use. It is not recommended for knowledge-intensive tasks or programming.

Unsloth Config for LFM2.5

Training Setup

Save and Export

🎉 llama-server Serving & Deployment

To deploy LFM2.5 for production with an OpenAI-compatible API:

Test with OpenAI client:

📊 Benchmarks

LFM2.5-1.2B-Instruct delivers best-in-class performance at the 1B scale and offers fast CPU inference with low memory usage:

💧 Liquid LFM2.5-1.2B-VL Guide

LFM2.5-VL-1.6B is a vision LLM built on top of LFM2.5-1.2B-Basearrow-up-right and tuned for stronger real-world performance. You can now fine-tune it locally with Unsloth.

Running TutorialFine-tuning Tutorial

Model Specifications:

  • LM Backbone: LFM2.5-1.2B-Base

  • Vision encoder: SigLIP2 NaFlex shape-optimized 400M

  • Context length: 32,768 tokens

  • Vocabulary size: 65,536

  • Languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish

  • Native resolution processing: Handles images up to 512×512 pixels without upscaling and preserves non-standard aspect ratios without distortion

  • Tiling strategy: Splits large images into non-overlapping 512×512 patches and includes thumbnail encoding for global context

  • Inference-time flexibility: User-tunable maximum image tokens and tile count for speed/quality tradeoff without retraining

⚙️ Usage Guide

Liquid AI recommends these settings for inference:

  • Text: temperature=0.1, min_p=0.15, repetition_penalty=1.05

  • Vision: min_image_tokens=64, max_image_tokens=256, do_image_splitting=True

Chat Template Format

LFM2.5-VL uses a ChatML-like format:

LFM2.5-VL chat template:

🖥️ Run LFM2.5-VL-1.6B

📖 llama.cpp Tutorial (GGUF)

1. Build llama.cpp

Obtain the latest llama.cpp from GitHubarrow-up-right. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU.

2. Run directly from Hugging Face

🦥 Fine-tuning LFM2.5-VL with Unsloth

Unsloth supports fine-tuning LFM2.5 models. The 1.6B model fits comfortably on a free Colab T4 GPU. Training is 2x faster with 50% less VRAM.

Free Colab Notebook:

Unsloth Config for LFM2.5

Training Setup

Save and Export

📊 Benchmarks

LFM2.5-VL-1.6B delivers best-in-class performance:

Model
MMStar
MM-IFEval
BLINK
InfoVQA (Val)
OCRBench (v2)
RealWorldQA
MMMU (Val)
MMMB (avg)
Multilingual MMBench (avg)

LFM2.5-VL-1.6B

50.67

52.29

48.82

62.71

41.44

64.84

40.56

76.96

65.90

LFM2-VL-1.6B

49.87

46.35

44.50

58.35

35.11

65.75

39.67

72.13

60.57

InternVL3.5-1B

50.27

36.17

44.19

60.99

33.53

57.12

41.89

68.93

58.32

FastVLM-1.5B

53.13

24.99

43.29

23.92

26.61

61.56

38.78

64.84

50.89

📚 Resources

Last updated

Was this helpful?