Version v1.7

You're viewing docs for v1.7. If you are cloning the repository, make sure to check out this release: git checkout v1.7

Quickstart¶

Install Cactus and run your first on-device AI completion.

Installation¶

React NativeFlutterKotlinSwiftPythonRustCLIC++

npm install cactus-react-native react-native-nitro-modules

Platform Integration¶

Homebrew (macOS):

brew install cactus-compute/cactus/cactus

From Source (macOS):

git clone https://github.com/cactus-compute/cactus && cd cactus && source ./setup

From Source (Linux):

sudo apt-get install python3 python3-venv python3-pip cmake build-essential libcurl4-openssl-dev
git clone https://github.com/cactus-compute/cactus && cd cactus && source ./setup

Include the Cactus header in your project:

#include <cactus.h>

See the Cactus repository for CMake build instructions.

Your First Completion¶

React NativeFlutterKotlinSwiftPythonRustCLIC++

import { useCactusLM } from 'cactus-react-native';

const App = () => {
  const cactusLM = useCactusLM();

  useEffect(() => {
    if (!cactusLM.isDownloaded) {
      cactusLM.download();
    }
  }, []);

  const handleGenerate = () => {
    cactusLM.complete({
      messages: [{ role: 'user', content: 'What is the capital of France?' }],
    });
  };

  if (cactusLM.isDownloading) {
    return <Text>Downloading: {Math.round(cactusLM.downloadProgress * 100)}%</Text>;
  }

  return (
    <>
      <Button onPress={handleGenerate} title="Generate" />
      <Text>{cactusLM.completion}</Text>
    </>
  );
};

use cactus_sys::*;
use std::ffi::CString;

unsafe {
    let model_path = CString::new("path/to/weight/folder").unwrap();
    let model = cactus_init(model_path.as_ptr(), std::ptr::null(), false);

    let messages = CString::new(
        r#"[{"role": "user", "content": "What is the capital of France?"}]"#
    ).unwrap();

    let mut response = vec![0u8; 4096];
    cactus_complete(
        model, messages.as_ptr(),
        response.as_mut_ptr() as *mut i8, 4096,
        std::ptr::null(), std::ptr::null(),
        None, std::ptr::null_mut(),
    );

    println!("{}", String::from_utf8_lossy(&response));
    cactus_destroy(model);
}

cactus run LiquidAI/LFM2-350M

#include <cactus.h>

cactus_model_t model = cactus_init(
    "path/to/weight/folder",
    "path/to/rag/documents",
    false
);

const char* messages = R"([
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
])";

char response[4096];
int result = cactus_complete(
    model, messages, response, sizeof(response),
    nullptr, nullptr, nullptr, nullptr
);

Supported Models¶

LLMs: Gemma-3 (270M, FunctionGemma-270M, 1B), LiquidAI LFM2 (350M, 2.6B) / LFM2.5 (1.2B-Instruct, 1.2B-Thinking) / LFM2-8B-A1B, Qwen3 (0.6B, 1.7B) (completion, tools, embeddings)
Vision: LFM2-VL, LFM2.5-VL (with Apple NPU), Qwen3.5 (0.8B, 2B)
Transcription: Whisper (Tiny/Base/Small/Medium with Apple NPU), Parakeet (CTC-0.6B/CTC-1.1B/TDT-0.6B-v3 with Apple NPU), Moonshine-Base
VAD: Silero VAD for voice activity detection
Embeddings: Nomic-Embed, Qwen3-Embedding

See the full list on HuggingFace.

Next Steps¶

Engine API -- Full inference API reference
Graph API -- Zero-copy computation graph for custom models
Fine-tuning & Deployment -- Convert and deploy custom fine-tunes
Choose Your SDK -- Help picking the right SDK for your project