Optimization

Model Formats: Safetensors, GGUF, ONNX & More

Understand the different file formats for AI models and when to use which one.

7 min readUpdated: January 18, 2026

SafetensorsGGUFONNXFormatsSecurity

01Why Different Formats?

AI models can be saved in different file formats. Each format has specific pros and cons regarding security, speed, compatibility, and file size. Choosing the right format can make a noticeable difference.

02Format Overview

The most important model formats:

Safetensors (.safetensors): The recommended format for diffusion models. Safe (no code execution risk), fast to load, and supported by all modern tools.
CKPT (.ckpt): Older PyTorch format. WARNING: Can contain arbitrary Python code! Only use from trusted sources.
GGUF (.gguf): Standard for quantized LLMs. Supports various quantization levels, CPU/GPU split, and metadata. Used by llama.cpp and Ollama.
ONNX (.onnx): Vendor-independent format. Good for inference optimization and cross-platform deployment. Less common for local use.
Diffusers: Hugging Face format with multiple files. Good for programming with the diffusers library, less practical for ComfyUI.
PyTorch (.pt/.pth/.bin): Standard PyTorch format. Similar security concerns as CKPT.

03Security Warning

Note

CKPT and PT files can contain arbitrary Python code that is executed when loaded. This means a manipulated model could install malware! Always use Safetensors files when possible. Only download models from trusted sources like Hugging Face or Civitai.

04Format Conversion

Most formats can be converted to each other. CKPT to Safetensors is especially recommended for older models. Tools like 'safetensors-convert' or the Hugging Face Hub facilitate the conversion. For GGUF conversion, use the llama.cpp convert tools, which offer various quantization levels.

05Recommendation

Tip

For image generation (ComfyUI): Use Safetensors. For LLMs (Ollama/llama.cpp): Use GGUF with a suitable quantization (Q4_K_M is a good default). For development: Diffusers format for maximum flexibility.

VRAM Optimization: Large Models on Small GPUs

Creating Your Own ComfyUI Workflows

Discover More

Explore more articles in our Knowledge Base and become an expert in local AI.