Model Formats: Safetensors, GGUF, ONNX & More
Understand the different file formats for AI models and when to use which one.
Table of Contents
01Why Different Formats?
AI models can be saved in different file formats. Each format has specific pros and cons regarding security, speed, compatibility, and file size. Choosing the right format can make a noticeable difference.
02Format Overview
The most important model formats:
- Safetensors (.safetensors): The recommended format for diffusion models. Safe (no code execution risk), fast to load, and supported by all modern tools.
- CKPT (.ckpt): Older PyTorch format. WARNING: Can contain arbitrary Python code! Only use from trusted sources.
- GGUF (.gguf): Standard for quantized LLMs. Supports various quantization levels, CPU/GPU split, and metadata. Used by llama.cpp and Ollama.
- ONNX (.onnx): Vendor-independent format. Good for inference optimization and cross-platform deployment. Less common for local use.
- Diffusers: Hugging Face format with multiple files. Good for programming with the diffusers library, less practical for ComfyUI.
- PyTorch (.pt/.pth/.bin): Standard PyTorch format. Similar security concerns as CKPT.
03Security Warning
CKPT and PT files can contain arbitrary Python code that is executed when loaded. This means a manipulated model could install malware! Always use Safetensors files when possible. Only download models from trusted sources like Hugging Face or Civitai.
04Format Conversion
Most formats can be converted to each other. CKPT to Safetensors is especially recommended for older models. Tools like 'safetensors-convert' or the Hugging Face Hub facilitate the conversion. For GGUF conversion, use the llama.cpp convert tools, which offer various quantization levels.
05Recommendation
For image generation (ComfyUI): Use Safetensors. For LLMs (Ollama/llama.cpp): Use GGUF with a suitable quantization (Q4_K_M is a good default). For development: Diffusers format for maximum flexibility.