Run My AI Model.

Find the perfect AI model for your GPU in seconds

Select Your GPU

GPUs

Number of GPUs. VRAM scales with this count for multi-GPU setups.

VRAM (GB)

Total video memory. Enter a custom value if no specific GPU is selected.

Minimum Context Window

Max text memory. 128k+ is ideal for analyzing full documents.

Minimum Tokens/Sec

Generation speed. 15-20 t/s is reading speed; 50+ is very fast.

Required Features

Filter by capabilities like Coding (high SWE-bench), MoE (efficiency), Math, Vision, or Reasoning.

Any Features

Minimum Quality Tier

Benchmarked performance. Excellent models are most capable.

Model Family

Specific architectures like Llama, Qwen, or Mistral.

System RAM (optional)

Spare memory used if VRAM is full (offloading). Much slower.

Pick a GPU or enter VRAM to get started

Results

Select a GPU or enter your VRAM to see results.