Beginner's Guide
Run My
AI
Model
.
Find the perfect AI model for your GPU in seconds
Select Your GPU
GPUs
Number of GPUs. VRAM scales with this count for multi-GPU setups.
or
VRAM (GB)
Total video memory. Enter a custom value if no specific GPU is selected.
Minimum Context Window
Max text memory. 128k+ is ideal for analyzing full documents.
Any
2k
4k
8k
16k
32k
64k
128k
256k
512k
1M
2M+
Minimum Tokens/Sec
Generation speed. 15-20 t/s is reading speed; 50+ is very fast.
Any
10
20
30
40
50
100
Required Features
Filter by capabilities like Coding (high SWE-bench), MoE (efficiency), Math, Vision, or Reasoning.
Any Features
Minimum Quality Tier
Benchmarked performance. Excellent models are most capable.
Any Quality
Excellent
Great
Good
Fair
Basic
Model Family
Specific architectures like Llama, Qwen, or Mistral.
Any Family
Llama
Qwen
DeepSeek
Mistral
Gemma
Phi
Other
System RAM
(optional)
Spare memory used if VRAM is full (offloading). Much slower.
GB
Pick a GPU or enter VRAM to get started
Results
Select a GPU or enter your VRAM to see results.
```