◆
Apex Jr
Llama 3.1 8B (~13 tok/s)
DeepSeek R1 7B (~20 tok/s · reasoning)
DeepSeek R1 14B (~12 tok/s · reasoning)
DeepSeek Coder V2 16B (~33 tok/s · code)
Qwen 2.5 Coder 7B (~22 tok/s)
Qwen 2.5 Coder 14B (~12 tok/s)
Qwen 2.5 14B (~2 tok/s)
Qwen 2.5 Coder 32B (~6 tok/s)
New chat
Self-hosted inference (experimental)
◆
This is not Apex, it's Apex Jr
Trying out different self-hosted models
Send
Running on dedicated hardware · 48 threads · 188 GB RAM · Beauharnois, QC