Show HN: Best setup local LLM found for a 5090 (llama.cpp fork + turboquant)
TL;DR
Hi folks, I found this setup on consummer hardware that seems to have great results on local hardware. - qwen 3.6 q6 - 450 K context using turboquant turbo3 mode llama. cpp fork - multimodal support This AI generated blog article is a kind of "report" of what and how I did and result exemples. I hope this can be usefull to some peopole.
Nauti's Take
Coming soon — Nauti's Take is being prepared.