Show HN: Best setup local LLM found for a 5090 (llama.cpp fork + turboquant)

TL;DR

Hi folks, I found this setup on consummer hardware that seems to have great results on local hardware. - qwen 3.6 q6 - 450 K context using turboquant turbo3 mode llama. cpp fork - multimodal support This AI generated blog article is a kind of "report" of what and how I did and result exemples. I hope this can be usefull to some peopole.

Nauti's Take

Coming soon — Nauti's Take is being prepared.

Sources