I was browsing Reddit (yetch) while waiting for some stuff to finish when I came across this post

https://old.reddit.com/r/LocalLLM/comments/1tek00h/why_is_llm_is_so_expensive/

The author make a (very) interesting claim: if table stakes are $6K (they’re not…but go with it for now), then most folks are cooked from the get go.

Personally, I have been figuring out how to get more from less. For example, people have found ways to run Qwen3.6 35B on a 6GB VRAM GTX 1060 at ~20tok/s (–ctx 64K IIRC, but go check the vids yourself)

https://youtu.be/8F_5pdcD3HY

I think there’s a lot of juice to squeeze by turning LLMs from “all seeing sages” into basically mouth pieces for shit that actually runs fast on regular silicon - but that’s just me and my crazy brain. YMMV.

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    2 days ago

    I don’t think that’s any new insight 😂. That’s how the AI game works. There’s always been two classes: Big corpo. And the GPU poor. Of course the big AI companies get to shape AI. Economy of scale also works in their favour. They’ve bought most of the skill. And they have all the money. They simply buy a 4x EPYC +3TB RAM connected to 16 Nvidia AI cards. And then a few hundred nodes more. You don’t even buy one. It’s just a very unequal environment if you want to compare the two.