@hedgehog

hedgehog@ttrpg.network · 7 months ago

We need more unions like this one.

hedgehog@ttrpg.network · 7 months ago

Edit: also i have a very strong suspicion that someone will figure out a way to make most matrix multiplications in an LLM be sparse, doing mostly same shit in a different basis. An answer to a specific query does not intrinsically use every piece of information that LLM has memorized.

Like MoE (Mixture of Experts) models? This technique is already in use by many models - Deepseek, Llama 4, Kimi 2, Mixtral, Qwen3 30B and 235B, and many more. I read that GPT 4 was leaked and confirmed to use MoE, and Grok is confirmed to use MoE; I suspect most large, hosted, proprietary models are using MoE in some manner.

hedgehog@ttrpg.network · 8 months ago

It’s not “dark green,” that’s for sure.

hedgehog@ttrpg.network · 8 months ago

From the blog post referenced:

We do not provide evidence that:

AI systems do not currently speed up many or most software developers

Seems the article should be titled “16 AI coders think they’re 20% faster — but they’re actually 19% slower” - though I guess making us think it was intended to be a statistically relevant finding was the point.

That all said, this was genuinely interesting and is in-line with my understanding of the human psychology that’s at play. It would be nice to see this at a wider scale, broken down across different methodologies / toolsets and models.