I think it’s Fromi-Owa. Maybe half Japanese?
- 0 Posts
- 13 Comments
SGforce@lemmy.cato
science@lemmy.world•NoFap(Anti-porn group) and the group’s founder, Alexander Rhodes, sued Taylor & Francis and UCLA researcher over a paper critical of the groupEnglish
202·19 days agoI assumed they’d turn up in the Epstein files. He funded tons of adjacent shit.
SGforce@lemmy.cato
Data is Beautiful@mander.xyz•What's going to kill you? [US, 2018-2023]English
4·7 months agoHuge gap there on that second graph. Hiding a massive spike, are we?
SGforce@lemmy.cato
CasualEurope@piefed.social•Tell me what country you're from without telling me what country you're from
9·8 months agoHow’s it going, eh?
I’m going to make the crookedest arrows, guaranteed.
SGforce@lemmy.cato
TechTakes@awful.systems•Gemini 2.5 "reasoning", no real improvement on river crossings.English
311·1 year agoExactly. It’s overtrained on the test, ignoring the differences. If you instead used something it recognises but doesn’t recognise as the test pattern (having the same tokens/embeddings) it will perform better. I’m not joking, it’s a common tactic to get around censoring. You’re just going around the issue. What I’m saying is they’ve trained the model so much on benchmarks that it is indeed dumber.
SGforce@lemmy.cato
TechTakes@awful.systems•Gemini 2.5 "reasoning", no real improvement on river crossings.English
19·1 year agoBet
SGforce@lemmy.cato
TechTakes@awful.systems•Gemini 2.5 "reasoning", no real improvement on river crossings.English
214·1 year agoIt’s just overtrained on the puzzle such that it mostly ignores your prompt. Changing a few words out doesn’t change that it recognises the puzzle. Try writing it out in ASCII or uploading an image with it written or some other weird way that it hasn’t been specifically trained on and I bet it actually performs better.
SGforce@lemmy.cato
TechTakes@awful.systems•Lol. Lmao even. "DeepSeek R1 reproduced for $30: Berkeley researchers replicate DeepSeek R1 for $30—casting doubt on H100 claims and controversy"English
1118·1 year agoThey finetuned 1.5-3b models. This is a non-story
SGforce@lemmy.cato
TechTakes@awful.systems•Deepseek Tianenmen square controversy gets weirderEnglish
7·1 year agoThe local models are distilled versions of Qwen or llama or whatever else, not really deepseek’s model. So you get refusals based on the base model primarily, plus whatever it learned from the distilling. If it’s Qwen or another Chinese model then it’s more likely to refuse but a llama model or something else could pick it up to a lesser extent.
SGforce@lemmy.cato
TechTakes@awful.systems•Stubsack: weekly thread for sneers not worth an entire post, week ending 2nd February 2025English
8·1 year agoWhen hedge funds decide to flip the switch on something the reaction never looks rational. Meta was green today ffs.
Yep, was going to mention a study from a few years ago that threw out most of this data as junk since they found many were counting stretched flaccid as “erect”.


Such is life in the zone