

New post from tante: The “Data” Narrative eats itself, using the latest Pivot to AI as a jumping off point to talk about synthetic data.
New post from tante: The “Data” Narrative eats itself, using the latest Pivot to AI as a jumping off point to talk about synthetic data.
Naturally, the best and most obvious fix — don’t hoard all that shit in the first place — wasn’t suggested.
At this point, I’m gonna chalk the refusal to stop hoarding up to ideology more than anything else. The tech industry clearly sees data not as information to be taken sparingly, used carefully, and deleted when necessary, but as Objective Reality Unitstm which are theirs to steal and theirs alone.
Starting things off with a newsletter by Jared White that caught my attention: Why “Normies” Hate Programmers and the End of the Playful Hacker Trope, which directly discusses how the public perception of programmers has changed for the worse, and how best to rehabilitate it.
Adding my own two cents, the rise of gen-AI has definitely played a role here - I’m gonna quote Baldur Bjarnason directly here, since he said it better than I could:
It’s turned the tech industry from a potential political ally to environmentalism to an outright adversary. Water consumption of individual queries is irrelevant because now companies like Google and Microsoft are explicitly lined up against the fight against climate disaster. For that alone the tech should be burned to the ground.
People in a variety of fields are watching the “AI” industry outright promise to destroy their field, their industry, their work, and their communities. Illustration, filmmaking, writers, and artists don’t need any other reason to be against the tech other than the fact that the industry behind the tech is openly talking about destroying them.
Those who fight for progressive politics are seeing authoritarians use the tech to generate propaganda, litter public institutions with LLM “accountability sinks” that prevent the responsibility of destroying people’s lives from falling on individual civil servants, and efforts to leverage the centralised nature of Large Language Model chatbots into political control over our language.
If AI slop is an insult to life itself, then this shit is an insult to knowledge. Any paper that actually uses “synthetic data” should be immediately retracted (and ideally destroyed altogether), but it’ll probably take years before the poison is purged from the scientific record.
Artificial intelligence is the destruction of knowledge for profit. It has no place in any scientific endeavor. (How you managed to maintain a calm, detached tone when talking about this shit, I will never know.)
Saw an AI-extruded “art” “timelapse” in the wild recently - the “timelapse” in question isn’t gonna fool anyone who actually cares about art, but it’s Good Enoughtm to pass muster on someone mindlessly scrolling, and its creation serves only to attack artists’ ability to prove their work was human made.
This isn’t the first time AI bros have pulled this shit (Exhibit A, Exhibit B), by the way.
Burke and Goodnough are working to rectify the report. That sounds like removing the fake stuff but not the conclusions based on it. Those were determined well ahead of time.
In a better world, those conclusions would’ve been immediately thrown out as lies and Burke and Goodnough would’ve been immediately fired. We do not live in a better timeline, but a man can dream.
This isn’t the first time I’ve heard about this - Baldur Bjarnason’s talked about how text extruders can be poisoned to alter their outputs before, noting its potential for manipulating search results and/or serving propaganda.
Funnily enough, calling a poisoned LLM as a “sleeper agent” wouldn’t be entirely inaccurate - spicy autocomplete, by definition, cannot be aware that their word-prediction attempts are being manipulated to produce specific output. Its still treating these spicy autocompletes with more sentience than they actually have, though
Not to mention, Cursor’s going to be training on a lot of highly sensitive material (sensitive data, copyrighted code, potential trade secrets) - the moment that shit starts to leak, all hell’s gonna break loose on the legal front.
With AI, of course
Now, you might object: Anysphere wouldn’t be abusing just their customers’ data. Their customers’ customers’ data may have non-disclosure agreements with teeth. Then there’s personal data covered by the GDPR and so on.
If we’re lucky, this will spook customers into running for the hills and hasten its demise. Whatever magical performance benefits Cursor’s promising isn’t gonna be worth getting blamed for a data breach.
The report claims its about ethical AI use, but all I see is evidence that AI is inherently unethical, and an argument for banning AI from education forever.
OpenAI’s choices don’t make any long term sense if AGI isn’t coming. The obvious explanation is that at this point he simply plans to grift and hype (while staying technically within the bounds of legality) to buy few years of personal enrichment.
Another possibility is that Altman’s bought into his own hype, and genuinely believes OpenAI will achieve AGI before the money runs out. Considering the tech press has been uncritically hyping up AI in general, and Sammy Boy himself has publicly fawned over “metafiction” “written” by an in-house text extruder, its a possibility I’m not gonna discount.
New premium column from Ed Zitron, digging into OpenAI and Oracle’s deal.
Richard Stallman successfully kamikazes his reputation for good after multiple close attempts over the years
He still maintains a solid reputation with FOSS freaks, fascists and pedophiles to this day. Given the Venn diagram of these three groups is a circle, this isn’t particularly shocking.
Literally what I and many others have been warning about. Using LLMs in your work is effectively giving US authorities central control over the bigotry and biases of your writing
- Baldur Bjarnason, talking about Trump’s plan to turn LLMs into propaganda machines
Remember when I told you that using these LLMs was like giving US tech a bigotry dial for all your writing?
- Baldur, once again, on Facebook obeying in advance with Llama 4
(On a brighter note, that’s a pretty clever Touhou nod in the thumbnail)
Being compared to whackjobs with a worse grip on reality than him definitely helped.
Somehow, Palpatine returned Scott came off as a voice of reason
In more low-key news, French voice actor Françoise Cadol has accused Aspyr Media of making an AI replica of her voice for the latest Tomb Raider remaster (the French one specifically, though its probably not the only one with AI slop voices).
Examples of the AI-generated voicelines have popped up on social media.
Nausicaä of the Valley of the Wind cost 1 million to make in 1984. (No idea what that would be adjusted for inflation).
I checked a few random inflation calculators, and it comes out to roughly $3.1 million.
If there is, I haven’t heard of it. To try and preemptively coin one, “artificial industry” (“AI” for short) would be pretty fitting - far as I can tell, no industry has unmoored itself from reality like this until the tech industry pulled it off via the AI bubble.
I genuinely forgot the metaverse existed until I read this.