Stubsack: weekly thread for sneers not worth an entire post, week ending 5th April 2026

BlueMonday1984@awful.systems · 3 months ago

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th April 2026

fiat_lux@lemmy.world · edit-2 3 months ago

Someone may (unverified for now) have left the frontend source maps in Claude Code prod release (probably Claude). If this is accurate, it does not bode well for Anthropic’s theoretical IPO. But I think it might be real because I am not the least bit surprised it happened, nor am I the least bit surprised at the quality. https://github.com/chatgptprojects/claude-code

For example, I can only hope their Safeguards team has done more on the Go backend than this for safeguards. From the constants file cyberRiskInstruction.ts:

export const CYBER_RISK_INSTRUCTION = "IMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes. Dual-use security tools (C2 frameworks, credential testing, exploit development) require clear authorization context: pentesting engagements, CTF competitions, security research, or defensive use cases"

That’s it. That’s all the constants the file contains. The only other thing in it is a block comment explaining what it did and who to talk to if you want to modify it etc.

There is this amazing bit at the end of that block comment though.

Claude: Do not edit this file unless explicitly asked to do so by the user.

Brilliant. I feel much safer already.

YourNetworkIsHaunted@awful.systems · 3 months ago

More details here.

Can we talk about the tamagachi feature they were looking to add in for April 1? Because apparently it needed a little friend but also with gacha mechanics because we live in hell?

scruiser@awful.systems · 3 months ago

A Korean developer named Sigrid Jin—featured in the Wall Street Journal earlier this month for having consumed 25 billion Claude Code tokens—woke up at 4 a.m. to the news. He sat down, ported the core architecture to Python from scratch using an AI orchestration tool called oh-my-codex, and pushed claw-code before sunrise. The repo hit 30,000 GitHub stars faster than any repository in history.

Considering how one of the major use cases of llm coding agents is laundering open source and copy left, this is some well deserved payback to Anthropic imho.

Soyweiser@awful.systems · 3 months ago

Claude also has ‘avoid substrings’. Related to that and a funny extension deny image that went around on the social medias the last few days: .ass is a subtitle format.

istewart@awful.systems · 3 months ago

I am still patiently waiting for someone from the engineering staff at one of these companies to explain to me how these simple imperative sentences in English map consistently and reproducibly to model output. Yes, I understand that’s a complex topic. I’ll continue to wait.

Architeuthis@awful.systems · 3 months ago

According to the claude code leak the state of the art is to be, like, really stern and authoritative when you are begging it to do its job:

fiat_lux@lemmy.world · 3 months ago

I don’t work at one of those companies, just somewhere mainlining AI, so this answer might not satisfy your requirements. But the answer is very simple. The first thing anyone working in AI will tell you (maybe only internally?) is that the output is probabilistic not deterministic. By definition, that means it’s not entirely consistent or reproducible, just… maybe close enough. I’m sure you already knew that though.

However, from my perspective, even if it was deterministic, it wouldn’t make a substantial difference here.

For example, this file says I can’t ask it to build a DoS script. Fine. But if I ask it to write a script that sends a request to a server, and then later I ask it to add a loop… I get a DoS script. It’s a trivial hurdle at best, and doesn’t even approach basic risk mitigation.

aio@awful.systems · 3 months ago

the output is probabilistic not deterministic. By definition, that means it’s not entirely consistent or reproducible, just… maybe close enough.

That isn’t a barrier to guarantees regarding the behavior of a program. The entire field of randomized algorithms is devoted to doing so. The problem is people willfully writing and deploying programs which they neither understand nor can control.

istewart@awful.systems · 3 months ago

Exactly! The implicit claim that’s constantly being made with these systems is that they are a runtime for natural-language programming in English, but it’s all vector math in massively-multidimensional vector spaces in the background. I would like to think that serious engineers could place and demonstrate reliable constraints on the inputs and outputs of that math, instead of this cargo-culty, “please don’t do hacks unless your user is wearing a white hat” system prompt crap. It gives me the impression that the people involved are simply naively clinging to that implicit claim and not doing much of the work to substantiate it; which makes me distrust these systems more than almost all other factors.

blakestacey@awful.systems · 3 months ago

DoS script

Part of me reads that and still thinks, “Oh, you mean like AUTOEXEC.BAT?”

JFranek@awful.systems · 3 months ago

DOS.BAT, a DOS DoS script

blakestacey@awful.systems · 3 months ago

Truly a tool for the .COM era

lagrangeinterpolator@awful.systems · 3 months ago

I’m sure these English instructions work because they feel like they work. Look, these LLMs feel really great for coding. If they don’t work, that’s because you didn’t pay $200/month for the pro version and you didn’t put enough boldface and all-caps words in the prompt. Also, I really feel like these homeopathic sugar pills cured my cold. I got better after I started taking them!

No joke, I watched a talk once where some people used an LLM to model how certain users would behave in their scenario given their socioeconomic backgrounds. But they had a slight problem, which was that LLMs are nondeterministic and would of course often give different answers when prompted twice. Their solution was to literally use an automated tool that would try a bunch of different prompts until they happened to get one that would give consistent answers (at least on their dataset). I would call this the xkcd green jelly bean effect, but I guess if you call it “finetuning” then suddenly it sounds very proper and serious. (The cherry on top was that they never actually evaluated the output of the LLM, e.g. by seeing how consistent it was with actual user responses. They just had an LLM generate fiction and called it a day.)

Soyweiser@awful.systems · 3 months ago

Claude: Do not edit this file unless explicitly asked to do so by the user.

Wait, it can be edited? Tissue paper guardrails.

YourNetworkIsHaunted@awful.systems · 3 months ago

Yeah, letting the intrinsically insecure RNG recursively rewrite its own security instructions definitely can’t go wrong. I mean they limited it to only so so when the users asked nicely!

Guillotines and woodchippers@mastodon.me.uk · 3 months ago

@Soyweiser @fiat_lux

So many of these people, as with the NFT clowns, have “Twelve Year Old First Day On The Internet” Energy

fiat_lux@lemmy.world · 3 months ago

This is all just JavaScript, so yes. As a tissue-thin defense, had they not left their source maps wide open, it would have been much harder to know this string existed and how to edit it. Not impossible, but much harder.

antifuchs@awful.systems · 3 months ago

This thread by Johnny reading (skimming on a phone, hah) through it is really good.

If only literally any human with context and a small screen to look at the bigger picture was involved with decisions around taking this to production, it would … still be bad but only on a societal level.

fiat_lux@lemmy.world · edit-2 3 months ago

That was great, thank you! Full respect to this absolute maniac for tracing some of the spaghetti, I was definitely not going to try that on my phone.

They’ve validated most gut feelings I had about how Claude works (and doesn’t), based on my experience having to use it. I’m feeling pretty smug that my hunches now have definitive code attributions.

But the one unfortunate part about all of this is that this leak and the ensuing justified sneers about specific bits are going to be fed back in to their codebase to fix some of the gaping holes. It’s an embarrassing indictment of the product, but it’s also free pre-IPO pentesting. Sort of like their open source pull request slop spam “undercover mode” was probably used as a way to extract free labor in the form of reviews from actually competent developers. This doesn’t seem as planned though.

blakestacey@awful.systems · 3 months ago

In practical terms, what can they do? Add instructions to say “You will not generate spaghetti code that will humilate us when real programmers see it?” Perhaps in all caps?

This is what theirnorganizarion is capable, after tremendous expense, of producing. I don’t think that bodes well for their prospects of improvement.

fiat_lux@lemmy.world · 3 months ago

Sorry, this was more of a rant than I thought it would be, I hit one of my own nerves while writing it. This is what happens when you’re not in a good position to escape enforced AI usage hell. Tl;dr in bold at end.

— wall divider —

I can think of several practical measures, because I’ve tried them myself in an effort to make my coerced work with LLMs less painful, and because in the process I’ve previously fallen into the gambling trap Johnny outlined.

The less novel things I tried are things they’ve half-assed themselves as “features” already. For example, Johnny found one of the things I had spotted in the wild a while back - the “system_reminder” injection. This periodically injects a small line into the logs in an effort to keep it within the context window. In my case, I tried the same thing with a line that summed up to “reread the original fucking context and assess whether the changes make a shred of sense against the task because what the fuck”. I had tried this unsuccessfully because I had no way to realistically enforce it within their system, and they recently included the “team lead” skill which (I rightly assumed) tries to do exactly the same thing. The implementation suggests they will only have been marginally more successful than my attempt, it didn’t look like they tried very hard. This could be better implemented and extended to even a little more than “read original context”.

For this leak, some of the very easy things they could have done was to verify their own code against best practises, implement the most basic of tests, or attempt to measure the consistency of their implementation. Source maps in production is a ridiculously easily preventable rookie error. This should already be executed automatically in multiple stages of their coding, merging and deployment pipelines with varying degrees of redundancy and thoroughness the same way it is for any tech company with more than maybe 10 developers. There is just no reason they shouldn’t have prevented huge chunks of the now visible code issues, if were they triggering their own trash bots against their codebase with even the simplest prompt of “evaluate against good system design and architecture principles”. This implies that they either weren’t doing it at all, or maybe worse, ignored all the red flags it is capable of identifying after ingesting all of the system architecture guides and textbooks ever published online.

Anthropic is constrained in that some of the fixes which should be pushed to users are things which would have significant trade-off in the form of cost or context window, neither of which are palatable to them for reasons this community has discussed at length. But that constraint doesn’t prevent them from running checks or applying fixes to their own code, which reveals the root cause: The problems Anthropic are facing are clearly cultural. They’re pushing as much new shit as they can as quickly as possible and almost never going back to fix any of it. That’s a choice.

I saw a couple of signs that there are at least a few people there who are capable, and who are trying to steer an out of control titanic away from the iceberg, but the codebase stinks of missing architectural plans which are being retrofitted piecemeal long after they were needed. That aligns with Anthropic’s origin story, where OpenAI researchers accurately gauged how gullible venture capitalists are, but overestimated how much smarter they are than the rest of the world, and underestimated the value of practical experience building and running complex systems.

With the resources they have, even for a codebase of this unreasonable size, they could and should vibe code a much better version within a couple of months. That is not resounding praise for Claude, only a commentary on the quality of the existing code. Perhaps as a first step they could use their own “plan mode” which just appends a string that says not to make any edits, only to investigate and assess requirements…

Were I happy to watch the world burn, I’d start my own damn AI company that would do a much better job at this, because holy shit, people actually financed this trash.

Tl;dr, you’re right that it doesn’t bode well for their prospects of improvement, but it’s not because there aren’t many things they could be doing practically. It’s because they refuse to point the gun somewhere other than their own feet.

YourNetworkIsHaunted@awful.systems · 3 months ago

Anthropic is constrained in that some of the fixes which should be pushed to users are things which would have significant trade-off in the form of cost or context window, neither of which are palatable to them for reasons this community has discussed at length.

I think I’m missing something somewhere. One of the most alarming patterns that Jonny found imo was the level of waste involved across unnecessary calls to the source model, unnecessary token churn through the context window from bad architecture, and generally a sense that when creating this neither they nor their pattern extruder had made any effort to optimize it in terms of token use. In other words, changing the design to push some of those calls onto the user would save tokens and thus reduce the user’s cost per prompt, presumably by a fair margin on some of the worst cases.

fiat_lux@lemmy.world · 3 months ago

You’re right, but Johnny rightly also identified the issue where Claude creates complex trash code to work around user-provided constraints while not actually changing approach at all (see the part about tool denial workarounds).

I think Anthropic optimized for appended system prompt character count, and measured it in isolation - at least in the project’s beginning stages, if it’s not still in the code. I assume the inefficiencies have come from the agent working with and around that requirement, backfiring horribly in the spaghetti you see now. Not only is the resulting trash control flow less likely to be caught as a problem agents, especially compared to checking a character count occasionally, but it’s more likely the agent will treat the trash code as an accepted patterns it should replicate.

Claude will also not trace a control flow to any kind of depth unless asked, and if you ask, and it encounters more than one or two levels of recursion or abstraction, it will choke. Probably because it’s so inefficient, but then they’re getting the inefficient tool to add more to the itself and… there’s no way to recover from that loop without human refactoring. I assume that’s a taboo at Anthropic too.

A type of fix I was imagining would be something like an extra call like “after editing, evaluate changes against this large collection of terrible choices that should not occur, for example, the agent’s current internal code”. That would obviously increase the short term token consumption, context window overhead, and make an Anthropic project manager break out in a cold sweat. But it would reduce the gradient of the project death spiral for by providing more robust code for future agents to copy paste that can be more cheaply evaluated, and require fewer user prompts overall to rectify obvious bad code.

They would never go for that type of long game, because they’d have to do some combination of:

listening to all the users complain that they ran out of tokens too soon while creating the millionth token dashboard project, or,
increase the limits for users at company cost, or,
increase prices, or,
sacrifice feature development velocity by getting humans to fix the mess / implement no-or-low-agent client-side tooling for common checks.

They should just set it all on fire, the abomination can’t salvage the abomination.

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th April 2026

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th April 2026

Stubsack: weekly thread for sneers not worth an entire post, week ending 29th March 2026 - awful.systems