I think this summarizes in one conversation what is so fucking irritating about this thing: I am supposed to believe that it wrote that code.
No siree, no RAG, no trickery with training a model to transform the code while maintaining identical expression graph, it just goes from word-salading all over the place on a natural language task, to outputting 100 lines of coherent code.
Although that does suggest a new dunk on computer touchers, of the AI enthusiast kind, you can point at that and say that coding clearly does not require any logical reasoning.
(Also, as usual with AI it is not always that good. sometimes it fucks up the code, too).
They scripted the river crossing puzzle into LLMs months ago. It’s a demo set-piece to convince users that the bot can solve any class of problem - the only issue is that it’ll often turn them into more river-crossing problems.
Yeah I’m thinking this one may be special cased, perhaps they wrote a generator of river crossing puzzles with corresponding conversion to “is_valid_state” or some such. I should see if I can get it to write something really ridiculous into “is_valid_state”.
Other thing is that in real life its like “I need to move 12 golf carts, one has low battery, I probably can’t tow more than 3 uphill, I can ask Bob to help but he will be grumpy…”, just a tremendous amount of information (most of it irrelevant) with tremendoustremendous possible moves (most of them possible to eliminate by actual thinking).