• 2 Posts
  • 128 Comments
Joined 2 years ago
cake
Cake day: July 19th, 2023

help-circle
  • I guess I’m the local bertologist today; look up Dr. Bender for a similar take.

    When we say that LLMs only have words, we mean that they only manipulate syntax with first-order rules; the LLM doesn’t have a sense of meaning, only an autoregressive mapping which associates some syntax (“context”, “prompt”) to other syntax (“completion”). We’ve previously examined the path-based view and bag-of-words view. Bender or a category theorist might say that syntax and semantics are different categories of objects and that a mapping from syntax to semantics isn’t present in an LLM; I’d personally say that an LLM only operates with System 3 — associative memetic concepts — and is lacking not only a body but also any kind of deliberation. (Going further in that direction, the “T” in “GPT-4” is for Transformers; unlike e.g. Mamba, a Transformer doesn’t have System 2 deliberation or rumination, and Hofstadter suggests that this alone disqualifies Transformers from being conscious.)

    If you made a perfect copy of me, a ‘model’, I think it would have consciousness. I would want the clone treated well even if some of the copied traits weren’t perfect.

    I think that this collection of misunderstandings is the heart of the issue. A model isn’t a perfect copy. Indeed, the reason that LLMs must hallucinate is that they are relatively small compared to their training data and therefore must be lossy compressions, or blurry JPEGs as Ted Chiang puts it. Additionally, no humans are cloned in the training of a model, even at the conceptual level; a model doesn’t learn to be a human, but to simulate what humans might write. So when you say:

    Spinal injuries are terrible. I don’t think ‘text-only-human’ should fail the consciousness test.

    I completely agree! LLMs aren’t text-only humans, though. An LLM corresponds to a portion of the left hemisphere, particularly Broca’s area, except that it drives a tokenizer instead; chain-of-thought “thinking” corresponds to rationalizations produced by the left-brain interpreter. Humans are clearly much more than that! For example, an LLM cannot feel hungry because it does not have a stomach which emits a specific hormone that is interpreted by a nervous system; in this sense, LLMs don’t have feelings. Rather, what should be surprising to you is the ELIZA effect: a bag of words that can only communicate by mechanically associating memes to inputs is capable of passing a Turing test.

    Also, from one philosopher to another: try not to get hung up on questions of consciousness. What we care about is whether we’re allowed to mistreat robots, not whether robots are conscious; the only reason to ask the latter question is to have presumed that we may not mistreat the conscious, a hypocrisy that doesn’t withstand scrutiny. Can matrix multiplication be conscious? Probably not, but the shape of the question (“chat is this abstractum aware of itself, me, or anything in its environment”) is kind of suspicious! For another fun example, IIT is probably bogus not because thermostats are likely not conscious but because “chat is this thermostat aware of itself” is not a lucid line of thought.


  • I think it’s the other way around. The memes are incredibly good at left vs right because left- and right-leaning people presume underlying facts and the memes reassure people that those facts are true and good (or false and bad, etc.) without doing any fact-finding.

    When we say “the right can’t meme” what we mean is that the right’s memes are about projecting bigotry. It’s like saying that the right has no comedians; of course they have people that stand up in front of an audience and emit words according to memes, tropes, and narremes, such that the audience laughs. Indeed, stand-up was invented by Frank Fay, an open fascist. (His Behind the Bastards episodes are quite interesting.) What we’re saying is that the stand-up routine is bigoted. If this seems unrelated, please consider: the Haitians-eating-pets joke is part of a stand-up routine that a clown tells in order to get his circus elected.


  • My name is Schmidt F. I’m 27 years old. My house is in the Mennonite region of Dutch Pennsylvania, where all the farms are, and I am trad-married. I work as the manager for the Single Sushi matchmaking service, and I get home every day by sunset at the latest. I don’t smoke, but I occasionally drink. I’m in bed by two candles and make sure I sleep until sunrise, no matter what. After having a glass of warm unpasteurized milk and doing about twenty minutes of prayer before going to bed, I usually have no problems sleeping until morning. Just like a real Mennonite, I wake up without any fatigue or stress in the morning. I was told there were no issues at my last one-on-one with my pastor. I’m trying to explain that I’m a person who wishes to live a very quiet life, as long as I have Internet access. I take care not to trouble myself with any enemies, like JavaScript and Python, that would cause me to lose sleep at night. That is how I deal with society, and I think that is what brings me happiness. Although, if I were to write code I wouldn’t lose to anyone.




  • The original article is a great example of what happens when one only reads Bostrom and Yarvin. Their thesis:

    If you claim that there is no AI-risk, then which of the following bullets do you want to bite?

    1. If a race of aliens with an IQ of 300 came to Earth, that would definitely be fine.
    2. There’s no way that AI with an IQ of 300 will arrive within the next few decades.
    3. We know some special property that AI will definitely have that will definitely prevent all possible bad outcomes that aliens might cause.

    Ignoring that IQ doesn’t really exist beyond about 160-180 depending on population choice, this is clearly an example of rectal philosophy that doesn’t stand up to scrutiny. (1) is easy, given that the people verified to be high-IQ are often wrong, daydreaming, and otherwise erroring like humans; Vos Savant and Sidis are good examples, and arguably the most impactful high-IQ person, Newton, could not be steelmanned beyond Sherlock Holmes: detached and aloof, mostly reading in solitude or being hedonistic, occasionally helping answer open questions but usually not even preventing or causing crimes. (2) is ignorant of previous work, as computer programs which deterministically solve standard IQ tests like RPM and SAT have been around since the 1980s yet are not considered dangerous or intelligent. (3) is easy; linear algebra is confined in the security sense, while humans are not, and confinement definitely prevents all possible bad outcomes.

    Frankly I wish that they’d understand that the capabilities matter more than the theory of mind. Fnargl is one alien at 100 IQ, but he has a Death Note and goldlust, so containing him will almost certainly result in deaths. Containing a chatbot is mostly about remembering how systemctl works.


  • Jeff “Coding Horror” Atwood is sneering — at us! On Mastodon:

    bad news “AI bubble doomers”. I’ve found the LLMs to be incredibly useful … Is it overhyped? FUCK Yes. … But this is NOTHING like the moronic Segway (I am still bitter about that crap), Cryptocurrency, … and the first dot-com bubble … If you find this uncomfortable, I’m sorry, but I know what I know, and I can cite several dozen very specific examples in the last 2-3 weeks where it saved me, or my team, quite a bit of time.

    T. chatbot booster rhetoric. So what are those examples, buddy? Very specifically? He replies:

    a friend confided he is unhoused, and it is difficult for him. I asked ChatGPT to summarize local resources to deal with this (how do you get ANY id without a valid address, etc, chicken/egg problem) and it did an outstanding, amazing job. I printed it out, marked it up, and gave it to him.

    Um hello‽ Maybe Jeff doesn’t have a spare room or room to sublet, but surely he can spare a couch or a mailbox? Let your friend use your mailing address. Store some of their stuff in your garage. To use the jargon of hackers, Jeff should be a better neighbor. This is a common issue for unhoused folks and they cannot climb back up the ladder into society without some help. Jeff’s reinvented the Hulk tacos meme but they can’t even eat it because printer paper tastes awful.




  • I love how this particular sci-fi plot gets rewritten every few years. We ought to make it a creative-writing exercise for undergraduates. I was struck by this utterly unhinged and somewhat offensive response on the orange site which starts with the single word “stirrups” and goes places:

    Despite speaking as if he’s doing his utmost to have a love affair with the Cambridge dictionary (and sounding like a twat at the same time) he’s not wrong in so far as not giving a shit is going to screw him over when the ability to push buttons in front of a television no longer matters. What happens when the guys hanging around doing meth on the sidewalk become the engineers that end up becoming the super biologist supermen that cure cancer make us able to hear what dogs hear and see extra colors? It’s unlikely, but it’s even less likely that everyone who is a middle class engineer will be so tomorrow. There is no moat in any profession outside of entrenched wealth or guns at the moment. There just isn’t - we’re in a permanent state of future shock along with the singularity. In large part because that’s what people decided that they wanted.




  • House Democrats have dripped more details from Epstein files and we have surprise guests! They released an un-OCR’d PDF; I’ll transcribe the mentions of our favorite people:

    Sat[urday] Dec[ember] 6, 2014 ZORRO … Reminder: Elon Musk to island Dec[ember] 6 (is this still happening?)

    Zorro is a ranch in New Mexico that Epstein owned; Epstein was scheduled to be there from December 5-8, so that Musk and Epstein would not be at the island together. Combined with the parenthetical uncertainty about happenstance, did Epstein want to perhaps grant Musk some plausible deniability by not being present?

    Mon[day] Nov[ember] 27, 2017 NY … 12:00pm LUNCH w/ Peter Thiel [REDACTED]

    From the rest of the schedule formatting, the redacted block following Thiel’s name is probably not a topic; it might be a name. Lunch between two rich financiers is not especially interesting but lunch between a blackmail-gathering Mossad asset and an influencer-funding accelerationist could be.

    Sat[urday] Feb[ruary] 16, 2019 NY-LSJ 7:00am BREAKFAST w/ Steve Bannon

    Well now, this is the most interesting one to me. This isn’t Epstein’s only breakfast of the day; at 9 AM he meets with Reid Weingarten, one of his attorneys, about some redacted topic. Bannon’s not exactly what I think of as a morning person or somebody who is ready to go at a moment’s notice, so what could drag him out of bed so early? (Edit: This vexed me so I looked it up and sunrise was 6:48 AM that morning at sea level. It would have been the crack of dawn!) Epstein’s Friday evening had had two haircuts, too, with plenty of redacted info; was he worried about appearing nice for Bannon? (The haircuts might not have been for Epstein, given context.) This was a busy day for Epstein; he had a redacted lunch date, and he also had somebody flying in/out that morning via JFK connecting to Saint Thomas and staying in a hotel room there. He then flew out of Newark in the evening to visit the infamous island itself, Little Saint James. The redaction doesn’t quite tell us who this guest is, but it can’t be Bannon because the Dems fucked up the redaction! I can see the edges of the descenders on the name, including a ‘g’ and ‘j’/‘q’, but Bannon’s name doesn’t have any descenders.

    Also Prince Andrew’s in there, I guess?


  • There isn’t a way to solve problems without some value judgements. As long as there are Algol descendants and a lineage of C, there will be people with more machismo than awareness of systems, and they will always be patrician and sadistic in their language-design philosophy. Even left-leaning folks like Kelley (Zig) or DeVault (Hare) are not reasonable language designers; they might not be social conservatives but they aren’t interested in advancing the art of programming. Zig’s explicitly an attempt to iterate on C and C++ without giving up their core unsafety, while Hare is explicitly trying to travel decades back in time to fit onto a 1.41MiB floppy disk.

    I’d recommend stepping outside of the Algol world for a little bit. Hare, Rust, Zig, Go, and Odin have — at least to me, and to a few other PLT folks — the same semantics; they’re all built on C++'s memory model and fully inherit its unsafety. (Yes, safe Rust is a safe subset; no, most production Rust is not safe Rust.) Instead, deliberately force yourself to use a Smalltalk, a Forth, a Lisp, an ML, or a Prolog; solve one or two problems in them over a period of about one month per language. This is the only way to understand the computer without the lens of Algol. Also, consider learning a deliberately unpleasant language like Brainfuck or Thue to give yourself an alien toy model to prevent yourself from getting mind-locked over the industry’s concerns. If you like reading papers, I’d suggest exactly one paper to cure Algol sickness, the Galois theory of algorithms.

    Discussions on technology are excuses for dick-measuring and insulting people only to later claim that actually you are Dutch and it is in your culture to be an asshole.

    This is your call. Personally I’ve found that I can be blunt with evidence and technical claims while empathizing with the difficulty of understanding those claims, and this still allows for fruitful technical discussions. (Also, I have the free time to be vindictive, to paraphrase Yet Another Apolitical Programmer.) I’ve found that GvR (Python, Dutch) doesn’t really understand most of the criticisms I’ve brought to the table, even when I wrote them up for the Python core team, and that the design-by-committee process left multiple Python committee members with a deep contempt for anybody who actually has to use their language. I’ve also found that “Ginger” Bill (Odin, British) is completely unable to have a discussion on this basis as he is too busy negging, sapping, and otherwise playing rhetorical tricks in order to get his way. Unrelated: I also found that DeVault (American) was willing to be less of a sex pest when threatened with a ban, which is a useful trick for moderators to know; in general, being harsh-but-fair to DeVault seems to have pushed him further and further to leftism and public decency over time.

    Also, sometimes people get removed from their communities! Walter Bright (D, American) was kicked out of the wider D community for generally having shitty politics in all arenas of life; the catalyst was likely some particularly transphobic remarks made a few years ago. Similarly, if Blow’s Jai actually had anything interesting to contribute besides the soa and aos keywords then there would already be open-source knockoffs because Blow livestreams so many bigoted takes; arguably Odin is a Jai clone.





  • It’s because of research in the mid-80s leading to Moravec’s paradox — sensorimotor stuff takes more neurons than basic maths — and Sharp’s 1983 international release of the PC-1401, the first modern pocket computer, along with everybody suddenly learning about Piaget’s research with children. By the end of the 80s, AI research had accepted that the difficulty with basic arithmetic tasks must be in learning simple circuitry which expresses those tasks; actually performing the arithmetic is easy, but discovering a working circuit can’t be done without some sort of process that reduces intermediate circuits, so the effort must also be recursive in the sense that there are meta-circuits which also express those tasks. This seemed to line up with how children learn arithmetic: a child first learns to add by counting piles, then by abstracting to symbols, then by internalizing addition tables, and finally by specializing some brain structures to intuitively make leaps of addition. But sometimes these steps result in wrong intuition, and so a human-like brain-like computer will also sometimes be wrong about arithmetic too.

    As usual, this is unproblematic when applied to understanding humans or computation, but not a reasonable basis for designing a product. Who would pay for wrong arithmetic when they could pay for a Sharp or Casio instead?

    Bonus: Everybody in the industry knew how many transistors were in Casio and Sharp’s products. Moravec’s paradox can be numerically estimated. Moore’s law gives an estimate for how many transistors can be fit onto a chip. This is why so much sci-fi of the 80s and 90s suggests that we will have a robotics breakthrough around 2020. We didn’t actually get the breakthrough IMO; Moravec’s paradox is mostly about kinematics and moving a robot around in the world, and we are still using the same kinematic paradigms from the 80s. But this is why bros think that scaling is so important.


  • Wolfram has a blog post about lambda calculus. As usual, there are no citations and the bibliography is for the wrong blog post and missing many important foundational papers. There are no new results in this blog post (and IMO barely anything interesting) and it’s mostly accurate, so it’s okay to share the pretty pictures with friends as long as the reader keeps in mind that the author is writing to glorify themselves and make drawings rather than to communicate the essential facts or conduct peer review. I will award partial credit for citing John Tromp’s effort in defining these diagrams, although Wolfram ignores that Tromp and an entire community of online enthusiasts have been studying them for decades. But yeah, it’s a Mathematica ad.

    In which I am pedantic about computer science (but also where I'm putting most of my sneers too, including a punchline)

    For example, Wolfram’s wrong that every closed lambda term corresponds to a combinator; it’s a reasonable assumption that turns out to not make sense upon closer inspection. It’s okay, because I know that he was just quoting the same 1992 paper by Fokker that I cited when writing the esolangs page for closed lambda terms, which has the same incorrect claim verbatim as its first sentence. Also, credit to Wolfram for listing Fokker in the bibliography; this is one of the foundational papers that we’d expect to see. With that in mind, here’s some differences between my article and his.

    The name “Fokker” appears over a dozen times in my article and nowhere in Wolfram’s article. Also, I love being citogenic and my article is the origin of the phrase “Fokker size”. I think that this is a big miss on his part because he can’t envision a future where somebody says something like “The Fokker metric space” or “enriched over Fokker size”. I’ve already written “some closed lambda terms with small Fokker size” in the public domain and it’s only a matter of time until Zipf’s law wears it down to “some small Fokkers”.

    Also, while “Tromp” only appears once in my article, it appears next to somebody known only as “mtve” when they collaborated to produce what Wolfram calls a “size-7 lambda” known as Alpha. I love little results like these which aren’t formally published and only exist on community wikis. Would have been pretty fascinating if Alpha were complete, wouldn’t it Steve!? Would have merited a mention of progress in the community amongst small lambda terms, huh Steve!?

    I also checked the BB Gauge for Binary Lambda Calculus (BLC), since it’s one of the topics I already wrote up, and found that Wolfram’s completely omitted Felgenhauer from the picture too, with that name in neither the text nor bibliography. Felgenhauer’s made about as many constructions in BLC as Tromp; Felgenhauer 2014 constructs that Goodstein sequence, for example. Also, Wolfram didn’t write that sequence, they sourced it from a living paper not in the bibliography, written by…Felgenhauer! So it’s yet another case of Wolfram just handily choosing to omit a name from a decade-old result in the hopes that somebody will prefer his new presentation to the old one.

    Finally, what’s the point of all this? I think Wolfram writes these posts to advertise Mathematica (which is actually called Wolfram Mathematica and uses a programming language called Wolfram BuT DiD YoU KnOw) He also promotes his attempt at rewriting all of physics to have his logo upon it, and this blog post is a gateway to that project in the sense that Wolfram genuinely believes that staring at these chaotic geometries will reveal the equations of divine nature. Meanwhile I wrote my article in order to win an IRC argument against make a reasonable presentation of an interesting phenomenon in computer science directly to Felgenhauer & Tromp, and while they don’t fully agree with me, we together can’t disagree with what’s presented in the article. That’s peer review, right?