Want to wade into the snowy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this. Merry Christmas, happy Hannukah, and happy holidays in general!)

  • lagrangeinterpolator@awful.systems
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 days ago

    AI researchers are rapidly embracing AI reviews, with the new Stanford Agentic Reviewer. Surely nothing could possibly go wrong!

    Here’s the “tech overview” for their website.

    Our agentic reviewer provides rapid feedback to researchers on their work to help them to rapidly iterate and improve their research.

    The inspiration for this project was a conversation that one of us had with a student (not from Stanford) that had their research paper rejected 6 times over 3 years. They got a round of feedback roughly every 6 months from the peer review process, and this commentary formed the basis for their next round of revisions. The 6 month iteration cycle was painfully slow, and the noisy reviews — which were more focused on judging a paper’s worth than providing constructive feedback — gave only a weak signal for where to go next.

    How is it, when people try to argue about the magical benefits of AI on a task, it always comes down to arguing “well actually, humans suck at the task too! Look, humans make mistakes!” That seems to be the only way they can justify the fact that AI sucks. At least it spews garbage fast!

    (Also, this is a little mean, but if someone’s paper got rejected 6 times in a row, perhaps it’s time to throw in the towel, accept that the project was never that good in the first place, and try better ideas. Not every idea works out, especially in research.)

    When modified to output a 1-10 score by training to mimic ICLR 2025 reviews (which are public), we found that the Spearman correlation (higher is better) between one human reviewer and another is 0.41, whereas the correlation between AI and one human reviewer is 0.42. This suggests the agentic reviewer is approaching human-level performance.

    Actually, now all my concerns are now completely gone. They found that one number is bigger than another number, so I take back all of my counterarguments. I now have full faith that this is going to work out.

    Reviews are AI generated, and may contain errors.

    We had built this for researchers seeking feedback on their work. If you are a reviewer for a conference, we discourage using this in any way that violates the policies of that conference.

    Of course, we need the mandatory disclaimers that will definitely be enforced. No reviewer will ever be a lazy bum and use this AI for their actual conference reviews.

    • V0ldek@awful.systems
      link
      fedilink
      English
      arrow-up
      0
      ·
      11 hours ago

      we found that the Spearman correlation (higher is better) between one human reviewer and another is 0.41

      This stinks to high heaven, why would you want these to be more highly correlated? There’s a reason you assign multiple reviewers, preferably with slightly different backgrounds, to a single paper. Reviews are obviously subjective! There’s going to be some consensus (especially with very bad papers; really bad papers are always almost universally lowly reviewed, because you know, they suck), but whether a particular reviewer likes what you did and how you presented it is a bit of a lottery.

      Also the worth of a review is much more than a 1-10 score, it should contain detailed justification for the reviewers decision so that a meta-reviewer can then look and pinpoint relevant feedback, or even decide that a low-scoring paper is worthwhile and can be published after small changes. All of this is an abstraction, of course a slightly flawed one, but of humans talking to each other. Show your paper to 3 people you’ll get 4 different impressions. This is not a bug!

    • JFranek@awful.systems
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 days ago

      Problem: Reviewers do not provide constructive criticism or at least reasons for paper to be rejected. Solution: Fake it with a clanker.

      Genius.

    • blakestacey@awful.systems
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 days ago

      the noisy reviews — which were more focused on judging a paper’s worth than providing constructive feedback

      dafuq?

      • lagrangeinterpolator@awful.systems
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        1 day ago

        Yeah, it’s not like reviewers can just write “This paper is utter trash. Score: 2” unless ML is somehow an even worse field than I previously thought.

        They referenced someone who had a paper get rejected from conferences six times, which to me is an indication that their idea just isn’t that good. I don’t mean this as a personal attack; everyone has bad ideas. It’s just that at some point, you just have to cut your losses with a bad idea and instead use your time to develop better ideas.

        So I am suspicious that when they say “constructive feedback”, they don’t mean “how do I make this idea good” but instead “what are the magic words that will get my paper accepted into a conference”. ML has become a cutthroat publish-or-perish field, after all. It certainly won’t help that LLMs are effectively trained to glaze the user at all times.

      • scruiser@awful.systems
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 days ago

        Going from lazy, sloppy human reviews to absolutely no humans is still a step down. LLMs don’t have the capability to generalize outside the (admittedly enormous) training dataset they have, so cutting edge research is one of the worse use cases for them.

        • IndustryStandard@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 days ago

          An LLM is better than literally nothing. There have been scandals of papers being basically copies of previous papers at conferences and that was only caught because some random online read the papers.

          Nobody is reading papers. Universities are a clout machine.

          • V0ldek@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            5 hours ago

            Nobody is reading papers. Universities are a clout machine.

            Sokal, you should log off

            • blakestacey@awful.systems
              link
              fedilink
              English
              arrow-up
              0
              ·
              4 hours ago

              Funny story: Just yesterday, I wrote to a journal editor pointing out that a term coined in a paper they had just printed had actually been used with the same meaning 20 years ago. They wrote back to say that I was the second person to point this out and that an erratum would be issued.

          • Seminar2250@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            1 day ago

            Alice: what is 2 + 2?

            LLM: random.random() + random.random()

            Alice: 1.2199404515268157 is better than nothing, i guess

          • self@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            1 day ago

            you’ve got so much in common with an LLM, since you also seem to be spewing absolute bullshit to an audience that doesn’t like you

          • scruiser@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            1 day ago

            What value are you imagining the LLM providing or adding? They don’t have a rich internal model of the scientific field to provide an evaluation of novelty or contribution to the field. They could maybe spot some spelling or grammar errors, but so can more reliable algorithms. I don’t think they could accurately spot if a paper is basically a copy or redundant, even if given RAG on all the past papers submitted to the conference. A paper carefully building on a previous paper vs. a paper blindly copying a previous paper would look about the same to an LLM.

          • swlabr@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            1 day ago

            Your premise is total bullshit. That being said, I’d prefer a world where nobody reads papers and journals stop existing to a world where we are boiling the oceans to rubber-stamp papers.

        • swlabr@awful.systems
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 days ago

          me, thinking it’s a waste to not smoke indoors because my landlord won’t fix the CO detectors: oh

          (jk I don’t smoke)