i only really made this so i can link to it on this post on my comment. whatever-


Explanation Time!

the idea here is that those “code blocks” aren’t regular code blocks, but a special syntax which the LM writes so that the UI can present that as verifiable “hyperlinks” with exact text to the actual source.

so here, the LM specified exactly which lines it wants to highlight.

meaning: its not hallucinating, and if it is, you notice it because the highlighting is wrong or doesnt match at all.

we essentially use the LM as a “highlighter” rather than a regurgitator, making mistakes obvious and correct answers immediately verifiably correct, cuz u can see the source.

explanation done-


i like mockups. and godot. so here we are.

this uses the solarized theme which looks somewhat close to the claude theme they use. somewhat close.

whatever something something ai bad or whatever, is this what u need to hear? sigh

i hope u have a nice day <3

this is very much a post i first posted on the Qwen community but then i decided that this stuff doesnt belong on blahaj zone and moved it here… oh well.

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    10 hours ago

    Isn’t that kind of what some commercial tools do? Famously Perplexity AI, or maybe some coding agents when they do a code review?

    I like the UI mockup.

  • Womble@piefed.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    13 hours ago

    There’s this which seems like an implementation of what you want with other features included (like web search and calculator tools as well as persistent memory). I’ve been meaning to check it out for a while but haven’t got round to it.

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 hours ago

      I wonder if there’s anything to check out… I don’t see any obvious way to run it. And the code looks like it’s a first draft to add some real code later?!

      • Womble@piefed.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 hours ago

        Huh, yeah you’re right. I could have sworn it was functional when I looked at it before…

    • maria [she/her]@lemmy.blahaj.zoneOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      13 hours ago

      im not really looking for something like this, im just disappointed to see that todays chatbots with web search still just make u

      • either trust the LM with its sources
      • or read thru ever source completely to actually trust it

      even tho… It feels so obvious to just show that preview of exactly the text describing the answer to be most immediately trustworthy.

  • e0qdk@reddthat.com
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    15 hours ago

    I implemented a system for exploring my own source code via tools a while back. I have find_files, read_source_code and a few others that allow putting in a project name and/or filename as parameters (restricted heavily based on permissions I’ve set in my custom harness). It’s been pretty good at following tasks like “Read the source code in the foo project and update the documentation in such-and-such.md” – which I have mermaid.js sequence diagrams embedded into. (I don’t give it direct file write access; it just gives my output in my chat client and I copy over what it spits out and diff against what was in git then tweak if needed.)

    Quoting directly from the code works well with no particular special effort. It is absolutely terrible at giving line numbers though (hallucinates everything when it tries to do that). I have a few ideas on how I might be able to improve that – the most straightforward is to just inject comments with the line number into the return from the tool call (so that it can quote the number instead of trying to estimate position). If that’s not good enough, I’ve also got an AST-based source code reader (only for JS and Python though) that can return line numbers, It was intended for skeletonizing code so that I could throw larger files at an LLM without it having to read the entire thing and then just pull chunks out with read_source_code based on line number ranges – but it hasn’t been particularly effective at making good use of that capability. Maybe that concept could be repurposed for quoting code to the user though if the simpler approaches aren’t good enough… 🤔️

    TL;DR: This is relevant to my interests and I might build my own too!

    • Onno (VK6FLAB)@lemmy.radio
      link
      fedilink
      English
      arrow-up
      3
      ·
      15 hours ago

      I’m not sure if I’m missing something, but can this not be solved with one Linux command?

      grep

      The command has a -n option to output line numbers and -C x to provide x lines of context.

      There’s no extra software required, doesn’t need an LLM, doesn’t hallucinate, just a plain search.

      • e0qdk@reddthat.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        14 hours ago

        What the LLM can do that grep can’t is that it can find things by imperfect description. You need to know a text string that’s exactly in the file to get grep/ack/etc. to locate it; you can be vague with an LLM and it may still be able to figure it out. It’s the difference between searching for FooBarFactory already knowing the exact name and trying to find the file it’s in (where grep, etc. are great) and “find the code that instantiates FooBar objects in foo project and tell me what it’s called” when you don’t know if it was FooBarManager or FooBarFactory or it’s actually a function called make_foo_bar() instead of factory class or there are actually three different ways to do it because of legacy code.

          • e0qdk@reddthat.com
            link
            fedilink
            English
            arrow-up
            1
            ·
            13 hours ago

            More like a conceptual search. e.g. I’ve used my source code explorer to get a survey of how the template handling works in llama.cpp since the sample chat code doesn’t apply the same logic that llama-cli actually does.

          • maria [she/her]@lemmy.blahaj.zoneOP
            link
            fedilink
            English
            arrow-up
            1
            ·
            14 hours ago

            yea no ur right, grep works great fir doin codebase search. dunno what this other peeps is going on about, for the case of file search, grep rules.

    • maria [she/her]@lemmy.blahaj.zoneOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      15 hours ago

      i know that generally, quoting stuff from files works well. the point here is less about being useful but more about being 100% verifiable.

      indexing things with line numbers absolutely works, but u gotta actually put line numbers in the tool output. Meaning, the read_source_code should return the line number at the start if each line, e.g.

      1:extends Control
      2:
      3:# comment here
      4:and so on
      

      it eats up tokens, but does make line index hit rate almost always correct.