a mockup of what "trushworthy LM search" could look like [OC, brainmade]

maria [she/her]@lemmy.blahaj.zone · 15 hours ago

a mockup of what "trushworthy LM search" could look like [OC, brainmade]

hendrik@palaver.p3x.de · edit-2 10 hours ago

Isn’t that kind of what some commercial tools do? Famously Perplexity AI, or maybe some coding agents when they do a code review?

I like the UI mockup.

Womble@piefed.world · 13 hours ago

There’s this which seems like an implementation of what you want with other features included (like web search and calculator tools as well as persistent memory). I’ve been meaning to check it out for a while but haven’t got round to it.

hendrik@palaver.p3x.de · 10 hours ago

I wonder if there’s anything to check out… I don’t see any obvious way to run it. And the code looks like it’s a first draft to add some real code later?!

Womble@piefed.world · 3 hours ago

Huh, yeah you’re right. I could have sworn it was functional when I looked at it before…

maria [she/her]@lemmy.blahaj.zone · 13 hours ago

im not really looking for something like this, im just disappointed to see that todays chatbots with web search still just make u

either trust the LM with its sources
or read thru ever source completely to actually trust it

even tho… It feels so obvious to just show that preview of exactly the text describing the answer to be most immediately trustworthy.

e0qdk@reddthat.com · 15 hours ago

I implemented a system for exploring my own source code via tools a while back. I have find_files, read_source_code and a few others that allow putting in a project name and/or filename as parameters (restricted heavily based on permissions I’ve set in my custom harness). It’s been pretty good at following tasks like “Read the source code in the foo project and update the documentation in such-and-such.md” – which I have mermaid.js sequence diagrams embedded into. (I don’t give it direct file write access; it just gives my output in my chat client and I copy over what it spits out and diff against what was in git then tweak if needed.)

Quoting directly from the code works well with no particular special effort. It is absolutely terrible at giving line numbers though (hallucinates everything when it tries to do that). I have a few ideas on how I might be able to improve that – the most straightforward is to just inject comments with the line number into the return from the tool call (so that it can quote the number instead of trying to estimate position). If that’s not good enough, I’ve also got an AST-based source code reader (only for JS and Python though) that can return line numbers, It was intended for skeletonizing code so that I could throw larger files at an LLM without it having to read the entire thing and then just pull chunks out with read_source_code based on line number ranges – but it hasn’t been particularly effective at making good use of that capability. Maybe that concept could be repurposed for quoting code to the user though if the simpler approaches aren’t good enough… 🤔️

TL;DR: This is relevant to my interests and I might build my own too!

Onno (VK6FLAB)@lemmy.radio · 15 hours ago

I’m not sure if I’m missing something, but can this not be solved with one Linux command?

grep

The command has a -n option to output line numbers and -C x to provide x lines of context.

There’s no extra software required, doesn’t need an LLM, doesn’t hallucinate, just a plain search.

e0qdk@reddthat.com · 14 hours ago

What the LLM can do that grep can’t is that it can find things by imperfect description. You need to know a text string that’s exactly in the file to get grep/ack/etc. to locate it; you can be vague with an LLM and it may still be able to figure it out. It’s the difference between searching for FooBarFactory already knowing the exact name and trying to find the file it’s in (where grep, etc. are great) and “find the code that instantiates FooBar objects in foo project and tell me what it’s called” when you don’t know if it was FooBarManager or FooBarFactory or it’s actually a function called make_foo_bar() instead of factory class or there are actually three different ways to do it because of legacy code.

Onno (VK6FLAB)@lemmy.radio · 14 hours ago

So, a fuzzy search then?

There’s several command line tools for that too.

e0qdk@reddthat.com · 13 hours ago

More like a conceptual search. e.g. I’ve used my source code explorer to get a survey of how the template handling works in llama.cpp since the sample chat code doesn’t apply the same logic that llama-cli actually does.

maria [she/her]@lemmy.blahaj.zone · 14 hours ago

yea no ur right, grep works great fir doin codebase search. dunno what this other peeps is going on about, for the case of file search, grep rules.

maria [she/her]@lemmy.blahaj.zone · 15 hours ago

i know that generally, quoting stuff from files works well. the point here is less about being useful but more about being 100% verifiable.

indexing things with line numbers absolutely works, but u gotta actually put line numbers in the tool output. Meaning, the read_source_code should return the line number at the start if each line, e.g.

1:extends Control
2:
3:# comment here
4:and so on

it eats up tokens, but does make line index hit rate almost always correct.

e0qdk@reddthat.com · 15 hours ago

Are you thinking to make something like a quote_snippet tool that you give a file and line range to and have it (deterministically) present that to the user as part of the response?

maria [she/her]@lemmy.blahaj.zone · 14 hours ago

yyyyes exactly.

a mockup of what "trushworthy LM search" could look like [OC, brainmade]

a mockup of what "trushworthy LM search" could look like [OC, brainmade]

Explanation Time! ⏱