[deleted by user]
I’m no dev so I don’t understand all the technicalities but if I got it right you made it so the AI is itself showing how confident it is about its own answers? That is neat.
Not sure to understand the downvotes? Ins’t it a good idea to make it harder for AI to be telling bullshit without blushing?
[deleted by user]
I think interesting? It’s kind of hard to tell.
You are going to have to significantly tone down the editorialization and platitudes to get this to a place where a journal might consider it.
Make the point of how it’s novel or useful by explaining what it does, not by repeating that it’s novel and useful.
[deleted by user]
The description has such an unsettling, overconfident, llm-style tone for a project described as something to challenge LLM hallucinations.
[deleted by user]
Good for you, welcome to the internet where people’s opinions abound. I didn’t accuse you of writing it with an LLM I said it was an LLM style, if you don’t like my opinion, that’s fine with me. I simply found the writing style unsettling. Cheers!
[deleted by user]
LLMs were created by reading millions of *social media posts written by neurodivergent people sharing their passions online.
*edit: spelling
[deleted by user]
Although I’m generally opposed to AI in general and LLMs in particular, this project seems really cool. Might actually change my stance on LLM usage. Kudos and hope this gets more attention and development!
[deleted by user]
I have trouble understanding what makes it list “Context” as its source as opposed to “Model” and how that makes it any more deterministic, can you give a more detailed example?
[deleted by user]
I was like, why aren’t you publishing it to a conference/journal if it is good? Then realized that you are doing exactly that. Kudos for the work, looking forward to the progress!
[deleted by user]
As a physicist, my favorite referee comment ever was [That my claim was wrong] “should be obvious to anyone who has ever sat through an elementary electromagnetism course.” He was wrong BTW, and the paper was finally published in a different journal.
I am from a decidedly different field, so I don’t know if I can vouch for you in any meaningful way.
[deleted by user]
Thanks for sharing. I’ve not yet delved into reading it in depth but appreciate your goals and the fact that you documented it all.
[deleted by user]
So I was curious about how you accomplished this and took a look with the robots to figure it out.
TL;DR: the router is a massive decision tree using heuristics and regex to avoid LLM calls on unprefixed prompts.
I think this is an interesting, brute force approach to the problem, but one that will always struggle with edge cases. The other bit it will struggle with is transparency. Yes, it might be deterministic because it is a decision tree, but unless you really understand how that decision tree works under the hood and know where the pitfalls are, you’re going to end up talking to the LLM a lot of the time anyhow.
Something you might want to consider is doing a fine-tune of a smol model (think something like qwen3:1.7B or even smaller like one of the gemma3n sub-1B) that will do the routing for you. You can easily build the dataset synthetically or harvest your own logs. I think this might end up covering more edge cases more smoothly without resorting to a big call to a larger model
[deleted by user]
Cool man. It is really refreshing to see this level of engagement. You’ve really thought this though. You’re right about the routing model moving it up a level and also about retraining. It’s all trade-offs.
Are you intending this for others to use or is this really just for you? Because I think what you’re slowly building is a power tool with a whack-a-mole set of routing tweaks specifically for you. Nothing wrong with that, but the barrier to entry for others to use this is reading that routing and understanding the foibles that have been baked in with your preferences in mind, and even adding fixes and tweaks of their own which kinda breaks the magic a little.
This was really the point I was making about transparency.
I appreciate others also doing real work with potato GPUs because I, too, have a potato GPU (6GB). I think there is real utility in continuing to develop this.
I’ll give this a star and follow along. It doesn’t really fit my mental model of how I’d like my harness to behave, but I will totally steal some of these ideas.
[deleted by user]
So uh… what does it do? Summarize short articles?
[deleted by user]
Looks interesting. Will give it a whirl on my home server.
In this article, they talk about bringing up a local RAG system to let people run an LLM off a large document corpus: https://en.andros.dev/blog/aa31d744/from-zero-to-a-rag-system-successes-and-failures/
Wonder if this, connected to something like that, and wrapped in an easy end-user friendly script or UI could be a good combination for a local, domain-specific, grounded knowledge-base?
[deleted by user]
The problem with CAG is not just that it hogs memory, but to keep it fresh you have to keep re-indexing. If the corpus is large and dynamic, it can easily fall out of date and, at runtime, blow out the context window.
GraphRAG has some promise. NVidia has a playbook for converting text into a knowledge graph: https://build.nvidia.com/spark/txt2kg
It’ll probably have the same issues with reindexing, but that will be a common problem, until someone comes up with better incremental training/indexing.
Can’t it source other LLM outputs as “verified source” and thus still say whatever sounds good, like any LLM? Providing “technical” verification, e.g. SHA, gives no insurance about the content itself being from a reputable source. I don’t think adding confidence and sourcing changes anything, the user STILL has to verify that whatever is provided is coherent and a third party is actually a good source. Thanks for making the process public though, doing better than OpenAI does.
[deleted by user]
Isn’t it “source: model” basically roulette? We go back to the initial problem. Also anything else that is not model might also be hallucinated if at any point the string that gives back “source:” goes through the model.
[deleted by user]
Fair, but that’s the same problem human thinkers face. Faulty inputs == faulty outputs. You should always be validating your sources.
Right but if one person keeps on giving me wrong answers, knowingly or not, my distrust in them in not linear. They’ll have to “earn” it back and it’s going to be very challenging. If they do learn though, then it might come back faster. In this setup I have no guarantee of any progress. There no “one” in there trying to fix any mistake.
[deleted by user]
AI will think for you if you prompt it to do so. It’s up to the user to use the tool in a way that suits your style.
So basically, you created a prompt wrapper that removes position bias by using trust to evaluate both, and forcing an evidence path with scratch. This is a really cool development. It probably will not solve everything but it solves alot.
Is llama open source?
[deleted by user]
TLDR.
So you basically solved humanity problems with LLMs, you should sell it to NVIDIA and be rich, no more hallucination.
[deleted by user]
You should have made it clear in the title. The title is the most important part and you’re literally saying you’ve made LLM stop “bullshitting”. Pretentious phrase to draw everyone attention. Then in the body you correct everybody assumption. Dirty move.
[deleted by user]
For someone who has never run an ai locally – can you set this up on a regular laptop? How world you do that
[deleted by user]
[deleted by user]




