Happy to answer any questions and open for suggestions :)
It's basically a LLMs with access to a search engine and the ability to query a vector db.
The top n results from each search query (initialized by the LLM) will be scraped, split into little chunks and saved to the vector db. The LLM can then query this vector db to get the relevant chunks. This obviously isn't as comprehensive as having a 128k context LLM just summarize everything, but at least on local hardware it's a lot faster and way more resource friendly. The demo on GitHub runs on a normal consumer GPU (amd rx 6700xt) with 12gb vRAM.
Wonderful work!
is it possible to make it only use a subset of the web? (Only sites that I trust and think are relevant to producing an accurate answer), and are there ways to possibly make it work offline on pre installed websites? (wikipedia, some other wikis and possibly news sites that are archived locally), and how about other forms of documents? (books and research papers as pdfs)
Seconded. I tried to do this many years ago for my dissertation and failed, but this would be a dream of mine.
Would it not be possible to create a search engine that only crawls certain sites?
I was most interested in the offline aspect of it, which I wouldn't know where to even start with if I were to fork.
How do you parse and efficiently store large, unstructured information for arbitrary, unstructured queries?
You put it in a search server, like ElasticSearch or Meili.
Llocalsearch uses searxng which has a feature to blacklist/whitelist sites for various purposes.
also a great idea to expose this to the frontend. thanks :)
uhhhh both ideas are great, would you like to turn them into github issues? i will definitely look into both of them :)
What is the search engine that it uses?
searxng, which is a locally running meta search engine combining a lot of different sources (including Google and co)
This might be more of a searxng question, but doesn't it quickly run up against anti-bot measures? CAPTCHA challenges and Forbidden responses? I can see the manual has some support for dealing with CAPTCHA [1], but in practical terms, I would guess a tool like this can't be used extensively all day long.
I'm wondering if there's a search API that would make the backend seamless for something like this.
1. https://docs.searxng.org/admin/answer-captcha.html
As a last resort we could have AI work on top of a real web browser and solving captchas as well. Should look like normal usage. I think these kinds of systems LLM + RAG + Web Agent will become widespread and the preferred method to interact with the web.
We can escape all ads and dark UI patterns by delegating this task to AI agents. We could have it collect our feeds, filter, rank and summarize them to our preferences, not theirs. I think every web browser, operating system and mobile device will come equipped with its own LLM agent.
The development of AI screen agents will probably get a big boost from training on millions of screen capture videos with commentary on YouTube. They will become a major point of competition on features. Not just browser, but also OS, device and even the chips inside are going to be tailored for AI agents running locally.
If everyone consumes like that what's even the incentive for content creators?
If content creators can't find anything that is uniquely human and cannot be made by AI, then maybe they are not creative enough for the job. The thing about generative AI is that it can take context, you can put a lot or very little guidance in it. The more you specify, the more you can mix your own unique sauce in the final result.
I personally use AI for text style changes, as a summarizer of ideas and as rubber duck, something to bounce ideas off of. It's good to get ideas flowing and sometimes can help you realize things you missed, or frame something better than you could.
I didn't run into a lot of timeouts while using it myself, but you would probably need another search source if you plan to host this service for multiple users at the same time.
There are projects like flareresolverr which might be interesting
If you're open to it, it would be great if you could make a post explaining how you built this. Even if it's brief. Trying to learn more about this space and this looks pretty cool. And ofc, nice work!
a primer - https://github.com/nilsherzig/LLocalSearch/issues/17
guys, i didn't thought there would be this much interest in my project haha. I feel kinda bad for just posting it in this state haha. I would love to make a more detailed post on how it works in the future (keep an eye on the repo?)
To scrape the websites, do you just blindly cut all of the HTML into defined size chunks or is there some more sophisticated logic to extract text of interest ?
I'm wondering because most news websites now have a lot of polluting elements like popups, would they also go into the database ?
If you look at the vector handler in his code, he is using blue Monday sanitizer and doing some "replaceAll".
So I think there may be some useless data in the vector, but that may not be a issue since it is coming from multiple sources (for simple question at least)
Your project looks very cool. I had on my ‘list’ to re-learn Typescript (I took a TS course about 5 years ago, but didn’t do anything with it) so I just cloned your repo so I can experiment with it.
EDIT: I just noticed that most of the code is Go. Still going to play with it!
Thanks :). Yea only the web part is typescript and I really wouldn't recommend to learn from my typescript haha
"normal consumer GPU"... well mine is a 4GB 6600.. so I guess that varies.
Sorry it wasn't my intention to gatekeep, but my 300€ card really is on the low end of LLM Things
any plans to support other backends besides ollama?
Sure (if they are openai api compatible i can add them within minutes) otherwise I'm open for pull requests :)
Also, i don't own an Nvidia Card or Windows / MacOS
This is awesome, would love if there were executable files where these dependencies are needed. That would make it wayyyy more accessible rather than just to those that know how to use the command line and resolve dependencies (yes, even docker runs into that when fighting the local system).