Guess it’s all in the subject. I’ve found some implementations of AI practical but it’s always asking for more data more everything. Just curious about how others use AI as carefully as possible.
I’m all for local models, but if you don’t have a giant computer, pay a few bucks for a no log API like Cerebras API. Or any company, in any jurisdiction; take your pick.
But you can use them with any number of chat front ends, including easily self hostable ones.
Obviously the caveat for sending anything over the internet or “trusting” a cloud business applies, but we’re talking about inference-only companies that mostly host open LLMs for other businesses to use; you aren’t their product. They don’t do any training, and their business isn’t invading your privacy.
You can use local models, or use a VPN + private window and then a site like Venice.ai which doesnt require an account
What is the business model of Venice.ai?
Selling premium subscriptions for uncensored models I think. But it does not really matter as it doesn’t require an account.
Run local models.
Indeed. I saved a set of instructions and have just been waiting for the time to implement.
The chatbot only knows what you tell it. Don’t tell it what you don’t want it to know.
The bigger issue to me would be them crawling data for training. If you don’t want it training on your data then keep it offline/hidden.
So you never have the odd question for it?
Define odd. I ask it stuff all the time. But I anonymize it, or feed it purposefully wrong info as part of my prompt.
Yes! Mine thinks I’m a banker and I wear a top hat.
Use a local model, learn some toolcalling and have it retrieve factual answers from a database like wolfram alpha if needed. . We have a community over at c/localllama@sh.itjust.works all about local models. if your not very techy I recommend starting with a simple llamafile which is a one click executable EXE that packages engine and model together in a single file.
Then move on to a real local model engine like kobold.cpp running a quantized model that fits in your computer especially if you have a graphics card and want to offload via CUDA or Vulcan. Feel free to reply/message me if you need further clarification/guidance
https://github.com/mozilla-ai/llamafile
https://github.com/LostRuins/koboldcpp
I would start with a 7b q4km quant see if your system can run that.
Thank you so much. My work schedule lightens for a few days in December and this is top of my to do.
I’m actually quite excited to learn more about local models so your guidance is richly appreciated and I will surely check out the community you mention here.
Go in with realistic expectations though. The performance is going to be very limited unless you have a pretty beefy PC.
Alpaca on flathub makes it simple to setup a local instance and get chatting. https://flathub.org/en/apps/com.jeffser.Alpaca
Thank you! I come here to be the fly on the wall listening to tech magicians but I’ve got to start upping my game when I get some free time.




