Frog put Claude in a box

cm0002@toast.ooo · 2 days ago

Frog put Claude in a box

verstra@programming.dev · 2 days ago

It’s probably something like “I’ve disabled agent’s removeFile tool, but LLM figured out that it can use the bash tool, still”.

It looks like “AI bad” or “Claude insecure” mantra.

dumnezero@piefed.social · 2 days ago

mantra

you mean facts?

Scipitie@lemmy.dbzer0.com · 2 days ago

“It’s my circlejerk - so it’s a fact!”

dumnezero@piefed.social · 2 days ago

I hope that you’re hired for long enough to learn what having security means in the context of using LLM “agents” and the like.

kingofras@lemmy.world · 1 day ago

deleted by creator

OwOarchist@pawb.social · 2 days ago

It looks like “AI bad” or “Claude insecure” mantra.

Until you solve prompt injection, they are indeed extremely bad for security and should never be given permissions that would allow them to do anything catastrophic.

verstra@programming.dev · 6 hours ago

I say mantra because there is a large amount of people just hating AI outright, without a grounded reasoning.

Granted, coding agents are insecure by default - they are built to execute remote code - but that does not mean they are generally useless/harmful/bad. I run them in a container, with access to the codebase only.

Also, they hallucinate, produce over-convoluted abstractions, do not know when to reject instead of blindly trying to find a way trough a brick wall.

But also, they can answer questions about gigantic codebases way faster than I could. They can generate tests, find missing test coverage, review code, and many other things.

kingofras@lemmy.world · 1 day ago

mantra

The way LLMs work is that they actively will make multiple attempts to get past hurdles (because they have no intelligence or methodology) so guardrails need to be extremely tight for them to work, other wise the model will simply see it as one of the challenges to overcome.

That’s the mantra, and that is very poor technology to put in the hands of people who don’t understand how it works.

kingofras@lemmy.world · 1 day ago

deleted by creator