How do you vet open source software if you don't know code well?

cm0002@lemmy.cafe · 2 months ago

How do you vet open source software if you don't know code well?

entwine@programming.dev · 2 months ago

You can’t, obviously. I know how to read code, but I still rarely do it since it’s very time consuming. Usually, if I’m nervous about something, I’ll first look at the author and see if they’re well-known, or at least tied to a real identity. In the rare cases that I have reviewed a code base (I’m not a security expert or anything) to check for malware, the things I looked for were:

obvious red flags, like urls to fishy sites, or calls to filesystem APIs where it doesn’t make sense, paths that it shouldn’t be trying access, etc
anything that looks obfuscated, poorly written, or delibrately designed to be difficult to read

But if it’s anything related to Node/NPM, I always use a throwaway rootless podman container without filesystem access. Even if the author is trustworthy, their dependency graph is likely a bag of used needles that they picked up on the side of the road.

treadful@lemmy.zip · edit-2 2 months ago

There’s not a ton you can do, but you can look out for indicators of a healthy project.

A good sign is if the repo has a lot of different contributors. If something has hundreds or thousands of contributors, there’s more eyes on it to catch something malicious. Also other activity as well, like bug reports demonstrates a strong user base, which is like crowdsourcing trust.

Another thing is, if your distro packages it in one of their main repos that’s a reasonable indicator that you can trust it. Def not 100% but when you don’t have a lot to go on, it’s something.

Any other tips I think I have are more technical.

pinball_wizard@lemmy.zip · 2 months ago

Read along on Lemmy.

We will complain about the smallest impurities in our selected open source software.

If we’re not complaining about a piece of open source software, then we don’t trust it enough to use it.

ikidd@lemmy.world · 2 months ago

99% of people who can read code are only going to catch obvious things like cryptominers. Most aren’t going to catch something like the XZ malware that was an entirely serendipitous finding from timing how long a certain part of the process took and noticing it was off. True malware is using unique loopholes and malformed requests that will get past nearly everyone.

There really needs to be a concentrated effort put into vetting code, but of course, funding for that is non-existent. 60% of code in the wild is maintained by hobbyists getting paid almost nothing. We’re screwed.

SMillerNL@piefed.social · 2 months ago

Not disagreeing with you, but since the author is asking about GitHub… the XZ GitHub didn’t actually have any malicious code. Only the website tarbal did.

jimmy90@lemmy.world · edit-2 2 months ago

this is a small-ish problem with FOSS that doen’t have an easy solution. the open source supply chain of code, libraries, tools and apps needs constant peer review, validation and enforcement

i think the tech behind NixOS will go some way to automating this but a coordinated human collaborative effort will be required too

dare i say it even AI might be able to lend a hand

tiny_hedgehog@piefed.social · edit-2 2 months ago

Probably a simple way without looking at ANY code is to just look at a few key points on GitHub (or GitLab or other):

Stars This is the number of people who have favourited the package. In general, if a package has more stars (500+, 1000+) it is probably good and has had a lot of people looking at it and using it. Beware packages with only a few stars (fewer than 20, but context matters.)

Forks Also look at the number forks the repo has. In general, the more forks it has, the more people in the community have contributed to it, fixing bugs, tightening security, etc. Again, the more eyes the package has on it, the higher the chance that key vulnerabilities have been identified and fixed.

Number of Contributors same reason as forks.

When last the files in the repo were updated Occasionally you’ll find a package that meets the above heuristics very well, but was only last updated 5 to 10 years ago. Avoid these as they aren’t up to date and therefore have vulnerabilities.

All these points are just rough heuristics and there will be exceptions but can generally point even experienced developers in the right direction.

mesa@piefed.social · 2 months ago

I usually start with whatever the main.cpp, main.py, ect… you can tell a lot about software by how easy or hard it is to read and dependencies.

Flyswat@lemmy.dbzer0.com · 2 months ago

In addition to what others said, I use https://openhub.net/ which provides a summary and key indicators.