There are oodles of neat and singular programs on github and similar. Curious what steps people take to vet for malware before downloading and trying stuff, especially if you’re not very familiar with the coding language it’s written in.
There’s not a ton you can do, but you can look out for indicators of a healthy project.
A good sign is if the repo has a lot of different contributors. If something has hundreds or thousands of contributors, there’s more eyes on it to catch something malicious. Also other activity as well, like bug reports demonstrates a strong user base, which is like crowdsourcing trust.
Another thing is, if your distro packages it in one of their main repos that’s a reasonable indicator that you can trust it. Def not 100% but when you don’t have a lot to go on, it’s something.
Any other tips I think I have are more technical.
Read along on Lemmy.
We will complain about the smallest impurities in our selected open source software.
If we’re not complaining about a piece of open source software, then we don’t trust it enough to use it.
You can’t, obviously. I know how to read code, but I still rarely do it since it’s very time consuming. Usually, if I’m nervous about something, I’ll first look at the author and see if they’re well-known, or at least tied to a real identity. In the rare cases that I have reviewed a code base (I’m not a security expert or anything) to check for malware, the things I looked for were:
-
obvious red flags, like urls to fishy sites, or calls to filesystem APIs where it doesn’t make sense, paths that it shouldn’t be trying access, etc
-
anything that looks obfuscated, poorly written, or delibrately designed to be difficult to read
But if it’s anything related to Node/NPM, I always use a throwaway rootless podman container without filesystem access. Even if the author is trustworthy, their dependency graph is likely a bag of used needles that they picked up on the side of the road.
-
99% of people who can read code are only going to catch obvious things like cryptominers. Most aren’t going to catch something like the XZ malware that was an entirely serendipitous finding from timing how long a certain part of the process took and noticing it was off. True malware is using unique loopholes and malformed requests that will get past nearly everyone.
There really needs to be a concentrated effort put into vetting code, but of course, funding for that is non-existent. 60% of code in the wild is maintained by hobbyists getting paid almost nothing. We’re screwed.
Not disagreeing with you, but since the author is asking about GitHub… the XZ GitHub didn’t actually have any malicious code. Only the website tarbal did.



