The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates

@FatCat@lemmy.world · 2 months ago

The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates

@masterspace@lemmy.ca · 2 months ago

For the purposes of this conversation. That’s pretty much just a pedantic difference. They are paying to train those models and then providing them to the public to use completely freely in any way they want.

It would be like developing open source software and then not calling it open source because you didn’t publish the market research that guided your UX decisions.

@Arcka@midwest.social · 2 months ago

Tell me you’ve never compiled software from open source without saying you’ve never compiled software from open source.

The only differences between open source and freeware are pedantic, right guys?

@masterspace@lemmy.ca · 2 months ago

Tell me you’ve never developed software without telling me you’ve never developed software.

A closed source binary that is copyrighted and illegal to use, is totally the same thing as a all the trained weights and underlying source code for a neural network published under the MIT license that anyone can learn from, copy, and use, however they want, right guys?

@WalnutLum@lemmy.ml · 2 months ago

You said open source. Open source is a type of licensure.

The entire point of licensure is legal pedantry.

And as far as your metaphor is concerned, pre-trained models are closer to pre-compiled binaries, which are expressly not considered Open Source according to the OSD.

@masterspace@lemmy.ca · edit-2 2 months ago

You said open source. Open source is a type of licensure.

The entire point of licensure is legal pedantry.

No. Open source is a concept. That concept also has pedantic legal definitions, but the concept itself is not inherently pedantic.

And as far as your metaphor is concerned, pre-trained models are closer to pre-compiled binaries, which are expressly not considered Open Source according to the OSD.

No, they’re not. Which is why I didn’t use that metaphor.

A binary is explicitly a black box. There is nothing to learn from a binary, unless you explicitly decompile it back into source code.

In this case, literally all the source code is available. Any researcher can read through their model, learn from it, copy it, twist it, and build their own version of it wholesale. Not providing the training data, is more similar to saying that Yuzu or an emulator isn’t open source because it doesn’t provide copyrighted games. It is providing literally all of the parts of it that it can open source, and then letting the user feed it whatever training data they are allowed access to.