Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

Blaed@lemmy.world · 3 years ago

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

drspod@lemmy.ml · 3 years ago

Is this model trained specifically for problem solving, or does it also perform as well as ChatGPT on conversational and generic text-generation tasks?

L_Acacia · 3 years ago

Specifically probleme sovling, chatgpt has multiple model too it is just hidden to the user

0421008445828ceb46f496700a5fa6@kbin.social · 3 years ago

We fined-tuned on a proprietary dataset of ~80k high quality programming problems and solutions.

ChrisLicht@lemm.ee · 3 years ago

Dumb question: Does one install the Python model, or access online?

L_Acacia · edit-2 3 years ago

The best way to run a Llama model locally is using Text generation web UI, the model will most likely be quantized to 4/5bit GGML / GPTQ today, which will make it possible to run on a “normal” computer.

Phind might make it accessible on their website soon, but it doesn’t seem to be the case yet.

EDIT : Quantized version are available thanks to TheBloke

ChrisLicht@lemm.ee · 3 years ago

You are awesome; thanks for the clue-in!

babysharknanana@lemmy.world · 3 years ago

Exciting, but as far as I know, we can’t use LLaMA commercially. So I ask myself how to use it in a non-commercial context? Isn’t it expensive to embedd such a model in free/open-source software?

L_Acacia · 3 years ago

Llama 2 now uses a license that allows for commercial use.

babysharknanana@lemmy.world · 3 years ago

I know, but the text is only talking of Llama. So this is using Llama 2?

abhibeckert@lemmy.world · edit-2 3 years ago

LLama2 and Llama are basically excatly the same model, except the “2” version has a more permissive license and was trained with a larger source data set. Nobody should use the old one ever, and I expect the noncommercial license is part of a contract Meta signed with someone who provided source material.

This is “CodeLlama” which was built on Llama2 and allows commercial use.

backgroundcow@lemmy.world · edit-2 3 years ago

I understand LLaMA and some other models come with instructions that say that they cannot be used commercially. But, unless the creators can show that you have formally accepted a license agreement to that effect, on what legal grounds can that be enforceable?

If we look at the direction US law is moving, it seems the current legal theory is that AI generated works fall in the public domain. That means restricting their use commercially should be impossible regardless of other circumstances - public domain means that anyone can use them for anything. (But it also means that your commercial use isn’t protected from others likewise using the exact same output).

If we instead look at what possible legal grounds restrictions on the output of these models could be based on if you didn’t agree to a license agreement to access the model. Copyright don’t restrict use, it restricts redistribution. The creators of LLMs cannot reasonably take the position that output created from their models is a derivative work of the model, when their model itself is created from copyrighted works, many of which they have no right to redistribute. The whole basis of LLMs rest on that “training data” -> “model” produces a model that isn’t encumbered by the copyright of the training data. How can one take that position and simultaneously belive “model” -> “inferred output” produces copyright encumbered output? That would be a fundamentally inconsistent view.

(Note: the above is not legal advice, only free-form discussion.)

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

Blog Post

Download