Meta has released llama 3.1. It seems to be a significant improvement to an already quite good model. It is now multilingual, has a 128k context window, has some sort of tool chaining support and, overall, performs better on benchmarks than its predecessor.

With this new version, they also released their 405B parameter version, along with the updated 70B and 8B versions.

I’ve been using the 3.0 version and was already satisfied, so I’m excited to try this.

    • @chayleaf@lemmy.ml
      link
      fedilink
      English
      84 months ago

      the code is FOSS, the weights aren’t, this is pretty common with e.g. FOSS games, the only difference here is weights are much costlier to remake from scratch than game assets

      • Possibly linux
        link
        fedilink
        English
        64 months ago

        The license has limitations and isn’t something standard like Apache

        • Pennomi
          link
          fedilink
          English
          54 months ago

          True, but it hardly matters for the source since the architecture is pulled into open source projects like transformers (Apache) and llama.cpp (MIT). The weights remain under the dubious Llama Community License, so I would only call the data “available” instead of “open”.

  • @greysemanticist
    link
    English
    14 months ago

    It doesn’t follow instructions, insists on being “conversational” despite being told not to be.

    • RachelRodent
      link
      fedilink
      English
      34 months ago

      that is the base model. Wait for people to finetune it for spesfic tasks