I’m doing a lot of coding and what I would ideally like to have is a long context model (128k tokens) that I can use to throw in my whole codebase.

I’ve been experimenting e.g. with Claude and what usually works well is to attach e.g. the whole architecture of a CRUD app along with the most recent docs of the framework I’m using and it’s okay for menial tasks. But I am very uncomfortable sending any kind of data to these providers.

Unfortunately I don’t have a lot of space so I can’t build a proper desktop. My options are either renting out a VPS or going for something small like a MacStudio. I know speeds aren’t great, but I was wondering if using e.g. RAG for documentation could help me get decent speeds.

I’ve read that especially on larger contexts Macs become very slow. I’m not very convinced but I could get a new one probably at 50% off as a business expense, so the Apple tax isn’t as much an issue as the concern about speed.

Any ideas? Are there other mini pcs available that could have better architecture? Tried researching but couldn’t find a lot

Edit: I found some stats on GitHub on different models: https://github.com/ggerganov/llama.cpp/issues/10444

Based on that I also conclude that you’re gonna wait forever if you work with a large codebase.

  • 0x01@lemmy.ml
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    1 day ago

    I do this on my ultra, token speed is not great, depending on the model of course, a lot of source code sets are optimized for Nvidia and don’t even use native Mac gpu without modifying the code, defaulting to cpu. I’ve had to modify about half of what I run

    Ymmv but I find it’s actually cheaper to just use a hosted service

    If you want some specific numbers lmk

    • shaserlark@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 day ago

      Interesting, is there any kind of model you could run at reasonable speed?

      I guess over time it could amortize but if the usability sucks that may make it not worth it. OTOH really don’t want to send my data to any company.