Title says it all. Why spend so much effort finetuning and serving models locally when any closed-source model will do the same for cheaper in the long run. Is it a philosophical argument? (As in freedom vs free beer) Or are there practical cases where a local model does better.

Where I’m coming from is the requirement of a copilot, primarily for code but maybe for automating personal tasks as well, and wondering whether to put down the $20/mo for GPT4 or roll out my own personal assistant and run it locally (have an M2 max, compute wouldn’t be a huge issue)

    • oppenbhaimerOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      The alternative here isn’t Uber. It’s a fast public transportation system. Local LLMs still don’t hold a candle to GPT-4’s performance from my experience, no matter what benchmarks say

      • a_beautiful_rhindB
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        I have decent public transportation in my city. It still takes 2 hours to get somewhere. Won’t drop me to the door on my schedule.

        Autonomy counts for something. Best case is always “get both”.

  • Monkey_1505B
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Why do people brew their own beer, or grow their own weed?

    It’s because they want to be more connected to the process, in control of it, and cut out the middleman. Also, local models probably won’t destroy civilization.

  • Bright-Question-6485B
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Maybe I missed it but the most important argument might have slipped which is quite simply that GPT4 looks and feels good, however if you have a clear task (anything, literally - examples are data structuring pipelines, information extraction, repairing broken data models) then a fine tuned llama model will make GPT4 look like a toddler. It’s crazy and if you don’t believe me I can only recommend to everyone to give it a try and benchmark the results. It is that much of a difference. Plus, it allows you to iron out bugs in the understanding of GPT4. There is clear limits to where prompt engineering can take you.

    To be clear I am really saying that there is things GPT4 just cannot do where a fine tuned llama just gets the job done.

  • ccbaddB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    For me it’s just censorship and privacy. Maybe api costs once we get more apps will be an issue too.