This is purely out of curiosity, but if anybody has some insights I’d love to hear it.

I am running 70B Q4 models on my M1 Max Macbook Pro (10 CPU, 32 GPU, 64 GB RAM). The lid is closed because I have an external monitor 4K attached via USB-C, so the display won’t draw any power.

I am using both llama.cpp and LM Studio, and in both cases I run the LLMs with Metal acceleration.

Now, when running the LLM, I notice that according to iStat Menus my macbook is drawing between 95 and 110W 😮

(The fans get loud quickly, just like the good old intel days. But it seems to be able to sustain this)

But how is that possible?

Where is that power draw coming from? The GPU alone is max 45W, and the CPU is something around ~30W max (I forgot the exact value), but it’s not even used much. In the screenshot it pulls a meager ~12W. So That’s a total of ~57W for CPU+GPU combined. Where do the other 50W+ go?

Where is the additional power draw coming from? I know there are lots of other components here: RAM (probably single digit power draw?), fans, memory controller, etc etc. But we are talking about a large chunk of power.

Does anybody know? :)

https://preview.redd.it/6xxet64ash0c1.png?width=2869&format=png&auto=webp&s=ca3a1f416b9f2764e7143d262a5540fb2d02fa44

  • Herr_DrosselmeyerB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Under full load and if thermals allow it, that machine can draw up to 120 from the wall. Likely the tool isn’t reading the SOC power draw correctly.

    • k_michaelOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Hm, you are right, I do also remember the anandtech article on m1 max power draw. Maybe the tool really isn’t reading reading the draw correctly 🤔 It’s still interesting though, if I run a 3D game on my MBP i draw maybe 65-70W under full load. The LLM must be using some component that the 3D game isn’t 🤷‍♂️

      • mcmoose1900B
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        The LLM must be using some component that the 3D game isn’t 🤷‍♂️

        The 3D game is probably throttling itself with vsync, or just utilizing the GPU less.

        And yeah, as others said, package power can be much higher than what the core/igp reports. On the M-Series, that may even include the RAM itself. This is true on AMD and Intel stuff as well.

  • Infamous_Charge2666B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Training models on a laptop is counter intuitive. It will eventually kill your battery and damage the laptop. A laptop doesnt have the airflow to allow intensive project run for long periods of time. Apple runs cooler but you’ll eventually realize you are better off building a server and use the laptop to log in to remotely train/ running inference or buy online pods.

    • k_michaelOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I’m not training, just running inference for fun every now and then. This question was mostly just for curiosity. But thank you, you are right!