With the proof of concept done and users able to get over 180gb/s on a PC with AMD’s 3d vcache, it sure would be nice if we could figure a way to use that bandwidth for CPU based inferencing. I think it only worked on Windows but if that is the case we should be able to come up with a way to do it under Linux too.

  • mcmoose1900B
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    There are actually TSVs for 3D Cache on the AMD 7900 series, but AMD doesn’t use them. Presumably because it makes the chip run hotter, so they’d have to downclock it.

    But I think it would be a great candidate for an ML card. Not for directly accelerating models, but for basically fitting any kind of intermediate calculations in cache to preserve all the RAM bandwidth for model weights.