What are you using ? I’m kind of unsure what is the right way here - compile models to webgpu or use safetensors/gguf and load it directly in a wasm based inference