I have a stealth 15m laptop that has 16 gig of ram with a 3060 with 6vrams. Can this run 13b models decently well? Pretty new to llm stuff and so far I can only make it gen around 2-3 token a second and feel like that’s pretty slow. Is there anyway I can bump that to 5+ token per second? Or is 2-3 token per second the limit of my laptop?
You must log in or register to comment.