I have been trying to get open source models to work with Langchain tools. So far the only model that has worked has been Llama 2 70b Q4 following James Briggs tutorial. Both Llama 2 13b and Mistral 7b Instruct use the tool correctly, observe the answer, but then return an empty string at the end as the output, whereas Llama 2 70b returns “It looks like the answer is X”.
I want to experiment with Qwen 14b as it is a relatively small model that may be more efficient to run than Llama 2 70b to see if it works with Langchain tools etc. I read on the GitHub page for Qwen 14b that it was trained specifically for tool usage so I feel like it is one of the most promising models. That and there was quite a lot of positive sentiment about it on this sub.
When I try to load Qwen 14b on my Mac M1 I am getting an error related to auto-gptq, when I tried to install auto-gptq with pip it errors and mentions something about CUDA. Does auto-gptq work on Mac OS or does it require CUDA? Is there any way to get some version of Qwen 14b to run on Mac OS?
Has anyone experimented with Qwen 14b and Langchain tool usage?
Does anyone have any suggestions for models smaller than Llama 2 70b that might work for Langchain tool usage?