Context window testing with limited memory

TelloLeEngineerB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

I was hoping to run some context window retrieval testing on open source long context models such as Yarn-Mistral-130k but I’m only working with a 16BG Mac M2. Does anyone have experience with inference on such a setup?

I have a automated evaluation script to generate various contexts and retrieval prompts, iterating over context lengths. I was hoping to be able to call the model iteratively in this script; what would be your preferred method to achieve this? llama.cpp? oogabooga? anything else?

You must log in or register to comment.

Chat