• 1 Post
  • 5 Comments
Joined 1 year ago
cake
Cake day: November 17th, 2023

help-circle



  • I don’t understand the motivation behind this. It’s not like we need models that are almost as good at things computers are excellent at, while using orders of magnitude more resources. It would be way more useful to train tiny models to predict when a calculator should be used

    I’m 100% on the fact that we should use calculators directly instead of LLMs for Math. I mean we humans also use calculators. But the thing is, many claim LLMs cannot do Math etc. This project is just to prove them wrong that LLMs when learnt the process especially in case of Math can do it. I mean if we consider the LLM to be the brain, it should ideally be able to master Math similar to how human brain does right

    Fine, you’ve ran an experiment out of curiosity and you got the result, but why would you want to finetune more language models on this?

    I mentioned this in Github Repo but missed out in the Reddit post. But here is the rationale: When initially tested using incontext learning for Gpt4, the model was able to follow the procedure showed in the 1shot prompting but it was sometimes failing with the overall result due to OpenAI’s addition technique. And given this is a procedural technique, doing a n-shot prompting would lead to even more tokens. Given these limitations, it was evident that finetuning would solve these issue. And also the model was able to quickly learn the procedure and the overall training and validation loss of model went close to 0 within 0.1 epochs