As said in the title I’m curious if grokking has been proven to happen with llm, could it be the case with gpt-4?