[R]eading List for Andrej Karpathy’s “Busy person’s intro to Large Language Models” Video

FallMindless3563 · 1 year ago

[R]eading List for Andrej Karpathy’s “Busy person’s intro to Large Language Models” Video

FallMindless3563 · 1 year ago

You certainly can combine all the tasks and datasets into a single instruction fine tuning dataset. Then you would have a separate dataset for the reinforcement learning half where the model is learning human preferences.