Axolotl values of warmup_steps and val_set_size for fine-tuning Llama-2 13B

Helveticus99 · 1 year ago

Axolotl values of warmup_steps and val_set_size for fine-tuning Llama-2 13B

FPham · 1 year ago

If your epoch is 50 steps then you are not going to use 100 warmup steps.

In Training Pro extension I use 0.1 of total steps for warmup, but max 100 (there isn’t point to go higher, after 100 steps you should have primed most of the weights)

So if you have 3800 samples, which is a ton, 100 warmup step is as good as any.

val_set_size seems to be size of evaluation data. Now it depends if you want to even use evaluation data or not (some type of training have no reason to use evaluation data as it will not evaluate anything useful) . Again with big dataset 0.04 is fine. With small dataset 0.04 will create 1 evaluation sample - you are far better not to have ANY evaluation dataset.