Alien Top
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Dry_Long3157B to LocalLLaMA@poweruser.forumEnglish · 3 years ago

Training LLMs on less epochs

message-square
message-square
0
link
fedilink
1
message-square

Training LLMs on less epochs

Dry_Long3157B to LocalLLaMA@poweruser.forumEnglish · 3 years ago
message-square
0
link
fedilink

I was going through a paper called MILAN which is a pre-training method to teach the model good Visual representations and one thing that struck me is the large no. of epochs we used to train models on (see image) even if we want the model to be able to generalize well. So I’m curious to know why even base models are only trained with a low epoch count.

TIA.

https://preview.redd.it/un1mdjoodx2c1.png?width=1312&format=png&auto=webp&s=2f80e328b05c3aee00a32c1e1ee8289810d8ddf0

alert-triangle
You must log in or # to comment.

LocalLLaMA@poweruser.forum

localllama@poweruser.forum

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 1 user / day
  • 1 user / week
  • 1 user / month
  • 3 users / 6 months
  • 3 local subscribers
  • 14 subscribers
  • 1.03K Posts
  • 5.96K Comments
  • Modlog
  • mods:
  • communick@poweruser.forum
  • BE: 0.19.17
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org