[P] Versioning code & large models together in GitHub

semicausalB to Machine Learning@academy.gardenEnglish · 2 years ago

Hey r/MachineLearning!

Last year, u/rajatarya showcased how we scaled Git to handle large datasets. One piece of feedback we kept getting is that people didn’t want to move their source code over to XetHub.

So we built a GitHub app & integration that lets you continue storing code in GitHub while XetHub handles the large datasets & models.

https://about.xethub.com/blog/xetdata-scale-github-repos-100-tb

We’ve enjoyed using it to host open source LLM’s like Llama2 and Mistral with our finetuning code side-by-side.

The whole thing is in beta so we’re eager for any feedback you have to offer :)

You must log in or register to comment.

Chat

Machine Learning@academy.garden

machinelearning@academy.garden

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !machinelearning@academy.garden

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
1 user / 6 months
11 local subscribers
14 subscribers
793 Posts
3.09K Comments
Modlog

mods:
communick@academy.garden

[P] Versioning code &amp; large models together in GitHub

[P] Versioning code &amp; large models together in GitHub

[P] Versioning code & large models together in GitHub

[P] Versioning code & large models together in GitHub