I’ve got research background in ML but never actually developed any models as it was all theoretical work. I got lucky during the interview stage for this role as my research impressed them. My project involves fine-tuning a GPT-3 model for a specific task and host the model on a website. Does anyone have any tips on how to go about learning what I need to know to do this? Also what should I consider when curating my custom dataset when fine-tuning the model? I really want this to be a learning experience for me.

  • JDubbleu@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    10 months ago

    I don’t have any resources as I’m a SWE, but I do have some advice.

    Ask for help from your mentor/other engineers. Seriously, I’m a software engineer (non-ML, but ML teams operate similar to SWE teams) and we don’t expect interns to know almost anything, and we understand they’re gonna need quite a bit of hand holding. I know I did. It’s okay! That’s how we all learn, and being able to ask for help when you need it is one of the most vital skills to have in software. The absolute worst thing you could do is struggle the whole internship without getting the help you need.

    All you gotta do is say, “Hey, I’m struggling with the fine-tuning of this model for my project. My research and academic experience have all been extremely theoretical, but I never got the chance to do much practical tuning. Do you have some suggestions given where I’m at?”. Obviously provide a lot of extra context for where you’re at/what you’re struggling with, but you get the point. They’re not gonna fire you so don’t worry about that (literally every interns worst fear), and they want you to learn! Asking would reflect well on you too since you’re showing 1) you know your short comings and 2) you are actively working to overcome them. If you can do both of those things you’re already ahead of most people.

    Good luck!