Howdy,
I’m a backend developer, and management recently asked me to train an llm on our company data. I’m a bit over my head here, and I figured I’d ask for high level advice rather than continuing to go down google rabbit holes.
What I’ve tried so far:
- I spun up some gpu instances on AWS. Couldn’t get llama to work at all, except for using gpt4all, which wasn’t very performant and does make a network call to a github page for a list of models.
- I tired following a google cloud tutorial here. This didn’t work in their colab notebook, so I gave up on that since if their own documentation didn’t work it didn’t seem promising.
Any advice is appreciated!
You must log in or register to comment.