I am a beginner to ML and cloud. I am planning on creating web APIs for some pre trained models i have gotten from the hugging face. My code need to take in text from other endpoints and return some classification data. I wanna deploy to google cloud, but unable to zero down on the best way. My options:-

  1. Dockerize. But then how do I scale it to use the gpus in the future.
  2. Colab Notebook. I was able to get the code working, but then how do I create an web endpoint? Missing knowledge here.
  3. Shud I just wrap my code in a flask endpoint and upload into an instance? Need some advice…