Predibase ( http://predibase.com ), also referred in the article, is a platform specifically designed for exactly that. It also has "repos" for finetuning multiple models and comapre their performance and keeping things organzie. It also allow you to query any of the finetuned models on the fly from a single GPU with multi-lora serving. (Predibase founder here)