This is what containers solve. Don't waste time manually installing things. Stor...

kunwon1 · on May 22, 2023

Speaking as someone who has encountered similar difficulties, this response has strong 'Draw the rest of the owl' vibes

sneak · on May 22, 2023

Speaking as someone who has solved these difficulties hundreds of times, "draw the rest of the owl" doesn't tell you the specific things to google to get detailed examples and tutorials on how millions of others have sidestepped these repeated issues.

itake · on May 22, 2023

Yep... you spend hours messing around with docker containers and debugging all the weird build errors.

I am less familiar with storing data in a db (for ml hosting concerns), but I'd imagine it would add overhead (as opposed to accessing files on disk).

You also have to deal with hosting a db and configuring the schema.

sneak · on May 22, 2023

You "spend hours messing around" with everything you don't know or understand at first. One could say the same about writing the software itself. At its core Dockerfiles are just shell scripts with worse syntax, so it's not really that much more to learn. Once you get it done once, you don't have to screw around with it anymore, and you have it on any box you want in seconds.

In either case you have to spend hours screwing around with your environment. If those hours result in a Dockerfile, then it's the last time. If they don't, then it's each time you want it on a new host (which as was correctly pointed out a pain in the ass).

Storing data in a database vs in files on disk is like application development 101 and is pretty much a required skill period. It's required that you learn how to do this because almost all applications revolve around storing some kind of state and, as was noted, you can't reasonably expect it to persist on the app server without additional ops headaches.

Many people will host dbs for you without you having to think about it. Schema is only required if you use a structured db (which is advisable) but it doesn't take that long.

NBJack · on May 22, 2023

I applaud your experience, but honestly I agree with parent: knowledge acquisition for a side project may not be the best use of their time, especially if it significantly impedes actually launching/finishing a first iteration.

It's a similar situation for most apps/services/startup ideas: you don't necessarily need a planet scale solution in the beginning. Containers are great and solve lots of problems, but they are not a panacea and come with their own drawbacks. Anecdotally, I personally wanted to make a small local 3 node Kubernetes cluster at one time on my beefy hypervisor. By the time I learned the ins and outs of Kubernetes networking, I lost momentum. It also didn't end up giving me what I wanted out of it. Educational, sure, but in the end not useful to me.

simonw · on May 22, 2023

I'm having trouble imagining what data I would store in a database as opposed to a filesystem if my goal is to experiment with large models like Stable Diffusion.

pdntspa · on May 23, 2023

I would take GP's kind of dogmatic jibber jabber with a grain of salt. There is an unspoken and timeless elegance to the simplicity of running a program from a folder with files as state

throwaway2037 · on May 23, 2023

Isn't terminfo db famous for this filesystem-as-db approach? File Vs DB: I say do whatever works for you. There is certainly more overhead in the DB route.

ElectricalUnion · on May 23, 2023

> Storing data in a database vs in files on disk is like application development 101

It's OK until you're dealing with, say, 130GiB of tensors, on what is effectively a binary blob that needs to be mostly in VRAM somehow.

I really don't want to read 130GiB of blobs from a database all the time.

novok · on May 23, 2023

IMO tensors & other large binary blobs are fairly edge-casey. You might as well treat them like video files and video file servers also don't store large videos in databases either, and most devs don't have large binary blob management experience.

chaxor · on May 23, 2023

'shell scripts with worse syntax' lol I wish shell could emulate Alpine on a non-linux box. Shell script with worse syntax for config VM may be closer to a qemu cloud init file.

machinawhite · on May 22, 2023

Well then you're wasting time managing your containers? Have you ever used k8s, it's a full time job lol

zztop44 · on May 22, 2023

I don’t think anyone is suggesting k8s for running an ai model as a side project on a single machine.

machinawhite · on May 23, 2023

> a side project on a single machine

Sorry I don't see how will containers help here

ElectricalUnion · on May 23, 2023

I used to think that the 300MB containers I shipped to prd were reasonably unoptimized.

Those AMD ROCm containers are like 14GiB compressed.