Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That is really a shame, because all I really want is a version of Craiyon that I can modify and run on my own hardware.

The amount of enjoyment I have derived from playing with Craiyon over the last two months is ridiculous.



IIRC Craiyon runs Dalle-mega. https://huggingface.co/dalle-mini/dalle-mega

Note I think you need 16gb of VRAM to run it.


You can run craiyon / dalle-mini on a card with 8GB of VRAM if you decrease batch size to 1 and skip the CLIP step. Takes about 7 sec to generate an image on a 3070.

I started with https://github.com/borisdayma/dalle-mini/blob/main/tools/inf... and pared it down.


Have you checked out MidJourney? Makes Craiyon look like crayons :P


Craiyon is free, whereas Midjourney is not. If you want MJ level quality, check out Disco Diffusion or go straight to Visions of Chaos, which runs just about every AI diffusion script in existence. The dev is very active and adds new features every couple days, such as recently the ability to train your own diffusion models, which I've been doing the last 3 days nonstop on my little 3060 Ti (8GB VRAM, which is barely sufficient to run at mostly default settings).


MidJourney does give you 25 minutes of free compute time though. Which is enough for at least trying it ~40 times

I've checked out Disco Diffusion but hadn't heard of Visions of Chaos, thanks. The biggest shortcoming to DD is there's simply not yet a sufficiently trained model to produce stuff to the level of MidJourney or Craiyon


Are they giving out trials without an invite now? I was invited from someone who was paying, became addicted by the end of my trial, subscribed for a month, then gave out invites to friends, some of which ended up also paying - not a bad business model! It was bad timing though, the day after I joined, I discovered Disco Diffusion and haven't stopped rendering since (roughly 10k images rendered, mostly for animations). It takes longer, the results are often less realistic compared to Midjourney, Dall-E (1/2) or Stable Diffusion (which I've been toying with for a few weeks), but it's somehow much more satisfying having to wait xx minutes for a render to complete, running on your own local PC, not having to use bots with 1000 other people in a channel spamming their prompts, and having TONS of parameters to play around with. I have a google drive full of docs from my own studies, comparing parameter values, models, prompts, etc. I'm really looking forward to Stable Diffusion releasing their models, I know VoC will add those models as soon as they are available. On top of that, VoC has been adding support for diffusion models (of which I'm training my own), but there are new ones added constantly as more and more people build models for e.g. pixel art, medieval style, monochromatic, etc.

Also the results vary drastically if you have enough VRAM to load more models e.g. a 3090 (24GB) or an A6000 (48GB). I've been saving money and waiting impatiently for 4090s to drop. Check out the Disco Diffusion or VoC Discord - people post their works in there and often you will see results that make you wonder if they're cheating ;)


Which are the best models the 3090 enables you to load?


I'd start with VITL14 and add one or more RN types. I personally like to use multiple VIT and RN models just to fill up VRAM. What exactly will fit depends on your output resolution and requires a lot of trial and error. I always have Task Manager open to monitor VRAM usage. In general, VIT = more realistic, RN = more artistic. It can take a lot of experimentation to find what exactly tickles your fancy. I constantly change which models I'm using depending on what I'm going for. This redditor did a nice comparison[0], and there are many more "studies" for which models to use - you can google around, there are new articles/studies being posted daily.

You can also try disabling use_checkpoints if you have extra VRAM, since it will render a bit faster (but uses more VRAM since it doesn't save intermediary 'checkpoints' to disk).

When/if you get bored, try disabling use_secondary_models which will use a lot more VRAM but can deliver results on a completely different level. You will likely struggle for a few days figuring out which parameters to tweak to get good results (e.g. tv_scale, sat_scale, etc, which are otherwise AFAIK ignored).

In any case I recommend reading A Traveler's Guide to the Latent Space, which I call "The Bible" since it covers so many topics and has links to various studies and will keep you busy reading for months ;)

Also check out the Discord for Disco Diffusion and Visions of Chaos, as you can read endless tips and tricks to getting amazing results.

Have fun! :)

[0] https://www.reddit.com/r/DiscoDiffusion/comments/t7p4bi/seas...

[1] https://sweet-hall-e72.notion.site/A-Traveler-s-Guide-to-the...


I think I might disagree with your assessment of DD.

I can't use it, but check out this guy's work. incredible detail

https://instagram.com/textrnr




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: