Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Run Database in GitHub Actions, Persisting Data to S3, and Access It Publicly (wesql.io)
72 points by earayu on Dec 12, 2024 | hide | past | favorite | 45 comments


Is that really cheaper than to just use a VPS for $5/month which offers compute and storage and everything?

The article says this about VPSs:

    You often end up paying for
    resources even when you're
    not using them.
But how is that different with S3? They also do not store the data for free.

And you can delete a VPS via API just like you can delete an S3 container.


GitHub Actions is completely free, and you can always use free S3-compatible storage like Cloudflare R2, making the entire setup cost nothing.

VPS has two main drawbacks:

Many don't guarantee persistent disk storage It's not ideal for CI/CD scenarios where you need ephemeral databases for testing But you're right - if you need a simple, always-on database, a $5 VPS might be a good choice.


Wait, this is news to me - which VPS providers do not have persistent data storage? Are you thinking of Heroku-like deployments? I feel like every VPS provider I've encountered always listed storage as a feature?


I was thinking about EC2's default instance storage - it's ephemeral and gets wiped when you restart or stop the instance. Without paying for EBS volumes, EC2's storage is non-persistent by design.


The d variant machines you mean?


You can't rely on VPS disk - backups, data retention and recovery is all up to you in case of node failure. There are other much more expensive and much slower products (external networked volumes) that do offer guarantees, but that's additional charge.


> backups

Just to point out, if the data is important you'll want backups anyway. Even with S3.

Just in case. :D


Why would you want backups for S3? Were there data loss incidents?


Well, Amazon might fail as a company at some point and then all your data will be gone. Theoretically.

Much more likely, though, is that you, or some sysadmin at your company, or even some user will accidentally hit the "delete" button on something important, and then without a backup, you can't get it back. Which is honestly the thing that people usually need their backups for, anyway. This is what most "data loss incidents" are: people just messing up and deleting things they shouldn't have. Wetware is much more prone to failure than hardware, after all.


There are ways to protect against delete or overwrite, for example by using versions.


Does that work when the sysadmin fat fingers the deletion of a bucket or account?


> Why would you want backups for S3?

In case something goes wrong. ie your account has a problem unexpectedly, or if they do indeed have a data loss/corruption problem, etc.


Most VPS providers offer a backup solution.


Yes, and you never use those, because if the VPS company fails, your backups are gone. So use the backup services of a second (and third) company if you value your data.


As I said, that's an additional charge.


Github actions are not free, you get a bunch of free minutes. For the public repositories, you are usually queued too far if you are sending lots of actions.


> GitHub Actions is completely free

I mean kinda? It's free for public repositories, but free doesn't mean free for anything. The use of GitHub actions, like most thing in life, has terms of service[1]. This use-case arguably breaks the term "for example, don't use Actions [...] as part of a serverless application". If you start using this for demos, you'd probably also be breaking "You may only access and use GitHub Actions to develop and test your application(s)".

It's up to GitHub if they choose to enforce any of these terms. I just want to point out that there are limits to "free".

[1]: https://docs.github.com/en/site-policy/github-terms/github-t...


yeah, you are right.

> for example, don't use Actions as a content delivery network or as part of a serverless application, but a low benefit Action could be ok if it’s also low burden

Just as the wesql article states, Use Cases is Not Recommended For:

- Long-term database hosting or production workloads. - Maintaining an always-on public database endpoint. - Circumventing GitHub Actions usage policies.


If you want to run a database on VPS with ssd disk, you'll need at least two replicas for data redundancy, which would cost about $10.


The article scopes this as "for development, testing, demos, or short-lived workloads". Do you really need HA/replication for any of those workloads?


but when the VPS fails, you lost all your data


You probably need to define "fails" for me. I've never had a VPS straight up "fail" before, as in a hardware level, can't access the data, dead. You may temporarily lose access to the data, but I've always been able to recover said data.


VPS backups are $1 for the $5 1GB VPS


$5 a month gets you 500GB of storage on R2. Free tier of 10GB is more than enough for CI/CD.


I get why these posts are popular but I always hate them. It's this sort of activity that results in free tiers being limited and removed.


just use it for CI/CD


Free tiers are meant to be abused, because they should not be provided for free, and this is the market reacting to remove it.


Imagine a restaurant offering free samples or complimentary dishes to their customers. It helps in product discovery, and indirectly in adoption of other products that they pay for.

Then a person comes in starts gobbling every sample, because they believe "free tiers are meant to be abused".


> Then a person comes in starts gobbling every sample, because they believe "free tiers are meant to be abused".

That’s what happens in practice, even if they don’t believe this.


In practice, they have someone giving out the free samples.

Also, social shame generally applies and people conform to social norms.


Then you kick them out. It's not very hard.


Of course, because having someone eating them all is the natural process; you need to kick the people out to prevent it.


People like you are the reason we can't have nice things.


I use an approach with Github Actions where if I need small amounts of persistent data between runs, I use a filesystem based off a fresh branch in the same repository.

That branch is rebased down to $tip so it shares no commits, and the 'database' doesn't affect development in trunk.

Obviously this doesn't work for significant data analysis, but if it's just a question of 'what was the state at the last run' it's cheap and easy.


Thanks, that's a cool idea.


  > Here's how to abuse GitHub Actions to run a database

  > DISCLAIMER: DO NOT ABUSE GITHUB ACTIONS DO RUN A DATABASE!
Meanwhile, a cloud VPS is down to €3.60/mo.

Provisioning a VPS takes ~30-60 seconds, and bootstrapping them with cloudinit is an option most places, so you could even spin them up and down as you need. But honestly, the effort isn't worth saving €3.60/mo. over.


Using it for CI/CD purposes does not violate GitHub's Terms of Service.


The benefits include not having to pay for the cost of owning the machine and disk, the computing nodes are only spun up when needed, and when your task is completed, the computing resources are not billed, and your data is stored on the cheaper S3.

These are the benefits brought by Wesql, a new database based on S3 that is compatible with MySQL.


You can persist data on Wasabi which is similar to S3 and pay even less...


Or just use Docker Compose?


Sure. That's the fascinating aspect of WeSQL database - its compute-storage separation allows you to run the database anywhere while data persists in S3.

You can shut down the compute service and restart it anywhere else - the data will simply restore from S3.


This sounds really interesting, but the article motivation is test databases.

Edit: suggest looking at this other post https://wesql.io/blog/every-database-will-be-rearchitectured...


Yep, GitHub Actions isn't meant for production deployment - it's a CI/CD service with a 6-hour limit per run.

But that's the cool thing about WeSQL - you can run it anywhere! Just fire it up like any other container. For production, throw it on ECS/EKS and you're good to go.


Yea, I’m not following why this approach is radically different from just doing that


next stop: run database inside notepad powered by blockchain




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: