Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
CircleCI security alert: Rotate any secrets stored in CircleCI (circleci.com)
304 points by j_kao on Jan 5, 2023 | hide | past | favorite | 84 comments


> We wanted to make you aware that we are currently investigating a security incident, and that our investigation is ongoing. We will provide you updates about this incident, and our response, as they become available. At this point, we are confident that there are no unauthorized actors active in our systems; however, out of an abundance of caution, we want to ensure that all customers take certain preventative measures to protect your data as well.

Is anyone else a little annoyed by the messaging here, I read it as, "We think something bad happened to your ultra secret data, but we don't know, so we're asking teams to spend potentially hours or days fixing things while we aren't really able to tell you if your stuff was actually compromised"?

What I find more troubling is, if they don't quite know what happened, or aren't telling us, and we do the work to change everything, how do they know it won't just happen again in the next day or so and people are still accessing our systems, where is the details?

> At this point, we are confident that there are no unauthorized actors active in our systems.

Confident isn't really a good enough word to use here in my opinion. We've just blocked Circle CI from all our systems for now until we hear more, likely start to move to another build system.

I know accidents happen but this is likely the beginning of the end for our teams relationship with Circle CI. Trust has been broken.


> so we're asking teams to spend potentially hours or days fixing things

At the risk of sounding pedantic, but this is why you have everything as IaC. These kind of changes should not cost days. It should take merely minutes or an hour tops to change all your keys. It should be trivial, for cases just like this.


You can't use IaC to change third-party API keys. And woe unto any service that doesn't allow multiple keys because then you're looking at outages.



I get that you can manage the values in Circle, but you can't actually generate the values. IE, if you have a API token to write to Salesforce, you have to go into the Salesforce admin and generate a new token. Pasting the value in the Circle UI or a terraform descriptor are not the hard part. For lots of services, you can only have one key at a time meaning that generating a new one invalidates the old one meaning you'd have to have an outage while you're pasting and deploying.


I fully agree, our team just had to change one set of keys, other teams didn't follow best practices and are in a bad situation.

It's not Circle's fault people didn't do things propertly, but I think they just owe us a better explanation.


I can see you are bamboozled. But you should have seen the writing on the wall for CircleCI for quite some time now.


Agreed, how was I supposed to know that Circle CI would lose all the secret keys? I mean I always knew it was a possibility and our team planned for it...but what are you actually talking about?


I'm curious what indicators you have that this would have been the case. This is not a comment made in snark; I'm genuinely curious, as a CircleCI user who did not see the writing on the wall.


Can elaborate where would I have seen the writing for this? What indicators did you see?


Layoffs and outages at Circle. software supply chain attacks becoming more and more popular.


Outages at CircleCI have been common since the service has been launched.

Amazon is about to lay off 18k people. Circle isn't in a unique position to my understanding. Did CircleCI lay off some group or set of really important and key personnel?

Software supply chain attacks affect everyone. Is there some way that Circle is more vulnerable to this type of attack?


I would imagine GitHub Actions and always maturing CI tooling baked into cloud providers are rapidly eroding the market share of dedicated services like CircleCI.


Great reminder for folks to switch any AWS actions you perform from CI/CD to use OIDC role assumption instead of static IAM user credentials. Then even if an attacker stole all your secrets they can't do anything in your AWS account.


I recently did this for one of my GitHub repos which runs several test suites (cumulatively taking >1h). If your actions are slow, pay attention to the IAM role session duration. The maximum duration with role chaining is 1 hour.

In the end your credentials need to outlive your CI/CD actions.


Throwaway for reasons:

From experience, be careful and ensure you properly scope your OIDC connection. It’s very easy to allow ANY GitHub repo with proper OIDC connection bits (SA email, connector pool, etc) to get an OIDC token, rather than what you expect, whether that’s any repo in your private org or a specific single repository. As always, RTFM


I believe the max duration of an assumed role session is 12 hours, but this can be changed per-role.


For assuming one role it can be up to 12 hours. If you're doing role chaining like the parent mentioned (where the 1st assumed role then assumes a 2nd role) then the maximum session duration is 1 hour. AWS has this documented here:

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_te...

> Role chaining limits your AWS CLI or AWS API role session to a maximum of one hour.


Can you elaborate, as someone with little AWS experience? Are OIDC based creds just more scoped? What makes them special?


To assume a role with OIDC you'd need to do it from the context of a specific CircleCI job run - getting access to the secrets of a particular CircleCI account alone would not be enough to authenticate to AWS (unlike when you use IAM user credentials).

Even if the attacker had access to env vars from running jobs (which includes the signed token needed to do an OIDC role assumption), those tokens have a short expiry time, and even if an attacker stole that token and performed a role assumption then that session can only be valid for a maximum of 12 hours in AWS, and then you know the attacker is out of your account.

It just significantly reduces, practically nullifies, anything that an attacker can do in your AWS account.


This only applies if the stolen credentials can’t create roles and can’t modify existing roles.


...and don't leave a payload behind, to maintain persistent access (unless I'm missing something?)


This is a good reminder to always follow least-permission best practices.


I’d add drift detection on everything IAM / SCP / Org to this list too.

A session token with only a few minutes validity can be enough for someone to make their access permanent.


I assume what they meant is having AWS accept short-lived OIDC tokens from Circle's OIDC provider, which in turn would generate them on demand when the CI is actually run. There'd be no secrets at rest and the attack surface would be (in principle) smaller.


While OIDC is a good option, at StepSecurity, we are building an open-source project that allows using your MFA tokens for deployments in CI/ CD. So far, it is implemented for GitHub Actions - https://github.com/step-security/wait-for-secrets. In this method, you get a link in the build log, click the link, and can enter credentials at run time, which then gets used in the next step in the pipeline for deployment. So there are no persistent secrets stored in the CI/ CD pipeline and no need for managing/ rotating separate deployment credentials.


A bit of a shameless plug for a relevant Terraform module I made (specific to GitHub in this case): https://github.com/unfunco/terraform-aws-oidc-github


https://twitter.com/sanitybit/status/1610829345676996609

>I've been investigating the use of a @ThinkstCanary AWS token that was improperly accessed on December 27th and suspected as much.


Perhaps just unfortunate timing, but of note: this comes approximately a month after CircleCI reduced their staff by about 17%[1].

[1]: https://circleci.com/blog/ceo-jim-rose-email-to-circleci-emp...


Not even a month, only 14 days between your linked post (Dec 7) and when secrets might have been leaked ("starting from December 21, 2022 through today, January 4, 2023")


What's tricky is this is not the first interesting recent post from Rob, he previously posted on "An Update on CirclCI Reliability" (Dec '22) [1] and "CircleCI remains secure; be vigilant and aware of phishing attempts for your credentials" (Nov '22) [2]. Overall, CircleCI has had a rough run of it lately.

[1] https://circleci.com/blog/an-update-on-circleci-reliability/

[2] https://circleci.com/blog/circleci-security-update/


Someone please correct me if i'm wrong... but there was a kerfuffle in 2017 about Circle using third-party JS which could be an attack vector: https://news.ycombinator.com/item?id=15442636

To give credence to this, a gitlabber spoke up in that thread, said it was a serious thing and they deliberately had no third-party stuff on their site for that reason.

And I just logged into Circle today, and use the Safari network inspector to see what JS it loads... and it's still plenty of third party stuff that I can see:

* Amplitude * Segment * cci-growth-utils * Statuspage * DataDog * HotJar * Pusher

Not sure if this is an issue, but it doesn't make me comfortable.


@dang this is currently #198 off the front page, yet this is basically an emergency (literally every customer's secrets are exposed?)... either circleci has no more customers, or people are very calm about this...

we need to rotate:

  - secrets in context environment variables
  - secrets in project environment variables
  - project deploy keys
  - circleci api tokens
then we have to go back and look at all audit logs for... basically everything... and try to find something that looks weird. :/


this is such a clusterfuck... and the circleci api doesn't even allow to automate most of the steps. and the ones that should work, error with "internal server error". of course, support is completely unresponsive


No email? I found out about this from a random HN post?


I received an email to my inbox but a few of my colleague did not.


Nobody at my org did, either.

Fun night when you need to reroll your credentials...at least it's nice to have a list in the CircleCI UI, but sucks when you need to make sure that you have all of the scopes available to you.


Based on who received and didn't receive it at my workplace, we concluded that it was only sent to users who had not unsubscribed from marketing mail.


That would be it. It should have been sent as a forced account-related email. How strange of them to do that.


I received an email in my spam folder.


We received an email


Had one legacy app still on CircleCI and figured may as well move it over to GH actions if we're already rotating tokens anyway. Really hard to recommend anything else these days.


People on my team are talking about it, I'd say this incident is the end of our trust in Circle CI going forwards.

On the other hand, I'm becoming increasingly weary of putting all my eggs in the Microsoft basket if move our source code, build system, dev environments (codespaces) to GitHib, is it just me ?


I'm kinda in the same boat. We've just started to experiment with GH Actions and I'm really liking it. This is just going to move the needle faster.


I legitimately don't understand how the ranking on HN works sometimes. How is it that there are older, less-commented posts ranking higher than this story? @dang?

edit: I sincerely think this should be bumped, given how many folks don't seem to be getting the news here in a timely fashion.


Our hodgepodge of microservices- developed over more than a decade- never got coordinated env variables, so now we've got to go through like ~50 services & libraries, one by one, updating secrets. Yuck.

If you do your shit right, you can just dump most of your secrets into some Contexts- containers of env variables- and apply them. Then when this stuff roles around, it's easy to update everything centrally; change the context & everyone sees it. We, alas, can't easily do that, since we have so many differing env var names. New Year, new fun!


> Then when this stuff roles around, it's easy to update everything centrally; change the context & everyone sees it.

But one still has to update their credentials on any downstream service, e.g. Third-party API keys. In general, this is highly individual for each service, and can mostly only be doneanially.


Such is the tragedy of for-profit software engineering. The trade-offs we see today lead to choices that tie our hands when facing trade-offs we didn't foresee. Also why experience comes at such a premium. Seeing further down the line and knowing how to argue about it prevents whole classes of problems.


Switch to contexts and add the same secret under multiple names


A decade of chances to fix this?


Why on earth haven't I received an email from Circle about this??

I guess the answer is, why on earth am I still using Circle CI....

Thankfully all of my secrets/env variables are just dummy data for tests, and already using OIDC


Check your personal Github account email.


yeah I did. None anywhere


I have. Maybe PEBKAC


I've created a tool due to this incident to help you find your secrets in CircleCi.

https://github.com/rupert-madden-abbott/circleci-audit

It can: * List env vars attached to your repos and contexts * List SSH keys attached to your repos * List which repos are configured with Jira (a secret that might need rotating)


Thanks for taking the initiative!

Circle CI have also released something similar [0] linked to near the bottom of their blog post[1].

[0]: https://github.com/CircleCI-Public/CircleCI-Env-Inspector

[1]: https://circleci.com/blog/january-4-2023-security-alert/


Does this also include deploy SSH keys?



Thanks. Sigh, I have ~20 OSS repos with them that I'll need to generate new keys for then.


Given the wording of "any and all secrets" I would not take any chances.


"Immediately rotate any and all secrets stored in CircleCI. These may be stored in project environment variables or in contexts."

The blog post calls out "environment variables" and "contexts"


Emphasis on may be; not to mention, they are actively investigating the breach and do not have all the information at this time.


We contacted CircleCI support and they clarified their blog post statement with the following info:

"Thank you for contacting CircleCI Support.

This does also apply to SSH Keys, as such we do recommend to rotate SSH Keys as well as to take extra caution.

If you have any other concerns please reach out."


Better to be cautious and rotate those too, right?


Another thread which may need merging to this: https://news.ycombinator.com/item?id=34255189


PSA: Seems like deleting deploy keys on the CircleCI end doesn't actually delete them from Github, so you need to do it on both ends.


You have to trust a CI provider almost as much as your production host. Circle has not earned the same trust as organizations like AWS.


Subpar product, never enjoyed using it. Constant downtime and incidents.


I really don't understand why you use someones else's computer to compile and test your stuff.

When their computers are compromised, by internal or external crooks, the crooks have full access to your code, and - in some cases - your data. If they wanted, they could inject their own shit into your binaries, totally ruining your reputation.

As a bonus, you get to pay a premium!

I still compile and test my code on my own machines, in my own network. It's much faster than CircleCI, cheaper, and it's ∞ safer.


You do need to trust someone else’s computer if they’re going to build and run your code. I think Google is doing some good work here in helping champion things like Supply-chain Levels for Software Artifacts (SLSA) [0][1]. I’d argue your build/CI/CD system should never have access to production data, but it would indirectly by being able to mutate your production environment (to deploy things). Compiling and testing on your own machine it’s necessarily safer though. Compare a typical CI/CD build instance which is usually a VM or container that has been freshly booted, or is being reused from a recent build, with your own machine that you likely also use to browse the internet and run many other apps. The (ideal of the) former is a reproducible on-demand environment with a specific toolchain, while the latter is a bespoke assortment of different toolchains, software, and unfinished projects. Not to mention your machine will not be the same as someone else on your team. I think as an industry we still have a lot of work to do around establishing trusted computing environments for CI/CD and enabling the level of auditability and observability to verify that. There are also CI/CD providers that you can run on your own infrastructure.

[0] https://cloud.google.com/blog/products/application-developme...

[1] https://slsa.dev (edit: fixed this link)


I don't like to run my code on someone else's machine either, but having a separate build system allows you to run full, long running tests while you continue with your work.

I can see why you would use GitHub actions if you already host your code there, but I don't feel comfortable sharing my signing keys


It's nice you can do that, it doesn't work for large distributed teams.


Sure it does. Do engineers not compile their code locally constantly as a part of the process of writing it? Store deterministic hashes of expected binaries with signed commits in PRs. Then untrusted CI merely needs to generate and sign -matching- hashes and now we are good as long as the engineer and CI system are not compromised at the same time.


You're talking about creating reproducible builds - which is a good idea, but in most cases you will still need to deliver that binary somewhere.

That typically requires authentication, whether you're deploying to kubernetes or copying the files somewhere using scp, etc

So either your laptop or the ci system needs some level of secrets present to put the artifact in the correct place


Engineers simply commit artifacts with Git LFS as a signed commit. Totally unprivileged build systems can append reproducible build signatures via git tags. That repo can be webhook triggered to be -pulled- by a lambda job or similar in the target environment that will the verify tags and signatures to assert if it is valid, and deploy artifacts to signature approved environments using ephemeral role credentials.

A VCS system or CI system should never have secrets or be trusted in any way. Doing this is always dramatically increases attack surface for no reason.

I run a security consulting firm and this is often one of the first things I help my clients to fix.


What about testing? In my company, before any code goes to production it has to go through hundreds if not thousands of unit tests. This can't be done on a dev laptop (see XKCD #303)


Testing is a separate concern than supply chain security. Testing should also never require any secrets useful to an adversary, so third party hosted CI is low risk here.


I think that depends entirely on the priorities and incentives driving the company.


I want the following option in my account settings for all critical services:

    [X] In case of a "security incident", lock down my account until I take action.
I understand why they can't do that by default, but it's crazy that every time this happens, I have to run in order to secure my assets when in many cases, I'd be perfectly fine with things just shutting down until I have time to take care of them.

Better yet, also give me a button that does this even when there's no official incident reported. That means disabling all access tokens, resetting the password, halting any scheduled jobs, and revoking access for any connected OAuth services until I manually re-enable them.


I don't think locking down the account will do anything. It sounds like secrets were already stolen. GitHub access tokens, etc. Locking the account won't unsteal that stuff.


Right. You'd need lock-down-all-AWS-controlled-by-the-foo-key because CircleCI got hacked and it had the foo-key.

Sounds like a separate product (something about breaches and blast radii) and not a CircleCI feature.


The product already exists, it's called OAuth. All you need is an additional role that you can authorize:

    CircleCI would like to:
    
    - Upload build artifacts
    - Report security incidents
Then in GitHub (or wherever), you have the aforementioned checkbox. So when CircleCI reports the incident, the GitHub account is locked down.


You mean hanging whole sections of our value chain on other companies' assets was not the best idea?


Of course; it's the GitHub account that would need to be locked down in this case, and yes, it should be possible to do this automatically. The problem is that even though OAuth exists (which could be used to specify such an action during authorization), many services still rely on manually copying secrets around, which means that GitHub is not necessarily aware that another service has access to it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: