Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm curious how this was uploaded to GitHub successfully. I guess they do less actual introspection on the repo's contents than I thought. Did it wreak havoc on any systems behind the scenes (similar to big repos like Homebrew's)?


There isn't anything wrong with the objects. A 'fetch' succeeds but the 'checkout' is what blows up.


Good point. For those that are curious:

Clone (--no-checkout):

    $ git clone --no-checkout https://github.com/Katee/git-bomb.git
    Cloning into 'git-bomb'...
    remote: Counting objects: 18, done.
    remote: Compressing objects: 100% (6/6), done.
    remote: Total 18 (delta 2), reused 0 (delta 0), pack-reused 12
    Unpacking objects: 100% (18/18), done.
From there, you can do some operations like `git log` and `git cat-file -p HEAD` (I use the "dump" alias[1]; `git config --global alias.dump catfile -p`), but not others `git checkout` or `git status`.

[1] Thanks to Jim Weirich and Git-Immersion, http://gitimmersion.com/lab_23.html. I never knew the guy, but, ~~8yrs~~ (corrected below) 3.5yrs after his passing, I still go back to his presentations on Git and Ruby often.

Edit: And, to see the whole tree:

  NEXT_REF=HEAD
  while [ -n "$NEXT_REF" ]; do
    echo "$NEXT_REF"
    git dump "${NEXT_REF}"
    echo
    NEXT_REF=$(git dump "${NEXT_REF}"^{tree} 2>/dev/null | awk '{ if($4 == "d0" || $4 == "f0"){ print $3 } }')
  done


Sad one to nitpick, but Jim died in 2014. So ~3.5 years ago.

Had the pleasure of meeting him in Singapore in 2013.

Still so much great code of his we use all the time.


Thanks for the correction, he truly was a brilliant mind. One of my regrets was not being active and outgoing enough to go meet him myself. I was lived in the Cincinnati area from 2007-2012. I first got started with Ruby in 2009, and quickly became aware of who he was (Rake, Bundler, etc) and that he lived/worked close by. But, at the time, I wasn't interested in conferences, meetups, or simply emailing someone to say thanks.


I too was curious about this.

https://github.com/Katee/git-bomb/commit/45546f17e5801791d4b... shows:

"Sorry, this diff is taking too long to generate. It may be too large to display on GitHub."

...so they must have some kind of backend limits that may have prevented this for becoming an issue.

I wonder what would happen if it was hosted on a GitLab instance? Might have to try that sometime...


Since GitHub paid a bounty and Ok'd release, perhaps they've patched some aspects of it already. Might be impossible to recreate the issue now.

My naive question is whether CLI "git" would need or could benefit from a patch. Part of me thinks it doesn't, since there are legitimate reasons for each individual aspect of creating the problematic repo. But I probably don't understand god deeply enough to know for sure.


is this a git->god typo, or a statement about your feelings towards Linus?


Please don't let Linus read this


Yes, hosting providers need rate limiting mitigations in place. GitHub's is called gitmon (at least unofficially), and you can learn more at https://m.youtube.com/watch?v=f7ecUqHxD7o

Visual Studio Team Services has a fundamentally different architecture, but we do some similar mechanisms despite that. (I should do some talks about it - but it's always hard to know how much to say about your defenses lest it give attackers clever new ideas!)


> how much to say about your defenses lest it give attackers clever new ideas

attackers will try clever new ideas anyway if their less clever old ideas don't work :P


How does the saying go? Something like "security through obscurity isn't security"?


It's not security through obscurity. It's defense in depth.


GitLab uses a custom Git client called Gitaly [0].

> Project Goals

> Make the git data storage tier of large GitLab instances, and GitLab.com in particular, fast.

[0]: https://gitlab.com/gitlab-org/gitaly

Edit: It looks like Gitaly still spawns git for low level operations. It is probably affected.


Spawning git doesn't mean that it can't just check for a timeout and stop the task with an error.

Someone will probably have to actually try an experiment with Gitlab.


Tested locally on a GitLab instance: trying to push the repo results in a unicorn worker allocating ~3GB and pegging a core, then being killed on a timeout by the unicorn watchdog.

    Counting objects: 18, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (17/17), done.
    Writing objects: 100% (18/18), 2.13 KiB | 0 bytes/s, done.
    Total 18 (delta 3), reused 0 (delta 0)
    remote: GitLab: Failed to authorize your Git request: internal API unreachable
    To gitlab.example.com: lloeki/git-bomb.git
     ! [remote rejected] master -> master (pre-receive hook declined)
    error: failed to push some refs to 'git@gitlab.example.com:lloeki/git-bomb.git'
I had "Prevent committing secrets to Git" enable though. Disabling this makes the push work. The repo first then can be browsed at the first level only from the web UI, but clicking in any folder breaks the whole thing down with multiple git processes hanging onto git rev-list.

EDIT: reported at https://gitlab.com/gitlab-org/gitlab-ce/issues/39093 (confidential).



Thanks. Here is the comment from a GitHub engineer addressing the root cause:

https://github.com/cocoapods/cocoapods/issues/4989#issuecomm...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: