I wish rebase was taught as the default - I blame the older inferior version control software. It’s honestly easier to reason about a rebase than a merge since it’s so linear.
Understanding of local versus origin branch is also missing or mystical to a lot of people and it’s what gives you confidence to mess around and find things out
The end result of a git rebase is arguably superior. However, I don't do it, because the process of running git rebase is a complete hassle. git merge is one-shot, whereas git rebase replays commits one-by-one.
Replaying commits one-by-one is like a history quiz. It forces me to remember what was going on a week ago when I did commit #23 out of 45. I'm grateful that git stores that history for me when I need it, but I don't want it to force me to interact with the history. I've long since expelled it from my brain, so that I can focus on the current state of the codebase. "5 commits ago, did you mean to do that, or can we take this other change?" I don't care, I don't want to think about it.
Of course, this issue can be reduced by the "squash first, then rebase" approach. Or judicious use of "git commit --amend --no-edit" to reduce the number of commits in my branch, therefore making the rebase less of a hassle. That's fine. But what if I didn't do that? I don't want my tools to judge me for my workflow. A user-friendly tool should non-judgmentally accommodate whatever convenient workflow I adopted in the past.
Git says, "oops, you screwed up by creating 50 lazy commits, now you need to put in 20 minutes figuring out how to cleverly combine them into 3 commits, before you can pull from main!" then I'm going to respond, "screw you, I will do the next-best easier alternative". I don't have time for the judgement.
> "oops, you screwed up by creating 50 lazy commits, now you need to put in 20 minutes figuring out how to cleverly combine them into 3 commits, before you can pull from main!"
You can also just squash them into 1, which will always work with no effort.
Then is not rebase your problem, but all your other practices. Long lived feature branches with lot's of unorganized commits with low cohesion.
Sometimes it's ok to work like this, but you asking git not being judgamental is like saying your roomba should accomodate to you didin't asking you to empty it's dust bag.
You can make long lived feature branches work with rebase, you just have to regularly rebase along the way.
I had a branch that lived for more than a year, ended up with 800+ commits on it. I rebased along the way, and the predictably the final merge was smooth and easy.
Adding to your comment, I've found that frequent squashing of commits on the feature branch makes rebasing considerably easier - you only have to deal with conflicts on one commit.
And of course, making it easier to rebase makes it more likely I will do it frequently.
1) because git rerere remembers the resolutions to the ..
2) small conflicts when rebasing the long lived branch on the main branch
if instead I delayed any rebasing until the long lived branch was done, I'd have no idea of the scale of the conflicts, and the task could be very, very different.
Granted, in some cases there would be no or very few conflicts, and then both approaches (long-lived branch with or without rebases along the way) would be similar.
If you do a single rebase at the end, there is nothing to remember, you just get the same accumulated conflicts you also collectively get with frequent rebases. Hence I don’t understand the benefit of the latter in terms of avoiding conflicts.
You don't see a difference between dealing with conflicts within a few days of you doing the work that led to them (or someone else), and doing them all at once, perhaps months later?
"If you do a single rebase at the end, there is nothing to remember, you just get the same accumulated conflicts you also collectively get with frequent rebases."
There is _everything_ to remember. You no longer have the context of what commits (on both sides) actually caused the conflicts, you just have the tip of your branch diffed against the tip of main.
"Hence I don’t understand the benefit of the latter in terms of avoiding conflicts."
You don't avoid conflicts, but you move them from the future to the present. If main is changing frequently, the conflicts are going be unavoidable. Why would you want to wait to resolve them all at once at the very end? When you could be resolving them as they happen, with all the context of the surrounding commits readily at hand. Letting the conflicts accumulate to be dealt with at the end with very little context just sounds terrifyingly inefficient.
If you rebase form main often, it keeps the difference to main quite small, so that when it comes time to do the final merge to main, it's either able to be fast-forwarded (keep it linear, good job!), or at least a very low risk of being conflicted (some people like merge commits, but at least your incoming branch will be linear). Because even though you might have commits that are a year old, initially branched from main from a year ago, their "base" has gradually become whatever main is _now_.
It's just like doing merges _from_ main during the lifetime of the branch. If you don't do any, you'll likely have lots of conflicts on the final merge. If you do it a lot, the final merge will go smooth, but your history will be pretzels all the way down.
In other words, frequent rebasing from main moves any conflicts from the future to "right now", but keeps the history nice and linear, on both sides!
I always do long lived feature branches, and rarely have issues. When I hear people complain about it, I question their workflow/competence.
Lots of commits is good. The thing I liked about mercurial is you could squash, while still keeping the individual commits. And this is also why I like jj - you get to keep the individual commits while eliminating the noise it produces.
>Replaying commits one-by-one is like a history quiz. It forces me to remember what was going on a week ago when I did commit #23 out of 45.
While I agree this is a rather severe downside of rebase... if you structure your commits into isolated goals, this can actually be a very good thing. Which is (unsurprisingly) what many rebasers recommend doing - make your history describe your changes as the story you want to tell, not how you actually got there.
You don't have to remember commit #23 out of 45 if your commit is "renamed X to Y and updated callers" - it's in the commit message. And your conflict set now only contains things that you have to rename, not all renames and reorders and value changes everything else that might happen to be nearby. Rebase conflicts can sometimes be significantly smaller and clearer than merge conflicts, though you have to deal with multiple instead of just one.
While it is a bit of a pain, it can be made a lot easier with the --keep-base option. This article is a great example https://adamj.eu/tech/2022/03/25/how-to-squash-and-rebase-a-... of how to make rebasing with merge conflicts significantly easier. Like you said though, it's not super user-friendly but at least there are options out there.
A merge can have you doing a history quiz as well. Conflicts can occur in merges just as easily as rebases. Trouble with trying to resolve conflicts after a big merge is that now you have to keep the entire history in your head, because you don't have the context of which commit the change happened in. With rebase you'd be right there in the flow of commits when resolving conflicts.
This seems crazy to me as a self-admitted addict of “git commit --amend --no-edit && git push --force-with-lease”.
I don’t think the tool is judgmental. It’s finicky. It requires more from its user than most tools do. Including bending over to make your workflow compliant with its needs.
I don't mind rebasing a single commit, but I hate it when people rebase a list of commits, because that makes commits which never existed before, have probably never been tested, and generally never will be.
I've had failures while git bisecting, hitting commits that clearly never compiled, because I'm probably the first person to ever check them out.
Sometimes it feels like the least-bad alternative.
e.g. I'm currently working on a substantial framework upgrade to a project - I've pulled every dependency/blocker out that could be done on its own and made separate PRs for them, but I'm still left with a number of logically independent commits that by their nature will not compile on their own. I could squash e.g. "Update core framework", "Fix for new syntax rules" and "Update to async methods without locking", but I don't know that reviewers and future code readers are better served by that.
In mercurial you could have those in phase hidden for future reference.
In jujutsu you can have those in a local set, but not push upstream. Only unfortunate thing with jujutsu is because it is trying to be a git overlay, you lose state that a mercurial clone on another machine would have.
It seems to me the "Not Rocket Science" invariant is upheld if you just require all PRs to be fast-forward changes. Which I guess is an argument in support of rebase, but a clean merge counts too. If the test suite passes on the PR branch, it'll pass on main, because that's what main will be afterward. Ideally you don't even test the same commit hash twice.
If you have expensive e2e tests, then you might want to keep a 'latest' tag on main that's only updated when those pass.
Rebase your local history, merge collaborative work. It helps to just relabel rebase as "rewrite history". That makes it more clear that it's generally not acceptable to force push your rewritten history upstream. I've seen people trying to force push their changes and overwrite the remote history. If you need to force push, you probably messed up. Maybe OK on your own pull request branches assuming nobody else is working on them. But otherwise a bad idea.
I tend to rebase my unpushed local changes on top of upstream changes. That's why rebase exists. So you can rewrite your changes on top of upstream changes and keep life simple for consumers of your changes when they get merged. It's a courtesy to them. When merging upstream changes gets complicated (lots of conflicts), falling back to merging gives you more flexibility to fix things.
The resulting pull requests might get a bit ugly if you merge a lot. One solution is squash merging when you finally merge your pull request. This has as the downside that you lose a lot of history and context. The other solution is to just accept that not all change is linear and that there's nothing wrong with merging. I tend to bias to that.
If your changes are substantial, conflict resolution caused by your changes tends to be a lot easier for others if they get lots of small commits, a few of which may conflict, rather than one enormous one that has lots of conflicts. That's a good reason to avoid squash merges. Interactive rebasing is something I find too tedious to bother with usually. But some people really like those. But that can be a good middle ground.
It's not that one is better than the other. It's really about how you collaborate with others. These tools exist because in large OSS projects, like Linux, where they have to deal with a lot of contributions, they want to give contributors the tools they need to provide very clean, easy to merge contributions. That includes things like rewriting history for clarity and ensuring the history is nice and linear.
Maybe I'm old, but I still think a repository should be a repository: sitting on a server somewhere, receiving clean commits with well written messages, running CI. And a local copy should be a local copy: sitting on my machine, allowing me to make changes willy-nilly, and then clean them up for review and commit. That's just a different set of operations. There's no reason a local copy should have the exact same implementation as a repository, git made a wrong turn in this, let's just admit it.
> And a local copy should be a local copy: sitting on my machine, allowing me to make changes willy-nilly, and then clean them up for review and commit.
That's exactly what Git is. You have your own local copy that you can mess about with and it's only when you sync with the remote that anyone else sees it.
"There's no reason a local copy should have the exact same implementation as a repository, git made a wrong turn in this."
Who is forcing you to keep a local copy in the exact same configuration at upstream? Nothing at all is stopping you from applying your style to your repos. You're saying that not being opinionated about project structure is a "wrong turn"? I don't think so.
I think most "ground truth" open-source repos do end up operating like this. They're not letting randos push branches willy-nilly and kick off CI. Contributors fork it, work on their own branches, open a PR upstream (hence that name: PULL Request), reviews happen, nice clean commits get merged to the upstream repository that is just being a repository on a server somewhere running CI.
I agree but I think git got the distributed (ie all nodes the same) part right. I also think what you say doesn't take it far enough.
I think it should be possible to assign different instances of the repository different "roles" and have the tooling assist with that. For example. A "clean" instance that will only ever contain fully working commits and can be used in conjunction with production and debugging. And various "local" instances - per feature, per developer, or per something else - that might be duplicated across any number of devices.
You can DIY this using raw git with tags, a bit of overhead, and discipline. Or the github "pull" model facilitates it well. But either you're doing extra work or you're using an external service. It would be nice if instead it was natively supported.
This might seem silly and unnecessary but consider how you handle security sensitive branches or company internal (proprietary) versus FOSS releases. In the latter case consider the difficulty of collaborating with the community across the divide.
> I still think a repository should be a repository: sitting on a server somewhere, receiving clean commits with well written messages, running CI. And a local copy should be a local copy: sitting on my machine, allowing me to make changes willy-nilly, and then clean them up for review and commit
This is one way to see things and work and git supports that workflow. Higher-level tooling tailored for this view (like GitHub) is plentiful.
> There's no reason a local copy should have the exact same implementation as a repository
...Except to also support the many git users who are different from you and in different context. Bending gits API to your preferences would make it less useful, harder to use, or not even suitable at all for many others.
> git made a wrong turn in this, let's just admit it.
Nope. I prefer my VCS decentralized and flexible, thank you very much. SVN and Perforce are still there for you.
Besides, it's objectively wrong calling it "a wrong turn" if you consider the context in which git was born and got early traction: Sharing patches over e-mail. That is what git was built for. Had it been built your way (first-class concepts coupled to p2p email), your workflow would most likely not be supported and GitHub would not exist.
If you are really as old as you imply, you are showing your lack of history more than your age.
If this was the main strategy used even for public/shared branches, then everyone would have to deal with changing, conflicting histories all the time.
I've had recent interns who've struggled with rebase and they've never known anything but Git. Never understood why that was given they seem ok with basic commits and branching. I would agree that rebase is easier to reason about than merging yet I'm still needing to give what feels like a class on it.
The fact that people have a harder time understanding rebase is evidence that rebase is harder to reason about. Whether you update your understanding based on that evidence is up to you. If I have to pick between merge and rebase, I would generally pick merge. It seems to cause less conflicts with long-lived branches. Commits maintain their identity so each one has to be conflict-resolved at most once.
However, even better for me (and my team) is squash on PR resolve.
IMO it's one of those things where rebase is at first less intuitive but once you get it is a lot simpler & easier to reason about. In contrast merging at first seems more straightforward but is actually less so.
that's not a value judgement in either direction, both initially simpler and longterm simpler have their merits.
I've heard people say before that it is easier to reason about a linear history, but I can't a think of a situation where this would let me solve a problem easier. All I can think of is a lot of downsides. Can you give an example where it helps?
Funnily enough in all my years of using git, this thread is the first time I've encountered merge. It sounds easier I suppose, but I don't really have a problem with rebase and will likely just continue as is
git rebase squash as a single commit on a single main branch is the one true way.
I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.
You should always be able to roll back main to a real state. Having incremental commits between two working stages creates more confusion during incidents.
If you need to consult the work history of transient commits, that can live in your code review software with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.
Merging merge requests as merge commits (rather than fast-forwarding them) gives the same granularity in the main branch, while preserving the option to have bisect dive inside the original MR to actually find the change that made the interesting change in behavior.
But they have, with pull requests. When you merge a pull request it is done via the "subtree" merge strategt, which preserves partial commits and also does not flatten them.
This is one of the few hills I will die on. After working on a team that used Phabricator for a few years and going back to GitHub when I joined a new company, it really does make life so much nicer to just rebase -> squash -> commit a single PR to `main`
What was stopping you from squash -> merge -> push two new changesets to `main`? Isn't your objection actually to the specifics of the workflow that was mandated by your employer as opposed to anything inherent to merge itself?
> You should always be able to roll back main to a real state.
Well there's your problem. Why are you assuming there are non-working commits in the history with a merge based workflow? If you really need to make an incremental commit at a point where the build is broken you can always squash prior to merge. There's no reason to conflate "non-working commits" and "merge based workflow".
Why go out of the way to obfuscate the pathway the development process took? Depending on the complexity of the task the merge operation itself can introduce its own bugs as incompatible changes to the source get reconciled. It's useful to be able to examine each finished feature in isolation and then again after the merge.
> with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.
I hate that all of that is omitted. It can be invaluable when debugging. More generally I personally think the tools we have are still extremely subpar compared to what they could be.
> I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.
I strongly disagree. Losing this discourages swarming on issues and makes bisect worse.
> You should always be able to roll back main to a real state. Having incremental commits between two working stages creates more confusion during incidents.
If you only use merge commits this shouldn't be any more difficult. You just need to make sure you specify that you want to use the first parent when doing reverts.
> I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.
Having worked on a maintenance team for years, this is just wrong. You don't know what someone will or won't need in the future. Those individual commits have had extra context that have been a massive help for me all sorts of times.
I'm fine with manually squashing individual "fix typo"-style commits, but just squashing the entire branch removes too much.
When your PR build takes more than an hour you'll think twice before creating multiple PRs for multiple related commits (e.g. refactoring+feature) when working on a single issue.
I completely agree. It also forces better commit messages, because "maintaining the history of each PR" is forced into prose written by the person responsible for the code instead of hand-waving it away into "just check the commits" -- no thanks.
Oh, that's why. I barely used any CVS before Git, so I was always puzzled about the "weird" opinions on this topic. I'm still puzzled by the fact that some people seem to reject entirely the idea of rewriting history - even locally before you have pushed/published it anywhere.
Sometimes people look sort of "superstitious" to me about Git. I believe this is caused by learning Git through web front-ends such as Github, GitLab, Gitea etc., that don't tell you the entire truth; desktop GUI clients also let the users only see Git through their own, more-or-less narrow "window".
TBH, sometimes Git can behave in ways you don't expect, like seeing conflicts when you thought there wouldn't be (but up to now never things like choosing the "wrong" version when doing merges, something I did fear when I started using it a ~decade ago).
However one usually finds an explanation after the fact. Something I've learned is that Git is usually right, and forcing it to do things is a good recipe to mess things up badly.
Understanding of local versus origin branch is also missing or mystical to a lot of people and it’s what gives you confidence to mess around and find things out