Ah! My people.
I, too, much prefer a rebase-heavy workflow. It allows me to both have a dirty "internal" history and clean up for publication.
As a side-effect, it also makes me comfortable having a mostly linear-history for what I publish, as opposed to a many-branched, merge-heavy one, which I dislike, and makes history confusing.
I reject the argument that a no-rebase, merge-only history "preserves the true history of how commits were created", because I believe that is irrelevant. What is relevant is what the tree looks like once the merge (or rebase) lands.
Should a merge conflict arise, in a rebase workflow the conflict resolution is folded into the rebased commit, so it looks like it was fine all along. In a merge workflow, the fix is in the separate merge commit. In both cases you still have to handle the merge conflict. And in my opinion it is not significant for the merge conflict resolution to be separate from the original commit itself because, again: what's important is the final state of the repo.
That's kind of the point though: being reasonably sure that a commit contains a tree that the committer had seen at some point, instead of making up history with commits that contain trees that the committer never saw at any point at all.
When someone rebases `n` commits, experience has taught me I can't trust any commits other than `HEAD`; chances are any commit printed by `git log "HEAD~${n}..HEAD^"` was never checked out by anyone, much less tested at all.
CI pipelines also usually run only against HEAD at the moment of push, so if someone pushes `n` commits, then `n-1` are usually ignored by CI pipeline.
Modifying options for compiler or linter or formatter checker; adding a new dependency or updating an existing dependency's version; changing default options for the project. Stuff like that might make those commits useless, and if someone notices a problem in HEAD after rebase, and decides to fix it, even if the fix is moved to the earliest possible point, nobody would bother re-testing all those n-1 commits after the fix was added, leaving broken commits useless for git bisect.
So I agree that rebase is nice. How most people use it, though, not so nice.
In a merge-based workflow you can have commits like "wip" or "before lunch"; no reason to believe those were ever tested either.
I like rebasing but it's ultimately up to the author. Even tools like Fossil, that don't have official history rewriting tools, don't ensure that history has never been rewritten because people can use external tools to do the rewriting (and I've done this).
Use a temporary branch for those. When you come back, undo the commit (git test -- hard if memory serves but i just have an "uncommit" alias) and commit the fully finished work to the real branch.
This is destroying the "real history" though. (Which again, I'm fine with, I like rebasing.)
Two months from now I'm quite likely to say something like "oh yeah, I remember I encountered a bug related to that, I was trying to fix it before lunch". The "wip" and "before lunch" commits are just as likely to be relevant in the future as any other.
It's nice to assume that all commits will compile and pass the tests, but it's sometimes useful to have a snapshot of that weird compiler error you encountered. So much for our nice assumption.
This is why I say it's all up to the author, and if the author likes rebasing, I don't think anyone should have a problem with that. (Don't rewrite public branches, of course.)
There's levels of granularity that matter. You could just as well record all your edits in realtime. Make a script that makes a commit every second or every time you finish editing a line. It might be interesting later, yet that's usually not how people use git. Those changes wouldn't be meaningful units of work.
If you make a commit "wip" or "before lunch" because you want a backup of your work or want to continue on a different computer, then it's not a meaningful unit either. It's OK to throw away.
Most people prefer less granular commits but not to the point of having 1 commit per issue/PR. For example after inheriting someone else's code written in a hurry and not tested, I often end up dividing my work into several commits - first there's a cleanup of all the things that need renaming for consistency, adding docs/tests, removing redundant/unused code, etc. sometimes this ends up being more commits as i reveal more tech debt. Then, when i am confident i actually understand code and it's up to my standards, I make the actual change. This can be again multiple commits. The first and second group are often mixed.
And it's important when it later turns out i broke something - i can focus on the commits that make functional changes as the issue is usually there and not in the cleanup commits which can be 10x larger.
BTW what git is really missing is a way to mark multiple commits as one unit of work so the granularity stays there but is hidden by default and can be expanded.
Is that not just a non-FF'd, non-squashed merge of a branch?
That's the closest you get today but it means having to make, merge and delete branches all the time. What i propose is something like git squash but that keeps the history internally. It would present as one commit in gitk and other GUIs but could be expanded to see more detail.
Does gitk have an equivalent of `git log --first-parent`?
In the View menu dialog, there's a checkbox for "Limit to first parent"
Isn't this something that git makes simple?
This is my preferred branching model. Most forges seem to call it "semi-linear history". If you have a lot of people working on the repo you'll probably want a merge queue to handle landing PRs but that's pretty straight forward.
It works really well with things like git bisect. It also means history is actually useful.
Every Jetbrains IDE does this, and VSCode has it's own equivalent feature. They don't use git, but same thing really. It's one of the most useful features ever IMO.
If people say "preserve history" as in "literally don't delete anything", then yeah I see where you're coming from.
I'm not against rebase, and even use it myself. But having a repo where every 3rd commit is a dice roll for git bisect just because straight line pretty, is just as annoying as people shipping their reflog.
A rebase of one commit is harmless. A squash is harmless. A rebase of multiple commits where every commit is deliberate (verifying all rebased commits, etc) is harmless.
A rebase that ignores the fact that any commit whose hash changed can now fail, is irresponsible. Shipping `wip` commit messages is irresponsible. A merge commit with the default message is irresponsible (it's no different from a `wip`-style commit). Having a branch with merge commits that could have been cherry-picks[3].
Also, to me the lie is not some aesthetic thing like commit order or some easily forgeable timestamp; the lie is having a commit that (for example) assumes the `p4tcc` driver is being used[1], and you read the diff and indeed it has assumptions that imply that driver is being used[2], but when you actually checkout that commit and see if that driver exists it turns out no it fucking doesn't, and hours were wasted chasing ghosts. Only because when that commit was created, the p4tcc driver was being used, but when you checked out weeks later now that commit magically uses the `est` driver instead.
If you're going to keep straight line, then test every change; if you don't do it, don't complain about broken middle commits.
If you're going to do merge commits, then keep each commit clean[4], even the merge commit[5]; if you don't don't complain about a history that is polluted with weird commits and looks like the timeline of a time-travelling show.
[1]: Because it did when that commit was created.
[2]: Because, again, it did when that commit was created.
[3]: This assumes the branch will later be integrated into main with a merge commit.
[4]: Squash is harmless. It's just omission. If anyone complains about purity, then just keep them happy with `git reset $COMMIT ; git add --all ; git commit -m "This is a new commit from scratch"`
[5]: Write something that helps those who use `git log --first-parent`. If you're on GitHub, at least use PR title and description as default (can be overriden on a case-by-case basis). If not, then even just "${JIRA_ID}: ${JIRA_TITLE}" is more useful than the default merge commit message while still letting you be lazy.
I depends on what you view as the real history. If you link each pull request to a work item you’re not going to really need all the commits on a development branch, because the only part of the history which matters is the pull request.
I think people should just use what works for them, if that’s debase who cares? The important part is being able to commit “asd” 9 billion times. If you can’t do that it will tax your developers with needlessly having to come up with reasons why they committed before lunch… that meeting… going to the toilet and so on.
Yeah, exactly, it’s up to the author to determine what’s important to preserve. Note this is always true, because the author is the one who commits, and can do anything before committing. If keeping the “before lunch” commit is useful for the history, rebasing does not prevent that in any way. Personally, I doubt that particular comment really is just as likely to be useful as something describing what the change is, but I’m with you that it’s author’s choice. It seems like squashing “WIP” and “before lunch” and describing the change content & reasoning has quite a bit higher likelihood of usefulness down the road than a comment on when you planned to eat lunch, and that has been true for me in practice for many years.
There is no “real history” in git, and it’s kind of a fictitious idea, even in Fossil or other VCSes that don’t offer rebase. Think about it: commit order of “WIP” ideas in a branch is already arbitrary, and commits only capture what you committed, not what you typed, nor when you ran the build, nor what bugs you fixed before committing, nor what you had for lunch, nor anything you didn’t choose to commit. Taking away rebase only adds extra pressure to plan your commits and be careful before committing, which means that people will do more editing that is not captured by the commit “history” before committing! Having rebase allows you to commit willy-nilly messes as you go and know that nobody has to see it. It seems like rebase might very well be safer in general because it encourages use of the safety net rather than discouraging frequent messes… and we’re all making frequent messes regardless of VCS, all we’re talking about is whether we force the rest of the team to have to be subjected to our messes.
Git provides change dependencies, and does not offer “history” in the sense you’re implying. People overload the word “history”, and git’s sense of history is to show the chain of state dependencies known as commits, and those have editable metadata on them. In other words, git’s “history” is a side-effect, a view of the dependencies. Git’s “history” does usually have loose association with an order of events, but nothing is or ever was guaranteed. It is by design that you can edit them (meaning build a new set of dependencies with rewritten metadata… the old one is still there until garbage collection), therefore there is no “real history”, that’s not a real thing.
I think most people these days just look at PRs. Everything else is largely noise.
Not at all. Someone’s got to look at your commits in the future when your code breaks ;)
Yep, this is why I'm mostly against squashing (and completely against blind squash-merges).
I'm not sure I get the advantage. The only thing I know is that the last commit on each PR is the one that is claimed to work. All others might as well be noise at that point since those random intermediates were never HEAD on the main branch, might be broken, incomplete, have failing tests, etc.. Squashing every PR into a single commit is at least an honest history of what's actually going out.
If you squash you have a history where every commit was tested and works (bugs notwithstanding) which to me is way more useful.
I mean you should be designing your commits such that each individual commit builds. That's the point of using squashes to fix up your history!
Commit refactoring can be really hard work however. Basically you do something like taking N random commits and convert these into M logical ones - where each one delivers incremental value and builds upon the other.
For some types of work it is easy, N=M: you were able to do high quality value adding atomic commits for the whole PR without rework.
For other types N >> M. This can happen when trying different approaches to a hard problem. I suppose research type work could always be considered a POC and the actual implementation could be a kind of cleanroom re-implementation of the POC but there isn't always time for such things and (again) the PR is far more important than the commits that built up to it - particularly if the resultant code is of equal quality. Note that I am not advocating for long running branches here - trunk based development is generally better provided it doesn't over incentivize teams to avoid hard problems (but that is a topic for another day).
This is why I think git should include the PR as a first class concept. For simple N=M type work, 1 PR should generally be 1 commit. Why not, after all, make the PR small and easy to review when you can? For harder N >> M type work, you get one PR with many commits that one can dig into if necessary.
This is the reason. I've been on a maintenance team for years where almost everything we handle was written by people no longer at the company, and often enough I've seen bugs get introduced during the original work, where the fix ends up being obvious because I can see the original commits and how the code got into its current state. A squash of any sort would've hidden the refactor and made it much more difficult.
My favorite are ones where "linting" and code formatter commits introduce bugs. Keep those separate from your actual work, please.
Oh I'm for squashing to make the history make sense. Please do not blindly squash-merge though.
And messy intermediate states won't help him at all. It won't help him either when related commits are interwinned with unrelated commits in history rather then being together.
I don't really understand why this would be important. If I'm the one committing, I can rebase however I want to rewrite history before merging, so if I'm super adamant that a commit that looks a certain way exists, I can just make that commit and then put commits around it as needed to ensure that it can be merged with by fast-forward to preserve it. If I'm not the one committing, why should I care about what intermediate states that the person who committed them don't even care about enough to preserve?
To me, the issue seems more that the UX for doing this sort of thing is not intuitive to most people, so the amount of effort needed to get the history rebased to what I described above often ends up being higher than people are willing to spend. This isn't a particulary compelling argument to me in favor of merging workflows though because it doesn't end up making the history better; it just removes most of the friction of merging by giving up any semblance of sane commit history.
(edited to add the below)
I definitely agree that generating broken commits during a rebase is not a good thing for anyone, and I'd be super frustrated if I had teammates doing that. At least personally, I make sure to compile and run unit tests before continuing after each step of a rebase after I've fixed conflicts; there's even the `x` option in an interactive rebase to execute a command on each commit (which will halt and drop into commit and allow you to amend before continuing if it fails), which is unfortunately not super well known.
It is important because not everyone does this:
Good quality rebases like those are more likely to happen on patch-based workflows (not necessarily email), compared to PR-based workflows, because there's more focus on the individual commits themselves being meaningful, with straight line history being mostly a nice side-effect. With "more likely" I mean literally that, more likely; I'm not saying it only happens there.
In PR-based workflows on the other hand, people tend to care only about HEAD. PR color is green? LGTM ship it :rocketemoji:. Most just read blog post by git shaman saying straight line pretty and then go to GitHub and enable the setting for that without thinking more than that; or learn that you can reorder commits to tell pretty story and do it without thinking more than that.
Though it's also true that some repository owners only care about the tagged commits; all untagged commits could be broken and they don't care because "it's supposed to be in progress" and "as long as the most recent commit works, it's fine". They've never needed to checkout any specific commit on any repository (understandable if they never contribute to others' repositories).
---
Also, you probably noticed already because of your edit, but re:
With "intermediate states" I don't mean what other people committed; I mean all your own commits that you just rebased (all your own commits whose hash changed) that are not the most recent one.
You are in the minority that fixes those; most people I've met would be like:
* All commits are tested and work fine.
* Create PR.
* See CI fails because branch is outdated.
* Rebase PR onto most recent commit of main branch.
* See CI fails because, idk, let's say it's something easy to fix like a more strict linter config.
* Make a new commit that fixes the linter errors.
* CI passes.
* Everyone LGTM's the PR and it gets fast-forwarded.
* The PR had `n` commits, but now `n-1` of those fail the linter because they contain the new config for the linter, but the committer never bothered to look at those commits, they only cared about HEAD. Those `n-1` commits "contain trees that the committer never saw at any point at all" (copy-pasting that quote from my message). And it doesn't matter that those commits are broken because for those people having pretty straight line is way more important than a working commit.
The recent FreeBSD/Netflix thingy[1] had a successful bisect only because when people rebase stuff in there, they don't YOLO those `n-1` rebased commits. If that had been any of my previous workplaces, or anyone who only rebase because "straight line pretty" without thinking anything more than that, then that whole bisect could have gone way worse.
[1]: https://news.ycombinator.com/item?id=40630699
All of your points seem accurate to me, but I don't see how merge workflows fix any of it. It seems like the same thing could happen where each commit along the way is broken until the final one, and then it's merged as-is. I don't think that having those intermediate commits being the exact ones that the person made is a solution because the problem you're describing is social, not technical; people not caring about committing messy intermediate state to the repo isn't going to be fixed by using merging rather than rebasing. The only workflow that would eliminate the problem entirely is to completely remove all intermediate state by squashing to a single commit before any merge, at which point doing a merge versus a rebase won't matter.
Neither workflow fixes anything. Each strategy helps with some things, but require discipline in other things.
Using merges lets you commit as you go, without needing to go back to repeat a test on a previous commit, and only worry about conflicts at the end of your development. Write code, test, commit. Write more code, test, commit. Cherry-pick, test, commit. Merge into main, fix conflicts, finish merge. There's never a need to go back and re-test, like with rebase, because the commits that were already tested are still there. But they require discipline to not pollute history, and being open to squashing commits that don't add any useful information (you want to avoid having "WIP"-style commits).
Using rebases lets you rewrite commits to take advantage of the most recent changes from the main branch, instead of waiting until you finish with your feature. But they require discipline to go back and repeat tests to ensure that any commit that changed still works as expected (and it's needed because the commits changed, hence their different hash, so they are no longer the commit hashes that were tested), and being open to having some merge commits (you want to avoid rebasing a 10 commit migration of your telemetry library because if 3 months later you find out your costs in production were way higher than what they told you they would be, reverting a single merge commit is more dumbproof compared to reverting a manually provided range of commits).
So yes, choosing one or the other is a social problem. Both are good solutions with good discipline, and both are bad solutions with bad discipline. One of those makes it less likely for people in my bubble to make a mess out of that repo. It might be the same as for your bubble, or it might be different.
But on a good project it doesn't really matter which one is done.
I appreciate your explanations! I think I understand your point of view now, and I do actually agree with it. In particular, I hadn't fully considered that the problem ultimately being social means that the "best" choice will be mostly dependent on what consensus a group is able to come to.
Thinking about this more, it almost seems like having a preference could become self-reinforcing; it's hard to be a member of a group that reaches a consensus on using merges as someone who prefers rebases (and likewise for the reverse), which over time manifests as more and more anecdotal evidence in favor of the preference working better than the alternative. It's no wonder that debates about this sort of thing become so contentious over time...
I don't see how this follows. Merge-heavy histories in my experience tend to be far less bisectable. They have all sorts of "oops, fixup" nonsense going on, precisely because the author did not take the time to get things right the first time.
Any workflow that happens on a number of patches greater than 1 accepts poor bisectability as a risk. But the only real solution there is Giant Monolithic Commits, which we all agree is even worse, right?
Yeah if "merge-heavy" means "ship the reflog", I get what you mean.
But if "merge-heavy" means "use merges when it makes sense, use rebase when it makes sense", then you can get a nice history with `git log --first-parent` that groups related commits together, and also a nice history with `git log --cherry` that shows what the "always-rebase-never-merge" dogmatic people want.
If for this particular project it just so happens that merge doesn't make sense because of the specific needs of the project, then so be it, nothing wrong with that. Same with rebases.
Unfortunately this topic is another holy war where the ship-the-reflog dogma fights against the always-rebase-never-merge dogma.
No balance.
That sounds more like merge-only (a.k.a. "ship the reflog"). Doesn't have to be that way.
Evaluate trade-offs and choose based on that evaluation.
Does adding a new commit have any actual advantage (e.g. easily reverting one or the other) compared to just amending/squashing it, or is it just some developer's own subjective sense of purity?
Does re-ordering the commits have any actual advantage (e.g. change has a smaller context and can be more easily reverted that way) compared to just leaving those commits in that order, or is it just some developer's own subjective sense of aesthetics?
Does using merge commits bring any actual advantage (e.g. the project benefits from being able to bisect on PRs or features as a whole) compared to rebasing (not fast-forwarding), or is it just some developer's own subjective sense of purity?
Does rebasing bring any actual advantage (e.g. each commit is already atomic, fully self-contained, and well tested against the new base, so "grouping" them with a merge commit doesn't make sense) compared to doing a merge-commit, or is it just some developer's own subjective sense of aesthetics?
Poor bisectability or developers putting actual effort into ensuring commits are atomic and test them.
Bisectability is nice with good rebased commits. Bisectability is nice with good merge commits.
Bisectability is bad when developers don't care about keeping bisectability good.
It depends.
Those commits might not be easy to understand, but they sure as hell are easy to revert (more likely than not) if something goes wrong, because they tend to correspond almost 1:1 to GitHub issues (or Jira tickets, or whatever equivalent). Keyword "almost" because sometimes you can get 2 of those for the same issue/ticket/whatever.
But those 2 unproperly split commits (therefore huge) are still easier to revert compared to a spray of 10 unproperly rebased tiny commits where 9 of them are broken (because of what I mention in other comments where people only test HEAD).
Too much text. But what I will say is that the "good" merge workflow you posit really only exists in one place (Linux) and requires a feudal hierarchy of high value maintainers individually enforcing all the rules via personal virtuosity. I've never seen it scale to a "typical" project run by managers and processes.
Where the straightforward "get your stuff rebased into a linear tree" tends to work pretty well in practice. The rules are simpler and easier to audit and enforce.
This assumes the rebasing will be done in a shared branch. Two rules of rebasing:
1. Never rebase a shared branch 2. Never break rule 1
No, that describes rebasing and preserving the intermediate commits. Of course, if you squash into one commit at merge time, this won't happen.
I do.
If you want the full history of someone's work, you need all the edits. Including all the times they backspaced over the typo. With down-to-the-millisecond timestamps attached!
Honestly all of this is just a function of Git’s tooling being pretty bad.
There’s no reason that a merge based history can’t be presented as a linear history. That’s purely a matter of how the information is displayed!
Similarly there’s absolutely no reason that git bisect should try to operate on every single commit. We live in a world where CI systems need to land a queue commits at a time. No one can afford to run every test on every commit. Git should have support for tagging commits that had varying levels of tests and then running bisect against only those commits. Easy peasy.
When I rebase I diff after each rebase to check that the only diffs are the ones I intended. So I, the committer, have seen all my rebased commits.
I'll add what should be obvious to any seasoned git user: rebase is only possible for "private" commits. If the commits were published, then merge is the only solution, because otherwise with a rebase you will "change (git) history" and break people's repos that have yours as upstream.
I recommend reading this 2008 exchange between top Linux developers learning to use git, and Torvalds's... characteristic language when talking about rebasing a "public" repo:
https://www.yarchive.net/comp/linux/git_rebase.html
(I'm putting "private" and "public" in quotes because they're really on a spectrum)
If you work on a branch also worked on by others, then yes, rebase is "only" possible on private commits.
But if you work on a feature branch, you can publish that branch as long as everybody else understands that it is yours and nobody should push to it. Then you can safely git push --force to the feature branch so that others can (a) see what you're up to (b) test the results. When you're done, you can do a final (or only) rebase against the target branch then merge to that branch.
What annoys me about this is that every branch ends up being "private" and people don't collaborate (In practice). So you end up with: "I can't start work on X until feature Y is merged, and there's a lot of PR comments left unaddressed so let's just merge Y since X is urgent and fix the PR comments later..."
That's not at all our experience at ardour.org.
Most PRs happen against the main branch, maybe after a private feature branch, maybe not.
Meanwhile, major developments happen in a feature branch, and these are almost always owned by a single person who is responsible for rebasing against the main branch (and eventually merging back there).
But maybe our development team and pace are not good tests for this model.
That still seems very annoying. People have unplanned time-off all the time, live in different timezones, or perhaps are l just busy ATM and can't do the rebase.
This can be mostly avoided with short-lived branches that are merged by the end of each day. If that isn't feasible, people can still collaborate on a feature branch as long as a single person is nominated to rebase it at the end.
In practice I’ve found:
- Unplanned time off means the feature just doesn’t make it in
- Living in different time zones means that you just wait and work slows down
In either event, there is rebase and squash auto merge from major git providers like GitHub which helps with this a ton. With that enabled unless the repo is super high traffic usually this is an afterthought unless there is merge conflicts which is semi rare.
That being said, that’s just the reality of what I’ve seen play out in various organizations. I personally like to work off other people branches a lot more than my peers because I work in platform engineering and very often developers come to me with feature branches that need some CI/infra/other change or optimization and very often it’s easiest for me to just drop a commit into their existing branch. The change falls under their ownership and it gets mainlined along with their feature.
They are the only ones working on the feature branch. If they take a vacation or fall ill or whatever, it causes no issues for anyone else.
Off topic, but thank you for having a perfectly modern and beautiful skeuomorphic GUI in Ardour. Tactile buttons with descriptive names instead of vague flat squares with incomprehensible icons; I wish more software offered this kind of interface.
OT but I love Audour so much. Thanks for you and your team!
cooperating on half baked code sux so much, that I happily take the trade off
If feature Y is incomplete, then sharing a feature branch (which sounds like a road to a messy commit history) doesn't change that. You could also use the unmerged feature branch Y temporarily as a base for X if you need some code from it.
Do you think that works any better if you're using merges Vs rebasing though?
If the problem is that you urgently need some small fix in an otherwise broken branch, you can always just cherry-pick that across to your own fresh pr.
My team ran into this early on when attempting to adopt a rebase-preferred workflow for feature branches...
...then we got used to it. A few workflow changes were necessary:
- configure pull.rebase=true. This is kinda just nice in general, but critical if someone might have rebased the branch you're working on overnight. - get used to pushing up your changes regularly - at very least before you quit for the day. - get used to pulling remote changes regularly - at very least before you start work on a branch each day. Rebase after pulling - especially important if you've branched off another feature branch (which may have been rebased). This way you avoid doing a bunch of work on an out of date base. - *Talk to each other.* If it isn't already obvious what a collaborator is doing... Ask them! And err on the side of over communicating what you're doing.
It turns out, all of these behaviors are kinda just generally useful, and after a bit you forget about rebase being the motivation and just enjoy having a bit less friction when working together.
What I hate about this workflow is it teaches you to git push --force.
Then one day, you're in the wrong window and you accidentally git push --force on master.
Much prefer a workflow where you never git push --force
Git hosting platforms generally have controls that allow you to prohibit force pushes on protected branches (i.e. master)
This doesn't fix the problem since you still can do a lot of damage doing that on a (perhaps wrong) feature branch.
That's where reflog comes in handy
It certainly diminishes it significantly. Very few people are usually collaborating on the same branch together that isn’t the main branch in most orgs. So the chances of you destroying yours or someone else’s work that way is pretty low. You can also branch protect any arbitrary branch, at least on GitHub, though I’ve never been in an org that protects branches other than the main one, besides an org that used release branches. I’ve also never seen someone clobber someone else’s feature branch, though it probably does happen. Much more often people are accidentally clobbering their own branches.
The ability to rewrite history on feature branches is powerful in a good way and slots right into the way Git is philosophically designed. I would probably not be interested in removing that feature to prevent the rare case that someone footguns themselves or someone else
You should do this instead:
git push —-force-with-lease repo branch
This. --force-with-lease is what --force should have been in the first place. Hopefully they will eventually make it so, and rename --force to something less accessible.
Any repo worth its salt shouldn't allow a force push to the main branch.
I fully agree, and also think that if it's a professional environment pushing to the main branch itself is very questionable for most / code repos. Not only because of the Sarbanes-Oxley Act for public companies.
Never use git push —force. Alway use git push —force-with-lease.
Then if someone changed your remote branch, git will always let you know :)
Been using git every day for ten+ years and I never knew about --force-with-lease.
That's great to have learned, thanks!
I think feature branches typically should be "mostly" private - but sometimes a few quick fixer are better to push, rather than just describe as a review comment.
In practice it’s fine to rebase and force push to your own published branch, even when it’s an open PR.
Just mark it draft or otherwise communicate when it’s no longer a moving target.
Especially when it's an open PR. That's effectively essential in order to, say, address feedback
I try not to rebase and force push for branches under active review though. It makes it much harder for the reviewer to see the changes that have happened since they last reviewed.
I version-number PR branches for upstreams on GitHub. If my branch 'for-upstream-13.0' does not apply cleanly on top of upstream 'main', I sync my fork's 'main' to upstream in GitHub's Web GUI, fetch to my desktop, and build a new 'for-upstream-13.1' on top, with the help of git-cherry-pick. Might need a new PR. All history is preserved, both on GH and locally.
That's only a problem when using GitHub. GitLab and Gerrit correctly version such force push PR updates and provide a nice view of what has changed.
This is false. Well, it can be false. Git sucks and makes it harder then it needs to be.
It’s relatively easy to have multiple people working on a feature branch that is continuously rebased.
Meta’s Mercurial-like VCS system automagically stores 100% of commits in the cloud. Feature branches are anonymous rather than named. It’s all just a tree of commit hashes.
Having multiple people work on one “branch” requires some basic communication as to what is considered “tip”. But since it’s all anonymous there’s no issues with different people incrementing a named branch to different commits.
Honestly it’s pretty simple and easy. Git makes it harder than it actually is.
As far as I can tell, what is really meant by “rebase is only possible for "private" commits.” is that for public commits, rebasing would require to coordinate with everyone interested in that branch.
This seems to be very similar to what you say with “requires some basic communication as to what is considered ‘tip’”
Requiring coordination (or a constrained model of allowed changes) when modifying things (such as what the tip/branch points to) is a general principle unrelated to git and applies to every workflow.
That said, I do understand that Meta’s centralized tool is more useful for your usecase.
One of my chief Git complaints is that 99.9% of projects are de facto centralized on GitHub. Genuinely decentralized projects are vanishingly rare.
The way Git auto-advances branch tags causes a lot of pain. Having lots of people commit to a shared feature branch that has lots of rebasing is quite easy when the “tip” is infrequently updated. Each person thinking they can declare the new tip with every commit is the source of pain.
I like your point. The exception is if the project you work on is itself downstream, in which case maintainers will rebase those “published” commits, probably employing the same technique they use to keep their local development branches up-to-date with the downstream project.
I've been a developer for twenty four years. I've seen a lot of people have strong opinions about whether to merge or rebase, to squash or not to squash. I haven't seen one single time where it made a difference whether something was merged or squashed, as I've never seen anyone look at commit messages in depth, versus tickets or PRs or bisecting.
The problem I run into is when a developer merges the main line branch into their feature branch repeatedly to get new changes. Now your branch is completely tangled up and difficult to change
I do this like... all the time. At least weekly for the last 5 years maybe? I can't remember a time where this has caused anything to get tangled or difficult.
...why? Isn't this exactly what rebase is for? Just plain, boring, non-interactive rebase.
Maybe? But it's also what merge is for.
If I used rebase like this regularly it would become difficult to determine if a given commit is an ancestor of HEAD. Sometimes I like to do that.
https://lore.kernel.org/lkml/CAHk-=wjbtip559HcMG9VQLGPmkurh5...
Merges I create from the CLI don't really look that differennt from the ones through some forge UI PR process. I don't know what Linus is getting at. If I was working under someone who had a process like this, I could ask them to clarify what they wanted me to do. I don't have that. And I need to be able to justify all of my decisions. So this isn't useful to me.
I don't doubt that Linus has good reasons for all this. I just don't know what they are. And I don't know if they're applicable to other repos.
Cherry-picking is harder. Splitting up the branch is harder.
I cherry pick occasionally. I don't see how it's affected at all? Edit: If another dev and I both cherry pick the same commit concurrently, that might do something weird. Maybe that's it?
I don't know what splitting up a branch means.
Really?
I certainly have used commit messages and seen others do the same. Perhaps this is more an indictment of the quality of the commit messages than anything else.
In my experience, rebase and squash makes it easier to collect work into meaningful groups and thus write more helpful/informative commit messages.
I can think of a few times off the top of my head when I referred back to a detailed commit message in a repo to understand why a change was made.
If I'm making a change that people are likely to wonder about, I tend to add comments. Basically, anything that's not obvious tends to get a comment, change or no.
I would like to do it more, but it's hard to convince people to not produce absolutely useless commit messages, and trust me I've tried.
Looking through my work project git log, it's a sea of random ticket numbers followed by absolutely nothing helpful or descriptive, usually just the title of the ticket if you're lucky.
For what it's worth, I often look at commits/commit messages in-depth.
I wouldn't do so if the history were a mess
I read 'git log -p' often, and linear history makes it so much easier. The same with bisecting. I'm yet to see any practical advantage of not allowing rebase/squash in feature branches. Preserving all history of feature branches is not an advantage to me. The history I care about is the history of the main branch and I don't want to see it splint into many tiny commits, especially not into commits looking like add/fix/fix/fix/fix where at all point except the last the code is broken.
Sure it looks nice, but it's a fake history. After a rebase you have commits where no one's ever run the tests, no author ever intended to get the repo into that state. Hell, most likely no one knows if the code even compiles.
It's assuming that that merge conflicts will never be too difficult, and people won't make mistakes in resolving them. If so, too bad, the original commits are lost by the rebase process. (They might or might not be kept in any given clone. No guarantees, good luck trawling through the reflog.)
It's corrupting your database because it makes the UI for exploring it nicer.
This is the opposite of what we should be doing! If the truth is messy we should build better UIs for exploring it. Add additional information on top of the true history, allow people to group and describe commits. Add more information to get the UI that you want, rather than destroying information because it makes things look cluttered.
That is the role of the CI system.
CI will usually only run the latest commit. Even if commit a, b, and c were all tested, the resulting commits a', b', and c' after rebase would usually only have the last one tested.
Since you shouldn't publish commits that weren't tested, this suggests you should publish only one commit at a time.
(Unless you think "works on my machine" is good enough testing. Sometimes it is.)
So rebase then force push a', wait for CI, push b', wait for CI, push c'? Personally I just squash on merge.
Only true if the commit author doesn't do so, and it is trivial to do so, either during or after a rebase using the rebase exec command. So given this is a discipline issue no different from a developer authoring a change without testing it, I fail to see how this is "rebase"'s fault.
Not to imply I accept the "corrupt database" opinion, but I think it's worth saying that aside from the collaborative element of VCS, commits exist for the purpose of exploring past code changes. A practice which improves that seems sound to me.
Go right ahead :)
"Undisciplined enough to use rebase, disciplined enough to put in extra effort to mitigate some of the harms of rebase" is an imaginary intersection.
This is begging the question (the fallacy, not the colloquial expression). It's only undisciplined to rebase if it's a bad practice, which is the topic under consideration here.
You can't use your preferred answer to the debate as justification for dismissing your opponent's arguments.
Same with merge, they are just somewhere in the middle.
Nothing prevents people from committing untested code that doesn't even compile without any rebase involved. Even hooks don't help since the incompetent coworkers are just barely smart enough to learn about -n
I take it that you've never reviewed a PR where the commits were a series of "wip" commits that don't even type check, much less pass the tests?
Undisciplined developers will be undisciplined. Forcing a rebase-free workflow mostly makes it less likely that these developers will lose work, it doesn't magically give you a clean commit history from them.
Was the bug introduced by the rebase, or was that code always broken?
Code doesn't exist until it is merged, so these are equivalent
Err, what?
If I'm trying to fix the PR, the distinction is often a critical starting point for finding the root problem.
If your PR is big enough that you're relying on a git bisect to find a bug, that's a problem in its own right.
A workflow like what OP is describing (code doesn't exist until it's merged) typically also assumes that your PR is one atomic change, not a whole feature. You'd use feature flags or something similar to decide when to actually release a feature, not the PR process.
Who said anything about bisecting?
But yes, sometimes changes are big and I'd rather land them as one cohesive unit that demonstrates how it actually solves the root issue than get stuck down a wrong path because "this is what we already landed the infra for".
That's not a workflow, it's a misunderstanding of reality.
I love rebasing, and use it most of the time. At work we also enforce squash-merging, which is the only scalable way to prevent low quality commits from polluting the main branch history.
While rebasing works most of the time, the problem arises when actually collaborating on feature branches - especially when your collaborator is not confident enough with git to realize when a rebase conflict might lose data. Merging works better in these situations. So while standardizing on rebasing is great for productivity across the org, you also have to watch out for this and make sure developers don't lose the ability to collaborate on branches.
What's the point of rebasing when you in the end squash-merge?
In my mind the only real benefit of rebasing is that you split your work into a couple of "nice" commits, but if you squash them anyway...
Ok, I've seen people doing this to ease the review process and I guess it makes sense sometimes, but to me it's still a largely "not worth it" effort.
The squash merge is for the sanity of master/main, whereas rebase is for the sanity of the reviewer. That's how I look at it at least
I've always felt the the rebase vs. merge debate was pointless because I never have dirty internal history by the time that I'm ready to make an MR/PR.
It's just that squashing takes 2 seconds. I just shift-select the commits in my Git client and click 'squash'. To me it's just like cleaning up after cooking. And if I want to re-order my commits, I just like drag and drop the order of them.
If I told you right now to squash or re-order your last 5 commits and you think it'd take you longer than 2 seconds, you are honestly using a bad tool.
Yet hundreds of people have spent many human hours writing these long blog posts about how you have to do things a certain way yadda yadda yadda when the real problem is that they use bad tools, squashing is such a pain, they don't bother to clean up their "fix 1" "fix 2" "try again" commits, and everyone has to deal with Git gymnastics at the end.
Exactly. The only true history is the clean, properly rebased history with one change per commit, as I want it.
Preach, brutha!
I generally prefer a rebase-free workflow (mostly due to my upbringing. Long story...) But other than rebasing shared branches, which (with notable exceptions) is a Wrong Thing, it's mostly up to teams to decide on a workflow that gets things done for them. As long as there's a consensus on how to do things, go with god.
Honestly I dont even do history anymore for small projects, I bisect twice a year tops and I have never evee have had to look at a 3 year old commit to figure out a bug, not that the code could'nt be 3 years old but the blame is so tangled at that point..