For better or worse, my experience as a GitHub cofounder and author of several Git books (Pro Git, etc) is that the Git commit message is a unique vector for code documentation that is highly sub-optimal.
The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line. So in the case of this commit example would be the very simple message of a generic "US-ASCII error" problem. Everything they talk about in this article is what is great about the _rest_ of the commit message, which, given modern tools, is _almost never_ seen by anyone.
The main problem is that Git was built so that the commit message is the _email body_, meant to be read by everyone in the project. But for better or worse, that is not generally the role of this text today. Almost nobody ever sees it. Unless it's discussed in a bunch of patch series over a mailing list, nobody reads anything other than the first 50 chars of the headline. It's actively difficult to do, by nearly every tool built around the Git ecosystem.
Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.
This is one of my biggest complaints with Git (or, indeed, any VCS before it), and I think why people just don't care much about good commit messages. It's just not easy to get this data back once it's written.
If you want an example of this, search through the Git project's history. Run a blame on any file. It's _so hard_ to figure out a story of any function implementation in any file, but the commit messages are _pristine_. Paragraphs and paragraphs of high quality explanation for almost every single commit. Look at any single commit that Jeff King has done for the last decade. Hundreds of hours of amazing documentation from a true genius that almost nobody will ever appreciate. It's horrifying.
I don't know exactly what the answer is, but the sad truth of Git is that writing amazing documentation via commit message, for most communities, is almost entirely a waste of time. It's just too difficult to find them.
I know the OP didn't mean it this way, but after reading HackerNews for the last decade or whatnot, it never ceases to surprise me how often developer complaints stem from developers just not doing their damn job.
"Almost nobody ever sees it.... nobody reads anything other than the first 50 chars of the headline."
On the one hand, I get it. If a tool makes something difficult, people are less likely to do it, and as engineers we want to make tools to cause people to fall into the pit of success. So, improving this part of git makes sense.
On the other hand, just do your damn job. If a coworker doesn't understand a code change, because they didn't bother to read the commit message, they're a bad developer. If they didn't write a git commit message because "no one is going to read it anyway", they're a lazy engineer. These things aren't excuses, they're incompetence, and not everything needs to cater to the least competent people in our profession.
If writing good commit messages isn't specifically defined as part of your job, why would you waste business hours writing commit messages that are beyond what is expected of you and frankly useless since nobody would ever read it anyway?
Writing good commit messages is part of your job in the sense that no reviewer should be approving anything without them, knowing that you may or may not be available if it breaks next year at 3 AM.
I put that kind of thing in the pull request, since that's where the review happens. Every commit links back to a pull request, and people actually do refer back to it. Writing good documentation there is part of the job.
No-one's going to ALSO write commit messages that no-one will see.
The git history will last longer than the platform hosting the PR.
Not only is “git log” always up, I can also skim it without opening a hundred browser tabs.
pull requests don't live in the repository (as far as I can tell...) and require you to use whatever online interface creates them.
Not sure whey e.g. Github Pull request merges don't include the entire pull request description in the merge commit message.
In my opinion PR/changeset description is exactly what should be in the commit description. In cases were we had 1 commit per PR (i.e. squashing before merging) just copying the PR description into merge commit worked really well - the goal for a PR description and a commit is essentially the same.
I wish github allowed to make the copying automatic and ensure that it happens (it doesn't, unfortunately).
If someone wants to learn the full history - the remarks during code review, perhaps all the WIP commits - they can read the PR/code review comments. I found it to be very rarely needed.
This is exactly how Microsoft Azure DevOps works when you enable the squash-on-merge behavior (which is how we used it while working at Microsoft). I thought this was completely logical and I'm surprised that GitHub can't be configured the same way.
All of our commit messages were nice, long and detailed, with a link back to the PR if you really wanted to go back and see the individual commits and/or discussion that occurred on that PR. I think I only looked at individual commits maybe once or twice since they were usually useless in isolation (woops, WIP, fix typo, etc.).
settings > allow squash merging > default commit message > pull request title and description
I'd do it for CYA purposes in case smt goes wrong and my commit is involved.
In my company we norm the titles of MRs, give it a ticket number and squash it all. If you can’t concisely describe what you did you need to split your MR.
Helps with reviewing as well as blaming.
One thing I learned is that any forum that appeals to software engineers will appeal to software engineers of all skill levels, from the guy that did a 6 week coding camp because he heard SWEs make a lot of money but didn't really learn anything but thinks he's an expert now, to geniuses with 10+ years experience.
For every comment from someone who really knows what they're doing, there's one from someone that really doesn't.
Yep. This is one of the big problems with online communities. When someone makes a bold statement, I have no idea if they’re a grizzled engineer with grizzled, hard earned engineering opinions, or some kid fresh out of a coding bootcamp who thinks they’re all that. In person, I’d treat those two people incredibly differently. Online? It’s impossible to spot the difference.
It doesn’t help that we all think of ourselves as programmers, even though people in our industry have a wide range of jobs. Someone working at a feature factory banging out websites and mobile apps has a very different job from someone slowly puzzling out a new cryptography algorithm or debugging a kernel driver. You can tell they’re different jobs because excelling in those roles takes different skills. In the first case, you want to know your domain backwards, have great social skills and work consistently. In the later cases, you need deep CS knowledge, patience and insight. It’s different.
Who is this site for? What does everyone do for your job? It’s all quite unclear.
Once I started interviewing - mind you, interviewing candidates that already got through several filters before getting in front of me - I realized how mediocre the average engineer is. Should it then follow that the average engineering opinion online is mediocre?
pretty much yes, even here you see a lot of overconfidently mediocre takes
It’s important to know how to stick to your course in the face of bad advice.
I know there are limits to this, but isn't that a good thing, too? It's all too easy to treat a newbie as if they can't possibly have useful things to offer to the community; being forced to treat all comments equally, without being able to fall back on the crutch of reputation, arguably forces one to read and engage more deeply with the content, and offers the chance to surface the occasional genuinely valuable contribution from a newbie (or, for that matter, to avoid letting someone get away with an ill considered or overbroad statement just because they have such a big reputation that people are too afraid to stand up to them).
True, but occasionally a newbie offers a solution to a problem and rejects the rejection when people try to tell them why their solution won't work.
For example, several years ago here on HN, during a thread about cryptography, someone admitted to not knowing much about cryptography, but offered one-time pads as a solution to the weaknesses of PKI. I tried to tell them that OTP solve a different problem that is unrelated to PKI, but they wanted nothing of it. They claimed that my response was purely emotional and that just because I knew more than them about cryptography doesn't automatically make them wrong. When I tried the Socratic Method to lead them towards an understanding of why they were wrong, they accused me of being condescending and said that I should answer my own questions.
If I see a bold claim that I have a hard time believing, then I'll ask follow-up questions. But when someone makes a bold claim that is just factually incorrect, while admitting they don't know much, and then get upset when someone tells them why it's incorrect, then that's just plain frustrating.
But that doesn't seem a situation that would have been addressed by fixing the problem that you originally identified:
Here, it sounds like you knew on which side of the divide a person fell. The resulting problem is still a problem, but it's one that also occurs offline!
When I document, or write commit messages, I don't really _care_ if other folks will ever look at them. Documentation is a gift for future me. If something wasn't obvious to figure out, or a potential source of future problems, I want it written down, so if _I_ go looking for info, it's there.
The fact that things are now documented for other folks is just a side benefit.
This.
I write documentation for me.
Very few folks ever use my published code, which is fine by me. I publish it, because treating my packages as atomic, ship-ready, high-Quality products, forces me to take great care, in each and every one.
Which means, when I use them in my other work, I don't have to worry about them.
My take on documentation is thus: https://littlegreenviper.com/miscellany/leaving-a-legacy/
Documentation is for everyone. Put it where everyone can read it, not hidden in the most esoteric place possible, via a very unfriendly tool.
I do that often and call it the developer guide. After the user and ops guides.
Not to mention comment and doc strings about why in the current code.
This entire thread is making me feel like I'm taking a crazy pill.
A simple git log is considered "esoteric" these days? No extra command line arguments are required to read the entire commit message. If so software "engineering" is truly a dead discipline. I guess the "move fast and break things" crowd have taken over.
Never worked with other people? Git as a thing is off the table.
Which commit? One month ago or three? Pro{gram,ject,duct}M, SME, QA, or user wants to know why? Do they even have a login to the systems they’d need? Is there search so they could find it themselves when you quit?
Or you could send them a link to the docs.
Not sure why you're attacking me. I don't remember saying anything offensive.
Anyway, I've never been a fan of "moving fast and breaking."
Might want to give that blog entry I linked, a read.
Why would I waste time reading this paragraphs long commit message when I can look at a diff and a 40 character headline and completely understand the issue? You think it's lazy, I think this is wasteful. Personally I don't need an epic story about making a one character change because your editor isn't configured to catch gremlins... it's just not that interesting.
Because you don’t actually understand the subtleties of the side effects of that 40-character change and building intuition about it takes a paragraph.
It’s all fun and games until your codebase is >1,000,000 loc.
This code never worked but made it into production. What I see is a developer hucking garbage over the wall, not testing their own code, passing reviews assuming they exist, and eventually stopping the train in its tracks because they're more concerned with pretty commit messages. I also think this is beyond simple to catch early be it the editor, the pre-commit hook, or any other range of tools that could have and should have prevented this.
I'm not saying a detailed commit message is never warranted, I'm saying this fuck up doesn't warrant a short story let alone a prize for being overly verbose. BTW, I did a loc . on the repo I work in, came back with 7400000 lines of code. Does this mean I'm cool enough to be in your club?
If you have to ask, then the answer is no. I don’t make the rules.
I feel like it's not a question of "doing your damn job". It's a question of what value can you expect to get from a particular investment. If blame is your tool and every line happens to be changed from a different blame invocation (is it "-w", "-w -C", "-C -C -C", etc), how do you learn the story of this block of code best? Maybe you then need to read a story _per line_ of code. But that's not actually worst case. Maybe you need to drill down to the commit _before_ that because the last change isn't semantically important. Maybe the one before that, etc. How many commits that touch those lines significantly do you need to research and read amazingly well written commit messages before you totally understand the context of this particular block of code?
Git visualization isn’t git’s job, providing primitives like blame for git visualizers is. If you use VSCode with gitlens (pycharm is similar), the exercise you just mentioned is trivial. Focus a line, get the correct blame. Click on that, view the commit history. Click from there, see the diff.
Coding is a really interesting field in how quickly it's developed, and I think there's a lot of people who assume their environment is the only environment and it should be that way for everyone.
Spending time digging through commit messages when tooling and design makes it harder, not easier, is a risky proposition if it turns out it was all fucking useless and you didn't find anything worth reading and are now even farther behind. Either due to the quality of the messages or a lack of your ability on your end to find what you need.
I'd love to work in the sort of environments these people seem to but it's just not been the case. I'm at a smaller company where I get to wear many hats, and coding/development is just one of them, but I know plenty of people at very large companies who also don't really do things like that because "putting in the effort" isn't rewarded as much as whatever arbitrary metric they're graded on.
This is the problem with any kind of documentation; while you can write the highest quality, meticulous, most obvious and clearest prose, it's moot if nobody reads it.
And nobody reads it because there's so much of it and there's no clear starting point. People just want the summary of what they're looking for.
I started to learn Java almost 20 years ago, we had a text book and everything. After the first two chapters, I learned how to google and instead of reading everything, just find what I need. I never went in-depth with reading because... it's mostly useless knowledge that quickly becomes outdated.
Nah, we have documentation with a clear starting point - there is index page with most common topics, "new user" page with links to what new user should read, and some error messages actually contain wiki links to pages with instructions.
And yet we still have people who don't read documentation.
With commit messages, there is a very clear starting point: the commit message for the commit that last touched the line of code you're looking at with git blame, which is my standard solution for finding out the reasoning behind any piece of code I don't quite understand. Only works for projects that don't destroy their history with squashes or otherwise write uninsightful commit messages (e.g. "fix bug").
Any system where the proposed solution is "be better" without an outline of "and here is how" and some method of enforcement is doomed to fail.
Checklists, build checks, linters, tests, SLOs, post incident responses, follow up tickets, etc all serve to unload "be a better software developer" into actual systems and processes that can continuously enable the better behavior.
Simply stating "do a better job" wont work as organizations scale. Related, you can expect what you inspect.
There were two hands in that comment (on the one hand/the other). One hand said that Git and other tools should be better. Only the other hand said to be better.
If an engineer spends an hour writing a commit message that no one reads, that's an unproductive engineer, compared to where they should be.
I have to admit, I am lazy. I don't spread seeds by hand; I use a tractor. I don't swim across the ocean; I use an air plane. Likewise, I don't write documentation in commit messages; I write documentation in PR descriptions, READMEs, and official document sources. You got me, I'm incompetent.
My "job" is to write software, not follow some arbitrary "pure" practices.
And I would argue we shouldn't cater to developers who make documentation difficult to access for everyone else by hiding it where only crappy tools can reach it.
Okay, maybe don't spend an hour. It would take a special kind of commit to need more than a few minutes writing a decent commit message.
Yeah. Like web browsers. And PDF viewers.
The non-caustic point here is that clearly different people have different ideas about what is accessible.
The "just do your damn job" retort presupposes that their job is to read the entire body of every git commit. That's the question - you can't just presuppose it.
Counter point - Engineering systems which require constant overriding of basic human nature therefore requiring making significant effort on the regular to avoid mistakes is bad engineering.
I'd hate to say that laziness makes a person an incompetent developer. Often my problems stem from an excess of sincere hard work and rather than from laziness.
But if you don’t cater to the worst on your team you are often viewed as the problem
As someone who has contributed to Git since before GitHub existed and who maintains legacy code, I simply cannot disagree more. I use `git blame`, `git log`, and `git show` in the terminal all the time. It's trivial to follow the history of a file. It takes me seconds to use `git log -G` to find when something was added or removed.
Nothing pains me more than to track down the commit and then find a commit message that's of the form "bleh" or "add a thing" when the developer could have spent 60 second to write down why they did it.
Nothing gives me more joy than to find a commit message (often my own) that explains in detail why something was done. A single good commit message can save me hours or days of work.
Let me also just say, and this is a bit of shot: GitHub contributes to the problem of bad commit messages. If I'm lucky, folks have put some amount of detail in the PR description, but sadly that's not close at hand to the commit log. It's another tool I have to open. Usually though, the PR is just a link to Jira, so that's another degree of indirection I need to follow. Then the Jira is a link to a Slack conversation. And the Slack conversation probably links to a Google doc.
As an industry, we're _terrible_ at documentation. But folks like Jeff King are fighting the good fight. At the end of the day, I don't think the problem is with the technology. I think it's a people problem. Folks perceive writing documentation as extra work, so they don't. There's no immediate value to it. The payoff comes days, weeks, or months later.
Please, write good commit messages. Just spend a minute saying why you did something so that every commit isn't a damn Chesteron's fence exercise. Put it in the commit message where I can easily find it. Your future self and I thank you.
Edit to add: I didn't address your argument, that commit messages are too hard to find.
First, I don't find this to be true. I rarely have trouble following the history of a line of code, a function, or a file.
Second, commit messages have value at the time they are written even if they are never seen again. I find that writing a good commit message helps ensure that I've written in code what I've intended to (I often view the diff while writing the commit message) and they have value to the people reviewing my code.
The thing is that writing a good commit message for future people doing `git blame` is only worth it if it's a line of code which someone in the future will look at and need to know why it was changed from its previous form to the current form.
If you simply want to comment the current state of the code, you should add a comment in the code.
No one will ever need to know in the future why that particular space character is an ascii space, so the whole commit message is just a blog entry in the wrong place.
It would have made sense to just put a comment at the top of the file saying "make sure encoding is whatever".
Then don't write commit messages for the future, write them for reviewers.
Seriously, as somebody who reviews a lot of code, well-written commit messages are a godsend.
It's an awful shame that GitHub doesn't allow commenting on commit messages. It's as if GitHub is being run by people who just don't know how Git is meant to be used.
You actually can comment on a commit itself. I'm in the habit on middle-clicking on the sha1 link of commits in a PR and looking at the commit itself. You can comment on lines in the commit, and there's a text area at the bottom where you can comment on the entire commit itself. I'll then follow up with making a comment on the PR linking the commit (pasting the sha1 link) and saying I made a few comments here.
Github wasn't really designed with code review in mind. A lot of the features they added over the years for review appear to be hacked on rather than fixing fundamental design issues (like being able to comment on commit messages without having to jump through a bunch of hoops).
Review systems like gerrit, phabricator, review board, or even email, do a much better job at exposing individual commits and their associated metadata like the commit message.
I don't think they were suggesting to review the individual commits, rather the (individual) commit messages. Commit messages are text, so you could have a similar line by line click-and-comment review interface as you already have for the code changes.
This is the issue; are commits in your codebase for the reviewers? Or for FutureDev to see what the hell happened? These are often very different things, and most process models favor one of these at the total expense of the other.
I write commit messages for future-me. Sooner or later I'm going to encounter the same problem again and wonder how I solved it last time. If I have a vague inkling that I dealt with this before, all I have to do is searching through my commit history and I can find it again. I can search my author (me), I can search by date, I can search by what files I touched. It's lovely.
Regarding commenting in the code vs. in the commit message, sometimes I copy-paste my explanatory comment if there is one into my commit message.
Right. You've given an example of exactly what I said was the only reasonable use case for detailed info in the commit message: someone in the future will need to know the history of that particular piece of code. It seems like the point of your specific example is to say 'in some cases you might want to know the history of a gap'. Fine. That seems like a nitpick to me.
No one in the future will need to know the history of a particular ascii encoded blank space (among a whole file of ASCII encoded blank spaces). Anyone who needs the general info that the file needs to be ascii will be helped by it being somewhere else, as opposed to in a random commit message.
But virtually every diff is one that someone in the future might want more information on. You can't know that they won't until you get to the end of the future and haven't needed it.
I think you mean "past state of the code"...
These comments rarely get updated. My favorite recent one was several sentences describing a data structure and how it mapped out statuses, written about a decade ago. Barely a year after that comment and its code was written, the entire thing was re-written with completely different structure - and the comment left unchanged. Left a co-worker completely baffled due to inexperience with perl, we figured out what happened because of svn blame.
Mistakes can happen. But if code comments are frequently not updated as the code evolves, it is a level of lazy that will probably manifest in other ways as well.
The drifting of code from comments is a problem I would love to see solved.
I've seen tools that can compare the git commit dates of code with nearby comments and that's a good start. However, there are potential problems with that, such as code and the comments that discuss the code not being near each other; or the code being updated and there being no need to update the comment
I think literal programming might help here, but that's an entirely different topic really.
Looking for more advanced tools that that and I suppose we're into the world of AI - asking the tool to understand both the code and the comment and to compare the underlying meaning.
Code review is an option but outside of an organisation that's difficult to do and besides, I think the problem would be best solved by something that is repeatable and part of the build process. And I'd love to be able to have a git commit hook that can say, "hold on! you've updated code but there's a comment that now looks old". That's the dream.
Sometimes a comment is appropriate. Sometimes a commit message is appropriate. Sometimes I need both. Often when dealing with legacy code I find neither. I'd be happy with either.
A commit message lets me tell a short story about a change that touches multiple locations in the code base. Maybe no one part of the change is all that tricky.
A commit message also allows me to explain why I'm making the change, whereas a comment may explain why the code is the way it is.
Commit messages and comments have overlapping use cases, but the Venn diagram is not a circle.
$0.02.
And often your get neither...
The example in the blog post would be a much better example if some kind of test or linting step was added to catch these white-space errors, to explain the need for catching such errors.
Pro tip, you can write both comments and commit messages.
Right. All this wonderful information and detailed error messages need to be findable by someone searching the same error. Someone digging into the code is a very different use case and they need a tiny fraction of that information.
I'm mixed on this. My project has a bug tracker. A commit is required to have a bug id. The bug tracker has entire discussions of what lead to the commit so it's not clear to me that a detailed commit message is a plus when the real detailed info is in the tracker. Yes it's indirect but there's no way I'm going to summarize the entire issue discussion.
Maybe this is a job for machine learning. Read the code, read the commits, read the bug tracker, add a git super-blame that asks the LLM to summarize why every line is the way it is and what it's doing
Companies do change bug trackers and ticketing systems and those links may no longer work years down the line.
But summarizing it can be one of the most valuable things you can do for a maintainer who has to make changes years after you've moved on. For one thing, the problem and discussion is fresh in your mind and you understand the context. In a few minutes, you could summarize the problem, the approach taken to fix it and alternatives that were considered but not used because the chosen solution clearly didn't have an issue/was more efficient, etc.
Even if you didn't want to do that, you could just copy and paste the entire discussion text at the end of the commit message so that even if the bug tracker is no longer in use in the future, the discussion itself was preserved in the commit history and accessible via git log or blame.
I've experienced this twice, we switched from Bugzilla to FogBugz to Jira in my time. With one relatively small exception in the FogBugz to Jira transition, all past case information was lost.
This is why at work the only required rule for commit messages is that they include the story number, so we can very easily find at least the general reason for a change from git blame.
If the committer uses the "show history of a file" process model. These days, it's mostly "squash commits until I get a good/flattering 'story' of what I did", which removes typos, failed experiments, mid-thought commits, and any other blemishes of _what actually happened_.
Commits CAN be used as great history, if history is allowed, but I've found that "modern" workflows tend to the rebase/squash side of things and also are mostly write-only.
I still wonder how much of rebase/squash-heavy workflows would disappear if UIs like GitHub and the CLI itself defaulted to `--first-parent` style views (with optional drilldown) and used the power of navigating a DAG for a little more good. All of this good commit metadata lost just because subway diagrams are pretty but also so many people find them confusing and messy. Or because GitHub shows commits as a flat confusing list with an order that only makes sense if you saw the subway diagram but GitHub's default view doesn't draw the subway diagram.
How does this work in the face of unrelated refactoring? Say you first fix a bug somewhere, with a great commit comment. Then some refactoring happens, and the affected function is moved to a new class in a new file. Are you still able to track the original git comment?
Say you found the current commit through `git blame`. You run `git show` on that commit. The diff shows that the function you're interested in was actually moved so from the diff output you can see the previous filename and you have the function name. You could use:
That will then find the commit where `someRandomFunction` was removed from `source.ext` and then the commit before that where it was added to `someRandomFunction`.Git log has a gazillion options and I've probably used them all at one time or another, but 99% of the time, the only ones I need are `-G` and following a particular file.
https://git-scm.com/docs/git-log#Documentation/git-log.txt--...
I also have `diff.renames` set to "copies" and `diff.algorithm` to "patience" in my `.gitconfig`:
https://git-scm.com/docs/git-config#Documentation/git-config...
https://git-scm.com/docs/git-config#Documentation/git-config...
I don’t think this is proper way of reasoning. What is hard and easy is subjective. And you discuss it as it would be objective. Word against word. It would be wise to have some poll and see results.
If one geek is writing and reading commit messages doesn’t mean it’s easily accessible by everyone. It’s hard to make something as a widespread standard if tooling doesn’t make it super easy to access. Allow people to leave kudos and emoji to other people commits messages and people will start making them better :D And later show heroic people with git —-stats
I 100% agree with this. I do this all the time. I also agree with the rest of the post. The sentiment raging against these longer git commit messages smells very much like elitism to me.
His credentials indicate that it may be possible that his arguments are based on data while your credentials and evidence indicate personal, anecdotal experience. Therefore I would trust his reasoning more. Additionally, I personally identify with it.
I mean a git developer finds git easy to use? That's biased data.
I love how both of you dropped your street cred before launching into your reasoning. It just shows how much more credentials convinces people rather then the argument itself. Normally that stuff logically doesn't matter and people are just doing it to grab some "authoritah" but in this case your backgrounds actually contributed to the arguments.
Definite agree there: Be it git or svn I spent a huge amount of my bugfixing and refactoring time in the history figuring out why things are the way they are.
Assuming all those links in the chain still exist. Before Jira we had FogBugz, almost all those old cases are gone (some were imported). And we used Flowdock for 10 years, that's completely gone.
Commit messages are the only thing we can rely on for this history. Use it. And try to avoid squashing commits, that erases this history - yes, even for a feature branch, changes from code review should be separate from the initial push, explain why it's being changed so we don't make the same mistake later.
I am terrible at git on the terminal, but with IntelliJ or emacs and magit, I can trivially find every commit ever to change a file, and easily navigate the commits to see every full commit message. It's not hard when you use a proper tool, and I have a feeling almost everyone has something like that?! Do you really try to stick with the git CLI and memorize hundreds of commands and flags?? Why?!
Really simple answer: Repeatability. I am not saying it is the only one right blessed answer, but if you really want to know why people haven't moved to pure GUI interfaces, imagine describing to someone how to add a new directory to their path.
Or: 1. Either hit WINDOWS-E and right click on This PC and select properties (it might be called something other than This PC if someone renamed it) or either press WINDOWS key or click Start or click the Windows icon (if you don't see them try mousing into a corner of your screen (typically bottom left) until they and the rest of the bar un-autohide) look for and click a gear symbol (should expand to say Settings if you hover), click System, on the left and the bottom you should see About. 2. Click the text Advanced system settings (on the right), look for a new window with a set of tabs, you want Advanced. Click the button Environment Variables. 3. In the top columnar box EITHER find a variable named Path, highlight and click button Edit, in a new window click button New, type '/opt/bin/git' in a text field that has appeared at the bottom list items, click OK OR click the button New, in a new window enter Path for Variable name and /opt/git/bin for Variable value, click OK (you shouldn't need to Browse Directory or Browse File). 4. Click OK button, click OK button, close Settings window.Your fictional example is not a good comparison -- I just can't imagine the scenario where you need to explain to someone who doesn't know how to modify their path why they need to add something to it.
For someone actually using git (and the CLI, at that) I'd expect to be able to say "oh, make sure git is in your path" and for them to understand how to check and set that, or at least be able to Google it and follow the instructions themselves. Likewise I'd ask something like "Can you cherry pick just that bug fix into a new PR so we can merge and deploy it today?", not give them a series of git CLI commands to paste in.
My observation of git beginners is ones using CLI say things like "oh, I screwed up my repo and had to clone a new copy". Good GUIs don't easily cause this situation, and mostly let you see and fix what happens when you do some weird accidental merge or rebase or someone else has force-pushed.
Yeah, I think I conflated a git specific question from the GP and a more general CLI question, my bad.
The argument can be made the interfacing with git is bad whether with mouse or with keyboard. My git secret weapon is to ask myself how do I make git do this thing that is easy in Subversion or Fossil and then I do that thing and I write it down so I can do it again in X number of months.
Sounds like someone hasn’t had to train fresh graduate engineers for awhile ;)
Insufficient explanation about about how to add something to PATH (specifically the tools for compiling java) meant that I started programming 2 years later than I would have otherwise.
On Windows it's actually just:
1. Press Win key 2. Type env 3. Choose system or account
Thanks! This is what I was secretly hoping for. I am doing this a lot lately.
It's even better because thanks to the Start Menu randomization process either could appear first in the results. Sometimes they will switch position after being presented.
Have you ever heard of "abstraction"? People that actually use windows can handle opening the start menu as a single part of a step. There's no conscious checklist for how the UI can be customized.
If you're going to make that into a complicated mess, then you absolutely do not get to assume the user understands "add this line somewhere" or has "fleet" installed and set up the way you expect.
Yes, I think we agree that abstraction is great (with or without "scair" quotes.) My point is that CLIs are valued as tools of explication, repeatable explication. I have actually used Window since 3.1. I cherry picked a particularly juicy example that I run into a lot.
As far as tooling goes the GP mentioned IntelliJ so I rewrote code with fleet, I could have easily have picked emacs or vim or bash or zsh or tcsh instead of fish and the complexity of interface would have remained static. I think HN formatting tools are partly to blame for the messiness but if you look at any quality set of docs describing a complicated computer interaction, to achieve the same level of repeatability as text-based, POSIXy interactions you are going to need a lot of screen shots and a few this or thats. WHICH is fine! Remember software engineering is about trade offs!
EDIT: CLI allows for abstractions like $EDITOR and $SHELL
If the tool calls git(1) then it can show you the script that your actions produced. Magit has something like this but I’ve never used it for that (since I also use git(1)) so I don’t know if it captures the whole context/commands.
I used a GUI frontend to R in a statistics course. Never needed to write R myself.
Don't even need the set -x, can just use fish_add_path for convenience.
IME git abstractions make it easy to read and navigate standard workflows, but incredibly difficult to repair issues that arise due to divergence of some kind or another because they are so opinionated.
I use git 99% in the terminal, and 1% in some git tool for visualization, but I find that a lot of people use it in the opposite way and have problems working with others that use a very slightly different workflow. You don't need to memorize hundreds of commands and flags, honestly a dozen or two gets you to expert status in most respects.
I don't have any problem at all, when some really tricky stuff needs to be done, I google for a solution and run whatever command magic I find. If you don't need to google for git commands to do uncommon things, I imagine you have a huge capacity to memorize things, good for you, but most of us don't.
I do understand how git works and could use the CLI most of the time if I wanted to, but there's exactly zero reason to do so. The GUIs offered by modern tools make it much more convenient and efficient to do things correctly. You really should't commit stuff without doing a careful review of the changes first, which is terrible to do in the terminal compared with using a GUI for that, for example.
Terminal for repeatability, gitlab for visualization is a good combo I’ve found. Push your branch and a great diff is waiting for you.
ChatGPT is really good at giving me the git invocation I need for weird complex stuff that doesn't come up everyday.
I don't know IntelliJ well, but I would be surprised if they did the rather expensive rename following that the multiple -C invocations did. Maybe someone can inform us here? GitHub definitely does not, but that is 100% my personal fault I assume.
Because IntelliJ is... less capable than it should be. Personally, I find `git add/commit -p`, `git diff` far easier to use than IntelliJ, and because Python is a fucking mess I had to install the codecommit git helper into a Python venv... but you can't tell IntelliJ to use that venv's $PATH for `git pull`/`git push`.
Oh, and you can't really macro complex stuff in IntelliJ, whereas I can do a single-command release and push-tag of a project with about 30 Git submodules in a (convoluted) Bash one-liner.
I don't find it more difficult to use or remember commands for than remembering how to accomplish similar tasks in some GUI (especially if that GUI is emacs). And unlike most GUIs (emacs may be an exception), I can trust that my knowledge of the git CLI won't become out of date when my GUI tool inevitably undergoes a UI redesign of some sort.
But more importantly, the CLI allows my typical workflow where I chain together a bunch of git (and other) commands in a row, allowing me to just type in, for instance, several different commits, their messages, and what files should go into each in one go without having to break my concentration by having to move around in some GUI between commits. Sprinkle in some stash manipulation and interactive rebases, compilation, and unit testing, and you'll really start to see how the CLI allows you to offload some of your working memory to your invocation in a way that a GUI just can't.
I was mind blown reading this also - are we not programmers for the sake of laziness in the face of these kinds of "problems"? I have to hail Tim Pope for Fugitive.vim also. HAIL TIM POPE!
100% this
Well, `git` is still the primary way I interact with a git repository, and `git log` shows the entire commit message by default. So I don't run into this problem.
If some "modern" git frontend is only capable of displaying the first line of a commit message, then this is a problem with that tool, not git itself.
(I'm also not convinced this is a limitation of all modern tooling...)
I can't tell if this is engaging with trolls or not, but I can't imagine that all of your interactions with your codebase are via `git log` with no other flags. Even the with the normal Git CLI that most of us use daily, most of us use `--oneline` or whatever to simplify useful calculations and visualizations like `--graph`, etc. But we're talking here mostly about code archeology, learning about the history of a block of code, so this comment seems somewhat ridiculous in that context.
Is it possible that you’ve been hit by
https://xkcd.com/2501/
?
Works OK for those of us who don’t know any git flags.The only sets of arguments I use to git log regularly:
* `git log branch` because I want to cherrypick or checkout parts of another branch.
* `git log --stat` because what files changed can be a big clue for what I'm looking for.
* `git log -- dir1/ file1/` because I only care about commits to a certain part of the tree.
Other than that, `git log` already provides so much information to /search or even `grep` through that I can't think of any other flags I use regularly, and if you don't use them regularly you forget them.
The real GOAT that people are sleeping on is `git rebase --interactive` where you can go back and edit part of your branch to clean it up before rebasing or merging towards main. The cleaner the commits are, the more useful they become later for other tools like log, merge, rebase, cherry-pick, bisect, etc.
A rebase to clean up your branch is great, and I lean on my team to do this. Unfortunately it's impossible to automate, because it amounts to craftsmanship. I've seen larger teams fall back to squash-merging, which at least discards checkpoint/broken/WIP commits. But it loses the nuance of more complex changes performed in logical stages.
I don't know how my comment was understood to mean that I am unfamiliar with git. My point was that those of us that use the git CLI have no issues seeing the rest of a commit message besides the first line, and in fact this is the default.
When did I say anything like that?
My point is just that the `git log` command, by default, shows the full commit message. The same goes for `git show`. So a user of the git CLI will regularly see complete commit messages, unless they purposefully request a different format. So, it is not some inherit problem in git that the complete commit message is hard to find. That's just a limitation of certain Git frontends.
You speculate that someone who uses git log without listing (or complaining about) all their flags are a troll?
periodic reminder that `gitk` exists, and has come with git since... pretty much forever? If you're reading `git log`, you really owe it to yourself to run `gitk` at least once to see what you've been missing for over a decade now.
What gave you the impression that I haven't heard of gitk?
When someone mentions an approach, and someone else mentions a different approach, neither comment is "for one person only".
Well, I’ve heard of gitk too but gitk is not available by default on my default installation of macOS so there’s that.
I'm not sure why anyone would use default installations of any developer tooling on Mac? Was the first thing you did when you finished initial startup on your mac not "install homebrew, then install the most up to date versions of git, python, etc. etc."?
gitk is great if you love subway diagrams and want a built-in tool for it. There's so much power in `git log --first-parent` and recursing into drilldowns that current UIs, including gitk, are bad at expressing as a workflow.
I don't know how it's for everyone else, but I do value the body of the commits from others. It's true that I see only the subject line for most commits. But I eventually read the full body of commits I'm interested in. Honestly, it's frustrating when commit messages don't carry enough context. Sometimes that context fits in the subject line. For others, I expect an elaborate body.
On my work I make 1-15 commits a day. If I have to spend thought cycles on the commit message, that is time that goes from other productive endeavours.
I think, as the original commenter also wrote, this might be worth it in much slower paces projects that is run in another cadence / over mailing lists.
I particularly think that high paced application development do not benefit from git as documentation.
Do you apply that to everything? Like not answering questions from your colleagues, not writing test, not refactoring, not optimizing, etc?
I personally don't measure my productivity by the number of commits I push. If I did, I could easily make 100 commits a day. And there of course it would be better for me to not care about the commit description, because it would take thought cycles and anyway the commits would make no sense.
Like the sibling comment, this comment reads like a person who don't realise the breath of projects git are used for.
Not answering questions from my colleagues? No nee to be snarky, lets keep a good tone here.
Small refactorings is a good example of some code I would not write long commit messages. Like going through a function improving its clarity and adding comments – I would not redo that effort in the commit message. Text updates, style updates, etc. are also things that rarely merits big messages.
Great for you that you don't make 100 commits a day – but watch out that you don't mix disparate changes into a single commit.
I didn't mean to be snarky, sorry if I read like that! I was trying to list examples of "non-coding" that I find are important :-).
I totally agree! Now I am starting to think that we all agree here. I was just confused because your comment seemed to disagree with its parent, which says: "Sometimes that context fits in the subject line. For others, I expect an elaborate body".
But apparently you do agree with that: sometimes it is worth writing a long commit message, sometimes it is not. It depends on the situation, and then it's a matter of common sense/experience.
Until it's 7 years later, the original developers are gone, the ticketing system has changed twice, and you have no clue why something is the way it is.
When you're committing is exactly when you already have the context of "why" loaded and even a short explanation should be quick to write. The thought cycles argument feels lazy unless you're doing a bunch of quick exploratory commits and clean up/squash your git history later and add context once a solution solidifies.
In that case, why should the commit history be the place to go? Commit histories are extremely exclusive – everybody not a part of the programming process will be locked out of that information. That is not fair.
Regardless, what you describe is more an organisational failure than an issue with commit messages.
In my experience with open source projects, the history is very much where I go. Say I read some code and don't understand a line (say it is weird, but it does not feel like complete garbage because the rest of the code is actually good), then I will definitely `git blame` or even start digging in the history to see where that line comes from.
Good commit messages have saved me more than once in that situation. It doesn't have to be a whole essay, but something meaningful. Something like "apparently Travis CI wants two white spaces here" is already useful: it says that back then, they used a CI called "Travis" and it required that weird extra space. Now I feel safe removing it because the project does not rely on Travis anymore. (For example).
Note: it could be in a comment. But comments rot, move, get out of sync, disappear. It's much harder to check all the revisions of a file in the last 7 years to look for a potential comment on another line than it is to find the commits that actually edited the line of code.
I would argue (rightly or wrongly) that there are two common truths to such a scenario:
Scenario 1, you’re doing a bunch of small changes that work towards a larger purpose. They’re what I like to call “checkpoint commits”. They aren’t the whole story —- just a step along the way to whatever you’re trying to accomplish.
Scenario 2, you’re coding instead of thinking. Making “random” changes until you get what you want, but because you’re continuously delivering, they all go to production. Note that “you” here might be the developer, or it might be business people demanding things from said developer.
In scenario 1, IMO you should be working on a branch. Then, when you’re finished, you squash your commits and replace the countless mini-messages (“fixed”, “Oops”, “wtf?”) with the actual message you want to be there when you merge it.
In scenario 2, especially if it’s driven by business, you’re probably SOL. In this instance, however, I tend to feel like people are making more work for themselves. If they stopped and thought it through for half an hour before starting work, it might only take an hour’s worth of work and one commit, instead of a day and thirty commits.
Of course, there are always shades in between. :)
None of these, the granularity of the changes are just smaller.
Yes, you could argue that we should go with preview envs and only merge larger changes into main. But then again, this adds considerable complexity to the infrastructure – something that might be merited when we scale to 10+ software engineers.
This is the nature of products where you work close with designers, POs, etc.
You simply don't don't do this effort to update text, positioning, colors, etc.
In particular: Remember that git is _not_ just for kernel-style projects.
Very much this.
If I'm modifying some rather obvious and ovreall simple thing like an obvious config of a grafana, adding a customer to a config and such things... it's hard to really bother with a long commit message. Also, with modern tools like VSCode with the Gremlin plugin, I don't think I'd have spent many words on removing a weird whitespace from a code base, to be honest.
On the other hand, if I've spent 4 hours thinking and 2 of those hours discussing the change with another DBA changing a 2 into an 8 in the config of an SLA-critical postgres cluster... spending 10 minutes on a commit message in the config management is - with regard to time - a footnote, irrelevant and inconsequential.
But it can be worth more than gold down the road if you ask "Why 8? Why not 6!"
I make roughly that many commits a day as well. If something's easy to understand I'll put in a simple commit message (e.g. [1]), but I do put in the effort for more complicated ones.
[1] https://github.com/nextest-rs/nextest/commit/efd194b2e1d8d61...
[2] https://github.com/oxidecomputer/omicron/commit/b07a8f593325...
This is a bad excuse against writing proper commit messages, since it can be easily extended to user and development documentation. If you want to classify these as productive endeavors while commit messages as non-productive, it basically boils down to doing as little as possible that you can get away with.
That is hardly hectic enough to avoid good commit messages. I have seen people writing good commit messages at much higher commit rates. Frankly, good commit messages are actually time savers if you have a high commit rates.
Things like good commit messages and a lot of other best practices are completely avoidable in the name of high pace. However, the time savings are marginal compared to the quality you sacrifice.
It's great for historical research though. It's one of the few pieces of documentation that will live with the code forever. github and other forms of centralization are not open data formats that folks trivially backup/convert/carry forward. They usually leave the data behind if they move the project somewhere else.
So no, I don't think it helps the current community much either. But it helps the debugger years later.
Is it great for historical research? I feel like the format and tooling around it is uniquely _not great_ for historical research. I think it's optimized for discussions before integration, which is largely what PR descriptions and comments are largely used for now.
I feel like given great commit messages, determining a story and useful history around any block of code given the Git tooling is incredibly difficult even if there are _amazing_ commit messages.
Like say you are trying to determine why a 10 line function is the way that it is. You blame it. Not even with the stupid-simple GitHub UI that _I_ originally wrote, but with the more expensive CLI interface that follows renames and ignores whitespace changes, etc. Now you get a list of SHAs of commits and the first 50 chars of commit messages for each line for the last modifications, etc. How do you even stitch those messages into a useful story (in order) to tell you how that function evolved to what it is now and why?
As a GitHub co-founder, whose fault is that? I have seen many great PR descriptions on GitHub that never make their way into the final inclusion in the main/master git history.
Meanwhile the git project links every commit to the message id whence the original patch (for many years now—not the whole history). Which will be available as long as the email archives are out there somewhere.
And the commit messages get reviewed into a good shape. Something that I’ve never seen anyone do on GitHub.
But Github and similar tools actually solved this problem where Git failed to do so. Nowadays people have a setup with Github or bitbucket where they can navigate from a piece of code right to the pull request, where they can read the code review discussion, see the build log, reach linked resources like the Jira, etc.
“Just go to our web app” is not solving the same problem as what Git is trying to solve (the latter sometimes badly, it might be added).
Tediously commit by commit. But it's often better then the alternative. Design decisions and business logic separately from the code or source control are infinitely harder to reference code against, and realistically that documentation will be lost.
At least if you have the git repo then there's at least some chance to be able to git through the history of some code that's kept with the code. Especially for stuff that code cannot document and you're working with devs that seem to be firmly believe that code is self documenting.
Doesn't mean that every code base needs to have amazing git commits. But code bases expected to live a long time at least give some possibility to string together a history after some work.
This isn't even a git concept though; it's something that was tacked on top of it. What you seem to be saying here is that a third-party tool building on top of git spawned a social movement that moved this layer up a level. Not every project uses github or a github workflow.
It's optimized for discussion of the purpose of the code unit in question. That discussion can be useful before integration; but pre-integration discussion can happen any way you like. PR discussions work, e-mails on mailing lists work. Face-to-face discussion works.
The real value (for me, I guess; apparently you just don't see it that way) is explaining the purpose (and possibly circumstances) of the commit, after the fact, when I'm looking at it for some reason or other. Not finding the commit, but explaining it once I'm there. A well-written commit message can be absolutely priceless.
Maybe this last point should go in a top-level response to your original comment, but I'm already here, so I'll just say it here. Saying that commit messages are terrible because only short-messages (the "subject line") are shown by default, seems to me about the same as saying e-mail bodies are useless for the same reason, or that file contents are terrible because `find` only lists file names by default. You 'have' to collapse by default, or you'd drown in a sea of commit messages anytime you tried to list anything.
Okay, I hear you, this is not the most ergonomic procedure to one-off. But seriously, you have the SHA commits. If you need to do this often, write a tool that takes those SHA commits, orders them based on log order (or chronological order, w/e, pick an ordering mechanism), and prints out whatever information is interesting to you. A simple display that can expand/collapse full messages, diffs, etc. would probably do nicely. It can be a GUI tool, a CLI tool (menu-driven, maybe); whatever works for you. This should not be a big deal to write for the common case, and if you think it's that critical to the community, publish it.
P4V (Perforce Visual Client) is amazing for visual historical research. I haven't seen a git tool like it, but I'd love one. https://www.perforce.com/video-tutorials/vcs/using-time-laps...
It might depend on which tools you're using. When I'm doing historical research for how a function evolved, I normally run "gitk" on the file, and walk through the commits; the full commit message is shown together with each diff to the file. It used to be even better in the past, when gitk showed the full commit diff, instead of the diff to just the file I passed on the command line, but "git show" on the commit hash (or another gitk which is not filtered to a path) is good enough.
I feel like you're complaining about a problem which you helped create.
So, with all due respect, do your part to fix it. For example, by allowing review comments on commit messages in GitHub. Gerrit gets this right, FWIW.
Till the team you are handing off the code to just copies the files and commits into a fresh new repo without any of the history. I had this happen once to a server I wrote, and then like 2 years later the new team comes and asks me if I knew of the server, and I'm like "I wrote it" and then they are all confused.
Maybe I do it wrong, but the most basic interface I use to check the git history is `git log`, which shows the whole commit message.
GitHub takes me 18 clicks to find the commits, I don't see why I would even bother using it.
Many engineers primarily or even exclusively use git via githubs interface and have never made a commit with a body.
Right, but then maybe the main issue is those engineers, and not the tooling? When I see someone using a hammer the wrong way, I don't usually blame the hammer.
The problem is the network effects. If enough people start hammering with the side to pull out nails, like 95%, there's a chance that could become known as "the right way".
I think a lot of the "get shit done" crowd for instance sees using Github and not "futzing around with git" as the right way to do things for getting shit done.
Those "engineers" go on _The List_
Completely agree, the value with the message is really just to link an external ticket Id, the user experience is much better in external ticketing systems for all of the story telling that the article loves.
Don't read "external ticket system" as closed either, plenty systems are open to the public.
Right. The massive commit with minimal description and a PR number which I can look up in Azure DevOps to find a review with no description, no discussion and a mention of a number I can go and look up in Jira, where some Scrum master wrote half a sentence of what needs to be done and asking to "reach out to Jeff" for explanation.
So much more valuable and great user experience
Just wait till next year when your employer migrates away from Azure DevOps and that PR number will be a dangling link forever lost.
Or you just move the repository to a different protect
Many editors have great git blame integration that makes these messages quite accessible.
It's really easy in emacs with magit to view commit messages from git blame view.
I believe vim, vscode, and jetbrains IDEs all make this simple.
Yeah, a lot of these also have Github and ticket tracker (Jira, etc) integration so they'll also pull in context from those, too
Most of the stuff I work on uses merge commits on Github so you can just click the PR # in the merge commit message and arrive at the PR, browse through commit messages, discussion, etc
Using vim-fugitive it's
I find this very hard to believe. Isn't it "everyone who is interested in the commit subject/files touched should read the body". Why would anyone else read immutable historical documentation?
This sounds like you are joking. Any good IDE will be able to annotate each line with blame info, and show the diff at the press of a button. On such diffs, the IDE should allow recursive blaming on context/deleted lines. Tools like Tig allow exactly that.
GitHub certainly does make it hard to see commit messages, I give you that :)
?? It's not like it was written for fun. This documentation attached to a commit exists to reduce the risk of accepting the patch from someone who might not be around in future, to fix any problems introduced. By disclosing all their relevant thoughts, the author shows their good intentions: they enable others to build on top of their work. If the author kept their thoughts to themselves they would gradually build up exclusive ownership of the code, which is often not a good idea. Also a commit message serves as proof of work, which can be important when there's too many patches. For commercial projects some of this is less important.
If you think about who created GIT (Linus) then it suddenly makes sense that the commit message is like an email body since most of the Linux kernel collaboration is done via a mailing list?
I might be in the minority, but parent's comment is probably about people like me: most of my coworkers have context free, or at best succinct commit messages. I never read more than the first line listed in the commit list, and don't even assume the description is always accurate.
Instead I'll spend my time stalking the related merge request, where the full description of the whole change resides, with probably a link to the ticket or reference documentation, and all the back and forth on why something is or isn't a good idea.
I think the world could be a better place if all of that was in git directly, but that's also utting much more burden on an already complex tool.
I take Scott's point with a difference perspective.
Though commit messages are ephemeral and hard to utilize in the future, they're the stream of consciousness of the project.
They convey very important shifts in direction, discoveries in the making, code smells, limits of current architecture, and markers of tech debt. We don't know what this beast will be. And we figure it out commit by commit. Document it.
Commit messages are the very opposite of ephemeral; they are the longest-lasting history a project is likely to have!
Yes, I misworded. The usefulness of commit messages, Scott's point.
This has been an issue with version control tooling for quite a long time. I'm fairly certain both CVS and SVN did the same thing. But I agree that you're still right.
I'm also very amused by the number of replies to your comment along the lines of, "oh, it's actually very easy because I always use <third party tool>".
Which is, of course, rather proving the point.
No, because the point is about common tooling, and the common tooling does not, actually, make this difficult.
That's rather a matter of opinion. One that I and GP both clearly disagree with. Insisting it's easy doesn't mean everyone will agree with you.
I feel half vindicated about my rant a few weeks ago[1] arguing that we should make commit messages as long as we like instead of the stupid 50 character or whatever limit. If enough people do that, maybe tools like GH will stop wrapping the message by default. Even if not, atleast the first line is usually easy to see in most tools by hovering over it or something.
[1] https://news.ycombinator.com/item?id=38831282
Presumably you're referring to commit subjects.
And no, they should absolutely not be as long as you like. It breaks things
So then I am not wrong that I do all my git commit messages via the "-m" commandline option with a short phrase like "frob the baz"?
(Initially I started using -m to avoid getting trapped in Vim. But even after I gained the option to use e.g. Notepad++ as the editor, I never saw the point in using anything more than "-m 'message'".)
Git respects the EDITOR environment variable and has done for decades (so likely before many here really used it) - you should probably be setting that (or equivalent on your platform) to the editor you want anyway.
Weird workaround just to avoid basic configuration seems like more work in the long run.
Isn't it obvious? Write better tools. There is no reason you have to be stuck with the deficiencies of what someone else has built. That's the whole point of open-source software.
It's more than a little concerning that a "GitHub cofounder and author of several Git books" has to have this pointed out to them.
There is a concerning trend of "we only use vscode" and popular preference shifting to "adjust to popular tool" rather than "use best tool".
This means sadly things like GitHub start to define git even more for your coworkers.
It's amazing, your experience with Git is so different than my own.
I routinely open a file in my editor, hit "Ctrl-c v B" for Git Blame mode, go to the line I'm interested in, and hit "Enter". Bam, there's the full commit message. From there I can can continue to trace backwards, blaming lines and reading full commit messages.
But, you know, not everyone uses Emacs and Magit, fair. How about just using "git gui blame file"? Click on a blame line, see the full commit message. This is a tool included with Git (available in a separate package in some installations).
OK, rather use an IDE? Install GitLens in VSCode. Easily accessible blame in your editor, where you can hover or click in various places to see full commit messages.
I mean, I agree in part; there are some tools which make good commit messages hard to write or find. The tiny little commit message edit box in VScode is not ideal. Lots of people use a workflow of "commit lots of crappy commits with one liner commit messages, let GitHub/GitLab squash them on merge."
But as an expert Git user who has managed to convince some teams to have a good commit message culture, if you do get people used to writing good commit messages, they can be very easy to find and read later on, there are tons of tools that make them easy to browse.
I regularly use the git command line, and "git show (pasted SHA)" in my second terminal doesn't really feel like the road block to understanding the grandparent seems to make it out to be. It takes me many orders of magnitude more time understanding what is output rather than searching for it, and like you mentioned there are any number of UIs (third party, editor integration, or even shipped with git like gitk) that wire everything up into a nice UI.
And I also disagree with the GP's complaint that "Most people only read the shortlog" being any kind of disadvantage. The commit message isn't for everyone, it's for the one time someone needs to figure out exactly what it did and why that commit was made, and why a change in X causes a behavior change in Y, and can save hours of work. It's like code comments, 99 times out of 100 you don't need them as you're just interacting with a documented API, but that 1 other time they are a godsend.
If you follow a pull-request based workflow, and if you typically squash down to one commit, then finding these messages isn't too bad, since the commit description pre-populates into the pull request description. I often track changes down not to their commit, but to their pull request.
Granted, that's not exactly `git`, but rather `github`…
I never considered the idea that it was atypical, but I read full commit message text all the time. There are many different ways to drill down into a commit, and then read the entire commit once you know it's relevant. Even doing a simple git log, and then a searching for some keyword through every full commit message, can be useful.
In practice, I think GitHub/GitLab/etc solve this UX problem pretty well. Inline git tools let you jump immediately to the PR that generated the code change, and the PR description + code reviews + snapshot of the commit help to understand what the point of the change was. You can search the PRs when you want to find some context. (It's unfortunate that PRs are not stored in the repository itself. I mean, Git is not a great database for a multi-user webpage, so this wouldn't quite work... but it would be nice if the archive was durable and easy to export/share.)
Your entire argument boils down to the fact that it's hard to view git blames. It's not.
As stated by other people, IDEs like VSCode and IntelliJ do an extremely good job of showing the blame. And they DO show the entire commit, body and everything at once.
To be clear from reading some of the other comments, I don't work at GitHub anymore so while I may have partially caused the issues I'm complaining about, I don't have the ability to fix them anymore.
Also, while most GUIs and editors have blame capability (as does GitHub actually), most of them don't ignore whitespace changes (-w), code movement or renames (the -C options) so they're often of limited use.
Finally, I _would_ like people to write good commit messages, I just would like to see a tool that actually uses that work in a way that helps document your code in an easy and valuable way, and the Git/Hub tooling makes that process at best "tedious" as someone in the thread says.
I am working on a new Git client called GitButler[1] and would like to address this at some point down the line, so maybe it ends up being me who helps fix this after all :)
1: https://gitbutler.com
git log. git blame, grab hash, git log hash. You make it sound like some arcane magic...
Of course Emacs has a mode for it:
https://github.com/redguardtoo/vc-msg
The reason people (myself included) rather like good Git commit messages is evident when one compares them to the alternative.
You're working in a commercial/closed source environment and want to find out why line 57 in src/blah/db/utils.py does that. Where do you look?
- inline code comments. Usually non-existent. Often out-of-date, sometimes misleading, frequently tells you no more than you can discern from just reading the code itself (especially now type annotations are trendy again). Rarely explains why the code exists. There's a reason people caution against too many comments, and that translates into people probably not putting enough commentsin.
- calling code? Helpful, but thanks to microservices and increased levels of abstraction (APIs, DI frameworks, messaging buses, config parsing) you've got to go check 900 different repos out to work out what is going on.
- email? Give up. You'll find invitations to the company Christmas party and Q2 sales figures but actual tech explanations are in short supply.
- Slack etc - same problems as email, plus developers who hide away all the interesting stuff in private team channels
- Google Docs - you probably don't have access to the relevant doc, and there's no way to know that you don't
- wiki/docs? Half baked, wrong etc. Or it'll be autogenerated JavaDoc type stuff that'll tell you what you already know or can reasonably infer from the code. Also, findability sucks. Or the developers just avoid the whole thing because the software is nasty and corporate and barely usuable.
- bug tracker/ticketing system? You ask around and someone says "oh yeah, Dave made that change two years ago" and then you search for tickets that match related keywords only to find out that those tickets weren't brought over from Trello into JIRA, and now you need to go ask IT to give you access to the legacy Trello board which they don't want to do because then it'll put them over the five users per month limit or whatever.
- Architecture Decision Records / decision logs / whatever you want to to call them - nice if they exist, I guess.
- ask the person who wrote it? This assumes they still work there and can remember. Plus you gotta do the asking around routine which takes days and destroys all hope and joy in the world.
By a process of elimination, commit messages are the closest you're going to get. They're right there - on your computer, neatly integrated into your editor, hopefully. You can search them fast in a terminal window rather than in some slow web-based monstrosity. If you're lucky, they're actually useful. Even if they aren't, they're at least contextually useful in helping you narrow down your search strategy for the inevitable plunge through email/slack/JIRA/Trello/internal wiki etc.
Ideally what should happen is the really useful commit messages get copied into stable technical documentation like decision logs or a properly maintaned wiki. If people did that, great, but it's pretty rare. A culture of sharing weird interesting tech things in a Slack-type system can help because future devs can at least search but you do that at the cost of more interruptions for colleagues now.
The broader issue is of all the bad options you can choose, it often tracks the wrong thing. In something like Trello/JIRA/whatever, if you're looking for the technical reasons, it'll have the business reasons without the technical stuff, or vice versa. You generally want both, and most systems only give you half the story.
I’m surprised that you (in particular) would say this. git-log is, to me, fine for displaying the whole message (not just the subject). And sure, I often fiddle with copy-pasting SHA1s like a caveman, but it’s fast enough for some quick history spelunking.
Finding the history of a particular code change is even more manual for me: maybe doing a chain of `git log -S'line'` where `line` copy-pasted in at every step. But doable and not a time-sink for my off-hand what’s-this thoughts. (But: something more convenient that isn’t an unreadable Unix pipeline one-liner would be very nice.)
My litmus test is simple and doesn’t involve hallucinating that other people are even reading my messages: am I reading my own past commit messages? Yes. I am curious why I did or didn’t do something on a daily basis(!)
While this may be how most people interact with git, I couldn't disagree more when it comes to my personal use.
I use 'git blame' (I've never needed to pass any options to it) and 'git show' liberally if I'm trying to understand a change that was made, and if the committer took the time to write a commit message body, of course I'll see it and read it.
I think people don't care much about good commit messages because they are unprofessional and sloppy. They just want to get the commit in, push the PR/MR, get it reviewed and merged, close that Jira ticket, and get credit for those sweet sweet story points (ugh). And on top of that, they generally don't care to document their changes because they personally don't see the value of doing so. Surely they'll remember the change if they ever revisit it (no of course not, but many people think they will), and they don't really give much thought to the possibility that others might need more context.
And besides, all the discussion about the bug or feature or whatever was happening in the bug tracker, so providing a link to that issue in the commit message is enough, right? (No, it's not; I hate it when people do that and think that's all they need to do.)
Then maybe this is GitHub's fault; fix your web UI, then. I avoid GUI interfaces to my dev tools as much as possible, and I think the git command line is perfectly fine for this. It absolutely does not only show the first line, generally. 'git log', 'git show', etc. give you the full message by default. In general I would say you have to go out of your way (by providing more command line options) to hide the message when using the command line tools.
Sure, because it's not a vector for code documentation, it's a vector for change documentation. And there's no better place to put the description of a change than in the record of the change happening.
While I agree that many people write very poor commit messages, I don't think the tooling and discoverability is why.
In my experience it all depends on what kind of codebase it is (product? library/framework? private company? opensource?), commit velocity, release cadence & how the codebase is used in general.
In low-velocity opensource libraries, good and clean commit messages can be really helpful when debugging arcane issues. I used to be maintainer of a frontend framework & widget library and we tried to have good commit messages as we'd often go back when over old commits when fixing bugs.
I agree that using git from command line for blame is not easy, this is something I always do from GitHub UI instead.
When GitHub is the repo's choice for PRs, and the codebase is product codebase with high velocity, having a pristine git history and clean commits and commit messages is not practical; however, the expectation should be to at least have good PR descriptions. When blaming commits in GH UI, it's easy to go to the PR which introduced the commit (it's linked below commit title); and PR descriptions can be enforced via templates in .github folder.
PR descriptions have an advantage that they can use images, videos etc. to better explain what they change. This is especially useful for frontend codebases.
I work on a big frontend monorepo. We have tools in place to do visual bisect between pull requests (each PR gets its own preview env). We very much do read PR descriptions when doing bisect to confirm which of the recently merged dozens of PRs introduced a regression in production N hours ago.
But in general I agree that commit messages are not good place to storage general knowledge (they're good for "what and why is changing here"). For documenting gotchas etc. I prefer to have code comments in relevant places of code; or README.md in subfolders. (Sadly, I notice most programmers just don't document anything anywhere at all).
To tack one additional problem onto your excellent list: the commit message is usually only the start of a conversation about why a change should be made. The rest of that discussion is whether it meets the bar and what needs to be adjusted before it can land on the collaborative trunk. Done well, that is valuable reading.
Git was designed with the distributed viewpoint. A commit message, as written by the author, is necessarily correct: I’ve decided this is right, and it’s on you to decide if you want to merge it into your history too.
In our current systems we usually have a URL in the commit message that links to the actual story behind the commit — the discussion on the pull request, merge request, or code review. I rarely see the results of these discussions being amended into the commit message. If the repo lives forever but the database behind the code review tool gets toasted then something just as important is lost forever.
(I come from a background of one idea equals one amended, fast forwarded commit to master. It’s possible other people rely on branch history to reflect the evolution of ideas and how they go from a request for review to approved code. In my experience branch histories tend to have very low quality commit messages and even then they only show one side of the conversation — the author’s responses to their reviewer’s and their own critiques.)
Making sense of code (or any system that changes over time) vis a vis its own history is one of those things where I really think AI/ML tools can really shine. Even with relatively low quality commit messages, I can look at something that happened 15 years ago in a codebase I am familiar with, and there will probably be enough information that I can assemble the full context, even if finding some of that information is challenging or time consuming. git log, git blame, look at the other code made in the commits, read the issue descriptions, read the code reviews. It just seems like a model could slurp that up and do a decent job of giving you a couple of paragraphs about why the line of code you are staring at is the way it is.
TBH putting such a detailed writeup in the git log doesn't really have any return -- for it to ever be useful to you again, you have to know the information is there; you then have to actively seek it out, with the hope that whatever you did to make it 'searchable' is going to work for you again. I can say with surety that if I were looking at a bug similar to the one linked from this article, I would not look to the git log for inpsiring a fix; I'd just fix it. Any extra time I would take would be to understand how a UTF8 nbsp ended up where it shouldn't have been in the first place -- something that the author of this commit seemed to have no interest in doing, but which likely has greater relevance than the documentation of the fix.
I want to be clear that I support commit messages that say what they do though; I'm not advocating for -m 'fixed' shenanigans, however at the same time I believe that -m 'fixes #1234' is often enough
I use fugitive.vim, and blaming is very convenient there as well as every other git workflow. I can press a shortcut to see when every line in the current file was changed, and who changed it along with the commit hash. If I need more – I can expand every hash to see the full context, including full commit text and diff. Maybe cli git is not too easy to use since how complex it is, but there exists a git wrapper so awesome it should be illegal
This is a failure of GitHub etc. GitHub tries to dumb things down for users because I guess it's judged they can't reason with commit histories and this is one of the consequences. The mess in especially private GitHub repos is beyond belief sometimes.
The thing is there's nowhere else for such documentation to go. It's not appropriate for a code comment. But we've got a whole generation of developers now who think git is GitHub and the only purpose of git is uploading changes to GitHub.
Git sucks, but it sucks a lot less than everything else. But we need to go back to basics and understand what version control is actually for.
This is a feature, and a crucial one. No one would include fifty lines of explanation if everyone had to see it. It would be better to throw the information away than to inflict it on everyone who was scanning through the commit history looking for a particular change.
Yet it is valuable information that only makes sense in the context of that change. There is nothing in the corrected version you can connect to the issue that was fixed. It's obnoxious to include comments about errors that have been removed, like this:
(This is ridiculous, but not unrealistic. I've seen code comments that said things like "# removed syntax error in invocation of query generator." This is what you get from programmers trying to juice their LOC stats.)The commit message is the right place for this kind of information, but most people reading the commit messages don't care. They're scanning through looking for something else, and all they need is a few words that tell them if this is the commit they're looking for. The person who needs to see the full story is the person who is interested in this change in particular. Maybe they found it by grepping the git log for "invalid byte sequence". Maybe they found it because they're looking at all the changes in that file, because some tooling that occasional modifies that file keeps messing it up. What matters is that if they have a special interest in that change, they have a way to see whatever information they committer felt was worth preserving, and the committer has a place to put that information where only someone with a special interest will see it.
This sounds like an excellent sales pitch to use email based good workflows such as those advocated for by Drew DeVault[1].
1: https://git-send-email.io/
One tool that I think promotes commit messages like the OP is magit in Emacs. Before using magit, I always used `git commit -m '...'` and didn't realize that commit messages could be longer than a line.
I agree that this is a tooling problem, but magit is a breath of fresh air in many ways (including verbose commit messages).
The good way to browse git blame a read commit messages is to use Magit. It is also great at letting you seamlessly rebase/split/merge long patch series.
Author of nit, here. I tried to move the landscape towards semantic reasoning. It’s on github but kind of abandonware. Life and incompetency happened ;)
No shilling. I commented here because I still think my framework was decently thought out, and mostly that calling someone a nit or a git is exactly what linus was thinking. Make it easy enough for anyone to use.
Nit is something people could take as a thought experiment.
I completely agree that well-written git log messages are goldmines of information.
I wish makers of popular git forges had made it easier to create and consume this information.
Almost all my wiki pages start with piping git log messages into a text file.
Git logs are the entry point to good project documentation.
(edit: fix formatting)
I agree for the use case of scrolling through a git history, yes, but when I land at a certain commit, e.g. by hitting the blame label in IntelliJ on a line whise reason d‘change I‘m interested in, then I will totally read the whole commit message in the hope that it helps me understand the change (in addition to looking and trying-to-understand the change itself).
I feel like this explains a lot about why GitHub is so consistently hostile towards showing or writing decent commit messages.
Which has helped push people away from writing useful ones, on an unprecedented scale, which makes it a self-fulfilling prophecy.
Great.
Just great.
Just do a threaded conversation in a comment at the top of each file. Add your name and the date.