My personal experience with story points is that the number never really mattered, but the process of the team discussing how to rate the complexity of a task was very useful. In terms of having utility of estimating how long something will take, I personally have never been able to translate story points into a reliable indicator for that for many reasons (e.g team changing, domain changing, variability in operational load outside of development work).
Overall I tend to avoid using story points, but on the few teams I worked on who really wanted to use it, I always framed it around building shared understanding rather than a metric that is actually useful for estimating work.
If your measure of the utility of story points is how well they help you estimate time, then you're right, they're useless to you. If you're on a scrum team, they're a useful back-of-the-envelope way to estimate which bits of validated functionality you're going to be able to get into the codebase this sprint. No one outside the scrum team should care about story points, and they certainly shouldn't be used to generate reports. Velocity is for the team's benefit, as a tool to help manage workload and schedule.
How are these different?
Same way that throughput and latency is different. You can use it to predict what the team can deliver in a quarter for example. But difficult to give ETAs to individual stakeholders.
if you are at a point where you employ such a factory throughput thinking for information workers then you're misusing information workers in my opinion. Focusing on a throughput metric is just as nonsensical as focusing on how many lines of code somebody creates, in fact, it's really astonishingly similar in how nonsensical it is. Bug reports, issues and features are not just lumps of coal that you need somebody with a pickaxe to beat it with. If you are treating employees like cogs then they will mold themselves to be empty, unthinking cogs. They will not think of outside solutions anymore, they will not care anymore, they will just take the next issue and beat it as ordered.
In my opinion it could even be seen as the biggest red flag of a team when they start using story points. It fundamentally means that this team started to measure their work in terms of raw issue throughput instead of real value. It may work for a while. Maybe your managers are just so awesome that all the issues they create are perfect and great for the business and you never have to think for yourself at all. But inevitably there will come a point where the company would be better off if everyone was using their full potential but by that point you are stuck with a bunch of cogs that you have molded into cogs over years.
ah, we are not using it as a measure of teams performance, like throughput though. Its just used to predict what the team can deliver in a sprint or a quarter etc and make business decisions based on that.
That is kinda the same thing. With story, it’s either we roughly know how to do it or we need to do some research. In te first case, estimates will be wrong because of edge cases and contextual differences.
Businesses decisions should be based on things like feature requests from customers, not the amount of “points“ a team can get done.
It’s extremely useful for a business to have a rough idea of when they’ll be able to deliver something. Budgets, contracts, client relationships, marketing, etc are affected by it.
So should Engineering give a best effort to estimate that, or just throw up their hands and say “it’s done when it’s done”?
Counterpoint: it's extremely useful for the Trisolarans to know when their next stable period will be, but that doesn't make it reasonable to demand an estimate and factor it into a contract unless you're really really clear about the uncertainty in the estimate.
Yes, we should all live in reality and use our best judgement. If estimating based on story point velocity is literally useless, then no one should be doing it. However if it does help make planning more accurate, even by a little, then it might be worth the cost of doing so. It’s a conversation to be had between different functions, hopefully based on empirical evidence.
I feel like a lot of engineers overlook that there’s more to a viable business than just producing high-quality software. People don’t ask for estimates just to annoy developers.
No, I know, there's just a systemic difficulty with scheduling dev items that are difficult to estimate.
My team is currently working on performance improvements for a certain service in order to get it to a level that our biggest client is happy with. Based on some profiling and some intuition, I built some Jira cards with rough ideas of what changes might make some improvements and added some rough estimates of how long it would take to trial each idea. Of course, what's actually happening is that we try one idea, it counter-intuitively makes performance worse, we dig into why that's the case and refine the idea, then try a different version of it. It's fundamentally impossible to give an estimate for when we will have run through all the ideas (plus new ones we think up along the way) and how much of an improvement it will make.
I just came out of a slightly painful standup where the dev manager repeatedly asked "Is this doable by mid-August?", and I repeatedly answered that it's not possible to give such an accurate estimate, that we would do what we can in that time and hope to give a better idea of what's possible by the end of this month. Of course it's not great for the dev manager to hear, because they have a client who needs to know how quickly they can scale up their use of the service. There's a conflict between what we can possibly know and what the client "needs" to know. I wish that I knew how to resolve it. It feels wrong to agree to deliver something with such a level of uncertainty, since it just leads to a ridiculous amount of pressure like this.
That’s all fair. It’s an inherently difficult problem. In a healthy organization, leadership is aware of this, has reasonable expectations and factors in the risk. Unfortunately not all organizations are mature in this way.
The assumption is that it's already unreasonable to expect developers to estimate how many hours something will take, and then due to meetings and such it's effectively impossible for them to then those already bad numbers into calendar days. So instead, the proper answer is to bypass all that by having developers estimate in made-up units that can be added up and trivially converted into calendar days.
The assumption that anyone can estimate work they have never done with a reasonable degree of accuracy is just plain wrong. That is why management pretends they arent doing that either
They aren’t. I’ve seen this argument rehashed a thousand times - “we arent estimating time, just complexity so we can estimate how much we can do in a sprint. which is two weeks. wink wink”
Which is just time and effort estimations with extra steps and cargo culting.
Only one requires an overpaid clown (SCRUM master/Program Manager/etc) to do the unit conversion.
These two sentences are somewhat in conflict imo. Workload and schedule is tied into capacity planning and resourcing, which happens at a higher level than the scrum team. The people having those conversations need something to go off of, so it's pretty easy to see why they latch onto story points and velocity.. those are numbers that their teams are already producing!
I do agree with you that this is a misuse of velocity and story points, but I don't think it's possible to keep such things from being used (abused?) by upper managers above your team
There are other numbers that kept getting ignored like bugs count, backlog size, time lost to tooling and meetings,… maybe those can help with planning and resourcing too.
It’s always measure the team’s productivity, not make the team productive.
Apart from the impact on team motivation and the incentive to inflate estimates, another problem with that approach is that the story points might not correspond cleanly with how much time was actually spent on the task. Story points are anyways an example of multidimensional data squeezed into one dimensions and in the process losing valuable information.
Are there teams out there that correct the story points based on actual amount of work and complexity?
The points are needed because it's convenient to do reports based off of it. If we used estimation language such as "easy, medium, hard, dummy-thicc" they'd still need to assign points to those labels so they can do math on reports and graphs to watch your performance
The biggest sin of course is then trying to predict velocity, but the consequences of that usually just make people doing the reporting look silly for no reason. I think even the slow developers rarely get fired, and you also get nothing for clearing more story points than other developers.
No bonuses for higher velocity is the real reason no one takes it seriously.
Bonuses for higher velocity based on guessed story points would be an even bigger reason for not taking it seriously - or rather to overestimate everything in order to gain the impression of higher velocity and the bonus associated.
Non-developers might have to learn something about development so they can push back on inflated estimates, which honestly would be helpful to everyone.
If I quoted you $1,000 to unclog your kitchen sink trap, you'd call bullshit because you can probably find out how easy it is to do that. Likewise, if I quote you 13 points to add a generic error toast message because a POST went badly, you should call bullshit on that too.
The problem with your kitchen sink example is that you'd think that they've unclogged 7k kitchen sinks so that they can you give you a great estimate.
If we've built the same software before ....we wouldn't have to build anything we'd just use the previously built software.
Software isn't built like a kitchen but by exploring an infinite list of design choices and then choosing one way and implementing it ..hopefully without impacting existing architecture. And it seldom stays static - the name of the game is constant change and then add group dynamics to that.
A better example is to call a plumber and ask how much would it be to add a "rainbow spout to the horse-drawn wheelie-do" oh and there will be several other teams of plumbers there as well. I bet you get a few questions.
Oh come on. There are certainly a bunch of things which nobody on a team has done before in any given project, an error toast is not one of them. Almost every project is using previously built software for them, and the way they fit together is reasonably well known.
this is nonsensical. first of all more programming these days is hardly invegigative, it is literally plumbing. attach this service this queue. translate this request to this api. compile this code under a new version. put this thing in a container.
secondly, and I've done alot of this, its entirely possible to build useful estimates around greenfield projects like 'define a new language', even though there are lots of variables in play. one useful technique is to work backwards. 'if we dont have a working draft spec in 2 weeks then we can really start the parser, so we're gonna say that, and if we cant hit it, we know we're in trouble'
if your ability to assign semantics and use your experience to plan out the work stops at 'horse drawn weheelie-do', i wonder if this is the right profression for you.
if as a culture we just throw up our hands 'whelp, its all unknowable, we'll just do the best we can and thats you can ask for', then we cant really criticize our customers for being frustrated with us and not 'appreciating our brilliance and how hard this is'.
and the truth is, we can and have done better than that.
And yet Fred will take 2 weeks to do the work, and the team is dysfunctional so only certain people work in certain areas.
Fred sucks. We can say his work is easy, and continually make him feel bad without firing him, or we can pretend it’s two weeks.
that really is the truth behind storypoints, somebody wants a way to create pressure. Which is also why storypoints are so bad because now you are pressuring everybody, even the good workers, despite only wanting pressure on a couple people. And on top of that you are wasting the time of the good workers with the whole story point estimation dance.
So, you’re saying it’s convenient for management, yes? Making things convenient for management at the expense of the front-line workers inverts the priorities of a company. The managers ought to be working harder to make things as efficient and convenient for the workers to build the right thing. Putting management first is the stupidest thing you can do for your company. You immediately separate yourself from quality and by and by your company will whither and rot. If it doesn’t die, it certainly will never be great.
The purpose of a company is to make money. Even if you hate shareholders (like for example your own 401k) and think invested capital deserves zero return, making money is still what allows the company to pay your salary.
Most companies do this by selling things to customers, whether services or products.
If you're building software for a specific customer, they'll want to know when then can start using it.
If you're building something for retail, marketing will want to know what release date to advertise.
These are the mind of things that management uses those reports for. Saying that actually customers don't need to be able to make plans, or marketing doesn't need to be able to advertise, is effectively saying that your employer doesn't need to make any money.
Which is quite a silly thing to say, at least if you recognize that devs don't exist in isolation and maybe even aren't the center of the universe.
That's not at all what I said. I said, if you choose to make management more efficient to the detriment of the people making the thing your company sells, you've made a bad choice. Bad managers like story points because it usually doesn't hurt their heads too much to add and compare scalar values. But a story point doesn't come anywhere close to representing the complexity of building any product. I've had one or two bad managers who wanted to make all things subservient to their precious Excel file. The productivity of that team (when the manager and story points process were introduced) plummeted and the entire team quit within a month of each other leaving.
The best managers I have had have worked by discussing customer and business needs with engineering and walking away with a richer picture of the tasks involved and an estimated timeline. It is very hard work to do this—it usually takes someone with a little knowledge of how software engineering works. I've had good product managers like this and I've been able to deliver some of the highest-value code I've written (measured strictly monetarily) thanks to managers who took a rich, nuanced view of the process.
I am not advocating for no management or no planning. I am advocating for management not requiring over-simplified metrics because it makes their life easier. Management does not exist to make its own tasks easier. It exists to make it easier for the people making the product to build the right thing at the right time and sell it. You can make plans without story points.
You'd end up having a points system whether you like it or not. If it wasn't story points, it'd be days/weeks/months estimates.
Almost nowhere you will find "it's done when it's done, don't bother me until it's done" as an option for a developer.
If you want an even more utopian arrangement, management would know how to look at git and use the software once in a while to see how things are progressing.
You, like tbrownaw, are arguing against something I didn't say. (Please go read the sibling thread.) I didn't say that planning, goals, etc. are bad. What I said was that story points lead to bad plans because they oversimplify the complexities of software development. A project timeline can be fine—but only if it is born out of a nuanced discussion and not by adding up a bunch of unitless points.
Developers (at least, ones like me) despise story points because they ask developers to collapse a ton of nuance and detail into a single scalar. It’s about as meaningful as this comparison of food recipes: compress the cook time, number of ingredients, cost of ingredients, quality of ingredients, whether or not some of the ingredients are allergens, and how much you like the food independent of the weather and your mood into a single scalar value and using that to plan what you will eat during the week. Madness!
Story points can enable corporate despotism. Bad managers who demand developers make up story points, and then turn around and use that to dictate what developers will work on—without engaging in a thoughtful discussion on what is feasible and what will keep the product stable—are on a power trip and are treating developers like cogs in a machine. They know the price of everything, but the value of nothing.
The good managers that I have had in the past regularly discussed with the dev team what was needed and—with the developers' input and without artificially compressing the complexities of the task at hand—set good goals that gave the dev team focus and direction.
Using story points is lazy management.
Having worked where a megacorp attempted to normalise points and incentivise velocity, there is a fatal flaw. The people doing the work estimate the size of the work! Increased “velocity” will happen automatically. Actual work throughput will remain the same. Prediction becomes impossible.
The most damning problem with “points” is that you can’t even do basic math with them. Does 4 points take twice as long as 2? No, no it doesn’t. The whole thing is just a giant waste of time.
My team needs to get 17 points done daily, but no-shows-no-calls keep kneecapping efforts. When they do show up, they complain incessantly about how management makes everything a 0.5 no matter the list of features in the story, and why can't they get more than 29 hours a week so they can have insurance. But it's attitudes like that which keep them from getting raises on their yearly performance reviews. It's a good thing scrum was invented so we could finally measure the consistent low performance of you computer science nerds.
It's also easy to game the system. Turn a couple of 3's into 5's. Your "velocity" goes up for no real increase in work.
To your own point, the reports serve little value aside from a fabricated narrative that some companies like to build feel-goods around.
Discuss the complexities and needs. Break the work into small chunks. Define what progress means. Set expectations for making progress. Regularly and honestly review why you/the team are or aren’t meeting expectations. Rinse and repeat.
Perhaps this is over simplifying it, but these are the tried-and-true high notes in my experience. If at any point one of those steps isn’t feasible, then it’s a larger issue that implementation process likely isn’t going to solve, so the “to measure velocity or not” point seems moot.
I used story points for years with my teams and they worked "as advertised", which is they helped the teams understand the effort, complexity and risk for each story involved.
When there was disagreement, it helped them dig deeper to understand why and usually reveal somebody's incorrect assumptions.
It helped make sure teams didn't overcommit to the amount of stories they stuffed into a sprint and avoid either burning out, or worse, normalizing not finishing the sprint causing negative impacts to morale/motivation. (For some reason my teams often thought they could do more than the points implied!)
Most importantly, when large projects were proposed or were in progress, we were able to give realistic estimates to the various stakeholders about when to expect the various milestones to arrive, which bought us engineers a ton of credibility, trust, and respect with the rest of the company.
And yes, management wanted to see the story points and measure the team against them. I told them to F-off. Nicely. Kinda.
It helped that I was either a CTO or a senior enough exec in those cases with 3-8 agile teams. I essentially was the middle management and could put a stop to any destructive practices like evaluating teams against their velocity.
I am sure teams quite appreciated you shielding them from overzealous management. But here is a thought: Doesn't this stand or fall with you being there or leaving? Will the next middle management be as capable and looking out to shield the teams from the destructive influence? Why not change the system, so that the middle management does not need to shield the engineers?
A manager cannot protect teams after they leave. The new manager can change all the existing process when they take over, and you're back to square one.
I understand that. So the question arises: Is it more difficult for a new manager to ruin the work processes through inaction (not shielding the team) or by reworking established processes? My bet is on it being very easy through inaction.
Most of time, I find new manager was brought in to be a yes person because previous manager quit due to conflicts with their management that maybe you didn't see.
So story points work when they are used as an internal tool for a team to understand themselves.
And story points don't work when they are a tool used to communicate to the external world outside the team.
They are good for 'hey can I get this crap done in a sprint'. Once someone starts measuring it for 'how good a team is' that is when it falls apart.
One teams points almost always do not equal another teams points either.
Agile has a TON of anti-patterns that look good to do and are enticing to do. But in the end are self destructive. Usually making it about the process instead of 'I have X amount of work and Y number of people how much can I get done in Z time'.
For example velocity. I measure it so I do not overcommit. Trying to do 50 points when 20 is the norm and something will happen that we do not want. But now that you have a measurable number some manager will want to brag on it (that is their job to brag about you). In the end being put on some spreadsheet to be presented to some other manager. It becomes a score to measure you against other teams and an anti-pattern. As actually testing if something is being productive is hard. But numbers you can get all sorts of them out of agile, leading straight to anti-patterns.
Story Points are inherently team specific as is Velocity. Trying to normalize them across an org for the purposes of a performance metric is folly. Velocity should be used internally on the team when doing timeline estimations, but exposing that outside the team is, again, folly. Even on a singular team, SP and V are subject to small drift or large corrections based on team makeup, time of year, or numerous other metrics. They are simply a planning estimation tool. As others have pointed out, the act of assigning SPs is a useful tool itself as it requires a team to collaboratively estimate complexity and it frequently helps surface miscommunication and missing details.
Absolutely my experience too. Story points have been an effective tool for me with multiple teams in several different companies over the past two decades. They aren't, by themselves, the complete answer to any problem - they need to be applied within the context of a healthy team and engineering culture. And some senior management persistently misunderstand them and want to do insane things like compare velocities between teams. But they are a good and useful tool in the hands of a good team.
Healthy engineering culture is a cure for many, many issues.
You sound like an excellent technical leader and that fixes a lot.
One of the main drivers for writing this down is to make it easy to pass along for people who aren't as aware of the problems that come from those anti-patterns, as well as to explain why they are so destructive. The hope is to raise some awareness for people in tougher situations.
Software does not exist in a vacuum. You are probably right in thinking that story points should not matter to you, as a developer; but they do matter to external stakeholders.
Forecasting and management of expectations is necessary because software often has external real-world dependencies: available funding for a first release, marketing materials, hardware that can't be released without software, trade shows, developer conferences, yearly retail release parties, OEM partners, cyclic stable releases for enterprise customers who won't push software into production without extensive pre-testing, graphics adapters that can't be released without drivers, rockets that won't launch, cars that won't drive, etc.
All of these things require some degree of forecasting and appropriately-evolving management of expectations. Here's where we stand. Here's what we can commit to. Here's what we might deliver, but are willing to defer to a future release. Here are the (low value) items were under consideration that we will definitely defer to a future release, here are the features you will need to drop to get feature b in this release.
The purpose of story points is to help provide forecasting and management of expectations (with appropriately limited commitments) to stakeholders, on the understanding that forecasts are approximate.
Calibrated burndown of story points is pretty much the only basis on which forecasting and management of expectations can be done in an agile process. The key is to make sure that stakeholders understand the difference between forecasting and commitment, and to make sure your development team is appropriately protected by a healthy development process.
Whether the author's claim that you get better forecasts by just counting stories, instead of summing story points... color me skeptical. I do get that it prevents some obvious abuses of process, while enabling other that are just as bad. If somebody is using story points as a developer performance metric (which they shouldn't), there's nothing that prevents them from using completed stories as a developer performance metric (which they shouldn't). The corresponding abuse of process to combat that metric would be to hyper-decompose stories.
There was a part of the article that discussed that you create tasks, not stories.
You can group the tasks into stories or milestones or iterations or epics.
In general, he’s saying you should always keep breaking down a task until it’s a 1, since 1s are easy to estimate.
The key part that may be missed is the “mob programming” or “pair programming” aspect where all the engineers on a team sit together and work through a story or milestones or epic to come up with a list one 1 point tasks.
Obviously this still can’t be done, so the only effective end solution is maximum pair/mob programming unless all tasks in an iteration are accounted for and broken down into easily understandable and estimable bits of work.
There is at least some truth to the notion that if you use mob programming, estimating becomes pointless.
My issue with this has always been that once an issue is straightforward enough to estimate as a 1 point task, you could’ve implemented the task already during the estimation process. The unknown effort is almost never in the writing code part, but figuring out the complexities around business rules & externalities.
But this doesn’t fix the process, it just moves the variability of effort & time into a different part of the process.
I have actually always had the same feeling about 1 pointers, but in this case you're talking about a collection of small tasks that make up a larger feature.
The larger feature probably couldn't have been implemented during the estimation process, but a single isolated small task could have.
Do you mean in an Agile process?
Sorry I have a visceral reaction to this, having seen teams try to accomplish large things by breaking them down into story-level tasks and then sum up the estimates, and then watch the slow motion train wreck as gaps emerge, requirements evolve, learnings accumulate, and everyone throws their hands in the air and points to the long paper trail of tickets that prove they did their job.
Scrum and story points are a reasonable way to drive local incremental improvements when you have no greater ambitions than local optimizations of an established system, but they have a low ceiling for achieving anything ambitious in a large system. At best you'll get solid utilization of engineering resources when you have a steady backlog of small requests, at worst you'll redirect engineers' attention off of the essential details that make or break difficult projects towards administrative overhead and ticket shuffling. I understand why this might be the way to go in a low trust environment, but it's really no way to live for an experienced and talented cross-functional team.
I would love to hear your view of handling ambitious changes in a large system.
(In the middle of this scenario. Mix of experienced/talented and low trust environment.)
You're right about that, but that's also one of the benefits to the approach. Inflating a point estimate is easy and there's no real audit trail for why the value is what it is.
On the other hand, if a team creates tasks like "Commit the code" or "save the file" it's pretty easy to identify as fluff.
In addition, I like the whole "Full Queues Amplify Variability" chart.
It's intuitively obvious, but I never realized it was that bad.
I'm going to have to run that down and see if it's actually backed by real data. Too many business books are complete flimflam.
Reinertsen's books are very, very good. The most recent and most up-to-date is Principles of Product Development Flow, which that chart came from. The previous one, Managing the Design Factory, is pretty similar and I think it's a better read.
Another great takeaway from it is to prioritize things by cost of delay, or even better, what he calls Weighted Shortest Job First, where you divide the cost of delay by the expected length of the task. The only problem is that the same people that want to get oddly formal and inappropriately rigorous with things like story points will want to turn cost-of-delay into an accounting exercise - or object that it's impossible because they don't have the accounting system for it - which misses the point entirely.
Yes, and WSJF gives you two units of data: priority and sequencing - because everybody's pet is priority 1. Much better than MoSCoW.
I hadn't specifically seen MoSCoW before. Some people need to have everything spelled out for them, I guess.
That is very true. I'm going to write something up about WSJF in practice in the future, but it has some real benefits to fixing that issue.
Particularly when you get the decision makers in a room and get them all to agree on the estimated value of each item. Not only does it remove the numerous priority #1's, it also gets everybody aligned on what the real priority #1 is and why.
I've seen it done where people are surveyed separately and it only works well when the people involved are forced to have a conversation to put real numbers to their assumptions, coming out with agreement. The other side effect is that it solves the squeaky wheel problem.
Every CTO needs to read that book.
The shape is about right for the variance of the M/M/1 queue (what is an M/M/1/∞ queue??), but it's a little disingenuous since variance is a higher-order moment. Standard deviation is probably more sensible to think about here, since that brings it back down to the same scale as the queue length.
https://en.wikipedia.org/wiki/M/M/1_queue#Average_number_of_...
In the cited example, going from 75% utilization -> 95% utilization drags your standard deviation from a modest 3.5 items in the queue all the way up to 19.5 items. And, in both cases, that's not far off from your average queue length, either.
Those items after your eg are part of the problem why story points don't work for you.
You really do need to control the amount of time your devs get distracted by other things. If they're solely focusing on that one project, you'll be fine.
Its why i personally appreciate XP more than plain old scrum.
This, 100%. When engineers have focus, no context switching... things get done. When they're interrupted with useless Zoom meetings every 2 hours, productivity drops off the cliff. If you have 2 meetings 1 hour apart, that 1 hour is basically useless: by the time you're in the zone, you're interrupted again.
And then when we're given a nice long stretch of interruption-free time, we ruin it by hitting HN.
You haven't discovered the secret formula: make your estimate and then mindlessly triple it.
multiplying estimates by PI has proven to be a scarily accurate rule of thumb over the course of my career.
Story points were never meant for forecasting. To do that you can use Monte Carlo analysis https://observablehq.com/@danielfrey/monte-carlo-simulation-...
Please don't
"Plans are useless, but planning is indispensable."
Eisenhower. Here's another from the great philosopher Mike Tyson: "Everybody has a plan until they get punched in the face."
Point being, have a continuously ready backlog consumed by short iterations with great telemetry - like SpaceX.
Yeah, the discussion can be really illuminating. If the points weren't recorded or mentioned afterwards I think it would be a net positive in some cases.
That's one of the main points of the approach mapped out.
You still get the discussion value, you just end up documenting things better as a practice and make a more concerted effort to add clarity where needed.
This is my experience as well. Story points in and of themselves are worthless, but as an excuse to discuss the project as a team, they can have some value.
My thing for any agile methodology is the self organizing team aspect really is the most important part, any time-like metric going up the chain is gonna be painful.
I liked story points when the team used them to get aligned on the work being done. I did not like story points when middle management had them on a chart they would review.
They're also useful in retro for the simple fact that something was written down. If your story point estimate was way off for a particular task, you can pull up the original conversation, identify what assumptions were violated or what new info was surfaced that wasn't present at the original estimation and whether there's learnings/process changes to be made.
Of course, retro is usually the first thing to go in a deluded attempt to increase velocity and story points hang on as this vestigial tail, contributing to more cargo cult software engineering.
edit: One scenario that might play out:
The team all agreed that tasks A, B & C were worth 3, 5 & 1 points respectively but Steve and Mei thought task D was worth 5 points but Carol thought it should be 13 because it involved integration of an external API where her experience was that other API integrations in the past had exposed hidden complexity.
Task D ultimately was not completed because during the process of integration, it was discovered that the API did support a key feature that was needed and instead, an in house alternative needed to be built.
It was decided that going forward, the team would instead assign a 1 point task to building a toy app for any new API integrations that would then feed into the process of deciding the story points for features requiring that integration.
That's a very healthy approach to it and deserves applause. Kudos.
The goal of mobbing around task breakdowns is to drive the communication and building that shared understanding. Just with an output of writing most of it down in a way that makes it easy to track progress and approximate the size at the same time.