I miss the college days where professors would argue endlessly on Bayesian vs Frequentist.
The article is very well succinct and even explains why even my Bayesian professors had different approaches to research and analysis. I never knew about the third camp, Pragmatic Bayes, but definitely is in line with a professor's research that was very through on probability fit and the many iteration to get the prior and joint PDF just right.
Andrew Gelman has a very cool talk "Andrew Gelman - Bayes, statistics, and reproducibility (Rutgers, Foundations of Probability)", which I highly recommend for many Data Scientists
I’m always puzzled by this because while I come from a country where the frequentist approach generally dominates, the fight with Bayesian basically doesn’t exist. That’s just a bunch of mathematical theories and tools. Just use what’s useful.
I’m still convinced that Americans tend to dislike the frequentist view because it requires a stronger background in mathematics.
I think the distaste Americans have to frequentists has much more to do with history of science. The Eugenics movement had a massive influence on science in America a and they used frequentist methods to justify (or rather validate) their scientific racism. Authors like Gould brought this up in the 1980s, particularly in relation to factor analysis and intelligence testing, and was kind of proven right when Hernstein and Murray published The Bell Curve in 1994.
The p-hacking exposures of the 1990s only fermented the notion that it is very easy to get away with junk science using frequentest methods to unjustly validate your claims.
That said, frequentists are still the default statistics in social sciences, which ironically is where the damage was the worst.
What is the protection against someone using a Bayesian analysis but abusing it with hidden bias?
I’m sure there are creative ways to misuse bayesian statistics, although I think it is harder to hide your intentions as you do that. With frequentist approaches your intentions become obscure in the whole mess of computations and at the end of it you get to claim this is a simple “objective” truth because the p value shows < 0.05. In bayesan statistics the data you put into it is front and center: The chances of my theory being true given this data is greater than 95% (or was it chances of getting this data given my theory?). In reality most hoaxes and junk science was because of bad data which didn’t get scrutinized until much too late (this is what Gould did).
But I think the crux of the matter is that bad science has been demonstrated with frequentists and is now a part of our history. So people must either find a way to fix the frequentist approaches or throw it out for something different. Bayesian statistics is that something different.
The first statement assumes that parameters (i.e. a state of nature) are random variables. That's the Bayesan approach. The second statement assumes that parameters are fixed values, not random, but unknown. That's the frequentist approach.
My knee jerk reaction is replication, and studying a problem from multiple angles such as experimentation and theory.
I don't think the guy's basic assertion is true that frequentist statistics is less favored in American academia.
I’m not actually in any statistician circles (although I did work at a statistical startup that used Kalman Filters in Reykjavík 10 years ago; and I did dropout from learning statistics in University of Iceland).
But what I gathered after moving to Seattle is that Bayesian statistics are a lot more trendy (accepted even) here west of the ocean. Frequentists is very much the default, especially in hypothesis testing, so you are not wrong. However I’m seeing a lot more Bayesian advocacy over here than I did back in Iceland. So I’m not sure my parent is wrong either, that Americans tend to dislike frequentist methods, at least more than Europeans do.
I can attest that the frequentist view is still very much the mainstream here too and fills almost every college curriculum across the United States. You may get one or two Bayesian classes if you're a stats major, but generally it's hypothesis testing, point estimates, etc.
Regardless, the idea that frequentist stats requires a stronger background in mathematics is just flat out silly though, not even sure what you mean by that.
I also thought it was silly, but maybe they mean that frequentist methods still have analytical solutions in some settings where Bayesian methods must resort to Monte Carlo methods?
Note that Bayesian methods also have analytical solutions in some settings.
There is a reason why conjugate priors were a thing.
I don’t think mathematical ability has much to do with it.
I think it’s useful to break down the anti-Bayesians into statisticians and non-statistician scientists.
The former are mathematically savvy enough to understand bayes but object on philosophical grounds; the later don’t care about the philosophy so much as they feel like an attack on frequentism is an attack on their previous research and they take it personally
This is a reasonable heuristic. I studied in a program that (for both philosophical and practical reasons) questioned whether the Bayesian formalism should be applied as widely as it is. (Which for many people is, basically everywhere.)
There are some cases, that do arise in practice, where you can’t impose a prior, and/or where the “Dutch book” arguments to justify Bayesian decisions don’t apply.
This statement is correct only on a very basic, fundamental sense, but it disregards the research practice. Let's say you're a mathematician who studies analysis or algebra. Sure, technically there is no fundamental reason for constructive logic and classical logic to "compete", you can simply choose whichever one is useful for the problem you're solving, in fact {constructive + lem + choice axioms} will be equivalent to classical math, so why not just study constructive math since it's higher level of abstraction and you can always add those axioms "later" when you have a particular application.
In reality, on a human level, it doesn't work like that because, when you have disagreements on the very foundations of your field, although both camps can agree that their results do follow, the fact that their results (and thus terminology) are incompatible makes it too difficult to research both at the same time. This basically means, practically speaking, you need to be familiar with both, but definitely specialize in one. Which creates hubs of different sorts of math/stats/cs departments etc.
If you're, for example, working on constructive analysis, you'll have to spend tremendous amount of energy on understanding contemporary techniques like localization etc just to work around a basic logical axiom, which is likely irrelevant to a lot of applications. Really, this is like trying to understand the mathematical properties of binary arithmetic (Z/2Z) but day-to-day studying group theory in general. Well, sure Z/2Z is a group, but really you're simply interested in a single, tiny, finite abelian group, but now you need to do a whole bunch of work on non-abelian groups, infinite groups, non-cyclic groups etc just to ignore all those facts.
I would follow but neither Bayesian nor frequentist probabilities are rocket science.
I’m not following your exemple about binary and group theory either. Nobody looks at the properties of binary and stops there. If you are interested in number theory, group theory will be a useful part of your toolbox for sure.
It's because practicioners of one says that the other camp is wrong and question each other's methodologies. And in academia, questioning one's methodology is akin to saying one is dumb.
To understand both camps I summarize like this.
Frequentist statistics has very sound theory but is misapplied by using many heuristics, rule of thumbs and prepared tables. It's very easy to use any method and hack the p-value away to get statistically significant results.
Bayesian statistics has an interesting premise and inference methods, but until recently with the advancements of computing power, it was near impossible to do simulations to validate the complex distributions used, the goodness of fit and so on. And even in the current year, some bayesian statisticians don't question the priors and iterate on their research.
I recommend using methods both whenever it's convenient and fits the problem at hand.
I'd suggest you to read "The Book of Why"[1]. It is mostly about Judea's Pearl next creation, about causality, but he also covers bayesian approach, the history of statistics, his motivation behind bayesian statistics, and some success stories also.
To read this book will be much better, then to apply "Hanlon's Razor"[2] because you see no other explanation.
[1] https://en.wikipedia.org/wiki/The_Book_of_Why
[2] https://en.wikipedia.org/wiki/Hanlon's_razor
The opposite is true. Bayesian approaches require more mathematics. The Bayesian approach is perhaps more similar to PDE where problems are so difficult that the only way we can currently solve them is with numerical methods.
Regarding the frequentist vs bayesian debates, my slightly provocative take on these three cultures is
- subjective Bayes is the strawman that frequentist academics like to attack
- objective Bayes is a naive self-image that many Bayesian academics tend to possess
- pragmatic Bayes is the approach taken by practitioners that actually apply statistics to something (or in Gelman’s terms, do science)
I see, so academics are frequentists (attackers) or objective Bayes (naive), and the people Doing Science are pragmatic (correct).
The article gave me the same vibe, nice, short set of labels for me to apply as a heuristic.
I never really understood this particular war, I'm a simpleton, A in Stats 101, that's it. I guess I need to bone up on Wikipedia to understand what's going on here more.
Bayes lets you use your priors, which can be very helpful.
I got all riled up when I saw you wrote "correct", I can't really explain why... but I just feel that we need to keep an open mind. These approaches to data are choices at the end of the day... Was Einstein a Bayesian? (spoiler: no)
Using your priors is another way of saying you know something about the problem. It is exceedingly difficult to objectively analyze a dataset without interjecting any bias. There are too many decision points where something needs to be done to massage the data into shape. Priors is just an explicit encoding of some of that knowledge.
A classic example is analyzing data on mind reading or ghost detection. Your experiment shows you that your ghost detector has detected a haunting with p < .001. What is the probability the house is haunted?
With a prior like that, why would you even bother pretending to do the research?
The fact that you are designing an experiment and not trusting it is bonkers. The experiment concludes that the house is haunted and you've already agreed that it would be so before the experiment.
You're absolutely right, trying to walk a delicate tightrope that doesn't end up with me giving my unfiltered "you're wrong so lets end conversation" response.
Me 6 months ago would have written: "this comment is unhelpful and boring, but honestly, that's slightly unfair to you, as it just made me realize how little help the article is, and it set the tone. is this even a real argument with sides?"
For people who want to improve on this aspect of themselves, like I did for years:
- show, don't tell (ex. here, I made the oddities more explicit, enough that people could reply to me spelling out what I shouldn't.)
- Don't assert anything that wasn't said directly, ex. don't remark on the commenter, or subjective qualities you assess in the comment.
Frequentist and Bayesian are correct if both have scientific rigor in their research and methodology. Both can be wrong if the research is whack or sloppy.
I've used both in some papers and report two results (why not?). The golden rule in my mind is to fully describe your process and assumptions, then let the reader decide.
I understand the war between bayesians and frequentists. Frequentist methods have been misused for over a century now to justify all sorts of pseudoscience and hoaxes (as well as created a fair share of honest mistakes), so it is understandable that people would come forward and claim there must be a better way.
What I don’t understand is the war between naive bayes and pragmatic bayes. If it is real, it seems like the extension of philosophers vs. engineers. Scientists should see value in both. Naive Bayes is important to the philosophy of science, without which there would be a lot of junk science which would go unscrutinized for far to long, and engineers should be able to see the value of philosophers saving them works by debunking wrong science before they start to implement theories which simply will not work in practice.
Academics can be pragmatic, I've know ones who've sued both Bayesian statistics and MLE
A few things I wish I knew when took Statistics courses at university some 25 or so years ago:
- Statistical significance testing and hypothesis testing are two completely different approaches with different philosophies behind them developed by different groups of people that kinda do the same thing but not quite and textbooks tend to completely blur this distinction out.
- The above approaches were developed in the early 1900s in the context of farms and breweries where 3 things were true - 1) data was extremely limited, often there were only 5 or 6 data points available, 2) there were no electronic computers, so computation was limited to pen and paper and slide rules, and 3) the cost in terms of time and money of running experiments (e.g., planting a crop differently and waiting for harvest) were enormous.
- The majority of classical statistics was focused on two simple questions - 1) what can I reliably say about a population based on a sample taken from it and 2) what can I reliably about the differences between two populations based on the samples taken from each? That's it. An enormous mathematical apparatus was built around answering those two questions in the context of the limitations in point #2.
My understanding is that frequentist statistics was developed in response to the Bayesian methodology which was prevalent in the 1800s and which was starting to be perceived as having important flaws. The idea that the invention of Bayesian statistics made frequentist statistics obsolete doesn't quite agree with the historical facts.
That was a nice summary.
The data-poor and computation-poor context of old school statistics definitely biased the methods towards the "recipe" approach scientists are supposed to follow mechanically, where each recipe is some predefined sequence of steps, justified based on an analytical approximations to a sampling distribution (given lots of assumptions).
In modern computation-rich days, we can get away from the recipes by using resampling methods (e.g. permutation tests and bootstrap), so we don't need the analytical approximation formulas anymore.
I think there is still room for small sample methods though... it's not like biological and social sciences are dealing with very large samples.
I don’t get what all the hate for subjective Bayesianism is. It seems the most philosophically defensible approach, in that all it assumes is our own subjective judgements of likelihood, the idea that we can quantify them (however in exactly), and the idea (avoid Dutch books) that we want to be consistent (most people do).
Whereas, objective Bayes is basically subjective Bayes from the viewpoint of an idealised perfectly rational agent - and “perfectly rational” seems philosophically a lot more expensive than anything subjective Bayes relies on.
Link to talk: https://youtu.be/xgUBdi2wcDI
Thank you.
In fact, the whole talk series (https://foundationsofprobabilityseminar.com/) and channel (https://www.youtube.com/@foundationsofprobabilitypa2408/vide...) seem interesting.
Funny enough I also heard recently about Fiducial Statistics as a 3rd camp, an intriguing podcast episode 581 of super data science, with the EiC of Harvard Business Review.