Stepping back, the high-order bit here is an ML method is beating physically-based methods for accurately predicting the world.
What happens when the best methods for computational fluid dynamics, molecular dynamics, nuclear physics are all uninterpretable ML models? Does this decouple progress from our current understanding of the scientific process - moving to better and better models of the world without human-interpretable theories and mathematical models / explanations? Is that even iteratively sustainable in the way that scientific progress has proven to be?
Interesting times ahead.
If you're a scientist who works in protein folding (or one of those other areas) and strongly believe that science's goal is to produce falsifiable hypotheses, these new approaches will be extremely depressing, especially if you aren't proficient enough with ML to reproduce this work in your own hands.
If you're a scientist who accepts that probabilist models beat interpretable ones (articulated well here: https://norvig.com/chomsky.html), then you'll be quite happy because this is yet another validation of the value of statistical approaches in moving our ability to predict the universe forward.
If you're the sort of person who believes that human brains are capable of understanding the "why" of how things work in all its true detail, you'll find this an interesting challenge- can we actually interpret these models, or are human brains too feeble to understand complex systems without sophisticated models?
If you're the sort of person who likes simple models with as few parameters as possible, you're probably excited because developing more comprehensible or interpretable models that have equivalent predictive ability is a very attractive research subject.
(FWIW, I'm in the camp of "we should simultaneously seek simpler, more interpretable models, while also seeking to improve native human intelligence using computational augmentation")
The goal of science has always been to discover underlying principles and not merely to predict the outcome of experiments. I don't see any way to classify an opaque ML model as a scientific artifact since by definition it can't reveal the underlying principles. Maybe one could claim the ML model itself is the scientist and everyone else is just feeding it data. I doubt human scientists would be comfortable with that, but if they aren't trying to explain anything, what are they even doing?
What if the underlying principles of the universe are too complex for human understanding but we can train a model that very closely follows them?
Then we should dedicate large fractions of human engineering towards finding ethical ways to improve human intelligence so that we can appreciate the underlying principles better.
That sounds like useful engineering, but not useful science.
The ML model can also be an emulator of parts of the system that you don't want to personally understand, to help you get on with focusing on what you do want to figure out. Alternatively, the ML model can pretend to be the real world while you do experiments with it to figure out aspects of nature in minutes rather than hours-days of biological turnaround.
That's the aspirational goal. And I would say that it's a bit of an inflexible one- for example, if we had an ML that could generate molecules that cure diseases that would pass FDA approval, I wouldn't really care if scientists couldn't explain the underlying principles. But I'm an ex-scientist who is now an engineer, because I care more about tools that produce useful predictions than understanding underlying principles. I used to think that in principle we could identify all the laws of the universe, and in theory, simulate that would enough accuracy, and inspect the results, and gain enlightenment, but over time, I've concluded that's a really bad way to waste lots of time, money, and resources.
What if our understanding of the laws of the natural sciences are subtly flawed and AI just corrects perfectly for our flawed understanding without telling us what the error in our theory was?
Forget trying to understand dark matter. Just use this model to correct for how the universe works. What is actually wrong with our current model and if dark matter exists or not or something else is causing things doesn't matter. "Shut up and calculate" becomes "Shut up and do inference."
ML is accustomed with the idea that all models are bad, and there are ways to test how good or bad they are. It's all approximations and imperfect representations, but they can be good enough for some applications.
If you think carefully humans operate in the same regime. Our concepts are all like that - imperfect, approximative, glossing over some details. Our fundamental grounding and test is survival, an unforgiving filter, but lax enough to allow for anti-vaxxer movements during the pandemic - survival test is not testing for truth directly, only for ideas that fail to support life.
All models are wrong, but some models are useful.
High accuracy could result from pretty incorrect models. When and where that woukd then go completely off the rails is difficult to say.
I'm in the following camp: It is wrong to think about the world or the models as "complex systems" that may or may not be understood by human intelligence. There is no meaning beyond that which is created by humans. There is no 'truth' that we can grasp in parts but not entirely. Being unable to understand these complex systems means that we have framed them in such a way (f.e. millions of matrix operations) that does not allow for our symbol-based, causal reasoning mode. That is on us, not our capabilities or the universe.
All our theories are built on observation, so these empirical models yielding such useful results is a great thing - it satisfies the need for observing and acting. Missing explainability of the models merely means we have less ability to act more precisely - but it does not devalue our ability to act coarsely.
But the human brain has limited working memory and experience. Even in software development we are often teetering at the edge of the mental power to grasp and relate ideas. We have tried so much to manage complexity, but real world complexity doesn't care about human capabilities. So there might be high dimensional problems where we simply can't use our brains directly.
A human mind is perfectly capable of following the same instructions as the computer did. Computers are stupidly simple and completely deterministic.
The concern is about "holding it all in your head", and depending on your preferred level of abstraction, "all" can perfectly reasonably be held in your head. For example: "This program generates the most likely outputs" makes perfect sense to me, even if I don't understand some of the code. I understand the system. Programmers went through this decades ago. Physicists had to do it too. Now, chemists I suppose.
If anyone actually thought this way -- no one does -- they definitely wouldn't build models like this.
I don't quite understand this point — could you elaborate?
My understanding is that the ML model produces a hypothesis, which can then be tested via normal scientific method (perform experiment, observe results).
If we have a magic oracle that says "try this, it will work", and then we try it, and it works, we still got something falsifiable out of it.
Or is your point that we won't necessarily have a coherent/elegant explanation for why it works?
There is an issue scientifically. I think this point was expressed by Feynman: the goal of scientific theories isn’t just to make better predictions, it’s to inform us about how and why the world works. Many ancient civilizations could accurately predict the position of celestial bodies with calendars derived from observations of their period, but it wasn’t until Copernicus proposed the heliocentric model and Galileo provided supporting observations that we understood the why and how, and that really matters for future progress and understanding.
People will be depressed because they spent decades getting into professorship positions and publishing papers with ostensible comprehensible interpretations of the generative processes that produced their observations, only to be "beat" in the game by a system that processed a lot of observations and can make predicts in a way that no individual human could comprehend. And those professors will have a harder time publishing, and therefore getting promoted, in the future.
Whether ML models produce hypotheses is something of an epistemiological argument that I think muddies the waters without bringing any light. I would only use the term "ML models generate predictions". In a sense, the model itself is the hypothesis, not any individual prediction.
There have been times in the past when usable technology surpassed our scientific understanding, and instead of being depressing it provided a map for scientific exploration. For example, the steam engine was developed by engineers in the 1600s/1700s (Savery, Newcomen, and others) but thermodynamics wasn’t developed by scientists until the 1800s (Carnot, Rankine, and others).
I think the various contributors to the invention of the steam engine had a good idea of what they were trying to do and how their idea would physically work. Wikipedia lists the prerequisites as the concepts of a vacuum and pressure, methods for creating a vacuum and generating steam, and the piston and cylinder.
> If you're the sort of person who believes that human brains are capable of understanding the "why" of how things work in all its true detail
This seems to me an empirical question about the world. It’s clear our minds are limited, and we understand complex phenomena through abstraction. So either we discover we can continue converting advanced models to simpler abstractions we can understand, or that’s impossible. Either way, it’s something we’ll find out and will have to live with in the coming decades. If it turns out further abstractions aren’t possible, well, enlightenment thought had lasted long enough. It’s exciting to live at a time in humanity’s history when we enter a totally uncharted new paradigm.
It means we now have an accurate surrogate model or "digital twin" that can be experimented on almost instantaneously. So we can massively accelerate the traditional process of developing mechanistic understanding through experiment, while also immediately be able to benefit from the ability to make accurate predictions, even without needing understanding.
In reality, science has already pretty much gone this way long ago, even if people don't like to admit it. Simple, reductionist explanations for complex phenomena in living systems don't really exist. Virtually all of medicine nowadays is empirical: try something, and if you can prove its safe and effective, you keep doing it. We almost never have a meaningful explanation for how it really works, and when we think we do, it gets proven wrong repeatedly, while the treatment keeps working as always.
instead of "in mice", we'll be able to say "in the cloud"
In vivo in humans in the cloud
one of the companies I worked for, "insitro", is specificallyt named that to mean the combination of "in vivo, in vitro, in silicon".
"In nimbo" (though what people actually say is "in silico").
Medicine can be explained fairly simply, and the why of how it works as it does is also explained by this:
Imagine a very large room that has every surface covered by on-off switches.
We cannot see inside of this room. We cannot see the switches. We cannot fit inside of this room, but a toddler fits through the tiny opening leading into the room. The toddler cannot reach the switches, so we equip the toddler with a pole that can flip the switches. We train the toddler, as much as possible, to flip a switch using the pole.
Then, we send the toddler into the room and ask the toddler to flip the switch or switches we desire to be flipped, and then do tests on the wires coming out of the room to see if the switches were flipped correctly. We also devise some tests for other wires to see if that naughty toddler flipped other switches on or off.
We cannot see inside the room. We cannot monitor the toddler. We can't know what _exactly_ the toddler did inside the room.
That room is the human body. The toddler with a pole is a medication.
We can't see or know enough to determine what was activated or deactivated. We can invent tests to narrow the scope of what was done, but the tests can never be 100% accurate because we can't test for every effect possible.
We introduce chemicals then we hope-&-pray that the chemicals only turned on or off the things we wanted turned on or off. Craft some qualifications testing for proofs, and do a 'long-term' study to determine if there were other things turned on or off, or a short circuit occurred, or we broke something.
I sincerely hope that even without human understanding, our AI models can determine what switches are present, which ones are on and off, and how best to go about selecting for the correct result.
Right now, modern medicine is almost a complete crap-shoot. Hopefully modern AI utilities can remedy the gambling aspect of medicine discovery and use.
It depends whether the value of science is human understanding or pure prediction. In some realms (for drug discovery, and other situations where we just need an answer and know what works and what doesn’t), pure prediction is all we really need. But if we could build an uninterpretable machine learning model that beats any hand-built traditional ‘physics’ model, would it really be physics?
Maybe there’ll be an intermediate era for a while where ML models outperform traditional analytical science, but then eventually we’ll still be able to find the (hopefully limited in number) principles from which it can all be derived. I don’t think we’ll ever find that Occam’s razor is no use to us.
The success of these ML models has me wondering if this is what Quantum Mechanics is. QM is notoriously difficult to interpret yet makes amazing predictions. Maybe wave functions are just really good at predicting system behavior but don't reflect the underlying way things work.
OTOH, Newtonian mechanics is great at predicting things under certain circumstances yet, in the same way, doesn't necessarily reflect the underlying mechanism of the system.
So maybe philosophers will eventually tell us the distinction we are trying to draw, although intuitive, isn't real
That’s what thermodynamics is - we initially only had laws about energy/heat flow, and only later we figured out how statistical particle movements cause these effects.
At that point I wonder if it would be possible to feed that uninterpretable model back into another model that makes sense of it all and outputs sets of equations that humans could understand.
Pure prediction is only all we need if the total end-to-end process is predicted correctly - otherwise there could be pretty nasty traps (e.g., drug works perfectly for the target disease but does something unexpected elsewhere etc.).
It makes me think about how Einstein was famous for making falsifiable real-world predictions to accompany his theoretical work. And, sometimes it took years for proper experiments to be run (such as measuring a solar eclipse during the breakout of a world war).
Perhaps the opportunity here is to provide a quicker feedback loop for theory about predictions in the real world. Almost like unit tests.
Agreed. At the very least, models of this nature let us iterate/filter our theories a little bit more quickly.
The model isn't reality. A theory that disagrees with the model but agrees with reality shouldn't be filtered, but in this process it will be.
Or jumping the gap entirely to move towards more self-driven reinforcement learning.
Could one structure the training setup to be able to design its own experiments, make predictions, collect data, compare results, and adjust weights...? If that loop could be closed, then it feels like that would be a very powerful jump indeed.
In the area of LLMs, the SPAG paper from last week was very interesting on this topic, and I'm very interested in seeing how this can be expanded to other areas:
https://github.com/Linear95/SPAG
Thank God! As a person who uses my brain, I think I can say, pretty definitively, that people are bad at understanding things.
If this actually pans out, it means we will have harnessed knowledge/truth as a fundamental force, like fire or electricity. The "black box" as a building block.
This type of thing is called an "oracle".
We've had stuff like this for a long time.
Notable examples:
- Temple priestesses
- Tea-leaf reading
- Water scrying
- Palmistry
- Clairvoyance
- Feng shui
- Astrology
The only difference is, the ML model is really quite good at it.
That's the crux of it: we've had theories of physics and chemistry since before writing was invented.
None of that mattered until we came upon the ones that actually work.
The most moneyed and well-coordinated organizations have honed a large hammer, and they are going to use it for everything, and so almost certainly future big findings in the areas you mention, probabilistically inclined models coming from ML will be the new gold standard.
But yet the only thing that can save us from ML will be ML itself because it is ML that has the best chance to be able to extrapolate patterns from these blackbox models to develop human interpretable models. I hope we do dedicate explicit effort to this endeavor, and so continue the human advances and expanse of human knowledge in tandem with human ingenuity with computers at our assistance.
Spoiler: "Interpretable ML" will optimize for output that either looks plausible to humans, reinforces our preconceptions, or appeals to our aesthetic instincts. It will not converge with reality.
That is not considered interpretable then, and I think most people working in the field are aware of this gotcha.
Iirc when EU required banks to have interpretable rules for loans, a plain explanation was not considered enough. What was required was a clear process that was used from the beginning - i.e. you can use an AI to develop an algorightm to make a decision, but you can’t use AI to make a decision and explains reasons afterwards.
My argument is: weather.
I think it is fine & better for society to have applications and models for things we don't fully understand... We can model lots of small aspects of weather, and we have a lot of factors nailed down, but not necessarily all the interactions.. and not all of the factors. (Additional example for the same reason: Gravity)
Used responsibly. Of course. I wouldn't think an AI model designing an airplane that no engineers understand how it works is a good idea :-)
And presumably all of this is followed by people trying to understand the results (expanding potential research areas)
It would be cool to see an airplane made using generative design.
How about spaceship parts ? https://www.nasa.gov/technology/goddard-tech/nasa-turns-to-a...
I wonder if ML can someday be employed in deciphering such black box problems; a second model that can look under the hood at all the number crunching performed by the predictive model, identify the pattern that resulted in a prediction, and present it in a way we can understand.
That said, I don’t even know if ML is good at finding patterns in data.
That's the only thing ML does.
I mean, it's just faster, no? I don't think anyone is claiming it's a more _accurate_ model of the universe.
Collision libraries and fluid libraries have had baked-in memorized look-up tables that were generated with ML methods nearly a decade ago.
World is still here, although the Matrix/metaverse is becoming more attractive daily.
We should be thankful that we live in the universe that obeys math simple enough to comprehend that we were able to reach that level.
Imagine if optis was complex enough that it would require ML model to predict anything.
We'd be in permanent stone age without a way out.
What would a universe look like that lacked simple things, and somehow only complex things existed?
It makes me think of how Gaussian integers have irreducibles but not prime numbers, where some large things cannot be uniquely expressed as combination of smaller things.
As a steelman, wouldn't the abundance of infinitely generate-able situations make it _easier_ for us to develop strong theories and models? The bottleneck has always been data. You have to do expensive work in the real world and accurately measure it before you can start fitting lines to it. If we were to birth an e.g. atomically accurate ML model of quantum physics, I bet it wouldn't take long until we have mathematical theories that explain why it works. Our current problem is that this stuff is super hard to manipulate and measure.
Maybe; AI chess engines have improved human understanding of the game very rapidly, even though humans cannot beat engines.
I would assume that given enough hints from AI and if it is deemed important enough humans will come in to figure out the “first principles” required to arrive at the conclusion.
I believe this is the case also. With a well enough performing AI/ML/probabilistic model where you can change the model's input parameters and get a highly accurate prediction basically instantly, we can test theories approximately and extremely fast rather than running completely new experiments, which will always come with it's own set of errors and problems.
I asked a friend of mine who is chemistry professor at a large research university something along these lines a while ago. He said that so far these models don't work well in regions where either theory or data is scarce, which is where most progress happens. So he felt that until they can start making progress in those areas it won't change things much.
Major breakthroughs happen when clear connections can be made and engineered between the many bits of solved but obscured solutions.
For me the big question is how do we confidently validate the output of this/these model(s).
It's the right question to ask, and the answer is that we will still have to confirm them by experimental structure determination.
Many of our existing physical models can be decomposed into "high-confidence, well tested bit" plus "hand-wavy empirically fitted bit". I'd like to see progress via ML replacing the empirical part - the real scientific advancement then becomes steadily reducing that contribution to the whole by improving the robust physical model incrementally. Computational performance is another big influence though. Replacing the whole of a simulation with an ML model might still make sense if the model training is transferrable and we can take advantage of the GPU speed-ups, which might not be so easy to apply to the foundational physical model solution. Whether your model needs to be verified against real physical models depends on the seriousness of your use-case; for nuclear weapons and aerospace weather forecasts I imagine it will remain essential, while for a lot of consumer-facing things the ML will be good enough.
Physics-informed machine learning is a whole (nascent) subfield that is very much in line with this thinking. Steve Brunton has some good stuff about this on YouTube.
This is the topic of epistemology of the sciences in books such as "New Direction in the Philosophy of Mathematics" [1] and happened before with problems such as the four color theorem [2] where AI was not involved.
Going back to the uninterpretable ML models in the context of AlphaFold 3, I think one method for trying to explain the findings is similar to the experimental methods of physics with reality: you perform experiments with the reality (in this case AlphaFold 3) to came up with sound conclusions. AI/ML is an interesting black-box system.
There are other open discussions on this topic. For example, can our human brain absorbe that knowledge or it is limited somehow with the scientific language that we have now?
[1] https://www.google.com.ar/books/edition/New_Directions_in_th...
[2] https://en.wikipedia.org/wiki/Four_color_theorem
No, science doesn't work that way. You can just calculate your way to scientific discoveries, you got to test them in the real world. Learning, both in humans and AI, is based on the signals provided by the environment. There are plenty of things not written anywhere, so the models can't simply train on human text to discover new things. They learn directly from the environment to do that, like AlphaZero did when it beat humans at Go.
You are conflating the whole scientific endeavor to a very specific problem to which this specific approach is effective at producing results that fit with the observable world. This has nothing to do with science as a whole.
In case it's not clear, this does not "beat" experimental structure determination. The matches to experiment are pretty close, but they will be closer in some cases than others and may or may not be close enough to answer a given question about the biochemistry. It certainly doesn't give much information about the dynamics or chemical perturbations that might be relevant in biological context. That's not to pooh-pooh alphafold's utility, just that it's a long way from making experimental structure determination unnecessary, and much much further away from replacing a carefully chosen scientific question and careful experimental design.
A few things:
1. Research can then focus on where things go wrong
2. ML models, despite being "black boxes," can still have brute-force assessment performed of the parameter space over covered and uncovered areas by input information
3. We tend to assume parsimony (i.e Occam's razor) to give preference to simpler models when all else is equal. More complex black-box models exceeding in prediction let us know the actual causal pathway may be more complex than simple models allow. This is okay too. We'll get it figured out. Not everything is closed-form, especially considering quantum effects may cause statistical/expected outcomes instead of deterministic outcomes.
I suspect that ML will be state-of-the-art at generating human-interpretable theories as well. Just a matter of time.
There will be an iterative process built around curated training datasets - continually improved, top tier models, teams reverse engineering the model's understanding and reasoning, and applying that to improve datasets and training.
The ML models will help us understand that :)
Reminds me of the novel Blindsight - in it there's special individuals who work as synthesists, whos job it is to observe and understand and then somehow translate back to "lay person" the seemingly undecipherable actions/decisions of advanced computers and augmented humans.
Our metaphors and intuitions were crumbling already and stagnating. See quantum physics: sometimes a particle, sometimes a wave, and what constitute a measurement anyway?
I’ll take prediction over understanding if that’s the best our brains can do. We’ve evolved to deal with a few orders of magnitude around a meter and a second. Maybe dealing with light-years and femtometer/seconds is too much to ask.
A new-ish field of "mechanistic interpretability" is trying to poke at weights and activations and find human-interpretable ideas w/in them. Making lots of progress lately, and there are some folks trying to apply ideas from the field to Alphafold 2. There are hopes of learning the ideas about biology/molecular interactions that the model has "discovered".
Perhaps we're in an early stage of Ted Chiang's story "The Evolution of Human Science", where AIs have largely taken over scientific research and a field of "meta-science" developed where humans translate AI research into more human-interpretable artifacts.
It's interesting to compare this situation to earlier eras in science. Newton, for example, gave us equations that were very accurate but left us with no understanding at all of why they were accurate.
It seems like we're repeating that here, albeit with wildly different methods. We're getting better models but by giving up on the possibility of actually understanding things from first principles.
Is alphafold doing model generation or is it just reducing a massive state space?
The current computational and systems biochemistry approaches struggle to model large biomolecules and their interactions due to the large degrees of freedom of the models.
I think it is reasonable to rely on statistical methods to lead researchers down paths that have a high likelihood of being correct versus brute forcing the chemical kinetics.
After all chemistry is inherently stochastic…
Science has always given us better, but error prone tooling to see further and make better guesses. There is still a scientific test. In a clinical trial, is this new drug safe and effective.
Some machine learning models might be more interpretable than others. I think the recent "KAN" model might be a step forward.
I believe it simply tells us that our understanding of mechanical systems, especially chaotic ones, is not as well defined as we thought.
https://journals.aps.org/prresearch/abstract/10.1103/PhysRev...
That is not a real concern, just a confusion on how statistics works :(
We will get better with understanding black boxes, if a model can be compressed into simple math formula then it's both easier to understand and to compute.
I think it likely that instead of replacing existing methods, we will see a fusion. Or rather, many different kinds of fusions - depending on the exact needs of the problems at hand (or in science, the current boundary of knowledge). If nothing else then to provide appropriate/desirable level of explainability, correctness etc. Hypothetically the combination will also have better predictive performance and be more data efficient - but it remains to be seen how well this plays out in practice. The field of "physics informed machine learning" is all about this.
These processes are both beyond human comprehension because they contain vast layers of tiny interactions and also not practical to simulate. This tech will allow for exploration for accurate simulations to better understand new ideas if needed.
every time the two systems disagree, it's an opportunity to learn something. both kinds of models can be improved with new information, done through real-world experiments
I think at some point, we will be able to produce models that are able to pass data into a target model and observe its activations and outputs and put together some interpretable pattern or loose set of rules that govern the input-output relationship in the target model. Using this on a model like AlphaFold might enable us to translate inferred chemical laws into natural language.
I can only hope the models will be sophisticated enough and willing to explain their reasoning to us.
Whatever it is if we needed to we could follow each instruction through the black box. It’s never going to be as opaque as something organic.
"better and better models of the world" does not always mean "more accurate" and never has.
We already know how to model the vast majority of things, just not at a speed and cost which makes it worthwhile. There are dimensions of value - one is accuracy, another speed, another cost, and in different domains additional dimensions. There are all kinds of models used in different disciplines which are empirical and not completely understood. Reducing things to the lowest level of physics and building up models from there has never been the only approach. Biology, geology, weather, materials all have models which have hacks in them, known simplifications, statistical approximations, so the result can be calculated. It's just about choosing the best hacks to get the best trade off of time/money/accuracy.
This is a neat observation. Slightly terrifying, but still interesting. Seems like there will also be cases where we discover new theories through the uninterpretable models—much easier and faster to experiment endlessly with a computer.
We need to advance mechanistic interpretability (field reverse engineering neural networks) https://www.youtube.com/watch?v=P7sjVMtb5Sg https://www.youtube.com/watch?v=7t9umZ1tFso https://www.youtube.com/watch?v=2Rdp9GvcYOE
I can only assume that existing methods would still be used for verification. At least we understand the logic used behind these methods. The ML models might become more accurate on average but they could still throw out results that are way off occasionally, so their error rate would have to become equal to the existing methods.
This is exactly how the physicists felt at the dawn of quantum physics - the loss of meaningful human inquiry to blindly effective statistics. Sobering stuff…
Personally, I’m convinced that human reason is less pure than we think it to be, and that the move to large mathematical models might just be formalizing a lack-of-control that was always there. But that’s less of a philosophy of science discussion and more of a cognitive science one
A better analogy is "weather forecasting".
In physics, we already deal with the fact that many of the core equations cannot be analytically solved for more than the most basic scenarios. We've had to adapt to using approximation methods and numerical methods. This will have to be another place where we adapt to a practical way of getting results.
The top HN response to this should be,
what happens is an opportunity has entered the chat.
There is a wave coming—I won't try to predict if it's the next one—where the hot thing in AI/ML is going to be profoundly powerful tools for analyze other such tools and render them intelligible to us,
which will I imagine mean providing something like a zoomable explainer. At every level there are footnotes; if you want to understand why the simplified model is a simplification, you look at the fine print. Which has fine print. Which has...
Which doesn't mean there is not a stable level at which some formal notion of "accurate" cannot be said to exist, which is the minimum viable level of simplification.
Etc.
This sort of thing will of course will the input to many other things.
We could be entering a new age of epicycles - high accuracy but very flawed understanding.
Perhaps for understanding the structure itself, but having the structure available allows us to focus on a coarser level. We also don't want to use quantum mechanics to understand the everyday world, and that's why we have classic mechanics etc.
The frontier in model space is kind of fluid. It's all about solving differential equations.
In theoretical physics, you know the equations, you solve equations analytically, but you can only do that when the model is simple.
In numerical physics, you know the equations, you discretize the problem on a grid, and you solve the constraint defined by the equations with various numerical integration schemes like RK4, but you can only do that when the model is small and you know the equations, and you find a single solution.
Then you want the result faster, so you use mesh-free methods and adaptive grids. It works on bigger models but you have to know the equations, finding a single solution to the differential equations.
Then you compress this adaptive grid with a neural network, while still knowing the governing equations, and you have things like Physics Informed Neural Networks ( https://arxiv.org/pdf/1711.10561 and following papers) where you can bound the approximation error. This method allows solve all solutions to the differential equations simultaneously, sharing the computations.
Then when knowing explicitly your governing equations is too complex, so you assume that there are some governing stochastic equations implicitly, which you learn the end-result of the dynamic with a diffusion model, that's what this alpha-fold is doing.
ML is kind of a memoization technique, analog to hashlife in the game of life, that allows you reuse your past computational efforts. You are free to choose on this ladder which memory-compute trade-off you want to use to model the world.
Might be easier to come up with new models with analytic solutions if you have a probabilistic model at hand. A lot easier to evaluate against data and iterate. Also, I wouldn't be surprised if we develop better tools for introspecting these models over time.
Interesting times indeed. I think the early history of medicines takes away from your observation though. In the 19th and early 20th century people didn't know why medicines worked, they just did. The whole "try a bunch of things on mice, pick the best ones and try them on pigs, and then the best of those and try a few on people" kind of thing. In many ways the mice were a stand in for these models, at the time scientists didn't understand nearly as much about how mice worked (early mice models were pretty crude by today's standards) but they knew they were a close enough analog to the "real thing" that the information provided by mouse studies was usefully translated into things that might help/harm humans.
So when you're tools can produce outputs that you find useful, you can then use those tools to develop your understanding and insights. As a tool, this is quite good.
"Best methods" is doing a lot of heavy lifting here. "Best" is a very multidimensional thing, with different priorities leading to different "bests." Someone will inevitably prioritize reliability/accuracy/fidelity/interpretability, and that's probably going to be a significant segment of the sciences. Maybe it's like how engineers just need an approximation that's predictive enough to build with, but scientists still want to understand the underlying phenomena. There will be an analogy to how some people just want an opaque model that works on a restricted domain for their purposes, but others will be interested in clearer models or unrestricted/less restricted domain models.
It could lead to a very interesting ecosystem of roles.
Even if you just limit the discussion to using the best model of X to design a better Y, limited to the model's domain of validity, that might translate the usage problem to finding argmax_X of valueFunction of modelPrediction of design of X. In some sense a good predictive model is enough to solve this with brute force, but this still leaves room for tons of fascinating foundational work. Maybe you start to find that the (wow so small) errors in modelPrediction are correlated with valueFunction, so the most accurate predictions don't make it the best for argmax (aka optimization might exploit model errors rather than optimizing the real thing). Or maybe brute force just isn't computationally feasible, so you need to understand something deeper about the problem to simplify the optimization to make it cheap.