Discord links! As citations for major results in foundational computer science!
Can someone point me to some resources as to how to interpret the table, which I assume is some sort of description for a Turing machine?
The "states" (A, B, C) correspond to goto targets. The "colors" (0, 1, 2, 3) are runtime data. At each state, the current color is read, and an instruction is executed (print some color, move left or right, goto some state) based on which color is there. Transliterated into C it looks like this:
#include "machine.h"
int main(void)
{
A:
switch (SCAN) {
case 0:
WRITE(1); RIGHT; goto B;
case 1:
WRITE(3); LEFT; goto B;
case 2:
WRITE(1); RIGHT; goto H;
case 3:
WRITE(2); RIGHT; goto A;
}
B:
switch (SCAN) {
case 0:
WRITE(2); LEFT; goto C;
case 1:
WRITE(3); RIGHT; goto B;
case 2:
WRITE(1); LEFT; goto C;
case 3:
WRITE(2); RIGHT; goto A;
}
C:
switch (SCAN) {
case 0:
WRITE(3); RIGHT; goto B;
case 1:
WRITE(1); LEFT; goto B;
case 2:
WRITE(3); LEFT; goto C;
case 3:
WRITE(2); RIGHT; goto C;
}
H:
HALT;
}
Question: are there any opportunities to rewrite this logic in a more "structured" style, or to make any other optimizations?What's the initial state supposed to be?
Infinite tape of zeros.
Zero is neither A nor B or C...?
The initial state is A, the tape is full of 0 (on both sides).
The start state is A.
My understanding is that the states are conventionally listed in order, so A would be the initial state:
A TM string is in lexical normal form iff the following conditions obtain: …The non-initial active states first occur in ascending order…
I love these puzzles. GNU C supports a label as value for computed goto. This is useful for direct threaded dispatch. You trade off a branch instruction for an address lookup, but it makes the code more structured.
int main(void) {
void* A[] = {&&A0, &&A1, &&A2, &&A3};
void* B[] = {&&B0, &&B1, &&B2, &&B3};
void* C[] = {&&C0, &&C1, &&C2, &&C3};
goto *A[SCAN];
A0: WRITE(1); RIGHT; goto *B[SCAN];
A1: WRITE(3); LEFT ; goto *B[SCAN];
A2: WRITE(1); RIGHT; HALT; return 0;
A3: WRITE(2); RIGHT; goto *A[SCAN];
B0: WRITE(2); LEFT ; goto *C[SCAN];
B1: WRITE(3); RIGHT; goto *B[SCAN];
B2: WRITE(1); LEFT ; goto *C[SCAN];
B3: WRITE(2); RIGHT; goto *A[SCAN];
C0: WRITE(3); RIGHT; goto *B[SCAN];
C1: WRITE(1); LEFT ; goto *B[SCAN];
C2: WRITE(3); LEFT ; goto *C[SCAN];
C3: WRITE(2); RIGHT; goto *C[SCAN];
}
GNU C supports a label as value for computed goto.
Why doesn't any modern C standard like C23 include this? Seems like a glaring omission.
Sometimes, the fact that one implementation includes it can make it actually more difficult to standardize, if there are some implementation details that are disagreeable (see GNU's nested functions)
Question: are there any opportunities to rewrite this logic in a more "structured" style, or to make any other optimizations?
Because A and C only jump to B it is possible to structure this using only loops and one boolean. Let us use Rust to demonstrate as it lacks GOTO:
let mut a = true;
loop {
loop {
if a { // state A
match scan() {
0 => { write(1); right(); break }
1 => { write(3); left(); break }
2 => { write(1); right(); return }
3 => { write(2); right() }
}
} else { // state C
match scan() {
0 => { write(3); right(); break }
1 => { write(1); left(); break }
2 => { write(3); left() }
3 => { write(2); right() }
}
}
}
a = loop { // state B
match scan() {
0 => { write(2); left(); break false }
1 => { write(3); right() }
2 => { write(1); left(); break false }
3 => { write(2); right(); break true }
}
}
}
Of course it is possible to rewrite this as a single loop if you are willing to accept two bits of extra state rather than one.Wow! I don't know if I would call that "structured", but it's pretty clever. And horrifying. Well-done!
I found this quite helpful: https://nickdrozd.github.io/2020/10/04/turing-machine-notati...
Despite a CS undergrad I don’t recall really learning any of these canonical representations of TMs before.
Each triplet seems to be the symbol to write, and direction to move in, and the next state. You can think of the behavior of a turing machine as a function of the current state and the symbol read that's repeated iteratively.
I've made a small repository with current record holders that also shows examples of running these machines with Wolfram Language: https://datarepository.wolframcloud.com/resources/The-Busy-B.... I guess I also need to update it now.
Each row is a state and each column is the symbol that was just read off the tape. So the first row & first column mean "I have just read symbol 0 and am currently in state A".
The cell in the table describes which actions to perform. The first row & first column has "1RB" which means: "replace the symbol on the tape with '1', shift 1 symbol to the right on the tape and switch to state 'B'".
The state 'Z' corresponds to the halting state.
In Python:
def L():
global index, tape
if index: index -= 1
else: tape.insert(0, 0)
def R():
global index, tape
index += 1
if index >= len(tape): tape.append(0)
table = {
('A', 0): (1, R, 'B'),
('A', 1): (3, L, 'B'),
('A', 2): (1, R, 'Z'),
('A', 3): (2, R, 'A'),
('B', 0): (2, L, 'C'),
('B', 1): (3, R, 'B'),
('B', 2): (1, L, 'C'),
('B', 3): (2, R, 'A'),
('C', 0): (3, R, 'B'),
('C', 1): (1, L, 'B'),
('C', 2): (3, L, 'C'),
('C', 3): (2, R, 'C'),
}
state = 'A'
tape = [0]
index = 0
while state != 'Z':
tape[index], direction, state = table[state, tape[index]]
direction()
There's a brief explanation at [0]. (Note that "1RZ" is understood to be a halting transition, since state "Z" has no rules.) Wikipedia also has a more elaborated example of a TM state table [1]. You can see the trace of this particular TM at [2].
[0] https://bbchallenge.org/story#turing-machines
[1] https://en.wikipedia.org/wiki/Turing_machine#Formal_definiti...
[2] https://bbchallenge.org/1RB3LB1RZ2RA_2LC3RB1LC2RA_3RB1LB3LC2...
The new BB(3,4) record holder is
0 1 2 3
A 1RB 3LB 1RZ 2RA
B 2LC 3RB 1LC 2RA
C 3RB 1LB 3LC 2RC
with the triple (t',d, s') in row s column t specifying the transition from state s with symbol t under the tape head. the machine overwrites symbol t with t', moves the tape Left/Right according to direction d, and changes state from s to s', halting if s'==Z.This is 3*4*log2(4*2*log2(4+1)) or about 64 bits of information. Meanwhile, in only 49 bits, BBλ(49) far exceeds Graham's number [1].
Can you help me understand the log2(4+1) term?
I calculate 3*4*log2(4*2*log2(4+1)) ≈ 51.
I (a layperson) would have thought the calculation would be 3*4*log2(4*2*4) = 60.
Is it perhaps 3*4*log2(4*2*log2(3*3*4-1)) ≈ 64?
I miscalculated. It should be 3*4*log2(4*2*(3+1))= 60 bits of information.
You also need to subtract some bits from symmetries right?
Like you can mirror the tape. You can relabel states B and C. And you can relabel symbols 1, 2, 3. Though the analysis is complicated a little bit by the fact that some (combinations) of those operations may yield the exact same machine.
Also, is there any sense in restricting the universe of machines to those that include one and only one halting symbol? Machines with more or fewer than one halting symbol are possible, but they're less interesting, right?
If you mean multiple halting transitions, then it doesn't really help for BB purposes. If the machine does halt, then only one of the halting transitions will ever be taken, and the others will be 'dead code'.
If you mean multiple halting states, then that is possible, and it can be used as a model of computation for, e.g., computable functions returning a single bit as output. But you still don't gain any additional running time before the machine halts.
As for no halting transitions, the problem is choosing what point to measure the tape at. One thing you can do is to put a mark on one of the transitions, then see whether the machine only runs that transition a finite number of times, or if it keeps running it forever. This yields the "Beeping Busy Beaver" numbers [0], which are uncomputable even if you have an oracle for the halting problem.
[0] https://www.sligocki.com/2021/03/06/beeping-busy-beaver/
What I mean is that instead of measuring the bits of a particular program BB(n,k), we could instead define a function BBWOHT(n,k) ("Busy Beaver With One Halting Transition"), and measure the bits of a particular BBWOHT(n,k) program, which would be fewer bits than in the equivalent BB(n,k) program.
Perhaps this is semantic games, but I'm trying to get at the idea that while there may be 60 bits of information in the BB(3,4) program, there are fewer in the BBWOHT(3,4) program, because we can prima facie exclude a bunch of "uninteresting" BB(n,k) programs from consideration if we restrict ourselves to BBWOHT(n,k), in the same way that we can exclude a bunch of "uninteresting" BB(n,k) programs from consideration if we restrict ourselves to BB(n,k) programs that aren't symmetries of other BB(n,k) programs.
There are many inefficiencies in Turing Machines which you could try to work around with increasingly complicated encodings. But that will never yield a provably optimal machine encoding such as you can obtain with BBλ2 [1].
Can you clarify what you mean by BBλ being "provably optimal"? IIUC BB functions for any Turing-complete computation model should be equivalent up to a constant. Maybe something like: there exists N, c: forall n >= N: BBλ(n) < BB(n + c) and vice-versa. I am not sure what the exact equation will be off-hand, maybe in this case we will need to replace BB() with a version based on binary encoding of TMs.
As the OEIS entry says:
for any other busy beaver BB based on self-delimiting programs, there is a constant c such that a(c+n) >= BB(n)
This requires an information theoretic measure of program size, such as bits or bytes, whereas the TM-based BB measures in states. I don't believe there are additively tight bounds on how many states are needed to encode an arbitrary string of n bits.
Limiting to one halt does easily shave off a few bits.
Looking at the state transitions:
12 * log2(3+1) is 24 bits
log2(12) + 11 * log2(3) is only 21.02 bits
And since the specific output symbol and movement direction of the halting state don't matter, that's another few bits you can throw away.
So that gets you down to 54.02 bits before you even consider factoring out symmetries.
Only if you want to count the number of different possible TM behaviours. The official count of BB TMs as given in OEIS sequence A052200 ignores such symmetries.
Counting the number of distinct TMs is not a simple task. The way you count it is the most broad (the count of all tables of values where each cell can have any (symbol, direction, state) combination). But this is a significant overcount of the bits needed to describe an arbitrary TM. for the BB(3, 4) case, there are only about 600B distinct TMs using the Tree Normal Form aka Brady's algorithm (https://nickdrozd.github.io/2022/01/14/bradys-algorithm.html) which comes out to <40 bits.
Do you know of any (hand-wavy is ok) intuitive explanation for why this machine will halt, beyond the inductive proof's given in your article? Just watching the machine run, I would guess at some kind of infinite behavior. It is remarkable that this is not the case.
Do you know of any (hand-wavy is ok) intuitive explanation for why this machine will halt, beyond the inductive proof's given in your article?
If the Halting problem could be solved by intuition, it wouldn't be much of a problem.
Yet in this case there is a proof that it halts, and that proof contains an argument. I was asking for an explanation of the essence of that argument, if simplifying it is possible.
Similarly, 2^n would be a huge overcount of the number of n-bit closed terms, which is given by OEIS sequence A114852 [1]. There are 36140122614 ~ 2^35 closed terms of size 49 bits (length of a straightforward self-delimiting encoding). So in that sense it's still a smaller lambda term computing an incomprehensibly larger output.
Hmm, so I think the 1R in 1RZ don't matter then for this program and are arbitrarily chosen, since it halts there so it doesn't matter what's still written on the tape or where the tape head moves?
(Even writing of the 1 doesn't matter, though a 0 would have been suboptimal I guess. A 2 was already written on the tape there, it's being replaced by a 1 but the 2 counted just as much for 'number of symbols on the tape')
It is sometimes thought that extremely long-running Turing machine programs must be deeply complicated, or "spaghetti code". This new reigning champion program is a counterexample. All things considered, it is relatively simple.
There are three states: A, B, C. (States are just like goto targets in C.) State B passes control to A and C, but states A and C don't "know about" each other; they only pass control back to B. This is a sort of modular construction, whereas in true spaghetti code each state would be able to pass to all the others.
This program has some other interesting features. It never prints a blank (that is, whenever it scans a 0, it prints 1, 2, or 3). Additionally, every instruction changes either the state or the color -- there are no "lazy instructions" like B1 -> 1LB that just move position without changing anything else.
There is some debate in the bbchallenge project regarding the current long-running champions: do their properties reflect those of the actual longest-running machine of that size, or are their properties just the easiest to automatically search for and prove things about, as a kind of streetlamp effect? There's no way to know, until the entire search space has been ruled out, either definitely or heuristically. (Every size past BB(5, 2) contains chaotic, pseudorandom machines that are expected to run forever, but can't be proven to do so without big advances in number theory.)
Though I suspect that no long-running machine can be utterly chaotic, at least when you look at the patterns it produces on the tape. To run for a long time, a machine must simulate some sort of higher-level rule: if it were dumping symbols on the tape at random, then it would very soon reach a halting configuration, a cyclical configuration, or some other simplified pattern. Still, one can have long-running machines that do simulate something chaotic on a higher level, spending an inordinate amount of time between each high-level step until it halts.
streetlamp effect
I agree completely, all of these kinds of conjectures are shaped by what is detectable. If there are any "dark matter" programs out there, they by definition will be difficult to find. That said, I find it entirely plausible that the champions will win by exploiting complex and exotic mathematical facts, while the implementations of the math do not themselves need to be complex or exotic at the code level.
More rambling thoughts about this: https://nickdrozd.github.io/2021/09/25/spaghetti-code-conjec...
Yeah, thankfully we're still in the range where halting machines are at least estimable in terms of inductive notations like the up-arrow. But as the machines gain more breathing room for working with variable-length data, we can start getting monsters like the TREE sequence.
At a high level, wouldn’t you expect Champions to be chaotic in the limit? As in, the halting problem tells us that any increasing sequence of Champions is not computable.
Yes, for a busy beaver properly defined in information theoretic terms, like BBλ2 [1], one can prove that its champions must be incompressible up to some constant. Mikhail Andreev's article "Busy Beavers and Kolmogorov complexity" [2] explores these connections.
Handwaving here, but I think longest running machines can't follow a specific structure in general.
Rough sketch of an argument:
Let's construct a number from all intermediate states of the machine concatenated. The number of digits of this number should correspond to the runtime (sketchy). We only care about the halting machines, so it's finite. We know that it must be unique because if a smaller machine computes the same number, we could get a bigger number by simply running the smaller program and doing other nonsense. That means the biggest programs are komolgorov optimal, and the numbers themselves should be k-trivial and thus nearly but not quite computable. Since they're not computable, the programs themselves can't follow a structure to generate them (since that would be computable in turn for larger values).
Of course, there can't be some grand global structure to describe all the champions in one go. But there could be properties that many have in common, e.g., predominantly consisting some number of nested recursions, characterized by some transfinite ordinal. For instance, this particular machine recurses over the first several hyperoperations.
whereas in true spaghetti code each state would be able to pass to all the others.
An n-state s-symbol TM can transition to n other states at most (or halt). Thus, for s=4 or s=2, only the smallest of TMs could be spaghetti like.
There are only so many Turing machines that we can possibly describe with a not too large amount of symbols such as 1RB3LB1RZ2RA_2LC3RB1LC2RA_3RB1LB3LC2RC. The fact that some of these can make such a crazy large number of steps before halting to me is mind blowing.
There are 2^60 such 3 state 4 symbol Turing machines. A 49-bit lambda term whose output (normal form) exceeds Graham's Number should blow your mind even more.
2^60 is very little! Is it known what fraction of them has an insanely large run time?
If you know, you know I guess. I certainly have no idea.
The acronyms refer to:
https://en.wikipedia.org/wiki/Busy_beaver
https://en.wikipedia.org/wiki/Ackermann_function
As I understand it, the game around functions like this is to get as close to infinity as you can, but not quite, and then to try to uncover properties about what you find there.
I'm under the impression that it's a certain kind of fun because the results are all way too large to work with computationally, so rather than comparing the values (which you can't calculate directly) you have to reason about the various algorithms that yield them.
That's all I got. The gust of the post is greek to me. I wish I had more computer science and less software engineering under my belt. Then maybe this could be my kind of fun too.
Well, the goal is to get as close to infinity as possible with the smallest program possible. We can name big numbers just fine, by recursing over recursion for some number of steps, but the fun part is to have these fall out 'naturally' from the space of small Turing machines.
In this case, we can actually name the number of symbols in terms of Knuth's up-arrow notation [0], which describes the sequence of operations starting with multiplication, exponentiation, repeated exponentiation (called tetration, e.g., 2↑↑5 = 2^[2^[2^[2^2]]]), repeated tetration (called pentation, e.g., 2↑↑↑5 = 2↑↑[2↑↑[2↑↑[2↑↑2]]]), and so on. This number is big enough that it needs 15 arrows in a row to describe it reasonably. So it's not just that the number is very large, it's that we can also put a neat lid over its value. For instance, we know it's still nothing compared to Graham's number, which can't be described with any reasonable number of up-arrows.
[0] https://en.wikipedia.org/wiki/Knuth%27s_up-arrow_notation
Is it not true that BB(5) > BB(3,4)? On the https://bbchallenge.org site it said they're trying to prove or disprove the conjecture that BB(5) is about 47 million but BB(3,4) is apparently much bigger than that.
Yes, it seems that BB(3, 4) >>> BB(5, 2) (BB(5) = BB(5, 2)). This is not too surprising since BB(3, 4) has 12 transitions in it's table (3*4), while BB(5, 2) only has 10. But it seems that also BB(3, 4) >> BB(6, 2) (which both have the same number of transitions, so it appears that having more symbols is valuable to these little TMs.
I think I can explain that. If you only have two symbols, you can encode the same information as having four symbols in two cells, so performing the same operation on them requires an extra state to read the first of the two and then move to one of two other states to handle the second. Less symbols requires more states.
This is related to the linear speed-up theorem, which roughly states that you can speed up any TM by expanding its alphabet. And speed-up is not what BB is about.
So actually, it would make sense to limit the the busy beavers to BB(n, 2) only.
Can someone give a quick description of why this one is special? I'm familiar with turing machines (sort of), but I can't deduce why this specific instruction set is impressive. Whats an Ackerman-level function? Whats it actually computing?
I was curious as to how it works, so I implemented here: turingmachine.io/?import-gist=c862f28918f3d889f964797694d28fcc
If you run it for a bit you see what's going on, State B turns 0's into 2's and 1's into 1's transitioning to C and state C turns 3 -> 2's and transitions to A. So you just iteratively lengthen your run of 3's exponentially since it requires a full pass through all the 3's to fix a 2->1.
It's fairly easy to make a turing machine that goes exponential and blows up forever.
But the really tricky part to understand is why it eventually stops. After an unimaginably high number of steps.
This all sounds like extreme Code Golf to me.
Here's a tangent to explore:
A BitGrid[1] (my personal hobby horse) only has 4 bits of state per cell, so a 4x4 grid of cells can't count more than 2^64, no matter what. Finding the highest it could count would be interesting. For small grids, the edge connections would dominate the outcome.
I don’t know why, but these probably-useless-results (which frankly I don’t 100% understand) intrigue me more than the incredibly-useful LLM advances. I guess because I’m naturally drawn to “simple” mathematical truths more than “messy” engineering results.
And you thought your for-loops were too deeply nested!
Nice that the author provides some context...
Why not? The idea that the only valid way to publish a scientific result is in a (so-called) peer-reviewed journal is a relic from 200 years ago when the scientific community was small enough to fit within the monkeysphere [1]. It is being clung to only because it is profitable for a small group of powerful academics and publishing houses, not because it has any actual merit in terms of the advancement of science. In fact, it's probably in no small measure responsible for contemporary replication crises. I'm as staunch an advocate of the scientific method as you will hope to find (I'm writing a book about it!) but IMHO traditional peer review is long past its sell-by date.
[1] https://en.wikipedia.org/wiki/Dunbar%27s_number
To be fair, the responsibility for the replication crises is much trickier. It’s based on human motivations and poor science:
1. Outright fraud
2. Lack of independent verification of results before building up a body of work resulting in a bunch of garbage. Contributes to making #1 “easy”.
3. Financial pressures to publish or perish contributing to 1 & 2. If peer review didn’t exist you’d still something similar about producing “recognized” results. This is also probably why we haven’t done any major breakthroughs in particle physics which now has a much longer term thinking phase to come up with anything.
The biggest problem with peer review is actually the establishment of an orthodoxy which discourages itself being upended by controlling the career success of anyone who dare challenge it - you have to support the orthodoxy to get career success and you fail to get recognition if your idea challenges the peer reviewers ideas and careers. That being said, such pressures existed before, but at least you could try to build your own following and it was competing orthodoxies instead of a single one winning out.
But those only exist because of the current peer-review model. There is a huge non-linearity built in to the publication process that provides disproportionate rewards for conning a small number of people, fewer than half a dozen. That, combined with a presumption of trustworthiness, produces a perverse incentive for attempting such cons because they are very easy to pull off, much easier than actually doing good science.
The rewards come from the grant model which I forgot to mention and also has similar problems to peer review. The stipulation to publish is a secondary effect. If peer review didn’t exist there would still be pressure to show that you somehow convinced the leading experts in the field.
And how do you tell who are the leading experts in the field?
It doesn’t matter. There’s plenty of ways. Tenure, those who get cited the most as contributing to other works, whoever manages to shmooze their way onto the grant review board, whoever has been doing “good” work in a space etc etc. If you think that peer review is the only way status is established, I’m afraid you’re failing to grok how humans work - we’re very status oriented and will come up with any mechanism to formalize and broadcast that status and academia and the scientific world is not immune from this.
And how do you think that gets decided?
And how do you get cited without first getting published in a peer-reviewed publication?
Do you think it's possible to do that without a publication record?
As decided by whom?
The point is that the current system is based on a small cadre of people assessing each other's work on the assumption that they are all competent and trustworthy. The bigger the community, the easier it becomes to game the system, and the bigger the incentives to do so, and so the less reliable traditional peer review becomes as a predictor of scientific quality. To say nothing of the sheer horrible inefficiency. It takes months to do something that should take days. If anything was ever ripe for disruption, it's peer review.
BTW, here is an example of what happens when someone actually sets out to game the system and is fairly good at it:
https://www.youtube.com/watch?v=ODgYbmmgOss
Tenure precedes peer review afaik which I think pretty obviously negates this question - humans established tenure somehow so whatever that mechanism was. Peer review as a concept is quite old (17th century) and what these guys did on discord is peer review and collaboration. I’m assuming you’re just using shorthand to refer to journal peer review which is the more recent phenomenon.
Citations exist independently of peer review. Not sure why you think you can’t have one without the other. Journals are certainly not the only source cited. For example, math I believe doesn’t even generally use journals and yet citations are going strong there.
Possible? Of course. Pick 10 random bureaucrats and have them pick admissions at random. Good? Well, now you seem to be arguing the pro publication position as a way of coming up with a better review board. But anyway, yes obviously there are better ways of establishing a grant review board by trying to populate it with some amount of “freethinkers”).
Were agreed that the peer review system sucks for all sorts of reasons but we’re now very far afield from what I was trying to correct which is that the replication crises has many origins and isn’t just the fault of peer reviews. You’d have it even if journals and publish or perish weren’t a thing.
Um, no. Tenure decisions turn largely on publication record, which turns on peer review.
To cite something there has to be something to cite. It is of course possible to cite a non-peer-reviewed publication, but in academia this is heavily frowned upon. Non-peer-reviewed is generally considered synonymous with crackpottery.
I meant do you think it's possible to "shmooze [your] way onto the grant review board" without a publication record in the real world, not some counterfactual world where you have stacked the deck.
I didn't say it was just the fault of peer review. What I said was that peer review was "probably in no small measure responsible for [the] replication crises", and I stand by that.
Ok, so what you seem to have said in this thread is that the system today is a huge contributor to the replication crisis but any suggestion that things could be done differently is met with an endless barrage of questions and resistance that no, it had to be done this way. So your complaint is that there’s a system at all instead of anarchy? Really not sure what point you’re trying to make.
Maybe re-reading my original comment will help clarify: https://news.ycombinator.com/item?id=40456188
Not really no. I suggested that you can decouple tenure (1900s) from modern peer review (1970s). Citations aren't an issue when publishing (you can publish anywhere) a result but are maybe more of an issue when you have a collaborative body of work (e.g. closer to open-source software development). But even still you can have citations (e.g. the citation using the Discord channel). For some reason you seemed to take the position that citations are inextractable from peer review. The grant mechanism is definitely a problem because of how it interacts with university funding, but the grant selection mechanism can evolve to be closer to how entrepreneurs work in the business market (which has its own pluses and minuses). What I suggested though is that even if you completely remove the modern peer review system, you'll still be left with the replication crises because. You've seem to have taken issue both with the suggestion that peer review is removable from academia and completely failed to engage with the issues that have nothing to do with peer review.
1. Issues around funding are a higher order problem with peer review only being a symptom at best (if at all). For example, Sabine talks about the issues and focuses on grants and spends 0 time on modern peer review.
2. Fraud didn't come into being because of peer review but grows with funding. The more funding the bigger the problem. Conversely the less funding the more likely proportionally the research is fraudulent or of poor quality because there's a smaller community checking the work. We know that the more funding we spend, the more research activity a field experiences. We don't have good ways to sift out fraud proactively - it takes disproportionately more work to root out fraud and bad science than it is to publish that & reap the rewards. This is true beyond academia - it's easier to spew BS than it is to explain the truth.
3. Not registering for null results has nothing to do with peer review. It's more just "hey I did this work and I'm not going to get rewarded so I'm not going to bother spending the work to publish the null result". That exists independent of the modern peer review system & even publish/perish is ancillary to this - a null result amounts to "failure" emotionally and that can be hard to deal with. That's why there's systems now to mandate pre-registration of the experiment - so that meta analysis can determine whether or not a result has actually been replicated enough to reduce the risk of p-hacking.
4. The replication crises for particle physics is a stark example how peer review is not really contributing as much. There's two schools of thought. The first is that we just follow the math and use data to refine which mathematical model to pick. The second is that we need to come up with better philosophical underpinnings for what the math is telling us. For now the first school is winning in terms of funding dollars (& results), but it's really hard to determine a priori which path is actually the one we should be following. Moreover, the orthodoxy exists independent of the peer review system (& even independent of grant proposals).
Well, then I don't know what to tell you. My point is that the contemporary peer review process is still operating under constraints that date back to the pre-internet age, and so that process could probably stand to be improved, and using the web might be part of that, and so the presence of a discord link as a citation is not necessarily something to lament. It might be part of the solution rather than the problem.
I'm pretty sure your parent meant that the concept of tenure precedes the concept of peer review. However, this too seems to be false, according to the repository of truth, Wikipedia, which says that:
(https://en.wikipedia.org/wiki/Scholarly_peer_review) but:
(https://en.wikipedia.org/wiki/Academic_tenure).
Even if that were true, what does the historical development of these institutions have to do with the claim that contemporary peer review is responsible for the contemporary replication crisis?
Yeah I obviously am taking about the peer reviewed journal not peer review as a concept (which is how this discussion started). ~~But it does look like tenure is after journals not before.~~ Correction: peer review as we know of began in the mid 1970s, so tenure precedes the modern peer review system.
Because Discord is a proprietary glorified IRC SaaS; its contents are, by nature, ephemeral and under control of the vendor. I'd expect such links to rot very quickly.
Collaborating on Discord is fine. Important results, including citations backing them, should really be published or at least replicated in more durable medium that's less susceptible to link rot, and easier to archive. Today, that's PDFs in paper repositories, or even regular blog posts.
I don't see any reason why the original publication venue has to end up being the canonical reference. That's not even true for traditional peer-reviewed papers. Have you ever seen an original physical copy of, say, Einstein's annus mirabilis papers? I haven't. I suspect that these are extremely rare and valuable collectors items and only trained archivists are even allowed to handle them.
The right way to reference scientific publications is by URN [1], not URL. That makes the location irrelevant, as it should be.
[1] https://en.wikipedia.org/wiki/Uniform_Resource_Name
I'm not sure that they're collector's items, but they're probably not in that many university libraries. For example, the University of Michigan library has a physical copy in its special collection, but my university's considerably smaller library does not. But that's just because of age: this is a 119-year-old paper; were it a little younger, say, about 70 years, it would be in my university's holdings. I think that's a considerably different order of magnitude of its lifetime from a Discord link that I'd be absolutely astounded to see last a decade, and that in practice will probably last much less time than that.
What difference does it make if the original lasts a year or a century? The original is irrelevant except for historical purposes. What matters from a scientific point of view is that the results withstand scrutiny, and that they are reliably replicated and accessible.
We discover new approaches to replication all the time. Have you never come across foundational papers and arguments that everyone loved at the time, but made big methodological mistakes that led to the wrong conclusion? Or worse, found a source everyone references based on yet other secondary sources, only to look at the original context to discover that everyone's been misquoting it for decades? That happens regularly.
Sure, but I don't see what that has to do with the choice of publication venue. All of these things happen in traditional peer-review publications too.
In this case, Ligocki's presentation of the proofs in the blog post is really more rigorous than anything that went on in the Discord server. There's not some golden Truth in there that's being imperfectly mediated; it's just about the attribution. You might have more of a point for results originating from programmatic searches, but that's why the programs are all published outside of Discord, so their output can be replicated.
I was a grad student at the institute for theoretical physics in Heidelberg. It's famously housed in two old villas with little room for books, so all the walls in almost all rooms are lined with shelfs. In the office I shared with five other students, there was one shelf in it that was locked. The only one in the building. In it was one book from 1905 that had a different color than all the others.
That's the physical copy of the papers you mean. They had issues with theft so they had to have it replaced and then lock it. It probably wasn't even original though.
MySpace: 16 years (2003–2019) Friendster: 12 years (2002–2013) Google+: 8 years (2011–2019) Vine: 4 years (2013–2017) Orkut: 10 years (2004–2014) Path: 8 years (2010–2018) Yik Yak: 4 years (2013–2017) Meerkat: 2 years (2015–2017) Windows Live Messenger (MSN Messenger): 15 years (1999–2014) AIM (AOL Instant Messenger): 20 years (1997–2017) ICQ: Ongoing (since 1996) but significantly declined after the early 2000s Yahoo Messenger: 20 years (1998–2018) Bebo: 14 years (2005–2019, relaunched in 2021) Google Wave: 2 years (2009–2011) Ping (Apple): 2 years (2010–2012) Discord: 8 years (2015-)
so clearly discord is inherently different and here to stay forever! /s
feels like time is a circle sometimes ha
Even with services that lived over a decade, it's not clear whether messages were accessible for all that time. E.g. Google Talk/Meet/Whatever seemingly lost all messages before ~2013. Links to Facebook posts tend to die quickly, as both users and Meta itself seem to constantly play with privacy features. Etc.
I really like your point (I'm one of bbchallenge maintainers). I think that Discord is close to optimal for us in the short term, but bad for the reasons you and other have mentioned mid/long term.
ICQ 1996- June 26th, 2024.
It’s shuttering.
I understand the complaint here, but a lot of recent impressive progress in mathematics has been by rapid collaboration + iteration (e.g. the project to improve Zhang's prime gap bound), and it may be the case that different communication tools are not easily fungible with Discord in this regard. You have to go where the people actually are.
But Discord isn't citable. Somebody needs to archive the Discord and make that available through the proper channels (e.g. a website, a book, the Internet Archive).
that post was just an (work in progress) update. Shawn's blog post is a proper announcement - much better than i would have written :)
Well it was cited so by definition it's citable.
I don't like discord but... The people doing research have chosen this method of collaboration. I like their research. Let's not be choosing beggars and tell them how to conduct their research.
There are a ton of "personal communication" cites out there that are dead ends. The point of the cite isn't to provide a handy link, though it's nice if it is one, but to hand the credit where the credit is due.
But papers aren't citable. Somebody needs to archive the paper and make it available through the proper channels (e.g. a website, library, journal, the Internet Archive).
Finding bigger busy beaver numbers is not exactly foundational. More like recreational. If it were foundational it would be peer reviewed in a journal article, not posted on a blog.
What I think you are doing here is to DEFINE "foundational work" as something that gets published in a journal.
I don't mind if you use that definition, but if you do then the fact that all foundational work is published in journals is not insightful or informative, it is merely the definition.
If, on the other hand, you intended for the statement to say something meaningful about how important scientific and mathematical work is communicated, then you need a different definition of "foundational". And you would have to look to that definition to decide whether this work was foundational, because if it was then it disproves your hypothesis: it would be a counter example that illustrates that some foundational work is shared in places other than journals.
In typical mathematical style, a good definition comes from the properties you observe and desire to hold true in your axiomatic system.
To me, figuring out the halting behavior of small turing machines is similar in spirit going over all short logical propositions and trying to determine if they are true or not.
Like it sounds like it could end up being useful somehow.
As a member of these chats: it's often like hitting on an idea on a break-room blackboard and working it out, except the interaction can be cited. That's a positive change, if we can follow through and add to the literature in due time. Here's hoping.
That's fine, but the citation shouldn't be in the form of a Discord link, or at least not exclusively in that form. Make a copy of the relevant messages, host them elsewhere, and include that copy in the citation. Discord has been pretty good so far but being a durable archive has never been their core selling point so I don't trust that to continue indefinitely.
I thought that was great, too. They should also stream themselves solving this stuff on Twitch.
Better than math proofs on 4chan[1], huh?
[1] https://en.wikipedia.org/wiki/Superpermutation#Lower_bounds,...
FWIW, it's a public Discord server, you can find the invite link at the top-right of https://bbchallenge.org.
Also, I'd consider these more as attributions than citations. All the major arguments supporting the results have been replicated in the blog post (in a more rigorous form), so it can stand on its own: the Discord links just provide historical context for those interested.