I don't understand how this shows Turing completeness. The implementation of the rule 110 automaton seems to be limited by both width (not Turing complete because there is a finite number of states of a given width) and iteration limit (not be Turning complete because it always terminates).
Can you write an implementation of rule 110 with arbitrary (i.e. unbounded) width and depth?
It's still ok if the implementation limits it rather than the concept. I mean, your computer has finite memory rather than infinite tape, so it doesn't meet that requirement either regardless of language/method.
I am not talking about limitations of find or mkdir like other commenters are.
I can write a Python program that simulates rule 110 with unbounded state width and unbounded iteration depth. I might not be able to execute it on any computer (it's going to run out of memory at some point), but I can still reason about it and its behavior with a (theoretical) infinite memory. After reading the blog post, I am not convinced I can write such a program using `find` and `mkdir`, since the provided example uses explicit limits for WIDTH and ITER in the program itself.
The same argument would make C non turing complete. Because the size of pointers is a compile time constant and because everything needs to have an address that puts a large, but hard limit on tape length.
There are ways to argue arround that, e.g. C might be able to interface with a infinite tape file via the stantard library, and maybe strict aliasing and pointer provenance let's you create a system where bit identical pointers can be different. But the mental model most people have of C wouldn't be turing complete.
While strictly speaking true I don't think it is the same argument at all. You are talking about a restriction of the runtime (much like mkdir argument length or maximum filesystem depth), even though it leaks into the standard because standard people care about physical hardware, not theoretical ones.
The WIDTH and ITER limit being actual constants that are part of the program makes all the difference compared to C pointer limitations that are part of the execution environment.
But C is not Turing-complete, because its numbers (and pointers) are required to have an arbitrary limit which itself must be representable and useable in C. Python, on the other hand, has no such requirement and you could imagine an implementation that could grow indefinitely (well, until it ran out of the physical universe). C is prohibited from doing that, there is always a theoretical limit present. You can set that limit hella high, but it's still there.
"Python" doesn't exist though. It's silly to claim that Python is more powerful just because it doesn't tell you what it's going to do (run on some limited hardware) and C lets you choose. Reality is fundamentally, necessarily different from theory. Everything real is a mere approximation of theory, and vice versa. Even Turing's machine is limited by all the paper in the universes, unless you ignore reality (which is fine!). It's false precision to say that C in reality with bounded pointers is different from C with unbounded pointers, but Python in reality with secretly bounded pointers is the same as Python with unbounded pointers.
No, it's not a false precision. The C requires the numbers (and memory size) to have an upper limit which it is obliged to tell you. The Python doesn't require such limitations to exist. They will, of course, exist in practice since it's impossible to build a Turing machine but only a sufficiently huge finite-state machine, but that is the practical consideration. In practice, all language implementations are FSMs. But C is a FSM even in theory, unlike Python.
You would have better luck trying to argue that there is a reading of standard that allows for unboundedly huge file streams and that fread()/fwrite()/fseek() then could be used to faithfully implement Turing machine.
Write a python interpreter in C and it's clear to see why your logic fails. You've reaped your claimed benefits of Python while remaining in C.
Python-the-language can be Turing-complete even if Python-as-actually-implemented is not.
Then C-the-language can be Turing complete, even if C-as-actually-implemented is not. Just implement a python interpreter. (Or you can also just implement bignums in C and use those for your computation)
You can't implement a Python interpreter with access to infinite memory in C as specified. That is the point.
Why not? Whether you're running python code in your C interpreter or just running C code, the same memory restrictions will apply based on your hardware. CPython doesn't place a lower bound on bignums over a non-C based implementation
EDIT: See the GMP library, which states "There is no practical limit to the precision except the ones implied by the available memory in the machine GMP runs on"[0]
https://gmplib.org/#WHAT
The C specification limits programs to addressing a finite amount of memory, though it can be made arbitrarily large by an implementation. The Python specifications do not imply this though real interpreters do.
Yes, this is my entire point
Why should I care what the language specification states in a computability theory discussion? There only needs to exist a method to accomplish our goal-Whether the method conforms to specification or not doesn't seem relevant to me.
Would it be fair to say then that "Python" is Turing complete, while CPython/PyPy implementations are not turing complete, because they will always implicitly run up against C's memory limitations, therefore they do have a hard limit. Python itself as a language is turing complete because it does not place realistic limitations on the user like C does?
I don't think there's real difference between Python and C. With Python you can't make your program use more than 4 GB of address space if you run it with a 32-bit interpreter. You have to swap the interpreter. In C the same goes for the compiler. And yes, you can inspect the upper limit by looking at the width of size_t, but it will be seen differently with 32 and 64-bit compilers although the program will be the same. And you _can_ make program behave differently basing on size_t's width, but you're not required to. It doesn't change that fundamentally Python is no more Turing-complete than C just because you can't do it in Python (that's my assumption, I don't know Python well enough actually).
Maybe it all boils down to how CPUs work, and maybe it's safe to say that the incompleteness comes from the CPU implementation? You can of course argue that Python interpreters are written in C/C++, but of course we can imagine they can be written in assembly.
Edit: after I read some other comments I think I see the point - that indeed the problem is the implementation (on CPU).
It's still not comparable to mkdir and find.
The arbitrary limit in C is not fixed by the code. So if you run out of space on a 32-bit machine with sizeof(size_t) == 4, you can run the same code on a 64-bit machine with sizeof(size_t) == 8. With mkdir and find, you have to change the code to do this.
You can translate any Turing Machine into a single C program, which will behave identically so long as you have enough memory. You cannot do this if you need to change the program when the amount of memory changes.
I'd argue that the process of taking a C program and compiling and running it on ever larger pointer sizes it turing complete, but not a single iteration of this process.
A C program which doesn’t inspect its pointer representations can conceivably use unlimited memory. For pointer values whose representation can be statically determined to be uninspected by the program, the C implementation can use memory locations outside of the representable address space, and thus make unlimited memory accessible. Functionally there is no need for self-contained C programs to ever inspect their pointer representation.
The difference is very small though, you could say that WIDTH amd ITER must be defined in the execution enviroment (shell)before execution and that the rest of the code is the program, and we are at the same situation as in C.
If you define WIDTH and ITER prior to execution you are just giving arguments to your program.
Maybe using the term "environment" was not the best choice; what I mean is that WIDTH and ITER are program variables that impact program behavior and output (appear in regexes etc.) whereas (most) C programs don't actually reference or depend on the pointer width (other than crashing if it's too small); it is an internal detail of the C compiler and underlying hardware that only happens to be visible to the programmer due to leaky abstractions. I don't think those are comparable.
I mean... isn't this just `sizeof` in c?
Honestly, I'm struggling to think of any real world code base I've worked with in C that didn't care about pointer size.
C is indeed not Turing-complete for more or less this reason.
Neither is the universe: we have (as far as we know) a limited number of matter and energy that can be converted to computation, which limits the size of an implementable system.
Real Turing completeness is necessarily theoretical.
Yes. C is not Turing-complete even in theory. Other languages are. It doesn't especially matter.
We don’t know that at all. It’s a sensible assumption that the universe is infinite in size, and we have no indication to the contrary. The biggest impediments are the accelerating expansion of the universe (which however isn’t fully explained and thus may not be inevitable) and the heat death, which limits time for meaningful causal interaction.
On the other hand, a C running on a machine with a significantly larger address space would have appropriately larger pointers. The C standard does not specify any particular pointer bitwidth. With these things together, C as a language has a decent claim to Turing-completeness.
For any C program there is a number N, that depends on the program, compiler, architecture, etc., but does not depend on the program input, such that the program won't be able to access more than N bits of state at any moment in any of its possible executions. Hence, the program is equivalent to a finite state automaton.
Yes, but you still configure (choose a compiler) it to a fixed size before running, that is in my mind no different than specifying a fixed tape size, like in the find + mkdir example.
C is definitely not Turing complete. The standard library provides no escape, because file sizes are also limited (due to ftell (3)), and there is no chdir in the C standard library, so the total number of files is also limited. I have a recollection of an attempt to construct a possible Turing-complete interpretation of the C standard involving recursion and va_arg, but I don't think it went anywhere.
Looks like you're treating Python as a spec; in this case I'd say we should treat mkdir+find as a spec too.
Programming language implementations often have some hard limits built in. E.g.: https://peps.python.org/pep-0611/
I'm treating mkdir+find as a spec. The values for WIDTH and ITER are in the program itself and will impact any implementation of mkdir and find you use, including theoretical ones.
Thanks for explaining. I missed that WIDTH and ITER are in the program.
Thank you for your comment on my article. I think I've managed to fix the proof by implementing a tag system. Would you (anyone) mind reviewing my code [1][2]? The key point is using back references, which I think gave us capabilities beyond regular expressions to achieve Turing completeness. However, I'm not very familiar with tag systems, and I'm worried I might have missed something.
[1] https://onecompiler.com/bash/42mux2442 [2] https://onecompiler.com/bash/42mux3nf8
I am not an expert in either tag systems (or `find`), but this seems right. Using backreferences to handle copies sounds right (it does add expressive power to regular expressions but does not give Turing completeness, afaik).
I think your first example is missing the "any word of length < 2 is a halting word" condition but it is present in your second example.
Thank you! I've updated the article with a link to my comment.
I think it's not that simple. It's always confusing to talk about Turing machines and the requirement of infinite memory vs. the reality of finite memory. I think "Turing completeness" is not so obvious to define rigorously and the way people use it does maybe not exactly capture the idea of "arbitrary computation" being possible. I'll try to clarify some things for myself and maybe others.
First of all, recall that a dynamical system is a set X with a map f: X -> X. The evolution of the system is given by the iterated application of f. A dynamical system is finite if the set X is finite.
I think it is useful to broaden this concept and define an IO-system as three sets X and I and O with a map f: I × X -> O × X. This means at every evolution step an "input" value i ∈ I has to be provided and an "output" value o ∈ O is obtained.
A Turing machine m consists of a finite alphabet A of symbols and a finite IO-system h: A × S -> O × S, where O = {move left, move right, print symbol a ∈ A}. This represents how the "head" of the Turing machine updates its internal state s ∈ S when reading a symbol from the alphabet I. We call this IO-system h the head of the Turing machine. You could specify the Turing machine with the data T = (A, S, O, h).
You now couple this Turing machine with another IO-system, which we call the "tape". It is either an infinite (N = ∞) tape or a finite, circular tape of length N. It has states X = {1, ..., N} × I × ... × I where the product I × ... × I has length N. It's set of inputs is the set O and its set of outputs is A. It's operation is given by a function t: O × X -> A × X, which describes the intended reaction of the type to the instructions from the head, i.e. depending on the instruction in O it either moves the "position counter" of the tape to the left, to the right, or it prints a symbol onto the tape. After it has performed this it reads the symbol at the current position and gives this output back to the head.
We can now combine the head h and the tape t into a "machine" dynamical system m: X × S × O -> X × S × O where h(x, s, o) = (t(o, x)_X, h(t(o, x)_A, s)_S, h(t(o, x)_A, s)_O). This represents the evolution of the Turing machine together with the tape. We call this the [machine dynamicals system with memory N of the Turing machine T].
Definition 1. Let's say that [the dynamical system f: X -> X simulates another dynamical system g: Y -> Y] if there exists an injective map u: Y -> X such that g(y) = f(u(y)). In order to compute the evolution g(g(...(g(y))...)) we can instead compute f(u(f(u(...(f(u(y))...)) and use injectivity of u to get back a result in Y.
Lemma 2. Any finite dynamical system is simulated by the machine dynamical system of some Turing machine with tape length N = 1. proof: Just set the head of the Turing machine to be the desired dynamical system and trivialize all the other objects.
This is a triviality result and tells us that this is not a good attempt to investigate universality of Turing machines in a "finite memory" setting.
False Hypothesis 3. There exists a universal Turing machine U in the sense that this Turing machine has the property that its machine dynamical system with infinite memory simulates the machine dynamical system with infinite memory of any other Turing machine T.
As far as I know this hypothesis is false because the sense of simulation mentioned above is far too strong. At this point I think there are many definitions one can make so let's stick with the one of Alan Turing.
Definition 4. We say that [the dynamical system f: X -> X simulates another dynamical system g: Y -> Y with respect to the "result" functions R: X -> {null, 0, 1} and Q: Y -> {null, 0, 1}] if there exists an injective map u: Y -> X such that the sequences Q(g^n(y)) and R((f ∘ u)^n(y)) are "result equivalent", meaning they are equal if you delete all instances of "null".
We now extend the concept of a Turing machine T by adding to it a result function r: O -> {null, 0, 1}.
Definition 5 (A. Turing, 1936). We say that [the Turing machine T with result function r: O -> {null, 0, 1} (N,M)-simulates another Turing machine T' with result function r': O' -> {null, 0, 1}] if the machine dynamical system of T with memory N simulates the machine dynamical system of T' with memory M, with respect to the result functions R: X × S × O -> {null, 0, 1} given by R(x, s, o) = r(o) and R': X' × S' × O' -> {null, 0, 1} given by R'(x, s, o) = r'(o).
Definition 6. We say that [a Turing machine U with result function r is (N,M)-universal] if it (N,M)-simulates any other Turing machine with result function.
Theorem 7 (A. Turing, 1936). There exists a (∞,∞)-universal Turing machine.
Definition 8. We say that [a Turing machine U with result function r is finite-weakly universal] if for any finite M there exists some finite N such that it (N,M) simulates any other Turing machine with result function.
Now it gets difficult becasue I don't actually know the answers anymore. I am pretty sure that any (∞,∞)-universal Turing machine is also finite-weakly universal. Even more so, it might be the case that finite-weak universality is equivalent to (∞,∞)-universality. Most certainly finite-weak universality is not a trivial concept and captures an interesting aspect of the concept of computation. I want to make the point that in my opinion infinite memory should not be seen as requirement in order to talk about these concepts of computation like Turing machines and universality.
It is also unclear how exactly to define the "Turing completeness" of a system, as I don't think there exists a definition of Turing completeness for dynamical systems. You have to specify how you are allowed to put an input into the dynamical system at least. I think that in some sense one could use what OP found and prove a rigorous result that with `find` + `mkdir` one can somehow construct a finite-weakly universal Turing machine.
I interpreted `-maxdepth` as a safety device.
I don't know this field very well so I might be misunderstanding, but I think this is different than "infinite tape" in Turing Machines. As I understand it, the proof of universality for Rule 110 required that the program code which is called the "production rules" be repeated infinitely on the tape even for a finite size program. https://en.wikipedia.org/wiki/Rule_110#:~:text=An%20infinite...
If you had a halting problem oracle to tell you how much runtime is needed to run a certain program to completion, you could get away with having only a finite number of repetitions of the "production rules", and simply pretending that they're infinitely repeated. This would only work for programs that halt.
If I understand correctly, any program that loops forever, if implemented within Rule 110 Cyclic Tags, requires infinite repetition of the production rules. I think this is a difference of Rule 110 vs Turing Machine tape. If I understand correctly, a Turing Machine with finite, even quite small, tape can loop forever. But a Rule 110 program must have infinitely sized tape to be able to loop forever.
Basically (if I understand correctly), Rule 110 Cyclic Tags essentially "consume" tape symbols as basically a non-renewable resource, like an electrical computer server powered by the burning of coal. Infinite runtime (looping forever) requires infinite repetition of the tape symbols (both the "production rules" and the "clock pulses" - see the Wiki page above). I believe this is unlike Turing Machines, which can loop forever without "consuming" any non-renewable resource.
To clearly state this again: Running a simple "while(true)" loop in a Turing Machine only needs finite tape, but requires infinite tape in Rule 110.
To be fair I've never been 100% sold on the Turing-completeness of rule 110, but your argument isn't landing with me either.
"Allowed" is probably covering too wide a meaning in your description. Just because something is capable of defining infinite consumption does not mean it was allowed to do so in the proof.
The initial state of a Turing Machine tape is that beyond the finite-length input, all of the rest of the infinite tape is blank.
https://en.wikipedia.org/wiki/Turing_machine#:~:text=Cells%2...
Whereas the Rule 110 Cyclic Tags engine requires the infinite tape to contain infinite repetitions of structured patterns, even in order to simply run "while(true)". That's a key difference.
I also agree with zamadatix's sibling comment.
C is maybe technically not Turing compete either: https://cs.stackexchange.com/questions/60965/is-c-actually-t...
According to the second answer, C99 is Turing complete.
The author has since updated their post: