How does one do proper concurrency in Python? As in separate "processes" running with encapsulation, fault tolerance, etc. For example, how does one bundle up a TCP client in a "process" such that the process handles failed connections, broken connections, retrieving data at periodic intervals (relatively fast), receiving "messages" from other "processes", etc.?
There's Ray, Pyro, Pykka, Celery, multiprocessing, asyncio, threads, Qt, and more, but all of them have issues. And a lot of it boils down to the GIL, although "processes" that are doing I/O such as TCP and other network communication should ideally reduce the GIL effect, to my understanding (is that right?).
From what I can tell, it's basically one of the worst language choices for systems of this nature, but I am trying to figure out how to do it because I need to.
Can you give an example of what you consider a good solution in a different language? That will help us see exactly what you're looking for and how/whether it might be achieved in Python.
Erlang and Elixir. :)
Well RabbitMQ is built in Erlang so there you go :)
But RabbitMQ is just a messaging bus, isn't it? That doesn't handle the processes (operating system processes or virtual processes) actually processing the messages and those processes handling fault-tolerance for the things they're connected to.
That is correct. Honestly I’m not sure this is a good use case for Python but it’s definitely possible. I’ve used the RabbitMQ + gevent + multiprocessing pattern in the past and it works but I find the code extremely hard to reason about. If I were doing it again from scratch I’d probably choose another language with better concurrency primitives.
It definitely isn't, but this is a new role with a rushed timeline in a place dominated by Python.
This is also a major concern of mine.
Rather than handling the multiprocessing and message passing yourself it would be much easier to use celery + gevent and let celery do the work of spinning up processes for executing the tasks.
It may feel limiting but my advice would be to keep all the celery job queue stuff isolated from your server, especially if you are using an async web framework. Have your web server just put all the jobs in the celery queue and let it handle executing them, regardless of whether they’re cpu or io-bound. If you try to optimize too much by doing something like leaning on celery for cpu-bound tasks but letting your web server handle the io-bound ones you’re going to be in for a world of hurt when it comes to both debugging and enforcing the order of execution. Celery has its warts but you’ll at least know where in the system your problem is and have reasonably good control over the pipeline.
If you are I/O bound then asyncio or good old fashioned gevent will do great. If you are CPU bound, then use multiprocessing. If you need to accept "jobs" from elsewhere, use RabbitMQ (with or without Celery). If you have mixed CPU/IO workload that fits the worker pattern, then you would do all 3. At the top level you have a RabbitMQ consumer, fetching jobs from a remote queue and then putting these into a multiprocessing queue processed by N=~cpucount processes. And each of these use asyncio/gevent to do their work.
Why some people are extremely averse to RabbitMQ?
I saw that at one company having RabbitMQ / Celery setup - every time a new software engineer comes in, they complain about RabbitMQ and ask why would company use it. The infrastructure was running like this without hiccups for years. At one point company has let go of many experienced engineers and this time one developer found some issue with the code and blamed it on rabbit as it was locking the queue. There were no more senior developers to contest it, so he convinced manager to swap it out for Redis. He took about two months to rewrite it. Surprise, the same issue existed on Redis. The Redis solution works fine, but has its own limitations...
So do you recommend a single Python process running asyncio or multiprocessing or both? Or do people normally split these things amongst several Python processes?
My understanding is that multiprocessing creates multiple interpreters but that it still comes across some GIL issues if all under the same Python process.
I am in general quite comfortable with the actor model, and I would ideally use Erlang/Elixir here, but I can't for various reasons.
Use asyncio. Bind to a socket, set blocking False, then asyncio.run(on_new_connection(sock)).
Inside that coroutine get the loop and await loop.sock_accept(sock)
And then asyncio.create_task(on_connection_data(connection)).
The only gotcha is you need to keep a reference to that task so it doesn't get garbage collected.
Get garbage collected by what? Doesn’t Python use reference counting?
asyncio doesn't store strong references to tasks. It is your responsibility to keep the ref: https://docs.python.org/3/library/asyncio-task.html#asyncio....
Consider organizing the code using TaskGroup.
Something something libuv?