I find defaultdict, OrderedDict, namedtuple among other data structures/classes in the collections module to be incredibly useful.
Another module that's packaged with the stdlib that's immensely useful is itertools. I especially find takewhile, cycle, and chain to be incredibly useful building blocks for list-related functions. I highly recommend a quick read.
EDIT: functools is also great! Fantastic module for higher-order functions on callable objects.
Why do you use OrderedDict for now that regular dicts are ordered by default?
OrderedDicts have some convenience methods and features that ordinary dicts don't have.
Also, dicts can become unordered at any time in the future. Right now the OrderedDict implementation is a thin layer over dict, but there are no guarantees it’ll always be that.
There in lies another reason why OrderedDicts are still useful even in 3.12
Not really. It was pointed out that since 3.7 the order preserving behaviour is part of the spec for dicts.
I guess for most purposes, OrderedDicts are then obsolete, but I believe there are some extra convenience methods that they have, but I've only really needed to preserve order.
Makes you think what other parts of Python have become obsolete.
They can, but ordered dict can also become unordered in the future, should the steering committee decide.
But seriously: It’s no longer an implementation detail that dictionaries are ordered in Python. It’s a specification of how Python works.
I missed that in the 3.7 release notes.
Not true as of 3.7[0]
[0] https://docs.python.org/3.7/whatsnew/3.7.htmlOh well... This is what I get to not look at release notes with lawyer eyes. Thanks for the correction.
dict are ordered to keep argument order when using named arguments in function calling. So it would be a non-trivial breaking change to revert this now.
I would argue that OrderedDict have more chances to be depreciated than dict becoming unordered again, since there is now little value to keep OrderedDict around now (and the methods currently specific to UnorderedDict could be added to dict).
Do you have any examples?
Check out the docs: https://docs.python.org/3/library/collections.html#collectio...
I work with different versions of Python3 (and 2 unfortunately) and some code is still in 3.6, hence I used OrderedDicts.
3.6 was the first with the new ordered by default dicts, even though wasn't specc'd until 3.7.
It worked as an accidental implementation detail in CPython from some other optimization, but it wasn't intentional at the time. Because it wasn't intentional and wasn't part of the spec, that code could be incompatible with other interpreters like pypy or jython.
See my comment and the linked email at https://github.com/ericvsmith/dataclasses?tab=readme-ov-file... for dataclasses and 3.6. I think it's still true.
The reason Guido didn't want 3.6 to guarantee dict ordering was to protect 3.5 projects from mysteriously failing when using code that implicitly relied on 3.6 behaviors (for example, cutting and pasting a snippet from StackOverflow).
He thought that one cycle of "no ordering assumptions" would give a smoother transition. All 3.6 implementations would have dict ordering, but it was safer to not have people rely on it right away.
pypy implemented naturally ordered dict before cpython did.
jython never released a P3 version so is irrelevant, ironpython has yet to progress beyond 3.4 so is also irrelevant.
As someone who just had to backport a fairly large script to support 3.6, I found myself surprised at how much had changed. Dataclasses? Nope. `__future__.annotations`? Nope. `namedtuple.defaults`? Nope.
It's also been frustrating with the lack of tooling support. I mean, I get it – it's hideously EOL'd – but I can't use Poetry, uv, pytest... at least it still has type hints.
Not even VSCode extension works anymore
It may be more explicit: OrderedDict has move_to_end() which may be useful e.g., for implementing lru_cache-like functionality (like deque.rotate but with arbitrary keys).
OTOH that’s a lot less useful now that functools.lru_cache exists: it’s more specialised so it’s lighter, more efficient, and thread-safe. So unless you have extended flexibility requirements around your LRU, OD loses a lot there.
And if you’re using a FIFO cache, threading a regular dict through a separate fifo (whether linked list or deque) is more efficient in my experience of implementing both S3 and Sieve.
Two dictionaries with equal keys and values are considered equal in Python, even if the order of the entries differ. By contrast, two OrderedDict objects are only equal if their respective entries are equal and if their order does not differ.
I mostly migrated to frozen dataclasses from namedtuples when dataclasses became available. I’m curious about your preference for the namedtuple. Is it the lighter weight, the strong immutability, the easy destructing? Or is it that most tuples might as well be namedtuples? Those are the advantages I can think of anyway :)
The main thing I find myself using them for is `_make()`. From the canonical [0] example:
You could of course accomplish the same with a dictionary comprehension, but I find this to be less noisy. Also, they have `_asdict()` should you want to have the contents as a dict.[0]: https://docs.python.org/3/library/collections.html#collectio...
I'm not quite sure how the fetchall() return type looks, but couldn't you just
Ofcourse you have to come up with different variable names, but it still seems more elegant to just unpack.Honestly, the "proper" way should be passing something like this
to the fetchall(), to automatically keep the names in sync with those in the SQL query string.Implemented as conn.row_factory = sqlite3.Row
By default, List[Tuple]]. You could do a list comp over the fetchall(), but at that point there’s already some magic happening, so why not make it explicit?
You don't need `_make()` with dataclasses, and you get `asdict()` as a stand-alone function so it doesn't clash with each class's namespace. Here's what your code might look like with them:
For that, you might as well conn.row_factory = sqlite3.Row
That would be a better option, as long as the goal isn’t to demonstrate how namedtuples or dataclasses work.
Dictionary comprehensions can be very elegant. List and dictionary comprehensions are very powerful and expressive abstractions. In fact, while not good practice you can pretty much write all Python code inside comprehensions including stuff regarding mutation.
This is valid(as in it will run, but highly unidiomatic) code:
quicksort = lambda arr: [pivot:=arr[0], left:= [x for x in arr[1:] if x < pivot], right := [x for x in arr[1:] if x >= pivot], quicksort(left) + [pivot] + quicksort(right)][-1] if len(arr) > 1 else arr
print(quicksort([1, 33, -4, -2, 110, 5, 88]))
Sometimes mutations in comprehensions are very expressive.
There are lots of other short ways to write `scan`, but I don't think any of them map so clearly to a naive definition of what it's supposed to do.That's incredibly clever, generators are underrated. I once challenged my friend to do leetcode problems with only expressions. Here's levenshtein distance, however it's incredibly clunky.
Holy shit that’s really clever. Didn’t know about _make, thank you!
I didn’t either until I read docs tbf. It’s just kind of thrown in as an afterthought for the section, too.
Any reason not to consider pydantic as the next step?
Every time I’ve used Pydantic I’ve found it to be a tonne of friction. The developer ergonomics just don’t seem right.
These days I use attrs and cattrs, and I’m much happier. Everything feels a lot more straightforward.
attrs is what Python’s dataclasses were based on, but they kept on improving it, so attrs just feels like standard Python with a little bit extra.
If I find my self write a[0] a[1] a[2] in more than one place, I would upgrade it to a namedtuple. Much better readability, can be defined inline like `MyTuple = namedtuple('MyTuple', 'k1 k2 k3')`
ChainMap might be the most underrated bit in the standard library.
For anyone wanting some more explanation, ChainMap can be used to build nested namespaces from a series of dicts without having to explicitly merge the names in each level. Updates to the whole ChainMap go into the top-level dict.
The docs are here [0].
Some simple motivating applications:
- Look up names in Python locals before globals before built-in functions: `pylookup = ChainMap(locals(), globals(), vars(builtins))`
- Get config variables from various sources in priority order: `var_map = ChainMap(command_line_args, os.environ, defaults)`
- Simulate layered filesystems
- etc
[0] https://docs.python.org/3/library/collections.html#collectio...
I found it perfect for structured logging, where you might want to modify some details of the logged structures (e.g. a password) without changing the underlying data.
I just wish python had some better ergonomics/syntactic sugar working with itertools and friends. Grouping and mapping and filtering and stuff quickly become so unwieldy without proper lambdas etc, especially as the typing is quite bad so after a few steps you're not sure what you even have.
Just as recent as today I went to Kotlin to process something semicomplex even though we're a python shop, just because I wanted to bash my head in after a few attempts in python. A DS could probably solve it minutes with pandas or something, but again stringly typed and lots of guesswork.
(It was actually a friendly algorithmic competition at work, I won, and even found a bug in the organizer's code that went undetected exactly because of this)
I find converting things from map objects or filter objects back to lists to be a bit clunky. Not to mention chaining operations makes it even more clunky. Some syntatic sugar would go a long way.
also more_itertools ! even less known than itertools, but equally useful.
I use defaultdict a lot - for accumulators when you're not sure about what is coming. Here's a simplified example:
This is a dict-of-dict-of-set that is accumulating from a stream of rows, and I don't know what stars and instruments will be present.Another related tool is Counter (https://docs.python.org/3/library/collections.html#collectio...)
My faves are the lru_cache, namedtuples, deques, chainmap, and all of the itertools.
Although it's not part of the standard library, toolz is wonderful for rounding out these modules.
https://toolz.readthedocs.io/en/latest/