HN comments for: Lesser known parts of Python standard library

judicious

50 replies

3d15h

2024-09-05 02:56:10 UTC

I find defaultdict, OrderedDict, namedtuple among other data structures/classes in the collections module to be incredibly useful.

Another module that's packaged with the stdlib that's immensely useful is itertools. I especially find takewhile, cycle, and chain to be incredibly useful building blocks for list-related functions. I highly recommend a quick read.

EDIT: functools is also great! Fantastic module for higher-order functions on callable objects.

https://docs.python.org/3/library/itertools.html

padthai

23 replies

3d15h

2024-09-05 03:25:58 UTC

Why do you use OrderedDict for now that regular dicts are ordered by default?

heavyset_go

11 replies

3d14h

2024-09-05 03:45:30 UTC

OrderedDicts have some convenience methods and features that ordinary dicts don't have.

rbanffy

8 replies

3d14h

2024-09-05 04:00:12 UTC

Also, dicts can become unordered at any time in the future. Right now the OrderedDict implementation is a thin layer over dict, but there are no guarantees it’ll always be that.

judicious

2 replies

3d14h

2024-09-05 04:10:28 UTC

There in lies another reason why OrderedDicts are still useful even in 3.12

rbanffy

1 replies

3d1h

2024-09-05 17:19:16 UTC

Not really. It was pointed out that since 3.7 the order preserving behaviour is part of the spec for dicts.

judicious

0 replies

2024-09-05 18:24:25 UTC

I guess for most purposes, OrderedDicts are then obsolete, but I believe there are some extra convenience methods that they have, but I've only really needed to preserve order.

Makes you think what other parts of Python have become obsolete.

wodenokoto

1 replies

3d14h

2024-09-05 04:09:54 UTC

They can, but ordered dict can also become unordered in the future, should the steering committee decide.

But seriously: It’s no longer an implementation detail that dictionaries are ordered in Python. It’s a specification of how Python works.

rbanffy

0 replies

3d1h

2024-09-05 17:18:26 UTC

I missed that in the 3.7 release notes.

3eb7988a1663

1 replies

3d14h

2024-09-05 04:25:13 UTC

Not true as of 3.7[0]

  the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.

[0] https://docs.python.org/3.7/whatsnew/3.7.html

rbanffy

0 replies

3d1h

2024-09-05 17:17:51 UTC

Oh well... This is what I get to not look at release notes with lawyer eyes. Thanks for the correction.

SuchAnonMuchWow

0 replies

3d10h

2024-09-05 08:22:27 UTC

dict are ordered to keep argument order when using named arguments in function calling. So it would be a non-trivial breaking change to revert this now.

I would argue that OrderedDict have more chances to be depreciated than dict becoming unordered again, since there is now little value to keep OrderedDict around now (and the methods currently specific to UnorderedDict could be added to dict).

wodenokoto

1 replies

3d14h

2024-09-05 04:11:38 UTC

Do you have any examples?

heavyset_go

0 replies

3d12h

2024-09-05 05:42:13 UTC

Check out the docs: https://docs.python.org/3/library/collections.html#collectio...

judicious

7 replies

3d14h

2024-09-05 03:33:33 UTC

I work with different versions of Python3 (and 2 unfortunately) and some code is still in 3.6, hence I used OrderedDicts.

mixmastamyk

4 replies

3d13h

2024-09-05 04:31:46 UTC

3.6 was the first with the new ordered by default dicts, even though wasn't specc'd until 3.7.

Izkata

3 replies

3d13h

2024-09-05 05:08:24 UTC

It worked as an accidental implementation detail in CPython from some other optimization, but it wasn't intentional at the time. Because it wasn't intentional and wasn't part of the spec, that code could be incompatible with other interpreters like pypy or jython.

ericvsmith

1 replies

3d12h

2024-09-05 05:50:16 UTC

See my comment and the linked email at https://github.com/ericvsmith/dataclasses?tab=readme-ov-file... for dataclasses and 3.6. I think it's still true.

raymondh

0 replies

3d11h

2024-09-05 07:28:50 UTC

The reason Guido didn't want 3.6 to guarantee dict ordering was to protect 3.5 projects from mysteriously failing when using code that implicitly relied on 3.6 behaviors (for example, cutting and pasting a snippet from StackOverflow).

He thought that one cycle of "no ordering assumptions" would give a smoother transition. All 3.6 implementations would have dict ordering, but it was safer to not have people rely on it right away.

masklinn

0 replies

3d9h

2024-09-05 09:22:45 UTC

pypy implemented naturally ordered dict before cpython did.

jython never released a P3 version so is irrelevant, ironpython has yet to progress beyond 3.4 so is also irrelevant.

sgarland

1 replies

3d4h

2024-09-05 13:35:14 UTC

As someone who just had to backport a fairly large script to support 3.6, I found myself surprised at how much had changed. Dataclasses? Nope. `__future__.annotations`? Nope. `namedtuple.defaults`? Nope.

It's also been frustrating with the lack of tooling support. I mean, I get it – it's hideously EOL'd – but I can't use Poetry, uv, pytest... at least it still has type hints.

neves

0 replies

3d4h

2024-09-05 13:53:35 UTC

Not even VSCode extension works anymore

d0mine

1 replies

3d11h

2024-09-05 06:31:49 UTC

It may be more explicit: OrderedDict has move_to_end() which may be useful e.g., for implementing lru_cache-like functionality (like deque.rotate but with arbitrary keys).

masklinn

0 replies

3d9h

2024-09-05 09:18:01 UTC

OTOH that’s a lot less useful now that functools.lru_cache exists: it’s more specialised so it’s lighter, more efficient, and thread-safe. So unless you have extended flexibility requirements around your LRU, OD loses a lot there.

And if you’re using a FIFO cache, threading a regular dict through a separate fifo (whether linked list or deque) is more efficient in my experience of implementing both S3 and Sieve.

Flimm

0 replies

3d6h

2024-09-05 12:05:26 UTC

Two dictionaries with equal keys and values are considered equal in Python, even if the order of the entries differ. By contrast, two OrderedDict objects are only equal if their respective entries are equal and if their order does not differ.

sevensor

16 replies

3d5h

2024-09-05 12:55:54 UTC

I mostly migrated to frozen dataclasses from namedtuples when dataclasses became available. I’m curious about your preference for the namedtuple. Is it the lighter weight, the strong immutability, the easy destructing? Or is it that most tuples might as well be namedtuples? Those are the advantages I can think of anyway :)

sgarland

12 replies

3d5h

2024-09-05 13:28:50 UTC

The main thing I find myself using them for is `_make()`. From the canonical [0] example:

    import sqlite3

    EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')
    conn = sqlite3.connect('/companydata')
    cursor = conn.cursor()
    cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
    for emp in map(EmployeeRecord._make, cursor.fetchall()):
        print(emp.name, emp.title)

You could of course accomplish the same with a dictionary comprehension, but I find this to be less noisy. Also, they have `_asdict()` should you want to have the contents as a dict.

[0]: https://docs.python.org/3/library/collections.html#collectio...

wodenokoto

3 replies

3d1h

2024-09-05 16:39:29 UTC

I'm not quite sure how the fetchall() return type looks, but couldn't you just

    for name, age, title in cursor.fetchall():
        print(name, age, title)

Ofcourse you have to come up with different variable names, but it still seems more elegant to just unpack.

Joker_vD

1 replies

3d1h

2024-09-05 17:25:28 UTC

Honestly, the "proper" way should be passing something like this

    def namedtuple_factory(cursor, row):
        fields = [column[0] for column in cursor.description]
        cls = namedtuple("Row", fields)
        return cls._make(row)

to the fetchall(), to automatically keep the names in sync with those in the SQL query string.

ttyprintk

0 replies

2d7h

2024-09-06 11:11:46 UTC

Implemented as conn.row_factory = sqlite3.Row

sgarland

0 replies

2d6h

2024-09-06 11:44:57 UTC

By default, List[Tuple]]. You could do a list comp over the fetchall(), but at that point there’s already some magic happening, so why not make it explicit?

kstrauser

2 replies

2024-09-05 18:27:07 UTC

You don't need `_make()` with dataclasses, and you get `asdict()` as a stand-alone function so it doesn't clash with each class's namespace. Here's what your code might look like with them:

    import sqlite3
    from dataclasses import asdict, dataclass
    
    @dataclass
    class EmployeeRecord:
        name: str
        age: int
        title: str
        department: str
        paygrade: str
    
    conn = sqlite3.connect("/companydata")
    cursor = conn.cursor()
    cursor.execute("SELECT name, age, title, department, paygrade FROM employees")
    for emp in (EmployeeRecord(*row) for row in cursor.fetchall()):
        print(emp.name, emp.title)
        print(asdict(emp))

ttyprintk

1 replies

2d7h

2024-09-06 11:10:35 UTC

For that, you might as well conn.row_factory = sqlite3.Row

kstrauser

0 replies

2d5h

2024-09-06 12:58:34 UTC

That would be a better option, as long as the goal isn’t to demonstrate how namedtuples or dataclasses work.

judicious

2 replies

3d3h

2024-09-05 15:04:22 UTC

Dictionary comprehensions can be very elegant. List and dictionary comprehensions are very powerful and expressive abstractions. In fact, while not good practice you can pretty much write all Python code inside comprehensions including stuff regarding mutation.

This is valid(as in it will run, but highly unidiomatic) code:

quicksort = lambda arr: [pivot:=arr[0], left:= [x for x in arr[1:] if x < pivot], right := [x for x in arr[1:] if x >= pivot], quicksort(left) + [pivot] + quicksort(right)][-1] if len(arr) > 1 else arr

print(quicksort([1, 33, -4, -2, 110, 5, 88]))

hansvm

1 replies

2024-09-05 18:29:37 UTC

Sometimes mutations in comprehensions are very expressive.

  def scan(items, f, initial):
    x = initial
    return (x := f(x, y) for y in items)

There are lots of other short ways to write `scan`, but I don't think any of them map so clearly to a naive definition of what it's supposed to do.

judicious

0 replies

2d23h

2024-09-05 19:03:26 UTC

That's incredibly clever, generators are underrated. I once challenged my friend to do leetcode problems with only expressions. Here's levenshtein distance, however it's incredibly clunky.

  levenshtein_distance = lambda s1, s2: [matrix := [[0] * (len(s2) + 1) for _ in 
  range(len(s1) + 1)], [
          [
              (matrix[i].__setitem__(j, min(matrix[i-1][j] + 1, matrix[i][j-1] + 
  1, matrix[i-1][j-1] + (0 if s1[i-1] == s2[j-1] else 1))), matrix[i][-1])[1]
              for j in range(1, len(s2) + 1)
         ]
         for i in range(1, len(s1) + 1)
     ], matrix[-1][-1]][-1]

gcr

1 replies

3d2h

2024-09-05 15:53:34 UTC

Holy shit that’s really clever. Didn’t know about _make, thank you!

sgarland

0 replies

2d6h

2024-09-06 11:40:06 UTC

I didn’t either until I read docs tbf. It’s just kind of thrown in as an afterthought for the section, too.

gcr

1 replies

3d2h

2024-09-05 15:53:58 UTC

Any reason not to consider pydantic as the next step?

JimDabell

0 replies

2024-09-05 17:32:51 UTC

Every time I’ve used Pydantic I’ve found it to be a tonne of friction. The developer ergonomics just don’t seem right.

These days I use attrs and cattrs, and I’m much happier. Everything feels a lot more straightforward.

attrs is what Python’s dataclasses were based on, but they kept on improving it, so attrs just feels like standard Python with a little bit extra.

est

0 replies

3d3h

2024-09-05 14:54:28 UTC

If I find my self write a[0] a[1] a[2] in more than one place, I would upgrade it to a namedtuple. Much better readability, can be defined inline like `MyTuple = namedtuple('MyTuple', 'k1 k2 k3')`

BerislavLopac

2 replies

3d11h

2024-09-05 06:56:37 UTC

ChainMap might be the most underrated bit in the standard library.

stevesimmons

1 replies

3d4h

2024-09-05 13:41:44 UTC

For anyone wanting some more explanation, ChainMap can be used to build nested namespaces from a series of dicts without having to explicitly merge the names in each level. Updates to the whole ChainMap go into the top-level dict.

The docs are here [0].

Some simple motivating applications:

- Look up names in Python locals before globals before built-in functions: `pylookup = ChainMap(locals(), globals(), vars(builtins))`

- Get config variables from various sources in priority order: `var_map = ChainMap(command_line_args, os.environ, defaults)`

- Simulate layered filesystems

- etc

[0] https://docs.python.org/3/library/collections.html#collectio...

BerislavLopac

0 replies

2d7h

2024-09-06 10:54:26 UTC

I found it perfect for structured logging, where you might want to modify some details of the logged structures (e.g. a password) without changing the underlying data.

matsemann

1 replies

2024-09-05 17:51:56 UTC

I just wish python had some better ergonomics/syntactic sugar working with itertools and friends. Grouping and mapping and filtering and stuff quickly become so unwieldy without proper lambdas etc, especially as the typing is quite bad so after a few steps you're not sure what you even have.

Just as recent as today I went to Kotlin to process something semicomplex even though we're a python shop, just because I wanted to bash my head in after a few attempts in python. A DS could probably solve it minutes with pandas or something, but again stringly typed and lots of guesswork.

(It was actually a friendly algorithmic competition at work, I won, and even found a bug in the organizer's code that went undetected exactly because of this)

judicious

0 replies

2024-09-05 18:11:45 UTC

I find converting things from map objects or filter objects back to lists to be a bit clunky. Not to mention chaining operations makes it even more clunky. Some syntatic sugar would go a long way.

tpoacher

0 replies

3d4h

2024-09-05 13:36:15 UTC

also more_itertools ! even less known than itertools, but equally useful.

mturmon

0 replies

2d22h

2024-09-05 19:38:37 UTC

I use defaultdict a lot - for accumulators when you're not sure about what is coming. Here's a simplified example:

    # a[star_name][instrument] = set of (seed, planet index) of visited planets
    a = defaultdict(lambda: defaultdict(set))
    for row in rows:
        a[row.star][row.inst].add((row.seed, row.planet))

This is a dict-of-dict-of-set that is accumulating from a stream of rows, and I don't know what stars and instruments will be present.

Another related tool is Counter (https://docs.python.org/3/library/collections.html#collectio...)

mont_tag

0 replies

3d9h

2024-09-05 09:06:16 UTC

My faves are the lru_cache, namedtuples, deques, chainmap, and all of the itertools.

daniel_grady

0 replies

1d20h

2024-09-06 21:36:37 UTC

Although it's not part of the standard library, toolz is wonderful for rounding out these modules.

https://toolz.readthedocs.io/en/latest/

chadash

13 replies

3d5h

2024-09-05 12:40:07 UTC

> OrderedDict - dictionary that maintains order of key-value pairs (e.g. when HTTP header value matters for dealing with certain security mechanisms).

Word to the wise... as of Python 3.7, the regular dictionary data structure guarantees order. Declaring an OrderedDict can still be worthwhile for readability (to let code reviewers/maintainers know that order is important) but I don't know of any other reason to use it anymore.

willcipriano

3 replies

3d2h

2024-09-05 15:56:10 UTC

Being as specific as possible with your types is how you make things more readable in Python. OrderedDict where the order matters, set where there are no duplicate items possible, The newish enums are great for things that have a limited set of values (dev, test, qa, prod) vs using a string. You can say a lot with type choice.

Another reason is I think that 3.7 behavior is just a C Python implementation detail, other interpreters may not honor it.

7bit

1 replies

2d10h

2024-09-06 08:25:03 UTC

Being as specific as possible with your types is how you make things more readable in Python.

If Dict already guarantees to keep Order, nothing ist won by using both Dict and OrderedDict. Just use Dict.

pests

0 replies

1d17h

2024-09-07 01:15:13 UTC

The mere fact we're having this discussion means people don't know the guarantees and be being more explicit there is less room for confusion.

sestep

0 replies

3d2h

2024-09-05 16:08:25 UTC

You're thinking of 3.6: https://stackoverflow.com/a/39980744/5044950

WhyNotHugo

3 replies

2d23h

2024-09-05 18:49:16 UTC

Does dict now guarantee that it maintains order? IIRC, it was originally a mere side effect of the algorithm chosen (which was chosen for performance), but it could change in future releases or alternative implementations.

metalliqaz

1 replies

2d23h

2024-09-05 19:11:36 UTC

afaik the documentation states that it could change in the future

nilslindemann

0 replies

2d22h

2024-09-05 19:38:02 UTC

Nope, "Dict keeps insertion order" is the ruling.

https://mail.python.org/pipermail/python-dev/2017-December/1...

eurleif

0 replies

2d23h

2024-09-05 19:18:41 UTC

Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.

https://docs.python.org/3/library/stdtypes.html#dict:~:text=....

dataflow

2 replies

3d3h

2024-09-05 14:42:29 UTC

Word to the wise... as of Python 3.7, the regular dictionary data structure guarantees order.

Which means you still should use it if you might run on 3.6 or earlier.

sestep

1 replies

3d2h

2024-09-05 16:07:35 UTC

Even for 3.6 it's still true, just not guaranteed in writing: https://stackoverflow.com/a/39980744/5044950

And Python <=3.7 is already end-of-life anyways: https://devguide.python.org/versions/

Helmut10001

0 replies

2d23h

2024-09-05 18:44:01 UTC

This hit me bad once bad. I tested the regular dict and it _looked_ like it was ordered. Turned out, 1 out of about 100000 times it was not. And I had a lot of trouble identifying the reason 3 weeks later, when the bug was buried deep in complex code, and it appeared mostly what looked like random.

LudwigNagasena

1 replies

3d5h

2024-09-05 13:21:29 UTC

Comparison of OrderedDict is order-sensitive. They also have some extra methods.

buildbot

0 replies

3d1h

2024-09-05 16:40:39 UTC

Yep, those extra methods are extremely useful. Basically turns them into stack/queues in addition to being dictionaries which can be very helpful.

BiteCode_dev

9 replies

3d4h

2024-09-05 14:10:56 UTC

Add functools to the list. Espacially functools.wraps() and functools.partial().

The stdlib is full of goodies.

Now I always appreciated the battery included logic in python. But I noticed this week that LLM diminish that need. It's so easy to prompt for small utilities and saves you from using entire libraries for a few tools.

And the AI can create doc and tests for them as quickly.

So while I was really enthusiastic things like pairwise() were added to itertools, it's not as revolutionary as before.

morkalork

3 replies

2024-09-05 17:41:27 UTC

I wish there were some syntactic sugar for partial but knowing how patterns like that get abused in other languages maybe it is for the better that there isn't.

globular-toast

2 replies

2024-09-05 17:56:50 UTC

What would it look like? Wouldn't it just be lambda?

morkalork

0 replies

2d16h

2024-09-06 01:50:05 UTC

Yes, sugar so instead of having `partial(foo, w=1, x=2)` or `lambda y,z: foo(1,2,y,z)` just instead something like `foo(1,2,_,_)`

BiteCode_dev

0 replies

2d23h

2024-09-05 19:05:44 UTC

I wish it would at least be a method on any callable.

gcr

3 replies

3d2h

2024-09-05 15:52:07 UTC

If you’re saying that LLMs trade idiomatic tools for ease-of-boilerplate-generation, shouldn’t that be a point against them, not in their favor?

Pardon the hyperbole but it’s a bit like lauding an IDE for automatically generating thousands of Java class stubs.

rurp

1 replies

3d2h

2024-09-05 16:21:29 UTC

Agreed, rewriting standard functions is much worse than using standard tools that already exist.

In addition to the extra boilerplate and reduced readability, that also sounds like an easy way to introduce subtle bugs. Standard library functions have been exhaustively field tested, a similar looking LLM generated function could easily include a footgun.

BiteCode_dev

0 replies

3d1h

2024-09-05 16:49:54 UTC

Sure, but have you tried to introduce a new standard tool in the stdlib?

It's not a fun process.

Writing the code is the easy part.

And installing more-itertools for one functions is a bit silly

BiteCode_dev

0 replies

3d1h

2024-09-05 16:49:04 UTC

For decades you had the famous itertools recipes taunting you in the doc: https://docs.python.org/3/library/itertools.html#itertools-r...

They were super useful, but not included in the stdlib, despite being a few lines long.

We also had more-itertools, bolton, and others, to bridge that gap.

Now, there was always a tension between adding more stuff to the stdlib, or letting 3rd party libs handle it. Remember the saying: the stdlib is where projects go to die.

And of course tensions about installing full on 3rd party libs just for a few functions.

The result is that many people copy/pasted a lot of small utilities, and endless debates on python-ideas to include some more.

I think this is going to slow down. Now if you want "def first_true(iterable, default=False, predicate=None)", you ask chatgpt, and you don't care.

The cost of adding those into the project is negligeable.

It's nowhere near generating thousand of class stubs. It's actually the opposite: very targetted, specific code needs being filled instead of haunting python debates or your venv.

But to stimulate a bit your anxiety, I do think code gen is going also making a big comeback with LLM :)

BeetleB

0 replies

3d3h

2024-09-05 15:15:03 UTC

And itertools!

sgarland

8 replies

3d4h

2024-09-05 13:44:27 UTC

Adding `array` [0] to the list. It's generally slower than a list, but massively more memory-efficient. You're limited to a heterogeneous type, of course, but they can be quite useful for some operations.

[0]: https://docs.python.org/3/library/array.html

cgopalan

5 replies

3d3h

2024-09-05 14:50:06 UTC

You mean homogenous instead of heterogenous, right?

tomjakubowski

1 replies

3d2h

2024-09-05 16:04:46 UTC

To add to this, arrays are also restricted to primitive C types. A Python array object is simply a heap allocated `unsigned long *` or what have you.

https://docs.python.org/3/library/array.html

https://github.com/python/cpython/blob/main/Modules/arraymod...

And you can use struct for heterogenous data =) It has a neat DSL for packing/unpacking the data, reminiscent of the "little languages" from classic book The Practice of Programming. Python is actually pretty nice working with binary data.

https://docs.python.org/3/library/struct.html

sgarland

0 replies

2d6h

2024-09-06 11:58:00 UTC

Python is actually pretty nice working with binary data.

It really is! I’ve been working on a project to generate large amounts of synthetic data, and it calls out to C for various shared libraries to do the heavy lifting *. Instead of encoding and decoding back and forth, I can just ship bytes around, and then directly write them out to a file. Saves a lot of time.

*: yes, I should just rewrite it into a faster language entirely. I intend to, but for the time being it’s been “how fast can I make Python without anything but stdlib,” as long as you accept ctypes as being included in that definition.

banannaise

1 replies

2d4h

2024-09-06 14:06:13 UTC

It would be very funny to have an iterable type where the items are required to contain incompatible types.

Sohcahtoa82

0 replies

1d20h

2024-09-06 22:17:19 UTC

Somewhere out there, I'm sure someone could come up with a use case.

sgarland

0 replies

2d7h

2024-09-06 11:06:47 UTC

Ugh… yes. Thank you.

lang4d

1 replies

2d21h

2024-09-05 21:09:58 UTC

Why is it slower compared to a normal list?

sgarland

0 replies

2d6h

2024-09-06 11:49:09 UTC

The simplistic answer is that lists have been heavily optimized over the years, arrays haven’t been touched much.

I’m not positive on why lists are faster to create than lists, though. Retrieval makes sense (lists already store the Python object, arrays have to cast it back), but creation I’m unsure about. I’ll check dis.dis.

EDIT: from a sibling comment above [0], maybe because array reallocs are done much more granularly than lists, so as it grows, it’ll have to do so more frequently compared to lists?

[0]: https://github.com/python/cpython/blob/main/Modules/arraymod...

joshdavham

6 replies

3d13h

2024-09-05 05:07:19 UTC

I did not know about that webbrowser module. This will definitely come in handy for sure!

fuzztester

5 replies

3d11h

2024-09-05 07:25:40 UTC

Try:

  import antigravity

and

  import braces

separately.

adm_

3 replies

3d9h

2024-09-05 08:59:11 UTC

Actually it is

  from __future__ import braces

joshdavham

1 replies

2d23h

2024-09-05 19:16:10 UTC

That's cool! I didn't know about this one.

Also, just to make sure I understand the joke. It's basically just saying that they'll never add braces to the python syntax, right?

Sohcahtoa82

0 replies

1d19h

2024-09-06 22:41:52 UTC

I mean, technically braces ARE part of Python syntax. They're used by sets and dictionaries.

But I know what you meant, and yeah...they'll never use braces as block delimiters. IMO, that's a good thing. Whitespace-as-syntax means you're FORCED to have a minimal level of decent code formatting or it doesn't work.

fuzztester

0 replies

3d6h

2024-09-05 12:02:47 UTC

Oh yeah, thanks.

That also makes it more funny. :)

chuckadams

0 replies

3d4h

2024-09-05 14:07:05 UTC

Don’t forget

    import this

guhcampos

5 replies

3d3h

2024-09-05 15:22:33 UTC

I've significantly reduced my use of `namedtuple` since DataClasses were introduced, but I confess I never did much performance comparisons between the two.

I assume the `namedtuple` syntax is more pleasing for Functional favorable programmers, but this makes me wonder if the stdlib should choose one of them?

crabbone

3 replies

3d3h

2024-09-05 15:30:00 UTC

Last I looked at named tuple implementation, it was along these lines:

* generate source code from a template string.

* eval generated code.

* call constructor.

This is woefully slow and wasteful compared to a sensible solution: writing it in C. But, nobody really cares.

twixfel

2 replies

3d2h

2024-09-05 15:41:47 UTC

I suspect nobody cares because it's not a problem. That bit of code you're moaning about will only be called once per namedtuple. It's unlikely to be a problem.

crabbone

1 replies

3d2h

2024-09-05 16:29:29 UTC

Guess who cares? The person you replied to... and it would've been really easy to figure that out, given that parent went all the way to look for implementation, isn't it?

Anyways. The reason I cared is because I was working on a Protobuf parser, where named tuple was supposed to play a key role: the message class. Imagine my disappointment when I started to run benchmarks.

twixfel

0 replies

23h43m

2024-09-07 18:47:58 UTC

But namedtuple makes classes, it does not instantiate them. If you're in a situation where your bottle neck is defining classes rather than instantiating them, then I would hope you can appreciate that this is an edge case.

pletnes

0 replies

3d2h

2024-09-05 15:34:20 UTC

In my opinion, namedtuple was created to allow usage of a tuple (they are required in many places) while giving names to the members, rather than plain indexes.

m463

4 replies

2d21h

2024-09-05 21:22:18 UTC

what do people use when they want shorthand for something like this:

  a['foo'] = 20
  a['bar'] = 9

where you want to be able to do:

  a.foo = 20
  a.bar = 9

Am4TIfIsER0ppos

1 replies

2d6h

2024-09-06 12:04:59 UTC

Use Lua?

Sohcahtoa82

0 replies

1d19h

2024-09-06 22:42:17 UTC

Or JavaScript.

hooverd

0 replies

2d21h

2024-09-05 21:25:44 UTC

Pydantic, attrs, or dataclasses (standard library option). Or you can override __getattr__ on a dict subclass.

Too

0 replies

1d11h

2024-09-07 06:58:06 UTC

You don't. Use dataclasses instead to get type checking.

Otherwise there are tons of tricks to get what you want. To add to the list posted in sibling:

    a = vars(a) # readonly
    print(a.foo)

    class Obj: pass; 
    a = Obj()
    a.foo = 20

magicmicah85

3 replies

3d3h

2024-09-05 14:44:08 UTC

Probably silly, but I went five years of programming in python before I learned about the help function. Only learned about it when I had to take an intro to python class for school.

stavros

1 replies

3d2h

2024-09-05 15:44:33 UTC

Well, I went twenty-five years of programming in Python before I saw your comment, sooo...

magicmicah85

0 replies

2d18h

2024-09-05 23:52:53 UTC

Haha, I feel like it’s such an unknown feature and it’s incredibly useful too.

ttyprintk

0 replies

2d7h

2024-09-06 10:58:24 UTC

Try pydoc -p 0

tpoacher

2 replies

3d4h

2024-09-05 13:51:37 UTC

I was not aware of zipapp ... but it's interesting to see it exists as a method for enabling python to run 'zipped packages' ... since python can already do that by default with normal zipfiles, as long as the zipfile appears / is added to the python path (which is roughly analogous to how one can add .jar files to the classpath in java). E.g.:

  export PYTHONPATH="package.zip"
  python3 -m packagename

will work just fine.

(PS. I document this technique in one of my python-template projects: https://git.sr.ht/~tpapastylianou/python-self-contained-runn...)

I suppose, if the intent is to package something in a manner that attempts to make it newbie-proof, then requiring a PYTHONPATH incantation before the python part might be one step too far ... but then again, one could argue the same about people not quite knowing what to do with a .pyz file and getting stuck.

BiteCode_dev

1 replies

3d4h

2024-09-05 14:07:19 UTC

Checkout shiv for turning even more zipapp goodness: https://shiv.readthedocs.io/en/latest/

ttyprintk

0 replies

2d7h

2024-09-06 10:55:19 UTC

shiv has two things over zipapp:

A bootstrap boilerplate that allows the shiv to be able to run as an interpreter. I think zipapp can only be given code.interact as main.

Unpacking wheels into ~/.shiv, which might be faster. I can’t remember if this permits running compiled C, which is not possible from within a zipapp.

rurp

2 replies

3d2h

2024-09-05 16:31:00 UTC

I see a lot of mentions of itertools in this thread, which is indeed a great library, but I want to mention that itertools.groupby is one of the easiest to misuse functions I've seen. It's not necessarily intuitive that it groups contiguous records. Passing it an unsorted list won't break, but also might not return the results you're expecting.

globular-toast

1 replies

2d11h

2024-09-06 07:04:08 UTC

Yes, it's similar to Unix's `uniq` (which is mentioned in the doc). Some SQL stuff also requires prior sorting to work right (although usually you can't accidentally miss it).

Nowadays the GNU `uniq` can sort and the `sort` can unique because there are performance benefits. Assume the same is true in Python so if worried about performance `groupby(sorted(...))` might not be the best.

The other thing that is a bit odd is it returns iterators. It's up to you to build concrete groups if that's what you need.

ttyprintk

0 replies

2d7h

2024-09-06 11:01:38 UTC

I feel like non-trivial uses always involve a complex key function. At that point, I reach for defaultdict(list)

pjot

2 replies

3d4h

2024-09-05 14:30:40 UTC

To run a localhost webserver on port 8000, serving the content of the current directory:

  python -m http.server

Pass -h for more options.

macNchz

1 replies

2024-09-05 17:33:21 UTC

This is one I use all the time, super handy. Another CLI module I regularly make use of is `python -m json.tool`, for formatting and validating json.

Last year I ran http.server with -h to remind myself of something, and the --cgi flag caught my eye...funnily enough there's built in support in the web server for running CGI scripts. Alas, it's deprecated and will be removed in 3.13 later this year, but I when I discovered it I couldn't resist the opportunity to write a CGI script for the first time 20-something years: https://github.com/drien/python-httpserver-upload

ttyprintk

0 replies

2d7h

2024-09-06 11:18:37 UTC

You’ll probably like -m zipfile for those times unzip is not installed.

dairiki

1 replies

3d9h

2024-09-05 09:29:37 UTC

I just discovered graphlib.TopologicalSorter the other day.

Nice! When you need it, you need it. It's nice not to have to implement it oneself.

crabbone

0 replies

3d2h

2024-09-05 15:36:01 UTC

Once I found myself needing to sort something topologically... and the interface to this sorter is so bad that you cannot really retrofit any kind of graph data to make it work with the sorter. So, it's kinda worthless, unless you specifically design your graph to be sorted with this sorter.

Also, topological sort is like five lines of code... so, it doesn't matter if the function is there.

timdiggerm

0 replies

3d4h

2024-09-05 13:47:25 UTC

The contrast between links and the background colors is too low, making this very hard to read.

teddyh

0 replies

3d8h

2024-09-05 09:41:27 UTC

file_url = 'file://' + os.path.realpath('test.html')

You have to encode the file name!

  file_url = 'file://' + urllib.parse.quote(os.path.realpath('test.html'))

skinner927

0 replies

3d6h

2024-09-05 12:12:34 UTC

contextlib.ExitStack is a lesser known trick for limiting context manager nesting.

openrisk

0 replies

3d11h

2024-09-05 07:01:47 UTC

Some modules are essential additions while others are handy so as not to have to manage dependencies.

Good example of the latter use case is the statistics module.

There is a price to pay though: its performance is 10x slower than numpy. So its mostly useful when the required calculation is not a bottleneck.

The benefit is you are good to go (batteries included) without any virtual environmemts, pip's etc.

nurettin

0 replies

1d21h

2024-09-06 21:05:13 UTC

I am surprised that they didn't mention pack/unpack. And a namedtuple can take the results of unpack which means you can easily parse binary files. Like

    header = Header._make(unpack(f.read(64), header_format))

    print(header)

nick238

0 replies

3d11h

2024-09-05 06:58:38 UTC

The following are identical

    fractions.Fraction(numerator=1, denominator=3)
    fractions.Fraction(1) / 3

ChainMap is maybe better described/used as inheritance for dicts, where something like

    settings = ChainMap(instance_settings, region_settings, global_settings)

would give you one object to look in.

jjice

0 replies

3d3h

2024-09-05 15:00:12 UTC

This is the reason I love python for small personal projects. I can get up and going in a heartbeat and the stdlib has so much that I'd need. If there was a Flask-style HTTP server and a more requests-like HTTP client in the stdlib, I'd be a content man. Maybe I need to suck it up, but I just find venvs and Python packaging in general annoying to deal with, especially for something small.

That said, Go has those things so it's crept in a little bit into my quick programming, but I'll always love python.

h4l

0 replies

3d9h

2024-09-05 09:30:46 UTC

MappingProxyType is another handy one. It wraps a regular dict/Mapping to create a read-only live view of the underlying dict. You can use it to expose a dict that can't be modified, but doesn't need copying.

https://docs.python.org/3/library/types.html#types.MappingPr...

globular-toast

0 replies

2024-09-05 17:54:50 UTC

ChainMap is one of my favourites. I like when I find a use for it. The obvious one is cascading options type thing (like cmdline options -> env -> defaults). I also found a use for it recently when changing the underlying storage layer of a class without breaking the API.

My other favourite parts of the stdlib are functools and itertools. They are both full of stuff that gives you superpowers. I always find it a shame when I see developers do an ad hoc reimplantation of something in functools/itertools.

flakiness

0 replies

3d1h

2024-09-05 17:09:14 UTC

Didn't know "dis". It looks nice!

Everyone these days is using ast [1] but the might be room for dis instead in some cases.

[1] https://docs.python.org/3/library/ast.html#module-ast

djoldman

0 replies

3d4h

2024-09-05 13:54:15 UTC

TIL decimal and fraction. Pretty cool.

brianyu8

0 replies

3d15h

2024-09-05 02:47:43 UTC

If you liked this blog post, I can’t recommend PyMOTW[0] highly enough. It’s my goto for a concise introduction whenever I need to pick up a new Python stdlib module.

[0]: https://pymotw.com/3/

alexpotato

0 replies

3d6h

2024-09-05 11:55:08 UTC

For a funny and insightful tour of the Python "built in" functions, I highly recommend Dave Beazley's talk:

https://www.youtube.com/watch?v=j6VSAsKAj98

Uptrenda

0 replies

3d11h

2024-09-05 06:45:25 UTC

Throwing frozensets out, too. If regular sets aren't obscure enough, frozensets might be your thing. It looks like a set, it acts like a set, but its... hashable (for indexing) and (immutable.) Why use this? For algorithms that rely on combinations (not permutations), frozensets can be very useful. E.g. NOT this -- (0, 1) (1, 0) (both distinct using tuples) vs frozenset([0, 1]) ([1, 0] or [0, 1] have the same identity / frozenset.) You can use this for indexing algorithms and things like that. Sometimes, sets are very convenient because they naturally 'normalise' entries into a fixed order. This can simply a lot of code.

Qem

0 replies

3d8h

2024-09-05 09:55:23 UTC

For people eager to join the AI/ML revolution it provides Naive Bayes classifier - an algorithm that can be considered a minimum viable example of machine learning.

I don't think this is true. It allows you to specify and calculate parameters for normal distributions, what allows you to jury rig a naive bayes classifier, what is shown as a doc example. This is not the same as providing a built in classifier.

OutOfHere

0 replies

2d18h

2024-09-06 00:04:43 UTC

It has an HTTP server that can compete with Flask for simple use.