Ok I’m about four months late on this, but I’m not sure anyone else blogs about MediaWiki Python utility library releases so I’m still gonna tag this as a news post. In December last year, mwparserfromhell released version 0.6! There are two super exciting changes, and you should follow that link for the full changelog, since I’m only going to go over these two changes:
- Underscores and spaces are now equivalent in the
Template.get()now takes a fallback parameter (and also supports
dictsyntax for accessing params)!
(Code examples were found in my leaguepedia_archive repository or written for this post.)
Underscores and spaces thing
Previously, we would write:
And now we can write:
This may just seem like a minor convenience, but this is a pretty huge improvement for a few reasons:
- The fact that this didn’t “just work” before was a source of “accidental complexity” - especially for beginners just starting to learn the ins and outs of this library (also I’m pretty sure I forgot this wasn’t already supported and messed up at least a few times, oops)
- If you forget to support underscores, the bugs that will arise are relatively nondeterministic in that it’s dependent on wiki users having “messed up” in a sense, and so hard to notice
Infobox_Teamare just two variations, what about
Template:This template name has many different words? You get exponential growth, yikes
So this is actually something to be really excited about!!
Here’s a direct link to the PR.
I am SO EXCITED!! about this one!!!
Previously, we would write:
And this can now be written as:
If we’re certain that a param exists, then we can also now just, access parameters as if the template is a dict - I’m a bit mixed about this syntax. It only saves a couple characters, and in my opinion removes clarity a bit.
This is going to be slightly weird, though, because, remember, you get a
Param object, not the value of the key:
If this were actually a dictionary with key-value pairs of Sona data (rather than a wiki template), we’d expect to get
<class 'str'>. But instead, what we get is:
name=Sona <class 'mwparserfromhell.nodes.extras.parameter.Parameter'>
Of course, we knew that; that’s how
template.get() has always worked. But when accessing via the dict syntax, this definitely could feel just a bit unexpected - so be careful! And maybe stick to the
.get() method for clarity.
An argument in favor of the dict syntax
There’s an argument in favor of the dict syntax, though, which is to make it more obvious when we know that a parameter is expected to be in the template or not - just like when working with dicts.
template['name']- we know the template has a
nameparam (and we’ll get an error if it doesn’t)
template.get('name', None)- the template may not have a
nameparam, and fallback to
So, things to balance.
By the way, this code:
Not QUITE this dict-like None
We DO still need the fallback
None that I wrote above, unlike when working with normal dicts. (And to be clear, this is NOT a criticism of the implementation; it would be a breaking change to have it any other way, as there could be a lot of code depending on try/catching
template.get() fails. A library as low-level as
mwparserfromhell needs to be really, really, really stable, so breaking changes are to be avoided at all costs, especially for something that is, at the end of the day, really just syntactic sugar.)
I love love love love love love this library, and I’m so happy to see it continuing to be developed!
mwparserfromhell is crazy impressively good at what it does, and just a joy to develop with, and these two patches are making it even more so!
I do recommend against using the dict-access syntax - I think it can be a convenient nice-to-have, but it varies just a bit too much in behavior from real dicts to make it a net positive. Stick to
template.get('name', fallback).value and don’t forget to
.strip() the result!