This page looks best with JavaScript enabled

What's up with Variables and Parsoid?

 ·  ☕ 8 min read

If you go to the mediawiki.org page for the Variables extension, you will see the following rather scary warning:

Warning: This extension is incompatible with plans to parallelize parsing, as is intended by the use of Parsoid. Therefore, the future of this extension is uncertain, and it is expected to become incompatible with the standard MediaWiki parser within a few years. For further information, see task T250963 and Parsoid/Extension API § No support for sequential, in-order processing of extension tags.1

But it’s one thing for Wikipedia, with its huge contributor base including several volunteer developers, to use solely Lua in places where Extension:Variables would otherwise be needed. It’s quite another for small third-party wikis on topics like video games, software documentation, and personal projects - for wikis like this, there are often only a couple people managing every aspect of the site, and it’s unlikely any of them has significant development experience.

Over the past few years, there’s been a lot of confusion in the third-party wiki space over what to do about this situation. Is Extension:Variables safe to use? Will Parsoid support forced-linear parsing?2 Will the legacy parser be permanently available as a separate contentmodel? Can variables be made compatible with Parsoid? And are we talking about #vardefine and #var only, or also #var_final?

At MWCon Spring 2024 I spoke with cscott from the Parsoid team at the Wikimedia Foundation about the situation. This post contains my assessment of the current state of Variables & Parsoid & MediaWiki based on our conversation; cscott has reviewed it (extensively, he helped a lot!!), but please don’t take this post to be any kind of official communication itself. I don’t work for WMF and this isn’t Phabricator. Thanks also to alex4401 for reviewing.

TL;DR

I believe you are safe to continue using Variables on your own third-party wiki. This includes #var_final as this function has been patched to be exactly as Parsoid-compatible as the rest of the extension. (But try not to use #var_final anyway because it’s not a great coding practice.)

Code quality recommendation

So you can use variables, but should you? Keep in mind that every variable is a global variable. This is slightly off-topic for this article so I’ve placed my advice inside of a collapsed section, but I don’t want to say “yes use variables!” without qualifying a bit.

There is no such thing as a locally-scoped variable.

Recently I was doing a demo where I built a tiny wiki with some Cargo; the point of the demo was to explain the use of Cargo, so I did it entirely in wikitext. I wanted to show a sum of some things:

{{#vardefine:
   cweight|
      {{#expr:
          {{#var:cweight|0}} + {{{quantity}}} * {{{weight}}}
      <!-- end expr -->}}
}}

Okay, easy enough, then retrieve it when you are done with:

{{#var:cweight}}

Not so fast, there can be multiple instances of this entire thing per page. So we have to retrieve it with:

{{#var:cweight}}{{#vardefine:cweight|0}}

Scoping does not exist.

If you are comfortable with Lua, please use Lua instead of Extension:Variables whenever possible. Don’t worry about the performance cost of jumping in and out of the Lua interpreter, write maintainable (and working!) code. cscott also recommends checking out Extension:ArrayFunctions to get some extra rendering logic while staying “safely” in wikitext.

(Specifically, what ArrayFunctions affords you is the ability to keep a simple list of data that’s easily changeable, and then to write all your display logic around that list. This fulfills the principle of separation of data & display, which is something programmers usually strive for. I learned about this extension for the first time about a week before writing this post, and I can definitely understand the appeal, but I am a bit concerned that it will be somewhat “write-only” as you’ll create a complicated template and then have no idea how any of it works two months later.)

My code quality recommendation regarding variables is thus:

  1. If you legitimately need a global singleton then use variables (specifically this is the case if you want to impose an order on several rows of Cargo stored to the same page)
  2. Else if you can use Lua then use Lua
  3. Else go ahead and use variables

What’s the deal with Parsoid?

You may have heard that Parsoid lets the parser parse articles in any order. This is more or less true, but it’s a bit more complicated than that.

Parsoid will expand the article into many different “fragments” (template expansion, parser function, etc.), parse them independently - reusing the result of any fragments it can - and then compile the results to give you a completed document. There are two situations in which a fragment can be reused:

  • An identical fragment was encountered earlier in the parse; or
  • The parser has access to an identical fragment from an earlier revision. The exact implementation of this “cross-revision fragment reuse” is not yet determined, but it will be implemented in one of these two ways:
        1. As the parser computes the outputs of each fragment, it caches these outputs keyed by the wikitext that was used to generate it. These cached values can be reused on subsequent parses of any page.
        2. There will be no global cache, and the parser will do no extra work to cache fragments for global reuse. Instead, when it comes time to read from cache, the parser will look at the current page’s history and pull previously-computed HTML directly out of the previous DOM structure.

If a fragment is being reused, the parser will will NOT re-evaluate the fragment, and will instead insert the same HTML that was computed from this fragment last time.

So it’s not necessarily true that the parser will parse articles out of order, but if the parser skips parsing some fragment and instead reuses an older version’s HTML, we get the same effect as out-of-order parsing: A fragment can output HTML that was computed earlier in time than when code that appears above it lexicographically was computed.

In the following block of code, that’s not an issue:

{{#if:x|hello|world}}

{{#if:|world|hello}}

The two lines of code here do not interact in any way; the parser can evaluate either one first and be fine. And if it reuses one of the statements’ values from a previous parse, that’s also fine; unless the wikitext code inside a block changes, its value will always be the same.

But let’s look at this example instead:

{{#vardefine:x|17}}

{{#var:x}}

What if on the last revision of the page we had written {{#vardefine:x|34}} instead of {{#vardefine:x|17}}? Because the wikitext of the fragment {{#var:x}} is untouched, the parser will insist on reusing the cached HTML of that fragment, and we’ll always see 34 output! We won’t be able to update the value until the cache expires.

Possible issues during the transition to Parsoid

The assumptions that fragments can be reused & that parsing can happen out of order are the primary issues with Parsoid and variables. But while Parsoid is under development, there may be some additional problems due to Parsoid relying on the legacy parser for some of its computations. These problems should be resolved by the time Parsoid is considered stable.

What to expect

Most likely, under Parsoid, extensions will have some mechanism to opt out of fragment reuse and force linear parsing. This will come as a performance penalty compared to what Parsoid would otherwise be capable of, but it won’t be a regression compared to current behavior.

You should expect a timeline somewhat like this:

  • 1.41 - no change at all!
  • 1.42 - Extension:ParserMigration allows you to test Parsoid in the main namespace, and Extension:DiscussionTools uses Parsoid by default. This shouldn’t impact you unless you go out of your way to make it do so.
  • 1.43? (tentatively) - Using Parsoid for wikitext will be possible in vanilla MediaWiki, but it won’t be the default. Don’t opt into it if you want to use Variables.
  • 1.?? - Parsoid will become the default parser, but the legacy parser will still be bundled with MediaWiki; if you want to use Variables, continue to use the legacy parser.
  • 1.?? - The legacy parser will eventually stop being bundled with MediaWiki, but will instead be available as an extension. WMF can be expected to continue providing security updates as long as there is a community maintainer.
  • 1.?? - Parsoid adds support to disable fragment reuse & to force linear parsing; Variables should be able to invoke this functionality whenever it’s used. Wikis using variables can switch to Parsoid. (Note: Depending on the needs of WMF-supported extensions, this milestone may happen earlier.)
  • 1.?? - Hopefully, Parsoid’s support is universal enough that the legacy parser is no longer required & becomes unmaintained.

Conclusion

Variables are almost certainly safe to keep using, including #var_final, but you may have to delay updating to Parsoid. You’ll permanently lose out on performance benefits that Parsoid will bring if you keep Variables enabled forever, but if you’re happy with your wiki’s current performance, then nothing should change. And “nonlinear parsing” is a bit of a misnomer, but it’s a good enough approximation to what’s going on that if you don’t totally understand fragments, you can roll with that.

Finally, check out cscott’s MWCon presentation “Don’t be a [[Square]]! Keeping up with wikitext changes,” which you can find linked in the MWCon presentation list - it’s a very cool look at planned changes to MediaWiki syntax and into the development philosophy of the Parsoid team!


  1. This warning went through several iterations in March and April 2021, of varying levels of severity. This was by far the scariest version and it should be noted that WMF staff did not create or review this text. ↩︎

  2. As it turns out, this is not exactly what should be asked. The more accurate question might be, “Will Parsoid support disabling fragment reuse when certain parser functions are used in a page?” ↩︎

Share on

river
WRITTEN BY
River
River is a developer most at home in MediaWiki and known for building Leaguepedia. She likes cats.


What's on this Page