This page looks best with JavaScript enabled

Live I18n

 ·  ☕ 7 min read

I wrote a new MediaWiki extension this past weekend, LiveI18n! It’s a parser function (also available as a built-in Lua function) that’s an easy-to-use way to provide internationalization (i18n) to your wiki’s UI & navigational elements, such as infobox labels. There were a couple ways you could already do this, but both were kinda bad, for reasons I’ll explain below.

Usage examples

Parser function

With your user language set to pt-br and $wgLiveI18nDefaultLanguageCode set to es:

This should be falling back to pt: {{#live_i18n:en=hello|es=not hello|pt=this is a fallback}}<br>
This should be displaying in the default, pt-br: {{#live_i18n:en=hello|es=not hello|pt=this is a fallback|pt-br=this is the real thing}}<br>
This should say hola because nothing good is available: {{#live_i18n:en=hello|es=hola}}<br>
Nothing should display here: {{#live_i18n:nothing is here!}}<br>
Or here: {{#live_i18n:de=nothing is here!}}

will give the following outputs:

This should be falling back to pt: this is a fallback
This should be displaying in the default, pt-br: this is the real thing
This should say hola because nothing good is available: hola
Nothing should display here:
Or here: 

Lua

Again, with your user language set to pt-br and $wgLiveI18nDefaultLanguageCode set to es:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
local p = {}

function p.main(frame)
    ret = {}
    ret[#ret+1] = 'This should be falling back to pt: ' .. mw.ext.live_i18n.translate{ en = "Hello", es = "Hola", pt = 'this is a fallback' }
    ret[#ret+1] = 'This should be displaying in the default, pt-br: ' .. mw.ext.live_i18n.translate{ en = "Hello", pt = 'this is a fallback', ['pt-br'] = 'this is the real thing' }
    ret[#ret+1] = 'This should say hola because nothing good is available: ' .. mw.ext.live_i18n.translate{ en = 'hello', es = 'hola' }
    ret[#ret+1] = 'Nothing should display here: ' .. mw.ext.live_i18n.translate{ 'nothing is here!' }
    ret[#ret+1] = 'Or here: ' .. mw.ext.live_i18n.translate{ de = 'nothing is here!' }
    return table.concat(ret, '<br>')
end

return p

will give the same outputs, namely:

This should be falling back to pt: this is a fallback
This should be displaying in the default, pt-br: this is the real thing
This should say hola because nothing good is available: hola
Nothing should display here:
Or here: 

Why is there a config variable?

Why do I provide $wgLiveI18nDefaultLanguageCode at all, instead of always falling back to $wgLanguageCode?

  • You could choose to use qqx for the former, and give your i18n labels names when they’re untranslated, e.g. cat-color-label. This would be recommended only if you have a private wiki which you know is used by speakers of only 2 or 3 languages, and you know everything will be fully translated.
  • I can’t really think of any other reasonable use case, but I’m generally in favor of extra customization options rather than locking people into could-be-inconvenient defaults.

Alternatives (and why this is better)

MyVariables

You can use MyVariables’s variable {{USERLANGUAGECODE}} to get the current user’s preferred language code. Then, you can use ParserFunction’s switch parser function to pick from the available languages and display the correct output to the user. Indeed, this is the approach that I’m currently using on Leaguepedia as of writing this article.

There are some disadvantages to this approach:

  • The syntax is a bit unwieldy: instead of using a purpose-built parser function, you’re switching on a variable.
  • There is no Lua support, so if you’re in Lua, you’re using either frame:callParserFunction() or frame:preprocess(), both of which are not great.
  • You can’t take advantage of fallback languages (e.g. pt and pt-br should fallback to each other preferentially over en) unless you write your own support for them.

int magic word

You can also use the int magic word. This approach would have you create one system message per i18n message per language; for example, if the “Cats” infobox has a label called “fur color,” you might create the articles MediaWiki:customi18n-infobox-cats-fur-color, MediaWiki:customi18n-infobox-cats-fur-color/de, etc. The customi18n prefix is there to ensure you don’t have any namespacing conflicts with any other system messages, and then you’d further categorize within the name of the template, and then the message name.

Because only sysops can edit the MediaWiki namespace, and also because editing this many separate pages is a bit annoying, you might want to create an infrastructure that can read from Lua modules ending in /i18n and then update the corresponding system messages via a crontab script running on a per-minute basis, from an account with the correct permissions. Because you’re namespacing to messages prefixed by customi18n, this shouldn’t be any more “dangerous” than using the MyVariables approach in the first place, even though you’re effectively allowing anyone to edit a restricted namespace.

Of course there are also disadvantages to this approach:

  • Anything that requires a crontab is a huge amount of infrastructure to set up unless you already have that in place (and lacking that setup, it’s a giant pain to edit the individual pages).
  • Crontabs can crash.
  • The up-to-a-minute delay can be really annoying when trying to see changes; also people may not be aware that the delay exists, and think that their edits aren’t working.

I considered using this approach for my i18n; the first bullet isn’t an issue for me, but the third certainly is, and so I decided to go for the MyVariables approach.

Other?

I’m not aware of any other pre-existing way to do i18n on a single wiki (short of the Translate extension, which is overkill if all you want to translate is navigational/interface elements, and not your full content). But maybe there is another way to do it. In any case, I’m pretty sure having a dedicated parser function with Lua support is ideal, and that’s why I wrote LiveI18n!

Development

It took me probably 7-8 hours in total to write this extension. All I knew going in was that I was going to use $parser->getOptions()->getUserLang() to get the user language code. This is not an exaggeration by any means; this is literally the only thing I knew about what had to be done, but I figured the rest of it would be “easy enough to figure out.” And I guess I was more or less right, considering that the extension seems to work.

(Incidentally, this is why I knew even that much - I had assumed that MyVariables and the int magic word used the same way of getting the user language when I first implemented my i18n. Little did I know, MyVariables actually used a different approach and disabled parser cache every time you called it! Uh oh. That was a fun surprise the first time I turned that module on. So I read a bunch of source code, figured out what was going on, and then we patched MyVariables & sent the patch upstream as well.)

My general approach was to look at extensions that I knew were doing similar things to what I needed to do and then copy the general approach of what they were doing. For example, I started by copying my own first extension, CustomLogs’s extension.json file (at least the parts of it I knew I’d need, like, uh, name and author and having a config variable).

Extensions that I used (this is probably not a complete list):

  • Arrays (it has parser functions!)
  • ParserFunctions (it also has parser functions!)
  • Cargo (it has a Lua function!)
  • Scribunto (it has…lots of Lua functions)
  • I18nTags (I went on a super long tangent trying to set up CLDR for my language fallbacks before, umm, reading the docs and realizing that I wanted a built-in LanguageFallback…but this was only after I literally could not get CLDR to work, so…good thing I could not get CLDR to work I guess. Incidentally, this is why this extension requires 1.35+)
  • ExtJSBase (I have no idea what this is, but I searched Github in the mediawiki org (I did a lot of that) for uses of LanguageFallback to figure out how I was supposed to get one of these things, and that was what had an example!)

I also did a lot of print-debugging by returning things that were not my actual output but rather some construction based on intermediate objects when my code wasn’t working. Now that it’s done, I kinda sorta maybe know what I’m doing. A bit. Haha. In any case, it was a super fun learning experience, and I think this extension is legitimately super useful, so definitely some of the most productive 7-8 hours I’ve spent this year.

Next, I want to write an extension to generate documentation pages for JSON contentmodel pages….

Share on

river
WRITTEN BY
River
River is a developer most at home in MediaWiki and known for building Leaguepedia. She likes cats.


What's on this Page