This page looks best with JavaScript enabled

Optimizing Cargo - query 2

 ·  ☕ 13 min read

Here’s the second of three Cargo queries I recently optimized.

(Note: All code in this post was originally written by me for Leaguepedia and is licensed under CC BY-SA 3.0.)

The original query

Here’s the original query:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
function h.query(team)
	local tbl = {
		tables = {
			'MatchSchedule=MS',
			'MatchScheduleGame=MSG',
			'TeamRedirects=MTR1',
			'TeamRedirects=MTR2',
			'TeamRedirects=GTR1',
			'TeamRedirects=GTR2',
		},
		join = {
			'MS.MatchId=MSG.MatchId',
			'MS.Team1=MTR1.AllName',
			'MS.Team2=MTR2.AllName',
			'MSG.Blue=GTR1.AllName',
			'MSG.Red=GTR2.AllName',
		},
		fields = {
			-- if game is null then we'll assume blue and red are done properly in the match
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Blue IS NOT NULL THEN MSG.Blue ELSE MS.Team1 END) = Team1',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Red IS NOT NULL THEN MSG.Red ELSE MS.Team2 END) = Team2',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND GTR1._pageName IS NOT NULL THEN GTR1._pageName ELSE MTR1._pageName END) = Team1Page',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND GTR2._pageName IS NOT NULL THEN GTR2._pageName ELSE MTR2._pageName END) = Team2Page',
			'MS.DateTime_UTC=UTC',
			'MS.OverviewPage',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Winner IS NOT NULL THEN MSG.Winner ELSE MS.Winner END) = Winner[number]',
			'MS.Tab',
			'MS.Round',
			
			-- the following 2 cases auto fallback to MS when MSG is null
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Winner = "1" THEN "1" WHEN MS.BestOf = "1" AND MSG.Winner = "2" THEN "0" ELSE MS.Team1Score END) = Team1Score [number]',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Winner = "1" THEN "0" WHEN MS.BestOf = "1" AND MSG.Winner = "2" THEN "1" ELSE MS.Team2Score END) = Team2Score [number]',
			'MS.FF [number]',
			'MS.ShownName',
			'MS.BestOf [number]',
			'MS._pageName=DataPage',
		},
		where = ('(MTR1._pageName="%s" OR MTR2._pageName="%s") AND MS.Winner IS NOT NULL'):format(team, team),
		orderBy = 'DateTime_UTC DESC',
		groupBy = 'MS.MatchId',
		limit = 200,
	}
	return tbl
end

To be clear, this query is somewhat loaded with (what should be) application logic in its list of fields that it has no right to be performing, and my life would be a lot easier if I just did the application logic in the right place. This query is not great, and I don’t really recommend cramming this much CASE/WHEN logic into a query. If I were forced to rewrite this entire module from scratch, I would not attempt to reconstruct this query; instead, I would get a simple list of fields, perform the CASE/WHEN logic (as a bunch of if statements) in application logic following the query, and have a much easier time of things.

That said, there is never a need to rewrite things from scratch, and when you refactor you should generally refactor incrementally. In this case (haha) I was able to fix the query with a very small change and I just left things as they were otherwise. The CASE/WHEN logic did get a bit more verbose but it’s ok.

So, let’s break this down. What is this query doing?

There are two tables we are getting information from: MatchSchedule (MS) and MatchScheduleGame (MSG). The first one is, as its name suggests, matches with timestamps (aka a schedule of matches). The second one has one row per game per match.

Quick note about League of Legends: There’s two sides in each game, like white and black in chess. In League, they’re called blue and red. Just like in chess, in each game, which team is on blue side and which team is on red side is something interesting.

Which team is blue & which is red is stored in the MSG table and can’t be stored in MS, since it changes each game. If a series is best-of-one (BO1), then MS & MSG are 1:1 with each other, and I can proxy MS.Team1 and MS.Team2 with MSG.Blue & MSG.Red (in some order); however, if the series was longer than one game, then there are many entries of MSG and this substitution no longer works.

That’s what all of the verbose CASE/WHEN logic above is handling: If the series is a BO1, then I want to get information from MSG to print in my one-row-per-match table, because that’s more interesting, but if it’s BO2/BO3/BO5, then I need to stick to the MS values.

There’s one more non-complicated but extra-verbose issue: Sometimes MSG data is simply omitted from an entire tournament because we don’t have game-specific data available. In this case, the values will be null, so I need to add an AND NOT NULL check to everything.

What’s wrong with the query?

Because there’s 4 copies of TeamRedirects, a table with almost 4000 rows at the time of this article’s publication, as well as an OR in the where condition, this query was causing performance issues. It wasn’t nearly as bad as the query mentioned last time, but I still wanted to fix it.

How to fix?

There were a couple options I could do to split up the query into multiple parts:

  • Split up on the OR in the where - first, query rows where the team is MS.Team1 (via TeamRedirects of course, which accounts for teams renaming) and then query rows where the team is MS.Team2. These rows should be non-overlapping because a team can’t play vs itself. Because of the limit that I’m displaying, a bit of new application logic would be needed; I’d have to query 200 in each one individually, sort, then display only the first 200 rows of the combined result, but otherwise this shouldn’t be too bad.
  • Query MS & MSG data separately. This would require a lot of new application logic because the rows WOULD be overlapping, i.e. rows would require data from both queries. But I could grab just the MS data, get the MatchIds, then grab the MSG data from those specific MatchIds, splice it in, and go from there.
  • Delete the code completely. lolwut? Yeah turns out the schedule history pages are barely used by anyone, this query sucks to maintain, and if it was going to be a ton of work to fix the performance issue, it wasn’t really worth it to keep these around. Just throwing this out there - if you think maybe your users aren’t using a high-effort feature, ask them if they’d mind if you just canned it!

Instead I did none of these. I fixed what amounted little more than a bug.

A major design oversight

Where are all of these copies of TeamRedirects being used? Let’s look:

MTR1 & MTR2

First, in these fields:

1
2
'CONCAT(CASE WHEN MS.BestOf = "1" AND GTR1._pageName IS NOT NULL THEN GTR1._pageName ELSE MTR1._pageName END) = Team1Page',
'CONCAT(CASE WHEN MS.BestOf = "1" AND GTR2._pageName IS NOT NULL THEN GTR2._pageName ELSE MTR2._pageName END) = Team2Page',

And also in the where condition:

1
where = ('(MTR1._pageName="%s" OR MTR2._pageName="%s") AND MS.Winner IS NOT NULL'):format(team, team)

Yes, we definitely need these copies of TeamRedirects. It’s in a field and also in the where condition; the second one at least is non-negotiable.

GTR1 & GTR2

They show up only in these fields:

1
2
'CONCAT(CASE WHEN MS.BestOf = "1" AND GTR1._pageName IS NOT NULL THEN GTR1._pageName ELSE MTR1._pageName END) = Team1Page',
'CONCAT(CASE WHEN MS.BestOf = "1" AND GTR2._pageName IS NOT NULL THEN GTR2._pageName ELSE MTR2._pageName END) = Team2Page',

Hm, these are the same fields as before….And wait, the teams in GameSchedule are the SAME as the teams in MatchSchedule (we know this via business logic). Could we just…reuse the MTR1 & MTR2 values here?? Yes, of course we can!

The solution

Remember that the entire point of the CASE/WHEN logic is that we don’t know which of the following two equalities holds, just that one of them holds:

  • MS.Team1 = MSG.Blue and MS.Team2 = MSG.Red
  • MS.Team1 = MSG.Red and MSG.Team2 = MSG.Blue

So we do have to expand our CASE/WHEN by one additional clause to check which one of these conditions is true, then “fill in the blank” with the appropriate value of MTR1 or MTR2, but at that point we can totally ditch two of our four copies of TeamRedirects entirely for free, with zero additional application logic and no other refactoring!

The new query

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
function h.query(team)
	local tbl = {
		tables = {
			'MatchSchedule=MS',
			'MatchScheduleGame=MSG',
			'TeamRedirects=MTR1',
			'TeamRedirects=MTR2',
		},
		join = {
			'MS.MatchId=MSG.MatchId',
			'MS.Team1=MTR1.AllName',
			'MS.Team2=MTR2.AllName',
		},
		fields = {
			-- if game is null then we'll assume blue and red are done properly in the match
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Blue IS NOT NULL THEN MSG.Blue ELSE MS.Team1 END) = Team1',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Red IS NOT NULL THEN MSG.Red ELSE MS.Team2 END) = Team2',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Blue = MS.Team1 THEN MTR1._pageName WHEN MS.BestOf = "1" AND MSG.Red = MS.Team1 THEN MTR2._pageName ELSE MTR1._pageName END) = Team1Page',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Red = MS.Team1 THEN MTR1._pageName WHEN MS.BestOf = "1" AND MSG.Blue = MS.Team1 THEN MTR2._pageName ELSE MTR2._pageName END) = Team2Page',
			'MS.DateTime_UTC=UTC',
			'MS.OverviewPage',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Blue IS NOT NULL THEN MSG.Winner ELSE MS.Winner END) = Winner[number]',
			'MS.Tab',
			'MS.Round',
			
			'MSG.Blue',
			'MSG.Red',
			
			-- the following 2 cases auto fallback to MS when MSG is null
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Winner = "1" THEN "1" WHEN MS.BestOf = "1" AND MSG.Winner = "2" THEN "0" ELSE MS.Team1Score END) = Team1Score [number]',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Winner = "1" THEN "0" WHEN MS.BestOf = "1" AND MSG.Winner = "2" THEN "1" ELSE MS.Team2Score END) = Team2Score [number]',
			'MS.FF [number]',
			'MS.ShownName',
			'MS.BestOf [number]',
			'MS._pageName=DataPage',
		},
		where = ('(MTR1._pageName="%s" OR MTR2._pageName="%s") AND MS.Winner IS NOT NULL'):format(team, team),
		orderBy = 'DateTime_UTC DESC',
		groupBy = 'MS.MatchId',
		limit = 200,
	}
	return tbl
end

As it turned out, this was enough to fix any performance issues we had and stopped the query from triggering any monitoring alerts, no further refactoring or any application logic changes needed! Yay!

Conclusion

Sometimes, instead of doing a huge refactor, you might have an easy optimization to make that can save your query’s performance without any major changes needed.

Also, some queries shouldn’t have existed in the first place, and some features should be sunset, but in the end this one was too easy to fix for that to happen to it.

And yes, I know TeamRedirects is using the _pageName field! It really shouldn’t, but I have not yet gotten around to adding an OverviewPage field to this table, like I have for PlayerRedirects. It’s on my todo list I promise.

Full module code

As usual, I’m including the full code of Module:TeamSchedule in case you want to read it:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
local util_args = require('Module:ArgsUtil')
local util_cargo = require('Module:CargoUtil')
local util_esports = require("Module:EsportsUtil")
local util_map = require("Module:MapUtil")
local util_time = require('Module:TimeUtil')
local util_vars = require("Module:VarsUtil")
local m_team = require('Module:Team')
local lang = mw.getLanguage('en')

local COLUMNS = { 'Date', 'Event', 'Round', 'Team 1', 'Team 2', 'Side', 'Result', 'Score' }
local SIDES = { [1] = 'Blue', [2] = 'Red' }

local DEBUG = false

local h = {}

local p = {}
function p.main(frame)
	local args = util_args.merge()
	h.castArgs(args)
	h.setDebugStatus(args)
	h.setColumnsBasedOnDebugStatus()
	local team = args.team and m_team.teamlinkname(args.team) or mw.title.getCurrentTitle().baseText
	local query = h.query(team)
	local result = util_cargo.queryAndCast(query)
	local processed = h.processData(result, team)
	return h.printTable(processed)
end

function h.castArgs(args)
	args.debug = util_args.castAsBool(args.debug)
end

function h.setDebugStatus(args)
	DEBUG = args.debug
end

function h.setColumnsBasedOnDebugStatus()
	if not DEBUG then return end
	COLUMNS[#COLUMNS+1] = 'Team1'
	COLUMNS[#COLUMNS+1] = 'Team2'
	COLUMNS[#COLUMNS+1] = '_side'
	COLUMNS[#COLUMNS+1] = 'Winner'
	COLUMNS[#COLUMNS+1] = 'Team1Page'
	COLUMNS[#COLUMNS+1] = 'Team2Page'
end

function h.query(team)
	local tbl = {
		tables = {
			'MatchSchedule=MS',
			'MatchScheduleGame=MSG',
			'TeamRedirects=MTR1',
			'TeamRedirects=MTR2',
		},
		join = {
			'MS.MatchId=MSG.MatchId',
			'MS.Team1=MTR1.AllName',
			'MS.Team2=MTR2.AllName',
		},
		fields = {
			-- if game is null then we'll assume blue and red are done properly in the match
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Blue IS NOT NULL THEN MSG.Blue ELSE MS.Team1 END) = Team1',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Red IS NOT NULL THEN MSG.Red ELSE MS.Team2 END) = Team2',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Blue = MS.Team1 THEN MTR1._pageName WHEN MS.BestOf = "1" AND MSG.Red = MS.Team1 THEN MTR2._pageName ELSE MTR1._pageName END) = Team1Page',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Red = MS.Team1 THEN MTR1._pageName WHEN MS.BestOf = "1" AND MSG.Blue = MS.Team1 THEN MTR2._pageName ELSE MTR2._pageName END) = Team2Page',
			'MS.DateTime_UTC=UTC',
			'MS.OverviewPage',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Blue IS NOT NULL THEN MSG.Winner ELSE MS.Winner END) = Winner[number]',
			'MS.Tab',
			'MS.Round',
			
			'MSG.Blue',
			'MSG.Red',
			
			-- the following 2 cases auto fallback to MS when MSG is null
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Winner = "1" THEN "1" WHEN MS.BestOf = "1" AND MSG.Winner = "2" THEN "0" ELSE MS.Team1Score END) = Team1Score [number]',
			'CONCAT(CASE WHEN MS.BestOf = "1" AND MSG.Winner = "1" THEN "0" WHEN MS.BestOf = "1" AND MSG.Winner = "2" THEN "1" ELSE MS.Team2Score END) = Team2Score [number]',
			'MS.FF [number]',
			'MS.ShownName',
			'MS.BestOf [number]',
			'MS._pageName=DataPage',
		},
		where = ('(MTR1._pageName="%s" OR MTR2._pageName="%s") AND MS.Winner IS NOT NULL'):format(team, team),
		orderBy = 'DateTime_UTC DESC',
		groupBy = 'MS.MatchId',
		limit = 200,
	}
	return tbl
end

function h.processData(result, team)
	util_map.rowsInPlace(result, h.processRow, team)
	return result
end

function h.processRow(row, team)
	-- coalesce in THIS fields list would just be, too much, so do it here
	row.Team1Page = row.Team1Page or row.Team1
	row.Team2Page = row.Team2Page or row.Team2
	row._side = h.getSide(team, row.Team1Page, row.Team2Page, row.DataPage)
	local winner = h.getWinner(row.Winner, row._side)
	local result = winner == 'Draw' and 'Draw' or winner and 'Win' or 'Loss'
	local isBO1 = h.isBO1(row.Team1Score, row.Team2Score)
	util_esports.setScoreDisplays(row)
	row.Date = util_time.dateInLocal(row.UTC)
	row.Event = ('[[%s|%s]]'):format(row.OverviewPage, row.ShownName or '')
	row.Round = row.Tab or row.Round or ''
	row['Team 1'] = m_team.rightmediumlinked(row.Team1)
	row['Team 2'] = m_team.rightmediumlinked(row.Team2)
	row.Side = isBO1 and SIDES[row._side] or 'Series'
	row.Result = result
	row.Score = h.getScore(row)
	row.classes = {
		('teamschedule-side-%s'):format(isBO1 and row._side or 'Series'),
		('teamschedule-result-%s'):format(result),
		Side = 'standings-mhside' .. (isBO1 and row._side or 'Series'),
		Result = 'standings-mhwl' .. (winner == 'Draw' and '' or winner and 1 or 0)
	}
	return row
end

function h.getSide(team, team1, team2, dataPage)
	if team1 and not team2 then return 1 end
	if team2 and not team1 then return 2 end
	if team1 and mw.ustring.lower(team1) == mw.ustring.lower(team) then
		return 1
	end
	if team2 and mw.ustring.lower(team2) == mw.ustring.lower(team) then
		return 2
	end
	error(("team %s seems not to be in match - page %s, %s vs %s"):format(
		team,
		dataPage,
		team1,
		team2
	))
end

function h.getWinner(winner, side)
	if winner == 0 then
		return 'Draw'
	end
	return winner == side
end

function h.isBO1(score1, score2)
	score1 = score1 or 0
	score2 = score2 or 0
	local max = math.max(score1, score2)
	local total = score1 + score2
	return max == 1 and total == 1
end

function h.getScore(row)
	return ('%s - %s'):format(
		row.Team1ScoreDisplay or 'Missing Data',
		row.Team2ScoreDisplay or 'Missing Data'
	)
end

function h.printTable(processed)
	local tbl = mw.html.create('table')
		:addClass('wikitable hoverable-rows')
		:attr('id', 'teamschedule-history-table')
	h.addHeading(tbl)
	for _, row in ipairs(processed) do
		local tr = tbl:tag('tr')
		for _, class in ipairs(row.classes) do
			tr:addClass(class)
		end
		for _, col in ipairs(COLUMNS) do
			tr:tag('td')
				:wikitext(row[col])
				:addClass(row.classes[col])
		end
	end
	return tbl
end

function h.addHeading(tbl)
	local tr = tbl:tag('tr')
	for _, col in ipairs(COLUMNS) do
		tbl:tag('th'):wikitext(col)
	end
	return
end

return p
Share on

river
WRITTEN BY
River
River (RheingoldRiver) is a MediaWiki developer and the manager of Leaguepedia. She likes cats.


What's on this Page