Some quick notes on how to get the GP score tables into a useful shape, in order to be able to compute something like the below top 10 by best four results in the first five GPs (using preliminary results for GP 5). So I don’t have to figure this out again next time, and so you can play along at home.
rank | name | points ------+----------------------------+-------- 1 | Ken Endo | 3706.7 2 | Ulrich Voigt | 3285.9 3 | Hideaki Jo | 3235.5 4 | Kota Morinishi | 3102.1 5 | Nikola Zivanovic | 3054.5 6 | Robert Vollmert | 2905.2 7 | Will Blatt | 2896.7 8 | Prasanna Seshadri | 2834.7 9 | Yuki Kawabe | 2834.5 10 | Michael Ley | 2806.7
The scripts used to do this are available here.
First part is getting the HTML tables into a more useful shape; I wanted CSV. Copy and paste into a spreadsheet works, but gets tedious quickly. The following monstrosity does the job for me (using the Mac OS X command line, so your awk
might well behave differently in subtle ways). EDIT: small improvements.
$ curl http://gp.worldpuzzle.org/content/preliminary-results-wpf-gp-puzzle-5-competitive-section \ | egrep '(<td|<tr)' \ | tr -d '\t\n\r' | tr ',' ';' \ | sed 's/]*>//g' \ | awk ' BEGIN { RS=" *</tr> *" FS=" *</td> *" OFS="," } /[0-9]\./ { $1 = $1 print $0 } ' > gp5.csv
And do the same for GPs 1 through 4. Then we throw this into postgres.
$ createdb gp $ psql gp # create table raw1 (rk text, name text, country text, nick text, points double precision, sent int, ok int, time interval); CREATE TABLE # create table raw2 (like raw1); ... # copy raw1 from '/tmp/gp1.csv' (FORMAT 'csv'); COPY 348 ...
Then we combine the raw tables.
# create view combined as select *, 1 as round from raw1 union select *, 2 as round from raw2 union select *, 3 as round from raw3 union select *, 4 as round from raw4 union select *, 5 as round from raw5;
This is enough to get a total top 10.
# select name, sum(points) as points, count(*) as played from combined group by name order by points desc limit 10; name | points | played ------------------+--------+-------- Ken Endo | 4466.2 | 5 Nikola Zivanovic | 3724.5 | 5 Hideaki Jo | 3651.5 | 5 Kota Morinishi | 3562.1 | 5 Yuki Kawabe | 3447.5 | 5 Will Blatt | 3404.7 | 5 Robert Vollmert | 3387.2 | 5 James McGowan | 3355.1 | 5 Michael Ley | 3299.7 | 5 Ulrich Voigt | 3285.9 | 4
Eventually, two results will be dropped, so for an intermediate state, let’s drop one.
# create view ranked as select *, rank() over (partition by name order by points desc) as prank from combined;
Now we can get, for example, the current leader’s results ordered by score.
# select round, points, time from ranked where nick = 'EKBM' order by prank; round | points | time -------+--------+---------- 5 | 991 | 00:57:36 3 | 942.3 | 00:50:40 4 | 901.2 | 00:56:53 2 | 872.2 | 00:53:53 1 | 759.5 | 01:18:49
And an overall leaderboard considering the best four results so far.
# create view best4 as select name, sum(points) as points, count(*) as played from ranked where prank <= 4 group by name; # select rank() over (order by points desc) as rank, name, points, played from best4 order by points desc; rank | name | points | played ------+----------------------------+--------+-------- 1 | Ken Endo | 3706.7 | 4 2 | Ulrich Voigt | 3285.9 | 4 3 | Hideaki Jo | 3235.5 | 4 4 | Kota Morinishi | 3102.1 | 4 5 | Nikola Zivanovic | 3054.5 | 4 6 | Robert Vollmert | 2905.2 | 4 7 | Will Blatt | 2896.7 | 4 8 | Prasanna Seshadri | 2834.7 | 4 9 | Yuki Kawabe | 2834.5 | 4 10 | Michael Ley | 2806.7 | 4