WhatfettleOne CSV, thirty stories: 16. Mapination

This is day 16 of One CSV, 30 stories a series of articles exploring price paid data from the Land Registry found on GOV.UK. The code for this and the other articles is available as open source from GitHub

Yesterday we made a map with the total volume of transactions over 20 years. I wanted to see how that distribution changed over time. A spot of knife-and-forking:

cut -d'⋯' -f2,3 data/pp.tsv |
    sed -e 's/ //' |
    awk '$2' |
    sort |
    uniq -c |
    sort -rn |
    sed -e 's/^ *//' -e 's/  */⋯/' -e 's/ *$//'
    sort -k2 > daily-postcodes.tsv

gives a count of the number of transactions for each postcode on each date:


To make a version of the map, one for each of the days in the price-paid dataset requires an awfully large number of postcode lookups. Rather than sorting and merging individual files 7181 times I elected to write some code to read the OS OpenData™ Code-Point® into a Perl hash table:

my $geocodes = "data/codepo_gb.tsv";
my %postcode = ();
open my $file, "<", $geocodes or die "unable to open $geocodes";
while (my $line = <$file>) {
    my ($postcode, $easting, $northing) = split /\t/, $line;
    $postcode{$postcode} = { easting => $easting, northing => $northing };

Which we can use to look-up the easting and northing to draw a circle for each postcode:

my $c = $postcode{$p->{postcode}};
my $x = $width * $c->{easting} / $max_easting;
my $y = $height - ($height * $c->{northing} / $max_northing);
my $size = $p->{count};
printf($fp "circle %d,%d,%d,%d\n", $x, $y, $x+$size, $y+$size);

The complete script generates an individual image for each day, here’s the 365 images for 2007:

Daily volume of Land Registry transactions by postcode 2007

Stringing these individual images using gifsicle gives a rather large animated gif:

If you are minded, you can upload these to gifprint.com to make a flip book. A partial success:

The daily images are quite noisy and should benefit from some polishing, so more iteration on this hack tomorrow.