28 September 2016

Opening up the Ministry for the Environment data and webscrape the 2015 free allocation of emission units

Let's look at the latest data on the very generous free give-aways of emissions units to emitters made by the New Zealand Ministry for the Environment
N.B. Update on 10 December 2016. The allocation decisions have moved to the web page of the Environmental Protection Authority

The Environmental Protection Authority now hosts the 2015 Industrial Allocation Decisions that show the final free allocation of emission units to emitters for 2015 under the New Zealand Emissions Trading Scheme.

The New Zealand Ministry for the Environment no longer hosts the unit allocation data and the old link returns an Acess Denied page.

I looked at the 2010 to 2014 data in my post Opening up the data on emissions units in the NZ emissions trading scheme. So in this post I am will repeat my steps in web-scraping the freebie emissions unit data into a sensible open-data format (but with the links updated to the EPA).

The url of the old Ministry for the Environment web page is http://www.mfe.govt.nz/climate-change/reducing-greenhouse-gas-emissions/new-zealand-emissions-trading-scheme/participatin-4

The url of the EPA web page is http://www.epa.govt.nz/e-m-t/taking-part/Industrial-allocations/allocations-decisions/Pages/decisions-2010.aspx. And unfortunately, the Google sheet 'scrape the table' script does not seem to work with the EPA page.

Go to Google and open a new Google sheet.

Following the tip from the School of Data Liberating HTML Data Tables, enter this text in cell A1 of the Google sheet.


Add the url of the Ministry for the Environment's free allocation web-page between the double speech marks so you have this exact text in cell A1.


It was good thing that I kept a screen shot to show that it worked perfectly! We now have a Google sheet of the 2015 free unit allocation to NZ emissions trading scheme emitters.

I have saved it as NZETS-2015-final-allocations-for-eligible-activities.

However, the data does not have a "tidy" structure, where each variable is a column and each observation is a row (Wickham, Hadley . "Tidy Data" Journal of Statistical Software [Online], Volume 59, Issue 10 (12 September 2014)).

The first column includes both industry names and types of industries classified by the type of emissions the industry produces. And lots of asterisks. A tidy format would have these attributes (or variables) as separate columns so that each company/emitter would have a row each.

I used a programme called Open Refine (which is also at Github) to data-wrangle the data into tidy format and to save it as a comma-separated values file which is this Google sheet NZETS-2015-final-allocations-for-eligible-activities. Its a bit fiddly using Open Refine, and I have not documented the steps. I won't describe how I did it. Yes, I know, from the point of view of reproducing the tidied data I should have done the tidying with a script or code. Next time I will.

As usual, the big emitters get the most emission units! Of 4.417 million units allocated to industries, 90% went to 11 large companies. New Zealand Steel Development Limited, of arbitrage profits fame, gets 1,067,501 free units. New Zealand Aluminium Smelters Limited gets 772,706 free units.

This is the updated free emission unit allocation data from 2010 to 2015.

I did a bit of data visualising with the 2015 data and created this pie-chart in R programming language.

The R script for that is:

Did I not get the End the Rainbow memo? So I picked a better colour scale from Colour Brewer.

The R script for this non-rainbow pie chart is:

No comments:

Post a Comment