The Ontario government released their annual Sunshine List on March 24, detailing public sector employees earning more than $100,000 per year. I created a table so readers can explore the list in more detail, letting you can search, sort and filter by name, salary and more.
This list is published each year on the Government’s website in a way that’s hard to search, impossible to sort and difficult to navigate. The Globe wanted to pull the data from this year’s list and publish it in a more usable way, as a tool for our reporters and our readers.
Here’s a little background on how I made the tool.
I started by building a scraper, a program that trolls web pages for content and saves it in a more sophisticated way than copy-paste. Using a coding language called Python, I built a universal scraper that could pull all the data back to 1997 – the first year it was released.
I cleaned this data using Google Refine, converting encoded HTML characters and renaming some categories. The next challenge was cutting the data down as much as possible. While there were only a few thousand records back in the 1990s, other years had as many as 79,000 records, making the file sizes very large. While Chrome and Firefox could handle it well, Internet Explorer chugged slowly with each new megabyte I pushed its way.
Since I had data from the 2010 release, I added a feature to let readers compare increases or decreases. In an earlier version of the table, I also included the employer name and position with this pop-up. But I had to cut it late in development because it nearly doubled the size of the 2010 dataset.
The final tool is very simple to use and, admittedly, not very flashy. But it lets readers dig a little deeper into the list, find notable people or search specific jobs.