Monday, 28 January 2013

The Challenges of Creating Data for Ontario

It's not easy getting the numbers you need. Any fool can go into Excel and start typing stuff, of course, but how do you get methodologically sound data to use in your projections and other number-crunching pursuits?

Several years back, Canada (or at least certain politics and public-policy circles) had a big public debate on research methodology. It was a reassuring reminder of the elevated state of our public discourse (we had to justify knowing fewer things, cf the country to the south of us), but the result - a sabotaged 2011 census - was very disappointing indeed. For that reason I've completely dismissed using the data from that survey here and on Québec Projections.

Also disappointing is the fact that Elections Ontario, as far as I can tell, doesn't publish reports on provincial electoral ridings, as does the DGEQ (Québec elections officials helpfully provided me with an Excel file of their socio-economic reports upon request, which they've since made public). I've bypassed this problem for Ontario by digging into the Federal Electoral District profiles, which use the more robust (but dated) 2006 Census data. Since Ontario's current provincial riding boundaries are more or less the same, I'm using the federal data as a substitute.

The Northern Ontario ridings, however, have kept their old boundaries from the 1996 Federal Representation Order. Or, alternatively, the 2005 Representation Act (Ontario) incorporated the 1996 boundaries by reference, with minor modifications. So I've gone back and looked at the old profiles from the 2001 Census, using the 1996 federal boundaries. I've found the corresponding fields and calculated new ones, and I'm plugging them into the data from the newer profiles. 

Unfortunately, that leaves some gaps. We don't have religion data for 2006 and beyond, which was interestingly left out of the 2006 Census. Going backwards, we're missing the interesting data on sources of income, industrial categories and so on for 2001.

You don't have to be familiar with Geocities, but it helps
 Doesn't that bring into question the changes that have taken place since 2001 and 2006? Won't these changes make the data less valid? Maybe, is the answer to that. The 2000s did see a lot of immigration to Ontario. We're hoping that the fact that most of the data came from 2006, and that voting data came from 2007 and 2011, will keep our analyses relevant overall. The US, after all, uses a much less detailed census, and only every 10 years.

For Northern Ontario, the outmigration, or the stability of the population since 2001 that led to their losing a riding (if it hadn't been for an legislative intervention) probably mean that Northern Ontario has become an even more exaggerated version of whatever patterns make up Northern Ontario. Not really knowing too well myself, at this point let's pretend I can guess that Northern Ontario is older, whiter, more working class, and more francophone than the rest of Ontario (or ROO -- today's candidate for worst acronym). If immigrants and young people have been leaving / not moving to the North since then, the tendencies described by the lagging-behind data will probably be rather conservative.

We can, of course, also run analyses for data that exclude Northern Ontario.

The next step is to see how much the federal riding boundaries have changed in Northern Ontario from 1996 (i.e. the boundaries today in the North) to 2003 (the most recent FED profiles available for that geographic territory).

It's not a perfect solution, but with more fluidity in the definition of ridings likely to come in the next few years, as well as the provinces' distrust of Federal data, we might see Elections Ontario decide to publish some data for the benefit of parties, the media, academics, as well as the all-blogging, and tweeting Peanut Gallery. Or we can build them ourselves, census tract by census tract.

