January 24, 2014

What good is #opendata when it’s wrong?

From:Visualize Regulations: Regulation Data Like You’ve Never Seen It Before


From reporters to academics, crunching numbers on regulations is sexy (especially during election time). So before I dive in to the details behind this post’s title, here’s a tip: you can’t count regulations that don’t get published. I take this concept and many others on in my post on the positives and pitfalls of available regulation datasets.

Riddle me this one my regulation counting colleagues: what if I told you your stats were wrong if you were using Reginfo.gov data?

As I described in my methodology post, I created a new mashed-up dataset based on bulk data from Reginfo.gov and the Federal Register.  In this post, I describe the errors I found in the XML files available on Reginfo.gov. More specifically, the errors related to regulations reviewed by the Office of Information and Regulatory Affairs (OIRA) and determined to be (“yes”)  economically significant.

My research centered on final regulations deemed economically significant and distinguishing them from those that are not.  As a result, I built all of my (mostly “by hand”) quality assurance processes around making sure every final regulation deemed “economically significant” was true.  I have no idea about the data quality of the non-economically significant rules in the Reginfo.gov XML files, but I can only assume that there are errors in those entries as well.

Read Complete Post


No Comments »

No comments yet.

Leave a Reply

Name not required for anonymous comments. Email is optional and will not be published.