Digital India: What India’s Open Data Program could learn from USAFacts.org

The importance of Data is growing not just in private sector but also in the Government.  Data can help in decision making. In fact, Data not only helps the citizens but the governments too can benefit by taking better informed decisions. Let us touch on what the 3 sites do – Data.gov [US Govt], USAFacts.org [US Non-Govt owned] & Data.gov.in [Indian Govt].

Any open data platform from a government must allow their departments to provide data & make the data available for others to analyse. Such platforms don’t host the data themselves but rather aggregates metadata about open data resources in one centralized location.

Open Data Day

Open Data Day – a day devoted to encouraging governments to make public data freely available in machine readable formats under open licenses

Data.Gov – Open Data from US Government

It must be noted that the US government already has a superb Open Data site Data.gov. The site has data, tools, and resources to conduct research, develop web and mobile applications, design data visualization.

Under the terms of the 2013 Federal Open Data Policy,structure,-generated government data is required to be made available in open, machine-readable formats, while continuing to ensure privacy and security.

Data.gov is built on WordPress & CKAN (world’s largest open source data portal platform). Data.gov source code is available on GitHub].

USAFacts.org – the Balance Sheet of US Government

Former Microsoft CEO Steve Ballmer launched USAFacts [see video], which has detailed statistical reports on local, state and federal governments. He rightly said in the era of fake news “numbers” speak for itself, it shows how the country (US) is being run. You can listen to Steve Ballmer’s podcast here,

Before we get into the current state of Open Data in India it would be useful to get familiar with the highlights of USAFacts,

  1. USAFacts only uses government data as their source. Hence, may reports are based on data released in 2014 or 2015 (so let us stop complaining about Govt of India being slow!)
  2. It avoids forecasts, doest nor propose policy.
  3. Reports are based on what has already happened (the past).
  4. Data uncovered by the project – crime, emissions, traffic fatalities, lifespan and infrastructure.
  5. A report on government’s performance (operational results, risk factors,analysis of financials) is released. The report follows the format of a public company’s annual 10K report to the Securities and Exchange Commission (SEC). Though they follow a corporate reporting structure, they don’t propose Govt should be a business.
  6. It aggregates government statistics by combining federal, state, and local statistics to show the full picture of government. The data from each of these sources are in different formats and are compiled into a single database.
  7. The same data from various departments could contradict each other. USAFacts decides which one to use.
  8. The methodology of revenue of expenses is well explained on their site. It addresses double counting, grants from state/federal govts.
  9. One interesting high level report for 2014: Govt earning $5.2 trillion, Govt spending $5.4 trillion

To summarise,

FactualOnly official government data
ComprehensiveIntegrated federal, state and local government data
ContextualRelevant statistics and historical trends
ComprehensibleLogically organized by government mission
UnbiasedNo political agenda or commercial motive

The Need For Open Data in India

It has become a common spectacle for us to see political parties claiming what they have achieved, they are opposite to reality. This is where Data could help bring the real facts for citizens.

Media focuses on many unwanted topics to confuse readers, one of them being how many foreign trips the current Prime Minister Narendra Modi has been on. And this number is compared to his predecessors. Best for government to provide all such data officially from data.gov.in.

Many may not be aware, as part of Digital India program the govt of India has a developed platform Open Government Data (OGD) Platform India – Data.gov.in, built on Drupal. This is a joint initiative of Government of India and US Government.  A good number of countries today are having Open Data sites.

There are many good reports & visualizations available on Data.gov.in but I wish the raw data was made available to Data Scientists to build some interesting models & reports (I still think it is available on data.gov.in but I am not finding it). We need access to datasets the way it is provided in the US by Data.gov.

Do we know how many employees Govt of India has?

Govt Jobs in Data Science

Private sector in India has Data Science departments. Were you aware these positions exist in Govt of India today?

  1. Head Big Data Initiative, Department of Science & Technology
  2. Director, Data Management and Dissemination Division, (Chief Data Officer) Reserve Bank of India
  3. Head – Data Analytics Cell, NITI Aayog

I learnt about the above positions from Data Science Congress. On browsing DST, RBI and NITI Aayog sites I wasn’t able to find a page(s) which talked about their work in Data Science. #fail

Conclusion

Data.gov.in and Data.gov are similar, they are the official providers of government data. India needs a third party to build, analyse reports on the lines of USAFacts.org. There are many data enthusiasts these days in the country who could crowdsource their talent / passion / energy to work on this interesting project.

Govt of India and State Govts need to provide a lot more data for public consumption. It increases accountability on various govt departments which is in the interest of the country.

Only two states in India have an Open Data Policy: Sikkim & Telangana.

India should champion this first in English, then make all the reports available in all Indian languages (not just Hindi). The data is structured and to create reports from structured data in multiple Indian languages is very much achievable.

Also see,

Leave a Reply