The importance of Data is growing not just in the private sector but also in the Government. Data can help in decision making. In fact, Data not only helps the citizens but the governments too can benefit by making better-informed decisions. Let us touch on what the 3 sites do – Data.gov [US Govt], USAFacts.org [US Non-Govt owned] & Data.gov.in [Indian Govt].
Any open data platform from a government must allow their departments to provide data & make the data available for others to analyse. Such platforms don’t host the data themselves but rather aggregates metadata about open data resources in one centralized location.
Data.Gov – Open Data from the US Government
It must be noted that the US government already has a superb Open Data site Data.gov. The site has data, tools, and resources to conduct research, develop web and mobile applications, design data visualization.
Under the terms of the 2013 Federal Open Data Policy, structure,-generated government data is required to be made available in open, machine-readable formats while continuing to ensure privacy and security.
Data.gov is built on WordPress & CKAN (world’s largest open source data portal platform). Data.gov source code is available on GitHub].
USAFacts.org – the Balance Sheet of US Government
Former Microsoft CEO Steve Ballmer launched USAFacts [see video], which has detailed statistical reports on local, state and federal governments. He rightly said in the era of fake news “numbers” speak for itself, it shows how the country (US) is being run. You can listen to Steve Ballmer’s podcast here,
Before we get into the current state of Open Data in India it would be useful to get familiar with the highlights of USAFacts,
- USAFacts only uses government data as their source. Hence, may reports are based on data released in 2014 or 2015 (so let us stop complaining about Govt of India being slow!)
- It avoids forecasts, doest nor proposes policy.
- Reports are based on what has already happened (the past).
- Data uncovered by the project – crime, emissions, traffic fatalities, lifespan, and infrastructure.
- A report on the government’s performance (operational results, risk factors, analysis of financials) is released. The report follows the format of a public company’s annual 10K report to the Securities and Exchange Commission (SEC). Though they follow a corporate reporting structure, they don’t propose Govt should be a business.
- It aggregates government statistics by combining federal, state, and local statistics to show the full picture of government. The data from each of these sources are in different formats and are compiled into a single database.
- The same data from various departments could contradict each other. USAFacts decides which one to use.
- The methodology of revenue of expenses is well explained on their site. It addresses double counting, grants from state/federal govt.
- One interesting high level report for 2014: Govt earning $5.2 trillion, Govt spending $5.4 trillion
To summarise,
Factual | Only official government data |
Comprehensive | Integrated federal, state and local government data |
Contextual | Relevant statistics and historical trends |
Comprehensible | Logically organized by government mission |
Unbiased | No political agenda or commercial motive |
The Need For Open Data in India
It has become a common spectacle for us to see political parties claiming what they have achieved, but they are the opposite of reality. This is where data could help bring real facts to the public.
Many may not be aware, as part of Digital India program, the govt of India has a developed platform Open Government Data (OGD) Platform India – Data.gov.in, built on Drupal. This is a joint initiative of Government of India and the US Government. A good number of countries today are having Open Data sites.
There are many good reports & visualizations available on Data.gov.in but I wish the raw data was made available to Data Scientists to build some interesting models & reports (I still think it is available on data.gov.in but I am not finding it). We need access to datasets the way it is provided in the US by Data.gov.
Do we know how many employees Govt of India has?
Govt Jobs in Data Science
The private sector in India has Data Science departments. Were you aware these positions exist in Govt of India today?
- Head Big Data Initiative, Department of Science & Technology
- Director, Data Management and Dissemination Division, (Chief Data Officer) Reserve Bank of India
- Head – Data Analytics Cell, NITI Aayog
I learnt about the above positions from Data Science Congress. On browsing DST, RBI and NITI Aayog sites I couldn’t find a page(s) which talked about their work in Data Science. #fail
Data Journalism in India
Few sites in India which focus on data,
- Factly: Making Public Data Meaningful
- How India Lives: aims to organise a massive amount of public data on India and make it available in a searchable, comparable and visual format.
- IndiaSpend: India’s first data journalism initiative.
- IndiaStat: India’s most comprehensive e-resource for socio-economic statistical data
- SocialCops: On a mission to confront the world’s most critical problems through data intelligence.
Conclusion
Data.gov.in and Data.gov are similar. They are the official providers of government data. India needs a third party to build and analyze reports on the lines of USAFacts.org. There are many data enthusiasts these days in the country who could crowdsource their talent / passion / energy to work on this interesting project.
The government of India and State Governments need to provide a lot more data for public consumption. This will increase accountability for various government departments, which is in the country’s interest.
Only two states in India have an Open Data Policy: Sikkim & Telangana.
Delhi Govt rolls out an Open Data Platform for the city’s public transit system. Users can access static datasets and real-time feeds via APIs about routes, stops, and live GPS bus positions.
India should champion this first in English and then make all the reports available in all Indian languages (not just Hindi). The data is structured, and creating reports from structured data in multiple Indian languages is very much achievable.
Also see,
- Open Govt Data in India has issues of quality, disparate schema and metadata standardisation, and a lack of high-value data
- How to Get Facts & Figures For An Election Digital Campaign in India
- USAFacts.org Summary 2017
- USAFacts.org Annual Report 2017 – USA in Numbers
- Data USA: Instead of searching through multiple data sources that are often incomplete and difficult to access, you can simply point to Data USA to answer your questions.
- Wharton Research Data Services (WRDS) – A data research platform for over 50,000 commercial, academic, and government users in 30+ countries
- Open Data Day – a day devoted to encouraging governments to make public data freely available in machine-readable formats under open licenses
- Telangana’s ‘open data’ policy to help start-ups address public issues
- Big Picture with Kal Penn – brings lesser-known interesting facts about a variety of topics, where the information was collected using data mapping
Leave a Reply