Because we also started hating data portals. As a purveyor of one, I know its a bold statement to make.
But as a company born of open data, we’ve spent the past four years not only deploying but hacking on open data portals of all kinds. Actually, all our early experiments – from NYCDataWeb, to NYCFacets, to NYCpedia, were attempts at improving data portals – by linking, contextualizing and humanizing data.
Because as Mark Headd points out, we know we can do a lot better.
So instead of railing against the state of open data, we decided to build CivicDashboards.
A Data Portal is just One Step of Many
It involves far more – building connections to data silos (ETL); enriching, sanitizing, linking and contextualizing the published data; building knowledge products with actionable information from the data.
And that’s just the tip of the technical iceberg – there’s far harder stuff. Marshalling the political will to publish open data; building and funding teams to steward open data innovation inside government; getting buy-in from the various custodians of data; empowering data custodians with the incentives, tools and training to proactively contribute data; all while navigating the broken government procurement process that effectively discourages innovation and the requisite experimentation that comes with it.
With CivicDashboards, we aimed to address all these:
Open Data with Open Source
Imagine if the Web was not made available freely. Would we have all the modern web-based conveniences, businesses and business models we have now? Or will we still have walled gardens like Prodigy, Compuserve, AOL and Minitel? Will Wikipedia, Google, Facebook and smartphones be around? Will the whole concept of “the Cloud” even exists?
We believe that Open Data will only realize its full potential if the base infrastructure that supports it is also Open. That’s why we chose CKAN as a core component of CivicDashboards.
Already, CKAN powers the largest data portals in the world, and it has a vibrant, active ecosystem, with hundreds of extensions developed by the community, all innovating , collaborating and competing with each other without needing to sign a license agreement, or asking for permission.
Because having an API is just not enough. Having full open source access gives you the freedom to change everything and anything – the freedom to integrate best-of-breed tools, to use different authentication schemes, to create and share custom schemas, to store datasets in third-party cloud infrastructure, to create API-compatible solutions like DKAN, to address the gaps that Mark pointed out, and yes, to install CKAN on-premise yourself without signing a contract.
To stand on the shoulders of the global CKAN community as we have with CivicDashboards.
GovTech vs CivicTech
We also realized that Ontodia was primarily in the business of GovTech, not CivicTech – the difference being GovTech is mission-critical government infrastructure, and CivicTech, is the subsuming wider field of technology applied to civic problems.
One perennial criticism of civictech is “solutionism” – especially in the realm of hackathons and app challenges. As a product of one, we know first hand that a lot of governments hesitate to deploy “apps” from these events largely because of sustainability.
Is there a company behind this app? Who else is using it? Does it conform to our mission-critical requirements? Can we customize it to support all our citizens – will it be multi-lingual, accessible and mobile-friendly? Is it backed by a Service Level Agreement? Will the technology be around next year? Five years from now? Are there multiple vendors supporting the technology? Or are we locked in and have to sole-source all follow-up work?
Open Source!?! What is the quality of the code? Who else is contributing to the project? Our developers are busy with production issues and cannot be expected to compile the latest version and manually apply patches. What is my Total Cost of Ownership when I have to hire an expensive development team to maintain our deployment?
These questions have been asked before in another realm of mission-critical open source – Linux.
So similar to what RedHat did for Linux, Ontodia aims to do for CKAN.
#BuildWithNotFor also applies to GovTech vendors
Open Data is a Process. Ultimately, the success of Open Data is not going to be measured by the number of high quality, linked, machine-readable datasets in a portal, nor the number of Civic apps using that data.
As Stephen Goldsmith and Susan Crawford put it in the conclusion of their book – The Responsive City.
“The real payoff will come when technology changes legacy processes for good to create truly data-smart and responsive cities.”
Great open data programs are only possible when the agencies publishing the data implements processes and enabling mechanisms that ensure the data stays current, relevant and responsive to its constituent’s requirements. And those constituents include not only citizens and businesses, but other agencies as well, who surprisingly, often do not have ready access to their partner agency data.
As Anthea Watson Strong puts it on her screed – “Hey Uncle Sam, Eat Your Own Dogfood”.
“The real problem behind our data quality issues, is that the people who have the power to fix the data, don’t have an incentive to understand the problem or improve it. Government officials are lovely people who work hard in under-resourced offices. Although many of them believe deeply in transparency and citizen engagement, these portals tend to generate additional burdens that get in the way of their primary functions. When data is stale or data is inaccurate, someone has to take the time to update it or fix it. It is difficult for any one group to see beyond the limits of their own projects. The real trick is to align incentives. What we actually need, is for Uncle Sam to start dogfooding his own open data.” (highlighting ours)
That’s why we created our “Analytics-as-a-Service.” Because after helping a client launch a data portal, we wanted to build with, not for our clients as they embark on their open data journey.
Not because we fancy ourselves as experts with ready-made, off-the-shelf solutions that we want to upsell to them, but because:
- we want to learn about the local issues, so we can become a better partner and help them employ the Data Scientific Method and formulate the questions that can be answered with their data
- we want our clients to answer low hanging questions faster by helping them focus on the hard task of opening data, while we focus on the secondary, though necessary task of data wrangling
- so we can help them create a showcase of open data powered answers, to generate buy-in from all stakeholders.
- so we can build-up each other’s capacity. For us, to develop the product driven by real-world use cases; for our clients, so they can eventually wrangle data themselves instead of us doing all the wrangling for them.
To accelerate success, we maintain a Solution Template Library – pre-built Solution Templates including Crime Maps, Economic Activity Dashboards, Snowplow Trackers and open source projects like Citygram and Mapzen.
These Solution Templates often include Solutions developed by someone else, cherry-picked from other successful civic tech projects and best-of-breed commercial tools like CartoDB, Accela, Tableau and Necto, solving a similar problem.
And when we do develop or extend a new extension, as we’ve done with our CKAN Discourse Extension, the CitySDK CKAN Connector, and the OpenRefine CKAN connector, we add it to the Library, and when it makes sense, open source it as well so everybody wins.
Answering Questions often require 3rd-party Data
Ontodia has already compiled Key Place Indicators™(KPI) for all 3,000+ counties and 35,000+ townships/municipalities in the US, compiling maps and data from various high-value trusted sources like the U.S. Census, the Department of Labor Statistics, the FBI, etc. going back several years. Clients can use these KPIs and the data behind it to compute their own KPIs.
And as we’ve done with NYCpedia and data.beta.nyc, we can also wrangle data from other non-governmental public sources – like job postings in Indeed, hyperlocal news from DNAinfo, and curated, released FOILed data from news organizations like WNYC and the New York Times.
And with our Up-to-Data™ subscriptions, our clients can have continuing, up-to-date access to these datastreams. They can even commission custom feeds that further expands the datastream library.
So this is why we built CivicDashboards…
With the CivicDashboard suite, we offer a different path towards 21st Century government, building on the solid foundation of CKAN – the best open source data portal platform – we added all the other missing pieces to start putting open data to work.
Our novel approach of combining an open source data portal not only with enterprise support, but with performance management, analytics and data subscriptions, allow us the freedom to tap best-of-breed tools/techniques, and to #BuildWithNotFor our clients, partners and the CKAN community, all without artificially locking in our clients to a proprietary technology or the product roadmap of a single vendor.
We call this approach #OpenInfrastructure – Open Data with Open Source with Confidence.