Open data comes of age

If you live in Italy and are curious about your local authority’s pattern of spending and taxing, you are in luck. Since last week, OpenBilanci publishes on the web detailed financial data from all the 8,092 Italian local authorities for the past ten years. Both budgets and closed ex post accounts are available, along with a galore of indicators like financial autonomy or spending velocity. Not only are all data downloadable and open: OpenBilanci sports a nifty web interface for preliminary data exploration. The latter is a feature found also on other highly successful Italian open data projects like the mighty OpenCoesione, that released spending data on 749,112 projects funded by the country’s cohesion policy. And no surprise: though OpenCoesione is a government initiative and OpenBilanci is not-for-profit one, the same team of visionary coders stand behind both projects, through both a non profit and a for profit arm.

In the space of only a few years, open data have become a formidable force for openness, transparency and even data literacy in a country that badly needs all three. Forward-thinking civil servants and political leaders in some of Italy’s 20 regions (and some cities) have been working together with civic hackers for years now: Lazio has funded OpenBilanci through its SME-centred innovation policy, whereas Emilia Romagna has successfully built a partnership with the largest Italian open data community, Spaghetti Open Data. In a veritable stroke of genius, the city of Matera has decided to host on its own open data portal any open dataset produced by the local community.

When public authorities do not play ball, Italian civic hackers simply proceed to open up government data anyway. One of my favourite projects in this sense is Confiscati bene, started during an epic Spaghetti Open Data hackathon. The group wrote a crawler to extract data from the (non-open) website of ANBSC, a government agency tasked with reallocating assets confiscated to Mafia bosses and other assorted mobsters (the Italian police is doing a sterling job there, since ANBSC is juggling over 11,000 such assets). It cleaned them up, geocoded them, made them downloadable, built the customary sleek interface for web exploration, embedded them into a brand new website and released everything as a gift to ANBSC. OpenBilanci itself entailed scraping over two million web pages.

I know Italy’s scene best, but exciting open data projects are appearing everywhere. My absolute favourite one is British: OpenCorporates gathers data on over 60 million corporations all over the planet. Using unique identifiers and information on ownership structure, OpenCorporates shines a light on the corporate world, that has far less tight legal requirements on transparency than government. This OpenCorporates-based visualization, for example, will teach you much about Goldman Sachs.

It looks like the open data movement has come of age. It was surprisingly fast: in less than four years we went from a small cadre of nerds obsessing on Tim Berners-Lee famous “raw data now” speech to a strong community (there are almost 1,000 subscribers to the Spaghetti Open Data mailing list, churning out twenty messages a day 365 days a year) and a phalanx of young decision makers that understand the issue and are plugged into the community. I am proud of you all, my sisters and brothers in arms. And the best is yet to come – especially as we come together all across Europe, as I am sure we will soon since the times are ripe for this to happen. Who knows, data culture might even be able to shift European politics away from populism and onto evidence-based debate.

Trasparency brings growth

A few weeks ago, the open data community received a bit of a shock, in the form of a sharp, well-argued critique by Evgeny Mozorov. He claims that the “open government” meme hides a disconnect between those of us who interpret openness as transparency, accountability and ultimately human rights; and those that interpret it as contendibility to market competition, efficiency and GDP generation. In the extreme, he argues, “open government” can be made to mean “government open to competition from the private sector for the management of public goods”.

I agree with Mozorov that the difference in approach is there. However, I propose a point of view to reconcile the two different outlooks in practice. To do this, I borrow the words of the Italian blogger Michele Vianello, whom Mozorov would place firmly in the “openness for growth” camp. Michele is unconvinced by the sort of data that the Italian open data movement advocates the opening of.

How can we not understand that the most important data to open are those that enable citizens, businesses and government to generate economic and social value? […] How many GDP percentage points do we gain by live streaming the meetings of the parliamentary commissions?

Michele’s idea seems similar to what Mozorov attributes to the O’Reilly camp:

  1. The purpose of open data is mainly to stimulate economic growth.
  2. We can do so by releasing data that are amenable to being used for building value added services rather than government transparency data.

1 is the point debated by Mozorov. It comes down to what your values are. Let me leave it at that for now – I’ll come to it in a moment.

2 is definitely false – in the scientific sense of being contradicted by a veritable tide of economic literature. In the case of Italy, Corte dei Conti, our top administrative tribunal, estimates that corruption costs 60 billion euro a year. A few years ago this paper was quite popular (and there are many others). It says: increasing corruption by 1% (measured by polling indicators: corruption is by definition elusive to measure “at the tap”) implies a decrease of the GDP’s growth rate by a bit more than half a percentage point. Over the years, obviously, foregone growth itself follows an exponential growth curve – compound interest is an ugly beast – so that even small differences in the levels of corruption can lead to large ones in the absolute levels of national wealth.

In the following chart I imagine two initially identical economies (GDP at time 0 = 100), which would each grow by 2% a year in the absence of changes in their levels of corruption. I then posit that one of the two experiences a 1% increase in the levels of corruption in year 1. Following Mo, 2001 quoted above, this decreases its rate of growth by 0.54 percentage points. Corruption levels and everything else remain stable thereafter.


Twenty years later, the GDP of the more virtuous economy has a fifteen percentage points lead on the less virtuous one. Not by chance, the World Bank, OECD, UNDP and every other development agency worth its salt is turning an increasingly interested eye to pro-transparency, anti-corruption policies. Nor is it just corruption: the combination of open data, a lively data journalism scene and an attentive public opinion can help improve the effectiveness of public spending executed legally but poorly, that follows the same mathematical logic in decreasing growth. This is the rationale behind projects like Opencoesione (public expenditure data on over 600,000 projects funded with regional cohesion funds). Remember Linus’s Law: given enough eyeballs, all bugs are shallow.

Now, open data are not only a component of transparency; my experience (and, I think I can say, the Italian community’s – just look at Spaghetti Open Data) is that they generate additional demand for transparency, drive to better understand, order in the data generation and in the policy process.

Conclusion: we, the open government/open data community, can disagree about our value system. But in practical terms, value systems does not make a lot of difference: we should all support radical transparency policies. If you agree with Mozorov, you will do it because you think we all have a right to know in depth what the government (and, hopefully in the near future, corporates) is doing, since that impacts our lives. If you agree with Vianello, you’ll do it because it’s good for GDP. Either way, the economic impact of open data has a tried and true channel for economic impact via increased efficiency in the overall economy from transparency; it’s there and it is solid. At the time of writing, it looks far more solid than impact via jobs created by startups that sell apps on Apple’s AppStore or Google’s Play.

Civic hacking made easy: four small innovations to ramp up citizen involvement in hacktivism

I am still reeling from the impact of #SOD13, the first Spaghetti Open Data gathering. What they wrote about it, though a little overenthusiastic, is largely true. #SOD13 was a major energy high; it let us glimpse another possible world, and I am proud to have contributed to making it happen.

In my perception, a fundamental driver of #SOD13’s success was its uncompromisingly inclusive stance. Spaghetti Open Data was inclusive from day one, because we designed it that way. At the time (2010) I was thinking of a mutually respectful encounter between experts from different domains, hackers and civil servants, computer scientists and policy makers. #SOD13 aimed higher than that, all the way to the inclusion, as a protagonist, of anyone wishing to be part of the community. The problem was how to achieve this without watering down the gathering, without giving up on the opportunity to push to the edge the technical expertise of the more skilled members of the community. We wanted to boost inclusivity not by forcing everyone not to hack (which we could have done by, say, focusing on discussing the prime principles of open data), but by presenting participants with a sort of menu of things to do. Different items on the menu required different skills: programming, understanding and writing legal code, statistical data analysis, but also citizen engagement, data cleanup, monitoring. Each participants decided what she wants to do on the basis of inclinations and skills, and all these activities are part of a workflow to build technically complex projects. In this way, everyone involved, whatever his or her skill level, can jump in and be a civic hacker, immediately and in the narrow sense of the word. All it takes is being willing to get your hands dirty.

#SOD13 designed and deployed four of these activities.

  1. The non-technical hackathon. Led by two young lawyers passionate about open data, Morena and Francesco, one of the hackathon tracks revised the European Commission’s EPSI Scorecard for the part concerning Italy. A while earlier, we had noticed that these data were in urgent need of an update. The revision required half a day of work, a lot of web searching for supporting evidence, and a lot of patience, but it yielded an extraordinary result: Italy’s score leaped from 300 to 450, even with a conservative interpretation. The EPSI people thanked profusely the civic hackers in #SOD13 and updated its official site with our data. If, for once, Italy is proudly at the top end of a tech chart (in fourth position, after France, the Netherlands and the UK) we Italians owe it to them. In the same logic, SOD’s volunteers are manually populating the database supporting Twitantonio, an app that lets citizens find candidates to the upcoming elections: this work does not require more expertise than looking for someone on Twitter, but if it does not get done the app is useless.
  2. The monithon. This simple, elegant idea was proposed by the Opencoesione crowd. Opencoesione is a government project, and easily the largest-scale open data project ever deployed in Italy. The people in charge of it hang out in SOD since inception, well before Opencoesione was dreamed up, and I suspect that they got the idea from the mailing list. Monithon works like this: you query the Opencoesione database and find which projects near you your taxpayer euro went into. Then you go there, ring the bell and ask to see how it’s doing. I did not take part in this, but the report (Italian) is great. A bunch of Ministry wonks roaming the renovated schools of Bologna, on a Saturday, with citizens in tow? That’s a sterling silver punk attitude to monitoring in my book!
  3. The documentathon. Developers – especially in open source communities, where many contribute on their spare time and are unpaid – don’t like to write documentation for the code they write. Often they are spread thin across a project, and they simply don’t have the time for it. Result: tons of undocumented code, very difficult to make sense of, improve and even use. Even very rookie programmers with minimal can be very useful to a development project simply by adding comments to code written by others (“here we simulate clicking on the top right link”). If you want to see an example, check out this scraper, written originally by Vincenzo and modified and commented by me.
  4. The iron pact with the ladies. The hacker community has a problem, and that’s that women tend to stay away from it – so it loses half of its potential! After reading a sombering post by Asher Wolf, I asked the community for help in making SOD more attractive to women. That led to a fruitful collaboration with the Bologna chapter of Girl Geek Dinner on the whole three-day gathering. Also thanks to them and their Spaghetti Open Data Gender Survey, rolled out as a track of our hackathon, the female presence at the gathering was numerically strong and high quality on all fronts, writing code included. I intend to keep trying to do more to build female-friendly environments.

All in all, #SOD13 was definitely a step in the direction of “everybody is a civic hacker” – and that, as far as I’m concerned, is the right direction. We’ll see how it pans out.