comment 0

Governing by rankings – How the Global Open Data Index helps advance the open data agenda

This blogpost was jointly written by Danny Lämmerhirt and Mária Žuffová (University of Strathclyde). It was originally published by Open Knowledge International on blog.okfn.org.

We are pleased to announce our latest report Governing by rankings – How the Global Open Data Index helps advance the open data agenda. The Global Open Data Index (GODI) is one of the largest worldwide assessments of how well governments publish open data, coordinated by Open Knowledge International since 2013. Over the years we observed how GODI is used to monitor open data publication. But to date, less was known how​ ​GODI​ ​may​ ​translate​ ​into​ ​open​ ​data​ ​policies​ ​and publication​. How does GODI mobilise support for open data? Which actors are mobilised? Which aspects of GODI are useful, and which are not? Our latest report provides insights to these questions.

Why does this research matter?

Global governance indices like GODI enjoy great popularity due to their capacity to count, calculate, and compare what is otherwise hardly comparable. A wealth of research – from science and technology studies to sociology of quantification and international policy – shows that the effects of governance indicators are complex (our report provides an extensive reading list). Different audiences can take up indices to different (unintended) ends. It is therefore paramount to trace the effects of governance indicators to inform their future design.

The report argues that there are multiple ways of looking for ‘impacts’ depending on different audiences, and how they put GODI into practice. Does a comparative open data ranking like GODI help mobilise high-level policy commitments? Does it incentivise individual government agencies to adjust and improve the publication of open data? Does it open up spaces for discussion and deliberation between government and civil society? This thinking builds on an earlier report by Open Knowledge International arguing that indicators have different audiences, with different lived experiences, needs, and agendas. While any form of measurement needs to align with these needs to become actionable (which affects how the impact of indicators will take shape), it also needs to retain comparability.

Our findings

We used Argentina, United Kingdom and Ukraine as case studies to represent different degrees of open data publication, economic development and political set-up. Our report, drawing from a series of twelve interviews and document analysis, suggests that GODI drives change primarily from within government. We assume this finding is partly due to our limited sample size. While key actors in the government are easy to identify, as open data publication is often one of their job responsibilities,  further research is needed to identify more civil society actors and how they engage with GODI.

Below we describe nine ways how GODI influences open data policy and publication.

  1. Getting international visibility and achieving progress in country rankings or generally high ranking may incentivise and maintain high-level political support for open data, despite non-comparability of results across years.  
  2. In the absence of open data legislation, GODI has been used by Argentinian government as a soft policy tool to pressure other government agencies to publish data.
  3. Government agencies tasked with implementing open data used GODI to reward and point out progress made by other agencies, but also flag blockages to high-level politicians.  
  4. GODI sets standards what datasets to publish and sets a baseline for improvement. Outcomes are debatable around categories where the central government does not have easy political levers to publish data.
  5. GODI may be confounded with broader commitments to open government and used as an argument to reduce investment in other aspects of open government agenda. In the past, some high-level politicians presented  high ranking in GODI as evidence of government transparency and obsoletion of other ways of providing government information.  
  6. This effect may possibly be exacerbated by superficial media coverage that reports on the ranking without engaging with broader information and transparency policies. An analysis of Google News results suggests that journalists tend to reproduce (mostly politicians’) misconceptions and confound a good ranking in GODI with a high degree of government transparency and openness.
  7. Our findings suggest that individuals and organisations working around transparency and anti-corruption make little use of GODI due to a lack of detail and a misalignment with their specialised work tasks. For instance, Transparency International Ukraine uses the Transparent Public Procurement Rating to evaluate the legal framework, aside from the publication of open data.
  8. On the other hand, academics show interest to GODI to develop new governance indicators. They also often use country scores as a proxy for measuring open data availability.
  9. GODI has a potential for use in data journalism. Data journalism trainers may use it as a source of government data during their trainings.  

What we learned and the road ahead

Our research suggests that governments in all analysed countries pay attention to GODI.  With a few exceptions, they use it mostly to support open data publication and pave the way for new open data policies. While this is a promising finding, it has important implications for GODI and its design. If GODI sets standards in open data publication, as some interviewees from the government suggest, it needs to make sure to represent different data demands in the assessment and to encourage the implementation of sound policies. The challenge is to support policy development, which is often a lengthy process as opposed to short-lived rank-seeking.

Some interviewees suggested valuable avenues for GODI’s design. For instance, assessing progress in open data publication perpetually rather than once a year over a limited timespan would require a long-term commitment to open data publication and better opportunities for civic engagement, as it would prevent governments from updating datasets once a year before GODI’s deadline only. Another route forward is discussed in another recent research by OKI, highlighting the potential to adjust an open data index to align it more closely to specific needs of topical expert organisations. Beyond engaging via GODI, civil society and academia might also participate in the development of new data monitoring instruments such as the Open Data Survey, that are relevant for their mission.  

comment 0

How do open data measurements help water advocates to advance their mission?

This blogpost was jointly written by Danny Lämmerhirt and Nisha Thompson(DataMeet). It was originally published by Open Knowledge International on blog.okfn.org.

Since its creation, the open data community has been at the heart of the Global Open Data Index (GODI). By teaming up with expert civil society organisations we define key datasets that should be opened by government to align with civil society’s priorities. We assumed that GODI also teaches our community to become more literate about government institutions, regulatory systems and management procedures that create data in the first place – making GODI an engagement tool with government.

Tracing the publics of water data

Over the past few months we have reevaluated these assumptions. How do different members of civil society perceive the data assessed by GODI? Is the data usable to advance their mission? How can GODI be improved to accommodate and reflect the needs of civil society? How would we go about developing user-centric open data measurements and would it be worth to run more local and contextual assessments?

As part of this user research, OKI and DataMeet (a community of data science and open data enthusiasts in India) teamed up to investigate the needs of civic organisations in the water, sanitation and health (WASH) sector. GODI assesses whether governments release information on water quality, that is pollution levels, per water source. In detail this means that we check whether water data is available at potentially a household level or  at each untreated public water source such as a lake or river. The research was conducted by DataMeet and supervised by OKI, and included interviews and workshops with fifteen different organisations.

In this blogpost we share insights on how law firms, NGOs, academic institutions, funding and research organisations perceive the usefulness of GODI for their work. Our research focussed on the South Asian countries India, Pakistan, Nepal, and Bangladesh. All countries face similar issues with ensuring safe water to their populations because of an over-reliance on groundwater, geogenic pollutants like arsenic, and high pollutants from industry, urbanisation, farming, and poor sanitation.

According to the latest GODI results, openness of water quality data remains low worldwide.

What kinds of water data matter to organisations in the water sector?

Whilst all interviewed organisations have a stake in access to clean water for citizens, they have very different motivations to use water quality data. Governmental water quality data is needed to

  1. Monitor government activities and highlight general issues with water management (for advocacy groups).
  2. Create a baseline to compare against civil society data (for organisations implementing water management systems)
  3. Detect geographic target areas of under-provision as well as specific water management problems to guide investment choices (for funding agencies and decision-makers)

Each use case requires data with different quality. Some advocacy interviewees told us that government data, despite a potential poor reliability, is enough to make the case that water quality is severely affected across their country. In contrast, researchers have a need for data that is provided continuously and at short updating cycles. Such data may not be provided by government. Government data is seen as support for their background research, but not a primary source of information. Funders and other decision-makers use water quality data largely for monitoring and evaluation – mostly to make sure their money is being used and is impactful. They will sometimes use their own water quality data to make the point that government data is not adequate. Funders push for data collection at a project level not continuous monitoring which can lead to gaps in understanding.

GODI’s definition of water quality data is output-oriented and of general usefulness. It enables finding the answer to whether the water that people can access is clean or not. Yet, organisations on the ground need other data – some of which is process-oriented – to understand how water management services are regulated and governed or what laboratory is tasked to collect data. A major issue for meaningful engagement with water-related data is the complexity of water management systems.

In the context of South Asia, managing, tracking, and safeguarding water resources for use today and in the future is complex. Water management systems, from domestic to industrial to agricultural ones, are diverse and hard to examine and keep accountable. Where water is coming from, how much of it is being used and for what, and then how waste is being disposed of are all crucial questions to these systems. Yet there is very little data available to address all these questions.

How do organisations in the WASH sector perceive the GODI interface?

GODI has an obvious drawback for the interviewed organisations: transparency is not a goal for organisations working on the ground and does not in itself provoke an increase in access to safe water or environmental conservation. GODI measures the publication of water quality data, but is not seen to stimulate improved outcomes. It also does not interact with the corresponding government agency.

One part of GODI’s theory of change is that civil society becomes literate about government institutions and can engage with government via the publication of government data. Our interviews suggest that our theory of change needs to be reconsidered or new survey tools need to be developed that can enhance engagement between civil society and government. Below we share some ideas for future scenarios.

Our learnings and the road ahead

Adding questions to GODI

Interviews show that GODI’s current definition of water quality data does not always align with the needs of organisations on the ground. If GODI wants to be useful to organisations in the WASH sector, new questions can be added to the survey and be used as a jumping off point for outreach to groups. Some examples include:

  1. Add a question regarding metadata and methodology documentation to capture quality and provenance water data, but also where we found and selected data.
  2. Add a question regarding who did the data collection government or partner organisation. This allows community members to trace the data producers and engage with them.
  3. Assess transparency of water reports. Reports should be considered since they are an important source of information for civil society.

Customising the Open Data Survey for regional and local assessments

Many interviewees showed an interest in assessing water quality data at the regional and hyperlocal level. DataMeet is planning to customise the Open Data Survey and to team up with local WASH organisations to develop and maintain a prototype for a regional assessment of water quality. India will be our test case since there is local data for the whole country available at varying degrees across states. This may include to also assess quality of data and access to metadata.

Highest transparency would mean to have water data from each individual lab were the samples are sent. Another use case of the Open Data Survey would include to measure the transparency of water laboratories. Bringing more transparency and accountability to labs would be the most valuable for ground groups sending samples to labs across the country.

Storytelling through data

Whilst some interviewees saw little use in governmental water quality data, its usefulness can be greatly enhanced when combined with other information. As discussed earlier, governmental water data gives snapshots and may provide baseline values that serve NGOs as rough orientation for their work. Data visualisations could present river and water basin quality and tell stories about the ecological and health effects.

Map of high (> 30 mg/l) fluoride values from 2013–14. From: The Gonda Water Data story

Behavior change is a big issue when adapting to sanitation and hygiene interventions. Water quality and health data can be combined to educate people. If you got sick, have you checked your water? Do you use a public toilet? Are you washing your hands? This type of narration does not require granular accurate data.

Comparing water quality standards

Different countries and organisations have different standards for what counts as high water pollution levels. Another project could assess how the needs of South Asian countries are being served by a comparing pollution levels with different standards. For instance, fluorosis is an issue in certain parts of India: not just from high fluoride levels but also because of poor nutrition in those areas. Should fluoride affected areas have lower permissible amounts in poorer countries? These questions could be used to make water quality data actionable to advocacy  groups.

comment 0

Understanding the costs of scholarly publishing – Why we need a public data infrastructure of publishing costs

This blogpost was originally published by Open Knowledge International on blog.okfn.org. I’d like to thank Stuart Lawson and Jonathan Gray for their thoughtful comments and advice while writing this blogpost. This post draws from their paper “Opening the Black Box of Scholarly Communication Funding: A Public Data Infrastructure for Financial Flows in Academic Publishing.” Open Library of Humanities, 2(1). https://doi.org/10.16995/olh.72

Scholarly communication has undergone a seismic shift away from closed publishing towards an ever-growing support for open access. With closed publishing models, academic libraries faced a so-called  “serials crisis” and were not able to afford the materials they needed for their researchers and students. Partly in response to this problem, open access advocates have argued for increased access, whilst also changing the cost structure of scholarly publishing. In many countries this has led to experiments with ‘author pays’ models, where the prices of large commercial publishers have remained high, but the costs have shifted from readers to researchers.

Public data about the costs of these changing publishing models remains scarce. There is an increasing concern that they may perpetuate oligopolistic and dysfunctionalstructures that do not serve the interests of researchers or their students, readers and audiences. Some studies suggest that prices of open access publishing might unfairly discriminate against some institutions and point out the sometimes stark pricing differences across institutions. Funding organisations and institutions worry that hybrid journals might levy ‘Article Processing Charges’ (a common way of funding open access publishing) while not providing a proportionate decrease in subscription costs – thereby charging researchers twice (so-called “double dipping”).

Yet, evidence is fragmented and displays incomplete information. Members of Open Knowledge International’s network have been following this issue for several years.  Jenny Molloy wrote a blogpost on this issue for Open Access week three years ago. We have supported research in this area undertaken by Stuart Lawson, Jonathan Gray and Michele Mauri, and we published an associated white paper as part of the PASTEUR4OA project. To date public data about scholarly publishing finances remains fragmentary, partial and scattered.

The lack of publicly accessible financial information is problematic for at least three reasons:

  1. It hinders the evaluation of existing publishing policies and financing models. For example, incomplete and conflicting data prevents funders from making the best decisions where to allocate resources to.
  2. Financial opacity also prevents us from getting a detailed view how much money is paid in a country, per funder, academic sector, universities, libraries, and individual researchers.
  3. Ultimately, a lack of knowledge about payments weakens the negotiation powerof universities and libraries around market-coordinated forms of scholarly publishing.

As we celebrate International Open Access WeekOpen Knowledge International strongly pushes for public data infrastructures of scholarly finances. Such infrastructures would enable the tracking, documentation, publication, and discussion of different costs associated with scholarly publishing. Thereby public data infrastructures would provide the evidence base for a well-informed discussion about alternative ways of organising and financing scholarly publication in a world where open access to academic outputs becomes increasingly the norm.  Below you see a model of the financial flows that could be captured by such a data infrastructure, focussing on the United Kingdom.   

“Model of Financial Flows in Scholarly Publishing for the UK, 2014”, from Lawson, S., Gray, J., & Mauri, M. (2016). Opening the Black Box of Scholarly Communication Funding: A Public Data Infrastructure for Financial Flows in Academic Publishing. Open Library of Humanities, 2(1). https://doi.org/10.16995/olh.72

There is rising momentum within the larger research community – including funders, institutions and institutional libraries – to address the current lack of financial data around publishing. Earlier this year, Knowledge Exchange published a report underlining the importance to understand the total cost of publishing and the role of standard documentation formats and information systems to capture those. Funding bodies in different countries insert reporting clauses in their funding policies to gain a better picture how funds are spent. The UK’s higher education infrastructure body Jisc has worked with Research Councils UK to create a template for UK higher educationinstitutions to report open access expenditures in a standardised way and release it openly. This effort should support negotiations with journal publishers around the total costs of publishing.

In different European countries, funders and institutional associations start to create databases collecting the amount of money paid through APCs to single journals. The Wellcome Trust UK published information on how much it spent on open access publishing each year from 2010 to 2014. In a similar vein the German Open APC initiative, part of the initiative Transparent Infrastructure for Article Charges, crowdsources data on GitHub to publicly disclose money spent by different European institutions on open access publishing. And Open Knowledge International hosts a wiki for payment documents requested via FOI.

More financial transparency enables to rethink how scholarly publishing is organised

These examples are important signposts towards public data infrastructures of scholarly publishing costs. Yet, more concerted efforts and collaborations are needed to bring a deeper shift in how scholarly publishing is organised. Full transparency would require knowing how much each institution pays to each publisher for each journal, ideally allowing to relate these payments to public funds. To gain these insights, it is necessary to understand the ways scholarly publishing is organised and to address diverse obstacles to transparency, including:

  • Multiple income sources and financial management in institutions preventing from a disaggregated view on how much public funding is spent on open access
  • Payment models such as bundles, ‘big deals’, or annual lump sums preventing a clear image of open access costs
  • Policies mandating the reporting of payments differently and only covering certain disciplines
  • Non-disclosure agreements preventing transparent cost evaluations
  • Inaccurate or diverging price information such as price lists that do not necessarily display real payments.

What can be done to start contributing to a public database?

To lay the ground for collaboration Open Knowledge International wants to spark a dialogue across open access advocates, funders, universities, libraries, individual researchers, and publishers. In what follows we outline next steps that can be taken together towards a public data infrastructure:

Funders should insert reporting and disclosure clauses in funding policies, addressing both subscription payments and APC charges at a micropayment level (costs per article). Legal measures to prevent non-disclosure agreements can include to restrict or stop funds to publishers refraining from non-disclosure agreements.

Funding organisations, institutions and individual researchers should increase (inter)national research activities on the topic 1) to understand the magnitude of different cost types, 2) to identify new cost factors, necessary data representing them, and factors rendering them opaque, 3) and to analyse the benefits of alternative payment models for specific disciplines and institutions. Research should reflect rising administration costs and offer recommendations how to mitigate them.

Institutions, funding organisations and individual researchers can deliver evidence by disclosing their payments in databases accessible by everyone. If other institutions follow their model this allows for public comparisons of actual payments, to detect unreasonable pricing discrimination and to publishers not complying with open access funding policies. H


comment 0

Open data quality – the next shift in open data?

This blog post is part of our Global Open Data Index blog series. It is a call to recalibrate our attention to the many different elements contributing to the ‘good quality’ of open data, the trade-offs between them and how they support data usability (see heresome vital work by the World Wide Web Consortium). Focusing on these elements could help support governments to publish data that can be easily used. The blog post was jointly written by Danny Lämmerhirt and Mor Rubinstein.

Some years ago, open data was heralded to unlock information to the public that would otherwise remain closed. In the pre-digital age, information was locked away, and an array of mechanisms was necessary to bridge the knowledge gap between institutions and people. So when the open data movement demanded “Openness By Default”, many data publishers followed the call by releasing vast amounts of data in its existing form to bridge that gap.

To date, it seems that opening this data has not reduced but rather shifted and multiplied the barriers to the use of data, as Open Knowledge International’s research around the Global Open Data Index (GODI) 2016/17 shows. Together with data experts and a network of volunteers, our team searched, accessed, and verified more than 1400 government datasets around the world.

We found that data is often stored in many different places on the web, sometimes split across documents, or hidden many pages deep on a website. Often data comes in various access modalities. It can be presented in various forms and file formats, sometimes using uncommon signs or codes that are in the worst case only understandable to their producer.

As the Open Data Handbook states, these emerging open data infrastructures resemble the myth of the ‘Tower of Babel’: more information is produced, but it is encoded in different languages and forms, preventing data publishers and their publics from communicating with one another. What makes data usable under these circumstances?How can we close the information chain loop? The short answer: by providing ‘good quality’ open data.

Understanding data quality – from quality to qualities

The open data community needs to shift focus from mass data publication towards an understanding of good data quality. Yet, there is no shared definition what constitutes ‘good’ data quality.

Research shows that there are many different interpretations and ways of measuring data quality. They include data interpretability, data accuracy, timeliness of publication, reliability, trustworthiness, accessibility, discoverability, processability, or completeness.  Since people use data for different purposes, certain data qualities matter more to a user group than othersSome of these areas are covered by the Open Data Charter, but the Charter does not explicitly name them as ‘qualities’ which sum up to high quality. Current quality indicators are not complete – and miss the opportunity to highlight quality trade-offs

Also, existing indicators assess data quality very differently, potentially framing our language and thinking of data quality in opposite ways. Examples are:

Some indicators focus on the content of data portals (number of published datasets) or access to data. A small fraction focus on datasets, their content, structure, understandability, or processability. Even GODI and the Open Data Barometer from the World Wide Web Foundation do not share a common definition of data quality.

At the moment GODI sets out the following indicators for measuring data quality:

  • Completeness of dataset content
  • Accessibility (access-controlled or public access?)
  • Findability of data
  • Processability (machine-readability and amount of effort needed to use data)
  • Timely publication

This leaves out other qualities. We could ask if data is actually understandable by people. For example, is there a description what each part of the data content means (metadata)?

Improving quality by improving the way data is produced

Many data quality metrics are (rightfully so) user-focussed. However, it is critical that government as data producers better understand, monitor and improves the inherent quality of the data they produce. Measuring data quality can incentivise governments to design data for impact: by raising awareness of the quality issues that would make data files otherwise practically impossible to use.

Good data quality requires solutions jointly working together. Therefore, we would love to hear your feedback. What are your experiences with open data quality? Which quality issues hinder you from using open data? How do you define these data qualities? Please let us know by joining the conversation about GODI on our forum.

comment 0

Mapping open data governance models: Who makes decisions about government data and how?

This piece was originally published by Open Knowledge International on blog.okfn.org.

Different countries have different models to govern and administer their open data activities. Ana BrandusescuDanny Lämmerhirt and Stefaan Verhulst call for a systematic and comparative investigation of the different governance models for open data policy and publication.

The challenge

An important value proposition behind open data involves increased transparency and accountability of governance. Yet little is known about how open data itself is governed. Who decides and how? How accountable are data holders to both the demand side and policy makers? How do data producers and actors assure the quality of government data? Who, if any, are data stewards within government tasked to make its data open?

Getting a better understanding of open data governance is not only important from an accountability point of view. If there is a better insight of the diversity of decision-making models and structures across countries, the implementation of common open data principles, such as those advocated by the International Open Data Charter, can be accelerated across countries.

In what follows, we seek to develop the initial contours of a research agenda on open data governance models. We start from the premise that different countries have different models to govern and administer their activities – in short, different ‘governance models’. Some countries are more devolved in their decision making, while others seek to organize “public administration” activities more centrally. These governance models clearly impact how open data is governed – providing a broad patchwork of different open data governance across the world and making it difficult to identify who the open data decision makers and data gatekeepers or stewards are within a given country.  

For example, if one wants to accelerate the opening up of education data across borders, in some countries this may fall under the authority of sub-national government (such as states, provinces, territories or even cities), while in other countries education is governed by central government or implemented through public-private partnership arrangements. Similarly, transportation or water data may be privatised, while in other cases it may be the responsibility of municipal or regional government. Responsibilities are therefore often distributed across administrative levels and agencies affecting how (open) government data is produced, and published.

Why does this research matter? Why now?

A systematic and comparative investigation of the different governance models for open data policy and publication has been missing till date. To steer the open data movement toward its next phase of maturity, there is an urgency to understand these governance models and their role in open data policy and implementation.

For instance, the International Open Data Charter states that government data should be “open by default” across entire nations. But the variety of governance systems makes it hard to understand the different levers that could be used to enable nationwide publication of open government data by default. Who holds effectively the power to decide what gets published and what not? By identifying the strengths and weaknesses of governance models, the global open data community (along with the Open Data Charter) and governments can work together and identify the most effective ways to implement open data strategies and to understand what works and what doesn’t.

In the next few months we will seek to increase our comparative understanding of the mechanisms of decision making as it relates to open data within and across government and map the relationships between data holders, decision makers, data producers, data quality assurance actors, data users and gatekeepers or intermediaries. This may provide for insights on how to improve the open data ecosystem by learning from others.

Additionally, our findings may identify the “levers” within governance models used to provide government data more openly. And finally, having more transparency about who is accountable for open data decisions could allow for a more informed dialogue with other stakeholders on performance of the publication of open government data.

We are interested in how different governance models affect open data policies and practices – including the implementations of global principles and commitments. We want to map the open data governance process and ecosystem by identifying the following key stakeholders, their roles and responsibilities in the administration of open data, and seeking how they are connected:

  • Decision makers – Who leads/asserts decision authority on open data in meetings, procedures, conduct, debate, voting and other issues?
  • Data holders – Which organizations / government bodies manage and administer data?
  • Data producers – Which organizations / government bodies produce what kind of public sector information?
  • Data quality assurance actors – Who are the actors ensuring that produced data adhere to certain quality standards and does this conflict with their publication as open data?
  • Data gatekeepers/stewards – Who controls open data publication?

We plan to research the governance approaches to the following types of data:

  • Health: mortality and survival rates, levels of vaccination, levels of access to health care, waiting times for medical treatment, spend per admission
  • Education: test scores for pupils in national examinations, school attendance rates, teacher attendance rates
  • National Statistics: population, GDP, unemployment
  • Transportation: times and stops of public transport services – buses, trains
  • Trade: import and export of specific commodities, balance of trade data against other countries
  • Company registers: list of registered companies in the country, shareholder and beneficial ownership information, lobbying register(s) with information on companies, associations representatives at parliamentary bodies
  • Legislation: national legal code, bills, transcripts of debates, finances of parties

Output of research

We will use different methods to get rapid insights. This includes interviews with stakeholders such as government officials, as well as open government initiatives from various sectors (e.g. public health services, public education, trade). Interviewees may be open data experts, as well as policymakers or open data champions within government.

The type of questions we will seek to answer beyond the broad topic of “who is doing what”

  • Who holds power to assert authority over open data publication? What roles do different actors within government play to design policies and to implement them?
  • What forms of governance models can be derived from these roles and responsibilities? Can we see a common pattern of how decision-making power is distributed? How do these governance models differ?
  • What are criteria to evaluate the “performance of the observed governance models? How do they for instance influence open data policy and implementation?
comment 0

Data and the City: How public data is fostering civic engagement in urban regions

This blogpost was originally published by Open Knowledge International on blog.okfn.org.

How can city data infrastructures support public participation in local governance and policy-making? Research by myself and Jonathan Gray  examines the new relationships and public spaces emerging between public institutions, civil society groups, and citizens.

The development of urban regions will significantly affect the lives of millions of people around the world. Urbanization poses challenges including housing shortages, the growth of slums and urban decay, inadequate provision of infrastructure and public services, poverty, or pollution. At the same time, cities around the world publish a wide variety of data, reflecting the diversity and heterogeneity of the information systems used in local governance, policy-making and service delivery.

These “data infrastructures” are commonly considered as a “raw” resource, a trove for mining data composed of databases, APIs, cables, and servers. Opening up these data infrastructures to the public is said to advance progress towards a range of goals – including transparency, accountability, public participation, public service delivery, technological innovation, efficiency and economic growth. However, knowledge is scarce over how the public sphere, and civil society in particular, engage with data infrastructures to advance progress around urban issues.

To shed light on this question, Jonathan Gray and myself have published a new report titled “Data And The City: How Can Public Data Infrastructures Change Lives in Urban Regions?” We are interested in how city data infrastructures can support public participation around local governance and policy-making. The report demonstrates how public data infrastructures create new kinds of relationships and public spaces between public institutions, civil society groups, and citizens.

In contrast to more supply-oriented ideas around opening (government) data, we argue that data infrastructures are not a mere “raw” resource that can be exploited. Instead they are best conceived as a lively network or ecosystem in which publics creatively use city data to engage with urban institutions.

We intend to spark imagination and conversation about the role that public data infrastructures may play in civic life – not just as neutral instruments for creating knowledge, but also as devices to organise publics and evidence around urban issues; creating shared spaces for public participation and deliberation around official processes and institutions; and securing progress around major social, economic and environmental challenges that cities face.

Our report describes six case studies from cities around the world to demonstrate civil society’s vast action repertoire to engage with urban data infrastructures. One case study demonstrates how a British civil society organisation gathered budget data through freedom of information requests from municipal government. This information was fed into an open database and made accessible to finance experts and scholars in order to allow them to run a “public debt audit”. This audit enabled government officials and the larger public to debate the extent of public debt in British cities and to uncover how a lack of public scrutiny increased profits of financial institutes while putting a strain on the public purse.

Another case shows how a research group re-appropriated data on incarceration to highlight structural issues in urban neighbourhoods and to reform criminal justice in the United States. The case studies highlight how official data is often creatively repurposed, aggregated or augmented with other data sources in the context of evolving data infrastructures which are attuned to the specific needs and interests of civil society actors.

In detail, civic actors can engage with data infrastructures to:

  • Identify spaces for intervention. Having cadastral data at hand helped civic actors to identify vacant publicly-owned land, to highlight possibilities for re-using it and to foster community building in neighbourhoods around its re-use.
  • Open spaces for accountability. Using government’s own accounting measurements may provide civil society with evaluation criteria for the effectiveness of public sector programs. Civil society actors may develop a ‘common ground’ or ‘common language’ for engaging with institutions around the issues that they care about.
  • Enable scrutiny of official processes, institutional mechanisms and their effects. By opening public loan data, civil society was able to identify how decentralised fiscal audit mechanisms may have negative effects on public debt.
  • Change the way an issue is framed or perceived. By using aggregated, anonymized data about home addresses of inmates, scholars could shift focus from crime location to the origin of an offender – which helped to address social re-entry programs more effectively.
  • Mobilise community engagement and civic activism. Including facilitating the assembly and organisation of publics around issues.

The report makes the claim that a broader vision on public data infrastructures is needed beyond technicalities such as technical and legal openness. Drawing on ongoing research around participatory data infrastructures, our report foregrounds how governments may take steps to make public information systems responsive to the interests and concerns of different publics.

You can find the full report here.