By Steve MacFeely, Head of Statistics at UNCTAD
What distinguishes data evolution from data revolution?
In 2013, the UN High-Level Panel of Eminent Persons on the Post-2015 Development Agenda called for a Data Revolution to exploit the opportunities presented by the new data landscape. The idea of a data revolution seemed to capture the zeitgeist perfectly. In retrospect, it seems as if the term ‘data revolution’ had just been waiting for someone to say it aloud. No sooner had the words been uttered than the term was immediately adopted in public discourse and diplomatic declarations.
What did it mean? For an event or movement to be revolutionary, as opposed to evolutionary, it must presumably be in some way disruptive or transformational. What events or movements in the world of official statistics would be sufficiently disruptive or transformational to deserve being called revolutionary? Is the emergence of big data sufficiently transformational to qualify as a revolution? Or what about the Open Data movement – is that disruptive enough? Is the data revolution about new partnerships between NSOs and civil society or citizen science? Or is something more profound like a deepening of human capital and statistical literacy required to justify the term?
The data revolution, if indeed there has been such a revolution, is a curious one. Notwithstanding the chatter about big data and data analytics, there has been no obvious coup d’État, no shouting or marches on the streets, no Viva la Revolución, no data riots. But that does not mean there hasn’t been one! Perhaps, like many scientific revolutions before it, the data revolution has been a silent one and we may not fully grasp the implications until it is too late.
These are some of the issues and questions I set out investigate in my paper ‘In Search of the Data Revolution: Has the Official Statistics Paradigm Shifted?’. There are so many different data ecosystems, and the data revolution is such a huge topic, I limited my investigation to the world of official statistics. But even doing that, the canvas was enormous. Using definitions provided by the Independent Expert Advisory Group on a Data Revolution for Sustainable Development in their report ‘A World that Counts’ potential data revolutions were identified. Using a framework derived from Thomas Kuhn’s work ‘The Structure of Scientific Revolutions’ these revolutions were evaluated.
The paper looks at multiple data revolutions: the data privacy revolution; the open data revolution; big data revolution; and the social data revolution, to name a few. A surprise coming from this investigation is that the term data revolution, fittingly, can be traced all the way back to the 1960’s. In summary, not only is the term ‘data revolution’ not new, but the meaning of term hasn’t evolved that much if at all. On the contrary, it is remarkable how little has changed, raising the question whether it is possible for a revolution to continue for 60 years? The paper also examines the antecedents of the data revolution in the digital revolution, and how digitalisation has utterly changed the concept of data, from a narrow numeric viewpoint to a much broader concept that now comprises audio, visual and text information. This rescoping has led to an explosion of data, including the ‘paradigm destroying phenomena’ of big data, which has facilitated ‘correlation is enough’ algorithmic based decisions. This Copernican shift in discovery and decision-makingposes profound questions for us all.
The impact of secularization, the emergence of risk, and the two great historical tidal waves of industrialisation and empire on the growth in demand for data are examined. As are the ‘statistical revolutions’ that emerged from the Great Depression and WW2 which led to some of our most enduring statistical concepts and indicators, notably the development of national income, labour force and trade statistics. The connections between Taylorist performance metrics, New Public Management and evidence informed policy making is also discussed.
The paper also examines the journey of official statistics from serving only the state to that of a public good, serving democracy and accountability. This progressive view is quite a recent development, and was only formalised globally in 2014, when the Fundamental Principles of Official Statistics were endorsed by the United Nations GeneralAssembly. In doing so, heads of state from around the world were explicitly saying that official statistics were a public good. Now, 25 years later, that view is being challenged, as a new cold war for the ownership of our data is underway. Some see Open Data as the solution, and indeed it might be. But there are risks here too as most ‘open data’ initiatives are drives to open government data only and this may inadvertently contribute to data inequality. There are already asymmetries in openness between private and public sector data, as public data are classified as public goods whereascorporate data are classified as marketable and proprietary assets.
The data deluge has created a real challenge for both privacy and confidentiality. So much so, one can’t help question whether privacy as an ‘ideal’ might be alive and well, but whether privacy in ‘practice’ is on life-support. One of the biggest challenges for official statistics is how, and if it is sustainable, to protect the confidentiality of super large multinational enterprises. We are likely to hear about differential privacy and differential confidentiality in the coming years. But perhaps it is already too late; an unsuspecting public makes Faustian bargains everyday under the illusion that by signing away their rights they have secured valuable discounts or better services.
In this paper, I argue that there has not been a single data revolution, but many. The Data Revolution is in fact a series of revolutions. Those revolutions are a function or consequence of other revolutions; digital, informational, cultural and social. I also speculate on future crises that may trigger new data revolutions.
The data revolution(s) as we now understand the term is inextricably linked to the SDGs. It began as an aspiration, a plea for better data but quickly transformed into a fact. Diplomats make reference to it, not as a future state, but as the solution. But to realise the aspirations I (and other chief statisticians) argue that we need a Global Data Convention to safely access and usse data (and by extension statistics) while protecting the rights of citizens. Such a convention will need to be global in order to address ethical and sovereignty issues. It must re-establish some sort of social contract that strikes a balance between community dataveillance and individual and human rights, between security and privacy, between commerce and public good, between asymmetries in private and public openness, between data ownership and reward. In an era of faltering multilateralism, it may be convenient to turn a blind eye, but given the importance of data to all of our futures, the United Nations cannot ignore this challenge. Governments cannot abdicate their responsibility either.