Archives For #DataScience

Preface:

Governments cannot embrace, much less promote Big Data, Open Data, Analytics, Machine Learning & Ubiquitous Algorithms without protecting the Citizens’ whom they work for. Social Engineering must be by choice, not by default through illiterate political leaders.

Body:

The UK Government as part of its “Digital Economy” initiative has just released with great fanfare the “Data Science Ethical Framework”. Its ministerial champion has characterized it as “harnessing the Progressive power of Data Science while protecting the Public”. It does neither, but clearly illuminates the lengths to which the UK Government (along with others) will go in trying to influence/dictate behavior in areas where they have no literacy at all in respect to understanding the underlying capabilities (Data, Analytics & Algorithms), nor the consequences of the harm (or actual good) that can come if left to their own devices. Not to be left to a footnote however, is the fact that these attempts at behavioral influence do not apply to the Intelligence community or Police services, both of whom want unlimited powers to surveil, gather data on everyone’s daily lives (and perhaps thoughts) and to then use these to ultimately predict behaviors i.e. The Snoopers Charter.

Ever since the notion of Big Data has come onto the scene, many have extolled its virtues in changing the world as we know and understand it. They have hyped with a zeal not previously seen the notions of Data Science, Data Scientists, Algorithms & Machine Learning, etc. Virtually all of them have advocated for its wide-scale use to analyze and predict citizens’ behavior in order to gain deeper insights, without any controls as to “just how creepy” this activity could get in terms of interacting with the public at large. Any attempt to limit the “how and where” Big Data & Analytics should be applied was met by the fury of these same advocates who characterized it as “stifling economic growth and wealth creation”. Not surprisingly, most advocates have been highly influential in getting Governments to go along with their thinking and to take a “hands off” approach. This has not worked out well for consumers who now see their daily lives dissected, analyzed and ultimately manipulated by the algorithms & machine learning associated with the deep behavioral insights now available to almost every organization who invests in Data & Analytics capabilities.

The backlash that now arisen from this lack of control is significant enough that many Governments have created Ethics Councils and other bodies who have gone on to generate reports & recommendations on the issue of  “Ethics in the age of the Algorithm”. Additionally, these same governments (US, UK, EU, etc.) are also major advocates of Digital and have undertaken major Digital Strategy & Transformation efforts within their countries[1]. These efforts have served to further exacerbate the Ethics Problem that we are now experiencing. A common thread found amongst all of this is the seemingly cluelessness that Government Leaders, Ministers & Civil Servants exhibit each and every time they make an address or pronouncement on the topic of Privacy, Ethics, Governance, etc. associated with Big Data, Analytics, Algorithms, Digital, etc.  Clearly, they don’t understand the underpinnings of the issues, nor the reasons why this topic has become so paramount in the public’s mind and their stated demands that it be resolved to their satisfaction.

Data (Big or Small), Analytics (Creepy or Helpful) & Algorithms (Evil or Good) are major influences in how the Digital World around us evolves, much less serves us. Beyond the well-rehearsed platitudes, there needs to be a fundamental mastery of the details associated with these domains by Leaders & Policy Makers who are ultimately accountable for making Citizen’s lives better, much less protecting them from threats. Without strong & competent Leadership, and controls (governance) , these same citizens will be victimized rather than benefited by Data, Analytics, Algorithms & Digital. The requirement for competent leadership is not a political platform for campaigning on, but a focal point for Government action in order to uphold basic human rights, no matter what pace of transformational change the country is experiencing.

An Ethics Framework that relies on self-governance, best efforts and serendipity to insure that consumer Privacy is protected and that Citizens are not victimized by their own data is a recipe for disaster. Government Leaders must commit themselves to leading at all levels and across all domains. They must be literate and competent in the areas that they promote as catalysts for change and not leave Citizens to the vagaries of Data Science, and all that portends to be.

[1] The UK Government has gone so far as to make the “Digital Economy” a centerpiece of the Queens’ Speech in spite of not being able to come up with a companion “Digital Strategy” that was promised quite some time ago.

  • An edited version of this posting appeared in the June 2016 issue of Information Age (UK) (www.information-age.com)

Preface: The Oxford English Dictionary (OED) defines a Citizen Scientist a member of the general public who undertakes scientific work in collaboration with or under the direction of professional scientists and/or scientific institutions. The OED currently has no definition for either Data Science (it is also not recognized as a legitimate science by any scientific body in the world) or Data Scientist aka Unicorn.

Body*:

The notion of Data Science was born from the recent idea that “if you have enough data, you don’t need much (if any) science” to divine the truth and foretell the future” (as opposed to the long established rigors of Statistical or Actuarial Science which most times requires painstaking efforts and substantial time to produce their version of  “the truth”). Practitioners’ of this so-called science are the self-proclaimed Data Scientists, purported to be the “sexiest job” one can have today. The Data Scientist is a catchall role, which defies a common definition[1], but claims that;  “We can do anything you want with any data that you have” (akin to “Torturing the Data to obtain some version of the Truth”). Much of the hype of Data Science has been coupled with the virtues of Big Data (and all that entails). Now that we are starting to see Big Data wane and without much of a solid foundation built to date, it has become clear to me (and many others no doubt) that Data Science is on the cusp of being relegated to the “Junk Science[2]” rubbish bin in fairly short order. I for one will not mourn the death of Data Science, or the abatement of hype surrounding it (much less Big Data).

Rather than embracing this untested (and perhaps doomed) form of science and aimlessly searching for Unicorns aka Data Scientists, to pay vast sums to, many Organizations are now embracing the idea of “making everyone Data & Analytics Literate”.  This leads me to what my column is really meant to focus on; The Rise of the Citizen Scientist.

The Citizen Scientist is not a new idea whatsoever having seen action in the Space & Earth Sciences world for decades now (NASA SETI) and has really come into its own as we enter “The Age of Open Data”. Given the exponential growth of Open Data initiatives across the world (the UK remains the leader, but has growing competition from all geographies) the need for Citizen Scientists is now paramount.  As Governments open up of vast repositories of new data of every type, the opportunity for these same Governments (and Commercial interests) to leverage the passion, skills & collective know-how of Citizen Scientists to help garner deeper insights into the Scientific & Civic challenges of the day is substantial. They then can take this knowledge and the collective energy of the Citizen Scientist community to develop common solution sets and applications to meet the needs of all their Constituencies without expending much in terms of financial resources or suffering substantial development time lags. This can be a windfall of benefits for every level (National to Local) or type of Government (or Industry sector) found around the world.

The use of Citizen Scientists to tackle so-called “Grand Challenge” problems has been a driving force behind many Governments’ commitments to and investments in Open Data to date. There are so many challenges in governing today that it would be foolish to not employ these very capable resources to help tackle them. The benefits manifested from this approach are substantial and well proven. Many are well articulated in the Open Data Success Stories to date. Additionally, one only need to attend a local “hack fest” to see how engaged Citizen Scientists are of all age, gender & race and to feel the sense of community that these events foster as everyone focuses on the challenge(s) at hand and works diligently to surmount them using very creative approaches. To get a flavor as to some of the larger challenges being addressed by Citizen Scientists & Governments you can visit the (US) Citizen Science Alliance site (http://www.citizensciencealliance.org/index.html) or the (UK) (http://www.ukeof.org.uk/our-work/citizen-science).

As Open Data becomes pervasive in use and mature in respect to the breadth & richness of the data sets being curated, the benefits returned to both Government and its Constituents will be manifold. The catalyst to realizing these benefits (and attendant ROI) will be the role of the Citizen Scientist. Citizen Scientists are not going to be Statisticians, Actuaries or so-called Data Scientists, but ordinary people with a passion for science and learning and a desire to contribute to solving the many Grand Challenges facing society at large. I believe that their efforts will do more to turn the tide on societal and environmental challenges than all other undertakings combined.

[1] Neither Data Science nor Data Scientist merits an entry in the current OED.

[2] Junk Science is commonly regarded as; “untested or unproven theories presented as scientific fact”. In Actuarial terminology it constitutes “Unsound Results”.

*This posting in an edited version originally appeared in the June 2015 issue of Information Age (UK)