Questioning the role of data enthnography in government databases
As a Code for Canada fellow at the Canada Energy Regulator, I am part of a team which is redesigning a key energy database which contains historical energy, environmental, Indigenous, and energy regulation information for the nation. Much AI is embedded in it's functioning and it relies heavily on an accurate timeline of events. For example, over 50 years of permafrost data for our tundra lies in this database, it must be accurate over time for us to track changes in the climate throughout these decades. In my work, one key component of AI policy development which hasn't been touched on here is the government's role in data ethnography which refers to the analysis of how people live with data over time.
All AI algorithms are trained with datasets; these are, by necessity, datasets that have been collected from our past. For example, copyright licenses expire after 50 years or more and enter the public domain via open data. This data from books, art, and more, has a history and with it, the many biases of that history, which we may not want to repeat itself. This issue become apparent when we train assistant bots, such as Siri and Alexa, with female voices or when Google images correlates gorillas with African faces. Using datasets from our past, repeats that past. Moreover, this data is incomplete and is always changing--this is concerning given that data forms the basis of many policy decisions. Understanding where that data comes from, where it has traveled, and its general behavior is key to trusting the data. Measuring this path, not only ensures decision-making is conducted in an informed way, but lays the groundwork for assessing algorithmic impact. It gives us a way to interrogate the data and find answers informed by the limits of the data.
To put words to this ethnographic analysis of AI and algorithms, the "Max Planck Institute for Human Development, proposes a radical idea: the best way to understand them is to observe their behavior in the wild. In Nature Magazine, Rahwan (and 22 colleagues) calls for the inauguration of a new field of science called “machine behavior.” (https://blog.experientia.com/the-anthropologist-of-artificial-intelligence/).
Terms are yet to be created to define this ethnographic analysis of AI and algorithms, machine behavior is only one way to describe it. Another approach is referring to this kind of data as "thick data", in which "ethnography and big data analytics can work together to provide a more comprehensive picture of big data, and can thus, generate more societal value together than each approach on its own." (https://link.springer.com/chapter/10.1007/978-3-319-93061-9_2_) To me, this combination asks, what of machine sociology? In what ways are algorithms socialized? How does this impact our decision-making on an individual basis, as a society, as governments? In what ways can we educate, but also facilitate the public's access to AI and AI algorithms, can they create their own AIs with their own data?
In Estonia, citizens have access to a portal with their own data which they can access at anytime. This record of their data can behave as a governmental ethnographic analysis of each citizen's data, upon which informed decisions can be made, policy or otherwise. David Eaves of Harvard Kennedy School, in his parliamentary address as evidence in Canada earlier this year, stated " easy to pull disparate information about a citizen all together to get a very clear view about who that person is, and then to offer that information to different parts of government as it's trying to do its service. This is very different from many other countries...". (https://www.ourcommons.ca/DocumentViewer/en/42-1/ETHI/meeting-135/evidence)
I'd venture to say that this is just one way to do this, we have yet to re-imagine what this can look like in other countries, in other government contexts, in our many futures as a global society.
Published in the call for comments at the OECD Observatory of Public Sector Innovation’s Primer on Artificial Intelligence: https://oecd-opsi.org/ai-consultation/?unapproved=6838&moderation-hash=8ac4a8468aaa286c988b80decba0d88a#comment-6838