Why all the fuss about the safety of AI? Is it because some people are frightened of how Chat GPT could change the world? Or is it that we are more interested in machines and algorithms, forgetting it is data that feeds AI and can lead to biased, even dangerous decisions? In this blog, I will explore the concept of digital self-determination. This principle offers insights into comprehending your digital identity in the middle of the rapidly evolving technological landscape.
Unplash image
You have just discovered a three-part blog on something called digital self-determination (DSD). If you are interested in your data but not entirely sure what it is, where it goes, and what can be done so that it is not manipulated, we think it is important that you know more about digital self-determination and have an opportunity to help make it happen. If you are connected to data more professionally, these blogs will give you a new way to understand data challenges and what can be done about them.
The four blogs are intended to give you an understanding of DSD, its important features and how it is important for you as a means of engaging with, and better managing your data. There is a discussion of why we should participate in DSD, how it supplements other governance approaches, and where DSD is compatible with current debates about inclusion, and responsible data governance. Blog 1 introduces DSD by clearing up confusion about what is data, who ‘owns’ it, does it create ‘rights’ and how is it traded. Managing and controlling data by you, is at the heart of DSD so we all need to be on the same page when it comes to considering what DSD can do for us as the creators and receivers of data. Is data just another ‘thing’ that big market players make money from or can it be more personal and possible to trust? We are all bombarded with information about digital transformation and digital literacy, but can we be anything more than passive players in our digital spaces? And before we sign off from this blog, let’s think about how already existing efforts to make AI ‘responsible’ and personal data ‘protected’, would do with some new approaches where we are more included.
What is data?
Maybe you can give a complicated scientific or techno answer, or perhaps you are not sure. Put simply, when people create data it is usually messages sent between and within a digital community with a clear purpose or intention. If you send a social media post to your chat group, or comment on something that is streamed to you as a member of an internet list, then a community exists or is created that receives and responds to your communication. In this way, your data is intended for a specific community and not just anyone out in cyberspace.
People can communicate data with machines and receive data in return. Technology and algorithms can watch us through surveillance cameras and get information about us from chips and magnetic strips. Whatever the source or the intention, data is more often than not about people. Once this data is reused by those within or outside that digital community it can take on other forms, like predictions of our purchasing preferences sent to us through advertising we did not request, or who is likely to commit a crime. If we want it or not and despite protective regulations, this is currently the way that our data is reused and for us one of the main reasons for DSD intervention.
Data is intangible – that means that data is not a thing or even a ‘bit’. In the past, it was reduced to numbers or some other material representation for us to understand. Today data is digitized. It can be used and reused over and over again without losing value. Unlike property, data is not marketed as a ‘surplus’ of something because it can be processed endlessly. These features of data make it difficult to secure, contain and make it scarce. If all this is so, then how can it be owned, if we see ownership relating to ‘things’ that are limited to only our control? And is it better to see data as something that can be responsible and respectfully shared for the benefit of all that have an interest? DSD talks this language.
Digital Self-Determination: Three components
Do you already have a first impression of DSD? Good - now you need to know about DSD, and if it can do what other data governance approaches are not doing. DSD involves three components:
- Digital – the digital world is where data is largely transacted, constantly shifting seamlessly between data spaces. ‘Space’ is where people socialize and if the state is created digitally and socializing occurs through data exchange then people in that space have interests in that data. DSD is located in digital spaces and deals with data management across digital arrangements and relationships, whether these are personal or commercial. DSD ensures beneficial access and emerging relationships around data that are respectful by creating and maintaining trust between data stakeholders, whether they be small actors who start data communication, or big players who reuse that data for commercial benefit. The creation of safe data spaces is necessary for data subjects to manage and transact their data and that of their communities. Safety in this sense will be more likely when people trust each other’s data use. Let’s give an example of what is meant here. If we buy or sell something on a digital commerce platform, it only becomes a safe space when we trust that the transaction will be completed to our satisfaction and the platform will give us some back-up if our purchase or sale turns out not to be as anticipated. Sometimes this trust is ‘blind’ if there is not sufficient transparency offered on the platform about who is transacting, what is being traded, and the consequences of something going wrong. To make sure this happens it may be necessary for the data providers and other stakeholders in the transaction to make us aware of how they are using our data and give us some say in that use. In a similar way, DSD offers more than some claims of rights such as the right to privacy through expensive court litigation for which we may not have the money or the energy. It is about being given an informed choice to manage data about us and our communities in a safe and trusted setting.
- Self – DSD is focused on the data subject (you and me), but it is not limited to an individualist notion of ‘self’. Rather as data is created between individuals (and their communities) those who share and use this data should have duties and responsibilities to each other. The ‘self’ in this sense centralises on the idea of making data-subjects (in their data communities, and where their data may be transacted beyond these communities) more informed and capable about data access and use, ensuring their digital personality or identity is as they intend it. Another example will explain the ‘self’ a bit better. If you post something on social media, say a selfie, it says something about your identity and personality. The platform might claim that because you consented to their access, they have a legitimate interest in the post, and that they can use that data to make money. In this way what was once data about you, meant to say something about you to your data community, has been converted through reuse into business data. Did you have any part in this change? This concern is much more than a question of privacy which we often risk with our data posts. It is about making sure that we know how our data is passed on and used and have an opportunity to control this process.
- Determination – In DSD data subjects are the centre of data management considerations and need information about their data to control it. DSD works for the informed transaction and management of data in arrangements that can benefit you, the people you communicate with, and those who want to access and use your data. DSD involves informed choice and being given the opportunity to make actual and genuine data decisions. Data subjects and their communities become the first line of data access, management and use. DSD does not have to rely on external law, principles or regulations to be activated on our behalf. Instead, it operates on the understanding that we value informed choice to manage our data, and that we can’t even make that decision unless we know about data use and are given some space to make data arrangements that suit us. For instance, when I go to a bank and ask for a loan, to make a decision on whether I am a suitable risk the bank will require my personal credit information. If I am denied the loan, I need to know whether this was due to something about that information, and how the bank used it to come to the negative decision.
How is DSD different from other approaches to data access and use?
DSD can be seen as supplementary to and not replacing other data governance modes such as data rights, personal data protection, or ethics. It relies on mutualizing benefits from responsible data access, based on more equitable relationships among public, private and individual stakeholders through controlling and managing data. By focusing on enabling data subjects (like us) in safe digital/data spaces (like the relationship between a patient and her doctor), DSD recognizes what is missing in these other governance modes: the creation of respectful and responsible social bonds between data stakeholders (powerful and less powerful) to more fair position data control and thereby enlivening responsible data access.
DSD is what can be described as a ‘context-specific’ governance strategy with stakeholders working towards mutualized interests in responsible data sharing, essentially different from top-down governance models.
DSD returns the governance direction and momentum to those who create data and as such should have the first option for its management. This repositioning of data priorities relies on (and generates) trust between data stakeholders (those who make, use and reuse data) instead of relying on rights, or laws, or best practices.
Now because data governance discussions are shifting from data security to responsible data access, there is an opportunity to create a self-regulatory model where:
-
power is redirected to the data subject initially through correcting information deficits;
-
powerful data holders, users and marketers are willing to concede power in that direction for self-interest such as reputational benefit, data integrity, and trusted data access; and
-
the language of data access and trading moves from concealment and exploitation to respectful and responsible co-creation with mutual benefits which are based on trust as a result of responsible use and the social benefit that flows from this.
Obviously, DSD depends on the willingness of stakeholders to participate, and in so doing on the capacity to identify these stakeholders and their data links. Rather than being a major failing in the DSD frame, it is important to recognise and adapt the limitations in information sharing and the mutualised interests of stakeholders. Recognizing these as key factors of DSD in different contexts ensures that all stakeholders in a digital/data space, including digital communities, effectively 'own' the process.".
Why isn't requiring responsible AI use or regulating its risks sufficient?
The international interest in the regulation and governance of AI is moving in a more risk/safety direction. Earlier concentrations on ethics, personal data protection and even data rights are being surpassed by legislative and policy considerations of risk minimization and resilience. In this context the requirement for responsible AI has emerged. This strategy is primarily interested in governing AI technology, its creation and uses and the novel applications like generative AI and large language learning models. What is mainly missing from this discussion is a focus on responsible data access, use and reuse. At present, any AI technology/ecosystem is reliant for its operation on the input of data largely produced through human communication and interaction. AI cannot process and reproduce data (even synthetic data) without some original data harvesting and feeding. However, in the era of Big Data and mass data sharing, data is moving further away from its sources, and tracing and confirming its accuracy and integrity is complex. Additionally, unauthorised or uncharted data reuse in voracious data markets, and the opening up of new demands for data submission means that responsible data use is as crucial if not more crucial in ensuring responsible AI than only requiring the safety and robustness of AI technology and its new creations and applications.
Digital self-determination is a cocreation, coproduction data governance model that engages all stakeholders in the data ecosystem to ensure that the mutual interests around data access and use are responsible, equitable, accountable, respectful and transparent. Obviously, this requires that stakeholders trust each other despite coming to data access and use issues with very different interests and understanding. Trust is the grease that makes DSD run.
DSD can be directed to all data access/use contexts where data subjects and their communities have identified interests, and other data market players can be encouraged to negotiate mutually acceptable access and use regimes. Such a model can ensure not only data practices that are endorsed by stakeholders and trusted, but also identify potential bias and improve data accuracy. These preconditions are essential for any responsible AI project focusing on technology and its applications.
Some of you by now are thinking well this all sounds great in theory but it won’t work in real life. Why would big tech companies want to share data with people like us? In any case, why would I want to be involved in managing my data and what benefit is there if I do? We will tackle these and other challenges in the blog to follow. See you then.