XXX.3 May + June 2023
Page: 60
Digital Citation

Data Practices and Data Stewardship

Janis Wong

back to top 

Data protection laws and technologies limit the vast personal data collection, processing, and sharing in our digital society. These tools, however, may lack support for protecting individual autonomy over personal data, given the limited recourse individuals have when going up against large, multinational companies. Additionally, existing data stewardship solutions may not facilitate cocreated and collaborative solutions for supporting data protection rights and online safety. To address this, I propose the creation of a data commons for data protection to encourage cocreating data protection solutions and to redistribute power from companies back to individuals and communities.

back to top  Insights

Data protection laws, technologies, and stewardship frameworks undo potential data harms, but generally focus on individualized solutions.
A commons attempts to redistribute power to increase the personal and social value of data while ensuring its quality and secure storage through collaboration.
A commons can help distribute the benefits of data as a resource widely and equitably, without commodifying or privatizing it.

In our data-driven society, personal data is increasingly, knowingly or unknowingly, being gathered about us (data subjects) by companies (data controllers) in an attempt to learn more about our behaviors and consumer preferences. Such data practices that encourage the vast collection of our data have led to increased data protection and safety-related challenges, amplified by surveillance networks and infrastructures [1].

With an increasing number of data breaches and privacy scandals, more people are becoming cautious about what information they put online, to take back control over their personal data and undo any data-related harms. To address these challenges, legal and technological tools have been developed to support new ways for individuals to control, manage, and protect their personal data and increase their online safety. For example, data protection regulations such as the European Union General Data Protection Regulation and the California Consumer Privacy Act aim to protect individuals and their data through supporting the exercise of their data subject rights. Furthermore, industry-specific guidelines can also support practical considerations for companies when it comes to data protection and safety. ACM, for example, has its own Code of Ethics and Professional Conduct to guide the ethical conduct of computing professionals.

Privacy and data protection tools offer data subjects granular control over their data. These tools include Jumbo Privacy (a privacy and security assistant that protects users from online risks), Solid (a platform for linked data applications that are decentralized and under users' control rather than controlled by other entities), and the Data Transfer Project (a common framework with open-source code that connects two online service providers, enabling direct user-initiated data portability).

While the implementation of these laws, policies, and technologies represents a step in the right direction, it results in the responsibilization of data protection from data controllers to data subjects [2], where individuals, rather than data controllers, assume the burden of protecting their own personal data. Existing solutions rely on data subjects having a thorough understanding of both the law and the technological resources available for individual redress, often after data collection or processing. Furthermore, the focus on individual protections and safeguards disregards the power imbalance between users as data subjects and large corporations as data controllers: Do individuals know what rights they have? What can they do if a data controller isn't responsive to their data protection requests?

Even if such tools are effective, the fines that result from data breaches and privacy scandals may have little impact on the large market caps that big technology companies have. For example, plaintiffs in the U.S. successfully argued in a class-action lawsuit that user privacy was violated in the Cambridge Analytica scandal, which Meta agreed to settle and pay $725 million to the 250 million to 280 million people affected. This amounts to only $2 to $3 per person, barely making a dent in Meta's $319 billion market cap.

Ultimately, data subjects lack a meaningful voice when it comes to creating solutions that involve protecting their own personal data both ex ante and ex post (before and after the event, respectively), as there are few opportunities to help improve individual and collective privacy and safety outcomes by pooling knowledge, resources, and expertise with other people's.

back to top  The Role of Data Stewardship

More recently, to undo the potential harms caused by these data practices and address data subject and data controller power imbalances, data stewardship frameworks have attempted to provide data subjects with more agency over what and how their personal data is used. Data stewardship refers to the process by which individuals or teams within data-holding organizations are empowered to proactively initiate, facilitate, and coordinate data in the public interest. Those responsible for data stewardship, known as data stewards, may facilitate collaboration to unlock the value of data, protect actors from harms caused by data sharing, and monitor users to ensure that their data use is appropriate and can generate data insights. Data stewardship frameworks may help mobilize data protection by introducing new avenues for data subjects to directly confront large companies' motivations and ability to extensively collect, process, and share their data for profit.


Types and forms of data stewardship are wide ranging, each with distinct goals, functions, and infrastructures [3]. Data trusts apply trust law to establish fiduciary duties related to data management with the aim of redistributing power. Data foundations aim to minimize the risks of personal data breaches and other noncompliant data-related activities by building data usage, sharing, and reuse environments. Data cooperatives involve legal cooperative registration where pooled (collective) data is managed by its members, advancing their collective interests alongside societal considerations.

Domain applications of data stewardship models are also extensive. For example, Driver's Seat is a data cooperative owned by rideshare and delivery drivers. Its app allows drivers to gain insights from their driving data with the aim of limiting power imbalances between drivers and platform companies in the gig economy. Another community-driven data stewardship approach is PescaData, a mobile application that small-scale fisheries in Mexico, Latin America, and the Caribbean can use to register as well as track their own fishing storage and expenses involved in fishing days. PescaData enables workers to offer their catch to local markets without intermediaries. These organizations and tools have allowed for greater agency and control of personal and nonpersonal data by their own communities, where such data has traditionally been in the hands of corporations that may not share community interests.

Existing solutions rely on data subjects having a thorough understanding of both the law and the technological resources available for individual redress.

Data stewardship infrastructures are not without limitations, however. In the case of data trusts, operational strategy questions remain with regard to how they are deployed, as examples have only been tested theoretically, but not practically. Data subjects have limited rights in a data foundation compared with a trust, with limited opportunities for direct engagement. In data cooperatives, data subjects may not be able to act independently from the group given the cooperative's group aims, particularly where contract or incorporation to establish data-related rights may be difficult. Scale is also a challenge when it comes to implementing data stewardship.

Additionally, although there are current initiatives that aim to standardize and produce practical guidance on how these data stewardship mechanisms could be implemented, not all of these mechanisms are focused on data protection and safety. Rather, they may be focused on data sharing and increasing the value of data through privacy-preserving means for commercial and economic benefits, without consideration of supporting data subject recourse in cases of data breach or the manifestation of data protection harms. Crucially, these processes may not include data subjects in the iterative process of adopting, building, and deploying the framework to cocreate data protection solutions, and still result in the responsibilization of the data protection process.

In sum, existing data stewardship frameworks incorporate data protection considerations and support practices that improve online safety. However, data stewardship may not include data subjects or encourage collaboration in the process itself, resulting in limited emancipation from the existing data-driven infrastructures that continue to be incentivized to collect, process, and share individuals' personal data.

back to top  The Commons: Centering Individual and Group Collective Action

To address these data-related governance challenges, the commons, a framework that centers around individual and group collective action, trust, and cooperation, has been considered to limit the spillovers created by the reuse of data, thus increasing its value over time.

The commons, as developed by Elinor Ostrom in her key work Governing the Commons [4], guards a common-pool resource (CPR), a resource system that is sufficiently large so as to make it costly to exclude potential beneficiaries from obtaining benefits, thus reducing exploitation.

Commons governance follows several key principles. Respecting the competitive relationships that may exist when managing a CPR, the commons depends on human activities where CPR management follows the norms and rules of the community autonomously. The CPR enables transparency, accountability, citizen participation, and management effectiveness, where each stakeholder has an equal interest in the commons. Crucially, governing the commons recognizes polycentricity, a complex form of governance with multiple centers of decision making, each of which operates with some degree of autonomy. Its success relies on stakeholders entering contractual and cooperative undertakings or having recourse to central mechanisms to resolve conflicts. The norms created by the commons are bottom-up, focusing on the needs and wants of the community and collectively discussing the best way to address any issues. Given these principles, the commons can encourage dialogue among data subjects, experts, policymakers, and additional citizens, creating new knowledge together for the common good with the aim of undoing harmful data practices.

To address the rise of distributed, digital information in our data-driven society, Ostrom and Charlotte Hess developed the information or knowledge commons, where knowledge is the CPR [5]. As new technologies enable the capture of information, the knowledge commons recognizes that information is no longer a free and open public good. Instead, it needs to be managed, monitored, and protected for archival sustainability and accessibility.

Crucially, the knowledge commons addresses data-related governance challenges that arise due to spillovers created by the reuse of data. The knowledge commons can increase the value of such data reuse where the data is linked together and shared, creating new uses and value for the same data. Without a commons, the newly generated knowledge may not be available to the original creators of the data in the first place. As a result, the knowledge commons can support data subjects in accessing the personal and social value of their data while ensuring its quality and secure storage.

Governing the commons recognizes polycentricity, a complex form of governance with multiple centers of decision making, each of which operates with some degree of autonomy.

Ostrom's commons framework has also been applied to data commons that guard data as a CPR. Research data commons such as the Australian Research Data Commons, the National Cancer Institute's Genomic Data Commons, and the European Open Science Cloud all attempt to further open-science and open-access initiatives.

Traditionally, such data commons focus on data distribution and sharing rather than data protection. Recent research, however, has explored how the commons can practically answer questions of data ownership, storage, use, privacy, and regulation [6]. For example, a commons can be useful for considering the intellectual property rights of mass-participation content creation on social networking sites and in pervasive computing, where it could support the use of collective intelligence and knowledge sharing to address systemic problems that threaten the sustainability of institutions and physical infrastructures. A commons can help distribute the benefits of data as a resource widely and equitably, without commodifying or privatizing it. The commons has also been considered for governing emerging technologies, as it can help mitigate individual and collective risk.

back to top  Reducing Risk and Increasing Data Subject Safety

So how can a commons be considered as a cocreated and collaborative solution to giving data subjects more agency over their personal data and increasing their online safety? When a commons is created and applied to a specific use case, such as choosing a data archiving service or social media platform that is privacy friendly, a data subject can specify to what extent they would like their data to be protected without prior knowledge of law, policies, or technical expertise. The data subject's suggested actions are automatically generated by the system based on their preferences and specifications. Data subject decisions may override existing preferences, policies, or standards set by stakeholders, and data subjects can review and update their outcome, add their experiences, and participate in cocreation. Data protection and privacy rights, such as rights related to accessing data, porting data, and objecting to automated decision making, can be individually and collectively exercised against data controllers.


Ongoing research has tested the theoretical and practical aspects of creating a data commons in the context of data protection as well as other fundamental rights and values, where such opportunities to increase data subject agency is increasingly welcome and gaining traction among lawyers, computer scientists, data protection officers, and civil society [7].

In consideration of redistributing power away from data controllers and toward data subjects, when it comes to protecting our personal data, ongoing efforts by regulators, researchers, and activists have ensured that data and privacy protections are easier to understand and implement and are constantly improving for individuals and groups. This has allowed us to set new privacy preferences in our social media settings, opt out of automated decision making, and reject tracking cookies when we browse the Web. Taking commons principles onboard, here are a few steps you can take to undo current data malpractices:

  • If in doubt, opt out. With the rise of artificial intelligence and machine-learning technologies such as OpenAI's ChatGPT and Prisma Labs' Lensa, an increasing amount of data that individuals put online is being scraped to train their datasets. If you don't want your data to be included, search for ways you can opt out of the use of your personal data as well as optimization services.
  • Find out who your country or state data authority or ombudsman is. Beyond exercising your privacy rights with the companies that collect your data, you can seek additional information or escalate any issues with authorities that have regulatory oversight.
  • Make a Freedom of Information Act request. Public authorities in many countries are required to reveal information about how and what sorts of data are being collected and shared as well as how that data is being used.
  • Find your group. If you have concerns over your data and online safety, ask your friends, family, and colleagues. It is likely that they have similar concerns. SIGCHI as well as other digital crowdsourced resources are widely available, such as the Coronavirus Tech Handbook and A Comprehensive Guide to Tech Ethics and Zoom Class. Don't assume you have to do it alone.

Beyond regulatory and technological solutions, collaborative and multidisciplinary ones are necessary for data stewardship as big data and AI innovations continue to use more data to generate individual, social, and public value. This includes applying commons theories, principles, and practices to how we individually and collectively protect our personal data and online safety. While creating a commons is not a one-size-fits-all solution for solving privacy and safety issues, the framework represents an alternative sociotechnical solution that supports data subject agency to prevent and rectify data-related harms.

back to top  References

1. Ball, K. and Webster, W. Big Data and surveillance: hype, commercial logics and new intimate spheres. Big Data & Society 7, 1 (2020), 1–5;

2. Mahieu, R., Asghari, H., and van Eeten, H. Collectively exercising the right of access: Individual effort, societal effect. GigaNet (Global Internet Governance Academic Network) Annual Symposium 2017;

3. Ada Lovelace Institute. Exploring legal mechanisms for data stewardship, 2021;

4. Ostrom, E. Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge Univ. Press, 1990.

5. Hess, C. and Ostrom, E., eds. Understanding Knowledge as a Commons: From Theory to Practice. MIT Press, 2007.

6. Bloom, G., Raymond, A., Tavernier, W., Siddarth, D., Motz, G., Dulong de Rosnay, M., and Ruhaak, A. A practical framework for applying Ostrom's principles to data commons governance, 2021;

7. Zygmuntowski, J.J., Zoboli, L. and Nemitz, P.F. Embedding European values in data governance: a case for public data commons. Internet Policy Review 10, 3 (2021);

back to top  Author

Janis Wong is a postdoctoral research associate researching data protection, ethics, and governance at the Alan Turing Institute, the U.K.'s national AI and data science institute. She holds a Ph.D. in computer science from the University of St Andrews and a bachelor of laws from the London School of Economics. [email protected]

back to top 

Copyright held by author. Publication rights licensed to ACM.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2023 ACM, Inc.

Post Comment

@John Stanik (2023 06 30)

test comment