Outline Link to heading
Introduction Link to heading
As an Undergraduate Data Science Intern at DSE and a member of the Indigenous Environmental Stewardship team, I have learned extensively about the role and potential of technology, particularly data, in transforming environmental efforts. The growth of AI and machine learning presents numerous opportunities for improving efficiency and optimization of workflow, allowing for greater human efforts to be placed in developing and designing solutions rather than in repetitive and tedious data cleaning or classification. AI can also improve ways to monitor biodiversity, predict changes in ecosystems, and allocate resources for sustainable efforts due to its capability in comparing Big Data efficiently.
However, as society grows more technologically dependent, there is greater concern over the protection and privacy of data. It has become increasingly difficult for census data to be accurately provided to the public without risking the security that anonymity is meant to provide from hackers and cyber threats. In addition, inappropriate use of AI has fostered colonialism over data, with many case studies involving Indigenous communities. To work with data means to be ethically aware of the impacts of collaboration, transparent and shared data, and the inclusion of marginalized communities in environmental stewardship through data.
This blog is meant to share vocabulary that I have learned over the course of my involvement in the field, and I will continue to add to this list as I learn more. Many of these terms are commonly used but rarely properly defined (also because some definitions can vary according to the scope of the topic). My goal is to provide fundamental knowledge of terminology related to data sovereignty to help interested audiences better understand the significance of ethical work with data. The following definitions are based off of my own understanding, so please note that they may not fully apply to every scenario. I am always open to learning and discussing more, so please feel free to reach out to me with your own thoughts and resources!
General Terms Link to heading
AI Alignment: ensuring that AI is being used for the user’s intended goals and purposes (human values). This is important for AI ethics and ensuring that AI is being used in a safe and reliable manner.
- AI Misalignment: when AI usage is not reflecting the user’s intentions. Some examples are AI disposing important data, drawing inaccurate predictions, or giving biased responses.
Big Data: large, diverse, and complex data. Think about card scanners at metro stations or social media algorithms. Big data is typically defined by the 3 V’s:
- Volume: the amount of data
- Velocity: the speed of receiving and processing data
- Variety: the different types of data
Data Ethics: moral obligations when collecting, using, or sharing data. Some examples questions that may be asked are:
- Are people’s personal information anonymous if this data is released to the public?
- Do we have permission to use this data?
- Which communities might be impacted by the use of this data?
Data Localization: storing and using data within the geographical boundaries of where it is originally from.
Data Sovereignty: in simple terms, an individual’s or group’s right to have full control over their own data; this involves collecting, maintaining, protecting, interpreting, and sharing of the data. This term is especially key in understanding Indigenous Rights.
Indigenous Stewardship Link to heading
CARE Principles: defined in 2020 for Indigenous Data Governance.
- Collective Benefit: data facilitates equitable benefit, innovation, and inclusion.
- Authority to Control: Indigenous communities have full control over their data and are empowered.
- Responsibility: data should support and respect Indigenous data.
- Ethics: prioritize Indigenous rights and well being in all stages when using data.
Data Decolonization: the process of removing Western ideas and influences in data practices, particularly for Indigenous communities to have full ownership and control over their own data. It is important to acknowledge and understand historical influences of colonialism in order to determine the best practices for decolonization.
FAIR Principles: defined in 2016 in Scientific Data.
- Findable: data is uniquely and persistently identifiable.
- Accesible: can be obtained by machines and humans.
- Interoperable: ability for data to be used by different systems while producing the same results.
- Reusable: data is useful for other potential users (with correct permissions and licensing).
Indigenous Sovereignty: the right of Indigenous communities to manage their own affairs, and for any decision involving Indigenous data to be made with their participation and consent.
Landback: the movement by Indigenous people to reclaim their ancestral lands.
Traditional Ecological Knowledge (TEK): knowledge and practices about sustainability, environmentalism, restoration, kinship, and the connection between humans and nature that have been passed down for generations by Indigenous communities.
- Also known as Indigenous Local Knowledge (ILK) and Indigenous Traditional Knowledge (ITK).