Spaces:
No application file
No application file
NimaBoscarino
commited on
Commit
•
5eac372
1
Parent(s):
066de29
Rename charters/multimodal-project.txt to charters/multimodal-project.md
Browse files
charters/{multimodal-project.txt → multimodal-project.md}
RENAMED
@@ -1,4 +1,4 @@
|
|
1 |
-
Purpose of the ethical charter
|
2 |
|
3 |
It has been well documented that machine learning research and applications can potentially lead to "data privacy issues, algorithmic biases, automation risks and malicious uses" (NeurIPS 2021 ethics guidelines). The purpose of this short document is to formalize the ethical principles that we (the multimodal learning group at Hugging Face) adopt for the project we are pursuing. By defining these ethical principles at the beginning of the project, we make them core to our machine learning lifecycle.
|
4 |
|
@@ -6,27 +6,41 @@ By being transparent about the decisions we're making in the project, who is wor
|
|
6 |
|
7 |
This document is the result of discussions led by the multimodal learning group at Hugging Face (composed of machine learning researchers and engineers), with the contributions of multiple experts in ethics operationalization, data governance, and personal privacy.
|
8 |
|
9 |
-
|
10 |
-
Limitations of this ethical charter
|
11 |
|
12 |
This document is a work in progress and reflects a state of reflection as of May 2022. There is no consensus nor official definition of "ethical AI" and our considerations are very likely to change over time. In case of updates, we will reflect changes directly in this document while providing the rationale for changes and tracking the history of updates through GitHub. This document is not intended to be a source of truth about best practices for ethical AI. We believe that even though it is imperfect, thinking about the impact of our research, the potential harms we foresee, and strategies we can take to mitigate these harms is going in the right direction for the machine learning community. Throughout the project, we will document how we operationalize the values described in this document, along with the advantages and limitations we observe in the context of the project.
|
13 |
|
14 |
|
15 |
-
Content policy
|
16 |
|
17 |
Studying the current state-of-the-art multimodal systems, we foresee several misuses of the technologies we aim at as part of this project. We provide guidelines on some of the use cases we ultimately want to prevent:
|
18 |
|
19 |
-
Promotion of content and activities which are detrimental in nature, such as violence, harassment, bullying, harm, hate, and all forms of discrimination. Prejudice targeted at specific identity subpopulations based on gender, race, age, ability status, LGBTQA+ orientation, religion, education, socioeconomic status, and other sensitive categories (such as sexism/misogyny, casteism, racism, ableism, transphobia, homophobia).
|
20 |
-
Violation of regulations, privacy, copyrights, human rights, cultural rights, fundamental rights, laws, and any other form of binding documents.
|
21 |
-
Generating personally identifiable information.
|
22 |
-
Generating false information without any accountability and/or with the purpose of harming and triggering others.
|
23 |
-
Incautious usage of the model in high-risk domains - such as medical, legal, finance, and immigration - that can fundamentally damage people’s lives.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
-
|
26 |
|
27 |
-
|
28 |
-
Share open and reproducible work: Openness touches on two aspects: the processes and the results. We believe it is good research practice to share precise descriptions of the data, tools, and experimental conditions. Research artifacts, including tools and model checkpoints, must be accessible - for use within the intended scope - to all without discrimination (e.g., religion, ethnicity, sexual orientation, gender, political orientation, age, ability). We define accessibility as ensuring that our research can be easily explained to an audience beyond the machine learning research community.
|
29 |
-
Be fair: We define fairness as the equal treatment of all human beings. Being fair implies monitoring and mitigating unwanted biases that are based on characteristics such as race, gender, disabilities, and sexual orientation. To limit as much as possible negative outcomes, especially outcomes that impact marginalized and vulnerable groups, reviews of unfair biases - such as racism for predictive policing algorithms - should be conducted on both the data and the model outputs.
|
30 |
-
Be self-critical: We are aware of our imperfections and we should constantly lookout for ways to better operationalize ethical values and other responsible AI decisions. For instance, this includes better strategies for curating and filtering training data. We should not overclaim or entertain spurious discourses and hype.
|
31 |
-
Give credit: We should respect and acknowledge people's work through proper licensing and credit attribution.
|
32 |
-
We note that some of these values can sometimes be in conflict (for instance being fair and sharing open and reproducible work, or respecting individuals’ privacy and sharing datasets), and emphasize the need to consider risks and benefits of our decisions on a case by case basis.
|
|
|
1 |
+
## Purpose of the ethical charter
|
2 |
|
3 |
It has been well documented that machine learning research and applications can potentially lead to "data privacy issues, algorithmic biases, automation risks and malicious uses" (NeurIPS 2021 ethics guidelines). The purpose of this short document is to formalize the ethical principles that we (the multimodal learning group at Hugging Face) adopt for the project we are pursuing. By defining these ethical principles at the beginning of the project, we make them core to our machine learning lifecycle.
|
4 |
|
|
|
6 |
|
7 |
This document is the result of discussions led by the multimodal learning group at Hugging Face (composed of machine learning researchers and engineers), with the contributions of multiple experts in ethics operationalization, data governance, and personal privacy.
|
8 |
|
9 |
+
## Limitations of this ethical charter
|
|
|
10 |
|
11 |
This document is a work in progress and reflects a state of reflection as of May 2022. There is no consensus nor official definition of "ethical AI" and our considerations are very likely to change over time. In case of updates, we will reflect changes directly in this document while providing the rationale for changes and tracking the history of updates through GitHub. This document is not intended to be a source of truth about best practices for ethical AI. We believe that even though it is imperfect, thinking about the impact of our research, the potential harms we foresee, and strategies we can take to mitigate these harms is going in the right direction for the machine learning community. Throughout the project, we will document how we operationalize the values described in this document, along with the advantages and limitations we observe in the context of the project.
|
12 |
|
13 |
|
14 |
+
## Content policy
|
15 |
|
16 |
Studying the current state-of-the-art multimodal systems, we foresee several misuses of the technologies we aim at as part of this project. We provide guidelines on some of the use cases we ultimately want to prevent:
|
17 |
|
18 |
+
- Promotion of content and activities which are detrimental in nature, such as violence, harassment, bullying, harm, hate, and all forms of discrimination. Prejudice targeted at specific identity subpopulations based on gender, race, age, ability status, LGBTQA+ orientation, religion, education, socioeconomic status, and other sensitive categories (such as sexism/misogyny, casteism, racism, ableism, transphobia, homophobia).
|
19 |
+
- Violation of regulations, privacy, copyrights, human rights, cultural rights, fundamental rights, laws, and any other form of binding documents.
|
20 |
+
- Generating personally identifiable information.
|
21 |
+
- Generating false information without any accountability and/or with the purpose of harming and triggering others.
|
22 |
+
- Incautious usage of the model in high-risk domains - such as medical, legal, finance, and immigration - that can fundamentally damage people’s lives.
|
23 |
+
|
24 |
+
## Values for the project
|
25 |
+
|
26 |
+
We note that some of these values can sometimes be in conflict (for instance being fair and sharing open and reproducible work, or respecting individuals’ privacy and sharing datasets), and emphasize the need to consider risks and benefits of our decisions on a case by case basis.
|
27 |
+
|
28 |
+
### Be transparent
|
29 |
+
|
30 |
+
We are transparent and open about the intent, sources of data, tools, and decisions. By being transparent, we expose the weak points of our work to the community and thus are responsible and can be held accountable.
|
31 |
+
|
32 |
+
### Share open and reproducible work
|
33 |
+
|
34 |
+
Openness touches on two aspects: the processes and the results. We believe it is good research practice to share precise descriptions of the data, tools, and experimental conditions. Research artifacts, including tools and model checkpoints, must be accessible - for use within the intended scope - to all without discrimination (e.g., religion, ethnicity, sexual orientation, gender, political orientation, age, ability). We define accessibility as ensuring that our research can be easily explained to an audience beyond the machine learning research community.
|
35 |
+
|
36 |
+
### Be fair
|
37 |
+
|
38 |
+
We define fairness as the equal treatment of all human beings. Being fair implies monitoring and mitigating unwanted biases that are based on characteristics such as race, gender, disabilities, and sexual orientation. To limit as much as possible negative outcomes, especially outcomes that impact marginalized and vulnerable groups, reviews of unfair biases - such as racism for predictive policing algorithms - should be conducted on both the data and the model outputs.
|
39 |
+
|
40 |
+
### Be self-critical
|
41 |
+
|
42 |
+
We are aware of our imperfections and we should constantly lookout for ways to better operationalize ethical values and other responsible AI decisions. For instance, this includes better strategies for curating and filtering training data. We should not overclaim or entertain spurious discourses and hype.
|
43 |
|
44 |
+
### Give credit
|
45 |
|
46 |
+
We should respect and acknowledge people's work through proper licensing and credit attribution.
|
|
|
|
|
|
|
|
|
|