مبادئ سانتا كلارا

On Transparency and Accountability in Content Moderation


Top

Santa Clara Principles 2.0 Open Consultation Report

Download The Report

Table of Contents

I. Executive Summary

II. The Original Santa Clara Principles

  1. Numbers
  2. Notice
  3. Appeal

III. Results of the 2020-21 Open Global Consultation

A. Overarching and Cross-Cutting Comments and Recommendations

  1. Broadening the definition of “Content Moderation” beyond takedowns and account suspensions
  2. Due Process Throughout The Content Decision-Making System
  3. Cultural Competence Throughout the System
  4. Special Consideration for the Use of Automation in Content Moderation
  5. Government Engagement with Content Moderation Processes
  6. A Scaled Set of Principles?

B. Numbers Principle

  1. Summary of Comments

    a. Government Influence on Content Moderation

    b. Transparency around Individual Content Moderation Decisions

    c. Appeals and Content Moderation Errors

    d. More Information About the use of Automated Tools

    e. Information That Will Identify Trends in Content Moderation Abuse and Discriminatory Practices

    f. Reporting Data to Assist in Identifying Systemic Trends and Effects

    g. Improving Transparency Reporting Practices, While Acknowledging Scale

    h. Numbers related to crisis and emergency-related data

  2. Reflections and Observations

  3. Recommendations

C. Notice Principle

  1. Summary of Comments

    a. Improving Notice Given to Actioned Users

    1. Notice about the Origin of the Flag
    2. Provide users adequate information to support appeal
    3. Increased Emphasis on Timeliness

    b. Expanding Those who Should Receive Notice

  2. Reflections and Observations

  3. Recommendations

D. Appeals Principle

  1. Summary of Comments

    a. Procedures to Ensure Due Process in Appeals

    1. Clarify the definition of “appeal”
    2. Clarify the scope of appeals
    3. Allow users to submit additional information
    4. Appeal procedures must be clear
    5. Procedures for expedited appeals
    6. Cultural competence in appeals
    7. Procedures need to show sensitivity and responsiveness to abuse of takedown schemes
    8. Clearly explain appeal results and consequences
    9. Need for external review

    b. Yield Data to Further Research and Systemic Accountability

  2. Reflections and Observations

  3. Recommendations

E. Advertising and the Santa Clara Principles

Reflections and Observations
Recommendations

Acknowledgements


I. Executive Summary

Throughout the past few years, as tech companies have taken on an increasingly complicated role in policing the world’s speech, the demand for greater due process and transparency in content moderation has grown. At the same time, and particularly during the pandemic, gains made in recent years on transparency and accountability measures have in some instances regressed.

The call for transparency from social media companies dates to more than a decade ago when, following Yahoo!’s handing over of user data to the Chinese government—an act that resulted in the imprisonment of local dissidents Wang Xiaoning and Shi Tao—collective outrage from digital rights activists in organizations led to the creation of the Global Network Initiative (GNI), a multi-stakeholder organization that seeks to hold tech companies accountable to a set of principles, one of which is public transparency.

This effort, along with others, resulted in the publication of the first transparency reports by major companies such as Google. It is because of these transparency reports that we know how many pieces of user data companies turn over to various governments, or how many occurrences of government censorship have been imposed on platforms. Such transparency enabled users to make more informed decisions about which products to use and avoid, and empowered advocacy groups to insist that companies follow established legal processes when complying with such government demands.

Over time, however, it became clear that transparency around how companies respond to governments was insufficient; civil society began to demand that companies provide information about how they enforce their own policies and practices. The formation of Ranking Digital Rights—an organization which works to promote freedom of expression and privacy on the internet by creating global standards and incentives for companies to respect and protect users’ rights—and the launch of their Corporate Accountability Index in 2015 pushed the field forward, as did other efforts such as the Electronic Frontier Foundation’s Who Has Your Back?, a project that ranks companies based on their adherence to an annually updated set of transparency standards.

In 2018, a small convening of academics, lawyers, and activists culminated in the creation of the Santa Clara Principles on Transparency and Accountability in Content Moderation, a clear set of baseline principles designed to obtain meaningful transparency around internet platforms’ increasingly aggressive moderation of user-generated content. The principles comprise three categories: Numbers, Notice, and Appeal. Each category is then broken down into more detail.

The “Numbers” category asks companies to publish the numbers of posts removed and accounts permanently or temporarily suspended due to violations of their content guidelines. “Notice” indicates that companies should provide ample notice to users when their content is removed or their account suspended, with sufficient detail about the policy(ies) violated and information on how to appeal. Finally, “Appeal” asks that companies provide a meaningful opportunity for timely appeal, with a set of minimum standards that emphasizes human review.

In 2018, in the wake of several high-profile censorship events by Facebook, a coalition that included several of the original authors penned an open letter to Mark Zuckerberg demanding that the company institute an appeals process across all of its policy categories. That effort succeeded in part, opening a broader dialogue with Facebook and resulting in the expansion of their appeals processes to cover most areas of policy. In 2019, the Electronic Frontier Foundation’s “Who Has Your Back?” project was successful in obtaining endorsements of the principles by twelve major companies, including Apple, Facebook, Twitter, Reddit, and YouTube. Unfortunately, however, only one company—Reddit—has implemented the principles in full and the three largest platforms—Facebook, Youtube, and Twitter—were significantly lacking in their implementation.

At the same time—and particularly as many companies have taken an increasingly aggressive and often automated approach to content moderation, particularly in the Global South—the original group of authors heard feedback from many of our allies working on content moderation and digital rights around the world that the original principles missed certain key issues and could benefit from a review. As such, we undertook a lengthy deliberative process that involved more than fifteen organizations and an open call for comments with the goal of eventually expanding the Santa Clara Principles.

The pandemic brought with it a number of new complexities. In March 2020, as workplaces shut down and many countries went into lockdown, content moderators were by and large sent home and replaced with automated processes, many of which are buggy at best and deeply insufficient at worst. Many of the organizations working in our field have reported a great increase in the number of complaints they receive from users seeking redress and assistance in regaining their accounts. In April 2020, a global group of more than 40 organizations addressed the companies, urging them to ensure that removed content would be preserved, decisions made transparent, and that information would, in the future, be given to researchers and journalists.

At the same time, we are seeing an increase in demands for censorship, both by the public and governments, the latter of which in particular has resulted in what appears to be widespread silencing of dissent and opposition movements in various locales, including India, Palestine, the United States, and Colombia.

The pandemic has also directly affected our own work. Our plans to organize a convening alongside RightsCon2020 where we could receive in-person feedback from digital rights activists were dashed when travel was grounded. Furthermore, many organizations have struggled with capacity issues over the past year. Nevertheless, we received nearly thirty in-depth submissions from groups and individuals in roughly eighteen countries, from Brazil to Kenya to Canada, including several real-time group consultations conducted by partners in Africa, Latin America, India, and North America. These submissions were thoughtful, nuanced, and reflect a wide range of views on how transparency and accountability efforts by companies can be expanded to benefit a diverse range of users.

This report is reflective of those submissions and observes the following trends:

II. The Original Santa Clara Principles

Numbers

Companies should publish the numbers of posts removed and accounts permanently or temporarily suspended due to violations of their content guidelines.

At a minimum, this information should be broken down along each of these dimensions:

This data should be provided in a regular report, ideally quarterly, in an openly licensed, machine-readable format.

Notice

Companies should provide notice to each user whose content is taken down or account is suspended about the reason for the removal or suspension.

In general, companies should provide detailed guidance to the community about what content is prohibited, including examples of permissible and impermissible content and the guidelines used by reviewers. Companies should also provide an explanation of how automated detection is used across each category of content. When providing a user with notice about why her post has been removed or an account has been suspended, a minimum level of detail for an adequate notice includes:

Notices should be available in a durable form that is accessible even if a user’s account is suspended or terminated. Users who flag content should also be presented with a log of content they have reported and the outcomes of moderation processes.

Appeal

Companies should provide a meaningful opportunity for timely appeal of any content removal or account suspension.

Minimum standards for a meaningful appeal include:

In the long term, independent external review processes may also be an important component for users to be able to seek redress.

III. Results of the 2020-21 Open Global Consultation

A. Overarching and Cross-Cutting Comments and Recommendations

The open consultation revealed several over-arching and cross-cutting comments and recommendations.

1. Broadening the definition of “Content Moderation” beyond takedowns and account suspensions

The initial version of the Santa Clara Principles focused on a limited set of moderation decisions: the removal of posts and the suspension of accounts, both temporary and permanent.

But the taxonomy of content moderation is full of numerous other types of “actioning,” the term used by platforms to describe a range of enforcement actions including removal, algorithmic downranking, and more. Commenters were asked whether the Principles should be extended to a range of other decisions.

The overwhelming majority of respondents supported extending the Principles to other types of moderation decisions, both intermediate moderation actions such as downranking, as well as AI-based content recommendation and auto-complete. Some respondents reasoned that users should have greater control over what is shown to them; others believed that if certain content is recommended, other pieces of content are less likely to be seen, raising the risk of discrimination as to who sees what content. Still others argued, however, that content promotion/downranking was sufficiently different from content removal to require a different approach, and even those who favored including such actions did not have specific recommendations for what companies should be required to do. Several respondents highlighted the importance of defining the term “intermediate restrictions” so, at a minimum, it is clear to what actions the Santa Clara Principles apply.

Recommendations
  • As revised, the new principles apply broadly to negative actions taken by companies in response to either the content of users’ speech or the users’ identity. The revised principles apply to all “actioning” of user content by companies, defined as any form of enforcement action taken by a company with respect to a user’s content or account due to non-compliance with their rules and policies, including (but not limited to) the removal of content, algorithmic downranking of content, and the suspension (whether temporary or permanent) of accounts.
  • We also explored extending the Santa Clara Principles to autocomplete and recommendations systems. However, we decided not to include auto-complete and recommendations systems within these Principles since to do so seemed unworkable given how frequent such actions are and the varying contexts in which they occur.
2. Due Process Throughout The Content Decision-Making System

While implicit, the first version of the Santa Clara Principles did not specify steps that companies engaged in content moderation should take before and in making the initial moderation decision, as opposed to the appeal of that decision. Many of the comments received during the consultation were directed at that first level of moderation, with commenters wanting greater assurances of transparency and due process throughout the entire process of moderation. Indeed, much of the desire for transparency around the first-level decision was founded in the lack of trust that those decisions were being made fairly, consistently, and with respect for human rights. While transparency is important to evaluating human rights values, it is not itself a substitute for them.

Several comments asked for more detail and clarity around a platform’s content policies: although most companies publish their content policies online, these rules are not always comprehensible. Some companies do not have clearly defined and named policies and categories, which makes it difficult to understand them. In addition, policies often make use of vague and broadly interpreted terms, such as “offensive,” raising concerns around overbroad removals. Further, companies do not clearly outline how they implement and enforce their content policies (including which tools they use to moderate content), making it difficult to understand the full spectrum of their content moderation efforts and hold them accountable. Another issue raised relates to remedy measures that platforms should take when reversing content moderation decisions following a successful appeal (e.g. by providing greater content visibility and reviewing their policies).

Commenters also seek greater assurances that both human and automated moderators are sufficiently trained and have sufficient expertise. Commenters wanted to know whether human moderators are employed or outsourced or contractors, how they are supervised, how their performance is assessed so that quality decision-making is optimized, and information about their working conditions.

Specific suggestions included the following:

Recommendations
  • As revised, the new principles include a foundational principle of Human Rights and Due Process that emphasizes the need for due process and respect for human rights throughout the entire content moderation process and an Integrity and Explainability Principle that requires that all content moderation processes have integrity and be administered fairly.
3. Cultural Competence Throughout the System

Perhaps the most consistently identified need among all of the comments received was a need for greater cultural competence—knowledge and understanding of local language, culture and contexts—throughout the content moderation system, with the commenters correctly identifying it as an essential component of due process.

Cultural competence generally requires that those making first level moderation and appeal decisions understand the language, culture, and social context of the posts they are moderating. Commenters thus seek both guidelines for companies and greater transparency around the demographic background of human moderators, their geographical distribution, and the language proficiency of moderators to ensure that such competency is in place.

A few commenters recognized that the collection and disclosure of personal data about moderators raises considerable privacy concerns.

Specific comments included the following:

Recommendations
  • As revised, the new principles establish understanding of local cultures and contexts as a foundational principle that must be considered and aimed for throughout the operational principles. Specific steps are also included within each operational principle.
4. Special Consideration for the Use of Automation in Content Moderation

Internet platforms increasingly rely on automated tools powered by artificial intelligence and machine learning to enhance and manage the scale of their content moderation processes. Companies use algorithmic curation mechanisms, such as downranking and recommending, to moderate content that violates or partially violates their rules. Companies use content filters to scan content before it is published to the service—in some cases preventing the upload of content that violates certain rules.

Although these tools can make it easier for companies to moderate and curate content at great scale, they have also resulted in numerous concerning outcomes, including overbroad takedowns, deamplification of content, and other moderation measures. Automated moderation tools are generally limited in that they are unable to assess context or make subjective decisions. This has had an outsized impact on certain communities; for instance, the use of automation in moderating violent extremist content has arguably resulted in the widespread erasure of evidence of human rights violations in countries such as Syria. As a result, these tools cannot make accurate decisions when flagging or moderating content that doesn’t have clearly defined parameters or that requires nuance and specific cultural or linguistic understanding.

We asked stakeholders their opinions on expanding the Santa Clara Principles to specifically include guidelines related to the use of automated tools and AI in detecting and removing posts, for the purposes of ranking content, and for content recommendation and auto-complete.

Commenters strongly supported amending the Santa Clara Principles to account for the ever-increasing use of automated systems in the moderation process, as well as for content ranking.

One commenter proposed a new principle specifically aimed at the use of automated decision-making tools.

Although normative guidelines regarding whether automated systems should ever be used without any human moderation are appropriate, most of these suggestions received focused on transparency—companies should disclose when they use automated systems and how accurate those systems have proven to be. Specific suggestions regarding transparency reporting will be addressed below in the section on the Numbers Principle.

Recommendations
  • As revised, the new principles have specific provisions regarding the use of automated systems and transparency about that use, as set forth below.
  • As revised, the new principles require that companies have high confidence in all systems used, including human, review, automated systems, and all combinations of them. Users should have the ability to seek human review of many moderation actions.
5. Government Engagement with Content Moderation Processes

Government actors typically engage with the content moderation process in several ways: by insisting that community standards and terms of service reflect local law; by submitting legal requests to restrict content based on local law; by flagging legal content that violates the platform’s rules through mechanisms designed for end-users; or by utilizing extra-legal backchannels such as Internet Referral Units (IRUs). While companies have long included details about the government takedown request in their transparency reporting, the other practices are often conducted without any transparency.

Respondents strongly agreed that governmental involvement in a platform’s content moderation processes raises special human rights concerns, and that those concerns should be specifically addressed in any revision to the Santa Clara Principles. Respondents suggested that platforms provide more transparency generally on measures that they take in response to government demands, including agreements that may be in place, more information about government pressure or requests, and whether a content moderation action was required by a state regulation. Stakeholders also noted, however, the potential benefits of cooperation between companies and government experts in areas such as election administration and public health.

Recommendations
  • As revised, the new principles include a foundational principle addressing state involvement in content moderation that addresses this particular concern.
  • As revised, the new principles require specific reporting of numbers regarding governmental involvement, and additional notice to users, as set forth in the Notice section below.
  • As revised, the new principles include directions to states, namely that they regularly report their involvement in content moderation decisions, including specific referrals or requests for actioning of content, not exploit or manipulate companies’ content moderation systems to censor dissenters, political opponents, social movements, or any person, and affirming states’ obligations to respect freedom of expression. States must recognize and minimize their roles in obstructing transparency by the companies.
6. A Scaled Set of Principles?

As we sorted through the comments that largely suggested numerous additional reporting and process requirements, we were mindful of a familiar tension: that by increasing content moderation reporting, process, and appeal requirements we risk making them extremely difficult to meet, except for, perhaps, by the current resource-rich, market-dominant intermediaries that uniquely have the resources to fulfill such requirements. Such requirements may then discourage innovation and competition, thus entrenching the existing dominant companies. On the other hand, we recognized that even newer and smaller intermediaries should incorporate human-rights-by-design as they roll out new services. Any new set of principles must be sensitive to avoid generating standards that are impossible for small and medium-sized enterprises (SMEs) to meet compared to large companies. Yet newer and smaller services must understand and plan for compliance with the standards as they scale up, and not wait until they control the speech of millions of users. This concern runs through almost all of the suggested revisions.

Commenters frequently noted the challenge of scaling the Santa Clara Principles so that they were relevant to companies of varying sizes and resources. One participant in the Americas consultation explained that in addition to the issues of scale, strict reporting requirements often prevent companies from publishing data that outlines insights that are unique and most relevant to their type of service.

The original Santa Clara Principles set minimum standards and we are mindful that raising them might create unscalable burdens for smaller and new companies that do not have the staff and other resources to provide numbers, notice, and appeals procedures at the same level as the currently dominant platforms. We are also mindful that if we are to undertake scaling, the metrics themselves are problematic. What are the correct measures—Number of users? Capitalization? Geographic reach of service? Extent of moderation? Maturity of service?

We were also mindful that a fixed set of specific requirements may prove overly rigid given how quickly content moderation practices change and the ecosystem evolves.

We considered several ways to approach this problem, such as: tiering the principles such that new requirements were triggered when certain benchmarks were met; keeping the principles as baseline guidelines and providing guidance on how those baselines should be scaled up; and considering whether proportionality could be built into the principles in a way that would provide adequate guidance to companies and adequate protections to users.

Ultimately, we believe that the Santa Clara Principles must establish standards against which to evaluate a company’s practices—not minimum standards that everyone must meet, but a mean of sufficient practices. Some companies will and should be able to do more; some will be unable to meet them all. What any one company should do will depend on many factors – age, user base, capitalization, focus, and more—and will evolve over time.

Recommendations
  • As revised, the new principles establish standards that emphasize the purpose of each principle, which must be front of mind regardless of scale. That is, the principles should emphasize the goals of numbers, notice, and appeals, and how each specific practice furthers those goals. The revised principles also include more robust requirements to be adopted as an online service provider matures. The revised principles are also supplemented with a toolkit that provides more specific guidance for companies to consider as they plan for their growth and maturation to ensure that such measures are adopted as companies mature, not after they do so.

The following sections report the range of comments received, and largely do not include an evaluation of the merits or an endorsement of any particular comment.

B. Numbers Principle

1. Summary of Comments

During the consultation process, the Santa Clara Principles coalition solicited feedback on a range of questions which sought to understand if, and how, the Santa Clara Principles should be amended to include broader and more granular requirements for the “Numbers” section. There were plentiful suggestions, and they are set forth below. The countervailing concerns for privacy and competition are addressed in the Recommendations section.

The feedback the coalition received fell into eight categories: government influence on content moderation, transparency around individual content moderation decisions, more information about the use of automated tools, information to help identify trends in content moderation abuse and discriminatory practices, information to help identify systemic trends and effects, and improving transparency reporting practices.

a. Government Influence on Content Moderation

The original Santa Clara Principles are minimum standards for transparency and accountability around companies’ enforcement of their own Terms of Service. A number of stakeholders expressed a strong need for the revised Principles to directly confront the troubling role state and state-sponsored actors play in shaping the companies’ content moderation policies and practices government, through takedown requests and other actions. Currently, some internet platforms publish separate transparency reports outlining the scope and scale of government requests for content removals they receive. However, the granularity and consistency of this data varies from region to region, and many stakeholders noted that government cooperation is not always clearly disclosed in these reports. Stakeholders stated that companies should be required to explicitly state any form of cooperation they have with governments and indicate when decisions were required by national and local laws.

Some of the metrics and data points suggested under this category include aggregate data on:

This data should be broken down by the legal reasoning/basis used to justify requests and information on which government agency submitted the requests, including the existence of any court orders.

b. Transparency around Individual Content Moderation Decisions

Numerous stakeholders outlined that although several internet platforms currently publish aggregate information on the scope and scale of their content moderation efforts, there is still a fundamental lack of transparency and accountability around individual content moderation decisions. As a result, these stakeholders recommended that the Principles be amended to encourage companies to publish metrics such as:

c. Appeals and Content Moderation Errors

Many submissions underscored a desire for platforms to publish more data on appeals and proactive recognition of content moderation errors in order to paint a more granular picture of the integrity and efficacy of a platform’s content moderation operations. Some of the data points suggested include:

d. More Information About the use of Automated Tools

Many submissions touched on the need for internet platforms to provide greater quantitative transparency around the role automated tools play in their content moderation efforts, among other reasons, to raise awareness among users of the use of AI in content moderation.

At the same time, some respondents also noted that while we need more transparency around automated tools, we should not frame their use as inherently bad or problematic, and should recognize that some use of automation for the purposes of reviewing content was inevitable and sometimes necessary.

Numerous stakeholders recommended that companies publish data outlining the scope and scale of automated moderation efforts, and inform users which specific categories of content or rules violations are targeted by automated systems. They recommended that companies publish data on the amount of content that has been filtered pre-publication, and how much content was prevented from being uploaded. They also recommended that companies preserve all filtered content so researchers can access it. In particular, they want to see transparency around when automated processes are used for content moderation purposes and for what types of content; the criteria that are used by the automated process for making decisions; and the confidence/accuracy/success rates of the processes, including changes over time. In practice, much of this transparency would need to be done through qualitative information rather than quantitative data, although some transparency around the latter (e.g. accuracy rates over time) might be possible. It would also be important for any transparency requirements to recognise that humans are often also involved in content moderation which involves automated processes. Here, however, the respondents simply wanted to know whether ranking (whether downranking or promotion) was used at all and, if so, for what purposes. In practice, most if not all of this transparency would need to be provided through qualitative information rather than quantitative data. A distinction would need to be made between downranking which occurs due to the content itself (e.g. for coming close to a breach of the content moderation policy) and downranking/promotion as a result of data related to the user.

Some of the metrics that were recommended for inclusion in the revised Principles include:

As noted above, many stakeholders voiced concerns that marginalized and minority groups are often disproportionately affected by content moderation harms. Abusive flagging and removal of these communities’ content was especially a concern. In order to identify and understand these trends, stakeholders recommended that companies publish data that will enable researchers and civil society groups to identify patterns in which users and communities are being targeted including through abusive flagging patterns, and by whom, and to suggest ways platforms could modify their practices to address them. This data includes:

Further, several consultations and submissions outlined the need for data points to be adapted based on geography in order to elucidate certain patterns in abuse of the content moderation system. For example, commenters suggested that companies publish the content policy that a flag asserts a piece of content violated and adapt these metrics based on the region. In the United States, categories focused on hateful content, discriminatory content, and gender-based violence should be adapted to include a breakdown of data based on race. In other regions, such as India, the categories should be adapted to allow a focus on caste. One stakeholder noted that these kinds of localized data points are important and that the Principles “should encourage disclosures of information that is not just directly relevant for users but also beneficial in [the] larger public interest of general policymaking.”

Some of these suggestions, as well as suggestions below, reveal a tension between the benefits of analyzing personal information about the users of these services and the significant privacy concerns related to the collection, retention, and further use of such information. The revised Principles must be sensitive to not encourage the otherwise unwarranted collection of user data, and must steadfastly avoid urging online services to collect user data as a condition to providing services. However, when a company nevertheless persists in the collection of personal user data, aggregate reporting of that data can be helpful to identify discriminatory trends in content moderation.

Many stakeholders expressed concerns that flaws and errors in the content moderation system often disproportionately affect already vulnerable and marginalized communities, and that moderators and systems administrators often lack cultural competence. As a result, they suggested that platforms publish contextual data that can promote understanding of who was affected by a platform’s content moderation efforts, when they were affected, where these individuals are based, and why and how they were affected. The data points that were suggested by contributors who focused on this theme are broad, but generally advocated for data on:

Facebook specifically noted information that they considered _should not _be included through transparency, namely the quantity of content removed by a specific mechanism of the overall content moderation system (i.e., a specific type of technology or means of review) unless such data addresses a question not capable of being answered by any other metric. Facebook also noted that their content review process is conducted by a combination of people and automation which complement each other, so a piece of content could be flagged by automated tools and then removed by a human reviewer etc. In these circumstances, it would be misleading to suggest that this content was solely moderated by humans or by an automated process. GitHub recommended that any principles developed focus on content moderation and not include other decisions about content display and distribution.

g. Improving Transparency Reporting Practices, While Acknowledging Scale

Several stakeholders also put forth recommendations on how company transparency reports can be improved in order to provide greater visibility into and accountability around corporate content moderation processes and in order to generate meaningful transparency. At least one submission suggested that companies should publish transparency reports on a more frequent basis. Currently, many platforms publish reports on a quarterly basis. Others suggested that companies provide more information about AI technologies used in content moderation regarding the type of content (e.g., video, image, text) and further detail on human intervention in interaction with automated decisions. Commenters also mentioned the relevance to provide more information about which content moderation decisions encompass human intervention and the steps, procedures, and companies involved in the review process:

Many stakeholders underscored the need for the Principles to encourage and reflect the elements above adding that the Principles would be most useful if they focused on end results instead of breaking down data into overly granular pieces. A participant in the Americas consultation opined that the Principles place too much of an emphasis around quantitative transparency and not enough emphasis on qualitative transparency, as companies often use numbers to justify how well they are doing in terms of content moderation, but the qualitative points provide important context.

Companies should create a separate category when reporting numbers to outline data related to content removals and restrictions made during crisis periods, such as during the COVID-19 pandemic and periods of violent conflict.

2. Reflections and Observations

The Santa Clara Principles were initially drafted to provide a minimum set of standards that companies should meet in order to provide adequate transparency and accountability around their content moderation efforts. The metrics included in the Numbers section were similarly intended to provide a baseline for what a good transparency report on Terms of Service enforcement would look like. These metrics were released in anticipation of the first-ever industry Terms of Service enforcement reports, which were published by YouTube, Facebook, and later Twitter in 2018. One key question we grappled with was whether the new set of Principles should similarly provide a baseline of standards that companies should meet, whether they should outline a more advanced tier of requirements, or something else. The former would likely allow the coalition to conduct more advocacy with smaller platforms and services that do not currently publish Terms of Service enforcement reports. The latter would put pressure on companies to demonstrate more transparency and accountability in response to changes in the content moderation landscape that have occurred over the past three years.

The metrics and data points noted above all reflect a desire for companies to provide more transparency and accountability around how they design and implement their content moderation systems and what impact these policies and practices have.

However, there are at least three countervailing concerns to more detailed transparency reporting.

First, as some stakeholders recommended, the reporting requirements in the Numbers section may need to be proportional to a platform’s level of content moderation, its risk profile, and its size, the platform’s age, revenue, and geographic spread. Additional expertise may be needed to fully comprehend the implications of such a decision.

Second, some of the metrics suggested would require the unnecessary and undesirable collection of user’s personally identifiable information (PII). For example, many platforms assert that they do not currently collect demographic data such as race or other potentially sensitive information such as location data. Reporting metrics such as these would require platforms to begin collecting this data. In addition, some of the metrics suggested by stakeholders would require the unlawful disclosure of PII. These include metrics that require platforms to identify a user or moderator’s demographic information. Alternative methods, such as aggregate or qualitative reporting, for obtaining the information proposed by stakeholders without requiring PII collection or disclosure would have to be identified. Asking internet platforms to provide more information around how the PII they collect factors into content moderation decisions may be useful since many platforms assert that all PII they collect is essential for their operations.

Third, some stakeholders expressed concerns around pushing companies to report on metrics related to timelines for content removal. Over the past several years, several governments have noted that platforms have not been removing content quickly enough. Some of these legislators have subsequently introduced legislation requiring companies to moderate content along a specific timeline, or face penalties. While we recognize the importance of quick action and encourage platforms to moderate content on a timely basis, requiring companies to act along a specific timeline could result in overbroad removals, therefore raising significant free expression concerns. Because of this, the group chose not to include timeline-related metrics in the redrafted Principles.

3. Recommendations

  • As revised, the new Numbers Principle requires that companies report data that reveals how much actioning is the result of a request or demand by a state actor, unless such disclosure is prohibited by law. Companies should also report the basis for such requests or demands—whether a violation of law or terms of service and, if so,which specific provisions, and identify the state actor making the request. Companies should also report the number of requests for actioning made by state actors that the company did not act on.
  • As revised, the new principles also acknowledge the role states play in supporting and promoting transparency in content moderation. States should remove the barriers to transparency that they erect, including refraining from banning companies from disclosing state requests and demands for actioning.
  • As revised, the new principles require that intermediaries report the number of appeals the platform received from users, the number of/percentage of successful appeals that resulted in posts or accounts being reinstated, and the number of/percentage of unsuccessful appeals.
  • As revised, the new principles require that companies report the number of posts the company reinstated after proactively recognizing they had been erroneously removed, and the number of accounts the company reinstated after proactively recognizing they had been erroneously removed.
  • As revised, the new principles require greater transparency around the use of AI and automated processes in content moderation, including information both about a company’s confidence in its automated tools generally and disclosure of all tools used in any specific actioning.
  • As revised, the new principles include a special emphasis on transparency around flagging. The revised Numbers Principle encourages reporting data that will allow users and researchers to assess the frequency of flagging abuse and the measures a company takes to prevent it. Specific metrics and/or qualitative reporting could be devised to help identify abuse-related trends in particular regional contexts, including total number of flagged posts; total number of flagged posts removed; total number of posts flagged by users; the volume of flags received by region; and the number of heavily flagged posts reported to authorities. Companies are encouraged to assess and adapt metrics to regional/local contexts in terms of providing useful information to identify trends in content moderation abuse and discriminatory practices.
  • As revised, the new principles require companies to break down the disclosed data by country and by language, as well as by content policy violated. Companies should provide similar granularity of data across different countries, so as to achieve a greater global consistency in transparency reporting, but also be mindful of specificities of the local context.
  • As revised, the new principles require companies to provide information about the geographical/country and language distribution of content moderators.

C. Notice Principle

1. Summary of Comments

The comments received focused mainly on two areas:

  1. Expanding what is in the notice provided to users who post actioned content, and
  2. Expanding who receives notice
a. Improving Notice Given to Actioned Users
2. Notice about the Origin of the Flag

The original version of the Santa Clara Principles already recommends that notices indicate whether action against the user was taken as a result of a flag and whether the flag originated with a government. However, recognizing competing data privacy concerns, it does not require platforms to identify flaggers by name if they were non-governmental individuals. Participants emphasized the importance of receiving more information about the source of all flags multiple times throughout the consultation responses. It was particularly noted by respondents outside the US and EU. But generally commenters agreed that the Notice principle should not require a company to reveal private information regarding other users.

Regarding notices about law-enforcement or government flags, commenters recommended including specific information about court orders and other legal bases and authority provided to the company by the officials, even if the purported authority for the removal was the company’s own Terms of Service.

However, some respondents raised concerns about overly complex notices providing less useful or overwhelming information to users.

3. Provide users adequate information to support appeal

Several commenters suggested that the principle specify the minimum amount of information the companies should provide to users about the takedown so as to enable a meaningful appeal. This includes screenshots and URLs related to the content violation.

4. Increased Emphasis on Timeliness

Commenters also suggested that the Notice principle emphasize the importance of prompt and timely notices to enable users to seek meaningful redress. Commenters explained that providers should publicly articulate timelines they intend to meet for providing notice and consider giving pre-action notice in some cases. And users should receive more urgent notifications for more significant post/account restrictions.

GitHub suggested that the principle also acknowledge that there are times when it is not appropriate to provide notifications and describe a set of guidelines for when a provider can or should decline to provide more detail, such as spam, botnets or legal prohibition on notice itself. As GitHub explained, providing a detailed response in these situations can be counterproductive, particularly because these users may not otherwise use their accounts again.

Also, the Notice principle could specify how to provide notifications to users, with specific recommendations that notifications be sent by at least two different methods (e.g. in-app + by email).

b. Expanding Those who Should Receive Notice

Several commenters suggested that the principle require that users other than the actioned user receive notice or at least some other type of public-facing communications, beyond the notifications provided to users when their content is restricted.

2. Reflections and Observations

As with other principles, the recommended additions to the Notice Principle raised concerns around the asymmetric application of the new principles and to avoid generating standards that are impossible for small and medium-sized enterprises (SMEs) to meet compared to large companies.As noted above, the new principles are thus no longer framed as minimum standards but rather mean standards against which any company’s practices may be compared.

3. Recommendations

  • As revised, the new principles require that the Notice be in the language of the original post, or at a minimum in the user interface language selected by the user.
  • As revised, the new principles include further recommendations about the information that should be provided to users when a state actor is involved with the flagging and removal of the user’s post or account.
  • As revised, the new principles specify the full range of actioning for which notice is required.
  • As revised, the new principles specify that companies should provide information about the removal of a post or account at the content’s original URL (tombstoning).
  • As revised, the new principles require that notice be timely and that users be notified of any time limitations on appeals procedures.
  • As revised, the new principles specify that any exceptions to the Notice Principle, for example when the content amounts to spam, phishing or malware, be clearly set out in the company’s rules and policies.
  • As revised, the new principles require notice to users other than the author of the post, including group administrators, flaggers, and in some circumstances, the public.

D. Appeals Principle

1. Summary of Comments

The comments we received generally sought to advance one of two goals: (1) to ensure due process in the manner in which appeals are considered, and (2) to yield reportable data for research and systemic accountability. And as with the comments on the other principles, many concerns were raised regarding cultural competency.

a. Procedures to Ensure Due Process in Appeals
1. Clarify the definition of “appeal”

We received several comments about the need to further define what is meant by an “appeal.” Does the principle require a formal process? Should it require a panel rather than a single decision-maker? Should the appeal be heard by a certain authority?

Several comments sought clarification of the phrase “not involved in the initial decision,” with one commenter suggesting that the principle specify that the appeal be heard by people “not in the same chain of command” as the original decision-makers.

2. Clarify the scope of appeals

Several comments pertained to expanding or narrowing the scope of appeals.

With respect to broadening the scope of appeals, one commenter suggested that appeals be available to those users who request that abusive content be removed, but whose request is denied, and to users who flag content for other reasons but where no action is taken. Others suggested that decisions other than takedowns and account suspensions be appealable. These included downranking and shadowbanning.

“We additionally recommend that the principle be expanded to include the need for an appeals process for those who report abusive content but are informed that it does not violate terms of service - or receive no response at all. In these cases, we recommend that a second review can be requested, that the reporting party be provided with additional information regarding how the case was evaluated, and that the reporting party be given the opportunity to provide context on the content they have reported.” - PEN America (US)

With respect to narrowing appeals, GitHub, a platform that has to make such decisions, suggested that where a user is contesting the basis of a decision, a platform not be required to consider appeals unless the user provides additional information regarding the facts of their dispute.

GitHub also suggested that the principle allow for exceptions for removals or suspensions of spam, phishing, or malware in certain circumstances, such as where there is large-scale, apparent malicious activity or intent.

Others agreed that some takedown decisions deserved greater appellate consideration than others. For example, a more robust appeal mechanism may be required for removals based on editorial and other subjective judgments, in contrast to those based on the application of reasonably objective rules.

3. Allow users to submit additional information

Several commenters requested that the principle specify information that users should be able to submit to support their appeal. This should include information about takedowns and suspensions from other websites, similar patterns of content posting, and any evidence the user has regarding discriminatory targeting.

4. Appeal procedures must be clear

Several commenters suggested that the principle require that the appeal procedures be clearly set out and easy to understand, with one commenter suggesting that the process be illustrated with visuals.

Other commenters focused on the need for specific information regarding appeal timing and deadlines—the time available to file an appeal and the time a user should expect for the appeal to be resolved—and mechanisms to allow users to track the progress of appeals. And users should be promptly notified of the decision on the appeal and the consequences and finality of that decision.

5. Procedures for expedited appeals

Some commenters noted that in cases of crisis and urgency, an appeal may be truly timely only if it is expedited and considered on an as soon as possible or emergency basis. The principle should thus require that users be able to request such expedited consideration.

6. Cultural competence in appeals

Numerous commenters would require the principle to ensure cultural competence in appeals processes. Many would define “meaningful opportunity for appeal” to include reviewers with language fluency and cultural and regional knowledge of the context of the post, that appeal procedures be available in the language of the users, and that assistance be available during business hours in the user’s time zone. Responses to appeals should be issued in the language of the original post.

7. Procedures need to show sensitivity and responsiveness to abuse of takedown schemes

Several commenters suggested that the principle requires that appeal procedures show special sensitivity to users who have been subject to abusive targeting, or whose content removals are part of a broader scheme of online abuse.

8. Clearly explain appeal results and consequences

Several commenters also suggested that the principle require the platforms to clearly explain the results of the appeal and any available remedies. One commenter suggested that platforms allow users the opportunity to revise a removed post to address noted community standards or terms of service violations. Another suggested that platforms make remedies available to address damages caused by reversed takedowns or wrongly rejected takedown requests.

9. Need for external review

Some commenters suggested that external review be required in certain cases, or at least to assess efficacy of the internal review processes. This was a special concern for oversight of automated decision-making. An external reviewer or ombudsperson might be able to detect larger trends in how rules are enforced rather than seeing a moderation decision in a vacuum of the platform’s numerous other decisions. The Facebook Oversight Board was given as an example.

10. Yield Data to Further Research and Systemic Accountability

The second category of comments addressed the need for appeals to yield data to facilitate research and external oversight of the platform and the larger online information ecosystem. This data will allow researchers to assess the accuracy of content moderation decisions and the effectiveness of appeals. This includes data on both successful and unsuccessful appeals

Several commenters suggested that the principle require platforms to report data regarding the results of the appeals; for instance, how many were successful and how many were unsuccessful, as part of larger data collection to assess overall efficacy of content moderation systems.

“The number of successful appeals. This would serve as an indication of the

mechanisms that are currently being employed for flagging content and provide

information on the areas upon which they can be improved.

“The number of unsuccessful appeals. This data would indicate the level of certainty and familiarity that users have with regards to the rules used on online platforms.

“The average speed of detection of inappropriate content. This would enable various companies to determine whether they are detecting flaggable content in a timely manner and therefore preventing its spread on their platforms.

“The amount of times inappropriate content was shared and/or viewed before being flagged and taken down; Recording such data would allow companies to establish the means of the spreading of flagged content as well as to determine the damage and harm that is caused by the spreading of said content.

“The number of repeat offences. This would enable various companies to determine the perception of the rules and regulations concerning content moderation among offenders and help them model them in a manner that would deter future violations.” - Lawyers Hub (Kenya)

Commenters also suggested that platforms provide information that will assist future users in developing their appeals. This would include information about what a successful appeal was dependent upon–the quality of the submission, executive intervention, etc.

2. Reflections and Observations

Many commenters found Appeals to be the most complete component of the existing Santa Clara Principles and there were far fewer comments than for Numbers and Notice.

Over time, platforms have expanded the ways in which they moderate content beyond removals and account suspension. Now, enforcement actions range from the application of labels to downranking to demonetization. The ability to appeal these new types of moderation has failed to keep up. In some situations, such as demonetization or the application of warning labels, expansion of appeals may be relatively straightforward. In others, such as downranking or shadowbanning, users may not even know that such measures were applied. Platform ranking algorithms make a number of decisions that impact the placement of particular pieces of content; these decisions may be fundamentally different from decisions made through a traditional flagging process.

3. Recommendations

  • As revised, the new principles include a specific reference to the importance of cultural competence among staff conducting the review, and require that appeals be considered in the user’s language by those with cultural understanding of the post’s context.
  • As revised, the new principles specify that users be able to submit additional information in support of their appeal.
  • As revised, the new principles specify the full range of actioning that should be appealable, and consideration should be given as to whether certain types of actions require more or less rigorous appeals processes.
  • As revised, the new principles require companies to provide a meaningful opportunity for timely appeal of decisions to remove content, keep content up which had been flagged, suspend an account, or take any other type of action affecting users’ human rights, including the right to freedom of expression.
  • As revised, the new principles reflect the importance of providing users with information about their access to any independent review processes that might, and confirm that any such independent review process should also embrace the Santa Clara Principles and provide regular transparency reporting and clear information to users about the status of their appeal and the rationale for any decision.
  • As revised, the new principles direct companies to consider whether certain targets of abusive takedown schemes should by default be entitled to expedited appeals in certain situations.
  • As revised, the new principles require that appeal procedures be clear, complete, and understandable, including specifying all deadlines. Companies should also develop systems for users to track the progress of appeals.
  • As revised, the new principles require the periodic reporting of appeals data, whether to the public or to researchers.

E. Advertising and the Santa Clara Principles

We asked respondents whether the Santa Clara Principles should have special provisions for how advertisements are served to users, and otherwise incorporate advertisements into the Principles.

The overwhelming majority of respondents were supportive of the proposal to extend the principles to include recommendations around AI-based advertisement targeting and delivery. On the question of whether they should apply to the general moderation of advertisements, around half of respondents simply answered “yes” to this question, suggesting that the Santa Clara Principles should apply in full to the moderation of advertisements. The other half, however, suggested that while there should be greater transparency over the moderation of advertising, the existing Santa Clara Principles would need adaptation. This was justified on the basis that different principles and considerations applied when it came to how platforms develop policies and moderate their commercial and paid-for advertised content as opposed to user-generated content.

A number of suggestions as to what types of information should be required when it came to the moderation of advertising generally were put forward, including:

Commenters did not have many specific suggestions about what transparency should look like when it came to AI-based targeting and delivery of advertisements; the two most common themes were the need for information about the types of data collected from users for advertising purposes; and information on how users are categorised or segmented for advertising purposes.

Further suggestions for information made by a smaller number of respondents were:

Facebook specifically suggested that any transparency relating to advertising (or “paid content”) through the Santa Clara Principles should be framed to take account of the special characteristics of the advertising ecosystem. Participants in the Latin America virtual consultation highlighted that companies should aim for greater global consistency in their transparency initiatives, although platforms’ advertising policies in each country may be affected by local law.

Reflections and Observations

  • The overwhelming majority of respondents also want to see more transparency around the use of AI and automated processes in advertising targeting. Here, the most common requests were for the Santa Clara Principles to include requirements for information on the types of data that are collected from users for advertising purposes, and how users are categorised or segmented for advertising purposes.
  • The overwhelming majority of participants also wanted to see more transparency around the moderation of advertisements. Many respondents noted that there were differences between paid content and unpaid user-generated content which required different considerations. Despite this, around half of respondents in support of greater transparency felt that the existing Santa Clara Principles were sufficient, while the other half felt that they needed adaptation. While political advertising was seen by many as a particularly important focus, recommendations on transparency relating to advertising generally included requiring more information on how advertisements are clearly labelled and identified as opposed to other pieces of content; the criteria by which advertisements are targeted; how much money was paid for each advertisement; and the source of payment for advertisements, especially political advertisements.

Recommendations

Acknowledgements

Thank you to all of the organizations and individuals who submitted comments, participated in the group consultations, and reviewed and commented on preliminary work. Organizations submitting comments include the following: 7amleh, Association for Progressive Communications, Centre for Internet & Society, Facebook/Meta, Fundación Acceso, GitHub, Institute for Research on Internet and Society, InternetLab, Laboratório de Políticas Públicas e Internet (LAPIN), Lawyers Hub, Montreal AI Ethics Institute, PEN America, Point of View, Public Knowledge, Taiwan Association for Human Rights, The Dialogue, Usuarios Digitales. The list of individuals and groups who coordinated and hosted consultations, and otherwise contributed to the process, includes, but is not limited to: ALT Advisory, UNESCO, Eduardo Celeste, Derechos Digitales, Ivar A.M. Hartmann, Amélie Heldt, Tomiwa Ilori, Julian Jaursch, Clara Iglesias Keller, Paddy Leerssen, Martin J. Riedl, Christian Strippel, .