Photo Source: © Peter Dazeley via Getty Images

The Perils of Loosening Hate Speech Protections

18 July 2025

Commentary

As the international community marks the 20th anniversary of the Responsibility to Protect (R2P), a global political commitment to prevent and protect populations from genocide, war crimes, crimes against humanity and ethnic cleansing, one of the most urgent threats to that promise is unfolding online. At a time of increasing global instability, some of the world’s most powerful social media and tech companies are retreating from content moderation. This retreat comes as civil society, some governments, legal institutions, regional organizations and the UN have been working – often against the tide – to build guidelines and policy around the risks of unfettered discourse. By abandoning these responsibilities, these companies are not just eroding progress and global norms – they are putting lives at risk. As digital platforms scale back content moderation in the name of free speech, they can become accelerants for identity-based violence.

The decisions made by Meta earlier this year to walk back content moderation in favor of “free speech” is a prime example. On 7 January Meta, the parent company of Facebook, Instagram and WhatsApp, announced a major rollback of content moderation policies, beginning in the United States (US). The changes include replacing its fact-checking program, loosening hate speech standards and resisting what the company calls “dangerous censorship.” While framed as a defense of free expression, the policy shift effectively allows more inflammatory content, including speech targeting marginalized communities. The policies were originally implemented in response to criticism that its platforms were fueling global misinformation and violence. In a post titled “More Speech, Fewer Mistakes,” Meta’s Chief Global Affairs Officer, Joel Kaplan, defended the move, saying, “Meta’s platforms are built to be places where people can express themselves freely. That can be messy. On platforms where billions of people can have a voice, all the good, bad and ugly is on display. But that’s free expression.”

This follows a broader industry shift. After Elon Musk’s 2022 takeover of Twitter (now known as X), hate speech, including racist, homophobic and transphobic slurs, surged on the platform by 50 percent, according to a University of California, Berkeley study of 4.7 million English-language posts. Meta’s changes risk normalizing this deregulated approach across the tech sector, with potentially devastating global consequences.

Artificial intelligence (AI) is also rapidly transforming content creation, powering everything from news generation to social media posts, and also risks contributing to the promotion of dangerous and derogatory speech. In just one example, X’s AI chatbot, Grok, has reportedly surfaced and promoted antisemitic conspiracy theories and hate speech, raising serious concerns about the platform’s content moderation and the ethical safeguards guiding its AI development.

As digital platforms play an increasing role in shaping social dynamics and public discourse, relaxing content moderation and other online safeguards is not just a tech policy shift – it’s a challenge to the global commitment to R2P. In environments marked by ethno-political tensions, the idea that “more speech” on deregulated platforms is always the answer falls dangerously short. Unchecked digital content can, and has, put lives at risk by amplifying incitement to violence and/or eroding social cohesion. Ethiopia and Myanmar (Burma) serve as stark reminders of how digital spaces played a role in inciting hatred that preceded atrocities.

When social media becomes a tool for targeting communities and coordinating violence, digital deregulation doesn’t just weaken content policy, it erodes the norms and infrastructure designed to prevent atrocities. As digital technologies evolve, so too must our understanding of how R2P applies in an era where hate can go viral in a matter of minutes and translate into real world violence. The international community cannot afford to ignore lessons we have already learned and allow digital spaces to be blind spots in atrocity prevention.

Ethiopia

The two-year war in northern Ethiopia between the Tigrayan Defense Forces and allied groups against the federal Ethiopian National Defense Forces and their Eritrean and regional allies was marked by deep ethnic polarization and atrocities.

Estimates suggest up to 600,000 people were killed and at least a million remain displaced, though the true toll remains unclear. The UN and human rights organizations documented ethnic-based killings, sexual violence and forced displacement, likely amounting to war crimes and crimes against humanity by all sides. In Western Tigray, the systematic targeting of the Tigrayan population may amount to ethnic cleansing. Despite the signing of a cessation of hostilities agreement in November 2022, tensions remain elevated.

Throughout the crisis, prominent figures in Ethiopia, including Prime Minister Abiy Ahmed and other politicians, publicly denigrated populations from specific ethnic groups and politically manipulated the feelings these comments evoked to incite further violence and divisions. Digital platforms were widely used to spread inflammatory content, with many spreading hate speech and disinformation targeting Tigrayans via Facebook. The consequences were deadly. In one example, Meareg Amare Abrha, a Tigrayan professor, was shot and killed after Facebook posts falsely accused him of massacres and revealed his photo and location.

Facebook’s response was wholly inadequate. Overall, moderation was lacking – particularly in Amharic and Oromo, languages spoken by more than half the country. In one rare instance of enforcement, Facebook removed a post by Prime Minister Abiy for hate speech. In 2021 Facebook whistleblower Frances Haugen revealed the company knew its platform was being used to incite violence in Ethiopia, yet failed to invest in content moderation in local languages. Meareg’s son, Abraham, alleges his family repeatedly warned Facebook of the threats to his father, but no action was taken. Today, Abraham and Amnesty International’s former Ethiopia researcher, Fisseha Tekle, are suing Facebook in Kenya. They seek almost $2 billion dollars for a restitution fund for victims of hate speech and violence incited through Facebook.

Myanmar

Starting in August 2017, during the so-called “clearance operations,” security forces in Myanmar carried out systematic killings, torture, sexual violence and arson in Rakhine State, destroying villages, killing thousands and forcing over 720,000 Rohingya to flee to Bangladesh. A UN Fact-Finding Mission (FFM) concluded that senior members of the military, including the junta’s current leader, General Min Aung Hlaing, should be prosecuted for genocide against the Rohingya and for crimes against humanity and war crimes in Kachin, Rakhine and Shan states.

In the years leading up to the 2017 Rohingya genocide, hate speech and anti-Rohingya propaganda spread widely online. Military-linked actors and radical Buddhist nationalist groups flooded online spaces with anti-Muslim propaganda, including incitement to violence against the Rohingya. The result was an ecosystem ripe for the commission of atrocities; hate speech was not only tolerated but normalized, amplified and weaponized. This digital content stoked nation-wide hate and hostility, helping pave the way for the clearance operations that many across the country openly supported. In 2018 the FFM found that Facebook played a “determining role” in fueling atrocities perpetrated during the genocide.

In 2018, following a self-commissioned report, Facebook acknowledged it failed to prevent its platform from being used to “foment division and incite offline violence.” Alex Warofka, a Facebook Product Policy Manager went further, stating, “We agree that we can and should do more,” highlighting that the company would invest resources in addressing the abuse on its platform.

Facebook has been banned in Myanmar since the military coup in February 2021, but not because of its role in fomenting hate. After the coup, the platform became a vital tool for populations to communicate and organize resistance, triggering the junta to block access while continuing to weaponize it to post disinformation. Despite the ban, social media remains a tool of repression. On 13 March 2023 a group of 16 UN experts released a statement highlighting the responsibility of social media companies in protecting populations, particularly in post-coup Myanmar. The experts warned, “Online rhetoric has spilled into real world terror, with military supporters using social media to harass and incite violence against pro-democracy activists and human rights defenders. … Failing to cement its grip on power by locking up political prisoners and gunning down peaceful protesters, the junta has escalated its ruthless suppression of dissent to virtual spaces.”

Multiple civil society organizations and Rohingya groups continue to pursue redress for the atrocities they’ve endured, including calling for reparations and through the filing of universal jurisdiction cases. Rohingya activists have called on Meta to provide reparations – specifically a $1 million dollar education program in the refugee camps in Bangladesh that are currently home to one million Rohingya. Meta – a $1.88 trillion dollar company – has denied the request. The International Court of Justice and other bodies are actively examining the role of social media in the Rohingya crisis, raising serious legal and ethical questions about platform accountability.

While these two case studies focus on situations where Facebook bears some implied culpability, other platforms and mass communication channels have also served as vehicles for spreading hate speech and stoking violence. In Kenya, for example, unaddressed hate speech helped fuel widespread protests and mass violence, including crimes against humanity, following the 2007 presidential election results. In the months leading up to the election, the government failed to respond to critical warning signs, particularly rampant ethnic-based hate speech and incitement spread via SMS text messages. In the years that followed, the Kenyan government, with support from regional actors and the international community, invested in stronger regulation of hate speech across public and SMS platforms, contributing to more peaceful subsequent election cycles.

Another more recent case underscores how digital evidence is reshaping pathways to accountability. The International Criminal Court (ICC) is currently reviewing a confidential legal report providing evidence that the Russia-linked Wagner Group has promoted videos and images of atrocities in Mali and Burkina Faso. These posts – shared on X and Telegram – featured uniformed men committing grave abuses, often accompanied by mocking or dehumanizing language. While some of the graphic content has since been removed, other material remains accessible, including behind paywalls on Telegram. The report urges the ICC to investigate crimes “committed through the internet,” arguing they are “inexplicably linked to the physical crimes and add a new dimension of harm to an extended group of victims.” This marks the first known instance where the act of circulating images online is itself argued to constitute a war crime. European courts have already set legal precedent by prosecuting the war crime of outrages on personal dignity based largely on social media evidence.

Content Moderation and R2P

Each of these cases demonstrates the potential value of hate speech regulation and content moderation in volatile contexts. Various UN frameworks guide states, platforms and civil society to prevent and respond to hate speech while protecting free expression. The 2019 UN Strategy and Plan of Action on Hate Speech was the first system-wide framework to tackle hate speech, urging states and tech companies to adopt strong detection, monitoring and counter-speech policies aligned with human rights. It also calls for investments in media literacy, early warning systems and civil society support.

Building on this, the UN Special Adviser on the Prevention of Genocide released in 2023 a guide – “Countering and Addressing Online Hate Speech” – offering practical recommendations for policymakers and practitioners. It emphasizes the need for coordinated action: states should work with platforms, media, educators and communities to curb hate speech, pass rights-based laws, strengthen moderation in local languages and ensure platforms are transparent and accountable.

Both documents stress that no single actor, whether governments, tech companies or civil society, can tackle this threat alone. While social media companies must be accountable for moderating their platforms, as evidenced by both the crises in Ethiopia and Myanmar, they cannot be left to self-regulate. States must also establish clear policy guidelines that enforce accountability and uphold international standards.

The Global Centre for the Responsibility to Protect has published on both the protection possibilities and overwhelming dangers posed by digital technologies. Key takeaways include:

Algorithms Matter. Social media algorithms are designed to maximize engagement, but inflammatory and sensationalist content tends to spread fastest, creating fertile ground for hate speech, incitement and conspiracy theories. This increases the risk of real-world violence.
Unequal Safeguards. Protections and moderation are often weakest in places at higher risk of atrocities, especially where moderators with expertise in local languages is under-resourced. Many platforms do not invest sufficiently in content moderation staff or in fragile contexts.
Opportunities for Accountability. Digital technologies can strengthen accountability by preserving evidence, crowdsourcing documentation and supporting investigations by independent bodies, courts and civil society organizations.

The rollback of content moderation policies on social media platforms is not just a localized or technical issue – it carries global implications. Under-resourced moderation in local languages is not a passive oversight but a direct risk factor for violence. No society is immune to atrocities. Digital tools are double-edged swords: while they can mobilize awareness and prevention, they can just as easily be weaponized to incite and organize violence and deepen division. As such, responsible digital governance must be central to atrocity prevention strategies – just as atrocity prevention and human rights protections should be embedded in digital governance policy.

Content moderation policies on digital platforms have been increasingly tied to states’ obligations under the R2P framework. Under R2P, states have a duty to protect their populations from atrocities and support others in upholding that obligation. This includes preventing conditions that allow hate speech and incitement to flourish. A world where tech companies operate without oversight or accountability mechanisms undermines the essence of R2P.

Meanwhile, the cost of content moderation is also felt by those on the frontlines of digital harm. Across several African states, moderators decry low pay, sweatshop-like conditions and psychological trauma, including developing post-traumatic stress disorder, from the content they work to remove. Social media companies must ensure those protecting their users from incitement and hate speech are given adequate working conditions to do their crucial jobs.

States must take the lead to prevent and respond to digital harms that may lead to mass atrocities. This includes regulating tech companies, strengthening policy frameworks and supporting media literacy. While private companies are not bound by international law like states are, their platforms shape global discourse and influence political and social behavior in deeply consequential ways. Weak content moderation not only undermines human rights but also undercuts the moral imperatives embedded in R2P.

And the impact is not confined to any one country. Deregulation in the US impacts populations around the globe. Diaspora communities – often active participants in shaping narratives about their countries of origin – can be both a source of solidarity and a channel for incitement.

The international community must evolve its understanding of atrocity prevention to meet this moment and treat digital governance as a frontline issue.

Recommendations

As we mark the 20th anniversary of R2P, it is imperative that we immerse atrocity prevention into the realities of the digital age. Governments, tech companies, international institutions and civil society organizations must work together to create digital spaces that uphold international law and protect human dignity. Just as states have a responsibility to protect their populations, tech companies have a duty to safeguard their users and civilians more broadly. Unchecked hate speech on social media is not merely a risk, it is a proven driver of atrocity crimes. By recommitting to robust content moderation and global accountability, we can ensure that the digital sphere does not undermine, but rather supports our shared responsibility to protect through the following recommended actions:

1. Enhanced Content Moderation: Platforms must reinvest in content moderation infrastructure and continue to expand in local languages and regions with a high risk of violence. Companies should ensure atrocity prevention and human rights experts are integrated in these policymaking and training discussions. This should include both artificial intelligence (AI) tools and human moderators. As the influence and use of AI grows, it’s essential that these systems are trained not just for efficiency, but to actively prevent the spread of hate speech, incitement and misinformation that can contribute to atrocities and human rights abuses.
2. Transparency and Accountability: Companies should publicly disclose their moderation practices and cooperate with international investigations. Transparency about how decisions are made and how harmful content is addressed is imperative.
3. Collaboration with Governments and Civil Society: Tech companies must work with stakeholders to design frameworks and corporate policies that balance free expression with atrocity prevention and human rights obligations as set out by the UN Guiding Principles on Business and Human Rights. Governments must enact laws to hold companies accountable for facilitating hate speech and incitement.
4. Policy Reforms and International Cooperation: Expand upon the existing UN frameworks and establish global norms and regulatory standards for digital platforms to ensure ethical and consistent moderation practices across jurisdictions.
5. Community Engagement with Local Civil Society: Empower local civil society and communities to identify and counteract harmful content. This grassroots approach enhances early warning and resilience and can more swiftly identify gaps and overlooked risks.
6. Digital Literacy and Education: States should commit to launching initiatives to promote media literacy and critical thinking, especially among young users, to build long-term resistance to manipulation and disinformation.
7. Creation of a Reparations Fund: Tech companies, particularly those who have been directly implicated in abuses and incitement, must utilize their immense annual profits to give back via corporate social responsibility or other targeted programs to help rebuild the communities that have been destroyed by the company’s failures.

No one should face life-threatening risks simply for posting updates or reading the news on Facebook or X, or while engaging on WhatsApp or Telegram. As these companies continue to amass immense wealth and influence, they must be held accountable for prioritizing profit over the safety of millions. Until platforms treat digital incitement with the same urgency as physical threats, atrocity prevention will remain dangerously incomplete.

Sarah Hunter, Senior Research and Advocacy Officer - Communications Specialist