38,470 Subscribers

Q3 2024 Safety & Security Report: Election Recap and Renaming our Content Policy

Hi redditors,

As we begin 2025, we’re back with another Quarterly Safety & Security Report. This time, to report on how the US election went on Reddit and share the renaming of our Content Policy as “Reddit Rules”. But first, the Q3 numbers.

Q3 By The Numbers

Category	Volume (April - June 2024)	Volume (July-Sept 2024)
Reports for content manipulation	440,694	591,315
Admin content removals for content manipulation	25,062,571	25,785,092
Admin-imposed account sanctions for content manipulation	4,908,636	2,903,258
Admin-imposed subreddit sanctions for content manipulation	194,079	181,663
Reports for abuse	2,797,958	2,815,991
Admin content removals for abuse	639,986	616,443
Admin-imposed account sanctions for abuse	445,919	454,835
Admin-imposed subreddit sanctions for abuse	2,498	1,409
Reports for ban evasion	15,167	14,555
Admin-imposed account sanctions for ban evasion	273,511	186,739
Protective account security actions	2,159,886	1,190,348

Elections Recap

TL;DR: We saw no significant content interference related to the election, though we did see a temporary increase in abuse (as well as a corresponding increase in admin enforcement against abuse) in the days following the election.

The U.S. held general elections on Tuesday, November 5th. As noted in earlier posts, Reddit’s Policy, Safety, and Community teams were prepared and monitoring to ensure the integrity and safety of our platform.

Content manipulation and foreign interference

Content manipulation, which includes things like inauthentic content, manipulated content presented to mislead, and foreign interference, is against our policies. In the weeks leading to and immediately following the election, our teams were on high-alert for violating content, but ultimately found no significant malicious activity on our platform:

We conducted 65 in-depth investigations, and only one piece of content with minimal visibility was confirmed to be connected to a foreign actor.
We reviewed almost 3,000 accounts for potential political manipulation, with only 0.7% warranting actioning.

Abuse and harassment

On the day of and following the election, we observed a very short increase in abuse, including hate and harassment. We saw a corresponding increase in both admin enforcement actions and community level removals, aided by our community tools — our Safety Filters usage by mods peaked, flagging 200,000 pieces of content in communities in which they were enabled.

Community engagement

Early in the year, we held a roundtable with mods from across the political spectrum to hear their perspective on modding through elections. Their top priorities – inauthentic content and hateful content – aligned with ours. We also worked with communities throughout the US election cycle to ensure mods had the necessary resources and could escalate investigations to our teams.

We reached out to over 2,100 communities with resources, education, and reminders about site policy. Only 6 communities received Moderator Code of Conduct violations.
Our Mod Tip Line resulted in the identification of a few political spammers (not connected to foreign actors) that were actioned.

2024 was a big year for elections. As always, our focus during these significant world events — and every day — is fairly and consistently upholding sitewide rules when we review content and enforce our policies, and ensuring the integrity of our platform so that people of all political persuasions can learn, engage, and debate on Reddit. We’re able to do this thanks to the perspectives and participation of our internal teams and community partners — so thank you!

New Year, New Name: Reddit Rules

Last quarter we refreshed the name of this subreddit from r/redditsecurity to r/redditsafety to make it easier for people to know what to expect in this subreddit.

In a similar vein, today we’re renaming Reddit’s Content Policy to “Reddit Rules” (longtime redditors might remember that “rules” was actually the original name.). The name “Reddit Rules” better reflects that our policies govern both content AND behavior on Reddit. This is just a name change and doesn’t affect the content of the rules themselves.

We've already implemented the new name across a number of surfaces, though we expect that it will take some time to update all mentions, so please bear with us. We also know that many communities have descriptions or rules referencing the old name – Content Policy – and understand it may take mods some time to update. We set up an automatic redirect of the old link to the new page so things don't break as this change rolls out.

Happy new year to the entire Reddit community!

77 Comments

2025/01/13
20:14 UTC

Reddit Transparency Report: Jan-Jun 2024

Hello, redditors!

Today we published our Transparency Report for the first half of 2024, which shares data and insights about our content moderation and legal requests from January through June 2024.

Reddit’s biannual Transparency Reports provide insights and metrics about content moderation on Reddit, including content that was removed as a result of automated tooling and accounts that were suspended. It also includes legal requests we received from governments, law enforcement agencies, and third parties around the world to remove content or disclose user data.

Some key highlights include:

~5.3B pieces of content were shared on Reddit (incl. posts, comments, PMs & chats)
Mods and admins removed just over 3% of the total content created (1.6% by mods and 1.5% by admins)
Over 71% of the content removed by mods was done through automated tooling, such as Automod.
As usual, spam accounted for the majority of admin removals (66.5%), with the remainder being for various Content Policy violations (31.7%) and other reasons, such as non-spam content manipulation (1.8%)
There were notable increases in legal requests from government and law enforcement agencies to remove content (+32%) and in non-emergency legal requests for account information (+23%; this is the highest volume of information requests that Reddit has ever received in a single reporting period) compared to the second half of 2023
- We carefully scrutinize every request to ensure it is legally valid and narrowly tailored, and include the data on how we’ve responded in the report
- Importantly, we caught and rejected a number of fraudulent legal requests purporting to come from legitimate government and law enforcement agencies; we subsequently reported these bogus requests to the appropriate authorities.

You can read more insights in the full document: Transparency Report: January to June 2024. You can also see all of our past reports and more information on our policies and procedures in our Transparency Center.

Please let us know in the comments section if you have any questions or are interested in learning more about other data or insights.

118 Comments

2024/10/16
17:15 UTC

Q2’24 Safety & Security Quarterly Report

Hi redditors,

We’re back, just as summer starts to recede into fall, with an update on our Q2 numbers and a few highlights from our safety and policy teams. Read on for a roundup of our work on banning content from “nudifying” apps, the upcoming US elections, and our latest Content Policy update. There’s also an FYI that we’ll be updating the name of this subreddit from r/redditsecurity to r/redditsafety going forward. Onto the numbers:

Q2 By The Numbers

Category	Volume (January - March 2024)	Volume (April - June 2024)
Reports for content manipulation	533,455	440,694
Admin content removals for content manipulation	25,683,306	25,062,571
Admin-imposed account sanctions for content manipulation	2,682,007	4,908,636
Admin-imposed subreddit sanctions for content manipulation	309,480	194,079
Reports for abuse	3,037,701	2,797,958
Admin content removals for abuse	548,764	639,986
Admin-imposed account sanctions for abuse	365,914	445,919
Admin-imposed subreddit sanctions for abuse	2,827	2,498
Reports for ban evasion	15,215	15,167
Admin-imposed account sanctions for ban evasion	367,959	273,511
Protective account security actions	764,664	2,159,886

Preventing Nonconsensual Media from Nudifying Apps

Over the last year, a new generation of apps leveraging AI to generate nonconsensual nude images of real people have emerged across the Internet. To be very clear: sharing links to these apps or content generated by them is prohibited on Reddit. Our teams have been monitoring this trend and working to prevent images produced by these apps from appearing on Reddit.

Working across our threat intel and data science teams, we honed in on detection methods to find and ban such violative content. As of August 1, we’ve enforced ~9,000 user bans and over ~40,000 content takedowns. We have ongoing enforcement on content associated with a number of nudifying apps, and we’re continuously monitoring for new ones. If you see content posted by these apps, please report it as nonconsensual intimate media via the report flow. More broadly, we are also partnered with the nonprofit SWGfl to implement their StopNCII tool, which enables victims of nonconsensual intimate media to protect their images and videos online. You can access the tool here.

Harassment Policy Update

In August, we revised our harassment policy language to make clear that sexualizing someone without their consent violates Reddit’s harassment policy. This update prohibits posts or comments that encourage or describe a sex act involving someone who didn’t consent to it, communities dedicated to sexualizing others without their consent, or sending an unsolicited sexualized message or chat.

We haven’t observed significant changes to reporting since this update, but we will be keeping an eye out.

Platform Integrity During Elections

With the US election on the horizon, our teams have been working to ensure that Reddit remains a place for diverse and authentic conversation. We highlighted this in a recent post:

“Always, but especially during elections, our top priority is ensuring user safety and the integrity of our platform. Our Content Policy has long prohibited content manipulation and impersonation – including inauthentic content, disinformation campaigns, and manipulated content presented to mislead (e.g. deepfakes or other manipulated media) – as well as hateful content and incitement of violence.”

For a deeper dive into our efforts, read the full post and be sure to check out the comments for great questions and responses.

Same Subreddit, New Subreddit Name

What's in a name? We think a lot. Over the next few days, we’ll be updating this subreddit name from r/redditsecurity to r/redditsafety to better reflect what you can expect to find here.

While security is part of safety, as you may have noticed over the last few years, much of the content posted in this subreddit reflects the work done by our Safety, Policy, and Legal teams, so the name r/RedditSecurity doesn’t fully encompass the variety of topics we post here. Safety is also more inclusive of all the work we do, and we’d love to make it easier for redditors to find this sub and learn about our work.

Our commitment to transparency with the community remains the same. You can expect r/redditsafety to have our standard reporting from our Quarterly Safety & Security report (like this one!) our bi-annual Transparency Reports, as well as additional policy and safety updates.

Once the change is made, if you visit r/redditsecurity, it will direct you to r/redditsafety. If you’re currently a subscriber here, you’ll be subscribed there. And all of our previous r/redditsecurity posts will remain available in r/redditsafety.

Edit: Column header typo

49 Comments

2024/09/10
18:47 UTC

254

Update on enforcing against sexualized harassment

Hello redditors,

This is u/ailewu from Reddit’s Trust & Safety Policy team and I’m here to share an update to our platform-wide rule against harassment (under Rule 1) and our approach to unwanted sexualization.

Reddit's harassment policy already prohibits unwanted interactions that may intimidate others or discourage them from participating in communities and engaging in conversation. But harassment can take many forms, including sexualized harassment. Today, we are adding language to make clear that sexualizing someone without their consent violates Reddit’s harassment policy (e.g., posts or comments that encourage or describe a sex act involving someone who didn’t consent to it; communities dedicated to sexualizing others without their consent; sending an unsolicited sexualized message or chat).

Our goals with this update are to continue making Reddit a safe and welcoming space for everyone, and set clear expectations for mods and users about what behavior is allowed on the platform. We also want to thank the group of mods who previewed this policy for their feedback.

This policy is already in effect, and we are actively reviewing the communities on our platform to ensure consistent enforcement.

A few call-outs:

This update targets unwanted behavior and content. Consensual interactions would not fall under this rule.
This policy applies largely to “Safe for Work” content or accounts that aren't sexual in nature, but are being sexualized without consent.
Sharing non-consensual intimate media is already strictly prohibited under Rule 3. Nothing about this update changes that.

Finally, if you see or experience harassment on Reddit, including sexualized harassment, use the harassment report flow to alert our Safety teams. For mods, if you’re experiencing an issue in your community, please reach out to r/ModSupport. This feedback is an important signal for us, and helps us understand where to take action.

That’s all, folks – I’ll stick around for a bit to answer questions.

342 Comments

2024/08/15
19:07 UTC

Supporting our Platform and Communities During Elections

Hi redditors,

I’m u/LastBlueJay from Reddit’s Public Policy team. With the 2024 US election having taken some unexpected turns in the past few weeks, I wanted to share some of what we’ve been doing to help ensure the integrity of our platform, support our moderators and communities, and share high-quality, substantiated election resources.

Moderator Feedback

A few weeks ago, we hosted a roundtable discussion with mods to hear their election-related concerns and experiences. Thank you to the mods who participated for their valuable input.

The top concerns we heard were inauthentic content (e.g., disinformation, bots) and moderating hateful content. We’re focused on these issues (more below), and we appreciate the mods’ feedback to improve our existing processes. We also heard that mods would like to see an after-election report discussing how things went on our platform along with some of our key takeaways. We plan to release one following the US election, as we did after the 2020 election. Look for it in Q1 2025.

Protecting our Platform

Always, but especially during elections, our top priority is ensuring user safety and the integrity of our platform. Our Content Policy has long prohibited content manipulation and impersonation – including inauthentic content, disinformation campaigns, and manipulated content presented to mislead (e.g. deepfakes or other manipulated media) – as well as hateful content and incitement of violence.

Content Manipulation and AI-Generated Disinformation

We use AI and ML tooling that flags potentially harmful, spammy, or inauthentic content. Often, this means we can remove this content before anyone sees it. One example of how this works is the attempted coordinated influence operation called “Spamouflage Dragon.” As a result of our proactive detection methods, 85–90% of the content Spamouflage accounts posted on Reddit never reached real redditors, and mods removed the remaining 10-15%.

We are always investing in new and expanded tooling to address this issue. For example, we are testing and evolving tools that can detect AI-generated media, including political content (such as images of sitting politicians and candidates for office), as an additional signal for our teams to consider when assessing threats.

Hateful Content and Violent Rhetoric

Our policies are clear: hate and calls for violence are prohibited. Since 2020, we have continued to build out the teams and tools that address this content, and we have seen reductions in the prevalence of hateful content and improvements in how we action this content. For instance, while user reports remain an important signal, the majority of reports reviewed for hate and harassment are proactively detected via our automated tooling.

Enforcement

Our internal teams enforce these policies using a combination of automated tooling and human review, and we speak regularly with industry colleagues as well as civil society organizations and other experts to complement our understanding of the threat landscape. We also enforce our Moderator Code of Conduct and take action against any mod teams approving or encouraging rule-breaking content in their communities, or interfering with other communities.

So far, these efforts have been effective. Through major elections this year in India, the EU, the UK, France, Mexico, and elsewhere, we have not seen any significant or out of the ordinary election-related malicious activity. That said, we know our work is not done, and the unpredictability that has marked the US election cycle may be a driver for harmful content. To address this, we are adding training for our Safety teams on a range of potential scenarios, including content manipulation and hateful content, with a focus on political violence, race, and gender-based hate.

Support for Moderators and Communities

We provide moderators with support and tools to foster safe, on-topic communities. During elections, this means sharing important resources and proactively reaching out to communities likely to experience an increase in traffic to offer assistance, including via our Mod Reserves program, Crowd Control tool, and Temporary Events feature. Mods can also use our suite of tools to help filter out abusive and spammy content. For instance, we launched our Harassment Filter this year and have seen positive feedback from mods so far. You can read more about the filter here. Currently, the Harassment Filter is flagging more than 25,000 comments per day in over 15,000 communities.

We are also experimenting with ways to allow moderators to escalate election-related concerns, such as a dedicated tip line (currently in beta testing with certain communities - let us know in the comments if your community would like to be part of the test!) and adding a new report flow for spammy, bot-related links.

Voting Resources

We also work to provide redditors access to high-quality, substantiated resources during elections. We share these through our u/UptheVote Reddit account as well as on-platform notifications. And as in previous years, we have arranged a series of AMA (Ask Me Anything) sessions about voting and elections, and maintain partnerships with National Voter Registration Day and Vote Early Day.

Political Ads Opt-Out

I know that was a lot of information, so I’ll just share one last thing. Yesterday, we updated our “Sensitive Advertising Categories” to include political and activism-related advertisements – that means you’ll be able to opt-out of such ads going forward. You can read more about our Political Ads policy here.

I’ll stick around for a bit to answer any questions.

[edit: formatting]

100 Comments

2024/08/01
19:18 UTC

Reddit & HackerOne Bug Bounty Announcement

Hello, Redditors!

We are thrilled to announce some significant updates to our HackerOne public bug bounty program, which encourages hackers and researchers to find (and get paid for finding) vulnerabilities and bugs on Reddit’s platform. We are rolling out a new bug bounty policy and upping the rewards across all severity levels, with our highest bounty now topping out at $15,000. Reddit is excited to make this investment into our bug bounty community!

These changes will take effect starting today, June 26, 2024. Check out our official program page on HackerOne to see all the updates and submit your findings.

We’ll stick around for a bit to answer any questions you have about the updates. Please also feel free to cross-post this news into your communities and spread the word.

28 Comments

2024/06/26
17:07 UTC

Q1 2024 Safety & Security Report

Hi redditors,

I can’t believe it’s summer already. As we look back at Q1 2024, we wanted to dig a little deeper into some of the work we’ve been doing on the safety side. Below, we discuss how we’ve been addressing affiliate spam, give some data on our harassment filter, and look ahead to how we’re preparing for elections this year. But first: the numbers.

Q1 By The Numbers

Category	Volume (October - December 2023)	Volume (January - March 2024)
Reports for content manipulation	543,997	533,455
Admin content removals for content manipulation	23,283,164	25,683,306
Admin imposed account sanctions for content manipulation	2,534,109	2,682,007
Admin imposed subreddit sanctions for content manipulation	232,114	309,480
Reports for abuse	2,813,686	3,037,701
Admin content removals for abuse	452,952	548,764
Admin imposed account sanctions for abuse	311,560	365,914
Admin imposed subreddit sanctions for abuse	3,017	2,827
Reports for ban evasion	13,402	15,215
Admin imposed account sanctions for ban evasion	301,139	367,959
Protective account security actions	864,974	764,664

Combating SEO spam

Spam is an issue we’ve dealt with for as long as Reddit has existed, and we have sophisticated tools and processes to address it. However, spammers can be creative, so we often work to evolve our approach as we see new kinds of spammy behavior on the platform. One recent trend we’ve seen is an influx of affiliate spam-related content (i.e., spam used to promote products or services) where spammers will comment with product recommendations on older posts to increase visibility in search engines.

While much of this content is being caught via our existing spam processes, we updated our scaled, automated detection tools to better target the new behavioral patterns we’re seeing with this activity specifically — and our internal data shows that our approach is effectively removing this content. Between April and June 2024, we actioned 20,000 spammers, preventing them from infiltrating search results via Reddit. We’ve also taken down more than 950 subreddits, banned 5,400 domains dedicated to this behavior, and averaged 17k violating comment removals per week.

Empowering communities with LLMs

Since launching the Harassment Filter in Q1, communities across Reddit have adopted the tool to flag potentially abusive comments in their communities. Feedback from mods was positive, with many highlighting that the filter surfaces content inappropriate for their communities that might have gone unnoticed — helping keep conversations healthy without adding additional moderation overhead.

Currently, the Harassment filter is flagging more than 24,000 comments per day in almost 9,000 communities.

We shared more on the Harassment Filter and the LLM that powers it in this Mod News post. We’re continuing to build our portfolio of community tools and are looking forward to launching the Reputation Filter, a tool to flag content from potentially inauthentic users, in the coming months.

On the horizon: Elections

We’ve been focused on preparing for the many elections happening around the world this year–including the U.S. presidential election–for a while now. Our approach includes promoting high-quality, substantiated resources on Reddit (check out our Voter Education AMA Series) as well as working to protect our platform from harmful content. We remain focused on enforcing our rules against content manipulation (in particular, coordinated inauthentic behavior and AI-generated content presented to mislead), hateful content, and threats of violence, and are always investing in new and expanded tools to assess potential threats and enforce against violating content. For example, we are currently testing a new tool to help detect AI-generated media, including political content (such as AI-generated images featuring sitting politicians and candidates for office). We’ve also introduced a number of new mod tools to help moderators enforce their subreddit-level rules.

We’re constantly evolving how we handle potential threats and will share more information on our approach as the year unfolds. In the meantime, you can see our blog post for more details on how we’re preparing for this election year as well as our Transparency Report for the latest data on handling content moderation and legal requests.

Edit: formatting

Edit: formatting again

Edit: Typo

Edit: Metric correction

41 Comments

2024/06/13
17:22 UTC

Sharing our Public Content Policy and a New Subreddit for Researchers

0 Comments

2024/05/09
16:43 UTC

Reddit Transparency Report: Jul-Dec 2023

Hello, redditors!

Today we published our Transparency Report for the second half of 2023, which shares data and insights about our content moderation and legal requests from July through December 2023.

Reddit’s biannual Transparency Reports provide insights and metrics about content that was removed from Reddit – including content proactively removed as a result of automated tooling, accounts that were suspended, and legal requests we received from governments, law enforcement agencies, and third parties from around the world to remove content or disclose user data.

Some key highlights include:

Content Creation & Removals:
- Between July and December 2023, redditors shared over 4.4 billion pieces of content, bringing the total content on Reddit (posts, comments, private messages and chats) in 2023 to over 8.8 billion. (+6% YoY). The vast majority of content (~96%) was not found to violate our Content Policy or individual community rules.
  - Of the ~4% of removed content, about half was removed by admins and half by moderators. (Note that moderator removals include removals due to their individual community rules, and so are not necessarily indicative of content being unsafe, whereas admin removals only include violations of our Content Policy).
  - Over 72% of moderator actions were taken with Automod, a customizable tool provided by Reddit that mods can use to take automated moderation actions. We have enhanced the safety tools available for mods and expanded Automod in the past year. You can see more about that here.
  - The majority of admin removals were for spam (67.7%), which is consistent with past reports.
- As Reddit's tools and enforcement capabilities keep evolving, we continue to see a trend of admins gradually taking on more content moderation actions from moderators, leaving moderators more room to focus on their individual community rules.
  - We saw a ~44% increase in the proportion of non-spam, rule-violating content removed by admins, as opposed to mods (admins remove the majority of spam on the platform using scaled backend tooling, so excluding it is a good way of understanding other Content Policy violations).
New “Communities” Section
- We’ve added a new “Communities” section to the report to highlight subreddit-level actions as well as admin enforcement of Reddit’s Moderator Code of Conduct.
Global Legal Requests
- We continue to process large volumes of global legal requests from around the world. Interestingly, we’ve seen overall decreases in global government and law enforcement legal requests to remove content or disclose account information compared to the first half of 2023.
  - We routinely push back on overbroad or otherwise objectionable requests for account information, and fight to ensure users are notified of requests.
  - In one notable U.S. request for user information, we were served with a sealed search warrant from the LAPD seeking records for an account allegedly involved in the leak of an LA City Council meeting recording that resulted in the resignation of prominent, local political leaders. We fought to notify the account holder about the warrant, and while we didn’t prevail initially, we persisted and were eventually able to get the warrant and proceedings unsealed and provide notice to the redditor.

You can read more insights in the full document: Transparency Report: July to December 2023. You can also see all of our past reports and more information on our policies and procedures in our Transparency Center.

Please let us know in the comments section if you have any questions or are interested in learning more about other data or insights.

94 Comments

2024/04/16
17:12 UTC

Is this really from Reddit? How to tell:

1 Comment

2024/02/22
15:54 UTC

Q4 2023 Safety & Security Report

Hi redditors,

While 2024 is already flying by, we’re taking our quarterly lookback at some Reddit data and trends from the last quarter. As promised, we’re providing some insights into how our Safety teams have worked to keep the platform safe and empower moderators throughout the Israel-Hamas conflict. We also have an overview of some safety tooling we’ve been working on. But first: the numbers.

Q4 By The Numbers

Category	Volume (July - September 2023)	Volume (October - December 2023)
Reports for content manipulation	827,792	543,997
Admin content removals for content manipulation	31,478,415	23,283,164
Admin imposed account sanctions for content manipulation	2,331,624	2,534,109
Admin imposed subreddit sanctions for content manipulation	221,419	232,114
Reports for abuse	2,566,322	2,813,686
Admin content removals for abuse	518,737	452,952
Admin imposed account sanctions for abuse	277,246	311,560
Admin imposed subreddit sanctions for abuse	1,130	3,017
Reports for ban evasion	15,286	13,402
Admin imposed account sanctions for ban evasion	352,125	301,139
Protective account security actions	2,107,690	864,974

Israel-Hamas Conflict

During times of division and conflict, our Safety teams are on high-alert for potentially violating content on our platform.

Most recently, we have been focused on ensuring the safety of our platform throughout the Israel-Hamas conflict. As we shared in our October blog post, we responded quickly by engaging specialized internal teams with linguistic and subject-matter expertise to address violating content, and leveraging our automated content moderation tools, including image and video hashing. We also monitor other platforms for emerging foreign terrorist organizations content to identify and hash it before it could show up to our users. Below is a summary of what we observed in Q4 related to the conflict:

As expected, we had increased the required removal of content related to legally-identified foreign terrorist organizations (FTO) because of the proliferation of Hamas-related content online
- Reddit removed and blocked the additional posting of over 400 pieces of Hamas content between October 7 and October 19 — these two weeks accounted for half of the FTO content removed for Q4
Hateful content, including antisemitism and islamophobia, is against Rule 1 of our Content Policy, as is harassment, and we continue to aggressively take action against it. This includes October 7th denialism
- At the start of the conflict, user reports for abuse (including hate) rose 9.6%. They subsided by the following week. We had a corresponding rise in admin-level account sanctions (i.e., user bans and other enforcement actions from Reddit employees).
- Reddit Enforcement had a 12.4% overall increase in account sanctions for abuse throughout Q4, which reflects the rapid response of our teams in recognizing and effectively actioning content related to the conflict
Moderators also leveraged Reddit safety tools in Q4 to help keep their communities safe as conversation about the conflict picked up
- Utilization of the Crowd Control filter increased by 7%, meaning mods were able to leverage community filters to minimize community interference
- In the week of October 8th, there was a 9.4% increase in messages filtered by the modmail harassment filter, indicating the tool was working to keep mods safe

As the conflict continues, our work here is ongoing. We’ll continue to identify and action any violating content, including FTO and hateful content, and work to ensure our moderators and communities are supported during this time.

Other Safety Tools

As Reddit grows, we’re continuing to build tools that help users and communities stay safe. In the next few months, we’ll be officially launching the Harassment Filter for all communities to automatically flag content that might be abuse or harassment — this filter has been in beta for a while, so a huge thank you to the mods that have participated, provided valuable feedback and gotten us to this point. We’re also working on a new profile reporting flow so it’s easier for users to let us know when a user is in violation of our content policies.

That’s all for this report (and it’s quite a lot), so I’ll be answering questions on this post for a bit.

57 Comments

2024/02/13
18:40 UTC

Q3 2023 Safety & Security Report

Hi redditors,

As we come to the end of 2023, we’re publishing our last quarterly report in this year. In this edition, in addition to our quarterly numbers, you’ll find an update on our advanced spam capabilities, product highlights, and a welcome to Reddit’s new CISO.

One note: Because this report reflects July through September 2023, we will be sharing insights into the Israel-Hamas conflict in our following report that covers Q4 2023.

Now onto the numbers…

Q3 By The Numbers

Category	Volume (April - June 2023)	Volume (July - September 2023)
Reports for content manipulation	892,936	827,792
Admin content removals for content manipulation	35,317,262	31,478,415
Admin imposed account sanctions for content manipulation	2,513,098	2,331,624
Admin imposed subreddit sanctions for content manipulation	141,368	221,419
Reports for abuse	2,537,108	2,566,322
Admin content removals for abuse	409,928	518,737
Admin imposed account sanctions for abuse	270,116	277,246
Admin imposed subreddit sanctions for abuse	9,470	1,130
Reports for ban evasion	17,127	15,286
Admin imposed account sanctions for ban evasion	266,044	352,125
Protective account security actions	1,034,690	2,107,690

Mod World

In December, Reddit’s Community team hosted Mod World: an interactive, virtual experience that brought together mods from all around the world to learn, share, and hear from one another and Reddit Admins. Our very own Director of Threat Intel chatted with a Reddit moderator during a session focused on spam and provided a behind-the-scenes look at detecting and mitigating spam. We also had a demo of our Contributor Quality Score & Ban Evasion tools that launched earlier this year.

If you missed Mod World, you can rewatch the sessions on our new Reddit for Community page, a one-stop-shop for moderators that was unveiled at the event.

Spam Detection Improvements

Speaking of spam, our team launched a new detection method to assess content and user-level patterns that help us more decisively predict whether an account is exhibiting human or bot-like behavior. After a rigorous testing period, we integrated this methodology into our spam actioning systems and are excited about the positive results:

We identified at least an additional 2 million spam accounts for enforcement
Actioned 3x more spam accounts within 60 seconds of posting a post or comment

These are big improvements to how we’re able to keep spam off the site so users and mods never need to see or action it.

What’s Launched

Reports & Removals Insights for Communities

Last week, we revamped the Community Health page for all communities and renamed it “Reports & Removals.” This updated page provides mods with clear and new insights around content moderation in their communities, including data about Admin removals. A quick summary of what changed:

We renamed the page to “Reports and Removals” to better describe exactly what you can find on the page.
We introduced a new “Content Removed by Admins” chart which displays admin content removals in your community and also distinguishes between spam and policy removals.
We created a new Safety Filters Monthly Overview to help visualize the impact of Crowd Control and the Ban Evasion Filter in your community.
We modernized the page’s interface so that it’s easier to find, read, and tinker with the dashboard settings.

You can find the full post here.

Simplifying Enforcement Appeals

In Q3, we launched a simpler appeals flow for users who have been actioned by Reddit admins. A key goal of this change was to make it easier for users to understand why they had been actioned by Reddit by tying the appeal process to the enforcement violation rather than the user’s sanction.

The new flow has been successful, with the number of appealers reporting “I don’t know why I was banned” dropping 50% since launch.

Reddit’s New CISO

We’re happy to share that a few months back, we welcomed a new Chief Information Security Officer: Fredrick Lee, aka Flee (aka u/cometarystones), officially the coolest CISO name around! He oversees our Security and Privacy teams and you may see him stop by in this community every once in a while to answer your burning security questions. Fun fact: In addition to being a powerlifter, Flee also lurks in r/MMA, so bad folks better watch out.

54 Comments

2023/12/19
19:08 UTC

Q2 2023 Quarterly Safety & Security Report

Hi redditors,

It’s been a while between reports, and I’m looking forward to getting into a more regular cadence with you all as I pick up the mantle on our quarterly report.

Before we get into the report, I want to acknowledge the ongoing Israel-Gaza conflict. Our team has been monitoring the conflict closely and reached out to mods last month to remind them of resources we have available to help keep their communities safe. We also shared how we plan to continue to uphold our sitewide policies. Know that this is something we’re working on behind the scenes and we’ll provide a more detailed update in the future.

Now, onto the numbers and our Q2 report.

Q2 by the Numbers

Category	Volume (Jan - Mar 2023)	Volume (April - June 2023)
Reports for content manipulation	867,607	892,936
Admin content removals for content manipulation	29,125,705	35,317,262
Admin imposed account sanctions for content manipulation	8,468,252	2,513,098
Admin imposed subreddit sanctions for content manipulation	122,046	141,368
Reports for abuse	2,449,923	2,537,108
Admin content removals for abuse	227,240	409,928
Admin imposed account sanctions for abuse	265,567	270,116
Admin imposed subreddit sanctions for abuse	10,074	9,470
Reports for ban evasion	17,020	17,127
Admin imposed account sanctions for ban evasion	217,634	266,044
Protective account security actions	1,388,970	1,034,690

Methodology Update

For folks new to this report, we share user reporting and our actioning numbers each quarter to ensure a level of transparency in our efforts to keep Reddit safe. As our enforcement and data science teams have grown and evolved, we’ve been able to improve our reporting definitions and precision of our reporting methodology.

Moving forward, these Quarterly Safety & Security Reports will be more closely aligned with our more in-depth, now bi-annual Reddit Transparency Report, which just came out last month. This small shift has changed how we share some of the numbers in these quarterly reports:

Reporting queries are refined to reflect the content and accounts (for ban evasion) that have been reported instead of a mix of submitted reports and reported content
Time window for reporting reports queries now uses a definition based on when a piece of content or an account is first reported
Account sanction reporting queries are updated to better categorize sanction reasons and admin actions
Subreddit sanction reporting queries are updated to better categorize sanction reasons

It’s important to note that these reporting changes do not change our enforcement. With investments from our Safety Data Science team, we’re able to generate more precise categorization of reports and actions with more standardized timing. That means there’s a discontinuity in the numbers from previous reports, so today’s report shows the revamped methodology run quarter over quarter for Q1’23 and Q2’23.

A big thanks to our Safety Data Science team for putting thought and time into these reporting changes so we can continue to deliver transparent data.

Dragonbridge

We’re sharing our internal investigation findings on the coordinated influence operation dubbed “Dragonbridge” or “Spamoflauge Dragon.” Reddit has been investigating activities linked to this network for about two years and though our efforts are ongoing, we wanted to share an update about how we’re detecting, removing, and mitigating behavior and content associated with this campaign:

Dragonbridge operates with a high-volume strategy. Meaning, they create a significant number of accounts as part of their amplification efforts. While this tactic might be effective on other platforms, the overwhelming majority of these accounts have low visibility on Reddit and do not gain traction. We’ve actioned tens of thousands of accounts for ties to this actor group to date.
Most content posted by Dragonbridge accounts is ineffective on Reddit: 85-90% never reaches real users due to Reddit’s proactive detection methods
Mods remove almost all of the remaining 10-15% because it’s recognized as off-topic, spammy, or just generally out of place. Redditors are smart and know their communities: you all do a great job of recognizing actors who try to enter under false pretenses.

Although connected with a state actor, most Dragonbridge content was spammy by nature — we would action these posts under our sitewide policies, which prohibit manipulated content or spam. The connection to a state actor elevates the seriousness with which we view the violation, but we want to emphasize we would be taking this content down.

Please continue to use our anti-spam and content manipulation safeguards (hit that report button!) within your communities.

New tools for keeping communities safe

In September, we launched the Contributor Quality Score in AutoMod to give mods another tool to combat spammy users. We also shipped Mature Content filters to help SFW communities keep unsuitable content out of their spaces. We’re excited to see the adoption of these features and to build out these capabilities with feedback from mods.

We’re also working on a brand new Safety Insights hub for mods which will house more information about reporting, filtering, and removals in their community. I’m looking forward to sharing more on what’s coming and what we’ve launched in our Q3 report.

Edit: fixed a broken link

52 Comments

2023/11/14
17:27 UTC

Reddit Transparency Report: Jan-Jun 2023

Greetings, redditors!

Today we published our Transparency Report for the first half of 2023, which focuses on data and insights from January through June for both content moderation and legal requests.

We have historically published these reports on an annual basis, covering the entire year prior. To provide deeper analysis across a shorter period of time and increase our reporting cadence, we have now transitioned into a biannual schedule – starting with today’s report! You’ll begin to see these pop up more frequently in r/redditsecurity, and all reports past and present are housed in our Transparency Center.

As a quick refresher, our Transparency Reports provide quantitative findings and metrics about content removed from Reddit. This includes, but is not limited to, proactively removed content as a result of automated tooling, accounts suspended, and legal requests from governments, law enforcement agencies, and third parties to remove content or obtain private user data.

Content Creation & Removals: From January to June 2023, redditors created 4.4 billion pieces of content across Reddit communities. This is on track to surpass the content created in 2022.

Mods and admins removed 3.8% of the content created on Reddit, across all content types (1.96% by mods and 1.85% by admins) during this time. As usual, spam accounted for the supermajority of admin removals (78.6%), with the remainder being for various Content Policy violations (19.6%) and other content manipulation removals, such as report abuse (1.8%).
Close to 72% of content removal actions by mods were the result of proactive Automod removals.

Report Updates: We expanded reporting about admin enforcement of Reddit’s Moderator Code of Conduct, which sets out our expectations for community moderators. The new data includes a breakdown of the types of investigations conducted in response to potential Code of Conduct violations, with the majority (53.5%) falling under Rule 3: Respect Your Neighbors.

In addition, we have expanded our reporting in a number of areas, including moving data about removing terrorist content into its own section and expanding insights into legal requests for account information with new data about investigation types, disclosure impact, and how we handle fraudulent requests.

Global Legal Requests: We have continued to process large volumes of global legal requests from around the world. Interestingly, we received 29% fewer legal requests to remove content from government and law enforcement agencies during this reporting period, in contrast with receiving 21% more legal requests to disclose account information from global officials.

We routinely push back on overbroad or otherwise objectionable requests for account information, including, if necessary, in court. As an example, during the reporting period, we successfully defeated a production company’s efforts to unmask Reddit users by asserting our users’ First Amendment rights to engage in anonymous online speech.

We also started sharing new insights about fraudulent law enforcement requests. We identified and rejected three fraudulent emergency disclosure requests and one non-emergency disclosure request that sought to inappropriately obtain private user data under false premises.

You can read more insights in our Transparency Report: January to June 2023. Please let us know in the comments section if you have any questions or are interested in learning more about other data or insights.

26 Comments

2023/10/04
18:06 UTC

Q1 Safety & Security Report

Hello! I’m not the u/worstnerd but I’m not far from it, maybe third or fourth worst of the nerds? All that to say, I’m here to bring you our Q1 Safety & Security report. In addition to the quarterly numbers, we’re highlighting some results from the ban evasion filter we launched in Q1 to help mods keep their communities safe, as well as updates to our Automod notification architecture.

Q1 By The Numbers

Category	Volume (Oct - Dec 2022)	Volume (Jan - Mar 2023)
Reports for content manipulation	7,924,798	8,002,950
Admin removals for content manipulation	79,380,270	77,403,196
Admin-imposed account sanctions for content manipulation	14,772,625	16,194,114
Admin-imposed subreddit sanctions for content manipulation	59,498	88,772
Protective account security actions	1,271,742	1,401,954
Reports for ban evasion	16,929	20,532
Admin-imposed account sanctions for ban evasion	198,575	219,376
Reports for abuse	2,506,719	2,699,043
Admin-imposed account sanctions for abuse	398,938	447,285
Admin-imposed subreddit sanctions for abuse	1,202	897

Ban Evasion Filter

Ban evasion has been a persistent problem for mods (and admins). Over the past year, we’ve been working on a ban evasion filter, an optional subreddit setting that leverages our ability to identify posts and comments authored by potential ban evaders. Our goal in offering this feature was to help reduce time mods spent detecting ban evaders and prevent their potential negative community impact.

Initially piloted in August 2022, we released the ban evasion filter to all communities this May after incorporating feedback from mods. Since then we’ve seen communities adopting the filter and keeping it on — with positive qualitative feedback too. We have a few improvements on the radar, including faster detection of ban evaders, and are looking forward to continuing to iterate with y’all.

Adoption
- 7,500 communities have turned on the ban evasion filter
Volume
- 5,500 pieces of content are ban evasion-filtered per week from communities that have adopted the tool
Reversal Rate
- Mods keep 92% of ban evasion filtered content out of their communities, indicating the filter is catching the right stuff
Retention
- 98.7% of communities that have turned on the ban evasion filter have kept it on

Automod Notification Checks

Last week, we started rolling out changes to the way our notification systems are architected. Automod will now run before post and comment reply notifications are sent out. This includes both push notifications and email notifications. The change will be fully rolled out in the next few weeks.

This change is designed to improve the user experience on our platform. By running the content checks before notifications are sent out, we can ensure that users don't see content that has been taken down by Automod.

Up Next

More Community Safety Filters

We’re working on another new set of community moderation filters for mature content to further prevent this content from showing up in places where it shouldn’t or where users might not expect it, which we’ve heard from mods that they want. We already employ automated tagging at the site level for sexually explicit content, so this will add to those protections by providing a subreddit-level filter for a wider range of mature content. We’re working to get the first version of these filters to mods in the next couple of months.

29 Comments

2023/07/26
17:27 UTC

Content Policy updates: clarifying Rule 3 (non-consensual intimate media) and expanding Rule 4 (minor safety)

Hello Reddit Community,

Today, we're rolling out updates to Rule 3 and Rule 4 of our Content Policy to clarify the scope of these rules and give everyone a better sense of the types of content and behaviors that are not allowed on Reddit. This is part of our ongoing work to be transparent with you about how we’re evolving our sitewide rules to keep Reddit safe and healthy.

First, we're updating the language of our Rule 3 policy prohibiting non-consensual intimate media to more specifically address AI-generated sexual media. While faked depictions of non-consensual intimate media are already prohibited, this update makes it clear that sexually explicit AI-generated content violates our rules if it depicts a real, identifiable person.

This update also clarifies that AI-generated sexual media that depict fictional people, or artistic depictions such as cartoons or anime whether AI-generated or not, do not fall under this rule. Keep in mind however that this type of media may violate subreddit-specific rules or other policies (such as our policy against copyright infringement), which our Safety teams already enforce across the platform.

Sidenote: Reddit also leverages StopNCII.org, a free, online tool that supports platforms to detect and remove non-consensual intimate media while protecting the victim’s privacy. You can read more information about how StopNCII.org works here. If you've been affected by this issue, you can access the tool here.

Now to Rule 4. While the vast majority of Reddit users are adults, it is critical that our community continues to prioritize the safety, security, and privacy of minors regardless of their engagement with our platform. Given the importance of minor safety, we are expanding the scope of this Rule to also prohibit non-sexual forms of abuse of minors (e.g., neglect, physical or emotional abuse, including, for example, videos of things like physical school fights). This represents a new violative content category.

Additionally, we already interpret Rule 4 to prohibit inappropriate and predatory behaviors involving minors (e.g., grooming) and actively enforce against this content. In line with this, we’re adding language to Rule 4 to make this even clearer.

You'll also note that we're parting ways with some outdated terminology (e.g., "child pornography") and adding specific examples of violative content and behavior to shed light on our interpretation of the rule.

As always, to help keep everyone safe, we encourage you to flag potentially violative content by using the in-line report feature on the app or website, or by visiting this page.

That's all for now, and I'll be around for a bit to answer any questions on this announcement!

58 Comments

2023/07/06
20:05 UTC

162

Introducing Our 2022 Transparency Report and New Transparency Center

Hi all, I’m u/outersunset, and I’m here to share that Reddit has released our full-year Transparency Report for 2022. Alongside this, we’ve also just launched a new online Transparency Center, which serves as a central source for Reddit safety, security, and policy information. Our goal is that the Transparency Center will make it easier for users - as well as other interested parties, like policymakers and the media - to find information about how we moderate content, deal with complex things like legal requests, and keep our platform safe for all kinds of people and interests.

And now, our 2022 Transparency Report: as many of you know, we publish these reports on a regular basis to share insights and metrics about content removed from Reddit – including content proactively removed as a result of automated tooling - as well as accounts suspended, and legal requests from governments, law enforcement agencies, and third parties to remove content or lawfully obtain private user data.

Reddit’s Biggest Content Creation Year Yet

Content Creation: This year, our report shows that there was a lot of content on Reddit. 2022 was the biggest year of content creation on Reddit to date, with users creating an eye-popping 8.3 billion posts, comments, chats, and private messages on our platform (you can relive some of the beautiful mess that was 2022 via our Reddit Recap).
Content Policy Compliance: Importantly, the overwhelming majority – over 96% – of Reddit content in 2022 complied with our Content Policy and individual community rules. This is a slight increase from last year’s 95%. The remaining 4% of content in 2022 was removed by moderators or admins, with the overwhelming majority of admin removals (nearly 80%) being due to spam, such as karma farming.

Other key highlights from this year include:

Content & Subreddit Removals: Consistent with previous years, there were increased content and subreddit removals across most policy categories. Based on the data as a whole, we believe this is largely due to our evolving policies and continuous enforcement improvements. We’re always looking for ways to make our platform a healthy place for all types of people and interests, and this year’s data demonstrates that we’re continuing to improve over time.
- We’d also like to give a special shoutout to the moderators of Reddit, who accounted for 58% of all content removed in 2022. This was an increase of 4.7% compared to 2021, and roughly 69% of these were a result of proactive Automod removals. Building out simpler, better, and faster mod tooling is a priority for us, so watch for more updates there from us.
Global Legal Requests: We saw increased volumes across nearly all types of global legal requests. This is in line with industry trends.
- This includes year-over-year increases of 43% in copyright notices, 51% in legal removal requests submitted by government and law enforcement agencies, 61% in legal requests for account information from government and law enforcement agencies, and 95% in trademark notices.

You can read more insights in the full-year 2022 Transparency Report here.

Starting later this year, we’ll be shifting to publishing this full report - with both legal requests and content moderation data - on a biannual cadence (our first mid-year Transparency Report focused only on legal requests). So expect to see us back with the next report later in 2023!

Overall, it’s important to us that we remain open and transparent with you about what we do and why. Not only is “Default Open” one of our company values, we also think it’s the right thing to do and central to our mission to bring community, empowerment, and belonging to everyone in the world. Please let us know in the comments what other kinds of data and insights you’d be interested in seeing. I’ll stick around for a bit to hear your feedback and answer some questions.

99 Comments

2023/03/29
17:14 UTC

122

Q4 Safety & Security Report

Happy Women’s history month everyone. It's been a busy start to the year. Last month, we fielded a security incident that had a lot of snoo hands on deck. We’re happy to report there are no updates at this time from our initial assessment and we’re undergoing a third-party review to identify process improvements. You can read the detailed post on the incident by u/keysersosa from last month. Thank you all for your thoughtful comments and questions, and to the team for their quick response.

Up next: The Numbers:

Q4 By The Numbers

Category	Volume (Jul - Sep 2022)	Volume (Oct - Dec 2022)
Reports for content manipulation	8,037,748	7,924,798
Admin removals for content manipulation	74,370,441	79,380,270
Admin-imposed account sanctions for content manipulation	9,526,202	14,772,625
Admin-imposed subreddit sanctions for content manipulation	78,798	59,498
Protective account security actions	1,714,808	1,271,742
Reports for ban evasion	22,813	16,929
Admin-imposed account sanctions for ban evasion	205,311	198,575
Reports for abuse	2,633,124	2,506,719
Admin-imposed account sanctions for abuse	433,182	398,938
Admin-imposed subreddit sanctions for abuse	2,049	1,202

Modmail Harassment

We talk often about our work to keep users safe from abusive content, but our moderators can be the target of abusive messages as well. Last month, we started testing a Modmail Harassment Filter for moderators and the results are encouraging so far. The purpose of the filter is to limit harassing or abusive modmail messages by allowing mods to either avoid or use additional precautions when viewing filtered messages. Here are some of the early results:

Value
- 40% (!) decrease in mod exposure to harassing content in Modmail
Impact
- 6,091 conversation have been filtered (average of 234 conversations per day)
  - This is an average of 4.4% of all modmail conversations across communities that opted in
Adoption
- ~64k communities have this feature turned on (most of this is from newly formed subreddits).
- We’re working on improving adoption, because…
Retention
- ~100% of subreddits that have it turned on, keep it on. This number is the same for the subreddits that have manually opted in and the new subreddits that were defaulted in and sliced several different ways. Basically, everyone keeps it on.

Over the next few months we will continue to make model iterations to further improve performance and to keep up with the latest trends in abuse language on the platform (because shitheads never rest). We are also exploring new ways of introducing more explicit feedback signals from mods.

Subreddit Spam Filter

Over the last several years, Reddit has developed a wide variety of new, advanced tools for fighting spam. This allowed us to do an evaluation of one of the oldest spam tools that we have: the Subreddit Spam Filter. During this analysis, we discovered that the Subreddit Spam Filter was markedly error prone compared to our newer site-wide solutions, and in many cases bordered on completely random as some of you were well aware. In Q4, we performed experiments and the results validated our hypothesis. Our results showed 40% of posts removed by this system were not actually spam, and the majority of true spam that was flagged was also caught by other systems. After seeing these results, in December 2022, we disabled the Subreddit Spam Filter in the background, and it turned out that no one noticed! This was because our modern tools catch the bad content with a higher degree of accuracy than the Subreddit spam filter. We will be removing the ‘Low’ and ‘High’ settings associated with the old filter, but we will maintain the functionality for mods to “Filter all posts” and will update the Community Settings to reflect this.

We know it’s important that spam be caught as quickly as possible, and we also recognize that spammy content in communities may not be the same thing as the scaled spam campaigns that we often focus on at the admin level.

Next Up

We will continue to invest in admin-level tooling and our internal safety teams to catch violating content at scale, and our goal is that these updates for users and mods also provide even more choice and power at the community level. We’re also in the process of producing our next Transparency Report, which will be coming out soon. We’ll be sure to share the findings with you all once that’s complete.

Be excellent to each other

70 Comments

2023/03/06
18:07 UTC

297

We had a security incident. Here’s what we know.

0 Comments

2023/02/09
19:44 UTC

142

Q3 Safety & Security Report

As we kick off the new year, we wanted to share the Q3 Safety and Security report. Often these reports focus on our internal enforcement efforts, but this time we wanted to touch on some of the things we are building to help enable moderators to keep their communities safe. Subreddit needs are as diverse as our users, and any centralized system will fail to fully meet those needs. In 2023, we will be placing even more of an emphasis on developing community moderation tools that make it as easy as possible for mods to set safety standards for their communities.

But first, the numbers…

Q3 By The Numbers

Category	Volume (Apr - Jun 2022)	Volume (Jul - Sep 2022)
Reports for content manipulation	7,890,615	8,037,748
Admin removals for content manipulation	55,100,782	74,370,441
Admin-imposed account sanctions for content manipulation	8,822,056	9,526,202
Admin-imposed subreddit sanctions for content manipulation	57,198	78,798
Protective account security actions	661,747	1,714,808
Reports for ban evasion	24,595	22,813
Admin-imposed account sanctions for ban evasion	169,343	205,311
Reports for abuse	2,645,689	2,633,124
Admin-imposed account sanctions for abuse	315,222	433,182
Admin-imposed subreddit sanctions for abuse	2,528	2049

Ban Evasion

Ban Evasion is one of the most challenging and persistent problems that our mods (and we) face. The effectiveness of any enforcement action hinges on the action having actual lasting consequences for the offending user. Additionally, when a banned user evades a ban, they rarely come back to change their behavior for the better; often it leads to an escalation of the bad behavior. On top of our internal ban evasion tools we’ve been building out over the last several years, we have been working on developing ban evasion tooling for moderators. I wanted to share some of the current results along with some of the plans for this year.

Today, mod ban evasion filters are flagging around 2.5k-3k pieces of content from ban evading users each day in our beta group at an accuracy rate of around 80% (the mods can confirm or reject the decision). While this works reasonably well, there are still some sharp edges for us to address. Today, mods can only approve a single piece of content, instead of all content from a user, which gets pretty tedious. Also, mods can set a tolerance level for the filter, which basically reflects how likely we think the account is to be evading, but we would like to give mods more control over exactly which accounts are being flagged. We will also be working on providing mods with more context about why a particular account was flagged, while still respecting the privacy of all users (yes, even the privacy of shitheads).

We’re really excited for this feature to roll out to GA this year and optimistic that this will be very helpful for mods and will reduce abuse from some of the most…challenging users.

Karma Farming

Karma farming is another consistent challenge that subreddits face. There are some legitimate reasons why accounts need to quickly get some karma (helpful mod bots, for example, need some karma to be able to post in relevant communities), and some karma farming behaviors are often just new users learning how to engage (while others just love internet points). Mods historically have had to rely on overall karma restrictions (along with a few other things) to help minimize the impact. A long requested feature has been to give automod access to subreddit-specific karma. Last month, we shipped just such a feature. So now, mods can write rules to flag content by users that may have positive karma overall, but 0 or negative karma in their specific subreddit.

But why do we care about users farming for fake internet points!? Karma is often used as a proxy for how trusted or “good” a user is. Through automod, mods can create rules that treat content by low karma users differently (perhaps by requiring mod approval). Low, but non-negative, karma users can be spammers, but they can also be new users…so it’s an imperfect proxy. Negative karma is often a strong signal of an abusive user or a troll. However, the overall karma score doesn’t help with the situation in which a user may be a positively contributing member in one set of communities, but a troll in another (an example might be sports subreddits, where a user might be a positive contributor in say r/49ers, but a troll in r/seahawks.)

Final Thoughts

Subreddits face a wide range of challenges and it takes a range of tools to address them. Any one tool is going to leave gaps. Additionally, any purely centralized enforcement system is going to lack the nuance, and perspective that our users and moderators have in their space. While it is critical that our internal efforts become more robust and flexible, we believe that the true superpower comes when we enable our communities to do great things (even in the safety space).

Happy new year everyone!

37 Comments

2023/01/04
18:33 UTC

132

Q2 Safety & Security Report

Hey everyone, it’s been awhile since I posted a Safety and Security report…it feels good to be back! We have a fairly full report for you this quarter, including rolling out our first mid-year transparency report and some information on how we think about election preparedness.

But first, the numbers…

Q2 By The Numbers

Category	Volume (Jan - Mar 2022)	Volume (Apr - Jun 2022)
Reports for content manipulation	8,557,689	7,890,615
Admin removals for content manipulation	63,587,487	55,100,782
Admin-imposed account sanctions for content manipulation	11,283,586	8,822,056
Admin-imposed subreddit sanctions for content manipulation	51,657	57,198
3rd party breach accounts processed	313,853,851	262,165,295
Protective account security actions	878,730	661,747
Reports for ban evasion	23,659	24,595
Admin-imposed account sanctions for ban evasion	139,169	169,343
Reports for abuse	2,622,174	2,645,689
Admin-imposed account sanctions for abuse	286,311	315,222
Admin-imposed subreddit sanctions for abuse	2,786	2,528

Mid-year Transparency Report

Since 2014, we’ve published an annual Reddit Transparency Report to share insights and metrics about content moderation and legal requests, and to help us empower users and ensure their safety, security, and privacy.

We want to share this kind of data with you even more frequently so, starting today, we’re publishing our first mid-year Transparency Report. This interim report focuses on global legal requests to remove content or disclose account information received between January and June 2022 (whereas the full report, which we’ll publish in early 2023, will include not only this information about global legal requests, but also all the usual data about content moderation).

Notably, volumes across all legal requests are trending up, with most request types on track to exceed volumes in 2021 by year’s end. For example, copyright takedown requests received between Jan-Jun 2022 have already surpassed the total number of copyright takedowns from all of 2021.

We’ve also added detail in two areas: 1) data about our ability to notify users when their account information is subject to a legal request, and 2) a breakdown of U.S. government/law enforcement legal requests for account information by state.

You can read the mid-year Transparency Report Q2 here.

Election Preparedness

While the midterm elections are upon us in the U.S., election preparedness is a subject we approach from an always-on, global perspective. You can read more about our work to support free and fair elections in our blog post.

In addition to getting out trustworthy information via expert AMAs, announcement banners, and other things you may see throughout the site, we are also focused on protecting the integrity of political discussion on the platform. Reddit is a place for everyone to discuss their views openly and authentically, as long as users are upholding our Content Policy. We’re aware that things like elections can bring heightened tensions and polarizations, so around these events we become particularly focused on certain kinds of policy-violating behaviors in the political context:

Identifying discussions indicative of hate speech, threats, and calls to action for physical violence or harm
Content manipulation behaviors (this covers a variety of tactics that aim to exploit users on the platform through behaviors that fraudulently amplify content. This can include actions like vote manipulation, attempts to use multiple accounts to engage inauthentically, or larger coordinated disinformation campaigns).
Warning signals of community interference (attempts at cross-community disruption)
Content that equates to voter suppression or intimidation, or is intended to spread false information about the time, place, or manner of voting which would interfere with individuals’ civic participation.

Our Safety teams use a combination of automated tooling and human review to detect and remove these kinds of behaviors across the platform. We also do continual, sophisticated analyses of potential threats happening off-platform, so that we can be prepared to act quickly in case these behaviors appear on Reddit.

We’re constantly working to evolve our understanding of shifting global political landscapes and concurrent malicious attempts to amplify harmful content; that said, our users and moderators are an important part of this effort. Please continue to report policy violating content you encounter so that we can continue the work to provide a place for meaningful and relevant political discussion.

Final Thoughts

Overall, our goal is to be transparent with you about what we’re doing and why. We’ll continue to push ourselves to share these kinds of insights more frequently in the future - in the meantime, we’d like to hear from you: what kind of data or insights do you want to see from Reddit? Let us know in the comments. We’ll stick around for a bit to answer some questions.

61 Comments

2022/10/31
18:35 UTC

621

Reddit Onion Service Launch

Hi all,

We wanted to let you know that Reddit is now available as an “onion service” on Tor at the address:

https://www.reddittorjg6rue252oqsxryoxengawnmo46qy4kyii5wtqnwfj4ooad.onion

As some of you likely know, an onion service enables users to browse the internet anonymously. Tor is a free and open-source software that enables this kind of anonymous communication and browsing. It’s an important tool frequently used by journalists, human rights activists, and others who face threats of surveillance or censorship. Reddit has always been accessible via Tor, but with the launch of our official onion service, we’re able to improve the user experience when browsing Reddit on Tor: quicker loading times for the site, shorter network hops through Tor network and eliminating opportunities for Reddit being blocked or someone maliciously monitoring your traffic, and a cryptographic assurance that your connection is direct to reddit.com.

The goal with our onion service is to provide access to most of the site’s functionality at minimum this will include our standard post/comment functionality. While some functionality won’t work with Javascript disabled, core browsing should work. If you happen to find something broken, feel free to report it over at r/bugs and we’ll look into it.

A huge thank you to the work of Alec Muffett (@AlecMuffett) and all the predecessors who helped build the Enterprise Onion Toolkit, which this launch is largely based on. We’ll be open sourcing our Kubernetes deployment pattern and helping modernize the existing codebase and sharing our signal enhancements to help spot and block abuse against our new onion service.

For more information about the Tor network please visit https://www.torproject.org/.

Edit: There's of course an old reddit flavor at https://old.reddittorjg6rue252oqsxryoxengawnmo46qy4kyii5wtqnwfj4ooad.onion.

172 Comments

2022/10/25
14:40 UTC

140

Three more updates to blocking including bug fixes

Hi reddit peoples!

You may remember me from a few weeks ago when I gave an update on user blocking. Thank you to everyone who gave feedback about what is and isn’t working about blocking. The stories and examples many of you shared helped identify a few ways blocking should be improved. Today, based on your feedback, we’re happy to share three new updates to blocking. Let’s get to it…

Update #1: Preventing people from using blocking to shut down conversations

In January, we changed the tool so that when you block someone, they can’t see or respond to any of your comment threads. We designed blocking to prevent harassment, but we see that we have also opened up a way for users to shut down conversations.

Today we’re shipping a change so that users aren’t locked out of an entire comment thread when a user blocks them, and can reply to some embedded replies (i.e., the replies to your replies). We want to find the right balance between protecting redditors from being harassed while keeping conversations open. We’ll be testing a range of values, from the 2nd to 15th-level reply, for how far a thread continues before a blocked user can participate. We’ll be monitoring how this change affects conversations as we determine how far to turn this ‘knob’ and exploring other possible approaches. Thank you for helping us get this right.

Update #2: Fixing bugs

We have fixed two notable bugs:

When you block someone in the same thread as you, your comments are now always visible in your profile.
Blocking on old Reddit works the same way as it does on the rest of the platform now. We fixed an issue on old Reddit that was causing the block experience to sometimes revert back to the old version, and other times it would be a mix of the new and the old experience.

If you see any bugs, please keep reporting them! Your feedback helps keep reddit a great place for everyone to share, discuss, and debate — (What kind of world would we live in if we couldn’t debate the worst concert to go to if band names were literal?)

Update #3: People want more controls over their experience

We are exploring new features that will enable more ways for you to filter unwanted content, and generally provide you with more control over what you see on Reddit. Some of the concepts we are thinking about include:

Community muting: filters communities from feeds, recommendations, and notifications
Word filters: allows users to proactively establish words they don’t want to see
Topic filters: allows users to tune what types of topics they don’t want to see
User muting: allows users to filter out unwanted content without resorting to anti-harassment tools, such as blocking

Thank you for your feedback and bug reports so far. This has been a complex feature to get right, but we are committed to doing so. We’ll be sticking around for a bit to answer questions and respond to feedback.

That is, if you have not blocked us already.

110 Comments

2022/09/13
20:38 UTC

163

Update on user blocking

Hello people folks of Reddit,

Earlier this year we made some updates to our blocking feature. The purpose of these changes is to better protect users who experience harassment. We believe in the good — that the overwhelming majority of users are not trying to be jerks. Blocking is a tool for when someone needs extra protection.

The old version of blocking did not allow users to see posts or comments from blocked users, which often left the user unaware that they were being harassed. This was a big gap, and we saw users frequently cite this as a problem in r/help and similar communities. Our recent updates were aimed at solving this problem and giving users a better way to protect themselves. ICYMI, my posts in December and January cover in more detail the before and after experiences. You can also find more information about blocking in our Help Centers here and here.

We know that the rollout of these changes could have been smoother. We tried our best to provide a seamless transition by communicating early and often with mods via Mod Council posts and calls. When it came time to launch the experience, we ran into scalability issues that hindered our ability to rollout the update to the entire site, meaning that the rollout was not consistent across all users.

This issue meant that some users temporarily experienced inconsistency with:

Viewing profiles of blocked users between Web and Mobile platforms
How to reply to users who have blocked you
Viewing users who have blocked you in community and home feeds

As we worked to resolve these issues, new bugs would pop up that took us time to find, recreate, and resolve. We understand how frustrating this was for you, and we made the blocking feature our top priority during this time. We had multiple teams contribute to making it more scalable, and bug reports were investigated thoroughly as soon as they came in.

Since mid-June, the feature is fully functional on all platforms. We want to acknowledge and apologize for the bugs that made this update more difficult to manage and use. We understand that this created an inconsistent and confusing experience, and we have held multiple reviews to learn from our mistakes on how to scale these types of features better next time.

While we were making the feature more durable, we noticed multiple community concerns about blocking abuse. We heard this concern before we launched, and added additional protections to limit suspicious blocking behavior as well as monitoring metrics that would alert us if the suspicious behavior was happening at scale. That said, it concerned us that there was continued reference to this abuse, and so we completed an investigation on the severity and scale of block abuse.

The investigation involved looking at blocking patterns and behaviors to see how often unwelcome contributors systematically blocked multiple positive contributors with the assumed intent of bolstering their own posts.

In this investigation, we found that:

There are very few instances of this kind of abuse. We estimated that 0.02% of active communities have been impacted.
Of the 0.02% of active communities impacted, only 3.1% of them showed 5+ instances of this kind of abuse. This means that 0.0006% of active communities have seen this pattern of abuse.
Even in the 0.0006% of communities with this pattern of abuse, the blocking abuse is not happening at scale. Most bad actors participating in this abuse have blocked fewer than 10 users each.

While these findings indicate that this kind of abuse is rare, we will continue to monitor and take action if we see its frequency or severity increase. We also know that there is more to do here. Please continue to flag these instances to us as you see them.

Additionally, our research found that the blocking revamp is more effective in meeting user’s safety needs. Now, users take fewer protective actions than users who blocked before the improvements. Our research also indicates that this is especially impactful for perceived vulnerable and minority groups who display a higher need for blocking and other safety measures. (ICYMI read our report on Prevalence of Hate Directed at Women here).

Before we wrap up, I wanted to thank all the folks who have been voicing their concerns - it has helped make a better feature for everyone. Also, we want to continue to work on making the feature better, so please share any and all feedback you have.

261 Comments

2022/07/20
18:44 UTC

143

Q1 Safety & Security Report

Hey-o and a big hello from SF where some of our resident security nerds just got back from attending the annual cybersecurity event known as RSA. Given the congregation of so many like-minded, cyber-focused folks, we’ve been thinking a lot about the role of Reddit not just in providing community and belonging to everyone in the world, but also about how Reddit interacts with the broader internet ecosystem.

Ain’t no party like a breached third party

In last quarter’s report we talked about the metric “Third Party Breach Accounts Processed”, because it was jumping around a bit, but this quarter we wanted to dig in again and clarify what that number represents.

First-off, when we’re talking about third-party breaches, we’re talking about other websites or apps (i.e., not Reddit) that have had a breach where data was leaked or stolen. When the leaked/stolen data includes usernames and passwords (or email addresses that include your username, like worstnerd@reddit.com), bad actors will often try to log-in using those credentials at all kinds of sites across the internet, including Reddit -- not just on the site/app that got hacked. Why would an attacker bother to try a username and password on a random site? The answer is that since many people reuse their passwords from one site to the next, with a big file of passwords and enough websites, an attacker might just get lucky. And since most login “usernames” these days are an email address, it makes it even easier to find when a person is reusing their password.

Each username and password pair in this leaked/stolen data is what we describe as a “third-party breach account”. The number of “third-party breach accounts” can get pretty large because a single username/email address could show up in breaches at multiple websites, and we process every single one of those instances. “Processing” the breach account means we (1) check if the breached username is associated with a Reddit account and (2) whether that breached password, when hashed, matches the Reddit account’s current hashed password. (TL;DR: a “hashed” password means the password has been permanently turned into a scrambled version of itself, so nobody ever sees or has access to your password.) If the answer to both questions is yes, we let that Reddit user know it’s time to change their password! And we recommend they add some 2FA on top to double-plus protect that account from attackers.

There are a LOT of these stolen credential files floating around the internet. For a while security teams and specialized firms used to hunt around the dark web looking for files and pieces of files to do courtesy checks and keep people safe. Now, anyone is able to run checks on whether they’ve had their information leaked by using resources like Have I Been Pwned (HIBP). It’s pretty cool to see this type of ecosystem innovation, as well as how it’s been adopted into consumer tech like password managers and browsers.

Wrapping it up on this particular metric, last quarter we were agog to see “3rd party breach accounts processed” jump up to ~1.4B breach accounts, and this quarter we are relieved to see that has come back down to a (still whopping) ~314M breach accounts. This means that in Q1 2022 we received 314M username/password combos from breaches at other websites. Some subset of those accounts might be associated with people who use Reddit, and then a smaller subset of those accounts may have reused their breached passwords here. Specifically, we took protective action on 878,730 Reddit accounts this quarter, which means that many of you got a message from us to please change your passwords.

How we think about emerging threats (on and off of Reddit)

Just like we take a look at what’s going on in the dark web and across the ecosystem to identify vulnerable Reddit accounts, we also look across the internet to spot other trends or activities that shed light on potential threats to the safety or security of our platform. We don’t just want to react to what shows up on our doorstep, we get proactive when we can by trying to predict how events happening elsewhere might affect Reddit. Examples include analyzing the internet ecosystem at large to understand trends and problems elsewhere, as well as analyzing our own Reddit telemetry for clues that might help us understand how and where those activities could show up on our platform. And while y’all know from previous quarterly reports we LOVE digging into our data to help shed light on trends we’re seeing, sometimes our work includes really simple things like keeping an eye on the news. Because as things happen in the “real world” they also unfold in interesting ways on the internet and on Reddit. Sometimes it seems like our ecosystem is the web, but we often find that our ecosystem is the world.

Our quarterly reports talk about both safety AND security issues (it’s in the title of the report, lol), but it’s pretty fluid sometimes as to which issues or threats are “safety” related, and which are “security” related. We don’t get too spun-up about the overlap as we’re all just focused on how to protect the platform, our communities, and all the people who are participating in the conversations here on Reddit. So when we’re looking across the ecosystem for threats, we’re expansive in our thinking -- keeping eyes open looking for spammers and scammers, vulns and malware, groups organizing influence campaigns and also groups organizing denial of service attacks. And once we understand what kind of threats are coming our way, we take action to protect and defend Reddit.

When the ecosystem comes a knockin’ - Log4j

Which brings me to one more example - being a tech company on the internet means there are ecosystem dynamics in how we build (and secure) the technology itself. Like a lot of other internet companies we use cloud technology (an ecosystem of internet services!) and open source technology (and ecosystem of code!). In addition to the dynamics of being an ecosystem that builds together, there can be situations where we as an ecosystem all react to security vulnerabilities or incidents together -- a perfect example is the Log4j vulnerability that wreaked havoc in December 2021. One of the things that made this particular vulnerability so interesting to watch (for those of you who find security vulnerabilities interesting to watch) is how broadly and deeply entities on the internet were impacted, and how intense the response and remediation was.

Coordinating an effective response was challenging for most if not all of the organizations affected, and at Reddit we saw firsthand how amazing people will come together in a situation. Internally, we needed to work together across teams quickly, but this was also an internet-wide situation, so while we were working on things here, we were also seeing how the ecosystem itself was mobilized. For example, we were able to swiftly scale up our response by scouring public forums where others were dealing with these same issues, devoting personnel to understanding and implementing those learnings, and using ad-hoc scanning tools (e.g. a fleet-wide Ansible playbook execution of an rubo77's log4j checker and Anchore’s tool Syft) to ensure our reports were accurate. Thanks to our quick responders and collaboration with our colleagues across the industry, we were able to address the vulnerability while it was still just a bug to be patched, before it turned into something worse. It was inspiring to see how defenders connected with each other on Reddit (oh yeah, plenty of memes and threads were generated) and elsewhere on the internet, and we learned a lot both about how we might tune up our security capabilities & response processes, but also about how we might leverage community and connections to improve security across the industry. In addition, we continue to grow our internal community of folks protecting Reddit (btw, we’re hiring!) to scale up to meet the next challenge that comes our way.

Finally, to get back to your regularly scheduled programming for these reports, I also wanted to share across our Q1 numbers:

Q1 By The Numbers

Category	Volume (Oct - Dec 2021)	Volume (Jan - Mar 2022)
Reports for content manipulation	7,798,126	8,557,689
Admin removals for content manipulation	42,178,619	52,459,878
Admin-imposed account sanctions for content manipulation	8,890,147	11,283,586
Admin-imposed subreddit sanctions for content manipulation	17,423	51,657
3rd party breach accounts processed	1,422,690,762	313,853,851
Protective account security actions	1,406,659	878,730
Reports for ban evasion	20,836	23,659
Admin-imposed account sanctions for ban evasion	111,799	139,169
Reports for abuse	2,359,142	2,622,174
Admin-imposed account sanctions for abuse	182,229	286,311
Admin-imposed subreddit sanctions for abuse	3,531	2,786

Until next time, cheers!

35 Comments

2022/06/29
17:44 UTC

535

Prevalence of Hate Directed at Women

For several years now, we have been steadily scaling up our safety enforcement mechanisms. In the early phases, this involved addressing reports across the platform more quickly as well as investments in our Safety teams, tooling, machine learning, etc. – the “rising tide raises all boats” approach to platform safety. This approach has helped us to increase our content reviewed by around 4x and accounts actioned by more than 3x since the beginning of 2020. However, in addition to this, we know that abuse is not just a problem of “averages.” There are particular communities that face an outsized burden of dealing with other abusive users, and some members, due to their activity on the platform, face unique challenges that are not reflected in “the average” user experience. This is why, over the last couple of years, we have been focused on doing more to understand and address the particular challenges faced by certain groups of users on the platform. This started with our first Prevalence of Hate study, and then later our Prevalence of Holocaust Denialism study. We would like to share the results of our recent work to understand the prevalence of hate directed at women.

The key goals of this work were to:

Understand the frequency at which hateful content is directed at users perceived as being women (including trans women)
Understand how other Redditors respond to this content
Understand how Redditors respond differently to users perceived as being women (including trans women)
Understand how Reddit admins respond to this content

First, we need to define what we mean by “hateful content directed at women” in this context. For the purposes of this study, we focused on content that included commonly used misogynistic slurs (I’ll leave this to the reader’s imagination and will avoid providing a list), as well as content that is reported or actioned as hateful along with some indicator that it was directed at women (such as the usage of “she,” “her,” etc in the content). As I’ve mentioned in the past, humans are weirdly creative about how they are mean to each other. While our list was likely not exhaustive, and may have surfaced potentially non-abusive content as well (e.g., movie quotes, reclaimed language, repeating other users, etc), we do think it provides a representative sample of this kind of content across the platform.

We specifically wanted to look at how this hateful content is impacting women-oriented communities, and users perceived as being women. We used a manually curated list of over 300 subreddits that were women-focused (trans-inclusive). In some cases, Redditors self-identify their gender (“...as I woman I am…”), but one the most consistent ways to learn something about a user is to look at the subreddits in which they participate.

For the purposes of this work, we will define a user perceived as being a woman as an account that is a member of at least two women-oriented subreddits and has overall positive karma in women-oriented subreddits. This makes no claim of the account holder’s actual gender, but rather attempts to replicate how a bad actor may assume a user’s gender.

With those definitions, we find that in both women-oriented and non-women-oriented communities, approximately 0.3% of content is identified as being hateful content directed at women. However, while the rate of hateful content is approximately the same, the response is not! In women-oriented communities, this hateful content is nearly TWICE as likely to be negatively received (reported, downvoted, etc.) than in non-women-oriented communities (see chart). This tells us that in women-oriented communities, users and mods are much more likely to downvote and challenge this kind of hateful content.

Title: Community response (hateful content vs non-hateful content)

	Women-oriented communities	Non-women-oriented communities	Ratio
Report Rate	12x	6.6x	1.82
Negative Reception Rate	4.4x	2.6x	1.7
Mod Removal Rate	4.2x	2.4x	1.75

Next, we wanted to see how users respond to other users that are perceived as being women. Our safety researchers have seen a common theme in survey responses from members of women-oriented communities. Many respondents mentioned limiting how often they engage in women-oriented communities in an effort to reduce the likelihood they’ll be noticed and harassed. Respondents from women-oriented communities mentioned using alt accounts or deleting their comment and post history to reduce the likelihood that they’d be harassed (accounts perceived as being women are 10% more likely to have alts than other accounts). We found that accounts perceived as being women are 30% more likely to receive hateful content in response to their posts or comments in non-women-oriented communities than accounts that are not perceived as being women. Additionally, they are 61% more likely to receive a hateful message on their first direct communication with another user.

Finally, we want to look at Reddit Inc’s response to this. We have a strict policy against hateful content directed at women, and our Rule 1 explicitly states: Remember the human. Reddit is a place for creating community and belonging, not for attacking marginalized or vulnerable groups of people. Everyone has a right to use Reddit free of harassment, bullying, and threats of violence. Communities and users that incite violence or that promote hate based on identity or vulnerability will be banned. Our Safety teams enforce this policy across the platform through both proactive action against violating users and communities, as well as by responding to your reports. Over a recent 90 day period, we took action against nearly 14k accounts for posting hateful content directed at women and we banned just over 100 subreddits that had a significant volume of hateful content (for comparison, this was 6.4k accounts and 14 subreddits in Q1 of 2020).

Measurement without action would be pointless. The goal of these studies is to not only measure where we are, but to inform where we need to go. Summarizing these results we see that women-oriented communities and non-women-oriented-communities see approximately the same fraction of hateful content directed toward women, however the community response is quite different. We know that most communities don’t want this type of content to have a home in their subreddits, so making it easier for mods to filter it will ensure the shithead users are more quickly addressed. To that end, we are developing native hateful content filters for moderators that will reduce the burden of removing hateful content, and will also help to shrink the gap between identity-based communities and others. We will also be looking into how these results can be leveraged to improve Crowd Control, a feature used to help reduce the impact of non-members in subreddits. Additionally, we saw a higher rate of hateful content in direct messages to accounts perceived as women, so we have been developing better tools that will allow users to control the kind of content they receive via messaging, as well as improved blocking features. Finally, we will also be using this work to identify outlier communities that need a little…love from the Safety team.

As I mentioned, we recognize that this study is just one more milestone on a long journey, and we are constantly striving to learn and improve along the way. There is no place for hateful content on Reddit, and we will continue to take action to ensure the safety of all users on the platform.

269 Comments

2022/04/07
18:08 UTC

191

Announcing an Update to Our Post-Level Content Tagging

Hi Community!

We’d like to announce an update to the way that we’ll be tagging NSFW posts going forward. Beginning next week, we will be automatically detecting and tagging Reddit posts that contain sexually explicit imagery as NSFW.

To do this, we’ll be using automated tools to detect and tag sexually explicit images. When a user uploads media to Reddit, these tools will automatically analyze the media; if the tools detect that there’s a high likelihood the media is sexually explicit, it will be tagged accordingly when posted. We’ve gone through several rounds of testing and analysis to ensure that our tagging is accurate with two primary goals in mind: 1. protecting users from unintentional experiences; 2. minimizing the incidence of incorrect tagging.

Historically, our tagging of NSFW posts was driven by our community moderators. While this system has largely been effective and we have a lot of trust in our Redditors, mistakes can happen, and we have seen NSFW posts mislabeled and uploaded to SFW communities. Under the old system, when mistakes occurred, mods would have to manually tag posts and escalate requests to admins after the content was reported. Our goal with today’s announcement is to relieve mods and admins of this burden, and ensure that NSFW content is detected and tagged as quickly as possible to avoid any unintentional experiences.

While this new capability marks an exciting milestone, we realize that our work is far from done. We’ll continue to iterate on our sexually explicit tagging with ongoing quality assurance efforts and other improvements. Going forward, we also plan to expand our NSFW tagging to new content types (e.g. video, gifs, etc.) as well as categories (e.g. violent content, mature content, etc.).

While we have a high degree of confidence in the accuracy of our tagging, we know that it won’t be perfect. If you feel that your content has been incorrectly marked as NSFW, you’ll still be able to rely on existing tools and channels to ensure that your content is properly tagged. We hope that this change leads to fewer unintentional experiences on the platform, and overall, a more predictable (i.e. enjoyable) time on Reddit. As always, please don’t hesitate to reach out with any questions or feedback in the comments below. Thank you!

143 Comments

2022/03/23
18:38 UTC

354

Evolving our Rule on Non-Consensual Intimate Media Sharing

Hi all,

We want to let you know that we are making some changes to our platform-wide rule 3 on involuntary pornography. We’re making these changes to provide a clearer sense of the content this rule prohibits as well as how we’re thinking about enforcement.

Specifically, we are changing the term “involuntary pornography” to “non-consensual intimate media” because this term better captures the range of abusive content and behavior we’re trying to enforce against. We are also making edits and additions to the policy detail page to provide examples and clarify the boundaries when sharing intimate or sexually explicit imagery on Reddit. We have also linked relevant resources directly within the policy to make it easier for people to get support if they have been affected by non-consensual intimate media sharing.

This is a serious issue. We want to ensure we are appropriately evolving our enforcement to meet new forms of bad content and behavior trends, as well as reflect feedback we have received from mods and users. Today’s changes are aimed at reducing ambiguity and providing clearer guardrails for everyone—mods, users, and admins—to identify, report, and take action against violating content. We hope this will lead to better understanding, reporting, and enforcement of Rule 3 across the platform.

We’ll stick around for a bit to answer your questions.

[EDIT: Going offline now, thank you for your questions and feedback. We’ll check on this again later.]

87 Comments

2022/03/07
21:11 UTC

206

Q4 Safety & Security Report

Hey y’all, welcome to February and your Q4 2021 Safety & Security Report. I’m /u/UndrgrndCartographer, Reddit’s CISO & VP of Trust, just popping my head up from my subterranean lair (kinda like Punxsutawney Phil) to celebrate the ending of winter…and the publication of our annual Transparency Report. And since the Transparency Report drills into many of the topics we typically discuss in the quarterly safety & security report, we’ll provide some highlights from the TR, and then a quick read of the quarterly numbers as well as some trends we’re seeing with regard to account security.

2021 Transparency Report

As you may know, we publish these annual reports to provide deeper clarity around our content moderation practices and legal compliance actions. It offers a comprehensive and quantitative look at what we also discuss and share in our quarterly safety reports.

In this year’s report, we offer even more insight into how we handle illegal or unwelcome content as well as content manipulation (such as spam, artificial content promotion), how we identify potentially violating content, and what we do with bad actors on the site (i.e., account sanctions). Here’s a few notable figures from the report, below:

Content Removals

In 2021, admins removed 108,626,408 pieces of content in total (27% increase YoY), the vast majority of that for spam and content manipulation (e.g., vote manipulation, “brigading”). This is accompanied by a ~14% growth in posts, comments, and PMs on the platform, and doesn’t include legal / copyright removals, which we track separately.
For content policy violations:
- Not including spam and content manipulation, we removed 8,906,318 pieces of content.

Legal Removals

We received 292 requests from law enforcement or government agencies to remove content, a 15% increase from 2020. We complied in whole or part with 73% of these requests.

Requests for User Information

We received a total of 806 routine (non-emergency) requests for user information from law enforcement and government entities, and disclosed user information in response to 60% of these requests.

And here’s what y’all came for -- the numbers:

Q4 By The Numbers

Category	Volume (July - Sept 2021)	Volume (Oct - Dec 2021)
Reports for content manipulation	7,492,594	7,798,126
Admin removals for content manipulation	33,237,992	42,178,619
Admin-imposed account sanctions for content manipulation	11,047,794	8,890,147
Admin-imposed subreddit sanctions for content manipulation	54,550	17,423
3rd party breach accounts processed	85,446,982	1,422,690,762
Protective account security actions	699,415	1,406,659
Reports for ban evasion	21,694	20,836
Admin-imposed account sanctions for ban evasion	97,690	111,799
Reports for abuse	2,230,314	2,359,142
Admin-imposed account sanctions for abuse	162,405	182,229
Admin-imposed subreddit sanctions for abuse	3,964	3,531

Account Security

Now, I’m no /u/worstnerd, but there are a few things that jump out at me here that I want to dig into with you. One is this steep drop in admin-imposed subreddit sanctions for content manipulation. In Q3, we saw that number jump up, as the team was battling with some persistent spammers and was tackling the problem via a bunch of large, manual bulk bans of subs that were being used by specific spammers. In Q4, we see that number drop back to down, in the aftermath of that particular battle.

My eye also goes to the number of Third Party Breach Accounts Processed -- that’s a big increase from last quarter! To be fair, that particular number moves around quite a bit - it’s more of an indicator of excitement elsewhere in the ecosystem than on Reddit. But this quarter, it’s also paired with an increase in proactive account security actions. That means we’re taking steps to reinforce the security on accounts that hijackers may be targeting. We have some tips and tools you can use to amp-up the security on your own account, and if you haven’t yet added two-factor authentication to your account - no time like the present.

When it comes to account security, we keep our eyes on breaches at third parties because a lot of folks still reuse passwords from one site to the next, and so third party breaches provide a leading indicator of incoming hijacking attempts. But another indicator isn’t something that we look at per se -- it’s something that smells a bit…phishy. Yep. And I have about a 1000 phish-related puns where that came from. Unfortunately, we've been hearing/seeing/smelling an uptick in phishing emails impersonating Reddit, that are being sent to folks both with and without Reddit accounts. Below is an example of this phishing campaign, where they’re using the HTML template of our normal emails but substituting links to non-Reddit domains and the senders aren’t our redditemail.com sender.

First thing -- when in doubt or if something is even just a little bit suspish, go to reddit.com directly or open your app. Hey, you were just about to come check out some rad memes anyway. But for those who want to dissect an email at a more detailed level (am I the only one who digs through my spam folder occasionally, to see what tricks are trending?), here’s a quick guide on to recognize a legit Reddit email

Email is from either noreply@reddit.com (system emails), noreply@redditmail.com (notifications), or advertising@redditads.com (Reddit Ads platform)
Mouse-over links all go to reddit.com or a subdomain, redditinc.com or reddithelp.com

Of course, if your account has been hacked, we have a place for that too, click here if you need help with a hacked or compromised account.

Our Public Bug Bounty Program

Bringing the conversation back out of the phish tank and back to transparency, I also wanted to give you a quick update on the success of our public bug bounty program. We announced our flip from a private program to a public program ten months ago, as an expansion of our efforts to partner with independent researchers who want to contribute to keeping the Reddit platform secure. In Q4, we saw 217 vulnerabilities submitted into our program, and were able to validate 26 of those submissions -- resulting in $28,550 being paid out to some awesome researchers. We’re looking forward to publishing a deeper analysis when our program hits the one year mark, and then incorporating some of those stats into our quarterly reporting to this community. Many eyes make shallow bugs - TL;DR: Transparency works!

Final Thoughts

I want to thank you all for tuning in as we wrap up the final Safety & Security report of 2021 and announce our latest transparency report. We see these reports as a way to update you about our efforts to keep Reddit safe and secure - but we also want to hear from you. Let us know in the comments what you’d be interested in hearing more (or less) about in this community during 2022.

67 Comments

2022/02/16
22:45 UTC

172

Q3 Safety & Security Report

Welcome to December, it’s amazing how quickly 2021 has gone by.

Looking back over the previous installments of this report, it was clear that we had a bit of a topic gap. We’ve spoken a good bit about content manipulation, and we discussed particular issues associated with abusive and hateful content, but we haven’t really done a high level discussion about scaling enforcement against abusive content (which is distinct from how we approach content manipulation). So this report will start to address that. This is a fairly big (and rapidly evolving) topic, so this will really just be the starting point.

But first, the numbers…

Q3 By The Numbers

Category	Volume (Apr - Jun 2021)	Volume (July - Sept 2021)
Reports for content manipulation	7,911,666	7,492,594
Admin removals for content manipulation	45,485,229	33,237,992
Admin-imposed account sanctions for content manipulation	8,200,057	11,047,794
Admin-imposed subreddit sanctions for content manipulation	24,840	54,550
3rd party breach accounts processed	635,969,438	85,446,982
Protective account security actions	988,533	699,415
Reports for ban evasion	21,033	21,694
Admin-imposed account sanctions for ban evasion	104,307	97,690
Reports for abuse	2,069,732	2,230,314
Admin-imposed account sanctions for abuse	167,255	162,405
Admin-imposed subreddit sanctions for abuse	3,884	3,964

DAS

The goal of policy enforcement is to reduce exposure to policy-violating content (we will touch on the limitations of this goal a bit later). In order to reduce exposure we need to get to more bad things (scale) more quickly (speed). Both of these goals inherently assume that we know where policy-violating content lives. (It is worth noting that this is not the only way that we are thinking about reducing exposure. For the purposes of this conversation we’re focusing on reactive solutions, but there are product solutions that we are working on that can help to interrupt the flow of abuse.)

Reddit has approximately three metric shittons of content posted on a daily basis (3.4B pieces of content in 2020). It is impossible for us to manually review every single piece of content. So we need some way to direct our attention. Here are two important factoids:

Most content reported for a site violation is not policy-violating
Most policy-violating content is not reported (a big part of this is because mods are often able to get to content before it can be viewed and reported)

These two things tell us that we cannot rely on reports alone because they exclude a lot, and aren’t even particularly actionable. So we need a mechanism that helps to address these challenges.

Enter, Daily Active Shitheads.

Despite attempts by more mature adults, we succeeded in landing a metric that we call DAS, or Daily Active Shitheads (our CEO has even talked about it publicly). This metric attempts to address the weaknesses with reports that were discussed above. It uses more signals of badness in an attempt to be more complete and more accurate (such as heavily downvoted, mod removed, abusive language, etc). Today, we see that around 0.13% of logged in users are classified as DAS on any given day, which has slowly been trending down over the last year or so. The spikes often align with major world or platform events.

Decrease of DAS since 2020

A common question at this point is “if you know who all the DAS are, can’t you just ban them and be done?” It’s important to note that DAS is designed to be a high-level cut, sort of like reports. It is a balance between false positives and false negatives. So we still need to wade through this content.

Scaling Enforcement

By and large, this is still more content than our teams are capable of manually reviewing on any given day. This is where we can apply machine learning to help us prioritize the DAS content to ensure that we get to the most actionable content first, along with the content that is most likely to have real world consequences. From here, our teams set out to review the content.

Increased admin actions against DAS since 2020

Our focus this year has been on rapidly scaling our safety systems. At the beginning of 2020, we actioned (warning, suspended, banned) a little over 3% of DAS. Today, we are at around 30%. We’ve scaled up our ability to review abusive content, as well as deployed machine learning to ensure that we’re prioritizing review of the correct content.

Increased tickets reviewed since 2020

Accuracy

While we’ve been focused on greatly increasing our scale, we recognize that it’s important to maintain a high quality bar. We’re working on more detailed and advanced measures of quality. For today we can largely look at our appeals rate as a measure of our quality (admittedly, outside of modsupport modmail one cannot appeal a “no action” decision, but we generally find that it gives us a sense of directionality). Early last year we saw appeals rates that fluctuated with a rough average of around 0.5% but often swinging higher than that. Over this past year, we have had an improved appeal rate that is much more consistently at or below 0.3%, with August and September being near 0.1%. Over the last few months, as we have been further expanding our content review capabilities, we have seen a trend towards a higher rate of appeals and is currently slightly above 0.3%. We are working on addressing this and expect to see this trend shift in early next year with improved training and auditing capabilities.

Appeal rate since 2020

Final Thoughts

Building a safe and healthy platform requires addressing many different challenges. We largely break this down into four categories: abuse, manipulation, accounts, and ecosystem. Ecosystem is about ensuring that everyone is playing their part (for more on this, check out my previous post on Internationalizing Safety). Manipulation has been the area that we’ve discussed the most. This can be traditional spam, covert government influence, or brigading. Accounts generally break into two subcategories: account security and ban evasion. By and large, these are objective categories. Spam is spam, a compromised account is a compromised account, etc. Abuse is distinct in that it can hide behind perfectly acceptable language. Some language is ok in one context but unacceptable in another. It evolves with societal norms. This year we felt that it was particularly important for us to focus on scaling up our abuse enforcement mechanisms, but we recognize the challenges that come with rapidly scaling up, and we’re looking forward to discussing more around how we’re improving the quality and consistency of our enforcement.

189 Comments

2021/12/14
22:05 UTC