/r/pushshift

Photograph via snooOG

Subreddit for users of the pushshift.io API

Rules:

  • Please be kind to each other.
  • Please see this thread before posting "Is Pushshift Down?"
  • Camas, RedditSearch, and similar tools are unlikely to return due to recent API made by Reddit changes. Please do not post asking about those tools.

Want your data removed, use this Removal Request Form

/r/pushshift

14,524 Subscribers

1

Subreddit metadata

Hi everyone, any pointers/resources to retrieve metadata about subreddits by year, similar to this? https://academictorrents.com/details/c902f4b65f0e82a5e37db205c3405f02a028ecdf

I need to retrieve some info about the time of earliest post. Thank you so much in advance!

0 Comments
2024/12/12
10:57 UTC

2

Are API tokens still issued? I have access to PushShift, but after following the given link - I am just taken to Reddit's frontend. No key is listed.

0 Comments
2024/12/10
17:16 UTC

0

Is there a way to look for a deleted reddit account if I only know a part of their username?

I'm trying to look for a deleted reddit account. The username consists of 3 words and I can only remember 2 out of the 3 words.

1 Comment
2024/12/09
09:48 UTC

1

PushshiftDumpts/scripts/filter_file.py

Hello!

I am struggling to get the code you have posted on your github(https://github.com/Watchful1/PushshiftDumps/blob/master/scripts/filter\_file.py) to work. I kept everything in the code unchanged after I downloaded it. The only thing I changed was set the end date to 2005-02-01 and the path to the files. Nevertheless, after it finishes going through the file I have 0 entries in my csv file. Any solutions on how to fix that? Would really appreciate it! Thanks a lot in advance!

8 Comments
2024/11/24
13:18 UTC

1

Need help with data processing for my Masterthesis

Hi everyone,

for my masterthesis I want to test whether there is an empirical correlation of the development of meme stocks and reddit activity. To do so I need reddit data of the subreddits r/wallstreetbets and r/mauerstrassenwetten from beginning of 2020 to most recent date possible. To download the yearly dumps I followed the step by step explanation from u/watchful1 but the files specially the one from wallstreetbet are to big to process them using R (I have to use R). I only need 4 of the 125 columns but I'm not able to delete the unnecessary ones as long as I'm not able to import the data into R. Does anyone have a solution for this problem? And anyone an idea how to get data for 2024?

Would be very very greatful for any help.

Best,

2 Comments
2024/11/23
17:35 UTC

0

Can someone view my data, even if I delete my account, through other services like this one?

I am very concerned about my privacy now, thanks.

8 Comments
2024/11/09
07:49 UTC

2

Any mod who can help me!

Im struggling with my uni research where I have to collect somewhat big data about some posts on subreddits and comments. Anyone who have access to the API (need a token). Also want to know that if the API allows for historic data from 2021 to 2023? Is this possible?

4 Comments
2024/11/05
03:04 UTC

3

Why are some banned subreddits missing data months before their ban?

I am researcher looking at the gendercritical subreddit. Although the subreddit was banned at the end of June, the comment dumps stop mid April. Does the data exist anywhere? And if not why is that so I can at least put a reason as to why the data cuts off.

Thanks

2 Comments
2024/11/04
11:27 UTC

2

Method Not Allowed error

I've been getting this error for the past couple days. I had access in the past. Is there anything I can do to fix the issue? Or is it happening to others.

This is after trying to authorize from https://api.pushshift.io/signup

1 Comment
2024/09/08
22:28 UTC

0

Any clue why I get this when I try to authenticate?

{"detail":"User is not an authorized moderator."}

{"detail":"User is not an authorized moderator."}

6 Comments
2024/09/04
22:31 UTC

5

Need Access for Research

Hi all,

I want to access the reddit data using pushshift API. I raised a request. Can anyone help me how can I get the access at the earliest?

Thanks1

17 Comments
2024/09/04
16:58 UTC

1

Gab data for research purpose.

Hi, I've been searching for a dataset containing Gab posts. I finally came across a link but there is a login page coming up. I signed up and logged in, but since there is another guardrail requiring approval of requests and requests can only be submitted by moderators. I am unable to get access.

Is there any way of getting access to the data through my researcher credentials.

9 Comments
2024/08/25
03:51 UTC

4

Help with handling big data sets

Hi everyone :) I'm new to using big data dumps. I downloaded the r/Incels and r/MensRights data sets from u/Watchful1 and are now stuck with these big data sets. I need them for my Master Thesis including NLP. I just want to sample about 3k random posts from each Subreddit, but have absolutely no idea how to do it on data sets this big and still unzipped as a zst (which is too big to access). Has anyone a script or any ideas? I'm kinda lost

8 Comments
2024/08/22
09:33 UTC

1

How can I view a deleted post

I'm not a programmer, but I know that Pushshift functions as an archive for Reddit. Many posts I've interacted with have been deleted, and sometimes I'd like to see what the original post said. How can I view it?

Additionally, sometimes the post itself isn't deleted, but the original poster's account is gone, and I want to remember who made the post.

1 Comment
2024/08/06
17:13 UTC

0

Action Needed: Reauthorization of API access

Hello all,

Earlier this week, Pushshift faced a breach of security because of which the application configuration had to be updated. The updated application that authorizes you now goes by the name "ncri_ingest". All users will need to reauthorize for API access through https://api.pushshift.io/signup.

Users that have a long-running script using the refresh functionality will also need to replace the token with a new one after reauthorizing.

We apologize for any inconvenience caused and appreciate your patience during this period.

  • On behalf of Team NCRI
11 Comments
2024/08/01
00:00 UTC

9

FYI: Reddit is scaling up their "Reddit for Researchers" program

1 Comment
2024/07/31
21:46 UTC

8

Error code when trying to reauthorize

When it goes to the reddit page, I get;

bad request (reddit.com)

you sent an invalid request

— invalid client id.

11 Comments
2024/07/30
16:42 UTC

3

How long does it take Pushshift to respond to removal requests?

Requested nearly a week ago, I’ve heard nothing.

4 Comments
2024/07/18
13:44 UTC

8

Does pushshift support need to be notified when it's down?

I've just starting using it again recently - what's the protocol? Does it go down often?

It's been down for me for a few days now.

3 Comments
2024/07/14
15:32 UTC

30

Reddit dump files through July 2024

https://academictorrents.com/details/20520c420c6c846f555523babc8c059e9daa8fc5

I've uploaded a new centralized torrent for all monthly dump files through the end of July 2024. This will replace my previous torrents.

If you previously seeded the other torrents, loading up this torrent should recheck all the files (took me about 6 hours) and then download only the new files. Please don't delete and redownload your old files.

21 Comments
2024/07/13
03:55 UTC

2

Indexing Pushshift

Hi all,

I am a researcher and I used to collect Pushshift data using the API. Now I need to collect data again. The issue is I do not need a specific subreddit bu specific posts that cotain targeted expression and then I need to collect posts of that user who made these comments. Let's say in the last 5 years.
I was thinking to index the data in our lap (the last 5-6 years of pushshift comments and posts)
Did any one do that before or is there any guide or project for this so it saves the time experimenting with tools and structure?

Edit: What I mean exactly is if you have indexd Pushshift data youself what did you use, MongoDB / Elasticsearch?
Any one have docker file / code that get me started with this task faster?

Thanks,

Kind regards

10 Comments
2024/07/11
11:42 UTC

2

Confirmation of an account being removed?

Anyone know how we can get confirmation an account was removed after we submit the request? I can see the link to submit it but I don't see how we would get notified once it happened? Or maybe someone knows what website I could check?

2 Comments
2024/06/22
17:42 UTC

Back To Top