/r/redditdev

Photograph via snooOG

A subreddit for discussion of Reddit's API and Reddit API clients.

A subreddit for discussion of Reddit's API and Reddit API clients.

Please confine discussion to Reddit's API instead of using this as a soapbox to talk to the admins. In particular, use /r/ideasfortheadmins for feature ideas and /r/bugs for bugs. If you have general reddit questions, try /r/help.

To see an explanation of recent user-facing changes to reddit (and the code behind them), check out /r/changelog.


To report a security issue with reddit, please send an email to security@reddit.com .

This is an admin-sponsored subreddit.

/r/redditdev

74,879 Subscribers

2

Help! Error while calling the .json endpoint anonymously through an API route

I am building a simple app around the Reddit .json endpoints. I am new too Reddit API so I believe no one gonna judge me. I am using this endpoint https://api/reddit.com/search.json?q=any keyword anonymously without creating an app. And when I hit this URL in fetch API it’s shows the result/response but as soon as I create an API route in my Next JS project and call the above API and then call the API route it shows 500 internal server error. Hitting directly the URL https://api/reddit.com/search.json?q=any keyword my domain gives me no error but using and hitting this url in API route like mydomain.com/api/search it shows 500 internal server error. Seems Reddit is not allowing API calls from custom domains as and ApI route/endpoint die to some restrictions.

A help will be appreciated.

2 Comments
2024/04/26
19:22 UTC

1

Why does my script fetch only 30 top posts instead of 500?

Hi friends,

I’m trying to export 500 top posts from a select sub in a specific timeframe.

For some reason, the scrip returns only 30 top posts.

Any tips?

import praw
import csv
from datetime import datetime

client_id = "client_id"
client_secret = "client_secret"
user_agent = "my-app by u/Own_Island5749"

reddit = praw.Reddit(client_id=client_id,
                     client_secret=client_secret,
                     user_agent=user_agent)

subreddit_name = "subreddit_name"

start_date = datetime(2024, 4, 1).timestamp()
end_date = datetime(2024, 4, 21, 23, 59, 59).timestamp()

with open('posts.csv', 'w', newline='', encoding='utf-8') as csvfile:
    fieldnames = ['Date', 'Title', 'Author', 'Link', 'Upvotes']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()

    # Scrape posts
    post_count = 0
    for post in reddit.subreddit(subreddit_name).top(limit=None):  # Fetching all top posts
        post_date = datetime.utcfromtimestamp(post.created_utc)
        
        # Check if post is within the specified date range
        if start_date <= post.created_utc <= end_date:
            writer.writerow({
                'Date': post_date.strftime('%Y-%m-%d %H:%M:%S'),
                'Title': post.title,
                'Author': post.author.name if post.author else '[deleted]',
                'Link': f'https://www.reddit.com{post.permalink}',
                'Upvotes': post.score
            })
            post_count += 1

            if post_count >= 500:
                break

print("Scraping complete. Data saved to posts.csv.")
2 Comments
2024/04/26
17:17 UTC

1

Trying to get Location header from Reddit video URL succeeds with one version of curl/openssl, fails with another

Hi there,

I have a weird problem with retrieving response headers from a curl request to a Reddit video URL. On one Linux system (Debian 12, curl 7.88.1, OpenSSL 3.0.11), it works (I get back a 301 status code and the expected Location response header):

$ curl -v 'https://www.reddit.com/video/93lsuhlo9pwc1'
* Trying 151.101.201.140:443...
* Connected to www.reddit.com (151.101.201.140) port 443 (#0)
* ALPN: offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server accepted h2
* Server certificate:
* subject: C=US; ST=California; L=SAN FRANCISCO; O=REDDIT, INC.; CN=*.reddit.com
* start date: Jan 15 00:00:00 2024 GMT
* expire date: Jul 13 23:59:59 2024 GMT
* subjectAltName: host "www.reddit.com" matched cert's "*.reddit.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
* using HTTP/2
* h2h3 [:method: GET]
* h2h3 [:path: /video/93lsuhlo9pwc1]
* h2h3 [:scheme: https]
* h2h3 [:authority: www.reddit.com]
* h2h3 [user-agent: curl/7.88.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x55a2e2882c80)
> GET /video/93lsuhlo9pwc1 HTTP/2
> Host: www.reddit.com
> user-agent: curl/7.88.1
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 301
< content-type: text/html; charset=utf-8
< location: https://www.reddit.com/r/UkraineWarVideoReport/comments/1cd4d7d/after_the_military_aid_was_announced_the_american/

On another system (Ubuntu 22.04.4 LTS, curl 7.81.0, OpenSSL 3.0.2), the very same request returns a 403/Forbidden:

$ curl -v 'https://www.reddit.com/video/93lsuhlo9pwc1'
* Trying 151.101.41.140:443...
* Connected to www.reddit.com (151.101.41.140) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: C=US; ST=California; L=SAN FRANCISCO; O=REDDIT, INC.; CN=*.reddit.com
* start date: Jan 15 00:00:00 2024 GMT
* expire date: Jul 13 23:59:59 2024 GMT
* subjectAltName: host "www.reddit.com" matched cert's "*.reddit.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* Using Stream ID: 1 (easy handle 0x5e3c3a1f8eb0)
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
> GET /video/93lsuhlo9pwc1 HTTP/2
> Host: www.reddit.com
> user-agent: curl/7.81.0
> accept: */*
>
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
< HTTP/2 403

If it matters, the body of the 403 response says "You've been blocked by network security." Also, if it matters, I tried forcing curl to use TLSv1.2 only on both systems (thinking it was maybe the switching back and forth between TLS1.3 and 1.2 during negotiation that reddit didn't like) but this didn't change anything.
Anyone have any ideas on this?

1 Comment
2024/04/26
15:41 UTC

1

Getting 403 when trying to download pictures from a gallery

Hi,

I'm trying to download pictures from a gallery.

For that, I use links under media_metadata.

However, I always get a 403 Forbidden access.

I've taken a deeper look, and it seems that reddit is now doing a redirection.

Is there any way to download these pictures?

1 Comment
2024/04/26
14:27 UTC

4

Query about Reddit's Post-to-Profile Feature Rollout Date

Hello, Reddit community!
Back around 2017, Reddit began testing a new post-to-profile feature, allowing redditors to 'follow' specific users' profiles, as indicated in this link: https://www.reddit.com/r/modnews/comments/60i60u/tomorrow\_well\_be\_launching\_a\_new\_posttoprofile/.

However, I'm having trouble pinpointing the exact time when this feature was fully implemented. Does anyone know when the testing phase concluded and the feature officially went live? This information is crucial for my research. Thanks in advance for your help!

1 Comment
2024/04/26
08:36 UTC

1

prawcore.exceptions.ServerError: received 500 HTTP response

All now and then, sometimes after days of successful operation, my python script receives an exception as stated in the title while listening to modmails coded as follows:

for modmail in subreddit.mod.stream.modmail_conversations():

I don't think it's a bug, just a server hiccup as suggested here.

Anyhow, I'm asking for advice on how to properly deal with this in order to continue automatically rather than starting the script anew.

Currently, the whole for block is pretty trivial:

    for modmail in subreddit.mod.stream.modmail_conversations():
        process_modmail(reddit, subreddit, modmail)

Thus the question is: How should above block be enhanced to catch the error and continue? Should it involve a cooldown period?

Thank you very much in adcance!

----

For documentation purposes I'd add the complete traceback, but it won't let me, neither as a comment. I reckon it's too much text. Here's just the end then:

  ...

  File "C:\Users\Operator\AppData\Local\Programs\Python\Python311\Lib\site-packages\prawcore\sessions.py", line 162, in _do_retry

return self._request_with_retries(

       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Operator\AppData\Local\Programs\Python\Python311\Lib\site-
packages\prawcore\sessions.py", line 267, in _request_with_retries

raise self.STATUS_EXCEPTIONS[response.status_code](response)
prawcore.exceptions.ServerError: received 500 HTTP response

1 Comment
2024/04/26
04:39 UTC

1

Searching across multiple content types using API

It seems that there is no exact equivalent to the functionality of the search field sitting on top of www.reddit.com in the official API. Similar endpoint is there: https://www.reddit.com/dev/api/#GET_search, but it only allows searching for subreddits, posts and users. How about comments and media, two other important content types?

4 Comments
2024/04/25
17:47 UTC

3

Retrieving Modqueue with AsyncPraw

Hey All,

I'm looking to make a script that watches the Modqueue to help clean out garbage/noise from Ban Evaders.

When one has the ban evasion filter enabled, and a ban evader comes and leaves a dozen or two comments and then deletes their account, the modqueue continually accumulates dozens of posts from [deleted] accounts that are filtered as "reddit removecomment Ban Evasion : This comment is from an account suspected of ban evasion"

While one here and there isn't too bad, it's a huge annoyance and I'd like to just automate removing them.

My issue is with AsyncPraw I'm having an issue, here's the initial code I'm trying (which is based off of another script that monitors modmail and works fine)

import asyncio
import asyncpraw
import asyncprawcore
from asyncprawcore import exceptions as asyncprawcore_exceptions
import traceback
from datetime import datetime
   
debugmode = True

async def monitor_mod_queue(reddit):
    while True:
        try:
            subreddit = await reddit.subreddit("mod")
            async for item in subreddit.mod.modqueue(limit=None):
                print(item)
                #if item.author is None or item.author.name == "[deleted]":
                #    if "Ban Evasion" in item.mod_reports[0][1]:
                #        await process_ban_evasion_item(item)
        except (asyncprawcore.exceptions.RequestException, asyncprawcore.exceptions.ResponseException) as e:
            print(f"{datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S UTC')}: Error in mod queue monitoring: {str(e)}. Retrying...")
            if debugmode:
                traceback.print_exc()
            await asyncio.sleep(30)  # Wait for a short interval before retrying

async def process_ban_evasion_item(item):
    print(f"{datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S UTC')}: Processing ban evasion item: {item.permalink} in /r/{item.subreddit.display_name}")
    # item.mod.remove()  # Remove the item

async def main():
    reddit = asyncpraw.Reddit("reddit_login")
    await monitor_mod_queue(reddit)

if __name__ == "__main__":
    asyncio.run(main())

Though keep getting an unexpected mimetype output in the traceback:

Traceback (most recent call last):
  File "/mnt/nvme/Bots/monitor_modqueue/modqueue_processing.py", line 37, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/mnt/nvme/Bots/monitor_modqueue/modqueue_processing.py", line 34, in main
    await monitor_mod_queue(reddit)
  File "/mnt/nvme/Bots/monitor_modqueue/modqueue_processing.py", line 17, in monitor_mod_queue
    async for item in subreddit.mod.modqueue(limit=None):
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncpraw/models/listing/generator.py", line 34, in __anext__
    await self._next_batch()
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncpraw/models/listing/generator.py", line 89, in _next_batch
    self._listing = await self._reddit.get(self.url, params=self.params)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncpraw/util/deprecate_args.py", line 51, in wrapped
    return await _wrapper(*args, **kwargs)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncpraw/reddit.py", line 785, in get
    return await self._objectify_request(method="GET", params=params, path=path)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncpraw/reddit.py", line 567, in _objectify_request
    await self.request(
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncpraw/util/deprecate_args.py", line 51, in wrapped
    return await _wrapper(*args, **kwargs)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncpraw/reddit.py", line 1032, in request
    return await self._core.request(
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncprawcore/sessions.py", line 370, in request
    return await self._request_with_retries(
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/asyncprawcore/sessions.py", line 316, in _request_with_retries
    return await response.json()
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1166, in json
    raise ContentTypeError(
aiohttp.client_exceptions.ContentTypeError: 0, message='Attempt to decode JSON with unexpected mimetype: text/html; charset=utf-8', url=URL('https://oauth.reddit.com/r/mod/about/modqueue/?limit=1024&raw_json=1')
Exception ignored in: <function ClientSession.__del__ at 0x7fc48d3afd30>
Traceback (most recent call last):
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/aiohttp/client.py", line 367, in __del__
  File "/usr/lib/python3.9/asyncio/base_events.py", line 1771, in call_exception_handler
  File "/usr/lib/python3.9/logging/__init__.py", line 1471, in error
  File "/usr/lib/python3.9/logging/__init__.py", line 1585, in _log
  File "/usr/lib/python3.9/logging/__init__.py", line 1595, in handle
  File "/usr/lib/python3.9/logging/__init__.py", line 1657, in callHandlers
  File "/usr/lib/python3.9/logging/__init__.py", line 948, in handle
  File "/usr/lib/python3.9/logging/__init__.py", line 1182, in emit
  File "/usr/lib/python3.9/logging/__init__.py", line 1171, in _open
NameError: name 'open' is not defined
Exception ignored in: <function BaseConnector.__del__ at 0x7fc48d4394c0>
Traceback (most recent call last):
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/aiohttp/connector.py", line 285, in __del__
  File "/usr/lib/python3.9/asyncio/base_events.py", line 1771, in call_exception_handler
  File "/usr/lib/python3.9/logging/__init__.py", line 1471, in error
  File "/usr/lib/python3.9/logging/__init__.py", line 1585, in _log
  File "/usr/lib/python3.9/logging/__init__.py", line 1595, in handle
  File "/usr/lib/python3.9/logging/__init__.py", line 1657, in callHandlers
  File "/usr/lib/python3.9/logging/__init__.py", line 948, in handle
  File "/usr/lib/python3.9/logging/__init__.py", line 1182, in emit
  File "/usr/lib/python3.9/logging/__init__.py", line 1171, in _open
NameError: name 'open' is not defined

Just wondering if anyone can spot what I might be doing wrong, or if this is instead a bug with asyncpraw and the modqueue currently?

As a test, I changed over to regular Praw to try the example to print all modqueue items here: https://praw.readthedocs.io/en/latest/code_overview/other/subredditmoderation.html#praw.models.reddit.subreddit.SubredditModeration.modqueue

import praw
from prawcore import exceptions as prawcore_exceptions
import traceback
import time
from datetime import datetime

debugmode = True

def monitor_mod_queue(reddit):
    while True:
        try:
            for item in reddit.subreddit("mod").mod.modqueue(limit=None):
                print(item)
                #if item.author is None or item.author.name == "[deleted]":
                #    if "Ban Evasion" in item.mod_reports[0][1]:
                #        process_ban_evasion_item(item)
        except (prawcore_exceptions.RequestException, prawcore_exceptions.ResponseException) as e:
            print(f"{datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S UTC')}: Error in mod queue monitoring: {str(e)}. Retrying...")
            if debugmode:
                traceback.print_exc()
            time.sleep(30)  # Wait for a short interval before retrying

def process_ban_evasion_item(item):
    print(f"{datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S UTC')}: Processing ban evasion item: {item.permalink} in /r/{item.subreddit.display_name}")
    # item.mod.remove()  # Remove the item

def main():
    reddit = praw.Reddit("reddit_login")
    monitor_mod_queue(reddit)

if __name__ == "__main__":
    main()

But that too throws errors:

2024-04-25 16:39:01 UTC: Error in mod queue monitoring: received 200 HTTP response. Retrying...
Traceback (most recent call last):
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/requests/models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 5 (char 5)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/prawcore/sessions.py", line 275, in _request_with_retries
    return response.json()
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/requests/models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 2 column 5 (char 5)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/nvme/Bots/monitor_modqueue/modqueue_processing.py", line 12, in monitor_mod_queue
    for item in reddit.subreddit("mod").mod.modqueue(limit=None):
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/praw/models/listing/generator.py", line 63, in __next__
    self._next_batch()
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/praw/models/listing/generator.py", line 89, in _next_batch
    self._listing = self._reddit.get(self.url, params=self.params)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/praw/util/deprecate_args.py", line 43, in wrapped
    return func(**dict(zip(_old_args, args)), **kwargs)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/praw/reddit.py", line 712, in get
    return self._objectify_request(method="GET", params=params, path=path)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/praw/reddit.py", line 517, in _objectify_request
    self.request(
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/praw/util/deprecate_args.py", line 43, in wrapped
    return func(**dict(zip(_old_args, args)), **kwargs)
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/praw/reddit.py", line 941, in request
    return self._core.request(
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/prawcore/sessions.py", line 330, in request
    return self._request_with_retries(
  File "/mnt/nvme/Bots/monitor_modqueue/venv/lib/python3.9/site-packages/prawcore/sessions.py", line 277, in _request_with_retries
    raise BadJSON(response)
prawcore.exceptions.BadJSON: received 200 HTTP response
12 Comments
2024/04/25
16:25 UTC

2

question about extractingout posts and comments from a certain time period ( weekly , monthly) ?

Hi i am currently using reddit python api to extract posts and comments from subreddits. So far i am trying to list out posts based on the date uploaded including the post decription , popularity etc. I am also re-arranging the comments , with the most upvoted comments listed on top.

I am wondering if there is a way to extract posts ( perhaps top or hot or all)

  1. based on a certain time limit
  2. based on "top posts last week" "top posts last month" etc
  3. Extract the comments / comment tree .
  4. Summarizing the comments - if there is already a recommended way to do so ?

So far i am storing the information in the json format. The code is below 

flairs = ["A", "B"]

Get all submissions in the subreddit

submissions = [] for submission in reddit.subreddit('SomeSubreddit').hot(limit=None): if submission.link_flair_text in flairs: created_utc = submission.created_utc post_created = datetime.datetime.fromtimestamp(created_utc) post_created = post_created.strftime("%Y%m%d") submissions.append((submission, post_created))

Sort the submissions by their creation date in descending order

sorted_submissions = sorted(submissions, key=lambda s: s[1], reverse=True)

Process each submission and add it to a list of submission dictionaries

submission_list = [] for i, (submission, post_created) in enumerate(sorted_submissions, start=1): title = submission.title titletext = submission.selftext titleurl = submission.url score = submission.score Popularity = score post = post_created

# Sort comments by score in descending order
submission.comments.replace_more(limit=None)
sorted_comments = sorted([c for c in submission.comments.list() if not isinstance(c, praw.models.MoreComments)], key=lambda c: c.score, reverse=True)

# Modify the comments section to meet your requirements
formatted_comments = []
for j, comment in enumerate(sorted_comments, start=1):
    # Prefix each comment with "comment" followed by the comment number
    # Ensure each new comment starts on a new line
    formatted_comment = f"comment {j}: {comment.body}\n"
    formatted_comments.append(formatted_comment)

submission_info = {
    'title': title,
    'description': titletext,
    'metadata': {
        'reference': titleurl,
        'date': post,
        'popularity': Popularity
    },
    'comments': formatted_comments
}

submission_list.append(submission_info)

Write the submission_list to a single JSON file

with open("submissionsmetadata.json", 'w') as json_file: json.dump(submission_list, json_file, indent=4)

0 Comments
2024/04/25
13:30 UTC

2

MoreChildren API in R

Hi there,

I have been trying to get the children of a reply in a thread, using morechildren API. However, I get an incomplete response object. The data kind is more, as I think it is right, but the data only has its parent_id and depth. Any thought?

 response <- GET(paste0(baseurl, "api/morechildren?   api_type=json&link_id=t3_1c7renm&children=l0icuco&sort=confidence"), 
                 user_agent("discussion-analyzer"), 
                 add_headers(Authorization = authorization_bearer)) %>%
             content(as = 'text') %>% 
             fromJSON()

list(subreddit_id = NA, approved_at_utc = NA, author_is_blocked = NA, comment_type = NA, edited = NA, mod_reason_by = NA, banned_by = NA, ups = NA, num_reports = NA, author_flair_type = NA, total_awards_received = NA, subreddit = NA, author_flair_template_id = NA, likes = NA, replies = NA, user_reports = list(NULL), saved = NA, id = "_", banned_at_utc = NA, mod_reason_title = NA, gilded = NA, archived = NA, collapsed_reason_code = NA, no_follow = NA, spam = NA, can_mod_post = NA, created_utc = NA,
ignore_reports = NA, send_replies = NA, parent_id = "t1_l0icuco", score = NA, author_fullname = NA, report_reasons = list(NULL), approved_by = NA, all_awardings = list(NULL), collapsed = NA, body = NA, awarders = list(NULL), gildings = list(), author_flair_css_class = NA, author_patreon_flair = NA, downs = NA, author_flair_richtext = list(NULL), is_submitter = NA, body_html = NA, removal_reason = NA, collapsed_reason = NA, associated_award = NA, stickied = NA, author_premium = NA, can_gild = NA,
removed = NA, unrepliable_reason = NA, approved = NA, author_flair_text_color = NA, score_hidden = NA, permalink = NA, subreddit_type = NA, locked = NA, name = "t1__", created = NA, author_flair_text = NA, treatment_tags = list(NULL), author = NA, link_id = NA, subreddit_name_prefixed = NA, controversiality = NA, top_awarded_type = NA, depth = 10, author_flair_background_color = NA, collapsed_because_crowd_control = NA, mod_reports = list(NULL), mod_note = NA, distinguished = NA, count = 0, children = list(
list()))

0 Comments
2024/04/25
11:20 UTC

2

Best Practices for Automating Posts with PRAW Without Getting Blocked?

Hello r/redditdev,

I've been working on automating posting on Reddit using PRAW and have encountered an issue where my posts are not appearing — they seem to be getting blocked or filtered out immediately, even in a test subreddit I created. Here's a brief overview of my setup:

I am using a registered web app on Reddit. Tokens are refreshed properly before posting. The software seems to function correctly without any errors in the code or during execution. Despite this, none of my posts are showing up, not even in the test subreddit. I am wondering if there might be some best practices or common pitfalls I'm missing that could be causing this issue.

Has anyone faced similar challenges or have insights on the following?

Any specific settings or configurations in PRAW that might help avoid posts being blocked or filtered?

  • Is there a threshold of activity or "karma" that my bot account needs before it can post successfully?

  • Could this be related to how frequently I am attempting to post? Are there rate limits I should be aware of, even in a testing environment?

  • Are there any age or quota requirements for accounts to be able to post without restrictions?

Any advice or pointers would be greatly appreciated!

Thanks in advance!

4 Comments
2024/04/24
14:25 UTC

7

Bots replying again and again to same comments

Hey folks! I'm currently working on a cool bot project that fetches dictionary definitions whenever it's called upon, but I'm facing an issue.

The bot is set to run only when the script or command window is open. I understand that i need to host it somewhere , to keep it constantly running. I've tried skip_existing=True , it didn't work . I'm wondering about what happens if the server hosting the bot fails/a free alternative. Will the bot reply again upon re-running?

I'm also considering adding a time limit, like only replying to comments made within the past 7 days or any reasonable duration. Do you think this is a better approach? Would love to hear your thoughts and any suggestions.

3 Comments
2024/04/23
20:03 UTC

3

Is it feasible to build a bot that removes all comments by a user?

Hello everyone,

I'm currently in the process of developing a Reddit bot, and it's my first time trying this out, so please bear with me if my questions seem trivial.

I'm curious to know if it's feasible to delete all posts or comments made by a specific user within a subreddit where I, as the bot, hold moderator privileges. I've already granted all necessary permissions to the bot. You can find the code I've been working on at this link:

https://codeshare.io/XLNDwM.

Additionally, I'm seeking advice on the optimal method for hosting this bot. Are there any entirely free alternatives aside from PythonAnywhere that you would recommend for mere display purposes?

9 Comments
2024/04/23
19:06 UTC

1

Reddit Architecture and its scalability

I would like to know what kind of architecture is used in Reddit and how it has affected its scalablity and performance

2 Comments
2024/04/23
15:40 UTC

0

"after" always null in json files taken from online?

I might be doing something obviously wrong, but I am looking at json files of a comment section, but the "after" field is always null for every comment? for example "https://www.reddit.com/r/soccer/comments/1caz0gv/daily_discussion/.json"

Is there an obvious reason why for something I am doing wrong?

5 Comments
2024/04/23
09:58 UTC

4

Reddit API throwing 500 errora

Seems to be a general error happening since at least half an hour ago but I can't find anything about it and on redditstatus it doesn't show any issues.

1 Comment
2024/04/22
20:48 UTC

3

403 (Blocked) on any subreddit/random, with read scope and user-agent specified (Node.js)

Hi all, I'm writing a little script in Node.js to fetch random posts. I've been able to authorize the app properly, with both the password and client_credentials grant types; I get my bearer token regardless. I'm also able to query /api/v1/me properly. However, every time I try to GET /r/[any subreddit]/random, the API gives me a 403 page with a blocked header.

I'm using the following headers, which should be sufficient.

const fetchOptions = {
    method: 'GET',
    headers: new Headers({
        "User-Agent": "nodejs:com.curbbiter:v0.0.3 (by /u/Tiamut_)",
        "Authorization": `bearer ${access_token}`,
    }),
};
const res = await fetch(`https://oauth.reddit.com/r/${subredditName}/random`, fetchOptions);

Note that I've confirmed the access_token and subredditName variables are valid.

I've also specified the read scope (and tried with the default wildcard scope), to no avail. Is there something I'm doing wrong? Thanks in advance for any help you can provide!

2 Comments
2024/04/22
20:26 UTC

1

Is crossposting prohibited?

I made a subreddit and then wrote a script to crosspost submissions from other subs to my subreddit.

My script is run with a different username than the username that started the subreddit.

The crossposting works the first time, but not the second and the first crossposts are deleted.

I am wondering if Reddit prohibits automated crossposting?

Is it possible that I might need to enable crossposts in my subreddit?

2 Comments
2024/04/22
19:05 UTC

2

403 only on .json endpoints

I've been running some data collection scripts on Google Apps Scripts unauthenticated, only with an agent. This makes every other request or so fail since Reddit is cracking down on unauthenticated use. So thought I might as well do it right instead.

I have a function to log in and authenticate:

    var response = UrlFetchApp.fetch('https://www.reddit.com/api/v1/access_token?scope=read', {
      method: 'post',
      'muteHttpExceptions': true,
      headers: {
        'User-Agent': AGENT,
        'Authorization': 'Basic ' + Utilities.base64Encode(client_id + ':' + client_secret)
      },
      payload: {
        grant_type: 'client_credentials'
      }
    });

From this I get a token that I can use in further requests:

var options = {
  'method': 'get',
  'muteHttpExceptions': true,
  'headers': {
    Authorization: 'Bearer ' + token,
    'User-Agent': AGENT,
  }
};
var response = UrlFetchApp.fetch(url, options);

But whenever I call this last code block it gives me: {"message": "Forbidden", "error": 403}

More specifically I've called it with var url = "https://www.reddit.com/r/polandball.json" which I can fetch perfectly fine without doing all the authorization.

I can get "https://www.reddit.com/r/polandball" perfectly fine. But the JSON is not allowed.

So what the heck is going on here?

2 Comments
2024/04/22
03:21 UTC

3

Why does this PRAW code return submissions for some subreddits but not others?

I'm using this to query the top 20 posts of the week for a few subreddits, but some subreddits return nothing - despite having submissions from the past week:

reddit = praw.Reddit(client_id=clientID,
                     client_secret=clientSecret,
                     user_agent=userAgent,
                     uaername=redditUsername,
                     password=redditPassword)

try:
    submissions = reddit.subreddit(<subreddit>).top(time_filter='week', limit=20)

There doesn't seem to be a pattern to which subreddits do or don't work. All are public, all have posts from the past week (some greater than 20, some less than), some of the working subreddits are NSFW, some are not.

3 Comments
2024/04/20
13:36 UTC

2

403 Forbidden PRAW

As I am working on creating a survey, I am trying to use the following to help distribute the link (I plan to ask those who want the link to dm me).

Anyway, I am getting a 403 forbidden error, can anyone help in solving this? here is my code: import praw

import time

import traceback

import prawcore.exceptions

Authenticate with Reddit

reddit = praw.Reddit(

client_id='XXX', # Replace with your actual client_id

client_secret='XXXX', # Replace with your actual client_secret

user_agent='script:Automatic DM responder for survey:v1.0 (by u/NK524563)',

username='NK524563', # Your Reddit username

password='XXX', # Your Reddit password

scopes=['read', 'identity', 'submit', 'privatemessages']

)

List of survey URLs

urls = [

'https://qualtrics.com/survey1',

'https://qualtrics.com/survey2',

'https://qualtrics.com/survey3'

]

Counter to keep track of the last used index

current_index = 0

def check_and_respond():

global current_index

try:

for message in reddit.inbox.unread(limit=None):

if isinstance(message, praw.models.Message):

message.reply(f"Thank you for your interest! Please take our survey here: {urls[current_index]}")

message.mark_read() # Mark message as read

current_index = (current_index + 1) % len(urls) # Cycle back to 0 after the last URL

except Exception as e:

print("An error occurred: ", str(e))

print("Traceback: ", traceback.format_exc())

Attempt to get more information from the PRAW exception if it's related to HTTP

if isinstance(e, prawcore.exceptions.ResponseException):

response = e.response

print("Detailed HTTP response:")

print("Status code:", response.status_code)

print("Headers:", response.headers)

print("Content:", response.text)

try:

while True:

check_and_respond()

time.sleep(60) # Sleep for 60 seconds before checking again

except Exception as e:

print("SCRIPT ERROR:", e)

raise

5 Comments
2024/04/19
20:01 UTC

2

503 error when hitting ad_groups endpoint

Any one else getting a 503 error when hitting the GET /api/v2/accounts/account_id/ad_grpups endpoint for reddit ads?

2 Comments
2024/04/18
13:34 UTC

1

Looking for a PHP library like PRAW

Hi, I was searching for a library like PRAW for php as most I have found seem to be rather old and are no longer maintained. Or is it better to use JRAW instead of looking for a php version of it? Thank you in advance!

4 Comments
2024/04/18
09:49 UTC

3

Get comments of a given subreddit's users with PRAW

I'm working on a dataset for an authorship attribution algorithm. For this purpose, I've decided to gather comments from a single subreddit's users.

The way I'm doing it right now consists of two steps. First, I look through all comments on a subreddit (by subreddit.comments) and store all of the unique usernames of their authors. Afterwards, I look through each user's history and store all comments that belong to the appropriate subreddit. If their amount exteeds a certain threshold, they make it to the proper dataset, otherwise the user is discarded.

Ideally, this process would repeat until all users have been checked, however I'm always cut off from PRAW long before that, with my most numerous dataset hardly exceeding 11 000 comments. Is this normal, or should I look for issues with my user_agent? I'm guessing this solution is far from optimal, but how could I further streamline it?

2 Comments
2024/04/17
13:59 UTC

6

Reverse Reddit mobile app to access hidden api

Some data displayed in the mobile app and on new.reddit is not available through the official api: Things like listing subreddit category or global subscriber rank.

My question is if someone has tried to reverse engineer the Reddit mobile app to get ahold of these endpoints, if they are even accessible through a conventional API and not a custom protocol or handshake.

My own attempts have been to use a custom certificate on an Android phone to capture HTTPS data with the "Package Capture" Android app. This used to work fine for some old apps using HTTPS back in 2018 or so, but nowadays I'm having problem decrypting HTTPS data when using the Chrome app. Even worse, the Reddit app will not even load any data when using the "Package Capture" proxy. Indicating that they might be using SSL pinching or other measures to prevent circumventing their prtivate certificate.

I made some progress trying to decompile the Reddit app apk, but looking through decompile code is very annoying, and I had problems finding the actual requests being made to get this data.

Has anyone attemted something similar?

One alternative is web scraping, but even new.reddit doesn't provide subreddit categories afaik.

10 Comments
2024/04/17
11:13 UTC

4

Making a simple reddit bot post API changes?

Hi I want to make a bot that simply scrapes all of my subreddit’s posts and comments and relevant metadata. It would also make some comments.

Pre API changes I would know where to start but now Im having trouble finding how to use the new paid system. I cant even find reddits API website for it if the have one. Any good tutorials?

2 Comments
2024/04/17
08:48 UTC

2

API for "#X in <category>" for subreddits?

When I visit r/Marvel in the Reddit app I can see the text "#3 in Comics". I can also click on this to see the top 25 subreddits in the category of "Comics".

I can not see this on the Reddit web GUI. Is this available in the API? Or maybe a hidden endpoint for the app.

1 Comment
2024/04/16
16:00 UTC

2

[PRAW] Local host refused to connect / OSError: [Errno 98] Address already in use

Hello! I've been having trouble authenticating with the reddit api using praw for weeks. Any help would be greatly appreciated because I've got no idea where i'm going wrong. I've created a personal-use script to obtain basic data from subreddits, but my codes aren't running and my reddit instance doesn't work with the credentials im using, so I cannot get a refresh token.

I know this is a long read but I am a complete beginner so I figured the more info I show the better!! Thanks in advance :)

def receive_connection():
  server =  socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
  server.bind(("localhost", 8080))
  server.listen(1)
  client = server.accept()[0]
  server.close()
  return client



def send_message(client, message):
  print(message)
  client.send(f"HTTP/1.1 200 OK/r/n/r/n{message}".encode("utf-8"))
  client.close()


def main():
  print("Go here while logged into the account you want to create a token for: "
  "https://www.reddit.com/prefs/apps/")
  print("Click the create an app button. Put something in the name field and select the"
  " script radio button.")
  print("Put http://localhost:8080 in the redirect uri field and click create app")
  client_id=input("Enter the client id: ")
  client_secret=input("Enter the client secret: ")
  commaScopes=input("Now enter a comma separated list of scopes, or all for all tokens")

  if commaScopes.lower()=="all":
    scopes=["*"]
  else:
    scopes = commaScopes.strip().split(",")
  
  reddit = praw.Reddit(
      client_id=client_id.strip(),
      client_secret=client_secret.strip(),
      redirect_uri="http://localhost:8080",
      user_agent="praw_refresh_token_example")
  
  state = str(random.randint(0, 65000))
  url = reddit.auth.url(scopes, state, "permanent")
  print(f"Now open this url in your browser: {url}")
  sys.stdout.flush()

  client = receive_connection()
  data = client.recv(1024).decode("utf-8")
  param_tokens = data.split(" ", 2)[1].split("?",1)[1].split("&")
  params = {
      key: value for (key, value) in [token.split("=")for token in param_tokens]
      }
  
  if state!= params["state"]:
    send_message(
        client,
        f"State mismatch. Expected: {state} Received: {params['state']}",
    )
    return 1 
  elif "error" in params:
    send_message(client, params["error"])
    return 1

  refresh_token = reddit.auth.authorize(params["code"])
  send_message(client, f"Refresh token: {refresh_token}")
  return 0 

if __name__ == "__main__":
  sys.exit(main())

I enter my client id and my secret, it goes to the page where i click to authorise my application with my account, but then when it is meant to redirect to local host to give me a token it just says local host refuses to connect, and the code returns "OSError: [Errno 98] Address already in use".

I also am just having trouble with my credentials, without this code I have entered my client id, secret, user agent, user name and password. The code runs, but when I input the below, it returns true and none. I have checked my credentials a million times over. Is there likely a problem with my application? Or my account potentially? I'm using colaboratory to run these codes

print(reddit.read_only)
true

print(reddit.user.me())
none
3 Comments
2024/04/16
11:36 UTC

1

Total newb here. Can someone help me with a task?

I posted about this in r/dataengineering and got a reply (it's here) that said the task I'm trying to do is pretty easy.

Easy for who?? Not me, apparently! But the reply mentioned PRAW and the Reddit API, so I thought I'd pop on over here and see whether anyone is in a giving kind of mood. Can someone help me figure out how to do this? I'd be happy to give you the gift of free books (audiobook, even!) in return.

Hello dataengineers....

I'm scheduled to give a short talk this June at a conference, and to prepare for it I thought I'd invite a group to discuss the topic in a subreddit I moderate which is currently all of 6 members strong.

I'd like to invite those who've have commented on my posts/whose posts I've commented on.
I've downloaded my Reddit data, no problem there— but I really imagined it would be easier to get the usernames of those I've interacted with. I thought there would be a field for the usernames, but there is not.
Posts/comments are listed by "ID" (and in some cases "parent"). Is there some way I can use this info to get what I need?

19 Comments
2024/04/15
13:14 UTC

1

Query for NSFW subreddits not returning results

When use the PRAW API to search an NSFW subreddit for a query (keyword), no submissions are returned.
This was possible until a few weeks ago, but I can't find any information on what has changed.

Specifically, I tried to search the r/DrugNerds subreddit for the query "Ketanserin", which yields no results while there are numerous posts discussing this pharmaceutical.

Does anyone know why this is not possible anymore, or have any info related to the change?
Thank you in advance.

for submission in reddit.subreddit("drugnerds").search("ketanserin",time_filter='all', sort='top', limit=1000):
    print(submission.title)
2 Comments
2024/04/15
07:16 UTC

Back To Top