Getting Started

In this tutorial we’ll go over everything needed to create a bot or program using reddit’s API through the Python Reddit API Wrapper (PRAW). We’re going to write a program that breaks down a redditor’s karma by subreddit, just like the reddit feature. Unlike that, our program can break it down for any redditor, not just us. However, it will be less precise due to limitations in the reddit API, but we’re getting ahead of ourselves.

This is a beginners tutorial to PRAW. We’ll go over the hows and whys of everything from getting started to writing your first program accessing reddit. What we won’t go over in this tutorial is the Python code.

Connecting to reddit

Start by firing up Python and importing PRAW. You can find the installation instructions on the main page.

>>> import praw

Next we need to connect to reddit and identify our script. We do this through the user_agent we supply when our script first connects to reddit.

>>> user_agent = "Karma breakdown 1.0 by /u/_Daimon_"
>>> r = praw.Reddit(user_agent=user_agent)

Care should be taken when we decide on what user_agent to send to reddit. The user_agent field is how we uniquely identify our script. The reddit API wiki page has the official and updated recommendations on user_agent strings and everything else. Reading it is highly recommended.

In addition to reddit’s recommendations, your user_agent string should not contain the keyword bot.

Breaking Down Redditor Karma by Subreddit

Now that we’ve established contact with reddit, it’s time for the next step in our script: to break down a user’s karma by subreddit. There isn’t a function that does this, but luckily it’s fairly easy to write the python code to do this ourselves.

We use the function get_redditor() to get a Redditor instance that represents a user on reddit. In the following case user will provide access to the reddit user “`_Daimon_`”.

>>> user_name = "_Daimon_"
>>> user = r.get_redditor(user_name)

Next we can use the functions get_comments() and get_submitted() to get that redditor’s comments and submissions. Both are a part of the superclass Thing as mentioned on the reddit API wiki page. Both functions can be called with the parameter limit, which limits how many things we receive. As a default, reddit returns 25 items. When the limit is set to None, PRAW will try to retrieve all the things. However, due to limitations in the reddit API (not PRAW) we might not get all the things, but more about that later. During development you should be nice and set the limit lower to reduce reddit’s workload, if you don’t actually need all the results.

>>> thing_limit = 10
>>> gen = user.get_submitted(limit=thing_limit)

Next we take the generator containing things (either comments or submissions) and iterate through them to create a dictionary with the subreddit display names (like python or askreddit) as keys and the karma obtained in those subreddits as values.

>>> karma_by_subreddit = {}
>>> for thing in gen:
...     subreddit = thing.subreddit.display_name
...     karma_by_subreddit[subreddit] = (karma_by_subreddit.get(subreddit, 0)
...                                     + thing.score)

Finally, let’s output the karma breakdown in a pretty format.

>>> import pprint
>>> pprint.pprint(karma_by_subreddit)

And we’re done. The program could use a better way of displaying the data, exception catching, etc. If you’re interested, you can check out a more fleshed out version of this Karma-Breakdown program.

Obfuscation and API limitations

As I mentioned before there are limits in reddit’s API. There is a limit to the amount of things reddit will return before it barfs. Any single reddit listing will display at most 1000 items. This is true for all listings including subreddit submission listings, user submission listings, and user comment listings.

You may also have realized that the karma values change from run to run. This inconsistency is due to reddit’s obfuscation of the upvotes and downvotes. The obfuscation is done to everything and everybody to thwart potential cheaters. There’s nothing we can do to prevent this.

Another thing you may have noticed is that retrieving a lot of elements take time. reddit allows requests of up to 100 items at once. So if you request <= 100 items PRAW can serve your request in a single API call, but for larger requests PRAW will break it into multiple API calls of 100 items each separated by a small 2 second delay to follow the api guidelines. So requesting 250 items will require 3 api calls and take at least 2x2=4 seconds due to API delay. PRAW does the API calls lazily, i.e. it will not send the next api call until you actually need the data. Meaning the runtime is max(api_delay, code execution time).

Continue to the next tutorial. Writing a reddit Bot.

The full Karma Breakdown program.

import praw

user_agent = ("Karma breakdown 1.0 by /u/_Daimon_ "
r = praw.Reddit(user_agent=user_agent)
thing_limit = 10
user_name = "_Daimon_"
user = r.get_redditor(user_name)
gen = user.get_submitted(limit=thing_limit)
karma_by_subreddit = {}
for thing in gen:
    subreddit = thing.subreddit.display_name
    karma_by_subreddit[subreddit] = (karma_by_subreddit.get(subreddit, 0)
                                     + thing.score)
import pprint