Methods for Finding Related Reddit Subreddits with Simple Set Theory (2024)

I recently wrote a post on how to visualize network graphs of Reddit subreddits.

One of the reasons I’ve been researching the topic is to find a good way to facilitate discovery of lesser-known subreddits, as Reddit is doing a terrible job at it (although they have been trying a few new experiments very recently). As it turns out, invoking graph theory is overkill. Even fancy machine learning approaches like collaborative filtering, while powerful, may not be required to help Redditors discover new things.

Let’s say we have two sets: Set A, where A represents the number of active users in a given subreddit, and set B, where B is the set of active users in a subreddit. The intersection of Sets A and B (A ∩ B) represents users who are active in both subreddits.

Using BigQuery, I can get the comment data from ALL public Reddit subreddits, as otherwise this technique would not work well using any smaller subset. The network graph edgelist conveniently gives (A ∩ B), obtained as described in my previous post, which calculates the number of active users for all pairs of subreddits (defining “active users” as users who have made a comment in at least 5 unique threads in a given subreddit within the past 6 months).

Methods for Finding Related Reddit Subreddits with Simple Set Theory (1)

In this case, we can filter the edgelist to only allow intersections where there are at least 10 active users; this prevents including dead and personal subreddits.

We can run another similar query to get the number of active users for each subreddit.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (2)

After that, for a given subreddit A, find:

(A ∩ B) / (B)

for all subreddits B where (A ∩ B) > 0 (i.e. only neighbors of A). This computation takes less than a second. Additionally, the output is always a percentage between 0% and 100%. For the visualizations, we plot the Top 15 subreddits with the highest overlap of the specified subreddit A (and color the bars with a nice viridis palette to provide another easy way to perceive relative magnitude of relatedness).

The methodology may sound arbitrary, but the results are very interesting. Here’s a chart of the top related subreddits for /r/aww, one of the most popular places on the internet for cat pictures.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (3)

I have honestly never heard of any of these subreddits before. But yet, by analyzing public user activity alone, I found a few new places to get more cute pics.

This methodology is excellent for finding subreddit-specific subsubreddits which may not be documented. The related subreddits for /r/buildapc offer more places to get PC building advice.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (4)

Related subreddits for sport-specific subreddits, like /r/cfb (college football) include the corresponding teams.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (5)

/r/food related subreddits list a surprising number of subreddits dedicated to specific foods.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (6)

There is a surprising amount of depth to the /r/me_irl network.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (7)

The chart for /r/programming can tell you which subreddits exist for specific programming languages and technologies.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (8)

The methodology can also reveal a lack of related subreddits, by the large contrast between subreddits with high relatedness and low relatedness. For example, while /r/cfb may have large numbers of obviously-related subreddits as a sports subreddit, /r/golf has only 2.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (9)

You can view Related Subreddit charts for the Top 200 Subreddits in this GitHub repository.

Finding Similar Subreddits

Another method for finding related subreddits would be to find subreddits with similar communities. An academic approach to finding similarity between sets is the Jaccard Index. Using the same set A and set B definitions above, the formula now becomes:

(A ∩ B) / [(A) + (B) - (A ∩ B)]

which outputs the Jaccard Index, between 0 and 1. This formula only requires a few tweaks to the original code. The results from this computation tell a different story.

Here are the most-similar subreddits to /r/aww:

Methods for Finding Related Reddit Subreddits with Simple Set Theory (10)

In this implementation, the default Reddit subreddits must be removed from the results, as the communities of default subreddits are largely similar to most others by design. Even former defaults like /r/adviceanimals and /r/technology still have large amounts of holdout users which skew the results. As /r/aww is a mass-appeal subreddit, it makes sense that the communities are similar to other mass-appeal subreddits.

The magnitude of the Jaccard Index measures the strength of the similarity. Most subreddit relationships have a low Jaccard Index, but the relative magnitude between all subreddit neighbors illustrate comparisons for potential related subreddits regardless (this is also the reason why the x-axis is not fixed across plots). The subreddit relationship with the highest absolute similarity is /r/arrow and /r/flashtv at 0.345, which make sense given the massive overlap between the two CW television shows.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (11)

The Jaccard Index is more useful for finding similar subreddits to niche subreddits. Let’s try a few of the subreddits mentioned previously and see how the results changed.

/r/buildapc is a niche, and the output identifies well-established subreddits, unlike with the previous related-subreddit methodology.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (12)

The subreddit most similar to /r/cfb (college football) is /r/collegebasketball!

Methods for Finding Related Reddit Subreddits with Simple Set Theory (13)

The subreddit most similar to /r/food is /r/cooking!

Methods for Finding Related Reddit Subreddits with Simple Set Theory (14)

The subreddit most similar to /r/programming is /r/linux! (of course)

Methods for Finding Related Reddit Subreddits with Simple Set Theory (15)

You can view the Similar Subreddit charts for the Top 200 Subreddits in this GitHub repository.

Again, Reddit has significantly better internal data for identifying user activity between subreddits, such as voting patterns and clickthrough tracking. But the results shown using these two set methodologies are pretty good for using public data. In fact, these two set approaches can theoretically work with any set of categorized, settable data, which may give me a few ideas for new blog posts in the future.

And there’s still the fancy machine learning approaches to try.

As always, the full code used to process the comment data and generate the visualizations is available in this Jupyter notebook, open-sourced on GitHub.

If you do find any other interesting trends in the related/similar charts of other subreddits and write about it, it would be greatly appreciated if proper attribution is given back to this post and/or myself. Thanks!

Methods for Finding Related Reddit Subreddits with Simple Set Theory (2024)

FAQs

How do you find related communities on Reddit? ›

If you're using the Reddit app, the best place to find relevant and local communities you might be interested in is in the Communities tab. This page suggests communities you're likely to enjoy based on your existing favorite communities, local trends, and popular communities in topics of your interest.

How to search in Reddit community? ›

Open Reddit and navigate to the subreddit you want to search. You can click the subreddit from the Reddit home page or enter the URL directly, but make sure you actually navigate to the subreddit and aren't still on the home page. Click the Search box at the top of the screen. Type your query, and press enter.

How do you search Reddit for a topic? ›

If you're using the new Reddit design, it's at the very top in the middle. Type what you want to see here and press Enter to search. When you use the search bar, Reddit will pull up subreddits, users, and posts that contain your search term.

How to find Reddit channels? ›

From Reddit's mobile app, the chat tab is located at the bottom. If you're on desktop web, the chat tab is at the top right of reddit.com. To discover popular chat channels to join, you can either scroll through the featured channels at the top of the screen, or select the View all to view a full list.

Where do people find a sense of community? ›

Get involved with a club

Making your passions and interests known to others isn't the only way you can actively find people and communities that you might have things in common with. You can also seek out groups of people that share your interests too.

How to discover subreddits? ›

Reddit has a subreddit search feature at reddit.com/subreddits. To use the subreddit search, look for the box that says, “what are you interested in?” and enter keywords that are related to your niche.

How to search subreddit in redreader? ›

if you type the exact name of the subreddit you can tap on the "> /R/SUBREDDITNAME" just under the search , before the results.

Does Reddit have private communities? ›

Only members of the community that are approved by moderators can view and participate in private communities. It's up to mods to add and remove approved members. Only Reddit Premium members can create, view, and participate in Premium-only communities.

Does Reddit use boolean? ›

Reddit search supports the boolean operators AND, OR, and NOT (case sensitive). Using AND between words requires all of the connected words to be in the search results. E.g. Searching 'cats AND dogs' will return only results that have the word “cats” AND the word “dogs” in the results.

What is social grep? ›

SocialGrep allows you to access social media analytics and API. Also, we offer a real-time Reddit comment and post search. Additionally, we have historical data going back to 2010.

How to search Redditors posts? ›

Reddit Comment Search

Search through comments of a particular reddit user. Just enter the username and a search query, and press Search!

How do I add related communities on Reddit? ›

You can do so in the subreddit community appearance settings:
  1. Mod Tools.
  2. Scroll down and select Community Appearance.
  3. Scroll down and select Add Widget.
  4. Select Community list.
  5. Under Communities, add the subreddits in the Add New Community section.
May 14, 2024

Does Reddit show communities? ›

Reddit has a place for everyone. You can join as many communities as you want. However, on any given visit, your home feed will only show posts from around 250 communities and will refresh every 30 minutes.

How do I get community members on Reddit? ›

5 tips for growing your subreddit
  1. Look for related content in other communities. Search for related keywords to your topic on Reddit and sort by the last month. ...
  2. Cross-post other posts into your community. ...
  3. Get featured in the sidebar. ...
  4. Leverage the power of keywords to be discovered. ...
  5. Find another moderator to join your team.

References

Top Articles
Creamy Herbed Pasta with Turkey Stuffing Meatballs Recipe | Sur La Table
Pasta Alla Norma Recipe
Spasa Parish
The Machine 2023 Showtimes Near Habersham Hills Cinemas
Gilbert Public Schools Infinite Campus
Rentals for rent in Maastricht
159R Bus Schedule Pdf
11 Best Sites Like The Chive For Funny Pictures and Memes
Finger Lakes 1 Police Beat
Craigslist Pets Huntsville Alabama
Paulette Goddard | American Actress, Modern Times, Charlie Chaplin
Red Dead Redemption 2 Legendary Fish Locations Guide (“A Fisher of Fish”)
‘An affront to the memories of British sailors’: the lies that sank Hollywood’s sub thriller U-571
Haverhill, MA Obituaries | Driscoll Funeral Home and Cremation Service
Rogers Breece Obituaries
Ella And David Steve Strange
Ems Isd Skyward Family Access
Elektrische Arbeit W (Kilowattstunden kWh Strompreis Berechnen Berechnung)
Omni Id Portal Waconia
Banned in NYC: Airbnb One Year Later
Four-Legged Friday: Meet Tuscaloosa's Adoptable All-Stars Cub & Pickle
Harvestella Sprinkler Lvl 2
Is Slatt Offensive
Storm Prediction Center Convective Outlook
Experience the Convenience of Po Box 790010 St Louis Mo
modelo julia - PLAYBOARD
Poker News Views Gossip
Abby's Caribbean Cafe
Joanna Gaines Reveals Who Bought the 'Fixer Upper' Lake House and Her Favorite Features of the Milestone Project
Pull And Pay Middletown Ohio
Tri-State Dog Racing Results
Navy Qrs Supervisor Answers
Trade Chart Dave Richard
Sweeterthanolives
How to get tink dissipator coil? - Dish De
Lincoln Financial Field Section 110
1084 Sadie Ridge Road, Clermont, FL 34715 - MLS# O6240905 - Coldwell Banker
Kino am Raschplatz - Vorschau
Classic Buttermilk Pancakes
Pick N Pull Near Me [Locator Map + Guide + FAQ]
'I want to be the oldest Miss Universe winner - at 31'
Gun Mayhem Watchdocumentaries
Ice Hockey Dboard
Infinity Pool Showtimes Near Maya Cinemas Bakersfield
Dermpathdiagnostics Com Pay Invoice
A look back at the history of the Capital One Tower
Alvin Isd Ixl
Maria Butina Bikini
Busted Newspaper Zapata Tx
2045 Union Ave SE, Grand Rapids, MI 49507 | Estately 🧡 | MLS# 24048395
Upgrading Fedora Linux to a New Release
Latest Posts
Article information

Author: Sen. Ignacio Ratke

Last Updated:

Views: 6260

Rating: 4.6 / 5 (76 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Sen. Ignacio Ratke

Birthday: 1999-05-27

Address: Apt. 171 8116 Bailey Via, Roberthaven, GA 58289

Phone: +2585395768220

Job: Lead Liaison

Hobby: Lockpicking, LARPing, Lego building, Lapidary, Macrame, Book restoration, Bodybuilding

Introduction: My name is Sen. Ignacio Ratke, I am a adventurous, zealous, outstanding, agreeable, precious, excited, gifted person who loves writing and wants to share my knowledge and understanding with you.