Bulk Downloader for Reddit

Overview

PyPI version license

saveddit is a bulk media downloader for reddit

pip3 install saveddit

Setting up authorization

        

        

These registrations will authorize you to use the Reddit and Imgur APIs to download publicly available information.

User configuration

The first time you run saveddit, you will see something like this:

foo@bar:~$ saveddit
Retrieving configuration from ~/.saveddit/user_config.yaml file
No configuration file found.
Creating one. Please edit ~/.saveddit/user_config.yaml with valid credentials.
Exiting
  • Open the generated ~/.saveddit/user_config.yaml
  • Update the client IDs and secrets from the previous step
  • If you plan on using the user API, add your reddit username as well
imgur_client_id: ''
reddit_client_id: ''
reddit_client_secret: ''
reddit_username: ''

Download from Subreddit

foo@bar:~$ saveddit subreddit -h
Retrieving configuration from /Users/pranav/.saveddit/user_config.yaml file

usage: saveddit subreddit [-h] [-f categories [categories ...]] [-l post_limit] [--skip-comments] [--skip-meta] [--skip-videos] -o output_path subreddits [subreddits ...]

positional arguments:
  subreddits            Names of subreddits to download, e.g., AskReddit

optional arguments:
  -h, --help            show this help message and exit
  -f categories [categories ...]
                        Categories of posts to download (default: ['hot', 'new', 'rising', 'controversial', 'top', 'gilded'])
  -l post_limit         Limit the number of submissions downloaded in each category (default: None, i.e., all submissions)
  --skip-comments       When true, saveddit will not save comments to a comments.json file
  --skip-meta           When true, saveddit will not save meta to a submission.json file on submissions
  --skip-videos         When true, saveddit will not download videos (e.g., gfycat, redgifs, youtube, v.redd.it links)
  -o output_path        Directory where saveddit will save downloaded content

Example Usage: Download the hottest 15 posts each from /r/pics and /r/aww

foo@bar:~$ saveddit subreddit pics aww -f hot -l 5 -o ~/Desktop

You can download from multiple subreddits and use multiple filters:

foo@bar:~$ saveddit subreddit funny AskReddit -f hot top new rising -l 5 -o ~/Downloads/Reddit/.

Download from User's page

foo@bar:~$ saveddit user -h
Retrieving configuration from /Users/pranav/.saveddit/user_config.yaml file

usage: saveddit user [-h] users [users ...] {saved,gilded,submitted,upvoted,comments} ...

positional arguments:
  users                 Names of users to download, e.g., Poem_for_your_sprog
  {saved,gilded,submitted,upvoted,comments}

optional arguments:
  -h, --help            show this help message and exit

Example Usage: Download top 10 comments submissions by user

saveddit user "Poem_for_your_sprog" comments -s top -l 10 -o ~/Desktop

Example Output

foo@bar:~$ tree ~/Downloads/www.reddit.com
/Users/pranav/Downloads/www.reddit.com
├── r
│   └── aww
│       └── new
│           ├── 000_We_decided_to_foster_a_litter_of...
│           │   ├── comments.json
│           │   ├── files
│           │   │   └── 7fjt2gkp32s61.jpg
│           │   └── submission.json
│           ├── 001_Besties_
│           │   ├── comments.json
│           │   ├── files
│           │   │   └── zklpm1qo32s61.jpg
│           │   └── submission.json
│           ├── 002_My_cat_dice_with_his_best_friend...
│           │   ├── comments.json
│           │   ├── files
│           │   │   └── av3yrbmo32s61.jpg
│           │   └── submission.json
│           ├── 003_Digging_makes_her_the_happiest_
│           │   ├── comments.json
│           │   ├── files
│           │   │   └── zjw5f3yl32s61.jpg
│           │   └── submission.json
│           └── 004_Our_beloved_pup_needs_some_help_...
│               ├── comments.json
│               ├── files
│               │   ├── 66su4i9b32s61.mp4
│               │   ├── 66su4i9b32s61_audio.mp4
│               │   └── 66su4i9b32s61_video.mp4
│               └── submission.json
└── u
    └── Poem_for_your_sprog
        └── gilded
            ├── 000_Comment__The_guy_was_the_biggest_deal_an...
            │   └── comments.json
            ├── 001_Comment__tl_dr_life_is_long_Journey_s_h...
            │   └── comments.json
            ├── 002_Comment_From_Northwind_mine_to_Talos_shr...
            │   └── comments.json
            ├── 003_Comment__I_feel_terrible_having_people_j...
            │   └── comments.json
            └── 004_Comment_I_often_stop_a_time_or_two_At_...
                └── comments.json

21 directories, 22 files
(saveddit_prod) (base)

Supported Links:

  • Direct links to images or videos, e.g., .png, .jpg, .mp4, .gif etc.
  • Reddit galleries reddit.com/gallery/...
  • Reddit videos v.redd.it/...
  • Gfycat links gfycat.com/...
  • Redgif links redgifs.com/...
  • Imgur images imgur.com/...
  • Imgur albums imgur.com/a/... and imgur.com/gallery/...
  • Youtube links youtube.com/... and yout.be/...
  • These sites supported by youtube-dl
  • Self posts
  • For all other cases, saveddit will simply fetch the HTML of the URL

Contributing

Contributions are welcome, have a look at the CONTRIBUTING.md document for more information.

License

The project is available under the MIT license.

Comments
  • filenames for large multireddits

    filenames for large multireddits

    Hi, i've just encountered a problem. When i try to make an anonymous multireddit with about 90 subreddits in it the name of the folder generated throws this error: [Errno 36] File name too long

    Is there a way to bypass this?

    opened by kjboa 2
  • Permission Error

    Permission Error

    I am using Linux Mint 20.1. The following error occured.

    Traceback (most recent call last): File "/home/mobi/.local/bin/saveddit", line 8, in sys.exit(main()) File "/home/mobi/.local/lib/python3.8/site-packages/saveddit/saveddit.py", line 68, in main downloader.download(args.o, File "/home/mobi/.local/lib/python3.8/site-packages/saveddit/subreddit_downloader.py", line 79, in download os.makedirs(category_dir) File "/usr/lib/python3.8/os.py", line 213, in makedirs makedirs(head, exist_ok=exist_ok) File "/usr/lib/python3.8/os.py", line 213, in makedirs makedirs(head, exist_ok=exist_ok) File "/usr/lib/python3.8/os.py", line 213, in makedirs makedirs(head, exist_ok=exist_ok) [Previous line repeated 2 more times] File "/usr/lib/python3.8/os.py", line 223, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/Downloads'

    opened by mubashir-rehman 2
  • http 401 error

    http 401 error

    hello, I have the following error : `python -m saveddit.saveddit -r "Ebony" -f "new" -l 2000 -o "E:\E\D\saveddit\test" .___ .. __ ___________ ___ __ ____ | _/| /|__|/ | / __/_ \ / // __ \ / __ |/ __ | | \
    _
    \ / __ \ /\ __// // / /
    / | | || | /____ >(____ /_/ ___ >____ ____ | |||| / / / / /

    Downloader for Reddit version : v1.0.0 URL : https://github.com/p-ranav/saveddit

    E:\E\D\saveddit\test Downloading from /r/Ebony/new/ Traceback (most recent call last): File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\theo\Downloads\saveddit-master\saveddit-master\saveddit\saveddit.py", line 73, in main(args) File "C:\Users\theo\Downloads\saveddit-master\saveddit-master\saveddit\saveddit.py", line 32, in main categories=args.f, post_limit=args.l, skip_videos=args.skip_videos, skip_meta=args.skip_meta, skip_comments=args.skip_comments) File "C:\Users\theo\Downloads\saveddit-master\saveddit-master\saveddit\subreddit_downloader.py", line 74, in download for i, submission in enumerate(category_function(limit=post_limit)): File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\models\listing\generator.py", line 63, in next self._next_batch() File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\models\listing\generator.py", line 73, in _next_batch self._listing = self._reddit.get(self.url, params=self.params) File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\reddit.py", line 566, in get return self._objectify_request(method="GET", params=params, path=path) File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\reddit.py", line 672, in _objectify_request path=path, File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\reddit.py", line 855, in request json=json, File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 331, in request url=url, File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 257, in _request_with_retries url, File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 164, in _do_retry retry_strategy_state=retry_strategy_state.consume_available_retry(), # noqa: E501 File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 257, in _request_with_retries url, File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 164, in _do_retry retry_strategy_state=retry_strategy_state.consume_available_retry(), # noqa: E501 File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 260, in _request_with_retries raise self.STATUS_EXCEPTIONSresponse.status_code prawcore.exceptions.InvalidToken: received 401 HTTP response`

    this happened a first time at the 514th file and happened again as I retried.

    opened by reuspppp 2
  • Move client IDs and secrets in a separate configuration file

    Move client IDs and secrets in a separate configuration file

    Main issue: In your script files you store your client IDs and secrets as constants. This can pose a number of problems.

    Main problems:

    • Sensetiva Data exposure. Client IDs and secrets are considered rather sensetiva data. Storing them as constans is highly discouraged as basically anyone can get a hold of them.
    • Difficult configuration. If you need to change/update your tokens, this can be complicated for the end user due to the fact that he needs to change it in the script files itself, which is rather discouraging.
    • Code redundancy. You define your credentials twice, thus making it harder to change it (you need to go to every file and manually update them, which is inefficient at best) and also you end up with basically the same variables, which is also inefficient.

    Solution: Move all this data in a separate configuration file (.yaml or .json) and create a function to parse it. That way you can store all your data in one place and retrieving it via a simple function call, thus making the process of updating it much more simpler, your code is now optimized (a bit, but still), and any end user feels more comfortable working with configuration file than with raw codebase.

    If you do not mind, assign me please for this issue.

    Thanks for this awesome project!

    enhancement 
    opened by NickolaiBeloguzov 2
  • Fix for bugs plaguing the last commit & added a way to update configuration from the first start

    Fix for bugs plaguing the last commit & added a way to update configuration from the first start

    This updates includes:

    • Emergency fixes for the last commit that made it impossible for the script to run in some cases.
    • Updated both documentations (readme & readmepy)
    • Added a way for users to add their Oauth credentials right from the terminal on their first start. (This is very useful for people that use Termux on Android which have a difficulty to edit the yaml file).
    opened by Theoneflop 1
  • [Help] Comments Limits & Possible no-duplicate

    [Help] Comments Limits & Possible no-duplicate

    Hi! I just downloaded this and was wondering how I could like remove the limit on how much comments it downloads? (the current limit is top comments only so)

    Also, I was wondering how could I prevent it from re-downloading posts I already downloaded.

    Thanks!

    opened by Theoneflop 1
  • Added configuration file support

    Added configuration file support

    All website IDs and secrets were moved to an external file called 'user_config.yaml'. Also a new saveddit.configuration module was created for parsing thic configuration file.

    This module is searching for a file and, if does not find one, creates an empty configuration file, prompts user to place his valid credentials into this file and exits the script.

    New dependency was added: pyyaml==5.4.1

    'user_config.yaml' file was added to .gitignore to prevent sensetive data leak.

    opened by NickolaiBeloguzov 1
  • Comments limit argument & duplicates avoidance for subreddits.

    Comments limit argument & duplicates avoidance for subreddits.

    What does this pull request change in the whole code? Simple.

    • From now on, users are able to choose if they want to download only top-level comments or the whole comment section in a post inside a subreddit by including the argument "--all-comments".
    • In case of any technical error / electricity shutdown / internet issues etc. while downloading a subreddit, Saveddit will make sure to not re-download already downlaoded posts if the same command is written & while having the same output directory.
    opened by Theoneflop 0
  • Make saveddit a CL-callable module

    Make saveddit a CL-callable module

    Main issue: Using python3 -m saveddit.saveddit [args] command is not very comfortable for multiple reasons.

    Reasons:

    • You need to be in the same directory as saveddit which is an unnecessary step that can be eliminated
    • You need to call this module directly, which is a) can be confusing, b) can be eliminated

    Solution: Make this module callable from any place by creating a setup.py script and assembling this module into a python package. This way you can make saveddit available for domnload via PyPI - the largest Python project repository - via just a simple pip install saveddit command. To use this package, you'll need to just execute saveddit [args] command without changing your working directory. Also users can easily update your packages and you can modify its contents with ease

    opened by NickolaiBeloguzov 0
  • Need error handling or processing of non media posts.

    Need error handling or processing of non media posts.

    Getting the following error occasionally:

         * This is a redgif link
           - Looking for submission.preview.reddit_video_preview.fallback_url
    Traceback (most recent call last):
      File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/home/christopher/saveddit/saveddit/saveddit.py", line 65, in <module>
        main(args)
      File "/home/christopher/saveddit/saveddit/saveddit.py", line 31, in main
        downloader.download(args.o,
      File "/home/christopher/saveddit/saveddit/subreddit_downloader.py", line 141, in download
        self.download_gfycat_or_redgif(submission, files_dir)
      File "/home/christopher/saveddit/saveddit/subreddit_downloader.py", line 371, in download_gfycat_or_redgif
        if "reddit_video_preview" in submission.preview:
      File "/home/christopher/.local/lib/python3.8/site-packages/praw/models/reddit/base.py", line 35, in __getattr__
        return getattr(self, attribute)
      File "/home/christopher/.local/lib/python3.8/site-packages/praw/models/reddit/base.py", line 36, in __getattr__
        raise AttributeError(
    AttributeError: 'Submission' object has no attribute 'preview'
    
    
    opened by cmullins83 0
  • Scraping comments in order

    Scraping comments in order

    Does this library scrape comments of a given post in the order of their occurrence without messing with the hierarchy? The praw library helps in scraping all the comments but they are not in order. Please let me know if this library can do that and the command I should use.

    I used the command below and got an error:

    python3 -m bdfr download ./path/to/output --all-comments -l "https://www.reddit.com/r/germany/comments/yydfai/what_is_your_opinion_of_graffiti_all_over_walls/"

    Error: No such option: --all-comments

    Thank you

    opened by naveenmalla046 0
  • "..." in directory names doesn't work for Windows users.

    Line 41 in submission_downloader.py causes issues for Windows users because directories can't have "..." at the end of their names. For Windows users that line should be commented out.

    opened by doctorrmcb 1
  • After merging audio and video, audio and video stay around

    After merging audio and video, audio and video stay around

    When download a file that separates audio and video into 2 files after saveddit merges them into 1 file the individual audio and video files stay around. Is this intended functionality? or Can there be an option to only keep the merged file when downloading?

    enhancement 
    opened by Kyle-Mickan 2
  • FileNotFoundError when download a post with a title that is truncated on windows

    FileNotFoundError when download a post with a title that is truncated on windows

    System: Windows 10 64 bit. Python 3.9.5

    Steps to reproduce: Run this on windows: saveddit subreddit pics -f top -l 5 -o .

    As of today, the top post is this: https://old.reddit.com/r/pics/comments/haucpf/ive_found_a_few_funny_memories_during_lockdown/ Trying to download it gives this output:

    #000 "I’ve found a few funny memories during lockdown. This is from my 1st tour in 89, backstage in Vegas."
         * Processing `https://i.redd.it/f58v4g8mwh551.jpg`
    Traceback (most recent call last):
      File "c:\users\bad_g\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "c:\users\bad_g\appdata\local\programs\python\python39\lib\runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "C:\Users\bad_g\AppData\Local\Programs\Python\Python39\Scripts\saveddit.exe\__main__.py", line 7, in <module>
      File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\saveddit.py", line 346, in main    downloader.download(args.o,
      File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\subreddit_downloader.py", line 67, in download
        SubmissionDownloader(submission, i, self.logger, category_dir,
      File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\submission_downloader.py", line 68, in __init__
        files_dir = create_files_dir(submission_dir)
      File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\submission_downloader.py", line 62, in create_files_dir
        os.makedirs(files_dir)
      File "c:\users\bad_g\appdata\local\programs\python\python39\lib\os.py", line 225, in makedirs
        mkdir(name, mode)
    FileNotFoundError: [WinError 3] The system cannot find the path specified: '.\\www.reddit.com\\r\\pics\\top\\000_I_ve_found_a_few_funny_memories_...\\files'
    PS C:\Users\bad_g\Downloads\Saveddit>
    

    Probably due to the fact that windows removes the ellipsis at the end of the directory automatically. Maybe add the possibility to remove the truncation and/or simply remove the "..." added to the end of the directory for windows.

    opened by Satiriques 3
  • Add support for the XDG Base Directory Specification

    Add support for the XDG Base Directory Specification

    This is a feature request for supporting the XDG Base Directory Specification.

    The specification works around a bug during the early UNIX v2 rewrite which caused files prepended with a '.' to be ignored from the output of ls. While this "bug" has become a feature for some, it has also become a headache for users when developers continue to assume HOME is a great place to dump configuration files and local caches.

    To address these issues XDG Basedir was formed to give developers a standard location for these files and giving the users control over where they are placed in their HOME.

    If you were to support the XDG specification the following locations would change:

    Change ~/.saveddit/ to $XDG_CONFIG_HOME/saveddit and fall back to $HOME/.config/saveddit if XDG_CONFIG_HOME is not defined.

    opened by SaraSmiseth 0
Owner
Pranav
Pranav
YouTube-Downloader - YouTube Video Downloader made using python

YouTube-Downloader YouTube Videos Downloder made using python.

Shivam 1 Jan 16, 2022
Python-Youtube-Downloader - An Open Source Python Youtube Downloader

Python-Youtube-Downloader Hello There This Is An Open Source Python Youtube Down

Flex Tools 3 Jun 14, 2022
Youtube Downloader is a simple but highly efficient Youtube Video Downloader, made completly using Python

Youtube Downloader is a simple but highly efficient Youtube Video Downloader, made completly using Python

Arsh 2 Nov 26, 2022
Youtube-downloader-using-Python - Youtube downloader using Python

Youtube-downloader-using-Python Hii guys !! Fancy to see here Welcome! built by

Lakshmi Deepak 2 Jun 9, 2022
Python library to download bulk of images from Bing.com

Python library to download bulk of images form Bing.com. This package uses async url, which makes it very fast while downloading.

Guru Prasad Singh 105 Dec 14, 2022
Simple Python script to download images and videos from public subreddits without using Reddit's API 😎

Subreddit Media Downloader Download images and videos from any public subreddit without using Reddit's API Made with ❤ by Nico ?? About: This script a

Nico 106 Jan 7, 2023
Youtube Downloader Telegram Bot 😉

Youtube Dl bot ?? Prerequisite ffmpeg install dependencies pip3 install -r requirements.txt Setup Bot - Change configuration config.py File - insta

Aryan Vikash 285 Dec 6, 2022
A scriptable music downloader for Qobuz, Tidal, and Deezer

streamrip A scriptable stream downloader for Qobuz, Tidal, and Deezer. Features Downloads tracks, albums, playlists, discographies, and labels from Qo

null 967 Jan 3, 2023
A Udemy downloader that can download DRM protected videos and non-DRM protected videos.

Udemy Downloader with DRM support NOTE This program is WIP, the code is provided as-is and i am not held resposible for any legal repercussions result

Puyodead1 468 Dec 29, 2022
Music and video downloader, Made with love by Bryan Herrera

Python-Mp3Mp4-Downloader Music and video downloader, Made with love by Bryan Herrera Requirements CHOCOLATELY windows command If your system does not

ርᚱ1ናተᛰ ᚻህᚥተპᚱ 104 Dec 27, 2022
📺 YouTube Song Downloader Bot For Telegram 🔮

?? YouTube Song Downloader Bot For Telegram ?? Powerd By TamilBots.

Tamil Bots 146 Dec 31, 2022
music downloader written in python. (Uses jiosaavn API)

music downloader written in python. (Uses jiosaavn API)

Rohn Chatterjee 35 Jul 20, 2022
MMDL (Mega Music Downloader) - A tool to easily download music.

mmdl - Mega Music Downloader What is mmdl ❓ TLDR: MMDL is a cli app which allows you to quickly and efficiently download one or multiple songs from Yo

techboy-coder 30 Dec 13, 2022
apkizer is a mass downloader for android applications for all available versions.

apkizer apkizer collects all available versions of an Android application from apkpure.com Purpose Sometimes mobile applications can be useful to dig

Kamil Onur Özkaleli 41 Dec 16, 2022
Pantheon - The fastest YouTube downloader.

A Youtube downloader written in Python3, using HTTP requests and an API.

Billy 38 Nov 21, 2022
Terminal based YouTube player and downloader

termitube NOTE: THIS REPOSITORY IS A FORK OF mps-youtube as mps-youtube has been unmaintained for almost a year now. Features Search and play audio/vi

Otis/Jacob Root 27 Dec 23, 2022
Youtube playlist downloader with full metadata support

ytrake GUI tool to embed metadata for albums on Youtube with youtube-dl. Requires youtube-dl v2021.06.06. Post-processing Album metadata: Usage ytrake

null 28 Jul 12, 2022
Using Youtube downloader is the fast and easy way to download and save any YouTube video.

Youtube video downloader using Django Using Django as a backend along with pytube module to create Youtbue Video Downloader. https://yt-videos-downloa

Suman Raj Khanal 10 Jun 18, 2022