Python script to download all images/webms of a 4chan thread

Overview

4chan-downloader

Python script to download all images/webms of a 4chan thread

Download Script

The main script is called inb4404.py and can be called like this: python inb4404.py [thread/filename]

usage: inb4404.py [-h] [-c] [-d] [-l] [-n] [-r] thread

positional arguments:
  thread              url of the thread (or filename; one url per line)

optional arguments:
  -h, --help          show this help message and exit
  -c, --with-counter  show a counter next the the image that has been
                      downloaded
  -d, --date          show date as well
  -l, --less          show less information (surpresses checking messages)
  -n, --use-names     use thread names instead of the thread ids
                      (...4chan.org/board/thread/thread-id/thread-name)
  -r, --reload        reload the queue file every 5 minutes
  -t, --title         save original filenames

You can parse a file instead of a thread url. In this file you can put as many links as you want, you just have to make sure that there's one url per line. A line is considered to be a url if the first 4 letters of the line start with 'http'.

If you use the --use-names argument, the thread name is used to name the respective thread directory instead of the thread id.

Thread Watcher

This is a work-in-progress script but basic functionality is already given. If you call the script like

python thread-watcher.py -b vg -q mhg -f queue.txt -n "Monster Hunter"

then it looks for all threads that include mhg inside the vg board, stores the thread url into queue.txt and adds /Monster-Hunter at the end of the url so that you can use the --use-names argument from the actual download script.

Legacy

The current scripts are written in python3, in case you still use python2 you can use an old version of the script inside the legacy directory.

Comments
  • "Something went wrong"

    Regardless of thread, board, script location, drive letter, attribute, or host machine the script returns "Something went wrong" and times out after several tries. image Above picture showing different threads and boards and different attribute flags, all three failing almost immediately. image Interestingly however it works fine on an arch install. I've tried it on two machines running Windows 10, reinstalled python 3 as well Not entirely sure why it's dropping almost immediately on Windows. Assume this is a Windows problem, but nothing changed since I used it yesterday. will close if I find a solution to using it on windows. sorry there isn't much information to go off of.

    opened by cherioux 16
  • multiple threads from file error

    multiple threads from file error

    File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() Process Process-1: File "Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names' Traceback (most recent call last): File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "Programs\Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names'

    opened by th3illu 8
  • Filenames...

    Filenames...

    I've run into a few issues with filenames. One of them isn't entirely Windows specific, and I have an idea of what needs to be done to fix it, but no idea how to. Duplicate filenames within a thread simply overwrite any preceding files. Usually Spoiler_Image or file.png.

    The other issue is Windows specific and I've managed to solve it at least locally by:

    from django.utils.text import get_valid_filename
    ...snip...
                    img_path = ntpath.join(directory, get_valid_filename(img))
    ...snip...
    

    The issue in question, filenames like this used to make the script halt on Windows:

    C:\vtai>inb4404.py -c -d -l -t https://boards.4channel.org/vt/thread/34806758
    [2022-10-11 07:50:46 PM] [  9/278] vt/34806758/[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png
    Traceback (most recent call last):
      File "C:\vtai\inb4404.py", line 171, in <module>
        main()
      File "C:\vtai\inb4404.py", line 31, in main
        download_thread(thread, args)
      File "C:\vtai\inb4404.py", line 112, in download_thread
        with open(img_path, 'wb') as f:
    OSError: [Errno 22] Invalid argument: 'C:\\vtai\\downloads\\vt\\34806758\\[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png'
    

    After importing it appears to work.

    opened by vt-idiot 5
  • Script can't download files from a thread with a other thread quoted (Cross-thread)

    Script can't download files from a thread with a other thread quoted (Cross-thread)

    NSFW

    How to reproduce:

    1° Try to download from this link: https://boards.4chan.org/gif/thread/13521437

    This error appears

    File TTT Part 2 Top Tier Titties \1527053877481.webm from https://i.4cdn.org/gif/1536882732639.webm UNKNOWN ERROR OCCURRED [WinError 183] cannot create an existing file: 'C:\\Users\\mario\\4channer\\TTT Part 2 Top Tier Titties '

    But the "TTT Part 2 Top Tier Tittiles" and the two .webm files doens't exist. Using a custom folder location doens't work.

    opened by ghost 5
  • Corrupt downloads

    Corrupt downloads

    Hi! Just downloaded your script but every file downloaded is corrupt. I picked a random thread https://boards.4chan.org/b/thread/573855146/pink-ids-will-have-pink-eye-for-the-rest-of-their, but also tried with some more with same result.

    Is there any kind of logging i can enable to help you sort this out, if you're not experiencing the same problem?

    Cheers! Luca

    opened by LucaNonato 4
  • auto download/ all images in one folder?

    auto download/ all images in one folder?

    i want all the images of a certian tag to go into just one folder instead of each individual one, is there a way to do that? also is there a feature to autodownload threads in the queue text file when one appears

    question 
    opened by azzerzzzeqwe 3
  • Logo for the Downloader Readme

    Logo for the Downloader Readme

    Hey, i would like to propose a Logo for the readme header and maybe even as icon for a PyQt GUI-version if that one ever gets a go. I have altered the original 4chan Logo.

    opened by ThisLimn0 2
  • invalid syntax?

    invalid syntax?

    I guess I miss something very simple, but cannot figure it out, so here goes nothing

    I followed the instructions to in README.md and typed

    $ python inb4404.py http://boards.4chan.org/w/thread/2096591/lain-thread-anyone-have-any-arisu

    The only result was:

    File "inb4404.py", line 129 print(line.replace(link, '-' + link), end='') ^ SyntaxError: invalid syntax

    And I'm too dumb to know if this is a typo to fix or I should use a different URL (direct one to the first pic in thread maybe?).

    (After hundreds of shameful edits): also the output points at end='' equation, but for love of god I have no idea how to paste it here so markdown does not cut out all spaces.

    opened by tcheerno 2
  • Rewrite/Convert to python3?

    Rewrite/Convert to python3?

    The 2to3 tool should automate this as far as possible. I think the script should be converted to python3 because this is the new standard, support for python2 will be dropped in near future.

    opened by KopfKrieg 2
  • Stuck on checking the first post of the thread

    Stuck on checking the first post of the thread

    The script keeps "checking" the first post of the thread specified as thread_link

    I'm using python 2.7.13, script doesn't work with python 3 because of an error at line 73.

    opened by elleborgo 2
  • Reloading the file doesn't work properly

    Reloading the file doesn't work properly

    Single thread and multiple thread (using a file) works just fine, but the script is unable to reload the file and update the queue properly as of now.

    opened by Exceen 2
  • Configurable threads

    Configurable threads

    I added a lot between commits, which was a mistake, but the rundown is that I re-worked the threaded downloads.

    Previously each 4chan thread would get its own process to check and download new images. I created a queue (manager.list) system and made static workers/processes. By default, 4 will be created, but that is configurable with -p. As 'jobs' are pulled from the queue, a worker thread will work on it, then wait until another job is available to pull from the queue.

    Let me know if you have any questions or want me to change anything.

    Thanks!

    opened by Zand3r24 4
Releases(1.0)
  • 1.0(Jul 23, 2018)

    • Supports downloading images and webm-files of single and multiple (multi-threading) 4chan-threads with continuous checks for new posts.
    • Assign names to threads to locate them easier on your hard drive
    • When using a file to download multiple threads at once, 404'd threads will be marked automatically with a "-" in front of the URL.
    • Separates downloads into a "download" directory which serves as archive and a "new" directory. Downloaded files are put into both directories. If a files is deleted inside the "download" directory is will be downloaded again. On the other hand, if a file inside the "new" directory is deleted it won't be downloaded again. This serves as an easy way to keep a whole thread archived and to track new downloads. Therefore, deleting a file inside the "new" directory serves as some kind of a "mark as read" feature.
    • Thread Watcher is more or less an Add-On for the actual download file which checks for threads including a specified text and adds the respective URL into a text file which can be used with the download script.
    Source code(tar.gz)
    Source code(zip)
Owner
Micha Fink
Micha Fink
python code used to download all images contained in a facebook uid , the uid can be profile,group,fanpage

python code used to download all images contained in a facebook uid , the uid can be profile,group,fanpage

VVHai 2 Dec 21, 2021
Python script to download entire campaign images and navigation.

Squidle campaign downloader Python script to download entire campaign images and navigation. usage: squidle_campaign_downloader.py [-h] [--api-token A

Miquel Massot 2 Nov 17, 2021
Simple Python script to download images and videos from public subreddits without using Reddit's API 😎

Subreddit Media Downloader Download images and videos from any public subreddit without using Reddit's API Made with ❤ by Nico ?? About: This script a

Nico 106 Jan 7, 2023
A Python script that allows you to download all of an anime's episodes at once.

BitAnime A Python script that allows you to download all of an anime's episodes at once. · Download executable version · About BitAnime BitAnime is a

sh1nobu 17 Aug 10, 2022
Python library to download bulk of images from Bing.com

Python library to download bulk of images form Bing.com. This package uses async url, which makes it very fast while downloading.

Guru Prasad Singh 105 Dec 14, 2022
bing image downloader app used to download bulk images for a specific search term created using streamlit and bing_image_downloader python packages

bing image downloader app bing image downloader app is used to download bulk images for a specific search term. bing image downloader app gets the sea

Siva Prakash 8 Apr 5, 2022
Download images where login is required using har python and js

이미지 다운로드(har, python, js 사용) 로그인이 필요한 사이트에서 DevTools로 이미지를 다운받는 방법은 조금 까다로웠다. 가장 쉽게 할 수 있는 방법을 찾아보았다. 사용법 F12를 눌러 DevTools를 실행 Network 탭으로 이동 페이지 새로고침

null 0 Jul 22, 2022
A tool written in Python to download all Snapmaps content from a specific location.

snapmap-archiver A tool written in Python to download all Snapmaps content from a specific location.

null 46 Dec 9, 2022
This is a python based web scraping bot for windows to download all ACCEPTED submissions of any user on Codeforces

CODEFORCES DOWNLOADER This is a python based web scraping bot for windows to download all ACCEPTED submissions of any user on Codeforces Requirements

Mohak 6 Dec 29, 2022
Used Insta Loader to download high quality images from instagram account

Insta Dp Downloader Project Description: In this project, I have used "Insta Loader" to download high quality images from instagram account. You only

Hassan Shahzad 3 Oct 31, 2022
Download YouTube videos/music and images in MP4, JPG with this tool.

ABOUT THE TOOL Download YouTube videos, music and images in MP4, JPG with this tool, with an easy to understand interface. This tool works with both,

TrollSkull 5 Jan 2, 2023
Download all your URI Online Judge source codes and upload to GitHub with simple steps.

URI-Code-Downloader Download all your URI Online Judge source codes and upload to GitHub with simple steps. Prerequisites Python 3.x Installing Downlo

Luan Simões 9 Mar 23, 2022
Download all games from a public Itch.io Game Jam

Itch Jam Downloader Downloads all games from a public Itch.io Game Jam. What you'll need: Python 3.8+ pip install -r requirements.txt For site mirrori

Dragoon Aethis 19 Dec 7, 2022
Download all games from a public Itch.io Game Jam

Itch Jam Downloader Downloads all games from a public Itch.io Game Jam. What you'll need: Python 3.8+ pip install -r requirements.txt For site mirrori

Dragoon Aethis 19 Dec 7, 2022
Download all posts and comments in a subreddit

subreddit downloader This subreddit downloader downloads all posts and comments in a subreddit For a tutorial to use this program please follow this m

Guneet 6 Dec 16, 2022
Download all posts and comments in a subreddit

subreddit downloader This subreddit downloader downloads all posts and comments in a subreddit For a tutorial to use this program please follow this m

Guneet 6 Dec 16, 2022
You Can download any video/image in all social medias very easy and High Speed.

All-Downloader You Can download any video/image in all social medias very easy and High Speed. also you can easily download videos from web browsers s

Razor Kenway 14 Dec 16, 2022
👻🟡 Download all Snapchat video & photo memories from a data export.

Snapchat "Memories" Fetcher In compliance with the California Consumer Privacy Act of 2018 (“CCPA”), businesses which collect and store user data must

Todd Birchard 18 Dec 26, 2022
Python script designed to search and fetch direct download links from nxbrew.com

SwitchGamesDownloader Only for windows nxbrew.com is a website, accessible only using a proxy, where the majority of games for the Nintendo Switch are

Backend 91 Dec 28, 2022