Fast and configurable script to get and check free HTTP, SOCKS4 and SOCKS5 proxy lists from different sources and save them to files

Overview

proxy-scraper-checker

License: MIT

Fast and configurable script to get and check free HTTP, SOCKS4 and SOCKS5 proxies from different sources and save them to files. Script can also get geolocation for each proxy and check if proxies are anonymous.

You can get proxies obtained using this script in monosans/proxy-list (updated every ~15 minutes).

Requirements

  • maxminddb
  • requests
  • PySocks

Using this script

  • Make sure your Python version is >= 3.6.
  • Install libraries from requirements.txt.
  • Edit config.py according to your needs.
  • Run main.py.

Folders description

proxies - proxies with any anonymity level.

proxies_anonymous - anonymous proxies.

proxies_geolocation - same as proxies but including geolocation info.

proxies_geolocation_anonymous - same as proxies_anonymous but including geolocation info.

Buy me a coffee

If you want to thank me financially, ask for details in Telegram or VK.

License

Licensed under the MIT license.

This product includes GeoLite2 data created by MaxMind, available from http://www.maxmind.com.

Comments
  • Checking Download / Upload Speed

    Checking Download / Upload Speed

    Hello,

    would it be possible for you, to add a function, which checks the actual download and upload speed from the proxys? Like this you could always take the fastest proxy from the output and use it.

    Thank you

    opened by AIQubeAI 3
  • Does not work

    Does not work

    F:\proxy>python main.py https://api.proxyscrape.com/v2/?request=getproxies&protocol=http | No proxies found | Status code 403 https://api.proxyscrape.com/v2/?request=getproxies&protocol=socks4 | No proxies found | Status code 403 https://api.proxyscrape.com/v2/?request=getproxies&protocol=socks5 | No proxies found | Status code 403 https://raw.githubusercontent.com/hanwayTech/free-proxy-list/main/socks5.txt | No proxies found https://raw.githubusercontent.com/hanwayTech/free-proxy-list/main/socks4.txt | No proxies found https://proxysearcher.sourceforge.net/Proxy%20List.php?type=socks | No proxies found https://raw.githubusercontent.com/hanwayTech/free-proxy-list/main/https.txt | No proxies found https://raw.githubusercontent.com/hanwayTech/free-proxy-list/main/http.txt | No proxies found https://proxysearcher.sourceforge.net/Proxy%20List.php?type=socks | No proxies found Traceback (most recent call last): File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run return loop.run_until_complete(main) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 628, in run_until_complete self.run_forever() File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 595, in run_forever self._run_once() File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 1845, in _run_once event_list = self._selector.select(timeout) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\selectors.py", line 324, in select r, w, _ = self._select(self._readers, self._writers, [], timeout) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\selectors.py", line 315, in _select r, w, x = select.select(r, w, w, timeout) ValueError: too many file descriptors in select()

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "F:\proxy\main.py", line 403, in asyncio.run(main()) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 47, in run _cancel_all_tasks(loop) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 63, in _cancel_all_tasks loop.run_until_complete(tasks.gather(*to_cancel, return_exceptions=True)) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 628, in run_until_complete self.run_forever() File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 595, in run_forever self._run_once() File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 1845, in _run_once event_list = self._selector.select(timeout) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\selectors.py", line 324, in select r, w, _ = self._select(self._readers, self._writers, [], timeout) File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\selectors.py", line 315, in _select r, w, x = select.select(r, w, w, timeout) ValueError: too many file descriptors in select() Scraper :: HTTP ---------------------------------------- 100% 42/42 Scraper :: SOCKS4 ---------------------------------------- 100% 21/21 Scraper :: SOCKS5 ---------------------------------------- 100% 23/23 Checker :: HTTP ---------------------------------------- 0% 0/21368 Checker :: SOCKS4 ---------------------------------------- 0% 0/7759 Checker :: SOCKS5 ---------------------------------------- 0% 0/4395 Exception ignored in: <coroutine object main at 0x0000020784BB02E0> Traceback (most recent call last): File "F:\proxy\main.py", line 369, in main File "F:\proxy\main.py", line 316, in main File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\rich\progress.py", line 1178, in exit File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\rich\progress.py", line 1164, in stop File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\rich\live.py", line 147, in stop File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\rich\console.py", line 869, in exit File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\rich\console.py", line 827, in _exit_buffer File "C:\Users\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\rich\console.py", line 2017, in _check_buffer ImportError: sys.meta_path is None, Python is likely shutting down Exception ignored in: <coroutine object Proxy.check at 0x0000020784BB3BC0> Traceback (most recent call last): File "F:\proxy\main.py", line 242, in check_proxy RuntimeError: coroutine ignored GeneratorExit Exception ignored in: <coroutine object Proxy.check at 0x0000020784CE8350> Traceback (most recent call last):

    opened by Roboxkin 2
  • Sources

    Sources

    New sources would be nice, kinda hidden github lists or hidden proxy website's would be nice to add to the sources tab. For socks theres not many sources

    opened by ImInTheICU 2
  • i get error and its crashing

    i get error and its crashing

    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None) handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)> Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\asyncio\events.py", line 81, in _run self._context.run(self._callback, *self._args) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\asyncio\proactor_events.py", line 162, in _call_connection_lost self._sock.shutdown(socket.SHUT_RDWR) ConnectionResetError: [WinError 10054] Eine vorhandene Verbindung wurde vom Remotehost geschlossen

    opened by 9dl 2
  • Possible to scrape proxies without checking them?

    Possible to scrape proxies without checking them?

    Thank you first of all.

    I wonder if possible to save proxies before checking them. I mean is there any way to have 3 additional outputs like http_pre.txt sock4_pre.txt and socks5_pre.txt?

    opened by paynemux 2
  • SOURCES

    SOURCES

    Getting https://raw.githubusercontent.com/clarketm/proxy-list/master/proxy-list-raw.txt Getting https://raw.githubusercontent.com/TheSpeedX/PROXY-List/master/http.txt Getting https://raw.githubusercontent.com/mmpx12/proxy-list/master/http.txt
    Getting https://raw.githubusercontent.com/mmpx12/proxy-list/master/https.txt
    Getting https://raw.githubusercontent.com/ShiftyTR/Proxy-List/master/https.txt Getting https://raw.githubusercontent.com/ShiftyTR/Proxy-List/master/http.txt Getting https://api.proxyscrape.com/v2/?request=getproxies&protocol=http
    Checking proxies... Traceback (most recent call last): File "d:/proxy-scraper-checker-main/main.py", line 70, in t.start() File "C:\Python38-32\lib\threading.py", line 852, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread

    opened by evheniu 2
  •  [WinError 10054] Windows 10

    [WinError 10054] Windows 10

    Hi bro!! Thank you very much for your project.

    I want report, i have this problem. image

    Describe the bug It always shows [WinError 10054] when run, i try set the MaxConnections = 800 but get the same result.

    Technical information

    • OS: Windows 10
    • Python version:
    • Output of pip freeze:
    aiohttp==3.8.3
    aiohttp-socks==0.7.1
    aiosignal==1.2.0
    asgiref==3.5.2
    async-timeout==4.0.2
    attrs==22.1.0
    certifi==2022.9.24
    cffi==1.15.1
    charset-normalizer==2.1.1
    commonmark==0.9.1
    Django==4.1.3
    django-embed-video==1.4.8
    easy-thumbnails==2.8.3
    frozenlist==1.3.1
    idna==3.4
    multidict==6.0.2
    numpy==1.23.5
    pandas==1.5.2
    paperclip==2.6.1
    Pillow==9.3.0
    prettytable==3.5.0
    pycares==4.2.2
    pycparser==2.21
    Pygments==2.13.0
    PyPDF4==1.27.0
    pyperclip==1.8.2
    python-dateutil==2.8.2
    python-socks==2.0.3
    pytz==2022.6
    pywin32==304
    requests==2.28.1
    rich==12.6.0
    six==1.16.0
    sqlparse==0.4.3
    tabulate==0.9.0
    tzdata==2022.7
    urllib3==1.26.13
    vboxapi==1.0
    wcwidth==0.2.5
    yarl==1.8.1
    
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
    handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
    Traceback (most recent call last):
      File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "C:\Python311\Lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
        self._sock.shutdown(socket.SHUT_RDWR)
    ConnectionResetError: [WinError 10054] Se ha forzado la interrupción de una conexión existente por el host remoto
    

    What should I do?

    opened by Milor123 1
  • change api url

    change api url

    hello sir, i changed api link url with http://ip-api.com/json/?fields=country,regionName,city,mobile,proxy,hosting,query then in main.py line 62 i added data["country"], data["regionName"], data["city"],data["mobile"], data["proxy"], data["hosting"]

    i want to add new output ie. mobile, proxy and hosting

    the app runs but there are some errors, and when done the result is still the same

    Screenshot 2022-11-12 183546

    opened by sensei457 1
  • Add more sites to scrape

    Add more sites to scrape

    Hello there,

    Check this site https://geonode.com/free-proxy-list/ they contain usually a good bunch of good proxies, is it possible to implement it as well?

    opened by NightFuryPrime 1
  • user agent

    user agent

    when i debuged connection , i found out that custom user agent function is not working image

    did user agent scraping too for making it sure there is problem from checker side image

    api working properly(using browser): image

    opened by rafalohaki 1
  • [scrape support website]

    [scrape support website]

    please add support for https://spys.one/en/free-proxy-list/ https://proxypremium.top/ here is example scraping script from spys.one https://github.com/motebaya/socks5-proxy/blob/main/main.py

    opened by rafalohaki 1
Owner
Almaz
Almaz
Automatic Proxy scraper and Proxy-rotating Nitro Generator.

Automatic Proxy scraper and Proxy-rotating Nitro Generator.

Tawren007 2 Nov 8, 2021
Python port of proxy-www (https://github.com/justjavac/proxy-www)

proxy-www.py Python port of proxy-www (https://github.com/justjavac/proxy-www). Implemented additional functionalities! How to install pip install pro

Minjun Kim (Lapis0875) 20 Dec 8, 2021
Best discord webhook spammer using proxy (support all proxy type)

Best discord webhook spammer using proxy (support all proxy type)

Iтѕ_Ѵιcнч#1337 25 Nov 1, 2022
Azure-function-proxy - Basic proxy as an azure function serverless app

azure function proxy (for phishing) here are config files for using *[.]azureweb

null 17 Nov 9, 2022
HTTP proxy pool server primarily meant for evading IP whitelists

proxy-forwarder HTTP proxy pool server primarily meant for evading IP whitelists. Setup Create a file named proxies.txt and fill it with your HTTP pro

h0nda 2 Feb 19, 2022
This tool extracts Credit card numbers, NTLM(DCE-RPC, HTTP, SQL, LDAP, etc), Kerberos (AS-REQ Pre-Auth etype 23), HTTP Basic, SNMP, POP, SMTP, FTP, IMAP, etc from a pcap file or from a live interface.

This tool extracts Credit card numbers, NTLM(DCE-RPC, HTTP, SQL, LDAP, etc), Kerberos (AS-REQ Pre-Auth etype 23), HTTP Basic, SNMP, POP, SMTP, FTP, IMAP, etc from a pcap file or from a live interface.

null 1.6k Jan 1, 2023
Serves some data over HTTP, once. Based on the built-in Python module http.server

serve-me-once Serves some data over HTTP, once. Based on the built-in Python module http.server.

Peder Bergebakken Sundt 2 Jan 6, 2022
A website to list Shadowsocks proxies and check them periodically

Shadowmere An automatically tested list of Shadowsocks proxies. Motivation Collecting proxies around the internet is fun, but what if they stop workin

Jorge Alberto Díaz Orozco (Akiel) 29 Dec 21, 2022
A script to automatically update the github's proxy IP in hosts file.

updateHostsGithub A script to automatically update the github's proxy IP in hosts file. Now only Mac and Linux are supported. (脚本自动更新本地hosts文件,目前仅支持Ma

null 2 Jul 6, 2022
A simple, configurable application and set of services to monitor multiple raspberry pi's on a network.

rpi-info-monitor A simple, configurable application and set of services to monitor multiple raspberry pi's on a network. It can be used in a terminal

Kevin Kirchhoff 11 May 22, 2022
Simple app that redirect fixed URL to changing URL, configurable via POST requests

This is a basic URL redirection service. It stores associations between apps and redirection URLs, for apps with changing URLs. You can then use GET r

Maxime Weyl 2 Jan 28, 2022
euserv auto-renew script - A Python script which can help you renew your free EUserv IPv6 VPS.

eu_ex eu_ex means EUserv_extend. A Python script which can help you renew your free EUserv IPv6 VPS. This Script can check the VPS amount in your acco

A beam of light 92 Jan 25, 2022
Qtas(Quite a Storage)is an experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

Qtas(Quite a Storage)is a experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

Jiaming Zhang 3 Jan 12, 2022
Qtas(Quite a Storage)is an experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

Qtas(Quite a Storage)is a experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

Jiaming Zhang 3 Jan 12, 2022
NetMiaou is an crossplatform hacking tool that can do reverse shells, send files, create an http server or send and receive tcp packet

NetMiaou is an crossplatform hacking tool that can do reverse shells, send files, create an http server or send and receive tcp packet

TRIKKSS 5 Oct 5, 2022
GitHub action for sspanel automatically checks in to get free traffic quota

SSPanel_Checkin This is a dish chicken script for automatic check-in of sspanel for GitHub action, It is only applicable when there is no verification

FeedCatWithFish 7 Apr 28, 2022
Compare the contents of your hosted and proxy repositories for coordinate collisions

Nexus Repository Manager dependency/namespace confusion checker This repository contains a script to check if you have artifacts containing the same n

Sonatype Community 59 Mar 31, 2022
Cobalt Strike C2 Reverse proxy that fends off Blue Teams, AVs, EDRs, scanners through packet inspection and malleable profile correlation

Cobalt Strike C2 Reverse proxy that fends off Blue Teams, AVs, EDRs, scanners through packet inspection and malleable profile correlation

Mariusz B. 715 Dec 25, 2022
A Python library to utilize AWS API Gateway's large IP pool as a proxy to generate pseudo-infinite IPs for web scraping and brute forcing.

A Python library to utilize AWS API Gateway's large IP pool as a proxy to generate pseudo-infinite IPs for web scraping and brute forcing.

George O 929 Jan 1, 2023