Tool to scan for secret files on HTTP servers

Overview

snallygaster

Finds file leaks and other security problems on HTTP servers.

what?

snallygaster is a tool that looks for files accessible on web servers that shouldn't be public and can pose a security risk.

Typical examples include publicly accessible git repositories, backup files potentially containing passwords or database dumps. In addition, it contains a few checks for other security vulnerabilities.

As an introduction to these kinds of issues you may want to watch this talk:

See the TESTS.md file for an overview of all tests and links to further information about the issues.

install

snallygaster is available via pypi:

pip3 install snallygaster

It's a simple python 3 script, so you can just download the file "snallygaster" and execute it. Dependencies are urllib3, beautifulsoup4 and dnspython. In Debian- or Ubuntu-based distributions you can install them via:

apt install python3-dnspython python3-urllib3 python3-bs4

distribution packages

Some Linux and BSD systems have snallygaster packaged:

faq

Q: I want to contribute / send a patch / a pull request!

A: That's great, but please read the CONTRIBUTIONS.md file.

Q: What's that name?

A: Snallygaster is the name of a dragon that according to some legends was seen in Maryland and other parts of the US. There's no particular backstory why this tool got named this way, other than that I was looking for a fun and interesting name.

I thought a name of some mythical creature would be nice, but most of those had the problem that I would have had name collisions with other software. Checking the list of dragons on Wikipedia I learned about the Snallygaster. The name sounded funny, the idea that there are dragon legends in the US interesting and I found no other piece of software with that name.

credit and thanks

  • Thanks to Tim Philipp Schäfers and Sebastian Neef from the Internetwache for plenty of ideas about things to look for.
  • Thanks to Craig Young for many discussions during the development of this script.
  • Thanks to Sebastian Pipping for some help with Python programming during the development.
  • Thanks to Benjamin Balder Bach for teaching me lots of things about Python packaging.
  • Thanks to the organizers of Bornhack, Driving IT, SEC-T and the Rights and Freedom track at 34C3 for letting me present this work.

author

snallygaster is developed and maintained by Hanno Böck.

Comments
  • Add check for apache server info

    Add check for apache server info

    I suggest adding a check for apache server info and perl-status

    see the following links for details https://httpd.apache.org/docs/2.4/mod/mod_info.html https://perl.apache.org/docs/2.0/api/Apache2/Status.html

    opened by security-companion 12
  • Vulnerability: local denial of service (DoS) attack.

    Vulnerability: local denial of service (DoS) attack.

    When snallygaster is scanning a website the client can be attacked by the server and forced to consume all the available CPU resources. This attack works by exploiting a redos vulnerability in the heartbleed regex.

    Vulnerable code: https://github.com/hannob/snallygaster/blob/master/snallygaster#L431

    Here you can see the performance impact per 3Kb sent by the server. asciicast

    poc.py:

    import time
    import re
    import sys
    
    data = 'a' + ( ' ' * (int(sys.argv[1]) - 1) )
    print('Checking {}Kb of data'.format(len(data)/1000))
    
    start = time.time()
    regex = re.compile("^[a-zA-Z]+(-[a-zA-Z]+)? *( +[a-zA-Z]+(-[a-zA-Z]+)? *)+$")
    regex.match(data)
    print("Checked regex in: %dms" % ((time.time() - start) * 1000))
    
    opened by menzow 10
  • Add test for common backup archive files

    Add test for common backup archive files

    This pull request adds a simple check for common backup archive files. The list of files consists of backup.zip, www.zip, wwwroot.zip, backup.tar.gz, www.tar.gz and wwwroot.tar.gz (the file names were inspired by https://github.com/unamer/CTFHelper/blob/master/CTFhelper.py#L82). Even though these files do not exist very often (approx. 0.1% of the checked hostnames), the security implications of a found backup are huge. A backup archive does not only contain source code which provides an insight into the site structure but it may also contain secret keys, database passwords or database dumps. Additionally, the test is inexpensive requiring only six HTTP requests.

    opened by timonegk 8
  • dns.resolver.query() causes deprecation warning when using dnspython >=2.0.0

    dns.resolver.query() causes deprecation warning when using dnspython >=2.0.0

    In dnspython 2.0.0, dns.resolver.query() has been deprecated, dns.resolver.resolve() should be used instead. Therefore, with dnspython >=2.0.0, snallygaster causes deprecation warnings. However, using dns.resolver.resolve() would make the tool incompatible with dnspython <2.0.0. If that is not an issue, the following path would remove the warnings:

    diff --git a/snallygaster b/snallygaster
    index 8f32ff6..2dcdb3c 100755
    --- a/snallygaster
    +++ b/snallygaster
    @@ -215,7 +215,7 @@ def dnscache(qhost):
         except OSError:
             pass
         try:
    -        dnsanswer = dns.resolver.query(qhost, 'A')
    +        dnsanswer = dns.resolver.resolve(qhost, 'A')
         except (dns.exception.DNSException, ConnectionResetError):
             dns_cache[qhost] = None
             return None
    @@ -738,7 +738,7 @@ def test_wpdebug(url):
     @HOSTNAME
     def test_axfr(qhost):
         try:
    -        ns = dns.resolver.query(qhost, 'NS')
    +        ns = dns.resolver.resolve(qhost, 'NS')
         except (dns.exception.DNSException, ConnectionResetError):
             return
         for r in ns.rrset:
    

    If you want to apply this, I can create a PR.

    opened by mgollo 7
  • Add minimalistic docker image

    Add minimalistic docker image

    Makes the project easy to run with some minimum level of isolation.

    Alpine base image keeps things small & safe:

    REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
    snallygaster        latest              b41c47c92bfc        5 minutes ago       98.2MB
    

    Usage:

    $ docker run -ti --rm snallygaster /app/snallygaster --help
    usage: snallygaster [-h] [-t TESTS] [--useragent USERAGENT] [--nowww]
                        [--nohttp] [--nohttps] [-i] [-n] [-j] [-d]
                        [hosts [hosts ...]]
    
    positional arguments:
      hosts                 hostname to scan
    
    optional arguments:
      -h, --help            show this help message and exit
      -t TESTS, --tests TESTS
                            comma-separated tests to run.
      --useragent USERAGENT 
                            User agent to send
      --nowww               skip scanning www.[host]
      --nohttp              Don't scan http
      --nohttps             Don't scan https
      -i, --info            Enable all info tests (no bugs/security
                            vulnerabilities)
      -n, --noisy           show noisy messages that indicate boring bugs, but no
                            security issue
      -j, --json            produce JSON output
      -d, --debug           show detailed debugging info
    

    If you're happy with this it's easy to publish this on the docker hub or on quay.io

    opened by pieterlange 7
  • Unhandled Exception

    Unhandled Exception

    Reporting a bug, as asked by the script :)

    Installed using pip3

    This was run for a test between two Raspberry Pi's - apache2 running on the source destination. Not sure what else you needed, data dump from the SSH session.

    ``pi@weather:~ $ snallygaster -i -d http://192.168.1.101/check [[debug]] All hosts: http://192.168.1.101/check,www.http://192.168.1.101/check [[debug]] Scanning http://192.168.1.101/check [[debug]] running test_ilias_defaultpw test [[debug]] running test_symphony_databases_yml test [[debug]] running test_svn_dir test [[debug]] running test_sql_dump test [[debug]] checking 404 page state of http://http://192.168.1.101/check/cctcbcvr.htm [[debug]] checking 404 page state of https://http://192.168.1.101/check/kxyoqvxm.htm [[debug]] running test_filezilla_xml test [[debug]] running test_invalidsrc test [[debug]] running test_ds_store test [[debug]] running test_rails_database_yml test [[debug]] running test_cvs_dir test [[debug]] running test_axfr test [[debug]] running test_privatekey test [[debug]] running test_git_dir test [[debug]] running test_sftp_config test [[debug]] running test_xaa test [[debug]] running test_idea test [[debug]] running test_cgiecho test [[debug]] running test_drupal_backup_migrate test [[debug]] running test_phpunit_eval test Oh oh... an unhandled exception has happened. This shouldn't be. Please report a bug and include all output.

    called with /usr/local/bin/snallygaster -i -d http://192.168.1.101/check

    Traceback (most recent call last): File "/usr/local/bin/snallygaster", line 686, in test("http://" + host) File "/usr/local/bin/snallygaster", line 562, in test_phpunit_eval body='<?php echo(substr_replace("hello", "12", 2, 2));') File "/usr/lib/python3/dist-packages/urllib3/request.py", line 72, in request **urlopen_kw) File "/usr/lib/python3/dist-packages/urllib3/request.py", line 135, in request_encode_body **urlopen_kw) TypeError: urlopen() got multiple values for keyword argument 'body'

    `` pi@weather:~ $ pip3 --version pip 1.5.6 from /usr/lib/python3/dist-packages (python 3.4)

    Let me know if you need anything else. I assume I have the command line parameters correct :)

    opened by nimzoking 5
  • Test: Nette Framework - config.neon

    Test: Nette Framework - config.neon

    Nette (https://nette.org/en/) is very popular web PHP framework. It stores its config in the config.neon file. This file should not be accessible via browser, but many sites don't follow best practices.

    opened by lynt-smitka 4
  • Unhandled exception

    Unhandled exception

    When running ./snallygaster -d hkk.de snallygaster crashes with the following output:

    [[debug]] All hosts: hkk.de,www.hkk.de
    [[debug]] Scanning hkk.de
    [[debug]] Running test_lfm_php test
    [[debug]] Running test_idea test
    [[debug]] Running test_symfony_databases_yml test
    [[debug]] Running test_rails_database_yml test
    [[debug]] Running test_git_dir test
    [[debug]] Running test_svn_dir test
    [[debug]] Running test_apache_server_status test
    [[debug]] Running test_coredump test
    [[debug]] Running test_sftp_config test
    [[debug]] Running test_wsftp_ini test
    [[debug]] Running test_filezilla_xml test
    [[debug]] Running test_winscp_ini test
    [[debug]] Running test_ds_store test
    [[debug]] Running test_php_cs_cache test
    [[debug]] Running test_backupfiles test
    [[debug]] Checking 404 page state of http://hkk.de/stutpqto.htm
    [[debug]] Checking 404 page state of https://hkk.de/eyahirwt.htm
    [[debug]] Running test_backup_archive test
    [[debug]] Running test_deadjoe test
    [[debug]] Running test_sql_dump test
    [[debug]] Running test_bitcoin_wallet test
    [[debug]] Running test_drupal_backup_migrate test
    [[debug]] Running test_magento_config test
    [[debug]] Running test_xaa test
    [[debug]] Running test_optionsbleed test
    [[debug]] Running test_privatekey test
    [[debug]] Running test_sshkey test
    [[debug]] Running test_dotenv test
    [[debug]] Running test_invalidsrc test
    [[debug]] Running test_ilias_defaultpw test
    [[debug]] Running test_cgiecho test
    [[debug]] Running test_phpunit_eval test
    [[debug]] Running test_acmereflect test
    [[debug]] Running test_drupaldb test
    [[debug]] Running test_phpwarnings test
    [[debug]] Running test_adminer test
    [[debug]] Running test_elmah test
    [[debug]] Running test_citrix_rce test
    [[debug]] Running test_installer test
    [[debug]] Running test_wpsubdir test
    [[debug]] Running test_axfr test
    /home/osboxes/snallygaster/./snallygaster:706: DeprecationWarning: please use dns.resolver.resolve() instead
      ns = dns.resolver.query(qhost, 'NS')
    Oh oh... an unhandled exception has happened. This shouldn't be.
    Please report a bug and include all output.
    
    called with
    ./snallygaster -d hkk.de
    
    Traceback (most recent call last):
      File "/usr/lib/python3.9/site-packages/dns/inet.py", line 87, in af_for_address
        dns.ipv4.inet_aton(text)
      File "/usr/lib/python3.9/site-packages/dns/ipv4.py", line 52, in inet_aton
        raise dns.exception.SyntaxError
    dns.exception.SyntaxError: Text input is malformed.
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/lib/python3.9/site-packages/dns/inet.py", line 91, in af_for_address
        dns.ipv6.inet_aton(text, True)
      File "/usr/lib/python3.9/site-packages/dns/ipv6.py", line 165, in inet_aton
        raise dns.exception.SyntaxError
    dns.exception.SyntaxError: Text input is malformed.
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/osboxes/snallygaster/./snallygaster", line 916, in <module>
        test(host)
      File "/home/osboxes/snallygaster/./snallygaster", line 712, in test_axfr
        axfr = dns.zone.from_xfr(dns.query.xfr(r, qhost))
      File "/usr/lib/python3.9/site-packages/dns/zone.py", line 1184, in from_xfr
        for r in xfr:
      File "/usr/lib/python3.9/site-packages/dns/query.py", line 919, in xfr
        (af, destination, source) = _destination_and_source(where, port,
      File "/usr/lib/python3.9/site-packages/dns/query.py", line 226, in _destination_and_source
        af = dns.inet.af_for_address(where)
      File "/usr/lib/python3.9/site-packages/dns/inet.py", line 94, in af_for_address
        raise ValueError
    ValueError
    

    OS: Fedora 33 kernel: 5.8.18-300.fc33.x86_64

    opened by KommX 3
  • Not finding server-status

    Not finding server-status

    I have done some test and the Apache Server-Status page need to end with a " / " at the end. If it is not present, the page doesn't show up.

    opened by Themercee 3
  • base URL

    base URL

    Hi Hanno,

    Often web servers have a mapping where to route requests to, depended on the URL. In those cases they work as a proxy.

    E.g. by supplying /app you may hit a Tomcat, by /main a Node/Express server, under /whatever yet another server. Bottom line is that the user should be able to supply a base URL where all test URLs are appended to.

    Q: If it's just one server and / is redirected to /someotherpath. Does snallygaster follow the URL?

    Cheers, Dirk

    opened by drwetter 3
  • Add fast and extensive scanning options.

    Add fast and extensive scanning options.

    I like the sound of increasing the wordlist and adding flags to pick between a fast scan and a more extensive scan that includes the entire wordlist. This would allow you to increase your current tests to include megplus' list (Please note, some of these tests are already performed by snallygaster):

    /.AppleDB
    /.aws.yml
    /.aws/.credentials.swp
    /.aws/credentials
    /.babelrc
    /.bash_history
    /.bash_profile
    /.bashrc
    /.bowerrc
    /.bzr/repository/format
    /.cvsignore
    /.dockerignore
    /.DS_Store
    /.editorconfig
    /.env
    /.git/config
    /.git/HEAD
    /.gitconfig
    /.gitignore
    /.gitlab-ci.yml
    /.hg
    /.hg/branch
    /.hgignore
    /.htaccess
    /.htpasswd
    /.idea
    /.idea/.rakeTasks
    /.idea/dataSources
    /.idea/dataSources.local.xml
    /.idea/dataSources.xml
    /.idea/modules.xml
    /.idea/vcs.xml
    /.idea/workspace.xml
    /.jestrc
    /.jshintrc
    /.keys.yml
    /.keys.yml.swp
    /.muttrc
    /.mysql_history
    /.nbproject
    /.netrc
    /.npmignore
    /.npmrc
    /.pgpass
    /.profile
    /.psql_history
    /.s3.yml
    /.sh_history
    /.ssh
    /.ssh/authorized_keys
    /.ssh/id_dsa
    /.ssh/id_dsa.pub
    /.ssh/id_rsa
    /.ssh/id_rsa.pub
    /.ssh/known_hosts
    /.svn/all-wcprops
    /.svn/entries
    /.svn/format
    /.svn/wc.db
    /.svnignore
    /.swp
    /.terraform.tfstate.swp
    /.terraform.tfvars.swp
    /.travis.composer.config.json
    /.travis.yml
    /.travis.yml.swp
    /.wp-config.php
    /.wp-config.php.swp
    /.zsh_history
    /.zsh_profile
    /.zshrc
    /_admin/operations.aspx
    /_vti_bin/admin.asmx
    /admin
    /autoconfig
    /aws.yml
    /backup
    /backup.asp
    /backup.aspx
    /backup.do
    /backup.html
    /backup.jsp
    /backup.php
    /backup.txt
    /backup/
    /beans
    /bower.json
    /build.xml
    /cgi-bin/printenv.pl
    /cgi-bin/status.pl
    /cgi-bin/test-cgi.pl
    /circle.yml
    /composer.json
    /composer.lock
    /config
    /config.gypi
    /config.json
    /configprops
    /CVS/Entries
    /CVS/Root
    /cvsroot/CVSROOT
    /cvsroot/CVSROOT/val-tags
    /debug
    /debug.asp
    /debug.aspx
    /debug.do
    /debug.html
    /debug.jsp
    /debug.php
    /debug.txt
    /debug/
    /Dockerfile
    /dump
    /e2e-tests
    /env
    /examples/jsp/error/error.html
    /examples/jsp/num/numguess.jsp
    /examples/servlet/HelloWorldExample
    /features
    /flex
    /Gemfile
    /Gemfile.lock
    /gruntfile.coffee
    /Gruntfile.coffee
    /gruntfile.js
    /Gruntfile.js
    /Gulpfile
    /Gulpfile.js
    /gulpfile.js
    /index.asp
    /index.aspx
    /index.jsp
    /index.php
    /index.txt
    /info
    /info.asp
    /info.aspx
    /info.do
    /info.html
    /info.jsp
    /info.php
    /info.txt
    /info/
    /invoker/EJBInvokerServlet
    /invoker/JMXInvokerServlet
    /Jenkinsfile
    /jmx-console/HtmlAdaptor
    /karma.conf.js
    /keys.yml
    /license
    /LICENSE
    /license.md
    /LICENSE.md
    /LICENSE.txt
    /license.txt
    /Makefile
    /metrics
    /mkdocs.yml
    /nginx_status
    /npm-debug.log
    /npm-shrinkwrap.json
    /package.json
    /pagespeed_admin
    /php.php
    /phpinfo.php
    /phptest.php
    /phpunit.xml
    /readme
    /README
    /readme.html
    /README.html
    /readme.md
    /README.md
    /readme.mkd
    /README.mkd
    /README.txt
    /readme.txt
    /robots.txt
    /routes
    /s3.yml
    /s3.yml.swp
    /server-info
    /server-status
    /serverinfo
    /tags
    /terraform.tfstate
    /terraform.tfstate.backup
    /terraform.tfvars
    /terraform.tfvars.json
    /test
    /test.asp
    /test.aspx
    /test.do
    /test.html
    /test.jsp
    /test.php
    /test.txt
    /test/
    /tests
    /Thumbs.db
    /tmp
    /tmp.asp
    /tmp.aspx
    /tmp.do
    /tmp.html
    /tmp.jsp
    /tmp.php
    /tmp.txt
    /tmp/
    /tomcat-docs/appdev/sample/web/hello.jsp
    /trace
    /travis.yml
    /tsconfig.json
    /unit-tests
    /Vagrantfile
    /web-console/AOPBinding.jsp
    /web-console/applet.jsp
    /web-console/Invoker
    /web-console/listMonitors.jsp
    /web-console/ServerInfo.jsp
    /web-console/status
    /web-console/SysProperties.jsp
    /web-console/WebModule.jsp
    /WEB-INF/struts-config.xml
    /WEB-INF/web.xml
    /web.config
    /web.xml
    /webpack.config.js
    /wp-config.php
    /yarn-debug.log
    /yarn-error.log
    /yarn.lock
    /zephyr
    
    opened by EdOverflow 3
  • License change to 0BSD

    License change to 0BSD

    There is some controversy about the CC0 license and its patent clause, which means it's not OSI approved, not recommended by the FSF and recently Fedora decided to disallow CC0 code: https://lwn.net/Articles/902410/

    In summary: While the choice of CC0 was intended to make the use of the code as easy as possible, in practice it does the opposite. I would therefore like to change the license to 0BSD, which I believe is a license in the same spirit, but widely accepted as a good FOSS license (It's just a standard disclaimer, no restrictions for reuse): https://opensource.org/licenses/0BSD

    I would therefore like to ask all contributors if they agree to this change. While some contributions may be too small to justify a copyright, I would prefer to get approval from everyone, as this is certainly the legally safest option.

    Tagging everyone who made a pull request in the past that got merged. Please just post "I agree" on this issue if you agree to the license change.

    @security-companion @timonegk @jopi2016 @sebix @lynt-smitka @mohdshakir @cfi-gb @gvarisco @roman-mueller @gabeguz @pieterlange @ppepos @undergroundwires @wireghoul

    opened by hannob 10
  • Fix crash due to DNS timeout

    Fix crash due to DNS timeout

    Fixes crash due to dns.resolver.LifetimeTimeout: The resolution lifetime expired after 19.275 seconds.

    snallygaster crashes during axfr test if the DNS server of the target responds slowly. It should not crash completely but instead pass the test as desired behavior.

    During crash, snallygaster logs to stderr the following:

    Traceback (most recent call last):
      File "/tools/snallygaster/bin/snallygaster", line 1022, in <module>
        test(host)
      File "/tools/snallygaster/bin/snallygaster", line 801, in test_axfr
        ipv4 = dns.resolver.resolve(r, 'a').rrset
      File "/tools/snallygaster/lib/python3.10/site-packages/dns/resolver.py", line 1193, in resolve
        return get_default_resolver().resolve(qname, rdtype, rdclass, tcp, source,
      File "/tools/snallygaster/lib/python3.10/site-packages/dns/resolver.py", line 1066, in resolve
        timeout = self._compute_timeout(start, lifetime,
      File "/tools/snallygaster/lib/python3.10/site-packages/dns/resolver.py", line 879, in _compute_timeout
        raise LifetimeTimeout(timeout=duration, errors=errors)
    dns.resolver.LifetimeTimeout: The resolution lifetime expired after 19.275 seconds: 
    

    And stdout to:

    Oh oh... an unhandled exception has happened. This shouldn't be.
    Please report a bug and include all output.
    
    opened by undergroundwires 0
  • Exend svn and add cvs test

    Exend svn and add cvs test

    I compared the ZAPproxy-plugin for finding hidden files with snallygaster and found some differences. Therefore I created this PR to add some missing checks.

    opened by security-companion 6
  • Wait time between requests

    Wait time between requests

    Hi, have you been thinking about adding an option to enable a wait time between each request, eg. in order to reduce server load or avoid WAF triggering. If you agree with such a feature I could work on a pull request. Greetings

    opened by security-companion 0
  • stuck invalidsrc check with streaming responses

    stuck invalidsrc check with streaming responses

    hi, I have one server where snallygaster fails to run due to encountering a response of content-type multipart/x-mixed-replace.

    Maybe snallygaster should just do a HEAD request or at least have a timeout for invalidsrc requests?

    opened by mphilipps 2
Owner
Hanno Böck
Hanno Böck
A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.

Parallel web scraping The project is a training task for web scraping using python multithreading and a real-time-updated list of available proxy serv

Kushal Shingote 1 Feb 10, 2022
simple http & https proxy scraper and checker

simple http & https proxy scraper and checker

Neospace 11 Nov 15, 2021
WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

Project A: WebScraper A script that prints out a list of all EXTERNAL references

null 2 Apr 26, 2022
Goblyn is a Python tool focused to enumeration and capture of website files metadata.

Goblyn Metadata Enumeration What's Goblyn? Goblyn is a tool focused to enumeration and capture of website files metadata. How it works? Goblyn will se

Gustavo 46 Nov 22, 2022
Simple python tool for the purpose of swapping latinic letters with cirilic ones and vice versa in txt, docx and pdf files in Serbian language

Alpha Swap English This is a simple python tool for the purpose of swapping latinic letters with cirylic ones and vice versa, in txt, docx and pdf fil

Aleksandar Damnjanovic 3 May 31, 2022
Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

mcc-mnc.com-webscraper Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX) A Python script for web scraping mcc-mnc.com Link: mcc

Anton Ivarsson 1 Nov 7, 2021
Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

slocpi-scraper Sun Life of Canada Philippines Inc Investment Funds Scraper Install dependencies pip install -r requirements.txt Usage General format:

Daryl Yu 2 Jan 7, 2022
A scrapy pipeline that provides an easy way to store files and images using various folder structures.

scrapy-folder-tree This is a scrapy pipeline that provides an easy way to store files and images using various folder structures. Supported folder str

Panagiotis Simakis 7 Oct 23, 2022
Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc)

Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc).

Amit 6 Aug 26, 2022
Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow

Adrien Barbaresi 704 Jan 6, 2023
A low-code tool that generates python crawler code based on curl or url

KKBA Intruoduction A low-code tool that generates python crawler code based on curl or url Requirement Python >= 3.6 Install pip install kkba Usage Co

null 8 Sep 20, 2021
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

Joseph Lai 543 Jan 3, 2023
A tool to easily scrape youtube data using the Google API

YouTube data scraper To easily scrape any data from the youtube homepage, a youtube channel/user, search results, playlists, and a single video itself

null 7 Dec 3, 2022
A tool for scraping and organizing data from NewsBank API searches

nbscraper Overview This simple tool automates the process of copying, pasting, and organizing data from NewsBank API searches. Curerntly, nbscrape onl

null 0 Jun 17, 2021
👁️ Tool for Data Extraction and Web Requests.

httpmapper ??️ Project • Technologies • Installation • How it works • License Project ?? For educational purposes. This is a project that I developed,

null 15 Dec 5, 2021
This tool can be used to extract information from any website

WEB-INFO- This tool can be used to extract information from any website Install Termux and run the command --- $ apt-get update $ apt-get upgrade $ pk

null 1 Oct 24, 2021
IGLS - Instagram Like Scraper CLI tool

IGLS - Instagram Like Scraper It's a web scraping command line tool based on python and selenium. Description This is a trial tool for learning purpos

Shreshth Goyal 5 Oct 29, 2021
CRI Scrape is a tool for get general info about Italian Red Cross in GAIA Platform

CRI Scrape CRI Scrape is a tool for get general info about Italian Red Cross in GAIA Platform Disclaimer This code is only for educational purpose. So

Vincenzo Cardone 0 Jul 23, 2022
Web scrapping tool written in python3, using regex, to get CVEs, Source and URLs.

searchcve Web scrapping tool written in python3, using regex, to get CVEs, Source and URLs. Generates a CSV file in the current directory. Uses the NI

null 32 Oct 10, 2022