🕳️
CygnusX1
Code by Trong-Dat Ngo.
Overviews
Key features
-
🥰 No knowledge is required to get up and to run. -
🚀 Download image using customizable number of threads. -
⛏️ Crawl all possible images (search results and recommendations).
Installation
This repository is tested on Python 3.6+ and PyTorch selenium 3.141.0+, as well as it works fine on macOS, Windows, Linux.
You should setup and run
First, create a virtual environment with the version of Python you're going to use and activate it. (Can be omitted if you want to set up directly on the OS environment)
source venv/bin/activate
Then download
git clone https://github.com/dat821168/CygnusX1.git
Finally install dependencies in requirements.txt
:
pip install -r requirements.txt
Run
Use run.py
to start the script:
python run.py --keywords "keyword 1, keyword 2" --workers 8 --use_suggestions --headless
Argument details:
--keywords
: Indicate the keywords/keyphrases you want to search. For multiple keywords, separate them with commas.--out_dir
: Path where to save results. Default = './IMAGES'.--workers
: The maximum number of workers used to crawl image. Default = 2.--use_suggestions
: Crawl search engine suggestions/recommendations. Default = False.--headless
: Hide browser during scraping. Default = False.
Future Releases
-
Suppor Google search engine. - Support Bing search engine.
- Support Baidu search engine.