Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!

Hardik Vasa

Last update: Jan 5, 2023

Related tags

Third-party APIs Wrappers python terminal command-line image-gallery python-script image-processing google-images image-search image-dataset command-line-tool image-download image-database image-scraper download-images color-filter

Overview

Google Images Download

Python Script for 'searching' and 'downloading' hundreds of Google images to the local hard disk!

Documentation

Disclaimer

This program lets you download tons of images from Google. Please do not download or use any image that violates its copyright terms. Google Images is a search engine that merely indexes images and allows you to find them. It does NOT produce its own images and, as such, it doesn't own copyright on any of them. The original creators of the images own the copyrights.

Images published in the United States are automatically copyrighted by their owners, even if they do not explicitly carry a copyright warning. You may not reproduce copyright images without their owner's permission, except in "fair use" cases, or you could risk running into lawyer's warnings, cease-and-desist letters, and copyright suits. Please be very careful before its usage! Use this script/code only for educational purposes.

Comments

Looks like we cannot locate the path the 'chromedriver' if limit is included

OS: osx 10.12.6 Python version: 3.6.5 Issue steps: When running CLI with -l or --limit option specified, the following error is returned. Looks like we cannot locate the path the 'chromedriver' (use the '--chromedriver' argument to specify the path to the executable.) or google chrome browser is not installed on your machine executing without -l works fine.

Attempted to use json input but still returning same error. json file contents below: { "Records": [ {"keywords": "tops","limit": 1000}, {"keywords": "jacket","limit": 1000} ] }
question

opened by kevng9 31

>100 images on linux OS

Hello, loving this program - so useful, thanks for creating it!

I am having trouble setting it up to >100 issues. I installed via CLI and ran setup.py, so am assuming based on your instructions that Selenium is installed. I also followed the geckodriver instructions and that seems to have been successful. I'm running Linux Mint. Any support would be much appreciated!

$ googleimagesdownload -k "Siamese cat, Domestic cat" -l 199 -o "cat images" -f jpg

Item no.: 1 --> Item name = Siamese cat
Evaluating...
Starting Download...


Unfortunately all 199 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Item no.: 2 --> Item name =  Domestic cat
Evaluating...
Starting Download...


Unfortunately all 199 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Everything downloaded!
Total Errors: 0

Total time taken: 0.14542031288146973 Seconds

feature-request

opened by freekeys 17

fix newlines on table
this pr is consist only fix for newline on table

most diff from the actual pr are on readme file (before, after) and setup.py (before, after)

maybe squash the commits between https://github.com/hardikvasa/google-images-download/tree/618d1c2cf8842b7d8adaa518109f542bc90e81a8 and this pr after this pr is merged?

pypitest result can be found here https://test.pypi.org/project/google_images_download/

changes

setup.py

removing rst generation. readme file already use rst

add mit license to classifier

readme

change readme to rst

removing python2/3 usage section and example: with pip installation it is already a single command

it also mean not using python anymore on example.

removing bold effect on value in argument row

the explanation where the image downloaded

what is left

change author_email in setup.py

choose one license file. either license.txt or license.md

either give remove the example to download similar image or add that feature to program

if you want i can also upload to pypi. just reply when everything is ready.

or you can also upload it yourself to pypi
opened by rachmadaniHaryono 17
Unicode keyword cause problem when printing to console on Windows

Hi Hardik and guys,

Thanks for this great tool.

I'd like to report an issue, when using non-English search word, it causes the program to throw an error and quit.

I'm not sure if this is specific to some platform or locale. Mine is Windows 10 64-bit English version.

I tried to use some Thai words and the error occurs on the print(iteration) line. Obviously this is just a status message which the program would work just fine if the line is commented out.

I learned that we're only supposed to use characters in the string.printable when outputting to console. So my solution is to replace the line with print(iteration.encode(encoding='utf-8', errors='replace')), which will replace non-printable characters with their Unicode escape sequence e.g. "\xe0\xb8\xad" which works fine for me.

But I'm not sure which is the best way to fix or which is your preferred way. So let me know if I can help.

Thanks.
bug error-reporting pending-info

opened by krissdap 15
Corrupted images

While trying to download images using config file method, some images are getting corrupted. Is there anyway to remove those corrupted images after download completion.
bug docs pending-info

opened by devajith 13
Can't possibly make it find chromedriver

Tried on bash for windows AND windows cmd. always cant find chromedriver.

chromedriver --version ChromeDriver 2.38.552522 (437e6fbedfa8762dec75e2c5b3ddb86763dc9dcb)

chromedriver Starting ChromeDriver 2.38.552522 (437e6fbedfa8762dec75e2c5b3ddb86763dc9dcb) on port 9515 Only local connections are allowed.

`googleimagesdownload -cf=cf.json -cd=C:\Users\Kem\AppData\Local\Programs\Python\Python36\Scripts\chromedriver.exe

Item no.: 1 --> Item name = Metal Gear Solid screenshot Evaluating... Looks like we cannot locate the path the 'chromedriver' (use the '--chromedriver' argument to specify the path to the executable.) or google chrome browser is not installed on your machine (exception: argument of type 'NoneType' is not iterable)`
docs error-reporting

opened by devingDev 11
Unable to download more than 100 images in windows

Hi,

As mentioned in the documents, I am calling your library through another file in python. I have also set the path to chrome driver in the arguments but still I am unable to download more than 100 images. I am however able to download up to 100 images. How do I fix this situation as I need more than 100 image downloads ?

Thanks for sharing this awesome tool and thank you for reading.
docs error-reporting

opened by shravik 11
Ssl Error

while trying to download 100 images found by searching "pokemon go para ios" I get the following error: ssl.CertificateError: hostname 'assets.phonedog.com' doesn't match either of 'cloudfront.net', '*.cloudfront.net' I'm using the python 3.6 fork. Is ther anyway to bypass this??
bug

opened by Jaunter 10
'0 is all we got for the search filter'

I have seen this issue before and am not sure why i am getting it now but i understand that the problem is in the '_get_next_item' function and that the html page does not contain 'rg_meta notranslate' so it returns -1 and no links are found for the images. I dont know why this is happening. Any help is appreciated, thanks.

opened by jap101 8
Downloading related images produces incorrect directory syntax

When downloading with the --related-images parameter, the downloader tries to create a directory with an invalid name and crashes.

For example (Windows 10, Python 3.6): python google_images_download.py -k "disc brake" -l 10 -ri

...produces: Item no.: 1 --> Item name = disc brake Evaluating... Starting Download... Completed Image ====> 1. 1200px-disk_brake_dsc03682.jpg Completed Image ====> 2. 1175-front.jpg ... Completed Image ====> 10. car-disc-brake-cluster.jpg

Getting list of related keywords...this may take a few moments

Now Downloading - disc brake - disc+brake,g_1:motorcycle:SsiXX7r2me8%3D&usg=AI4_-kQTz_e6pgr8yN7IAZTX6buWsweNYg&sa=X&ved=0ahUKEwjC-_Hw0f_gAhWRonEKHYzLB1kQ4lYIKCgB

OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'downloads\\disc brake - disc+brake,g_1:motorcycle:SsiXX7r2me8%3D&usg=AI4_-kQTz_e6pgr8yN7IAZTX6buWsweNYg&sa=X&ved=0ahUKEwjC-_Hw0f_gAhWRonEKHYzLB1kQ4lYIKCgB'

The create_directories() function is trying to create a directory that contains invalid characters, or sometimes the directory path is too long.

I think the problem arises because at the end of the download() function, it calls get_all_tabs() and extracts item_name which becomes appended to the path. I can probably fix this and submit a pull request, although am a bit unsure what the desired behaviour should be, so any suggestions welcome.
bug error-reporting pending-info

opened by 4OH4 7
Chromedriver Error Mac

I am getting the error "Looks like we cannot locate the path the 'chromedriver' (use the '--chromedriver' argument to specify the path to the executable.) or google chrome browser is not installed on your machine"

I have chromedriver located in my projectDirectory and my arguments appears below. arguments = {"chromedriver": "/Users/Documents/google-images-download", "keywords": "Polar bears", "limit": 1000, "print_urls": False} # creating list of arguments

Any help would be appreciated. Thank you!
command-line-query

opened by gottlism 7
'NoneType' object is not subscriptable error

Item no.: 1 --> Item name = weller_12 Bourbon whiskey Evaluating... Starting Download... 'NoneType' object is not subscriptable

Now receiving this error for each download I try

opened by hamman33 5
finds zero pics to be downloaded (in a few millis)

tried with both multiple and single key. snippet and output attached.

code snippet from google_images_download import google_images_download

response = google_images_download.googleimagesdownload()

triple key

argument = {"keywords":"templar assassin,drow ranger,mirana","limit":100,"print_urls":True} absolute_image_paths = response.download(argument)

triple key

argument = {"keywords":"invoker,spectre,abaddon","limit":100,"print_urls":True} absolute_image_paths = response.download(argument)

triple key

argument = {"keywords":"cat,dog,plane","limit":100,"print_urls":True} absolute_image_paths = response.download(argument)

single key

argument = {"keywords":"invoker","limit":100,"print_urls":True} absolute_image_paths = response.download(argument)

############# on a macbook pro 2012, mojave 10.14.6 scraper_output.txt

opened by porcherface 1
AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

I want to get more than 100 images. So I try the "chromedriver".
Here is my code:

from google_images_download import google_images_download
response = google_images_download.googleimagesdownload()
arguments = {"keywords":"Michael Jordan","limit":200,"print_urls":True ,"raw_google_data":True,"no_download":True, "chromedriver":"C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe"}
paths = response.download(arguments)

Here is the result: Item no.: 1 --> Item name = Michael Jordan Evaluating... Traceback (most recent call last): File "C:/Users/MJ Zhou/PycharmProjects/SEV/download_image.py", line 8, in paths = response.download(arguments) #passing the arguments to the function File "D:\anaconda3\envs\paper1\lib\site-packages\google_images_download-2.8.0-py3.8.egg\google_images_download\google_images_download.py", line 970, in download File "D:\anaconda3\envs\paper1\lib\site-packages\google_images_download-2.8.0-py3.8.egg\google_images_download\google_images_download.py", line 1111, in download_executor File "D:\anaconda3\envs\paper1\lib\site-packages\google_images_download-2.8.0-py3.8.egg\google_images_download\google_images_download.py", line 310, in download_extended_page AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

How to solve this problem?

opened by ZhouMingjie-code 3
Does not work with any search terms.

Item no.: 1 --> Item name = dogs Evaluating... 'NoneType' object is not subscriptable Image objects data unpacking failed.

It used to work just fine a few months ago.

opened by Zoltanio 4
Images do get downloaded according to the terminal and the log file but do not show up the folder

I want to download multiple images per keyword. I want to use 50 different keywords so I import them from a CSV file. But at some keywords, the images do not show up in the folder. But according to the log file and terminal, the images are downloaded. The strange part is that when I put in the keywords in the programme directly the images are downloaded and saved.

opened by Maksimke 0
is there a way for the json to show image description if its in hebrew?

when i run the script and save the metadata to a json file I get a unicode \u05de\u05d3\u05d5\u05d6\u05d4 \u05d7\u05d5\u05e3 \u05d0\u05e9\u05d3\u05d5\u05d3, so I added to lines 1132-1134 json_file = open("logs/" + search_keyword[i] + ".json", "w",encoding="utf-8") json.dump(items, json_file, indent=4, sort_keys=True, ensure_ascii=False) json_file.close() but it didn't resolve the issue any suggestions?

opened by tallevy22 0

Owner

Hardik Vasa

GitHub

Google scholar share - Simple python script to pull Google Scholar data from an author's profile

google_scholar_share Simple python script to pull Google Scholar data from an au

9 Sep 15, 2022

A simple telegram Bot, Upload Media File| video To telegram using the direct download link. (youtube, Mediafire, google drive, mega drive, etc)

URL-Uploader (Bot) A Bot Upload file|video To Telegram using given Links. Features: ?? Only Auth Users (AUTH_USERS) Can Use The Bot ?? Upload YTDL Sup

18 Dec 17, 2022

google-resumable-media Apache-2google-resumable-media (🥉28 · ⭐ 27) - Utilities for Google Media Downloads and Resumable.. Apache-2

google-resumable-media Utilities for Google Media Downloads and Resumable Uploads See the docs for examples and usage. Experimental asyncio Support Wh

36 Nov 22, 2022

An attendance bot that joins google meet automatically according to schedule and marks present in the google meet.

Google-meet-self-attendance-bot An attendance bot which joins google meet automatically according to schedule and marks present in the google meet. I

12 Sep 20, 2022

Google Drive, OneDrive and Youtube as covert-channels - Control systems remotely by uploading files to Google Drive, OneDrive, Youtube or Telegram

covert-control Control systems remotely by uploading files to Google Drive, OneDrive, Youtube or Telegram using Python to create the files and the lis

52 Dec 6, 2022

DDoS Script (DDoS Panel) with Multiple Bypass ( Cloudflare UAM,CAPTCHA,BFM,NOSEC / DDoS Guard / Google Shield / V Shield / Amazon / etc.. )

KARMA DDoS DDoS Script (DDoS Panel) with Multiple Bypass ( Cloudflare UAM,CAPTCHA,BFM,NOSEC / DDoS Guard / Google Shield / V Shield / Amazon / etc.. )

256 Jan 2, 2023

Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!

Related tags

Overview

Google Images Download

Documentation

Disclaimer

Comments

triple key

triple key

triple key

single key

Owner

Hardik Vasa

Google scholar share - Simple python script to pull Google Scholar data from an author's profile

A simple telegram Bot, Upload Media File| video To telegram using the direct download link. (youtube, Mediafire, google drive, mega drive, etc)

google-resumable-media Apache-2google-resumable-media (🥉28 · ⭐ 27) - Utilities for Google Media Downloads and Resumable.. Apache-2

An attendance bot that joins google meet automatically according to schedule and marks present in the google meet.

Google Drive, OneDrive and Youtube as covert-channels - Control systems remotely by uploading files to Google Drive, OneDrive, Youtube or Telegram

Easy Google Translate: Unofficial Google Translate API

Async ready API wrapper for Revolt API written in Python.

A modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python.

A modern,feature-rich, and async ready API wrapper for Discord written in Python

An async-ready Python wrapper around FerrisChat's API.

A modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python.

Python script for download course from platzi.com

A python script to download twitter space, only works on running spaces (for now).

Python script to download WAX transactions

A modern, easy to use, feature-rich, and async ready API wrapper improved and revived from original discord.py.

This repository contains ready to deploy automations on AWS

A simple python script for rclone. Use multiple Google Service Accounts and cycle through them.

A small python script which runs a speedtest using speedtest.net and inserts it into a Google Docs Spreadsheet.

DDoS Script (DDoS Panel) with Multiple Bypass ( Cloudflare UAM,CAPTCHA,BFM,NOSEC / DDoS Guard / Google Shield / V Shield / Amazon / etc.. )