Download a large file from Google Drive (curl/wget fails because of the security notice).

Overview

gdown

Download a large file from Google Drive.


Description

Download a large file from Google Drive.
If you use curl/wget, it fails with a large file because of the security warning from Google Drive. Supports downloading from Google Drive folders (max 50 files per folder).

Installation

pip install gdown

Usage

From Command Line

$ gdown --help
usage: gdown [-h] [-V] [-O OUTPUT] [-q] [--fuzzy] [--id]  [--proxy PROXY]
             [--speed SPEED] [--no-cookies] [--no-check-certificate]
             [--continue] [--folder]
             url_or_id
...

$ # a large file (~500MB)
$ gdown https://drive.google.com/uc?id=1l_5RK28JRL19wpT22B-DY9We3TVXnnQQ
$ # gdown --id 1l_5RK28JRL19wpT22B-DY9We3TVXnnQQ
$ md5sum fcn8s_from_caffe.npz
256c2a8235c1c65e62e48d3284fbd384

$ # a small file
$ gdown https://drive.google.com/uc?id=0B9P1L--7Wd2vU3VUVlFnbTgtS2c
$ cat spam.txt
spam

$ # download with fuzzy extraction of a file ID
$ gdown --fuzzy 'https://drive.google.com/file/d/0B9P1L--7Wd2vU3VUVlFnbTgtS2c/view?usp=sharing&resourcekey=0-WWs_XOSctfaY_0-sJBKRSQ'
$ cat spam.txt
spam

$ # --fuzzy option also works with Microsoft Powerpoint files
$ gdown --fuzzy "https://docs.google.com/presentation/d/15umvZKlsJ3094HNg5S4vJsIhxcFlyTeK/edit?usp=sharing&ouid=117512221203072002113&rtpof=true&sd=true"

$ # a folder
$ gdown https://drive.google.com/drive/folders/1ivUsJd88C8rl4UpqpxIcdI5YLmRD0Mfj -O /tmp/folder --folder

$ # as an alternative to curl/wget
$ gdown https://httpbin.org/ip -O ip.json
$ cat ip.json
{
  "origin": "126.169.213.247"
}

$ # write stdout and pipe to extract
$ gdown https://github.com/wkentaro/gdown/archive/refs/tags/v4.0.0.tar.gz -O - --quiet | tar zxvf -
$ ls gdown-4.0.0

From Python

import gdown

url = 'https://drive.google.com/uc?id=0B9P1L--7Wd2vNm9zMTJWOGxobkU'
output = '20150428_collected_images.tgz'
gdown.download(url, output, quiet=False)

md5 = 'fa837a88f0c40c513d975104edf3da17'
gdown.cached_download(url, output, md5=md5, postprocess=gdown.extractall)

url = 'https://drive.google.com/drive/folders/1ivUsJd88C8rl4UpqpxIcdI5YLmRD0Mfj'
gdown.download_folder(url, quiet=True, no_cookies=True)

License

See LICENSE.

Comments
  • Permission denied although I have set to

    Permission denied although I have set to "anyone with link"

    I have 355 MB file on google drive I have download before with gdown and it worked (around Nov 2021) Today 14 Feb 2022, when I tried again it didn't worked.

    I use : gdown --id Error: Access denied with the following error:

        Cannot retrieve the public link of the file. You may need to change
        the permission to 'Anyone with the link', or have had many accesses. 
    

    You may still be able to access the file from the browser:

         https://drive.google.com/uc?id=1QWzmHdF1L_3hbjM85nOjfdHsm-iqQptG 
    

    I am the owner of the file on Google drive, I checked again for the status of the link (set to "Anyonae with the link")

    I have asked a colleague to download the file and it worked for him using the same command : gdown --id I uninstall and reinstall gdown Still doesn't work Any idea?

    enhancement 
    opened by thibaulttabarin 21
  • Add support for folder downloads

    Add support for folder downloads

    Disclaimer

    Since the PR #75 hasn't had any updates for some time, I am making this one with changes made by me and @motivationalreposter, I suggest @wkentaro close the older PR in favor of this one that is preserving @motivationalreposter commits and contains an implementation that is up-to-date and working.

    TODO:

    • [x] Check to see if it is still working 0.o, it is :)
    • [ ] Raise exception on folder with >=50 files, disable raising this exception when flag remaining_ok=True
    • [ ] Add in the docs and --help about the 50 files limitation
    • [ ] Check hashsum of files downloaded in integration tests
    • [ ] Add tests for remaining_ok

    What this PR adds

    This PR adds support for Gdrive folders by parsing the js in the main page and getting it's required files IDs, as well as adding proper CI testing in a gdrive folder with some CC0 license images.

    So,this PR contains as well (written by @motivationalreposter in the PR #75) :

    ' This commit solves the problem of downloading entire Google Drive folders, which has been an issue for quite a while (#62, #65).

    This solution uses Beautiful Soup to get the