Make Selenium work on Github Actions

Overview

Make Selenium work on Github Actions

Scraping with BeautifulSoup on GitHub Actions is easy-peasy. But what about Selenium?? After you jump through some hoops it's just as simple.

How to use this

I think you can just fork this repository and get to work on scraper.py.

Otherwise, just steal scraper.py and .github/workflows/scrape.yml and you'll be good to go.

Saving CSV files

If you want every scrape to update a CSV or something like this, you'll need to edit your scrape.yml to commit to the repo after the scraper is done running.

Take a look at the bottom of autoscraper-history to see how to do that. It should be one more step, something like this:

      - name: Commit and push if content changed
        run: |-
          git config user.name "Automated"
          git config user.email "[email protected]"
          git add -A
          timestamp=$(date -u)
          git commit -m "Latest data: ${timestamp}" || exit 0
          git push

I'm pretty sure I lifted that code 100% from Simmon Willison.

Scheduling

Right now the yaml is set to only scrape when you click the "Run workflow" button in the Actions tab, but you can add something like

  schedule:
    - cron: '0 * * * *'

To make it run the first minute of every hour.

How this all works

To make Selenium + GitHub Actions work, this repo does a few magic things. In a normal world, you start Chrome like this:

from selenium import webdriver

driver = webdriver.Chrome()

But we.... do things a little differently. Let me walk you through the changes!

webdriver-manager to manage the webdriver

You can add a special action to set up Chromedriver but I feel it's honestly easier to use webdriver-manager. It magically picks out the right version of chromedriver for you.

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())

It works on your local machine, too!

Chromium, not Chrome

When you're running GitHub Actions, it's probably on a nice little Ubuntu Linux machine. In those situations, you install software using apt. Since you can't install Chrome with apt, you'll install Chromium instead, the open-source version of Chrome. Works the same, just opens a little differently.

As a result, our chromedriver install code changed a little more:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from webdriver_manager.utils import ChromeType

driver_path = ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install()
driver = webdriver.Chrome(driver_path)

This also means we need to add a line that does apt-get install -y chromium-browser to our scrape.yml)

Headless Chromium

Since we don't have a monitor plugged into GitHub Actions, we can't actually see what's going on in the browser. In the olden days you had to construct some odd technical fake screen, but these days you just run in headless mode!

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(options=chrome_options)

Other options and more management of things

To combine the webdriver-manager driver path and the headless Chrome option, there are a lot of hoops to jump through. During the research process a lot of extra chrome options poked up, so I thought hey, let's just add all those, too.

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from webdriver_manager.utils import ChromeType
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service

chrome_service = Service(ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install())

chrome_options = Options()
options = [
    "--headless",
    "--disable-gpu",
    "--window-size=1920,1200",
    "--ignore-certificate-errors",
    "--disable-extensions",
    "--no-sandbox",
    "--disable-dev-shm-usage"
]
for option in options:
    chrome_options.add_argument(option)

driver = webdriver.Chrome(service=chrome_service, options=chrome_options)

The biggest change here beyond all those options is the Service thing. Apparently just giving it the path to chromdriver isn't good enough? Who knows, I just do what works.

You might also like...
Fully functioning price detector built with selenium and python

Fully functioning price detector built with selenium and python

Akulaku Create NewProduct Automation using Selenium Python
Akulaku Create NewProduct Automation using Selenium Python

Akulaku-Create-NewProduct-Automation Akulaku Create NewProduct Automation using Selenium Python Usage: 1. Install Python 3.9 2. Open CMD on Bot Folde

Youtube Tool using selenium Python
Youtube Tool using selenium Python

YT-AutoLikeComment-AutoReportComment-AutoComment Youtube Tool using selenium Python Auto Comment Auto Like Comment Auto Report Comment Usage: 1. Insta

Selenium Page Object Model with Python

Page-object-model (POM) is a pattern that you can apply it to develop efficient automation framework.

Aplikasi otomasi klik di situs popcat.click menggunakan Python dan Selenium
Aplikasi otomasi klik di situs popcat.click menggunakan Python dan Selenium

popthe-popcat Aplikasi Otomasi Klik di situs popcat.click. aplikasi ini akan secara otomatis melakukan click pada kucing viral itu, sehingga anda tida

Python Webscraping using Selenium

Web Scraping with Python and Selenium The code shows how to do web scraping using Python and Selenium. We use as data the https://sbot.org.br/localize

This file will contain a series of Python functions that use the Selenium library to search for elements in a web page while logging everything into a file

element_search with Selenium (Now With docstrings 😎 ) Just to mention, I'm a beginner to all this, so it it's very possible to make some mistakes The

Compiles python selenium script to be a Window's executable

Problem Statement Setting up a Python project can be frustrating for non-developers. From downloading the right version of python, setting up virtual

Automated tests for OKAY websites in Python (Selenium) - user friendly version

Okay Selenium Testy Aplikace určená k testování produkčních webů společnosti OKAY s.r.o. Závislosti K běhu aplikace je potřeba mít v počítači nainstal

Comments
  • ValueError: There is no such driver by url

    ValueError: There is no such driver by url

    Hi, i just fork the repo and runned on github, but got this error:

    ValueError: There is no such driver by url https://chromedriver.storage.googleapis.com/105.0.5177/chromedriver_linux64.zip

    opened by andreileonsalas 2
  • ValueError: There is no such driver by url

    ValueError: There is no such driver by url

    Hello, getting this issue:

     ValueError: There is no such driver by url https://chromedriver.storage.googleapis.com/106.0.5249.61/chromedriver_mac64_m1.zip
    

    After short debugging, figured out, it's due to changed format of url here: https://chromedriver.storage.googleapis.com/

    it used to be: https://chromedriver.storage.googleapis.com/106.0.5249.61/chromedriver_mac64_m1.zip

    but, for version 106.0.5249.61 it's actually became: https://chromedriver.storage.googleapis.com/106.0.5249.61/chromedriver_mac_arm64.zip

    mac64_m1.zip (old) vs mac_arm64.zip (new). Hard to say if it's permanent change, or just somebody's typo..

    opened by alexander-gridnev 1
  • ChromeDriverManager can't find driver by url

    ChromeDriverManager can't find driver by url

    When running:

    chrome_service = Service(ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install())
    

    I get the following error. I think it's caused because on https://chromedriver.storage.googleapis.com/ version 106.0.5235 doesn't exists.

    How can I fix this?

    File "/home/runner/work/shipping-data/shipping-data/routescanner_automated.py", line 183, in <module>
        soups_ods_2 = get_webpages2(od_ports, headless=True)
      File "/home/runner/work/shipping-data/shipping-data/routescanner_automated.py", line 55, in get_webpages2
        chrome_service = Service(ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install())
      File "/opt/hostedtoolcache/Python/3.10.7/x64/lib/python3.10/site-packages/webdriver_manager/chrome.py", line 39, in install
        driver_path = self._get_driver_path(self.driver)
      File "/opt/hostedtoolcache/Python/3.10.7/x64/lib/python3.10/site-packages/webdriver_manager/core/manager.py", line 30, in _get_driver_path
        file = self._download_manager.download_file(driver.get_url())
      File "/opt/hostedtoolcache/Python/3.10.7/x64/lib/python3.10/site-packages/webdriver_manager/core/download_manager.py", line 28, in download_file
        response = self._http_client.get(url)
      File "/opt/hostedtoolcache/Python/3.10.7/x64/lib/python3.10/site-packages/webdriver_manager/core/http.py", line 33, in get
        self.validate_response(resp)
      File "/opt/hostedtoolcache/Python/3.10.7/x64/lib/python3.10/site-packages/webdriver_manager/core/http.py", line [16](https://github.com/EwoutH/shipping-data/actions/runs/3249621607/jobs/5332229194#step:7:17), in validate_response
        raise ValueError(f"There is no such driver by url {resp.url}")
    ValueError: There is no such driver by url https://chromedriver.storage.googleapis.com/106.0.5[23](https://github.com/EwoutH/shipping-data/actions/runs/3249621607/jobs/5332229194#step:7:24)5/chromedriver_linux64.zip
    Error: Process completed with exit code 1.
    
    opened by EwoutH 0
  • Description(README.md) and code(scraper.py) should be updated

    Description(README.md) and code(scraper.py) should be updated

    opened by hi-Nobody 0
Owner
Jonathan Soma
baby data journo wrangler @ledeprogram + @littlecolumns, cat wrangler @cat-republic
Jonathan Soma
Python package to easily work with selenium and manage tabs effectively.

Simple Selenium The aim of this package is to quickly get started with working with selenium for simple browser automation tasks. Installation Install

Vishal Kumar Mishra 1 Oct 27, 2021
A library to make concurrent selenium tests that automatically download and setup webdrivers

AutoParaSelenium A library to make parallel selenium tests that automatically download and setup webdrivers Usage Installation pip install autoparasel

Ronak Badhe 8 Mar 13, 2022
This is a Python script for Github Bot which uses Selenium to Automate things.

github-follow-unfollow-bot This is a Python script for Github Bot which uses Selenium to Automate things. Pre-requisites :- Python A Github Account Re

Chaudhary Hamdan 10 Jul 1, 2022
pytest plugin that let you automate actions and assertions with test metrics reporting executing plain YAML files

pytest-play pytest-play is a codeless, generic, pluggable and extensible automation tool, not necessarily test automation only, based on the fantastic

pytest-dev 67 Dec 1, 2022
pytest splinter and selenium integration for anyone interested in browser interaction in tests

Splinter plugin for the pytest runner Install pytest-splinter pip install pytest-splinter Features The plugin provides a set of fixtures to use splin

pytest-dev 238 Nov 14, 2022
Selenium-python but lighter: Helium is the best Python library for web automation.

Selenium-python but lighter: Helium Selenium-python is great for web automation. Helium makes it easier to use. For example: Under the hood, Helium fo

Michael Herrmann 3.2k Dec 31, 2022
Tutorial for integrating Oxylabs' Residential Proxies with Selenium

Oxylabs’ Residential Proxies integration with Selenium Requirements For the integration to work, you'll need to install Selenium on your system. You c

Oxylabs.io 8 Dec 8, 2022
This project is used to send a screenshot by email of your MyUMons schedule using Selenium python lib (headless mode)

MyUMonsSchedule Use MyUMonsSchedule python script to send a screenshot by email (Gmail) of your MyUMons schedule. If you use it on Windows, take care

Pierre-Louis D'Agostino 6 May 12, 2022
0hh1 solver for the web (selenium) and also for mobile (adb)

0hh1 - Solver Aims to solve the '0hh1 puzzle' for all the sizes (4x4, 6x6, 8x8, 10x10 12x12). for both the web version (using selenium) and on android

Adwaith Rajesh 1 Nov 5, 2021
Spam the buzzer and upgrade automatically - Selenium

CookieClicker Usage: Let's check your chrome navigator version : Consequently, you have to : download the right chromedriver in the follow link : http

Iliam Amara 1 Nov 22, 2021