Dude is a very simple framework for writing web scrapers using Python decorators

Overview
License License Version Version
Github Actions Github Actions Coverage CodeCov
Supported versions Python Versions Wheel Wheel
Status Status Downloads Downloads
All Contributors All Contributors

dude uncomplicated data extraction

Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax.

🚨 Dude is currently in Pre-Alpha. Please expect breaking changes.

Installation

To install, simply run the following from terminal.

pip install pydude
playwright install  # Install playwright binaries for Chrome, Firefox and Webkit.

Minimal web scraper

The simplest web scraper will look like this:

from dude import select


@select(css="a")
def get_link(element):
    return {"url": element.get_attribute("href")}

The example above will get all the hyperlink elements in a page and calls the handler function get_link() for each element.

How to run the scraper

You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.

" --output data.json path/to/script.py">
dude scrape --url "" --output data.json path/to/script.py

The output in data.json should contain the actual URL and the metadata prepended with underscore.

[
  {
    "_page_number": 1,
    "_page_url": "https://dude.ron.sh/",
    "_group_id": 4502003824,
    "_group_index": 0,
    "_element_index": 0,
    "url": "/url-1.html"
  },
  {
    "_page_number": 1,
    "_page_url": "https://dude.ron.sh/",
    "_group_id": 4502003824,
    "_group_index": 0,
    "_element_index": 1,
    "url": "/url-2.html"
  },
  {
    "_page_number": 1,
    "_page_url": "https://dude.ron.sh/",
    "_group_id": 4502003824,
    "_group_index": 0,
    "_element_index": 2,
    "url": "/url-3.html"
  }
]

Changing the output to --output data.csv should result in the following CSV content.

data.csv

Features

  • Simple Flask-inspired design - build a scraper with decorators.
  • Uses Playwright API - run your scraper in Chrome, Firefox and Webkit and leverage Playwright's powerful selector engine supporting CSS, XPath, text, regex, etc.
  • Data grouping - group related results.
  • URL pattern matching - run functions on matched URLs.
  • Priority - reorder functions based on priority.
  • Setup function - enable setup steps (clicking dialogs or login).
  • Navigate function - enable navigation steps to move to other pages.
  • Custom storage - option to save data to other formats or database.
  • Async support - write async handlers.
  • Option to use other parser backends aside from Playwright.
  • Option to follow all links indefinitely (Crawler/Spider).
  • Events - attach functions to startup, pre-setup, post-setup and shutdown events.
  • Option to save data on every page.

Supported Parser Backends

By default, Dude uses Playwright but gives you an option to use parser backends that you are familiar with. It is possible to use parser backends like BeautifulSoup4, Parsel, lxml, Pyppeteer, and Selenium.

Here is the summary of features supported by each parser backend.

Parser Backend Supports
Sync?
Supports
Async?
Selectors Setup
Handler
Navigate
Handler
CSS XPath Text Regex
Playwright βœ… βœ… βœ… βœ… βœ… βœ… βœ… βœ…
BeautifulSoup4 βœ… βœ… βœ… 🚫 🚫 🚫 🚫 🚫
Parsel βœ… βœ… βœ… βœ… βœ… βœ… 🚫 🚫
lxml βœ… βœ… βœ… βœ… βœ… βœ… 🚫 🚫
Pyppeteer 🚫 βœ… βœ… βœ… βœ… 🚫 βœ… βœ…
Selenium βœ… βœ… βœ… βœ… βœ… 🚫 βœ… βœ…

Documentation

Read the complete documentation at https://roniemartinez.github.io/dude/. All the advanced and useful features are documented there.

Requirements

  • βœ… Any dude should know how to work with selectors (CSS or XPath).
  • βœ… Familiarity with any backends that you love (see Supported Parser Backends)
  • βœ… Python decorators... you'll live, dude!

Why name this project "dude"?

  • βœ… A Recursive acronym looks nice.
  • βœ… Adding "uncomplicated" (like ufw) into the name says it is a very simple framework.
  • βœ… Puns! I also think that if you want to do web scraping, there's probably some random dude around the corner who can make it very easy for you to start with it. 😊

Author

Ronie Martinez

Contributors ✨

Thanks goes to these wonderful people (emoji key):


Ronie Martinez

🚧 πŸ’» πŸ“– ??

This project follows the all-contributors specification. Contributions of any kind welcome!

Comments
  • ⬆️ Bump pybrowsers from 0.5.0 to 0.5.1

    ⬆️ Bump pybrowsers from 0.5.0 to 0.5.1

    Bumps pybrowsers from 0.5.0 to 0.5.1.

    Release notes

    Sourced from pybrowsers's releases.

    πŸ› Fix Windows Chromium

    What's Changed

    Full Changelog: https://github.com/roniemartinez/browsers/compare/0.5.0...0.5.1

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 7
  • ⬆️ Bump mypy from 0.931 to 0.940

    ⬆️ Bump mypy from 0.931 to 0.940

    Bumps mypy from 0.931 to 0.940.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 4
  • ⬆️ Bump selenium-wire from 4.6.4 to 4.6.5

    ⬆️ Bump selenium-wire from 4.6.4 to 4.6.5

    Bumps selenium-wire from 4.6.4 to 4.6.5.

    Changelog

    Sourced from selenium-wire's changelog.

    4.6.5 (2022-07-09)

    • Fix compatibility issue with DesiredCapabilities and older versions of the Chrome webdriver API.
    • Fix bug where verify_ssl would assume the inverse of the boolean passed (it was the wrong way round).
    • Minor update to support Python 3.10.
    • Minor README updates.
    Commits
    • 9368006 Bump version: 4.6.4 β†’ 4.6.5
    • 3c5166a Updates for 4.6.5
    • 3f37ef9 Revert version change
    • cba952d Merge pull request #570 from royopa/patch-1
    • de51d5c Merge pull request #574 from sterliakov/master
    • 7984282 Add 3.10 readme badge
    • 8054f65 Use By locator in test instead of deprecated/removed long lookup method
    • c7ed400 Bump version: 4.6.4 β†’ 4.6.5
    • 27e4e59 Remove werkzeug from main requirements, leave only in dev.
    • 0ccdbdf Add python 3.10 as a CI job
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 3
  • ⬆️ Bump pytest from 7.0.1 to 7.1.0

    ⬆️ Bump pytest from 7.0.1 to 7.1.0

    Bumps pytest from 7.0.1 to 7.1.0.

    Release notes

    Sourced from pytest's releases.

    7.1.0

    pytest 7.1.0 (2022-03-13)

    Breaking Changes

    • #8838: As per our policy, the following features have been deprecated in the 6.X series and are now removed:

      • pytest._fillfuncargs function.
      • pytest_warning_captured hook - use pytest_warning_recorded instead.
      • -k -foobar syntax - use -k 'not foobar' instead.
      • -k foobar: syntax.
      • pytest.collect module - import from pytest directly.

      For more information consult Deprecations and Removals in the docs.

    • #9437: Dropped support for Python 3.6, which reached end-of-life at 2021-12-23.

    Improvements

    • #5192: Fixed test output for some data types where -v would show less information.

      Also, when showing diffs for sequences, -q would produce full diffs instead of the expected diff.

    • #9362: pytest now avoids specialized assert formatting when it is detected that the default __eq__ is overridden in attrs or dataclasses.

    • #9536: When -vv is given on command line, show skipping and xfail reasons in full instead of truncating them to fit the terminal width.

    • #9644: More information about the location of resources that led Python to raise ResourceWarning{.interpreted-text role="class"} can now be obtained by enabling tracemalloc{.interpreted-text role="mod"}.

      See resource-warnings{.interpreted-text role="ref"} for more information.

    • #9678: More types are now accepted in the ids argument to @pytest.mark.parametrize. Previously only [str]{.title-ref}, [float]{.title-ref}, [int]{.title-ref} and [bool]{.title-ref} were accepted; now [bytes]{.title-ref}, [complex]{.title-ref}, [re.Pattern]{.title-ref}, [Enum]{.title-ref} and anything with a [__name__]{.title-ref} are also accepted.

    • #9692: pytest.approx{.interpreted-text role="func"} now raises a TypeError{.interpreted-text role="class"} when given an unordered sequence (such as set{.interpreted-text role="class"}).

      Note that this implies that custom classes which only implement __iter__ and __len__ are no longer supported as they don't guarantee order.

    Bug Fixes

    • #8242: The deprecation of raising unittest.SkipTest{.interpreted-text role="class"} to skip collection of tests during the pytest collection phase is reverted - this is now a supported feature again.
    • #9493: Symbolic link components are no longer resolved in conftest paths. This means that if a conftest appears twice in collection tree, using symlinks, it will be executed twice.

    ... (truncated)

    Commits
    • 1dbffcc [pre-commit.ci] auto fixes from pre-commit.com hooks
    • d53a5fb Prepare release version 7.1.0
    • d306ec0 Update upcoming trainings (#9744)
    • 3e4c14b Merge pull request #9751 from fabianegli/main
    • 7f924b1 Fix typo in deprecation documentation
    • 4a8f8ad build(deps): Bump django from 4.0.2 to 4.0.3 in /testing/plugins_integration ...
    • c0fd2d8 build(deps): Bump pytest-asyncio from 0.18.1 to 0.18.2 in /testing/plugins_in...
    • 843e018 Merge pull request #9732 from nicoddemus/9730-toml-failure
    • bc43d66 [automated] Update plugin list (#9733)
    • e38d1ca Improve error message for malformed pyproject.toml files
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 3
  • ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.10

    ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.10

    Bumps mkdocs-material from 8.5.6 to 8.5.10.

    Release notes

    Sourced from mkdocs-material's releases.

    mkdocs-material-8.5.10

    • Adjusted CSS to better allow for custom primary and accent colors
    • Fixed #4620: Primary color is not applied (8.5.9 regression)

    mkdocs-material-8.5.9

    • Fixed #4600: Illegible links for black/white primary colors (8.5.8 regression)
    • Fixed #4594: Need to set schema to change link color

    mkdocs-material-8.5.8

    • Added support for always showing settings in cookie consent
    • Fixed #4571: Buttons invisible if primary color is white or black
    • Fixed #4517: Illegible note in sequence diagram when using slate scheme

    mkdocs-material-8.5.7

    • Deprecated additional admonition qualifiers to reduce size of CSS
    • Fixed #4511: Search boost does not apply to sections
    Changelog

    Sourced from mkdocs-material's changelog.

    mkdocs-material-8.5.10 (2022-11-11)

    • Adjusted CSS to better allow for custom primary and accent colors
    • Fixed #4620: Primary color is not applied (8.5.9 regression)

    mkdocs-material-8.5.9 (2022-11-08)

    • Fixed #4600: Illegible link colors for black and white primary colors
    • Fixed #4594: Need to set schema to change link color

    mkdocs-material-8.5.8+insiders-4.26.2 (2022-11-03)

    • Updated MkDocs to 1.4.2
    • Added support for tag compare functions when sorting on index pages
    • Fixed footnotes being rendered in post excerpts without separators
    • Fixed error in blog plugin when toc extension is not enabled
    • Fixed issues with invalid asset paths and linked post titles
    • Fixed #4572: Privacy plugin fails when symlinks cannot be created
    • Fixed #4545: Blog plugin doesn't automatically link headline to post
    • Fixed #4542: Blog plugin doesn't allow for multiple instances
    • Fixed #4532: Blog plugin doesn't allow for mixed use of date and datetime

    mkdocs-material-8.5.8 (2022-11-03)

    • Added support for always showing settings in cookie consent
    • Fixed #4571: Buttons invisible if primary color is white or black
    • Fixed #4517: Illegible note in sequence diagram when using slate scheme

    mkdocs-material-8.5.7+insiders-4.26.1 (2022-10-22)

    • Improved reporting of configuration errors in tags plugin
    • Fixed #4515: Privacy plugin fails when site URL is not defined
    • Fixed #4514: Privacy plugin doesn't fetch Google fonts (4.26.0 regression)

    mkdocs-material-8.5.7 (2022-10-22)

    • Deprecated additional admonition qualifiers to reduce size of CSS
    • Fixed #4511: Search boost does not apply to sections

    mkdocs-material-8.5.6+insiders-4.26.0 (2022-10-18)

    • Refactored privacy plugin to prepare for new features
    • Added support for rel=noopener links in privacy plugin
    • Resolve encoding issues with blog and privacy plugin

    mkdocs-material-8.5.6+insiders-4.25.5 (2022-10-16)

    • Updated MkDocs to 1.4.1
    • Added namespace prefix to built-in plugins
    • Updated content and header partial

    ... (truncated)

    Commits
    • 08bf992 Prepare 8.5.10 release
    • 6a1b86e Fixed development environment and overrides
    • c62ff2c Allowed to override primary and accent colors more easily
    • 078a411 Allowed to override primary and accent colors more easily
    • 382e870 Formatting
    • a540c33 Documentation
    • 018e5f8 Added Cash App to premium sponsors
    • 84bc19c Prepare 8.5.9 release
    • 074a0c8 Fixed issues with color overrides by always setting color attributes
    • f5f5baa Fixed illegible links for black and white primary colors
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.9

    ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.9

    Bumps mkdocs-material from 8.5.6 to 8.5.9.

    Release notes

    Sourced from mkdocs-material's releases.

    mkdocs-material-8.5.9

    • Fixed #4600: Illegible links for black/white primary colors (8.5.8 regression)
    • Fixed #4594: Need to set schema to change link color

    mkdocs-material-8.5.8

    • Added support for always showing settings in cookie consent
    • Fixed #4571: Buttons invisible if primary color is white or black
    • Fixed #4517: Illegible note in sequence diagram when using slate scheme

    mkdocs-material-8.5.7

    • Deprecated additional admonition qualifiers to reduce size of CSS
    • Fixed #4511: Search boost does not apply to sections
    Changelog

    Sourced from mkdocs-material's changelog.

    mkdocs-material-8.5.9 (2022-11-08)

    • Fixed #4600: Illegible link colors for black and white primary colors
    • Fixed #4594: Need to set schema to change link color

    mkdocs-material-8.5.8+insiders-4.26.2 (2022-11-03)

    • Updated MkDocs to 1.4.2
    • Added support for tag compare functions when sorting on index pages
    • Fixed footnotes being rendered in post excerpts without separators
    • Fixed error in blog plugin when toc extension is not enabled
    • Fixed issues with invalid asset paths and linked post titles
    • Fixed #4572: Privacy plugin fails when symlinks cannot be created
    • Fixed #4545: Blog plugin doesn't automatically link headline to post
    • Fixed #4542: Blog plugin doesn't allow for multiple instances
    • Fixed #4532: Blog plugin doesn't allow for mixed use of date and datetime

    mkdocs-material-8.5.8 (2022-11-03)

    • Added support for always showing settings in cookie consent
    • Fixed #4571: Buttons invisible if primary color is white or black
    • Fixed #4517: Illegible note in sequence diagram when using slate scheme

    mkdocs-material-8.5.7+insiders-4.26.1 (2022-10-22)

    • Improved reporting of configuration errors in tags plugin
    • Fixed #4515: Privacy plugin fails when site URL is not defined
    • Fixed #4514: Privacy plugin doesn't fetch Google fonts (4.26.0 regression)

    mkdocs-material-8.5.7 (2022-10-22)

    • Deprecated additional admonition qualifiers to reduce size of CSS
    • Fixed #4511: Search boost does not apply to sections

    mkdocs-material-8.5.6+insiders-4.26.0 (2022-10-18)

    • Refactored privacy plugin to prepare for new features
    • Added support for rel=noopener links in privacy plugin
    • Resolve encoding issues with blog and privacy plugin

    mkdocs-material-8.5.6+insiders-4.25.5 (2022-10-16)

    • Updated MkDocs to 1.4.1
    • Added namespace prefix to built-in plugins
    • Updated content and header partial

    mkdocs-material-8.5.6+insiders-4.25.4 (2022-10-09)

    • Fixed other path issues for standalone blogs (4.24.2 regression)

    ... (truncated)

    Commits
    • 84bc19c Prepare 8.5.9 release
    • 074a0c8 Fixed issues with color overrides by always setting color attributes
    • f5f5baa Fixed illegible links for black and white primary colors
    • 59f981e Updated dependencies
    • 2569d4f Debug documentation build
    • 98b51db Debug documentation build
    • 18c5e9a Updated Insiders changelog
    • dca4274 Removed unnecessary overrides prefix
    • dcd4a3d Prepare 8.5.8 release
    • 9e8446e Merge pull request #4585 from squidfunk/docs/simplify-overrides
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • ⬆️ Bump types-pyyaml from 6.0.12 to 6.0.12.1

    ⬆️ Bump types-pyyaml from 6.0.12 to 6.0.12.1

    Bumps types-pyyaml from 6.0.12 to 6.0.12.1.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • ⬆️ Bump autoflake from 1.7.6 to 1.7.7

    ⬆️ Bump autoflake from 1.7.6 to 1.7.7

    Bumps autoflake from 1.7.6 to 1.7.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.7

    ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.7

    Bumps mkdocs-material from 8.5.6 to 8.5.7.

    Release notes

    Sourced from mkdocs-material's releases.

    mkdocs-material-8.5.7

    • Deprecated additional admonition qualifiers to reduce size of CSS
    • Fixed #4511: Search boost does not apply to sections
    Changelog

    Sourced from mkdocs-material's changelog.

    mkdocs-material-8.5.7+insiders-4.26.0 (2022-10-22)

    • Improved reporting of configuration errors in tags plugin
    • Fixed #4515: Privacy plugin fails when site URL is not defined
    • Fixed #4514: Privacy plugin doesn't fetch Google fonts (4.26.0 regression)

    mkdocs-material-8.5.7 (2022-10-22)

    • Deprecated additional admonition qualifiers to reduce size of CSS
    • Fixed #4511: Search boost does not apply to sections

    mkdocs-material-8.5.6+insiders-4.26.0 (2022-10-18)

    • Refactored privacy plugin to prepare for new features
    • Added support for rel=noopener links in privacy plugin
    • Resolve encoding issues with blog and privacy plugin

    mkdocs-material-8.5.6+insiders-4.25.5 (2022-10-16)

    • Updated MkDocs to 1.4.1
    • Added namespace prefix to built-in plugins
    • Updated content and header partial

    mkdocs-material-8.5.6+insiders-4.25.4 (2022-10-09)

    • Fixed other path issues for standalone blogs (4.24.2 regression)

    mkdocs-material-8.5.6+insiders-4.25.3 (2022-10-09)

    • Fixed #4457: Posts not collected for standalone blog (4.24.2 regression)

    mkdocs-material-8.5.6+insiders-4.25.2 (2022-10-04)

    • Fixed #4452: Blog and tags plugin crash when specifying slugify function

    mkdocs-material-8.5.6+insiders-4.25.1 (2022-10-03)

    • Updated mkdocs-rss-plugin in Dockerfile to fix MkDocs compat errors

    mkdocs-material-8.5.6+insiders-4.25.0 (2022-10-02)

    • Added support for navigation subtitles
    • Added support for defining an allow list for built-in tags plugin
    • Added support for custom slugify functions for built-in tags plugin
    • Improved stability of search plugin when using --dirtyreload

    mkdocs-material-8.5.6 (2022-10-02)

    • Modernized appearance of admonitions (with fallback, see docs)
    • Improved appearance of inline code blocks in admonition titles

    ... (truncated)

    Commits
    • 23f12fe Prepare 8.5.7 release
    • 922fde0 Fixed search boost not being applied to document sections
    • f13a552 Documentation
    • b2f310a Documentation
    • 93daab2 Updated Insiders changelog and documentation
    • b8161e0 Added warning for social plugin when site_url is not defined
    • ed6f0b1 Updated Insiders changelog
    • 34f563c Temporarily disabled no-misused-promise check due to ESLint error
    • 2b08c42 Fixed linter errors
    • b0afb7f Updated dependencies
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • ⬆️ Bump autoflake from 1.7.4 to 1.7.5

    ⬆️ Bump autoflake from 1.7.4 to 1.7.5

    Bumps autoflake from 1.7.4 to 1.7.5.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • ⬆️ Bump autoflake from 1.7.1 to 1.7.3

    ⬆️ Bump autoflake from 1.7.1 to 1.7.3

    Bumps autoflake from 1.7.1 to 1.7.3.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • ⬆️ Bump isort from 5.10.1 to 5.11.4

    ⬆️ Bump isort from 5.10.1 to 5.11.4

    Bumps isort from 5.10.1 to 5.11.4.

    Release notes

    Sourced from isort's releases.

    5.11.4

    Changes

    :package: Dependencies

    5.11.3

    Changes

    :beetle: Fixes

    :construction_worker: Continuous Integration

    v5.11.3

    Changes

    :beetle: Fixes

    :construction_worker: Continuous Integration

    5.11.2

    Changes

    5.11.1

    Changes December 12 2022

    ... (truncated)

    Changelog

    Sourced from isort's changelog.

    5.11.4 December 21 2022

    5.11.3 December 16 2022

    5.11.2 December 12 2022

    5.11.1 December 12 2022

    5.11.0 December 12 2022

    Commits
    • 98390f5 Merge pull request #2059 from PyCQA/version/5.11.4
    • df69a05 Bump version 5.11.4
    • f9add58 Merge pull request #2058 from PyCQA/deps/poetry-1.3.1
    • 36caa91 Bump Poetry 1.3.1
    • 3c2e2d0 Merge pull request #1978 from mgorny/toml-test
    • 45d6abd Remove obsolete toml import from the test suite
    • 3020e0b Merge pull request #2057 from mgorny/poetry-install
    • a6fdbfd Stop installing documentation files to top-level site-packages
    • ff306f8 Fix tag template to match old standard
    • 227c4ae Merge pull request #2052 from hugovk/main
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.11

    ⬆️ Bump mkdocs-material from 8.5.6 to 8.5.11

    Bumps mkdocs-material from 8.5.6 to 8.5.11.

    Release notes

    Sourced from mkdocs-material's releases.

    mkdocs-material-8.5.11

    mkdocs-material-8.5.10

    • Adjusted CSS to better allow for custom primary and accent colors
    • Fixed #4620: Primary color is not applied (8.5.9 regression)

    mkdocs-material-8.5.9

    • Fixed #4600: Illegible links for black/white primary colors (8.5.8 regression)
    • Fixed #4594: Need to set schema to change link color

    mkdocs-material-8.5.8

    • Added support for always showing settings in cookie consent
    • Fixed #4571: Buttons invisible if primary color is white or black
    • Fixed #4517: Illegible note in sequence diagram when using slate scheme

    mkdocs-material-8.5.7

    • Deprecated additional admonition qualifiers to reduce size of CSS
    • Fixed #4511: Search boost does not apply to sections
    Changelog

    Sourced from mkdocs-material's changelog.

    mkdocs-material-8.5.11 (2022-11-30)

    mkdocs-material-8.5.10+insiders-4.26.6 (2022-11-28)

    • Fixed #4683: Tags plugin crashes when a tag is empty

    mkdocs-material-8.5.10+insiders-4.26.5 (2022-11-27)

    • Fixed #4632: Post excerpt title link doesn't point to top of the page

    mkdocs-material-8.5.10+insiders-4.26.4 (2022-11-27)

    • Fixed redundant file extension when using privacy plugin

    mkdocs-material-8.5.10+insiders-4.26.3 (2022-11-15)

    • Fixed #4637: Attachments w/o titles in related links error in blog plugin
    • Fixed #4631: Remote favicons not downloaded and inlined by privacy plugin

    mkdocs-material-8.5.10 (2022-11-11)

    • Adjusted CSS to better allow for custom primary and accent colors
    • Fixed #4620: Primary color is not applied (8.5.9 regression)

    mkdocs-material-8.5.9 (2022-11-08)

    • Fixed #4600: Illegible link colors for black and white primary colors
    • Fixed #4594: Need to set schema to change link color

    mkdocs-material-8.5.8+insiders-4.26.2 (2022-11-03)

    • Updated MkDocs to 1.4.2
    • Added support for tag compare functions when sorting on index pages
    • Fixed footnotes being rendered in post excerpts without separators
    • Fixed error in blog plugin when toc extension is not enabled
    • Fixed issues with invalid asset paths and linked post titles
    • Fixed #4572: Privacy plugin fails when symlinks cannot be created
    • Fixed #4545: Blog plugin doesn't automatically link headline to post
    • Fixed #4542: Blog plugin doesn't allow for multiple instances
    • Fixed #4532: Blog plugin doesn't allow for mixed use of date and datetime

    mkdocs-material-8.5.8 (2022-11-03)

    • Added support for always showing settings in cookie consent
    • Fixed #4571: Buttons invisible if primary color is white or black
    • Fixed #4517: Illegible note in sequence diagram when using slate scheme

    mkdocs-material-8.5.7+insiders-4.26.1 (2022-10-22)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • ⬆️ Bump autoflake from 1.7.6 to 1.7.8

    ⬆️ Bump autoflake from 1.7.6 to 1.7.8

    Bumps autoflake from 1.7.6 to 1.7.8.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • ⬆️ Bump types-pyyaml from 6.0.12 to 6.0.12.2

    ⬆️ Bump types-pyyaml from 6.0.12 to 6.0.12.2

    Bumps types-pyyaml from 6.0.12 to 6.0.12.2.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • ⬆️ Bump cssselect from 1.1.0 to 1.2.0

    ⬆️ Bump cssselect from 1.1.0 to 1.2.0

    Bumps cssselect from 1.1.0 to 1.2.0.

    Changelog

    Sourced from cssselect's changelog.

    Version 1.2.0

    Released on 2022-10-27.

    • Drop support for Python 2.7, 3.4-3.6, add support for Python 3.7-3.11.

    • Add type annotations (PEP 484 and PEP 561).

    • More features from the CSS Selectors Level 4:

      • The :is() pseudo-class.

      • The :where() pseudo-class.

      • The :has() pseudo-class, with some limitations.

    • Fix parsing :scope after a comma.

    • Add parentheses to fix condition precedence in some cases.

    • Private API changes related to the removal of the Python 2 support:

      • Remove _unicode and _unichr aliases from csselect.parser.

      • Remove _basestring and _unicode aliases from csselect.xpath.

      • Deprecate csselect.xpath._unicode_safe_getattr() and change it to just call getattr().

    • Include tests in the PyPI tarball.

    • Many CI additions and improvements.

    • Improve the test coverage.

    Commits
    • ddd9784 Merge pull request #134 from scrapy/fix-publish
    • e4493e9 Fix the tag format in the publish action.
    • 97cc517 Bump version: 1.1.0 β†’ 1.2.0
    • cfa2959 Merge pull request #131 from scrapy/relnotes-1.2
    • 60c6146 Restore and deprecate _unicode_safe_getattr (#133)
    • 2c7c1ea Switch to the released 3.11.
    • faa595c Add a changelog entry about private API changes.
    • 89f1a86 Merge pull request #132 from scrapy/install-py.typed
    • d21b85d Fix installing py.typed.
    • e26aa4d Replace "Unicode string" with just "string".
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
Releases(0.22.0)
  • 0.22.0(Jul 4, 2022)

    What's Changed

    • ⬆️ Bump mkdocs-material from 8.3.3 to 8.3.4 by @dependabot in https://github.com/roniemartinez/dude/pull/175
    • ⬆️ Bump mkdocs-material from 8.3.4 to 8.3.5 by @dependabot in https://github.com/roniemartinez/dude/pull/176
    • ⬆️ Bump mkdocs-material from 8.3.5 to 8.3.6 by @dependabot in https://github.com/roniemartinez/dude/pull/177
    • ⬆️ Bump mkdocs-material from 8.3.6 to 8.3.7 by @dependabot in https://github.com/roniemartinez/dude/pull/178
    • ⬆️ Bump webdriver-manager from 3.7.0 to 3.7.1 by @dependabot in https://github.com/roniemartinez/dude/pull/181
    • ⬆️ Bump mkdocs-material from 8.3.7 to 8.3.8 by @dependabot in https://github.com/roniemartinez/dude/pull/179
    • ⬆️ Bump types-pyyaml from 6.0.8 to 6.0.9 by @dependabot in https://github.com/roniemartinez/dude/pull/180
    • ⬆️ Bump black from 22.3.0 to 22.6.0 by @dependabot in https://github.com/roniemartinez/dude/pull/182
    • ⬆️ Bump playwright from 1.22.0 to 1.23.0 by @dependabot in https://github.com/roniemartinez/dude/pull/183
    • ⬆️ Bump mkdocs-material from 8.3.8 to 8.3.9 by @dependabot in https://github.com/roniemartinez/dude/pull/184
    • ⬆️ Bump webdriver-manager from 3.7.1 to 3.8.0 by @dependabot in https://github.com/roniemartinez/dude/pull/185
    • ⬆️ Bump lxml from 4.9.0 to 4.9.1 by @dependabot in https://github.com/roniemartinez/dude/pull/186
    • ⬆ Bump version by @roniemartinez in https://github.com/roniemartinez/dude/pull/187

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.21.1...0.22.0

    Source code(tar.gz)
    Source code(zip)
  • 0.21.1(Jun 8, 2022)

    What's Changed

    • πŸ› Fix memory leak by @roniemartinez in https://github.com/roniemartinez/dude/pull/174
    • ⬆️ Bump mkdocs-material from 8.3.1 to 8.3.2 by @dependabot in https://github.com/roniemartinez/dude/pull/171
    • ⬆️ Bump mypy from 0.960 to 0.961 by @dependabot in https://github.com/roniemartinez/dude/pull/172
    • ⬆️ Bump mkdocs-material from 8.3.2 to 8.3.3 by @dependabot in https://github.com/roniemartinez/dude/pull/173

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.21.0...0.21.1

    Source code(tar.gz)
    Source code(zip)
  • 0.21.0(Jun 4, 2022)

    What's Changed

    • ✨ ChromeDriver version selection by @roniemartinez in https://github.com/roniemartinez/dude/pull/170
    • πŸ› Fix mkdocstrings by @roniemartinez in https://github.com/roniemartinez/dude/pull/167
    • ⬆️ Bump mkdocs-material from 8.2.16 to 8.3.0 by @dependabot in https://github.com/roniemartinez/dude/pull/168

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.20.3...0.21.0

    Source code(tar.gz)
    Source code(zip)
  • 0.20.3(Jun 1, 2022)

    What's Changed

    • ⬆️ Bump braveblock from 0.2.0 to 0.3.0 by @dependabot in https://github.com/roniemartinez/dude/pull/159
    • ⬆️ Bump mypy from 0.950 to 0.960 by @dependabot in https://github.com/roniemartinez/dude/pull/161
    • ⬆️ Bump mkdocs-material from 8.2.15 to 8.2.16 by @dependabot in https://github.com/roniemartinez/dude/pull/164
    • ⬆️ Bump mkdocstrings from 0.18.1 to 0.19.0 by @dependabot in https://github.com/roniemartinez/dude/pull/163
    • ⬆️ Bump lxml from 4.8.0 to 4.9.0 by @dependabot in https://github.com/roniemartinez/dude/pull/165
    • ⬆ Update dependencies by @roniemartinez in https://github.com/roniemartinez/dude/pull/166

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.20.2...0.20.3

    Source code(tar.gz)
    Source code(zip)
  • 0.20.2(May 25, 2022)

    What's Changed

    • πŸ› Fix helper imports by @roniemartinez in https://github.com/roniemartinez/dude/pull/158
    • ⬆️ Bump httpx from 0.22.0 to 0.23.0 by @dependabot in https://github.com/roniemartinez/dude/pull/156

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.20.1...0.20.2

    Source code(tar.gz)
    Source code(zip)
  • 0.20.1(May 21, 2022)

    What's Changed

    • πŸ’š Set latest tag in docker/build-push-action by @roniemartinez in https://github.com/roniemartinez/dude/pull/153

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.20.0...0.20.1

    Source code(tar.gz)
    Source code(zip)
  • 0.20.0(May 21, 2022)

    What's Changed

    • 🐳 Docker image by @roniemartinez in https://github.com/roniemartinez/dude/pull/152

    Other

    • ⬆️ Bump mkdocs-material from 8.2.13 to 8.2.14 by @dependabot in https://github.com/roniemartinez/dude/pull/148
    • ⬆️ Bump selenium-wire from 4.6.3 to 4.6.4 by @dependabot in https://github.com/roniemartinez/dude/pull/149
    • ⬆️ Bump playwright from 1.21.0 to 1.22.0 by @dependabot in https://github.com/roniemartinez/dude/pull/150
    • ⬆️ Bump mkdocs-material from 8.2.14 to 8.2.15 by @dependabot in https://github.com/roniemartinez/dude/pull/151

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.19.0...0.20.0

    Source code(tar.gz)
    Source code(zip)
  • 0.19.0(May 5, 2022)

    What's Changed

    • ✨ Follow dynamically-built URLs by @roniemartinez in https://github.com/roniemartinez/dude/pull/146
    • πŸ”¨ Add ignore robots.txt warning by @roniemartinez in https://github.com/roniemartinez/dude/pull/145

    Dependencies

    • ⬆️ Bump beautifulsoup4 from 4.11.0 to 4.11.1 by @dependabot in https://github.com/roniemartinez/dude/pull/136
    • ⬆️ Bump mkdocs-material from 8.2.8 to 8.2.9 by @dependabot in https://github.com/roniemartinez/dude/pull/135
    • ⬆️ Bump pyproject-flake8 from 0.0.1a3 to 0.0.1a4 by @dependabot in https://github.com/roniemartinez/dude/pull/137
    • ⬆️ Bump playwright from 1.20.1 to 1.21.0 by @dependabot in https://github.com/roniemartinez/dude/pull/138
    • ⬆️ Bump types-pyyaml from 6.0.5 to 6.0.6 by @dependabot in https://github.com/roniemartinez/dude/pull/139
    • ⬆️ Bump types-pyyaml from 6.0.6 to 6.0.7 by @dependabot in https://github.com/roniemartinez/dude/pull/140
    • ⬆️ Bump pytest from 7.1.1 to 7.1.2 by @dependabot in https://github.com/roniemartinez/dude/pull/141
    • ⬆️ Bump mkdocs-material from 8.2.9 to 8.2.11 by @dependabot in https://github.com/roniemartinez/dude/pull/142
    • ⬆️ Bump mypy from 0.942 to 0.950 by @dependabot in https://github.com/roniemartinez/dude/pull/143
    • ⬆️ Update dependencies by @roniemartinez in https://github.com/roniemartinez/dude/pull/144
    • ⬆️ Bump mkdocs-material from 8.2.11 to 8.2.13 by @dependabot in https://github.com/roniemartinez/dude/pull/147

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.18.0...0.19.0

    Source code(tar.gz)
    Source code(zip)
  • 0.18.0(Apr 10, 2022)

    What's Changed

    • ✨ Follow robots.txt rules with option to ignore by @roniemartinez in https://github.com/roniemartinez/dude/pull/134

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.17.0...0.18.0

    Source code(tar.gz)
    Source code(zip)
  • 0.17.0(Apr 9, 2022)

    What's Changed

    • ✨ Rename url to url_match and support function/lambda as matcher by @roniemartinez in https://github.com/roniemartinez/dude/pull/131
    • ⬆️ Bump beautifulsoup4 from 4.10.0 to 4.11.0 by @dependabot in https://github.com/roniemartinez/dude/pull/132

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.16.0...0.17.0

    Source code(tar.gz)
    Source code(zip)
  • 0.16.0(Apr 5, 2022)

    What's Changed

    • ✨ Support custom HTTP methods by @roniemartinez in https://github.com/roniemartinez/dude/pull/130

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.15.2...0.16.0

    Source code(tar.gz)
    Source code(zip)
  • 0.15.2(Apr 4, 2022)

    What's Changed

    • πŸ› Fix HTTPX async event hook by @roniemartinez in https://github.com/roniemartinez/dude/pull/129

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.15.1...0.15.2

    Source code(tar.gz)
    Source code(zip)
  • 0.15.1(Mar 31, 2022)

    What's Changed

    • ⬆️ Fix dependency gridlock by @roniemartinez in https://github.com/roniemartinez/dude/pull/128

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.15.0...0.15.1

    Source code(tar.gz)
    Source code(zip)
  • 0.15.0(Mar 31, 2022)

    What's Changed

    • πŸ”¨ Run adblock on HTTPX request event hook by @roniemartinez in https://github.com/roniemartinez/dude/pull/126
    • docs: add roniemartinez as a contributor for maintenance, code, doc, infra by @allcontributors in https://github.com/roniemartinez/dude/pull/125

    New Contributors

    • @allcontributors made their first contribution in https://github.com/roniemartinez/dude/pull/125

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.14.0...0.15.0

    Source code(tar.gz)
    Source code(zip)
  • 0.14.0(Mar 29, 2022)

    What's Changed

    • ✨ Use fnmatch by @roniemartinez in https://github.com/roniemartinez/dude/pull/122

    Other

    • ⬆️ Bump pyproject-flake8 from 0.0.1a2 to 0.0.1a3 by @dependabot in https://github.com/roniemartinez/dude/pull/120
    • ⬆️ Bump black from 22.1.0 to 22.3.0 by @dependabot in https://github.com/roniemartinez/dude/pull/121

    fnmatch: URL pattern matcher now uses Unix style wildcards (fnmatch) instead of regex

    See: https://docs.python.org/3/library/fnmatch.html

    Wildcards are easier to understand and simpler to use compared to regular expressions

    - @select(css=".title", url=r".*\.com")
    + @select(css=".title", url="*.com/*")
    def result_title(element):
        return {"title": element.text_content()}
    

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.13.0...0.14.0

    Source code(tar.gz)
    Source code(zip)
  • 0.13.0(Mar 27, 2022)

    What's Changed

    • ✨ Make return value of decorated functions optional by @roniemartinez in https://github.com/roniemartinez/dude/pull/119

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.12.2...0.13.0

    Source code(tar.gz)
    Source code(zip)
  • 0.12.2(Mar 27, 2022)

    What's Changed

    • πŸ› Fix PlaywrightScraper overwriting output file by @roniemartinez in https://github.com/roniemartinez/dude/pull/118

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.12.1...0.12.2

    Source code(tar.gz)
    Source code(zip)
  • 0.12.1(Mar 27, 2022)

    What's Changed

    • πŸ”¨ Refactor for Alpha by @roniemartinez in https://github.com/roniemartinez/dude/pull/112

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.12.0...0.12.1

    Source code(tar.gz)
    Source code(zip)
  • 0.12.0(Mar 25, 2022)

    What's Changed

    • ✨ Add shutdown event and save per page option by @roniemartinez in https://github.com/roniemartinez/dude/pull/102

    Other

    • ⬆️ Bump playwright from 1.20.0 to 1.20.1 by @dependabot in https://github.com/roniemartinez/dude/pull/101
    • ⬆️ Bump mypy from 0.941 to 0.942 by @dependabot in https://github.com/roniemartinez/dude/pull/104
    • ⬆️ Bump mkdocs-material from 8.2.6 to 8.2.7 by @dependabot in https://github.com/roniemartinez/dude/pull/105

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.11.0...0.12.0

    ✨ Save data on each page

    You can now save data after scraping a page. Save functions should be decorated with is_per_page=True and execute the scraper with --save-per-page to use it.

    @save("jsonl", is_per_page=True)
    def save_jsonl(data, output) -> bool:
        global jsonl_file
        jsonl_file.writelines((json.dumps(item) + "\n" for item in data))
        return True
    

    ✨ Shutdown event

    The shutdown even is called before the application terminates. This is useful when freeing resources, file handles, databases or other use-cases before ending.

    @shutdown()
    def zip_all():
        global SAVE_DIR
        shutil.make_archive("images-and-pdfs", "zip", SAVE_DIR)
    

    ✨ How dude runs internally

    events

    Source code(tar.gz)
    Source code(zip)
  • 0.11.0(Mar 23, 2022)

    What's Changed

    Features

    • ✨ Events by @roniemartinez in https://github.com/roniemartinez/dude/pull/99
    • πŸ”— Follow URLs by @roniemartinez in https://github.com/roniemartinez/dude/pull/90

    Documentation

    • πŸ“š Update docs by @roniemartinez in https://github.com/roniemartinez/dude/pull/93

    Fixes

    • πŸ’š Fix Actions rate limit error by @roniemartinez in https://github.com/roniemartinez/dude/pull/81
    • πŸ› Fix DevToolsActivePort file doesn't exist by @roniemartinez in https://github.com/roniemartinez/dude/pull/84
    • πŸ› Fix selenium failing on Windows by @roniemartinez in https://github.com/roniemartinez/dude/pull/94

    Other

    • ⬆️ Bump selenium-wire from 4.6.2 to 4.6.3 by @dependabot in https://github.com/roniemartinez/dude/pull/80
    • ⬆️ Bump mypy from 0.931 to 0.941 by @dependabot in https://github.com/roniemartinez/dude/pull/82
    • ⬆️ Bump pytest from 7.0.1 to 7.1.0 by @dependabot in https://github.com/roniemartinez/dude/pull/78
    • ⬆️ Bump braveblock from 0.1.13 to 0.2.0 by @dependabot in https://github.com/roniemartinez/dude/pull/83
    • ⬆️ Bump playwright from 1.19.1 to 1.20.0 by @dependabot in https://github.com/roniemartinez/dude/pull/87
    • ⬆️ Bump types-pyyaml from 6.0.4 to 6.0.5 by @dependabot in https://github.com/roniemartinez/dude/pull/88
    • ⬆️ Bump pytest from 7.1.0 to 7.1.1 by @dependabot in https://github.com/roniemartinez/dude/pull/91
    • ⬆️ Bump webdriver-manager from 3.5.3 to 3.5.4 by @dependabot in https://github.com/roniemartinez/dude/pull/97
    • ⬆️ Bump mkdocs-material from 8.2.5 to 8.2.6 by @dependabot in https://github.com/roniemartinez/dude/pull/100

    ✨ Basic Spider

    Example

    dude scrape ... --follow-urls
    

    or

    if __name__ == "__main__":
        import dude
    
        dude.run(..., follow_urls=True)
    

    ✨ Events

    More details at https://roniemartinez.github.io/dude/advanced/14_events.html

    Example

    import uuid
    from pathlib import Path
    
    from dude import post_setup, pre_setup, startup
    
    SAVE_DIR: Path
    
    
    @startup()
    def initialize_csv():
        """
        Connection to databases or API and other use-cases can be done here before the web scraping process is started.
        """
        global SAVE_DIR
        SAVE_DIR = Path(__file__).resolve().parent / "temp"
        SAVE_DIR.mkdir(exist_ok=True)
    
    
    @pre_setup()
    def screenshot(page):
        """
        Perform actions here after loading a page (or after a successful HTTP response) and before modifying things in the
        setup stage.
        """
        unique_name = str(uuid.uuid4())
        page.screenshot(path=SAVE_DIR / f"{unique_name}.png")  # noqa
    
    
    @post_setup()
    def print_pdf(page):
        """
        Perform actions here after running the setup stage.
        """
        unique_name = str(uuid.uuid4())
        page.pdf(path=SAVE_DIR / f"{unique_name}.pdf")  # noqa
    
    
    if __name__ == "__main__":
        import dude
    
        dude.run(urls=["https://dude.ron.sh"])
    
    

    Diagram showing when events are executed

    image

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.10.1...0.11.0

    Source code(tar.gz)
    Source code(zip)
  • 0.10.1(Mar 13, 2022)

    What's Changed

    • 🏁 Fix Windows support by @roniemartinez in https://github.com/roniemartinez/dude/pull/76

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.10.0...0.10.1

    Source code(tar.gz)
    Source code(zip)
  • 0.10.0(Mar 13, 2022)

    What's Changed

    Added

    • ✨ Block ads by @roniemartinez in https://github.com/roniemartinez/dude/pull/74

    Changed

    • πŸ”¨ Refactor and update docs by @roniemartinez in https://github.com/roniemartinez/dude/pull/75

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.9.2...0.10.0

    Source code(tar.gz)
    Source code(zip)
  • 0.9.2(Mar 12, 2022)

    What's Changed

    • πŸ”§ Disable notifications by @roniemartinez in https://github.com/roniemartinez/dude/pull/73

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.9.1...0.9.2

    Source code(tar.gz)
    Source code(zip)
  • 0.9.1(Mar 11, 2022)

    What's Changed

    • πŸ“š Add migration examples by @roniemartinez in https://github.com/roniemartinez/dude/pull/67

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.9.0...0.9.1

    Source code(tar.gz)
    Source code(zip)
  • 0.9.0(Mar 10, 2022)

    What's Changed

    Added

    • ✨ Add option to use Selenium by @roniemartinez in https://github.com/roniemartinez/dude/pull/64

    Fixed

    • πŸ› Pyppeteer fixes by @roniemartinez in https://github.com/roniemartinez/dude/pull/65

    Docs

    • πŸ“š Add pip install in README by @roniemartinez in https://github.com/roniemartinez/dude/pull/66

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.8.0...0.9.0

    Source code(tar.gz)
    Source code(zip)
  • 0.8.0(Mar 7, 2022)

    What's Changed

    • ✨ Add option to use Pyppeteer by @roniemartinez in https://github.com/roniemartinez/dude/pull/60
    • ⬆️ Bump mkdocs-material from 8.2.4 to 8.2.5 by @dependabot in https://github.com/roniemartinez/dude/pull/61

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.7.1...0.8.0

    Source code(tar.gz)
    Source code(zip)
  • 0.7.1(Mar 6, 2022)

    What's Changed

    • πŸ“š Add parser support table by @roniemartinez in https://github.com/roniemartinez/dude/pull/59

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.7.0...0.7.1

    Source code(tar.gz)
    Source code(zip)
  • 0.7.0(Mar 6, 2022)

    What's Changed

    • ✨ Add Text and Regex selectors for lxml by @roniemartinez in https://github.com/roniemartinez/dude/pull/57

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.6.1...0.7.0

    Source code(tar.gz)
    Source code(zip)
  • 0.6.1(Mar 6, 2022)

    What's Changed

    • πŸ› Fix lxml documentation by @roniemartinez in https://github.com/roniemartinez/dude/pull/56

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.6.0...0.6.1

    Source code(tar.gz)
    Source code(zip)
  • 0.6.0(Mar 6, 2022)

    What's Changed

    • ✨ lxml Implementation by @roniemartinez in https://github.com/roniemartinez/dude/pull/55

    Full Changelog: https://github.com/roniemartinez/dude/compare/0.5.1...0.6.0

    Source code(tar.gz)
    Source code(zip)
Owner
Ronie Martinez
I am a Python and C/C++ enthusiast working on open-source projects on my free time since 2013.
Ronie Martinez
Webservice wrapper for hhursev/recipe-scrapers (python library to scrape recipes from websites)

recipe-scrapers-webservice This is a wrapper for hhursev/recipe-scrapers which provides the api as a webservice, to be consumed as a microservice by o

null 1 Jul 9, 2022
A simple django-rest-framework api using web scraping

Apicell You can use this api to search in google, bing, pypi and subscene and get results Method : POST Parameter : query Example import request url =

Hesam N 1 Dec 19, 2021
A Very simple free proxy list scraper.

Scrappp A Very simple free proxy list scraper, made in python The tool scrape proxy from diffrent sites and api's. Screenshots About the script !!! RE

Joji aka Moncef 12 Oct 27, 2022
robobrowser - A simple, Pythonic library for browsing the web without a standalone web browser.

RoboBrowser: Your friendly neighborhood web scraper Homepage: http://robobrowser.readthedocs.org/ RoboBrowser is a simple, Pythonic library for browsi

Joshua Carp 3.7k Dec 27, 2022
This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Deals of the Day This is a web scraper, using the Python framework Scrapy, built to extract data such as price and product name from the Deals of the

David Souza 1 Jan 12, 2022
Amazon web scraping using Scrapy Framework

Amazon-web-scraping-using-Scrapy-Framework Scrapy Scrapy is an application framework for crawling web sites and extracting structured data which can b

Sejal Rajput 1 Jan 25, 2022
Works very well and you can ask for the type of image you want the scrapper to collect.

Works very well and you can ask for the type of image you want the scrapper to collect. Also follows a specific urls path depending on keyword selection.

Memo Sim 1 Feb 17, 2022
Simple Web scrapper Bot to scrap webpages using Requests, html5lib and Beautifulsoup.

WebScrapperRoBot Simple Web scrapper Bot to scrap webpages using Requests, html5lib and Beautifulsoup. Mark your Star ⭐ ⭐ What is Web Scraping ? Web s

Nuhman Pk 53 Dec 21, 2022
Scrapy, a fast high-level web crawling & scraping framework for Python.

Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag

Scrapy project 45.5k Jan 7, 2023
Async Python 3.6+ web scraping micro-framework based on asyncio

Ruia ??️ Async Python 3.6+ web scraping micro-framework based on asyncio. ⚑ Write less, run faster. Overview Ruia is an async web scraping micro-frame

howie.hu 1.6k Jan 1, 2023
Transistor, a Python web scraping framework for intelligent use cases.

Web data collection and storage for intelligent use cases. transistor About The web is full of data. Transistor is a web scraping framework for collec

BOM Quote Manufacturing 212 Nov 5, 2022
Web Scraping Framework

Grab Framework Documentation Installation $ pip install -U grab See details about installing Grab on different platforms here http://docs.grablib.

null 2.3k Jan 4, 2023
Web crawling framework based on asyncio.

Web crawling framework for everyone. Written with asyncio, uvloop and aiohttp. Requirements Python3.5+ Installation pip install gain pip install uvloo

Jiuli Gao 2k Jan 5, 2023
A simple python web scraper.

Dissec A simple python web scraper. It gets a website and its contents and parses them with the help of bs4. Installation To install the requirements,

null 11 May 6, 2022
βœ‚οΈπŸ•·οΈ Spider-Cut is a Network Mapper Framework (NMAP Framework)

Spider-Cut is a Network Mapper Framework (NMAP Framework) Installation | Usage | Creators | Donate Installation # Kali Linux | WSL

XforWorks 3 Mar 7, 2022
πŸ₯« The simple, fast, and modern web scraping library

About gazpacho is a simple, fast, and modern web scraping library. The library is stable, actively maintained, and installed with zero dependencies. I

Max Humber 692 Dec 22, 2022
A Simple Web Scraper made to Extract Download Links from Todaytvseries2.com

TDTV2-Direct Version 1.00.1 β€’ A Simple Web Scraper made to Extract Download Links from Todaytvseries2.com :) How to Works?? install all dependancies v

Danushka-Madushan 1 Nov 28, 2021
Simple library for exploring/scraping the web or testing a website you’re developing

Robox is a simple library with a clean interface for exploring/scraping the web or testing a website you’re developing. Robox can fetch a page, click on links and buttons, and fill out and submit forms.

Dan Claudiu Pop 79 Nov 27, 2022
Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Web Scrapping Popular Youtube Tech Channels with Selenium Data Mining, Data Wrangling, and Exploratory Data Analysis About the Data Web scrapi

David Rusho 0 Aug 18, 2021