A web service for scanning media hosted by a Matrix media repository

Brendan Abolivier

Last update: Dec 1, 2022

Related tags

Web Crawling matrix-content-scanner-python

Overview

Matrix Content Scanner

A web service for scanning media hosted by a Matrix media repository

Installation

TODO

Development

In a virtual environment with pip ≥ 21.1, run

pip install -e .[dev]

To run the unit tests, you can either use:

tox -e py

trial tests

To run the linters and mypy type checker, use ./scripts-dev/lint.sh.

Releasing

The exact steps for releasing will vary; but this is an approach taken by the Synapse developers (assuming a Unix-like shell):

Set a shell variable to the version you are releasing (this just makes subsequent steps easier):
```
version=X.Y.Z
```
Update setup.cfg so that the version is correct.
Stage the changed files and commit.
```
git add -u
git commit -m v$version -n
```
Push your changes.
```
git push
```
When ready, create a signed tag for the release:
```
git tag -s v$version
```
Base the tag message on the changelog.
Push the tag.
```
git push origin tag v$version
```
If applicable: Create a release, based on the tag you just pushed, on GitHub or GitLab.

If applicable: Create a source distribution and upload it to PyPI:

python -m build
twine upload dist/matrix_content_scanner-$version*

You might also like...

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

AutoScraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python This project is made for automatic web scraping to make scraping easy. It

4.8k Jan 4, 2023

Async Python 3.6+ web scraping micro-framework based on asyncio

Ruia 🕸️ Async Python 3.6+ web scraping micro-framework based on asyncio. ⚡ Write less, run faster. Overview Ruia is an async web scraping micro-frame

1.6k Jan 1, 2023

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow

704 Jan 6, 2023

Web Content Retrieval for Humans™

Lassie Lassie is a Python library for retrieving basic content from websites. Usage import lassie lassie.fetch('http://www.youtube.com/watch?v

570 Dec 19, 2022

🥫 The simple, fast, and modern web scraping library

About gazpacho is a simple, fast, and modern web scraping library. The library is stable, actively maintained, and installed with zero dependencies. I

692 Dec 22, 2022

Transistor, a Python web scraping framework for intelligent use cases.

Web data collection and storage for intelligent use cases. transistor About The web is full of data. Transistor is a web scraping framework for collec

212 Nov 5, 2022

Html Content / Article Extractor, web scrapping lib in Python

Python-Goose - Article Extractor Intro Goose was originally an article extractor written in Java that has most recently (Aug2011) been converted to a

3.8k Jan 2, 2023

A scalable frontier for web crawlers

Frontera Overview Frontera is a web crawling framework consisting of crawl frontier, and distribution/scaling primitives, allowing to build a large sc

1.2k Jan 2, 2023

Web crawling framework based on asyncio.

Web crawling framework for everyone. Written with asyncio, uvloop and aiohttp. Requirements Python3.5+ Installation pip install gain pip install uvloo