A web service for scanning media hosted by a Matrix media repository

Overview

Matrix Content Scanner

A web service for scanning media hosted by a Matrix media repository

Installation

TODO

Development

In a virtual environment with pip ≥ 21.1, run

pip install -e .[dev]

To run the unit tests, you can either use:

tox -e py

or

trial tests

To run the linters and mypy type checker, use ./scripts-dev/lint.sh.

Releasing

The exact steps for releasing will vary; but this is an approach taken by the Synapse developers (assuming a Unix-like shell):

  1. Set a shell variable to the version you are releasing (this just makes subsequent steps easier):

    version=X.Y.Z
  2. Update setup.cfg so that the version is correct.

  3. Stage the changed files and commit.

    git add -u
    git commit -m v$version -n
  4. Push your changes.

    git push
  5. When ready, create a signed tag for the release:

    git tag -s v$version

    Base the tag message on the changelog.

  6. Push the tag.

    git push origin tag v$version
  7. If applicable: Create a release, based on the tag you just pushed, on GitHub or GitLab.

  8. If applicable: Create a source distribution and upload it to PyPI:

    python -m build
    twine upload dist/matrix_content_scanner-$version*
You might also like...
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
A Smart, Automatic, Fast and Lightweight Web Scraper for Python

AutoScraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python This project is made for automatic web scraping to make scraping easy. It

Async Python 3.6+ web scraping micro-framework based on asyncio
Async Python 3.6+ web scraping micro-framework based on asyncio

Ruia 🕸️ Async Python 3.6+ web scraping micro-framework based on asyncio. ⚡ Write less, run faster. Overview Ruia is an async web scraping micro-frame

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)
Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow

Web Content Retrieval for Humans™

Lassie Lassie is a Python library for retrieving basic content from websites. Usage import lassie lassie.fetch('http://www.youtube.com/watch?v

🥫 The simple, fast, and modern web scraping library
🥫 The simple, fast, and modern web scraping library

About gazpacho is a simple, fast, and modern web scraping library. The library is stable, actively maintained, and installed with zero dependencies. I

Transistor, a Python web scraping framework for intelligent use cases.
Transistor, a Python web scraping framework for intelligent use cases.

Web data collection and storage for intelligent use cases. transistor About The web is full of data. Transistor is a web scraping framework for collec

Html Content / Article Extractor, web scrapping lib in Python

Python-Goose - Article Extractor Intro Goose was originally an article extractor written in Java that has most recently (Aug2011) been converted to a

A scalable frontier for web crawlers

Frontera Overview Frontera is a web crawling framework consisting of crawl frontier, and distribution/scaling primitives, allowing to build a large sc

Web crawling framework  based on asyncio.
Web crawling framework based on asyncio.

Web crawling framework for everyone. Written with asyncio, uvloop and aiohttp. Requirements Python3.5+ Installation pip install gain pip install uvloo

Owner
Brendan Abolivier
Backend software engineer @matrix-org, contributor on FOSS projects such as @cozy or RSF's Collateral Freedom.
Brendan Abolivier
A social networking service scraper in Python

snscrape snscrape is a scraper for social networking services (SNS). It scrapes things like user profiles, hashtags, or searches and returns the disco

null 2.4k Jan 1, 2023
A command-line program to download media, like and unlike posts, and more from creators on OnlyFans.

onlyfans-scraper A command-line program to download media, like and unlike posts, and more from creators on OnlyFans. Installation You can install thi

null 185 Jul 23, 2022
Scrape all the media from an OnlyFans account - Updated regularly

Scrape all the media from an OnlyFans account - Updated regularly

CRIMINAL 3.2k Dec 29, 2022
Audio media crawler for lbry.

Audio media crawler for lbry. Requirements Python 3.8 Poetry 1.1.7 Elasticsearch 7.14.0 Lbry-sdk 0.99.0 Development This project uses poetry as a depe

Hound.fm 4 Dec 3, 2022
Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

Toxicity comments crawler Crawler job that scrapes comments from social media posts and saves them in a S3 bucket. Twitter Tweets and replies are scra

Douglas Trajano 2 Jan 24, 2022
Web Scraping Framework

Grab Framework Documentation Installation $ pip install -U grab See details about installing Grab on different platforms here http://docs.grablib.

null 2.3k Jan 4, 2023
A Powerful Spider(Web Crawler) System in Python.

pyspider A Powerful Spider(Web Crawler) System in Python. Write script in Python Powerful WebUI with script editor, task monitor, project manager and

Roy Binux 15.7k Jan 4, 2023
Scrapy, a fast high-level web crawling & scraping framework for Python.

Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag

Scrapy project 45.5k Jan 7, 2023
:arrow_double_down: Dumb downloader that scrapes the web

You-Get NOTICE: Read this if you are looking for the conventional "Issues" tab. You-Get is a tiny command-line utility to download media contents (vid

Mort Yao 46.4k Jan 3, 2023
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

Computational Linguistics Research Group 8.4k Jan 8, 2023