Your self hosted Youtube media server

Simon

Last update: Dec 31, 2022

Related tags

Video elasticsearch youtube youtube-dl media-server archive download-videos download-manager

Overview

The Tube Archivist
Your self hosted Youtube media server

Core functionality

Subscribe to your favourite Youtube channels
Download Videos using yt-dlp
Index and make videos searchable
Play videos
Keep track of viewed and unviewed videos

Screenshots

Home Page

All Channels

Single Channel

Video Page

Downloads Page

Problem Tube Archivist tries to solve

Once your Youtube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from Youtube, you can organize, search and enjoy your archived Youtube videos without hassle offline through a convenient web interface.

Installation

Take a look at the example docker-compose.yml file provided. Tube Archivist depends on three main components split up into seperate docker containers:

Tube Archivist

The main Python application that displays and serves your video collection, built with Django.

Serves the interface on port 8000
Needs a mandatory volume for the video archive at /youtube
And another recommended volume to save the cache for thumbnails and artwork at /cache.
The environment variables ES_URL and REDIS_HOST are needed to tell Tube Archivist where Elasticsearch and Redis respectively are located.
The environment variables HOST_UID and HOST_GID allowes Tube Archivist to chown the video files to the main host system user instead of the container user.

Elasticsearch

Stores video meta data and makes everything searchable. Also keeps track of the download queue.

Needs to be accessable over the default port 9200
Needs a volume at /usr/share/elasticsearch/data to store data

Follow the documentation for additional installation details.

Redis JSON

Functions as a cache and temporary link between the application and the filesystem. Used to store and display messages and configuration variables.

Needs to be accessable over the default port 6379
Takes an optional volume at /data to make your configuration changes permanent.

Getting Started

Go through the settings page and look at the available options. Particularly set Download Format to your desired video quality before downloading.
Subscribe to some of your favourite Youtube channels on the channels page.
On the downloads page, click on Rescan subscriptions to add videos from the subscribed channels to your Download queue or click on Add to download queue to manually add Video IDs, links, channels or playlists.
Click on Download queue and let Tube Archivist to it's thing.
Enjoy your archived collection!

Potential pitfalls

Elastic Search in Docker requires the kernel setting of the host machine vm.max_map_count to be set to least 262144.

To temporary set the value run:

sudo sysctl -w vm.max_map_count=262144

To apply the change permanently depends on your host operating system:

For example on Ubuntu Server add vm.max_map_count = 262144 to the file /etc/sysctl.conf.
On Arch based systems create a file /etc/sysctl.d/max_map_count.conf with the content vm.max_map_count = 262144.
On any other platform look up in the documentation on how to pass kernel parameters.

Roadmap

This should be considered as a minimal viable product, there is an exstensive list of future functions and improvements planned:

Known limitations

Video files created by Tube Archivist need to be mp4 video files for best browser compatibility.
Every limitation of yt-dlp will also be present in Tube Archivist. If yt-dlp can't download or extract a video for any reason, Tube Archivist won't be able to either.
For now this is meant to be run in a trusted network environment.

Comments

Multi-arch images?

Thanks for working on this, it is a promising project!

I was trying to spin-up tubearchivist on a RPi 4 running 64-bit Raspbian OS and while the containers get built fine, I run into the standard_init_linux.go:228: exec user process caused: exec format error for the redis JSON and the tubearchivist containers (the Elasticsearch seems to install fine).

Is there a plan for building multi-arch images especially for arm64?

opened by abhilesh 27
update v0.2 support thread: check the release notes and the readme.

Hi,

What is your error? After the last upgrade the app won't launch

How to reproduce? Just update the app

[archivist-redis]: ok

[archivist-es]:

SOLUTION: chown 1000:0 /path/to/mount/point of elasticsearch

[tubearchivist]:

SOLUTION: add TA_HOST in the yml

TA_HOST=YOUR_IP or TA_HOST=YOUR_DOMAIN

Like this:

Have a nice day!
documentation

opened by zarevskaya 25
Add LDAP attribute mapping env variables.

When using a default Samba DC LDAP instance, uid isn't used to hold the username, so for this, and other LDAP implementations, it's necessary to be able to specify which LDAP attributes are actually used for first name, last name, username, email, etc.

This doesn't change the default behavior, as it uses the current hardcoded values as the default values instead of requiring admins to specify it if they are upgrading to a version containing this feature.

opened by BrianCArnold 20
Get Video Player Data Using New API

The videoPlayer() function now gets it's data from the API rather then the HTML. It still pulls the video id from the button. 3 functions were also added getVideoPlayerData(), getVideoData(), and apiRequest(). apiRequest() makes an api request when passed an endpoint (ex. /api/video/VIDEO_ID/player/) and a method (either "GET" or "POST") and returns the results in JSON. getVideoPlayerData() returns video player data in JSON when it is given a video ID (Makes a call to apiRequest() ). getVideoData() isn't used right now, it's just another example and returns video data in JSON when it is given a video ID (It also makes a call to apiRequest().

opened by n8detar 20

[Bug]: Synology error seccomp unavailable when starting Elasticsearch

Latest and Greatest

[X] I'm running the latest version of Tube Archivist and have read the release notes.

Operating System

Synology

Your Bug Report

Describe the bug

Hello, I want to thank you for creating this project. I already solved the redis issue. Archivist-es is continuously restarting. I will provide a log from synology docker here in a second. I primarily use portainer to manage the containers but since elasticsearch keeps bootlooping, the container log page won't even load on portainer.

Steps To Reproduce

attempting to start the containers

Expected behavior

archivist-es should run and continuously stay active instead of restarting itself every few seconds and occupying memory usage. archivist-es.csv

Relevant log output

version: '3.3'

services:
  tubearchivist:
    container_name: tubearchivist
    restart: unless-stopped
    image: bbilly1/tubearchivist
    ports:
      - 8100:8000
    volumes:
      - /volume3/TA_Creators:/youtube
      - /volume3/docker/tubearchivist/cache:/cache
    environment:
      - ES_URL=http://10.10.0.215:9200
      - REDIS_HOST=archivist-redis
      - HOST_UID=1024
      - HOST_GID=100
      - TA_HOST=10.10.0.215
      - TA_PASSWORD=REDACTED
      - ELASTIC_PASSWORD=REDACTED
      - TZ=EST
    depends_on:
      - archivist-es
      - archivist-redis
  archivist-redis:
    image: redislabs/rejson
    container_name: archivist-redis
    restart: unless-stopped
    expose:
      - "6379"
    volumes:
      - /volume3/docker/tubearchivist/redis:/data
    depends_on:
      - archivist-es
  archivist-es:
    image: bbilly1/tubearchivist-es
    container_name: archivist-es
    restart: unless-stopped
    environment:
      - "xpack.security.enabled=true"
      - "discovery.type=single-node"
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - ELASTIC_PASSWORD=REDACTED
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - /volume3/docker/tubearchivist/es:/usr/share/elasticsearch/data
    expose:
      - "9200"

Anything else?

I attached my log output above since that's the only place where I could and because it only allowed me to export as a formatted csv file from docker. Apologies and hopefully the solution is easily apparent.

question

opened by N72826 17

[docker] initial superuser created every time container starts
So it appears that the superuser is created on container startup every time based on variables TA_USERNAME and TA_PASSWORD, even if:

the container is started for the non-first time

even if another superuser exists

even if the initial user is manually deleted using the admin interface

One would presume from the wording on README, i.e. Change the environment variables TA_USERNAME and TA_PASSWORD to create the initial credentials and the example compose file, i.e. your initial TA credentials that it's only created when the container is run for the first time and/or if no other superuser exists. I believe this should be more clear in the README, and/or make the app not create a superuser if it's not the first time being run
enhancement
opened by kzshantonu 17
Playing downloads on Safari

Hey there – I am finding that downloaded videos won't play in Safari browsers (Mac or iOS). This seems to be an issue with how the webserver provides 'range' data to clients. This is beyond my expertise to fix. Any thoughts?

Oh…works great on Chrome though :)

opened by deanpribetic 17

[Bug] Autodelete unreliable in v1.1.3

I run TubeArchivist in Docker on a Synology NAS (DS918+, DSM 7.0.1-42218 Update 3).

This is the Compose i use:

services:
  tubearchivist_julian:
    image: bbilly1/tubearchivist:latest
    container_name: tubearchivist_julian
    volumes:
      - ./app:/cache
      - /volume1/media/youtube/julian:/youtube
    environment:
      TZ: Europe/Berlin
      ES_URL: http://tubearchivist_julian_es:9200
      REDIS_HOST: tubearchivist_julian_redis
      HOST_UID: 1026
      HOST_GID: 101
      TA_USERNAME: tube_js
      TA_PASSWORD: ${TA_PASS_JULIAN}
      ELASTIC_PASSWORD: ${TA_ELASTIC_PASS}
    ports:
      - 18000:8000
    depends_on:
      - tubearchivist_julian_es
      - tubearchivist_julian_redis
    restart: unless-stopped
  tubearchivist_julian_redis:
    image: redislabs/rejson:latest
    container_name: tubearchivist_julian_redis
    volumes:
      - ./redis:/data
    ports:
      - 6379:6379
    depends_on:
      - tubearchivist_julian_es
    restart: unless-stopped
  tubearchivist_julian_es:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.1
    container_name: tubearchivist_julian_es
    volumes:
      - ./es:/usr/share/elasticsearch/data
    environment:
      - "xpack.security.enabled=true"
      - "ELASTIC_PASSWORD=${TA_ELASTIC_PASS}"
      - "discovery.type=single-node"
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    ports:
      - 9200:9200
    restart: unless-stopped

My relevant settings in TubeArchivist are: Subscriptions - Page Size - 5 Downloads - Auto Delete - True Scheduler - Rescan - 0 */4,11-20 * Scheduler - Start Download - 5 */4,11-20 *

Since I updated to TA 1.3.0 last weekend some of my videos I mark as watched and that get deleted will get redownloaded again the next day. When asked on Discord I could not find a redownloaded video in the ignore list. This was also reported to be a bug on Unraid on Discord by somethingsuper.

bug question

opened by cpt-kuesel 16

Some more url formats would be nice
Channel links with the Channel name don't work:

tubearchivist | {'csrfmiddlewaretoken': ['AY0tO0Dav2zXgzjttPYb4lEsQ5qgYn5NE4Fzk66r983FaufAOvSgNKSTb3mBAUFQ'], 'subscribe': ['https://www.youtube.com/c/veritasium']} tubearchivist | parsing subscribe ids failed! tubearchivist | ['https://www.youtube.com/c/veritasium']

As a workaround I currently copy the link from the channel name when watching a video. This link has the neded channel id.

Playlist links in this format: https://www.youtube.com/watch?v=aFPJf-wKTd0&list=UUHnyfMqiRRG1u-2MsSQLbXA&index=2 are parsed like one video but not as list.
opened by MSDev201 16
Feature Request: Allow use of cookies.txt file to pass to YT-DLP

Just in case YT wants to be picky about age restricted, etc, files - be nice to be able to point to a cookies.txt file in the tubearchivist appdata folder that it can pass to YT-DLP
enhancement

opened by Marthisdil 15
Videos wont play

Ive left all setting at default. a couple things are happening

It seems dls are stalling and i have to hit download que multiple time to get it going again. nothing helpfule in logs

when playing a video in firefox (linux) i get no video with supported format and mimetime found. on chrome i get no controls and or anything, just a screen with the beginning videothumnail i assume

opened by Code-Slave 14

[Bug]: Change in SponsorBlock API breaks integration

I've read the documentation

[X] I'm running the latest version of Tube Archivist and have read the release notes.
[X] I have read through the wiki and the readme, particularly the common errors section.

Operating System

linux

Your Bug Report

Describe the bug

Sponsorblock changed their API, breaking our integration.

Steps To Reproduce

Activate integration on settings page and download any video that has segments registered in sponsorblock.

Expected behavior

Clean output and store relevant fields.

Relevant log output

tubearchivist  | [2022-12-31 08:06:59,973: WARNING/ForkPoolWorker-4] xxxxxxxxxxx: get sponsorblock timestamps
tubearchivist  | [2022-12-31 08:07:00,453: ERROR/ForkPoolWorker-4] Task download_pending[925b2a8e-9cd6-493a-bdcf-e7f0aaf20a4f] raised unexpected: KeyError('userID')
tubearchivist  | Traceback (most recent call last):
tubearchivist  |   File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
tubearchivist  |     R = retval = fun(*args, **kwargs)
tubearchivist  |   File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
tubearchivist  |     return self.run(*args, **kwargs)
tubearchivist  |   File "/app/home/tasks.py", line 90, in download_pending
tubearchivist  |     downloader.run_queue()
tubearchivist  |   File "/app/home/src/download/yt_dlp_handler.py", line 210, in run_queue
tubearchivist  |     vid_dict = index_new_video(
tubearchivist  |   File "/app/home/src/index/video.py", line 405, in index_new_video
tubearchivist  |     video.build_json()
tubearchivist  |   File "/app/home/src/index/video.py", line 158, in build_json
tubearchivist  |     self._get_sponsorblock()
tubearchivist  |   File "/app/home/src/index/video.py", line 359, in _get_sponsorblock
tubearchivist  |     sponsorblock = SponsorBlock().get_timestamps(self.youtube_id)
tubearchivist  |   File "/app/home/src/index/video.py", line 70, in get_timestamps
tubearchivist  |     sponsor_dict = self._get_sponsor_dict(all_segments)
tubearchivist  |   File "/app/home/src/index/video.py", line 81, in _get_sponsor_dict
tubearchivist  |     del segment["userID"]
tubearchivist  | KeyError: 'userID'

Anything else?

No response

bug

opened by bbilly1 1

[Bug]: Thumbnail downloading error blocks video downloading

I've read the documentation

[X] I'm running the latest version of Tube Archivist and have read the release notes.
[X] I have read through the wiki and the readme, particularly the common errors section.

Operating System

Docker in Ubuntu 20, kernel 5.4.0-125-generic

Your Bug Report

Describe the bug

Unable to download single video. Now messages/errors in UI.

Steps To Reproduce

Add video with alias BFOSuMc3hDc
Press "Download now"

Expected behavior

Video downloading succeeded

Relevant log output

[2022-12-30 23:39:39,829: INFO/MainProcess] Task home.tasks.download_single[5a4e8959-d092-4ccd-88f8-79d1ff21089c] received
[2022-12-30 23:39:39,833: WARNING/ForkPoolWorker-4] Added to queue with priority: BFOSuMc3hDc
[2022-12-30 23:39:46,558: WARNING/ForkPoolWorker-4] BFOSuMc3hDc: get metadata from youtube
[2022-12-30 23:39:47,511: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: get metadata from es
[2022-12-30 23:39:47,521: WARNING/ForkPoolWorker-4] {"_index":"ta_channel","_id":"UCd_sTwKqVrweTt4oAKY5y4w","found":false}
[2022-12-30 23:39:47,522: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: scrape channel data from youtube
[2022-12-30 23:39:47,708: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: download channel thumbnail
[2022-12-30 23:39:52,755: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: retry thumbnail download https://yt3.ggpht.com/3ZWqY8gwuMEl6e2oV6WPMmJyPCAG3i_lL4malTRk8xUWtwGU54wLLJT4H6QdP8bB13ybkAuRVbM=s900-c-k-c0x00ffffff-no-rj
[2022-12-30 23:39:58,804: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: retry thumbnail download https://yt3.ggpht.com/3ZWqY8gwuMEl6e2oV6WPMmJyPCAG3i_lL4malTRk8xUWtwGU54wLLJT4H6QdP8bB13ybkAuRVbM=s900-c-k-c0x00ffffff-no-rj
[2022-12-30 23:40:05,843: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: retry thumbnail download https://yt3.ggpht.com/3ZWqY8gwuMEl6e2oV6WPMmJyPCAG3i_lL4malTRk8xUWtwGU54wLLJT4H6QdP8bB13ybkAuRVbM=s900-c-k-c0x00ffffff-no-rj
[2022-12-30 23:40:14,854: ERROR/ForkPoolWorker-4] Task home.tasks.download_single[5a4e8959-d092-4ccd-88f8-79d1ff21089c] raised unexpected: AttributeError("'bool' object has no attribute 'convert'")
Traceback (most recent call last):
  File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/app/home/tasks.py", line 120, in download_single
    VideoDownloader().run_queue()
  File "/app/home/src/download/yt_dlp_handler.py", line 210, in run_queue
    vid_dict = index_new_video(
  File "/app/home/src/index/video.py", line 405, in index_new_video
    video.build_json()
  File "/app/home/src/index/video.py", line 150, in build_json
    self._add_channel()
  File "/app/home/src/index/video.py", line 204, in _add_channel
    channel.build_json(upload=True, fallback=self.youtube_meta)
  File "/app/home/src/index/channel.py", line 183, in build_json
    self.get_from_youtube(fallback)
  File "/app/home/src/index/channel.py", line 199, in get_from_youtube
    self.get_channel_art()
  File "/app/home/src/index/channel.py", line 245, in get_channel_art
    ThumbManager(self.youtube_id, item_type="channel").download(urls)
  File "/app/home/src/download/thumbnails.py", line 99, in download
    self.download_channel_art(url)
  File "/app/home/src/download/thumbnails.py", line 149, in download_channel_art
    self._download_channel_thumb(channel_thumb, skip_existing)
  File "/app/home/src/download/thumbnails.py", line 164, in _download_channel_thumb
    img_raw.convert("RGB").save(thumb_path)
AttributeError: 'bool' object has no attribute 'convert'

Anything else?

No response

question

opened by AlekseyLobanov 5

[Feature Request]: Refactor Docker image to use s6-overlay
Already implemented?

[X] I have read through the wiki.

[X] I understand the scope of this project and am aware of the known limitations and my idea is not already on the roadmap.

Your Feature Request

Is your feature request related to a problem? Please describe.

The docker image runs as root

Describe the solution you'd like

Since there are multiple processes in your image, it would be nice to refactor the current docker image to use s6-overlay. A good example that somewhat resembles this project would be to look at Funkwhales AIO docker image. It's also a Python application and uses nginx and celery

Additional context

Your help is needed!

[ ] Yes I can help with this feature request!

enhancement
opened by onedr0p 1
[Feature Request]: custom metadata - turn off syncing metadata
Already implemented?

[X] I have read through the wiki.

[X] I understand the scope of this project and am aware of the known limitations and my idea is not already on the roadmap.

Your Feature Request

Short Description (Metadata Sync)

Stop syncing metadata on a per video perspective. This means, i want to exclude videos by my own decision. Its already implemented for deactivated and outdated videos.

Please let users also a simple 'true/false' choice.

Short Description (Metadata Edit)

Choose the source of metadata, (yt | custom/local )

Define read only fields respective fields to sync

TA WebUI CRUD only for metadata. (without Create)

modern Inline editing, yeah nice.

Additional context

Refreshing Metadata is a good starting point, to get a synced version of your favorites. On the other side, you may have expirienced yourself, sometimes information are not present long time. And, imho an archive should prevent this situation. Is there any way, any kind of history, changelog something like that? What happens with the json.info after creating,indexing all the stuff? Could it placed somewhere, as binary in some db .. ?

Thanks for your time! Sorry if i went into too much detail. my real life job rubs off sometimes 👍

Your help is needed!

[X] Yes I can help with this feature request!

enhancement help wanted
opened by cmuc24 2
[Feature Request]: RSS feed downloader
Already implemented?

[X] I have read through the wiki.

[X] I understand the scope of this project and am aware of the known limitations and my idea is not already on the roadmap.

Your Feature Request

Is your feature request related to a problem? Please describe.

Sometimes channels delete videos shortly after uploading and scheduling a refresh of channels every hour isn’t great from a rate limit standpoint.

Describe the solution you'd like

Using an RSS feed to get notified of new uploads could trigger TA to search for only channels that have new videos. I found this article talking about how to get an RSS feed from YouTube: https://danielmiessler.com/blog/rss-feed-youtube-channel/

Additional context

I’m assuming this would require significant restructuring of the scheduler/download function, however, with this method it should “speed” up TAs downloading and refreshing because it would only look at new videos on channels, without having to go through everything.

Your help is needed!

[ ] Yes I can help with this feature request!

enhancement question
opened by pairofcrocs 1
[Bug]: "Add to download queue" defaults to adding an entire channel if a short video URL is added. Downloading an entire channel does not download shorts.
I've read the documentation

[X] I'm running the latest version of Tube Archivist and have read the release notes.

[X] I have read through the wiki and the readme, particularly the common errors section.

Operating System

Fedora 36, Docker, Latest image tag.

Your Bug Report

Describe the bug

The "Add to download queue" function in the Downloads page performs a few checks to determine if the URL provided is a video, playlist, or channel. If the URL does not match any of these checks, the URL is defaulted to a channel. As a result, if a shorts URL is used (example https://www.youtube.com/shorts/6QImkSXqwao), the detect_from_url function in helper.py will not correctly set the URL as a video, because there is no if statement to handle a URL matching "/shorts/".

Since there is no check for shorts in the URL, the default option on line 195 will return download type as a channel. When the entire channel is processed, all regular videos are add to the queue, but the shorts are not added since there is no processing for shorts in get_last_youtube_videos.

~~Currently the only way to download a short is to convert the URL to a standard video by replaceing the /shorts/ part of the URL with /watch?v=~~ edit: shorts can be downloaded by video ID alone per the wiki.

When adding an entire channel to the download queue, short videos are not included becuase get_last_youtube_videos function in subscriptions.py is only checking for videos in https://www.youtube.com/channel/{channel_id}/videos.

Adding a shorts channel page, example https://www.youtube.com/@gensho_yasuda/shorts only loads normal videos.

As a result, it is not possible to download a short using the standard shorts URL, add an entire channel to download normal videos and short videos together, or subscribe to a channel to download new shorts. This may apply to streams as well though I did not test this.

Steps To Reproduce

Click on "Add to download queue"

Add the following short video path

https://www.youtube.com/shorts/kHQCJDo_RzI

Click the "Add to download queue" button

The short will fail to be added to the queue and all videos in the /videos/ section of the channel will be added to the queue.

Expected behavior

Two parts to this.

When adding a short URL, that specific video should be downloaded. Adding the following if statement to line 194 of helper.py resolves this particular issue and treats URLs with /short/ as videos. I don't really know any python, so this may be a subpar fix.

if parsed.path.startswith("/shorts/"): youtube_id = parsed.path.split("/")[2] _ = self.find_valid_id(youtube_id) return youtube_id, "video"

I believe that it might be better to not default to channel and instead throw an error if the URL does not match channel, video, playlist, shorts, streams, etc.

There does not appear to be any provision to automatically download shorts and streams when subscribing to a channel or adding a channel to the download queue. This would be a a nice feature to have, though I suspect it might not be a good default feature and should probably be configurable. I think something similar to this request is outlined in https://github.com/tubearchivist/tubearchivist/issues/368.

I think the only place that would need to be updated to add short and stream support at a really simple level is the get_last_youtube_videos function in subscriptions.py. It looks like there is only a check for the URL ending in /videos when looking up a channel ID. I think this would need to also perform a check to see if any shorts or streams exist and if so combine those together to add to the download queue.

I made an attempt that kind of worked, though I don't know python. I'm sure this is shitty code and has some unexpected problems 😭.

def get_last_youtube_videos(self, channel_id, limit=True): """get a list of last videos from channel""" obs = { "skip_download": True, "extract_flat": True, } if limit: obs["playlistend"] = self.config["subscriptions"]["channel_size"] url_videos = f"https://www.youtube.com/channel/{channel_id}/videos" channel_videos = YtWrap(obs, self.config).extract(url_videos) if not channel_videos: channel_videos = {} url_shorts = f"https://www.youtube.com/channel/{channel_id}/shorts" channel_shorts = YtWrap(obs, self.config).extract(url_shorts) if not channel_shorts: channel_shorts = {} url_live = f"https://www.youtube.com/channel/{channel_id}/streams" channel_streams = YtWrap(obs, self.config).extract(url_live) if not channel_streams: channel_streams = {} channel = channel_videos | channel_shorts | channel_streams if not channel: return False last_videos = [(i["id"], i["title"]) for i in channel["entries"]] return last_videos

Relevant log output

Log after following steps to reproduce. processing: https://www.youtube.com/shorts/DFd9_sDBMv0 ParseResult(scheme='https', netloc='www.youtube.com', path='/shorts/DFd9_sDBMv0', params='', query='', fragment='') [{'url': 'UCpd0n9H-HdK4GplD5RGvIDg', 'type': 'channel'}] [2022-12-10 15:04:29,492: INFO/MainProcess] Task home.tasks.extrac_dl[b3ef705b-2c4e-411e-9715-5175adb9f4cd] received [2022-12-10 15:04:29,870: WARNING/ForkPoolWorker-16] xCXlzhS5Sz8: add to download queue [2022-12-10 15:04:30,925: WARNING/ForkPoolWorker-16] wxpVWxkSS8g: add to download queue [2022-12-10 15:04:31,853: WARNING/ForkPoolWorker-16] lhYCpd1k3P8: add to download queue [2022-12-10 15:04:32,865: WARNING/ForkPoolWorker-16] jw5NWWyiW70: add to download queue

Anything else?

Apologies if this is a known issue. I tried to search for anything about shorts and not much came up. Thanks for all your work on this project. It really is useful and much appreciated.
bug
opened by maltbeverage 5

Releases(v0.3.0)

v0.3.0(Nov 30, 2022)
Project updates

The browser extension Tube Archivist Companion got a major update, basically a rewrite to v0.1.0, now injecting buttons directly into the YouTube page, making this much more user friendly.

Tube Archivist now takes system snapshots instead of json file backups for mapping changes. Make sure to activate snapshot first before updating, particularly for large indexes, to avoid a lengthy delay at first start after the update, wiki.

Your video and channel indexes will automatically get updated at startup to take the new mapping changes, and a new index to hold the comments will be created.

If you are setting your version for Elasticsearch manually, that would be a good time to update to 8.5.1, if you are using archivist-es, you'll get the update automatically to take advantage of some improvements there.

Added

Added comments archiving, wiki

API: Added endpoints for comments management, docs

Added tag cloud to video page, wiki

Added similar videos on video page, wiki

API: Added endpoint for similar videos, docs

Added startup cleanup function deleting leftover partial video files from cache/downloads

Added podman installation instructions, by @redxtech, wiki

Changed

Changed mapping or settings update in index now triggers a snapshot instead of a json file backup, if enabled

Changed video page template to better integrate the inline player for similar videos

Changed inconsistent thumbnail path building from API template

Changed wording for scheduler frequency, #358

Changed potential pitfalls section to common errors, readme

Fixed

Fixed channel deactivation error

Fixed channel reindex not triggering due to a mapping error in channel_last_refresh

Fixed playlist deactivation error

Fixed error deactivating a not set configuration, #362

Hotfix

Image pushed again with a fix for #372.

Source code(tar.gz)
Source code(zip)
v0.2.4(Nov 5, 2022)
Project updates

This release introduces deduplicated snapshots for the Elasticsearch index. Before activating snapshots on the settings page, you’ll have to add an additional environment variable to the archivist-es container: path.repo=/usr/share/elasticsearch/data/snapshot, also see the updated docker-compose.yml file for reference.

The plan is to replace the current json file backup solution with snapshots, as this is a much faster, ressource and storage efficient solution. But as the snapshot files are not human readable, the current json backup solution will stay as a manual backup solution.

It’s particularly recommended to activate snapshots for large indexes, as upcoming changes in the index will otherwise trigger a slow json backup task.

Added

Added system snapshot for your metadata index, wiki

API: Added endpoints for snapshot management

Added configuration for fuzziness in searching, wiki

Added more detailed installation instructions, by @bakkot, Readme

Added more detailed contributions steps to setup your dev environment, by @bakkot, link

Added keyboard shortcuts for player, by @bakkot, wiki

Added LDAP attribute mapping, by @BrianCArnold, Readme

Changed

Changed arm64 build to use ffmpeg build from yt-dlp for better compatibility

Fixed

Fixed issue with yt-dlp api change not getting playlists videos anymore

Fixed issue where playlist was missing channel metadata

Fixed mobile layout overflow on downloads page

Fixed form validation to not allow channel page size of 0, #334

Source code(tar.gz)
Source code(zip)
v0.2.3(Oct 23, 2022)
Added

Added channel query filter for downloads page, wiki

Added downloads link for channel pages, wiki

Added documentation for minimal system requirements, Readme

Changed

Changed sponsorblock API requests, better timeout and rate limiting handling

Changed internals how URL queries get parsed for future improvements

Fixed

Fixed is_live key error after yt-dlp API change, #336

Fixed thumbnail parsing error for malformed images, #325

Fixed missing watched_date bulk update for channels and playlists, #309

Fixed Chrome compatibility issue with description text reveal, #327

Fixed error handling in playlist thumbnail extraction

Source code(tar.gz)
Source code(zip)
v0.2.2(Sep 19, 2022)
Added

Added LDAP disable cert check, by @DanielBatteryStapler

Added configurable grid size for downloads page

Added additional docker logs for manual import and add to queue

Changed

Changed downloads video UI to integrate in regular video list and grid classes

Fixed

Fixed manual import cleanup metadata, #311

Fixed manual import file extension split error, #311

Fixed channel creation from video metadata to catch all generic errors, #312

Source code(tar.gz)
Source code(zip)
v0.2.1(Aug 20, 2022)
Project updates

You can now sponsor this project here directly on GitHub. Show your support by donating to this project!

Added

Offline media import from info.json file, as described in the wiki.

LDAP authentication support, by @DanielBatteryStapler, readme.

Changed

Changed and refactored thumbnail downloader, better performance for large indexes, better integration into existing classes.

Changed download form placeholder wording to better represent what you can enter there, #300.

API: Changed search form to dedicated API endpoint, by @PrivateGER.

Fixed

Fixed CSRF error for SSL reverse proxies, by @birdwing.

Fixed channel url redirect for old channel names, #276.

Fixed backup lock to prevent multiple tasks from running, #278.

Fixed error handling for thumbnail downloader, #228.

Fixed error handling on RYD api downtime, #283.

Fixed subtitle parser for empty subtitles, #288.

Fixed vertical positioning of thumbnail for full text search.

Thank you to everybody contributing to this project!
Source code(tar.gz)
Source code(zip)
v0.2.0(Jul 23, 2022)
Breaking Changes

To validate from where the interface can be accessed, a new environment variable is required. Set TA_HOST to your hostname or IP, Link.

Tube Archivist now depends on Elasticsearch version 8, Link.

For peace of mind, make a manual backup before starting the update.

If you are using bbilly1/tubearchivist-es you will automatically get the recommended and tested version, else set your tag to 8.3.2 when using official Elasticsearch.

If you have been using the recommended version 7.17 before, Elasticsearch will take care of the internal upgrade automatically.

This will break backwards compatibility, you won’t be able to downgrade to ES7 - at least not easily.

Be patient, the migration can take a few minutes.

Added

Added validation from where this application can be served with the TA_HOST environment variable, improving security.

Added authentication for all static user generated files such as thumbnails and media files:

This might have a hopefully neglectable performance impact when using a big archive page size.

But these improved validations gives the confidence to remove the security notification from the known limitations.

Added keyword based search and filter for all your queries.

Added full text search over all your indexed subtitles.

As documented in the wiki.

Changed

Changed the channel detail page into multiple subpages, based on the mockup by @pairofcrocs

Video page: Showing all videos from this channel

Playlist page: Showing all playlists from this channel

About page: Additional metadata and channel configuration form

As documented in the wiki.

Changed internal media file move functions shutil.move for better compatibility for some platforms, #268 by @p0358

Changed descriptions to show a few lines as preview, #269 by @p0358

Changed backup task to write page by page for better performance for big indexes but slightly slower for small indexes.

Changed backup zip file to only include relevant json files to restore for better performance for big indexes.

Changed video download cache naming structure, fixing filename sanitation issue.

Changed download progress message to use full video title instead of filename, #271

Fixed

Fixed reindex task, deactivating non existing channels

Fixed webkit fullscreen scaling issue, #264 by @samdoshi

Fixed cookie import validator from browser extension, #266

Fixed nginx user permission error for some platforms, #268 by @p0358

Source code(tar.gz)
Source code(zip)
v0.1.7(Jul 3, 2022)
Project update

Tube Archivist Companion browser extension v0.0.3 now supports cookie sync

Added

Added Periodical cookie validation

API: Added task GET view to return running tasks, by @lamusmaser

API: Added route to store cookie with POST request for browser extension

Changed

Changed skip superuser creation with lockfile at startup, by @dshoreman

Changed cookie import to don't load if validation fails

Changed Redis connections to auto expire

Changed Redis message handling refactor to not auto expire

Fixed

Fixed various CSS scaling issues for tablet and mobile

Source code(tar.gz)
Source code(zip)
v0.1.6(Jun 4, 2022)
Project updates

First release build on new build server, thanks to everybody contributing financially

Tube Archivist Companion Browser Extension update v0.0.2

Added

User configurable grid row size for videos

Added Truenas Scale instructions, wiki shout out to: @heavybullets8

Changed

Changed cookie file handling directly from redis instead of file, yt-dlp 2022.05.18

Changed use embedded metadata for videos with content id, #241, shout out to @anonamouslyginger

Changed subtitle naming convention to .lang.vtt for new downloads, #195

Changed search as you type delay for better performance

Changed delete download queue buttons, moved to settings page under Actions header

Refactored yt-dlp integration into reusable base class

General code cleanup of unused methods

Fixed

Fixed deleted video lingering in playlist metadata

Fixed subtitle parsing error without segments, #249

Fixed process truncated thumbnail images, #256

Source code(tar.gz)
Source code(zip)
v0.1.5(May 8, 2022)
Project updates

Tube Archivist Companion, the Browser extension, is now published for Firefox and Chrome

There is a first release for Tube Archivist Metrics for Prometheus, shout out to @ainsey11

Added

Added cookie import, wiki

API: added pagination for list views

API: added sort and query filter in download view

API: added run task view

Changed

API: handle 404 in list views

Fixed

Fixed arm64 build error, #234 #240

Fixed holding on to previous Sponsorblock timestamps, shout out to @n8detar

Fixed channel validation error when subscribing to playlist, #223

Fixed error for thumbnail re-embedding task, #231

Fixed autodelete error creating malformed requests to ES, #217

Fixed reindex error when channel name has changed on YT, #211

Fixed premium trailer videos video id mismatch, #237

Fixed timeout issue with yt-dlp check-format interrupting the UI

Source code(tar.gz)
Source code(zip)
v0.1.4(Apr 16, 2022)
Project updates

Tube Archivist has a new home: https://github.com/tubearchivist/tubearchivist

There is a minimal Browser Extension, Firefox is approved, Chrome is still pending, see installation instructions for manual install.

There is a now bbilly1/tubearchivist-es, a set and forget Elasticsearch docker image, that automatically updates with Tube Archivist to the recommended version, Readme, recommended for Unraid due to a limitation of how the version numbers are parsed, optional for everybody else.

There is a WIP Tube Archivist Metrics container to provide Tube Archivist metrics in Prometheus/OpenMetrics format, shout-out to @ainsey11 for working on that

While developing the API, we are rewriting the frontend in NextJS/React, join us on Discord if you want to help. Shout-out to @insuusvenerati for taking the initiative.

There is an unfortunate unfixed bug in the periodic refresh task #211, requiring you to manually rename the channel folder if the name on YouTube has changed since. Check your logs for error messages.

Added

Added SponsorBlock integration support, wiki and wiki, shout-out to @n8detar for implementing the skipping in the player

Added API endpoint for login

Added API endpoint for video lists

Added API endpoint to test connectivity

Added detailed Installation instructions to wiki, shout-out to @pairofcrocs

Changed

Changed Dockerfile structure, reduced image size, faster build, better caching, shout-out to @Lickitysplitted for the help

Fixed

Fixed nginx default conf location conflict, shout-out to @Lickitysplitted

Fixed timing issue with download progress message, #210 shout-out to @ainsey11

Fixed schedule input validator, #209

Fixed subtitle parsing error, resulting in failed download, #196

Fixed startup error message for unsupported ES version, #197

Fixed pagination link building error, #221

Final notes

Thank you for every contribution, reach out if you want to get involved too!
Source code(tar.gz)
Source code(zip)
v0.1.3(Mar 26, 2022)
Notes

This release will automatically rebuild your video and channel index indexes

This release will validate a minimal Elasticsearch version of 7.17. This will be required for an upgrade to Elasticsearch 8 in a future release, 7.17 allows for a smooth and automatic upgrade between the major releases.

Added

Added dedicated continue watching section on top of homepage if you have any in progress videos

Added limited options for per channel customization wiki, with potential for future expansion:

Download format

Delete watched videos after x days

Index Playlist

Added startup check to validate minimal and maximal supported Elasticsearch version

Changed

Changed how subtitles are indexed to reduced overhead by joining multiple lines in one document

Changed how the download queue is indexed and build for better extensibility

Changed index playlist is now part of per channel settings

Improved deploy.sh to now run into testing environment without previous configurations

Improved deploy.sh to install debug tools in testing environment

Fixed

Fixed ignore progress bar if video is watched

Fixed how auto generated subtitles are parsed, #180

Source code(tar.gz)
Source code(zip)
v0.1.2(Feb 26, 2022)
Added

Added storing playback position and continue watching from where you left off, shout out to @n8detar

Added watch progress bar indicator over video thumbnail

Changed

Changed API endpoints to return config key by default

Fixed

Fixed a bug where subscribed playlist would auto unsubscribe when a new video was added.

You might want to double check your playlist subscriptions and re-subscribe if needed

Fixed rescan error if the channel doesn't exist anymore, #175

Fixed build error and better ffmpeg URL extractor from GitHub release API

Fixed some more small bugs, so we can create more later.

Note

This broke compatibility between unstable builds and requires a reset of Redis by deleting dump.rdb in the /data volume of Redis to reset your user configurations. Regular installations with the latest tag are not affected.

Source code(tar.gz)
Source code(zip)
v0.1.1(Feb 13, 2022)
Note

This update will automatically recreate and change the ta_video and ta_channel index to pick up the new mappings

Additionally a new index ta_subtitle will get created to index subtitles

Added

Added subtitle download, display and index support, wiki

searchinging subtitles is pending

Added Google Cast support, shout out to @n8detar

Added a new wiki page: FAQ

Added backfill functionality for videos to get missing returnyoutubedislike.com ratings, if integration is enabled

Added additional fields to the channel metadata indexing for future use

Added a hint of what to do when there are no videos yet, shout out to @SteVwonder

Added link to Helm Chart, shout out to @insuusvenerati

Added a few barely useful API endpoints, link

Added browser extension proof of concept, link

Changed

Changed JS player: Lots of improvements on the integrated video player, shout out to @n8detar

Show more metadata: likes, dislikes, views

Auto mark video as watched at 90%

Changed toggle: UI improvements inverting toggle to indicate current state, shout out to @GigaFyde

Major refactor and reorganization of all python code for reusability and readability improvements

Fixed

Fixed last page error for more than 10k video pagination, #156

Fixed issues with non ASCII character channel name, #127, #146

If you were affected by this bug, delete the channels then future downloads should work fine.

Fixed edge case where thumbnail embed failed, added atomicparsley to the image, #155

Fixed previous workaround with django debug variable, #159

Thank You

Thank you to everybody who is contributing to the improvement of this project! Join us on our Discord.

Help needed

There is a proof of concept browser extension that is waiting for you to improve on, yes you! :-) Join us on Discord.
Source code(tar.gz)
Source code(zip)
v0.1.0(Jan 8, 2022)
Connect

We now have a brand new discord server, join us here!

Join us in our brand new dedicated subreddit: r/TubeArchivist

Note

This update will automatically recreate the indexes to allow for better search functionality. As always, this will automatically create a backup first and can take up to a few minutes.

Added

Dedicated search page for search as you type over the whole index to dynamically get search results for videos, channels and playlists.

English language analyzer for improved search matching fixing stemming, plural/singular and some other

Optional integration with returnyoutubedislike.com to get dislikes and average ratings back

Changed

All django views have been refactored, shout out to @pawwel thank you for your help! #115, #116

All search functionality is now consolidated on the dedicated search page, this will replace the previous per page search forms

The sort order functionality is now integrated in the view style switch area, making things more compact

The subscribe to channel and subscribe to playlist forms are restyled in a more compact format

Wiki pages got a refresh…

Fixed

Fixed lot's of minor UI issues...

Fixed auto delete error when there was nothing to auto delete, #122

Fixed remaining orphaned playlists of delete channels, #118

Source code(tar.gz)
Source code(zip)
v0.0.9(Dec 17, 2021)
Manual changes

This release requires you to set your timezone environment variable TZ for TubeArchivist, otherwise the fancy brand new scheduler won’t know what time it is.

To note

Due to a mapping change this will automatically recreate the ta_playlist index in Elasticsearch on startup. As always an automatic backup of the index will be created first.

There is now an unstable release tag published to docker hub that will get updated between the different releases to quickly pull images to look at changes. As the name implies this is unstable WIP and only for your testing environment, more under Contributing.

Even though reasonably uptodate Elasticsearch images were never vulnerable to the log4j vulnerability, this might be a good opportunity to update to latest 7.16.1.

Added

Added cron like scheduler support to automatically:

Rescan Subscriptions

Start download

Refresh Metadata

Thumbnail check

Index backup

Scheduler configurations and examples are in the wiki.

Added optional auto delete of watched videos, #56

Added remember me to define your session’s lifetime, #77

Added login page autofocus to user name form field, #104

Added port overwrite environment variables for nginx and uwsgi to deal with otherwise unresolvable port collisions, #103, read me

The favicons, the whole favicons and nothing but the favicons, #93

Dynamic copyright for footer, #107

Changed

Rewrite of the notifications functionality into separate message channels to have notifications for different topics on different pages for better and extendable UI feedback.

Changed the video player to theater mode to use more of the available space, #98, #95

Changed the restore backup view on the settings page to use tagged backup name to separate between automatically, manually and due to update created backup files, wiki

Changed the subscribe and unsubscribe button to single color coded toggle, #62

Fixed

Fixed extractor error for old playlist format, #94

Fixed empty playlist index rescan error, #101

Fixed missing average rating due to disabled dislike button, #109

Thank you

As always, thank you for everybody opening issues and contributing to the improvement of this project!
Source code(tar.gz)
Source code(zip)
v0.0.8(Nov 27, 2021)
Index update

This update will make changes to the Elasticsearch index storing the videos ta_video and will create a new index called ta_playlist that will contain the playlists. This should be automatic at startup and even for big archives shouldn’t take more than 1 min. Tube Archivist will automatically make a backup of the index before starting that process.

Added

Playlist support: Subscribe to playlists, add videos to the queue with "Rescan subscriptions" button from downloads page, described here

Playlist support: Find and index playlists from selected channels from the channel detail page, described here

Playlist navigation at the bottom of the video page for videos part of a playlist.

Added original youtube link to video, channel detail and playlist detail page if the link is still available, #81

Added subscribe button directly to the channel and playlist, #81

Added delete download queue button to the downloads page, #85

Added a note about disk usage on the readme, #91

Changed

Changed thumbnail extraction with --check-format to make sure the thumbnail trying to download is available, #83

Changed format selection with using --check-format option to make sure the stream selected by yt-dlp is actually available, #90

Sadly both these changes will slow down adding videos to the download queue but will improve reliability...

Refactoring and performance improvements for index scanning.

Fixed

Fixing very silly youtu.be extractor error, #40

Fixing adding KB/s unit back to settings page, #87

Fixing indexing and downloading multi feed videos.

Looking for feedback

As always, thank you for everybody opening issues and helping to improve this project. If you have any feedback to playlists in particular, there is a discussion thread going.
Source code(tar.gz)
Source code(zip)
v0.0.7(Nov 1, 2021)
Breaking Changes

There are several breaking changes in this version:

First take down all containers.

TubeArchivist requires additional environment variables:

Authentication for TubeArchivist: TA_USERNAME and TA_PASSWORD

Authentication for Elasticsearch: ELASTIC_PASSWORD

Elasticsearch requires these changes:

Enable security by adding: xpack.security.enabled=true

And the matchingELASTIC_PASSWORD.

Take a look at the updated docker-compose.yml file, use a better password than verysecret.

Naming of Redis values are now standardized to allow for per user configurations. This means all your previous configurations on the settings page will fall back to the default values. So most importantly make sure to change the Download Format options to your preference before continue to download.

To avoid having unused values set in Redis it is recommended to delete the file dump.rdb from the redis volume.

Not required but recommended, change the port settings for archivist-redis and archivist-es to expose, these ports don’t need to be accessible over the network, this is also changed in the updated docker-compose file.

Added

User authentication with limited multi user support:

Each user can have different interface settings

For now all users share the same videos and permissions...

Check out the Users section of the wiki for more details

Extending sort by options and add asc/desc switch on home page

Added same sorting and filtering options to the channel page as well

Implemented --throttled-rate option of yt-dlp

Re embed thumbnails into media file after downloading

Channel names are now supported and will get automatically translated to the correct channel ID, #40

Changed

Making HOST_UID and HOST_GID optional for NFS compatibility, #58

Better progress information for adding to queue and rescanning functions

Calls to elasticsearch are now authenticated with credentials set with environment variables

Input forms are now validated before processing, increasing security

Redis keys are now name and user spaced, hence the breaking change

Fixed

Fix iOS compatibility issues with format example, #61

Lots of additional bug fixes and improvements, #28 #60 #64 #73 #75

Thank you for everybody opening issues and helping to improve Tube Archivist!
Source code(tar.gz)
Source code(zip)
v0.0.6(Oct 17, 2021)
Added

Embed thumbnail into media file postprocessor

Rescan filesystem to clean up index

Delete video button

Delete channel button

Changed

Rewrite of the artwork extraction and downloading classes, see below for more details

Showing default artwork when none is available do avoid breaking the interface.

Average video rating now shows as nostalgic stars.

The watched/unwatched checkbox is now a toggle, so you can revert the change back.

Fixed

New channel media folders will now get created with the correct permissions same as media files

Fixing an issue where a previously failed download task wouldn't clean up after it self

New architecture support

Additional installation instructions in the readme for:

arm64: Untested, looking for feedback, shout out to @lamusmaser

Unraid, shout out to @pairofcrocs

Synology, shout out to @geekedtv

Update path

The new thumbnail caching method is not backwards compatible. After updating, your already downloaded thumbnails will get reorganized into subfolders. Then Tube Archivist will scan your library and download all missing thumbnails. This can take a long time depending on your library size. docker-compose logs -f tubearchivist will confirm that something is happening. Then from this version on, new artwork will get downloaded once you add a video to the download queue instead of on demand when the interface needs it. This has a few key advantages:

Future proof the cache/video folder to not hold potentially 10s of thousands video thumbnails in one single folder.

Speed up the interface because all of the artwork will already be cached upfront.

Speed up the downloads view by using the cached thumbnails instead of loading them from youtube.

Guarantee that there is artwork available even if the video disappears later.

More in the spirit of the Archivist to make sure all relevant information is safely stored and organized.

And speed up searching with artwork preview in an upcoming version...

Clean up

Due to not handling 404 errors in thumbnails extraction before, you might have ended up with some placeholder thumbnails from youtube looking like this or with a html error file for channel art work. That's not really a problem but if you want to replace them with a beautiful Tube Archivist styled placeholder instead, shut down the container and continue:

Running this command from the cache/video folder on your host system will show all failed video thumbnails:

find . -type f -exec md5sum {} \; | grep 2f5b1b159ee4893e015e1c373111919b

If that doesn't give any output, you are golden, else this command will delete all thumbnails matching that specific hash:

find . -type f -exec md5sum {} \; | grep 2f5b1b159ee4893e015e1c373111919b | awk '{$1 = "rm" ; print }' | bash

Similar for the channel art work, 404 errors resulted in downloading a html page with the content of just that. To find all html files run this from the cache/channel folder:

find . -print | file -if - | grep "text/html" | awk -F: '{print $1}'

Again, if you don't get any output, you are good. If you do see any files matching, don't get confused with the file ending, these aren't actually JPGs, run this command to delete these files:

find . -print | file -if - | grep "text/html" | awk -F: '{print $1}' | xargs rm

Sorry for the complications....
Source code(tar.gz)
Source code(zip)
v0.0.5(Oct 3, 2021)
Added

Added grid and list view switch for all archive pages

The Downloads view now has a ignored view list toggle to show all previously ignored videos and provides options to unignore them

The Github wiki is now where all the user documentation is located.

Tube Archivist can now integrate with a custom RedisJSON port.

Added a section in the contributing page about how to set up your testing environment.

Changed

Tube Archivist now utilizes the patched nightly builds of ffmpeg for best compatibility with yt-dlp, #37 #26

The About page contains now just useful links as the documentation is now consolidated into the github wiki.

Converted some true/false dropdowns to a toggle switch.

The “Download Queue” button on the download page is now called “Start Download” for better clarity.

Fixed

Cleaned up startup functions into a dedicated Django ready method to avoid double execution.

There is still a lot of refactoring and cleaning up going on.

Source code(tar.gz)
Source code(zip)
v0.0.4(Sep 26, 2021)
Added

Added a readme section about updating Tube Archivist and expected future changes.

Added some donating links, #29

Added a docs folder to start working on a Tube Archivist wiki. Asking for help @TechnicallyOffbeat

Changed

Changed how the download queue works: Is now dynamic to allow for gracefully stopping and ungracefully killing the process.

Default download limit value is now disabled on new installations, as the dynamic queue offers better ways to stop the download process.

“Download now” function allows to set the video as a priority download infront of an already running queue.

Additionally the download order is more logical now: New videos get added to the back of the queue, videos start downloading from the top of the queue.

Fixed

Better error handling in add to download form

Sanitizing directory scan output from hidden and temporary files, #30

Source code(tar.gz)
Source code(zip)
v0.0.3(Sep 22, 2021)
Added

Added support to restore index from backup zip file

Post-processors support for yt-dlp and optional embedding of metadata into media file, #21 shout out to @nifoc

Now showing current version number in the footer for easy reference

There is now a CONTRIBUTING.md file

Linting and code formatting rules, shout out to @cclauss

Changed

Lots and lots of improvements, refactoring, cleaning in the code base to make things presentable

Fixed

Fixed lots of grammar and spelling issues, shout out to @TechnicallyOffbeat

Source code(tar.gz)
Source code(zip)
v0.0.2(Sep 17, 2021)
Added

backup metadata db to disk

readme section about elastic search permission error

Changed

Download view now has pagination to avoid loading too many thumbnails at once

Fixed

now publishing same image to docker for latest and newest semantic version

fixed blocking issue with download now

fixed staticfile collection throwing an error on container restart because files are already there

Source code(tar.gz)
Source code(zip)
v0.0.1(Sep 15, 2021)
Added

Importing and indexing existing video collection into archive.

Changed

Subscribe to channel now takes a list of channels

Fixed

Fixed scraping issue with EU cookie consent screen #2

Fixed issue where subscribing to invalid channel ids froze interface

Source code(tar.gz)
Source code(zip)