Cleaner script to normalize knock's output EPUBs

Overview

clean-epub

The excellent knock application by Benton Edmondson outputs EPUBs that seem to be DRM-free. However, if you run the application twice on the same ACSM file, the hashes do not match.

This script normalizes EPUB files, and it is specifically written to normalize the output files of knock. It strips away all the differences between different EPUB files for the same book.

Usage

./clean-epub.py -i input.epub -o output.epub

Details

In essence, it does this:

  • Create a temporary directory, and unzip the input EPUB into it
  • Set all the access and modification times for files and directories in the temporary directory to a fixed date
  • Zip the contents of the temporary directory into a new EPUB, without adding extra file attributes

I tested it on Ubuntu 20.04, with EPUBs bought from Bol(dot)com.

There are three reasons why you might want to test it more extensively before depending on it:

  • Ebooks from other sellers might contain more identifying details than just the timestamps and order of the files in the zip.
  • The zip implementation of another OS might not deterministically order the input files of the new zip file.
  • Your OS might track more file metadata fields than just access and modification times (not sure if this exists)
You might also like...
Osintgram by Datalux but i fixed some errors i found and made it look cleaner

OSINTgram-V2 OSINTgram-V2 is made from Osintgram which is made by Datalux originally but i took the script and fixed some errors i found and made the

FCurve-Cleaner: Tries to clean your dense mocap graphs like an animator would
FCurve-Cleaner: Tries to clean your dense mocap graphs like an animator would

Tries to clean your dense mocap graphs like an animator would! So it will produce a usable artist friendly result while maintaining the original graph.

This is a Python program that implements a vacuum cleaner as an Artificial Intelligence.
This is a Python program that implements a vacuum cleaner as an Artificial Intelligence.

Vacuum-Cleaner Python3 This is a Python3 agent that implements a simulator for a vacuum cleaner and it is introduction to Artificial Intelligence. A s

Utilities to make function-based views cleaner, more efficient, and better tasting.

django-fbv Utilities to make Django function-based views cleaner, more efficient, and better tasting. ๐Ÿ’ฅ ๐Ÿ“– Complete documentation: https://django-fbv

Wechat-file-cleaner - Clean files in PC WeChat FileStorage directory

Wechat-file-cleaner - Clean files in PC WeChat FileStorage directory

Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors
Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors

nwg-wrapper This program is a part of the nwg-shell project. This program is a GTK3-based wrapper to display a script output, or a text file content o

A script depending on VASP output for calculating Fermi-Softness.
A script depending on VASP output for calculating Fermi-Softness.

Fermi softness calculation for Vienna Ab initio Simulation Package (VASP) Update 1.1.0: Big update: Rewrote the code. Use Bader atomic division instea

A script written in Python that generate output custom color (HEX or RGB input to x1b hexadecimal)
A script written in Python that generate output custom color (HEX or RGB input to x1b hexadecimal)

ColorShell โ”€ 1.5 Planned for v2: setup.sh for setup alias This script converts HEX and RGB code to x1b x1b is code for colorize outputs, works on ou

Small Python script to parse endlessh's output and print some neat statistics

endlessh_parser endlessh_parser is a small Python script that parses endlessh's output and prints some neat statistics about it Usage Install all the

A python script to run any executable and pass test cases to it's stdin and compare stdout with correct output.

quera_testcase_checker A python script to run any executable and pass test cases to it's stdin and compare stdout with correct output. proper way to u

Ipylivebash - Run shell script in Jupyter with live output
Ipylivebash - Run shell script in Jupyter with live output

ipylivebash ipylivebash is a library to run shell script in Jupyter with live ou

Islam - This is a simple python script.In this script I have written all the suras of Al Quran. As a result, by using this script, you can know the number of any sura at the moment.
Islam - This is a simple python script.In this script I have written all the suras of Al Quran. As a result, by using this script, you can know the number of any sura at the moment.

Introduction: If you want to know sura number of al quran by just typing the name of sura than you can use this script. Usage in termux: $ pkg install

PathPicker accepts a wide range of input -- output from git commands, grep results, searches -- pretty much anything.After parsing the input, PathPicker presents you with a nice UI to select which files you're interested in. After that you can open them in your favorite editor or execute arbitrary commands.
emoji terminal output for Python

Emoji Emoji for Python. This project was inspired by kyokomi. Example The entire set of Emoji codes as defined by the unicode consortium is supported

Colored terminal output for Python's logging module
Colored terminal output for Python's logging module

coloredlogs: Colored terminal output for Python's logging module The coloredlogs package enables colored terminal output for Python's logging module.

Prettify Python exception output to make it legible.
Prettify Python exception output to make it legible.

pretty-errors Prettifies Python exception output to make it legible. Install it with python -m pip install pretty_errors If you want pretty_errors to

A game theoretic approach to explain the output of any machine learning model.
A game theoretic approach to explain the output of any machine learning model.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allo

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

Comments
  • Doesn't work if /tmp is on a different filesystem than the output directory

    Doesn't work if /tmp is on a different filesystem than the output directory

    Some distros (for example Arch) mount a RAM filesystem at /tmp, which breaks this script. I'd suggest directly creating the new epub in the target directory instead of /tmp and then moving as a fix.

    Log:

    โžœ ./clean-epub.py -i ../Seirei\ Gensouki:\ Spirit\ Chronicles\ Volume\ 16.epub -o ../Seirei\ Gensouki:\ Spirit\ Chronicles\ Volume\ 16\ Clean.epub 
    Archive:  ../Seirei Gensouki: Spirit Chronicles Volume 16.epub
     extracting: /tmp/tmp-epub-8016426/mimetype  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/copyright.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/FrontMatter4.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Styles/stylesheet.css  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert6.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert3.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter6-2.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/afterword.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert10.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/toc.ncx  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/frontmatter2.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Cover.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert7.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/frontmatter1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/frontmatter6.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/signup.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/frontmatter4.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter6.xhtml  
      inflating: /tmp/tmp-epub-8016426/META-INF/container.xml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert4.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert5.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert2.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/interlude2.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert5.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert1.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/epilogue.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/FrontMatter2.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/toc.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter5-1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert9.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter2-1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert2.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/interlude3.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/prologue.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter6-1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter1-1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/cover.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/FrontMatter6.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter3.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter5.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert3.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert10.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert4.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/jnovelclubCMYK.png  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/frontmatter3.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/FrontMatter3.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter4.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert7.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/interlude1.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/FrontMatter5.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/_page_map_.xml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/frontmatter5.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert9.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert8.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/FrontMatter1.jpg  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter7.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/volume.opf  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/insert8.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Text/chapter2.xhtml  
      inflating: /tmp/tmp-epub-8016426/OEBPS/Images/Insert6.jpg  
      adding: META-INF/ (stored 0%)
      adding: META-INF/container.xml (deflated 34%)
      adding: OEBPS/ (stored 0%)
      adding: OEBPS/volume.opf (deflated 83%)
      adding: OEBPS/_page_map_.xml (deflated 88%)
      adding: OEBPS/toc.ncx (deflated 79%)
      adding: OEBPS/Styles/ (stored 0%)
      adding: OEBPS/Styles/stylesheet.css (deflated 73%)
      adding: OEBPS/Images/ (stored 0%)
      adding: OEBPS/Images/Insert6.jpg (deflated 1%)
      adding: OEBPS/Images/FrontMatter1.jpg (deflated 0%)
      adding: OEBPS/Images/Insert8.jpg (deflated 0%)
      adding: OEBPS/Images/FrontMatter5.jpg (deflated 3%)
      adding: OEBPS/Images/Insert7.jpg (deflated 0%)
      adding: OEBPS/Images/FrontMatter3.jpg (deflated 0%)
      adding: OEBPS/Images/jnovelclubCMYK.png (deflated 1%)
      adding: OEBPS/Images/Insert3.jpg (deflated 1%)
      adding: OEBPS/Images/FrontMatter6.jpg (deflated 2%)
      adding: OEBPS/Images/Insert9.jpg (deflated 1%)
      adding: OEBPS/Images/FrontMatter2.jpg (deflated 0%)
      adding: OEBPS/Images/Insert1.jpg (deflated 1%)
      adding: OEBPS/Images/Insert2.jpg (deflated 1%)
      adding: OEBPS/Images/Insert5.jpg (deflated 1%)
      adding: OEBPS/Images/Insert4.jpg (deflated 1%)
      adding: OEBPS/Images/Cover.jpg (deflated 0%)
      adding: OEBPS/Images/Insert10.jpg (deflated 1%)
      adding: OEBPS/Images/FrontMatter4.jpg (deflated 0%)
      adding: OEBPS/Text/ (stored 0%)
      adding: OEBPS/Text/chapter2.xhtml (deflated 63%)
      adding: OEBPS/Text/insert8.xhtml (deflated 43%)
      adding: OEBPS/Text/chapter7.xhtml (deflated 66%)
      adding: OEBPS/Text/insert9.xhtml (deflated 43%)
      adding: OEBPS/Text/frontmatter5.xhtml (deflated 43%)
      adding: OEBPS/Text/interlude1.xhtml (deflated 58%)
      adding: OEBPS/Text/chapter4.xhtml (deflated 66%)
      adding: OEBPS/Text/frontmatter3.xhtml (deflated 42%)
      adding: OEBPS/Text/insert4.xhtml (deflated 43%)
      adding: OEBPS/Text/insert10.xhtml (deflated 43%)
      adding: OEBPS/Text/chapter5.xhtml (deflated 60%)
      adding: OEBPS/Text/chapter3.xhtml (deflated 62%)
      adding: OEBPS/Text/insert1.xhtml (deflated 43%)
      adding: OEBPS/Text/cover.xhtml (deflated 47%)
      adding: OEBPS/Text/chapter1-1.xhtml (deflated 58%)
      adding: OEBPS/Text/chapter6-1.xhtml (deflated 65%)
      adding: OEBPS/Text/prologue.xhtml (deflated 54%)
      adding: OEBPS/Text/interlude3.xhtml (deflated 59%)
      adding: OEBPS/Text/insert2.xhtml (deflated 43%)
      adding: OEBPS/Text/chapter2-1.xhtml (deflated 65%)
      adding: OEBPS/Text/chapter5-1.xhtml (deflated 62%)
      adding: OEBPS/Text/toc.xhtml (deflated 68%)
      adding: OEBPS/Text/epilogue.xhtml (deflated 60%)
      adding: OEBPS/Text/insert5.xhtml (deflated 43%)
      adding: OEBPS/Text/interlude2.xhtml (deflated 63%)
      adding: OEBPS/Text/chapter6.xhtml (deflated 50%)
      adding: OEBPS/Text/frontmatter4.xhtml (deflated 43%)
      adding: OEBPS/Text/signup.xhtml (deflated 46%)
      adding: OEBPS/Text/frontmatter6.xhtml (deflated 42%)
      adding: OEBPS/Text/frontmatter1.xhtml (deflated 42%)
      adding: OEBPS/Text/insert7.xhtml (deflated 43%)
      adding: OEBPS/Text/frontmatter2.xhtml (deflated 43%)
      adding: OEBPS/Text/afterword.xhtml (deflated 50%)
      adding: OEBPS/Text/chapter6-2.xhtml (deflated 54%)
      adding: OEBPS/Text/insert3.xhtml (deflated 43%)
      adding: OEBPS/Text/insert6.xhtml (deflated 43%)
      adding: OEBPS/Text/copyright.xhtml (deflated 54%)
      adding: OEBPS/Text/chapter1.xhtml (deflated 65%)
      adding: mimetype (stored 0%)
    Traceback (most recent call last):
      File "/home/laurin/clean-epub/./clean-epub.py", line 48, in <module>
        os.rename(f"{tempdirname}/{tempfilename}", outputfile)
    OSError: [Errno 18] Invalid cross-device link: '/tmp/tmp-epub-8016426/my-clean-9212175.epub' -> '../Seirei Gensouki: Spirit Chronicles Volume 16 Clean.epub'
    
    opened by laurinneff 2
Owner
null
Python script to generate Vale linting rules from word usage guidance in the Red Hat Supplementary Style Guide

ssg-vale-rules-gen Python script to generate Vale linting rules from word usage guidance in the Red Hat Supplementary Style Guide. These rules are use

Vale at Red Hat 1 Jan 13, 2022
An open-source script written in python just for fun

Owersite Owersite is an open-source script written in python just for fun. It do

ๅคงใใชใƒšใƒ‹ใ‚นใ‚’ๆŒใคๅฐ‘ๅนด 7 Sep 21, 2022
Mkdocs obsidian publish - Publish your obsidian vault through a python script

Mkdocs Obsidian Mkdocs Obsidian is an association between a python script and a

Mara 49 Jan 9, 2023
Loudchecker - Python script to check files for earrape

loudchecker python script to check files for earrape automatically installs depe

null 1 Jan 22, 2022
EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

WVlab 1 Jun 18, 2022
Yu-Gi-Oh! Master Duel translation script

Yu-Gi-Oh! Master Duel translation script

null 715 Jan 8, 2023
script to calculate total GPA out of 4, based on input gpa.csv

gpa_calculator script to calculate total GPA out of 4 based on input gpa.csv to use, create a total.csv file containing only one integer showing the t

Mohamad Bastin 1 Feb 7, 2022
S3-cleaner - A Python script attempts to delete the all objects/delete markers/versions from specific S3 bucket

Remove All Objects From S3 Bucket This Python script attempts to delete the all

null 9 Jan 27, 2022
Uses diff command to compare expected output with student's submission output

AUTOGRADER for GRADESCOPE using diff with partial grading Description: Uses diff command to compare expected output with student's submission output U

null 2 Jan 11, 2022