epub2sphinx is a tool to convert epub files to ReST for Sphinx

Overview

epub2sphinx

epub2sphinx is a tool to convert epub files to ReST for Sphinx.

It uses Pandoc for converting HTML data inside epub files into ReST.

It creates a directory structure similar to what sphinx-quickstart generates by default.

Installation

  • Install pandoc

    # On Ubuntu
    sudo apt-get install pandoc
    # On Arch Linux
    pacman -S pandoc

    For installing on other platforms, look here.

  • Install epub2sphinx

    python setup.py install
  • Install Sphinx

    epub2sphinx can generate ReST files without Sphinx, but Sphinx is used to build the HTML files if --build or --serve flags are used.

    pip3 install sphinx

Usage

Usage: epub2sphinx [-o <output_directory_path>] [-t <sphinx_theme_name>] [-s|--server|-b|--build] [-c] <epub_file_name>

  This tool helps you to convert your epub files into sphinx format for a better reading experience.
  Kindly provide the epub file as the argument to this command.

Options:
  -o, --output-directory PATH  The name of the output directory where the ReST file will be generated.
                               Kindly make sure that the given directory is not existing already.
  -t, --theme TEXT             The name of the sphinx theme.You can check for the available themes at:
                               <https://www.sphinx-doc.org/en/master/usage/theming.html#builtin-themes>
  -b, --build                  Build HTML from the generated ReST files using Sphinx.
                               Sphinx has to be installed for this to work.
  -s, --serve                  Build HTML using Sphinx and Serve the files on localhost.
                               Sphinx has to be installed for this to work.
  -c, --include-custom-css     Include the custom CSS from the EPUB for the HTML output
  --version                    Show the version and exit.
  --help                       Show this message and exit.

Example

epub2sphinx -o out_dir -t classic my_book.epub

# To generate HTML files using Sphinx
cd out_dir
make html

Usecase

epub2sphinx can be used to convert public domain or CC-licensed epub files into static web pages that allows people to read them online. This will be useful for sites like Project Gutenberg or FreeTamilEbooks. Eventhough Project Gutenberg has an option to read online, it is very plain. Using Sphinx allows us to make use of any default or custom sphinx theme to make it look better.

Screenshots of comparison

Project Gutenberg online read vs Sphinx generated output

image

image

GitHub GitHub issues GitHub forks GitHub Repo stars GitHub watchers

Comments
  • we should get only the chapter titles as the main links

    we should get only the chapter titles as the main links

    Thanks for the great work.

    Testing with this book. https://freetamilebooks.com/ebooks/namma_ooru_kovai/

    epub2sphinx -o out_dir -t alabaster namma_ooru_kovai.epub -b -s

    There are too many "Front Page" test as links on index pages and side bar.

    image

    we should get only the chapter titles as the main links.

    Please explore on this.

    opened by tshrinivasan 3
  • Make progress bar get updated for each stage

    Make progress bar get updated for each stage

    We should find a way to update the progress bar after each stage of conversion.

    In the generate rst stage, we can update after every file is converted to ReST.

    opened by nifey 2
  • Build HTML files by default

    Build HTML files by default

    Right now to generate HTML files from an epub file, we will do something like below

    epub2sphinx -o sample -t classic   sample.epub                               # (Step 1)
    cd sample                                                                    # (Step 2)
    make html                                                                    # (Step 3)
    

    With this change, (Step 1) will be enough to generate the HTML files. The HTML files will be generated even if the -b flag is not specified.

    If -B flag is specified, the execution will stop without any post conversions.

    In addition to this,

    • included a change to ignore .tmp folder in the project
    • refactored two methods in cli.py and convert.py
    • minor style changes
    opened by arvindpz 1
  • Show progress bar for the ReST conversion process

    Show progress bar for the ReST conversion process

    Previously the progress bar was shown for the total conversion of individual input files which will be (currently) only one and hence it was not looking good and it did not show where it's actually taking the time.

    Fix for issue #19

    Old progress bar UI

    https://user-images.githubusercontent.com/24192122/142731231-a5e70359-3d93-47b1-b213-285fd7d6bca8.mov

    New progress bar UI

    https://user-images.githubusercontent.com/24192122/142731240-588dc93c-6393-4e98-8830-65f89d0f1c1f.mov

    enhancement 
    opened by AswinChand97 1
  • Speed improvement

    Speed improvement

    Right now, the tool is bottlenecked especially by the generate rst stage. Some possible approaches to improve speed of conversion:

    • Check if unzipping the epub and directly accessing the html files is faster than obtaining it through ebooklib.
    • Check if we can execute subprocesses for pandoc conversion in parallel
    opened by nifey 1
  • Copy fonts and Don't copy ncx files

    Copy fonts and Don't copy ncx files

    Fonts are not getting copied. We also need to avoid copying any other files that we don't need like the ncx file and creating empty folders.

    We can change --include-custom-css to --include-custom-style to copy both CSS stylesheets and font files together i.e. either copy both stylesheets and fonts or none of them.

    opened by nifey 1
  • Index page is empty

    Index page is empty

    We add genindex to the index.rst but the HTML page is empty.

    We can find out why that is happening and fix it or remove genindex since most sphinx themes have a table of contents.

    opened by nifey 1
  • Enable overwriting output directory

    Enable overwriting output directory

    As of now, the tool aborts with an error message if the output directory already exists. Instead of aborting we can ask the user if it is okay to overwrite the directory and then delete the directory if the user says yes.

    We can also add a --overwrite flag for the user to explicitly tell that overwriting the output directory is okay.

    opened by nifey 1
  • Switch to a templating library

    Switch to a templating library

    Instead of implementing our own templating using regex, we can use an existing templating library like jinja.

    • [x] Implement templating for all files that we modify and copy
    • [x] Convert files in templates directory to use jinja
    • [x] Convert index.rst to use templates
    opened by nifey 1
  • Handle --serve failing when the port is already used

    Handle --serve failing when the port is already used

    Right now, if the port 8000 is already being used, the tool prints a stack trace.

    We can instead catch that exception and try a different port number instead or just print that the port is unavailable.

    We can also add an optional argument for the user to specify a port to serve on if --serve is used.

    opened by nifey 0
  • Escape quotes and double quotes for conf.py

    Escape quotes and double quotes for conf.py

    Whenever the title or author name consists of quotes or double quotes, the generation of conf.py breaks and hence breaking the overall flow. Add a utility function which esacapes quotes and double quotes.

    opened by AswinChand97 0
  • Plugin system for different publishers

    Plugin system for different publishers

    Right now we have hardcoded some HTML transformation like removing epub:type, converting image tag to img tag, etc. Different epub creators have different structure, like title page, TOC, about page, etc.

    It would be nice to have some hooks similar to plugins of Ebooklib to allow doing some extra modifications specific to an epub creator. Eg: Books from project gutenberg have a notice at the beginning and end. This hook can allow moving the notice to a different chapter.

    opened by nifey 0
  • Ops tasks

    Ops tasks

    • [ ] Add documentation
    • [ ] Publish documentation to Read the docs
    • [ ] Inline documentation for functions
    • [ ] Add contributing guidelines and issue template
    • [ ] Add tests
    • [ ] Add CI to check with different python versions and newer versions of dependencies
    • [x] Publish to PyPI
    opened by nifey 0
Owner
Nihaal
M.S. Computer Science student
Nihaal
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects

CLI tool to measure the build time of different, free configurable Sphinx-Projec

useblocks 11 Nov 25, 2022
Lightweight, configurable Sphinx theme. Now the Sphinx default!

What is Alabaster? Alabaster is a visually (c)lean, responsive, configurable theme for the Sphinx documentation system. It is Python 2+3 compatible. I

Jeff Forcier 670 Dec 19, 2022
[Unofficial] Python PEP in EPUB format

PEPs in EPUB format This is a unofficial repository where I stock all valid PEPs in the EPUB format. Repository Cloning git clone --recursive git@gith

Mickaël Schoentgen 9 Oct 12, 2022
sphinx builder that outputs markdown files.

sphinx-markdown-builder sphinx builder that outputs markdown files Please ★ this repo if you found it useful ★ ★ ★ If you want frontmatter support ple

Clay Risser 144 Jan 6, 2023
Main repository for the Sphinx documentation builder

Sphinx Sphinx is a tool that makes it easy to create intelligent and beautiful documentation for Python projects (or other documents consisting of mul

null 5.1k Jan 2, 2023
A curated list of awesome tools for Sphinx Python Documentation Generator

Awesome Sphinx (Python Documentation Generator) A curated list of awesome extra libraries, software and resources for Sphinx (Python Documentation Gen

Hyunjun Kim 831 Dec 27, 2022
Main repository for the Sphinx documentation builder

Sphinx Sphinx is a tool that makes it easy to create intelligent and beautiful documentation for Python projects (or other documents consisting of mul

null 5.1k Jan 4, 2023
Sphinx theme for readthedocs.org

Read the Docs Sphinx Theme This Sphinx theme was designed to provide a great reader experience for documentation users on both desktop and mobile devi

Read the Docs 4.3k Dec 31, 2022
Numpy's Sphinx extensions

numpydoc -- Numpy's Sphinx extensions This package provides the numpydoc Sphinx extension for handling docstrings formatted according to the NumPy doc

NumPy 234 Dec 26, 2022
ReStructuredText and Sphinx bridge to Doxygen

Breathe Packagers: PGP signing key changes for Breathe >= v4.23.0. https://github.com/michaeljones/breathe/issues/591 This is an extension to reStruct

Michael Jones 643 Dec 31, 2022
Type hints support for the Sphinx autodoc extension

sphinx-autodoc-typehints This extension allows you to use Python 3 annotations for documenting acceptable argument types and return value types of fun

Alex Grönholm 462 Dec 29, 2022
Watch a Sphinx directory and rebuild the documentation when a change is detected. Also includes a livereload enabled web server.

sphinx-autobuild Rebuild Sphinx documentation on changes, with live-reload in the browser. Installation sphinx-autobuild is available on PyPI. It can

Executable Books 440 Jan 6, 2023
Sphinx Bootstrap Theme

Sphinx Bootstrap Theme This Sphinx theme integrates the Bootstrap CSS / JavaScript framework with various layout options, hierarchical menu navigation

Ryan Roemer 584 Nov 16, 2022
A powerful Sphinx changelog-generating extension.

What is Releases? Releases is a Python (2.7, 3.4+) compatible Sphinx (1.8+) extension designed to help you keep a source control friendly, merge frien

Jeff Forcier 166 Dec 29, 2022
📖 Generate markdown API documentation from Google-style Python docstring. The lazy alternative to Sphinx.

lazydocs Generate markdown API documentation for Google-style Python docstring. Getting Started • Features • Documentation • Support • Contribution •

Machine Learning Tooling 118 Dec 31, 2022
Seamlessly integrate pydantic models in your Sphinx documentation.

Seamlessly integrate pydantic models in your Sphinx documentation.

Franz Wöllert 71 Dec 26, 2022
Speed up Sphinx builds by selectively removing toctrees from some pages

Remove toctrees from Sphinx pages Improve your Sphinx build time by selectively removing TocTree objects from pages. This is useful if your documentat

Executable Books 8 Jan 4, 2023
python package sphinx template

python-package-sphinx-template python-package-sphinx-template

Soumil Nitin Shah 2 Dec 26, 2022
A `:github:` role for Sphinx

sphinx-github-role A github role for Sphinx. Usage Basic usage MyST: :caption: index.md See {github}`astrojuanlu/sphinx-github-role#1`. reStructuredT

Juan Luis Cano Rodríguez 4 Nov 22, 2022