An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix

Overview

An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix, with glyphs based on cwTeXFangSong. The font is optimised for vertical typesetting. Below is a sample:

The font contains roughly 13,000 glyphs, mostly for traditional Chinese.

I created the font for one of my own projects. The font is admittedly not perfect, but nevertheless have many ineteresting features; therefore I am sharing the font file and programs used to generate it.

Download

Download the font directly at dist/tkFangSong.ttf.

The name of the font is 剔骨仿宋 (thek-kwot-fang-song, So named because the algorithm that created it resembles "deboning"). It is licensed under SIL open font license. If you wish to credit the author, you might use my name 黃令東/黄令东, or the romanization "Lingdong Huang".

Features

The font builds on the elegant shapes from cwTeXFangSong to add more hand-made look and feel reminiscent of the aesthetics of old woodblock printed books.

The font has a wider proportion compared to the original cwTeXFangSong, and is further widened towards the bottom, to accentuate the finishing strokes. The "center of mass" is also moved downwards:

Above right is a visualization of the base function used to warp the skeleton.

The height of a glyph is additionally tweaked based on its vertical complexity, computed with Sobel operator and taking the max of each pixel row.

Many fonts are optimized for horizontal typesetting, and as such, when arranged vertically, the center of mass shifts left and right, giving a jagged look. This font attempts to solve the problem by computing centroids (via image moments) and aligning them.

The font has rich textures. Some of them are artifacts produced by pix2pix network; others are fine-tuned noises delibrately added.

It is to be noted that, as an automated process, it doesn't always produce optimal results; some characters might end up looking ugly, or use the wrong caligraphic movement for certain strokes; For some caligraphers, some strokes might appear too "weak" to their tastes.

Process

The medial axis (skeleton) is computed for each raster rendering of the glyphs in the original font. (The resultant hershey font can be found at ./dist/CWFS64J.HF.TXT)

Pairs of images: the original rendering vs the skeletons are sent to pix2pix for training. pix2pix learns the correspondance and becomes capable of turning skeletons to glyphs.

New skeletons are generated by warping the originals according to my (questionable) taste.

All the new skeletons are fed into the trained network to obtain the new glyphs. The new glyphs are warped in structure, but the weight and shape of the strokes still look legit.

Some post-processing is applied, and potrace is used to re-vectorize the glyphs. Finally, fontforge is used to create a TTF file.

Building the font

Note that to use the font, you can simply download it here. This sections is for reproducing the results from scratch.

The scripts used to build the font are included in the workflow/ folder. Note that making the font is a quite involved process (especially the part of training the neural net). You might also need to modify the scripts to fit your system/folder configuration, but here are some rough steps:

  • Get OpenCV, tensorflow<=1.13.1, numpy and friends
  • First install swig version of skeleton-tracing for python, then obtain cwTeXFangSong from here.
  • Run skel.py > CWFS64.HF.TXT, then join.py > CWFS64J.HF.TXT
  • Modify pairs.py to read CWFS64J.HF.TXT and output to a folder you will create.
  • Download pix2pix-tensorflow and train on the images created in the previous step. (Good luck getting a prehistoric tensorflow project running, you'll need it. Once you do it's pretty straightforward, follow their README)
  • I trained 40K steps on 1.3K images, afterwhich the quality didn't seem to improve much, your mileage might vary.
  • Run warp.py > CWFS64W3.HF.TXT. Modify pairs.py to read from it, and create an output folder like before. Run pairs.py.
  • Evaluate the new image pairs with pix2pix. Edit any of the images that have too much defect with Photoshop or software of your choice. Put the edited ones in a separated folder, say retouched/.
  • Modify refine.py to read from the images and retouched images folders, create an output folder fine/ for it, and run the script.
  • Download potrace, make it runnable from commandline, and run trace_all.py.
  • Run forgefont.py to create a TTF from SVGs generated in the previous step.
  • Done! You can also preview the glyphs with preview.py > index.html, or preview the skeleton with preview_hf.py > index.html.

A PDF containing all the glyphs can be found here. If you find this font not bad, you might also enjoy qiji-font, a more authentic reproduction of a historical typeface.

You might also like...
A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings.

Python User Agents user_agents is a Python library that provides an easy way to identify/detect devices like mobile phones, tablets and their capabili

Format Covid values to ASCII-Table (Only for Germany and Austria)

Covid-19-Formatter (Only for Germany and Austria) Dieses Script speichert die gemeldeten Daten des RKIs / BMSGPK und formatiert diese zu einer Asci Ta

Text to ASCII and ASCII to text

Text2ASCII Description This python script (converter.py) contains two functions: encode() is used to return a list of Integer, one item per character

Hspell, the free Hebrew spellchecker and morphology engine.

Hspell, the free Hebrew spellchecker and morphology engine.

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

MUSE stands for Multilingual Universal Sentence Encoder - multilingual extension (supports 16 languages) of Universal Sentence Encoder (USE).

Hotpotato is a recipe portfolio App that assists users to discover and comment new recipes.
Hotpotato is a recipe portfolio App that assists users to discover and comment new recipes.

Hotpotato Hotpotato is a recipe portfolio App that assists users to discover and comment new recipes. It is a fullstack React App made with a Redux st

Etranslate is a free and unlimited python library for transiting your texts

Etranslate is a free and unlimited python library for transiting your texts

Answer some questions and get your brawler csvs ready!

BRAWL-STARS-V11-BRAWLER-MAKER-TOOL Answer some questions and get your brawler csvs ready! HOW TO RUN on android: Install pydroid3 from playstore, and

The project is investigating methods to extract human-marked data from document forms such as surveys and tests.
The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.

Comments
  • Any plans to add in Chinese punctuation?

    Any plans to add in Chinese punctuation?

    Are there any plans to add in chinese punctuation characters which fit in font style? English letters are not so important but currently the chinese fullwidth punctuation glypls are also "default placeholder" glyphs

    I could try do it myself for myself but I'm worried that (a) I'll do a bad job of it and (b) fontforge will mess up the font file if I have to decompile and recompile the ttf, thus breaking it.

    opened by Adrakaris 0
  • Is the licensing appropriate? 授权可行吗?

    Is the licensing appropriate? 授权可行吗?

    版權爭議

    由於本專案內容的版權可能存有爭議(請見 https://github.com/l10n-tw/cwtex-q-fonts/issues/15 討論串),我們建議,在可能存在法律風險的使用情境中避免使用本專案字體,並改以其它自由字體(例如 Google Noto Sans CJK 系列)取代之。

    opened by NightFurySL2001 2
Owner
Lingdong Huang
Lingdong Huang
strbind - lapidary text converter for translate an text file to the C-style string

strbind strbind - lapidary text converter for translate an text file to the C-style string. My motivation is fast adding large text chunks to the C co

Mihail Zaytsev 1 Oct 22, 2021
Getting git-style versioning working on RDFlib

Getting git-style versioning working on RDFlib

Gabe Fierro 1 Feb 1, 2022
Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

Samuel Dobbie 146 Dec 18, 2022
🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙‍♀️

?? Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! ??‍♀️

Brandon 5.6k Jan 3, 2023
You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.

You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.

null 8 Dec 20, 2022
StealBit1.1 and earlier strings and config extraction scripts

StealBit1.1 and earlier scripts Use strings_decryptor.py to extract RC4 encrypted strings from a StealBit1.1 sample(s). Use config_extractor.py to ext

Soolidsnake 5 Dec 29, 2022
Fixes mojibake and other glitches in Unicode text, after the fact.

ftfy: fixes text for you >>> print(fix_encoding("(ง'⌣')ง")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li

Luminoso Technologies, Inc. 3.4k Jan 8, 2023
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

Contents Maintainer wanted Introduction Installation Documentation License History Source code Authors Maintainer wanted I am looking for a new mainta

Antti Haapala 1.2k Dec 16, 2022
Implementation of hashids (http://hashids.org) in Python. Compatible with Python 2 and Python 3

hashids for Python 2.7 & 3 A python port of the JavaScript hashids implementation. It generates YouTube-like hashes from one or many numbers. Use hash

David Aurelio 1.4k Jan 2, 2023
A generator library for concise, unambiguous and URL-safe UUIDs.

Description shortuuid is a simple python library that generates concise, unambiguous, URL-safe UUIDs. Often, one needs to use non-sequential IDs in pl

Stavros Korokithakis 1.8k Dec 31, 2022