Implementation of hashids (http://hashids.org) in Python. Compatible with Python 2 and Python 3

Overview

hashids for Python 2.7 & 3

A python port of the JavaScript hashids implementation. It generates YouTube-like hashes from one or many numbers. Use hashids when you do not want to expose your database ids to the user. Website: http://www.hashids.org/

Compatibility

hashids is tested with python 2.7 and 3.5–3.8. PyPy and PyPy 3 work as well.

https://travis-ci.org/davidaurelio/hashids-python.svg?branch=master

Compatibility with the JavaScript implementation

hashids/JavaScript hashids/Python
v0.1.x v0.8.x
v0.3.x+ v1.0.2+

The JavaScript implementation produces different hashes in versions 0.1.x and 0.3.x. For compatibility with the older 0.1.x version install hashids 0.8.4 from pip, otherwise the newest hashids.

Installation

Install the module from PyPI, e. g. with pip:

pip install hashids
pip install hashids==0.8.4 # for compatibility with hashids.js 0.1.x

Run the tests

The tests are written with pytest. The pytest module has to be installed.

python -m pytest

Usage

Import the constructor from the hashids module:

from hashids import Hashids
hashids = Hashids()

Basic Usage

Encode a single integer:

hashid = hashids.encode(123) # 'Mj3'

Decode a hash:

ints = hashids.decode('xoz') # (456,)

To encode several integers, pass them all at once:

hashid = hashids.encode(123, 456, 789) # 'El3fkRIo3'

Decoding is done the same way:

ints = hashids.decode('1B8UvJfXm') # (517, 729, 185)

Using A Custom Salt

Hashids supports salting hashes by accepting a salt value. If you don’t want others to decode your hashes, provide a unique string to the constructor.

hashids = Hashids(salt='this is my salt 1')
hashid = hashids.encode(123) # 'nVB'

The generated hash changes whenever the salt is changed:

hashids = Hashids(salt='this is my salt 2')
hashid = hashids.encode(123) # 'ojK'

A salt string between 6 and 32 characters provides decent randomization.

Controlling Hash Length

By default, hashes are going to be the shortest possible. One reason you might want to increase the hash length is to obfuscate how large the integer behind the hash is.

This is done by passing the minimum hash length to the constructor. Hashes are padded with extra characters to make them seem longer.

hashids = Hashids(min_length=16)
hashid = hashids.encode(1) # '4q2VolejRejNmGQB'

Using A Custom Alphabet

It’s possible to set a custom alphabet for your hashes. The default alphabet is 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'.

To have only lowercase letters in your hashes, pass in the following custom alphabet:

hashids = Hashids(alphabet='abcdefghijklmnopqrstuvwxyz')
hashid = hashids.encode(123456789) # 'kekmyzyk'

A custom alphabet must contain at least 16 characters.

Randomness

The primary purpose of hashids is to obfuscate ids. It's not meant or tested to be used for security purposes or compression. Having said that, this algorithm does try to make these hashes unguessable and unpredictable:

Repeating numbers

There are no repeating patterns that might show that there are 4 identical numbers in the hash:

hashids = Hashids("this is my salt")
hashids.encode(5, 5, 5, 5) # '1Wc8cwcE'

The same is valid for incremented numbers:

hashids.encode(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) # 'kRHnurhptKcjIDTWC3sx'

hashids.encode(1) # 'NV'
hashids.encode(2) # '6m'
hashids.encode(3) # 'yD'
hashids.encode(4) # '2l'
hashids.encode(5) # 'rD'

Curses! #$%@

This code was written with the intent of placing generated hashes in visible places – like the URL. Which makes it unfortunate if generated hashes accidentally formed a bad word.

Therefore, the algorithm tries to avoid generating most common English curse words by never placing the following letters next to each other: c, C, s, S, f, F, h, H, u, U, i, I, t, T.

License

MIT license, see the LICENSE file. You can use hashids in open source projects and commercial products.

You might also like...
PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different languages

PyMultiDictionary PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different

A Python package to facilitate research on building and evaluating automated scoring models.
A Python package to facilitate research on building and evaluating automated scoring models.

Rater Scoring Modeling Tool Introduction Automated scoring of written and spoken test responses is a growing field in educational natural language pro

Deasciify-highlighted - A Python script for deasciifying text to Turkish and copying clipboard

deasciify-highlighted is a Python script for deasciifying text to Turkish and copying clipboard.

🚩 A simple and clean python banner generator - Banners

🚩 A simple and clean python banner generator - Banners

 A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file
A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file

A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file

Fixes mojibake and other glitches in Unicode text, after the fact.

ftfy: fixes text for you print(fix_encoding("(ง'⌣')ง")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li

A generator library for concise, unambiguous and URL-safe UUIDs.

Description shortuuid is a simple python library that generates concise, unambiguous, URL-safe UUIDs. Often, one needs to use non-sequential IDs in pl

Format Covid values to ASCII-Table (Only for Germany and Austria)

Covid-19-Formatter (Only for Germany and Austria) Dieses Script speichert die gemeldeten Daten des RKIs / BMSGPK und formatiert diese zu einer Asci Ta

Text to ASCII and ASCII to text

Text2ASCII Description This python script (converter.py) contains two functions: encode() is used to return a list of Integer, one item per character

Comments
  • Two unique IDs convert back to the same integer

    Two unique IDs convert back to the same integer

    Hi there -- I've found a case were two unique ids decode to the same integer.

    As far as I can tell this won't cause any problems for me, I'd just like to understand why this is possible and if it's expected. I've tested the opposite direction (running through a integers to a very high value to ensure there are no conflicts when encoding)

    Unfortunately, not sure how to reproduce it here without posting what the integers convert to on my side and the salt I use.

    opened by timworx 1
  • Salt Only uses first 43 Characters

    Salt Only uses first 43 Characters

    I'm seeing that the salt is limited in usable length (contrary to popular assumptions that you should use a "long random string"). For example, here's a session:

    >>> from hashids import Hashids
    >>> Hashids('12345678901234567890123456789012345678901234').encode(1)
    'WJ'
    >>> Hashids('1234567890123456789012345678901234567890123').encode(1)
    'WJ'
    >>> Hashids('123456789012345678901234567890123456789012').encode(1)
    'QN'
    

    It doesn't seem to matter what the contents of the salt are, it's always 43 characters.

    I can't immediately see the cause of this - it may be something to do with the length of the alphabet (62) minus the length of the separators (14) and something else. It doesn't seem to be dependent on the length of the number encoded (I tried 8, 16,32,64 and 128 bit numbers).

    I'm not sure if this is a bug, an undocumented feature or my (mis)understanding, but thought it worth raising as consumers of this library do indeed recommend "a long and secure salt value...". If it is an undocumented feature, some explanation of why 43 characters would probably be helpful.

    (edit: By chance, this also happens to be issue #43 :-) )

    opened by coofercat 1
  • What is the status of this project?

    What is the status of this project?

    The last stable release is nearing three years old, and while that does not mean this project is dead, I have to wonder if the library is still reasonably on par with the JS implementation. As it is, the library is good enough for my use case, but it might not be for others, so it might be a good idea to make some updates and to prepare a new release if necessary.

    I currently cannot help out much in that front, but I am willing to give some time to update this library's test suite. Adding property-based tests using hypothesis will be a huge improvement in my opinion. Thoughts?

    opened by malefice 6
  • How to use numbers only in alphabet

    How to use numbers only in alphabet

    I use

    hashids = Hashids(salt="hello", min_length=6, alphabet='0123456789')

    But "Alphabet must contain at least 16 unique characters." execption raised.

    Why Alphabet must contain at least 16?

    opened by hcaihao 9
Owner
David Aurelio
David Aurelio
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt

Life4 3k Oct 1, 2022
Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

Samuel Dobbie 133 Sep 5, 2022
🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙‍♀️

?? Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! ??‍♀️

Brandon 5.4k Sep 27, 2022
You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.

You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.

null 8 Aug 2, 2022
StealBit1.1 and earlier strings and config extraction scripts

StealBit1.1 and earlier scripts Use strings_decryptor.py to extract RC4 encrypted strings from a StealBit1.1 sample(s). Use config_extractor.py to ext

Soolidsnake 4 Apr 17, 2022
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

Contents Maintainer wanted Introduction Installation Documentation License History Source code Authors Maintainer wanted I am looking for a new mainta

Antti Haapala 1.2k Sep 20, 2022
A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings.

Python User Agents user_agents is a Python library that provides an easy way to identify/detect devices like mobile phones, tablets and their capabili

Selwin Ong 1.3k Sep 21, 2022
Etranslate is a free and unlimited python library for transiting your texts

Etranslate is a free and unlimited python library for transiting your texts

Abolfazl Khalili 16 Sep 13, 2022
a python package that lets you add custom colors and text formatting to your scripts in a very easy way!

colormate Python script text formatting package What is colormate? colormate is a python library that lets you add text formatting to your scripts, it

Rodrigo 4 Oct 7, 2021
Build a translation program similar to Google Translate with Python programming language and QT library

google-translate Build a translation program similar to Google Translate with Python programming language and QT library Different parts of the progra

Amir Hussein Sharifnezhad 3 Oct 9, 2021