cysimdjson
Fast JSON parsing library for Python, 7-12 times faster than standard Python JSON parser.
It is Python bindings for the simdjson using Cython.
Standard Python JSON parser (json.load()
etc.) is relatively slow, and if you need to parse large JSON files or a large number of small JSON files, it may represent a significant bottleneck.
Whilst there are other fast Python JSON parsers, such as pysimdjson, libpy_simdjson or orjson, they don't reach the raw speed that is provided by the brilliant SIMDJSON project. SIMDJSON is C++ JSON parser based on SIMD instructions, reportedly the fastest JSON parser on the planet.
Usage
import cysimdjson
json_bytes = b'''
{
"foo": [1,2,[3]]
}
'''
parser = cysimdjson.JSONParser()
json_parsed = parser.parse(json_bytes)
# Standard Python access
print(json_parsed['foo'][2][0])
# Access using JSON Pointer (faster)
print(json_parsed.at_pointer("/foo/2/0"))
Note: parser
object can be reused for maximum performance.
The json_parsed
is a read-only dictionary-like object, that provides an access to JSON data.
Installation
pip3 install cython
pip3 install git+https://github.com/TeskaLabs/cysimdjson.git
Note: cysimdjson will be released to pypi shortly.
Performance
----------------------------------------------------------------
# 'jsonexamples/test.json' 2397 bytes
----------------------------------------------------------------
* cysimdjson parse 510291.81 EPS ( 1.00) 1223.17 MB/s
* libpy_simdjson loads 374615.54 EPS ( 1.36) 897.95 MB/s
* pysimdjson parse 362195.46 EPS ( 1.41) 868.18 MB/s
* orjson loads 110615.70 EPS ( 4.61) 265.15 MB/s
* python json loads 72096.80 EPS ( 7.08) 172.82 MB/s
----------------------------------------------------------------
SIMDJSON: 543335.93 EPS, 1241.52 MB/s
----------------------------------------------------------------
# 'jsonexamples/twitter.json' 631515 bytes
----------------------------------------------------------------
* cysimdjson parse 2556.10 EPS ( 1.00) 1614.22 MB/s
* libpy_simdjson loads 2444.53 EPS ( 1.05) 1543.76 MB/s
* pysimdjson parse 2415.46 EPS ( 1.06) 1525.40 MB/s
* orjson loads 387.11 EPS ( 6.60) 244.47 MB/s
* python json loads 278.63 EPS ( 9.17) 175.96 MB/s
----------------------------------------------------------------
SIMDJSON: 2536.16 EPS, 1527.28 MB/s
----------------------------------------------------------------
# 'jsonexamples/canada.json' 2251051 bytes
----------------------------------------------------------------
* cysimdjson parse 284.67 EPS ( 1.00) 640.81 MB/s
* pysimdjson parse 284.62 EPS ( 1.00) 640.70 MB/s
* libpy_simdjson loads 277.13 EPS ( 1.03) 623.84 MB/s
* orjson loads 81.80 EPS ( 3.48) 184.13 MB/s
* python json loads 22.52 EPS ( 12.64) 50.68 MB/s
----------------------------------------------------------------
SIMDJSON: 307.95 EPS, 661.08 MB/s
----------------------------------------------------------------
# 'jsonexamples/gsoc-2018.json' 3327831 bytes
----------------------------------------------------------------
* cysimdjson parse 775.61 EPS ( 1.00) 2581.09 MB/s
* pysimdjson parse 743.67 EPS ( 1.04) 2474.81 MB/s
* libpy_simdjson loads 654.15 EPS ( 1.19) 2176.88 MB/s
* orjson loads 166.67 EPS ( 4.65) 554.66 MB/s
* python json loads 113.72 EPS ( 6.82) 378.43 MB/s
----------------------------------------------------------------
SIMDJSON: 703.59 EPS, 2232.92 MB/s
----------------------------------------------------------------
# 'jsonexamples/verysmall.json' 7 bytes
----------------------------------------------------------------
* cysimdjson parse 3972376.53 EPS ( 1.00) 27.81 MB/s
* orjson loads 3637369.63 EPS ( 1.09) 25.46 MB/s
* libpy_simdjson loads 1774211.19 EPS ( 2.24) 12.42 MB/s
* pysimdjson parse 977530.90 EPS ( 4.06) 6.84 MB/s
* python json loads 527932.65 EPS ( 7.52) 3.70 MB/s
----------------------------------------------------------------
SIMDJSON: 3799392.10 EPS
CPU: AMD EPYC 7452
More performance testing:
- Apple M1: > 1M EPS, > 3GB/s
Tests are reproducible
pip3 install orjson
pip3 install pysimdjson
pip3 install libpy_simdjson
python3 setup.py build_ext --inplace
PYTHONPATH=. python3 ./test/test_benchmark.py
Manual build
python3 setup.py build_ext --inplace