Ultra fast JSON decoder and encoder written in C with Python bindings

Overview

UltraJSON

PyPI version Supported Python versions PyPI downloads GitHub Actions status Travis CI status DOI Code style: Black

UltraJSON is an ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.6+.

Install with pip:

$ python -m pip install ujson

Usage

May be used as a drop in replacement for most other JSON parsers for Python:

>>> import ujson
>>> ujson.dumps([{"key": "value"}, 81, True])
'[{"key":"value"},81,true]'
>>> ujson.loads("""[{"key": "value"}, 81, true]""")
[{'key': 'value'}, 81, True]

Encoder options

encode_html_chars

Used to enable special encoding of "unsafe" HTML characters into safer Unicode sequences. Default is False:

>>> ujson.dumps("<script>John&Doe", encode_html_chars=True)
'"\\u003cscript\\u003eJohn\\u0026Doe"'

ensure_ascii

Limits output to ASCII and escapes all extended characters above 127. Default is True. If your end format supports UTF-8, setting this option to false is highly recommended to save space:

>>> ujson.dumps("åäö")
'"\\u00e5\\u00e4\\u00f6"'
>>> ujson.dumps("åäö", ensure_ascii=False)
'"åäö"'

escape_forward_slashes

Controls whether forward slashes (/) are escaped. Default is True:

>>> ujson.dumps("http://esn.me")
'"http:\\/\\/esn.me"'
>>> ujson.dumps("http://esn.me", escape_forward_slashes=False)
'"http://esn.me"'

indent

Controls whether indentation ("pretty output") is enabled. Default is 0 (disabled):

>>> ujson.dumps({"foo": "bar"})
'{"foo":"bar"}'
>>> print(ujson.dumps({"foo": "bar"}, indent=4))
{
    "foo":"bar"
}

Benchmarks

UltraJSON calls/sec compared to other popular JSON parsers with performance gain specified below each.

Test machine:

Linux 5.0.0-1032-azure x86_64 #34-Ubuntu SMP Mon Feb 10 19:37:25 UTC 2020

Versions:

  • CPython 3.8.2 (default, Feb 28 2020, 14:28:43) [GCC 7.4.0]
  • nujson : 1.35.2
  • orjson : 2.6.1
  • simplejson: 3.17.0
  • ujson : 2.0.2
  ujson nujson orjson simplejson json
Array with 256 doubles          
encode 22,082 4,282 76,975 5,328 5,436
decode 24,127 34,349 29,059 14,174 13,822
Array with 256 UTF-8 strings          
encode 3,557 2,528 24,300 3,061 2,068
decode 2,030 2,490 931 406 358
Array with 256 strings          
encode 39,041 31,769 76,403 16,615 16,910
decode 25,185 24,287 34,437 32,388 27,999
Medium complex object          
encode 10,382 11,427 32,995 3,959 5,275
decode 9,785 9,796 11,515 5,898 7,200
Array with 256 True values          
encode 114,341 101,039 344,256 62,382 72,872
decode 149,367 151,615 181,123 114,597 130,392
Array with 256 dict{string, int} pairs          
encode 13,715 14,420 51,942 3,271 6,584
decode 12,670 11,788 12,176 6,743 8,278
Dict with 256 arrays with 256 dict{string, int} pairs          
encode 50 54 216 10 23
decode 32 32 30 20 23
Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys          
encode 46 41   8 24
Complex object          
encode 533 582   408 431
decode 466 454   154 164
Comments
  • Add wheels to PyPI

    Add wheels to PyPI

    First, thanks for building and maintaining ujson. We at spaCy use it pretty much everywhere where we deal with json.

    While many users don't mind installing from source, now as we have with wheels a solid binary distribution format it would be great if you could adopt it and provide wheels for OS X and Windows on PyPI.

    Setting up a build environment on Windows is particularly tricky as there is no simple guide you can follow that applies to all Python versions. What makes matters worse is that pip's error on a missing msvc compiler is unnecessarily cryptic and people are easily confused, see https://github.com/spacy-io/spaCy/issues/334

    The Python community is already pretty far in adopting wheels, see http://pythonwheels.com, unfortunately ujson still belongs in the not-supporting camp. As a popular Python package I urge you guys to adopt wheels.

    If you would provide wheels spaCy could be installed without build system on OS X and Windows as ujson is the only non-wheel dependency missing. I guess there will be many other projects that will benefit from it as well. If you think that's not possible we will probably have to switch to a slower alternative and/or provide a graceful fallback, that's not my preference though.

    If you need help setting up an automatic build system I can lend a hand. I would be also happy to provide ours (https://ci.spacy.io) to you.

    release help wanted 
    opened by henningpeters 53
  • Is this project still maintained?

    Is this project still maintained?

    Hello, I am a core dev for spyder-ide/spyder and we would like to update this package to distribute wheels.

    Do you think this is possible @cgbystrom, @hrosenhorn, @Jahaja, @Jhonte, @jskorpan, @mhansson, @mkristo, @msjolund, @mthurlin, @ogabrielson, @oskarblom, @ronniekk, @T0bs, @tjogin?

    Thanks!

    release 
    opened by goanpeca 46
  • Float decoding problem

    Float decoding problem

    I am seeing a floating point problem that I do not see in json or cjson. The first two tests below pass while the one using ujson fails. I am running Ubuntu 12.04 if that helps.

    import unittest
    import json
    import cjson
    import ujson
    
    
    class TestUJsonFloat(unittest.TestCase):
        def test_json(self):
            sut = {u'a': 4.56}
            encoded = json.dumps(sut)
            decoded = json.loads(encoded)
            self.assertEqual(sut, decoded)
    
        def test_cjson(self):
            sut = {u'a': 4.56}
            encoded = cjson.encode(sut)
            decoded = cjson.decode(encoded)
            self.assertEqual(sut, decoded)
    
        def test_ujson(self):
            sut = {u'a': 4.56}
            encoded = ujson.encode(sut)
            decoded = ujson.decode(encoded)
            self.assertEqual(sut, decoded)
    
    opened by gmnash 28
  • Fix memory leak on encoding errors when the buffer was resized

    Fix memory leak on encoding errors when the buffer was resized

    JSON_EncodeObject returns NULL when an error occurs, but without freeing the buffer. This leads to a memory leak when the buffer is internally allocated (because the caller's buffer was insufficient or none was provided at all) and any error occurs. Similarly, objToJSON did not clean up the buffer in all error conditions either.

    This adds the missing buffer free in JSON_EncodeObject (iff the buffer was allocated internally) and refactors the error handling in objToJSON slightly to also free the buffer when a Python exception occurred without the encoder's errorMsg being set.


    I haven't added a test for this so far, and I'm not sure how to do it. The stdlib only has resource.getrusage for this (as far as I could see), and on Linux, this only provides the max RSS (ixrss, idrss, and isrss are always zero per the man page). Since the test suite constantly allocates and frees memory, including some huge objects, that makes it kind of messy (subprocess?). Let me know what you think about that.

    A quick manual test using a bytes object to trigger an exception:

    import resource
    import ujson
    
    ujson.dumps(['a' * 65536, ''])
    print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
    for i in range(1000):
    	try:
    		ujson.dumps(['a' * 65536, b''])
    	except:
    		pass
    print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
    

    With main: 10224 → 78068 (kB) With this branch: 10152 → 10152 (The exact numbers vary slightly, so that difference of 72 kB is meaningless.)

    Note that the initial ujson.dumps causes an increase in memory usage (by ~350 kB on my machine). I haven't looked into that at all, but I assume it's static variables etc.?

    changelog: Fixed 
    opened by JustAnotherArchivist 27
  • Fix unchecked buffer overflows (CVE-2021-45958).

    Fix unchecked buffer overflows (CVE-2021-45958).

    Fixes #334, fixes #501, fixes #502, fixes #503.

    #402 also passes ok with this change but it worked before it too. I could only reproduce the issue by git checkout 2.0.2 so I vote that we close that one too.

    Changes proposed in this pull request:

    • Add some extra memory resizes where needed.
    • Add some tests that used to but no longer cause memory issues.
    • Add a DEBUG compile mode to make memory issues easier to find and fix.
    • Include said DEBUG mode in CI/CD testing.
    changelog: Fixed 
    opened by bwoodsend 25
  • Support for OrderedDict?

    Support for OrderedDict?

    Any way to support ordered dicts? Is it even possible (efficiently) from C as it is a PyObject that overrides iter so without slow python calls it would probably be impossible unless iter implementation would be moved to C (detect object type and act accordingly)?

    It seems that no C json encoder supports it and this is probably the reason. Is that right?

    opened by 23doors 23
  • ujson long-term commitment

    ujson long-term commitment

    Is ujson still maintained? The last release on PyPi is over 2 years old. There seems to be a backlog of issues and PRs, and we're slightly concerned. Currently, we (https://github.com/crossbario/crossbar) have ujson as an optional dependency to accelerate on CPython, but seeing user affected (https://github.com/esnme/ultrajson/issues/184).

    opened by oberstet 22
  • Build Windows and MacOS

    Build Windows and MacOS

    For #219

    Changes proposed in this pull request:

    New Github action to build Windows and Mac wheels for Pythons 3.5 - 3.8.

    It works without the PyPi pushes - I can get all the wheels as build artifacts. I can't actually test the twine stuff as I don't have permission - even for TestPyPi.

    HTTPError: 403 Client Error: The user '***' isn't allowed to upload to project 'ujson'. See https://test.pypi.org/help/#project-name for more information. for url: https://test.pypi.org/legacy/`

    I tried nicking the PyPi upload section from the deploy.yml so you wouldn't have to change anything but it turns out that the pypa/gh-action-pypi-publish@master command it uses is only on Linux.

    The secret names for twine are pypi_username and pypi_password for PyPi and TWINE_USERNAME and TWINE_PASSWORD for TestPyPi but obviously you can change those.

    Let me know if I can help further.

    release Windows changelog: Added 
    opened by bwoodsend 19
  • UltraJSON 2.0.0 release checklist

    UltraJSON 2.0.0 release checklist

    Seeing as it's been over 4 years since the last release, and because support some EOL Python versions have been dropped in that time, I propose the next release should bump to version 2.0.0. Python 2.7 support should be retained for this one.

    Some things that come to mind to make a new release:

    • [x] Move https://github.com/esnme/ultrajson to https://github.com/ultrajson/ultrajson

    • [x] Review and merge open PRs

      • Reviews welcome from everyone, please comment those which are most important
    • [x] Get access to https://pypi.org/project/ujson/

      • I've emailed previous maintainer Jonas to ask him to add @cgbystrom, @rstms and me to PyPI (same usernames)
    • [x] Get access to https://test.pypi.org/project/ujson/

      • @segfault I see you've been using this for experimenting, would you mind adding @cgbystrom, @rstms and @hugovk too?
      • https://test.pypi.org/manage/project/ujson/collaboration/
    • [x] Set up Travis CI to deploy to TestPyPI on merge to master (to test the release machinery) and to PyPI on tags

      • I can set this up, it depends on previous two items
    • [x] Anything else?

    release 
    opened by hugovk 18
  • ujson import error on alpine 3.9

    ujson import error on alpine 3.9

    After installing ujson in alpine 3.9, Python raises an ImportError when attempting to import the package:

    ImportError: Error relocating /usr/lib/python2.7/site-packages/ujson.so: strreverse: symbol not found
    

    Possibly related to #180 as the symptoms are similar.

    Full terminal output showing steps to reproduce:

    ~ $ docker pull alpine
    Using default tag: latest
    latest: Pulling from library/alpine
    Digest: sha256:b3dbf31b77fd99d9c08f780ce6f5282aba076d70a513a8be859d8d3a4d0c92b8
    Status: Image is up to date for alpine:latest
    ~ $ docker run --rm -it alpine sh
    / # apk add py-pip python python-dev gcc musl-dev
    fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
    fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
    (1/25) Installing binutils (2.31.1-r2)
    (2/25) Installing gmp (6.1.2-r1)
    (3/25) Installing isl (0.18-r0)
    (4/25) Installing libgomp (8.2.0-r2)
    (5/25) Installing libatomic (8.2.0-r2)
    (6/25) Installing libgcc (8.2.0-r2)
    (7/25) Installing mpfr3 (3.1.5-r1)
    (8/25) Installing mpc1 (1.0.3-r1)
    (9/25) Installing libstdc++ (8.2.0-r2)
    (10/25) Installing gcc (8.2.0-r2)
    (11/25) Installing musl-dev (1.1.20-r3)
    (12/25) Installing libbz2 (1.0.6-r6)
    (13/25) Installing expat (2.2.6-r0)
    (14/25) Installing libffi (3.2.1-r6)
    (15/25) Installing gdbm (1.13-r1)
    (16/25) Installing ncurses-terminfo-base (6.1_p20190105-r0)
    (17/25) Installing ncurses-terminfo (6.1_p20190105-r0)
    (18/25) Installing ncurses-libs (6.1_p20190105-r0)
    (19/25) Installing readline (7.0.003-r1)
    (20/25) Installing sqlite-libs (3.26.0-r3)
    (21/25) Installing python2 (2.7.15-r3)
    (22/25) Installing py-setuptools (40.6.3-r0)
    (23/25) Installing py2-pip (18.1-r0)
    (24/25) Installing pkgconf (1.6.0-r0)
    (25/25) Installing python2-dev (2.7.15-r3)
    Executing busybox-1.29.3-r10.trigger
    OK: 171 MiB in 39 packages
    / # pip install -U ujson
    Collecting ujson
      Downloading https://files.pythonhosted.org/packages/16/c4/79f3409bc710559015464e5f49b9879430d8f87498ecdc335899732e5377/ujson-1.35.tar.gz (192kB)
        100% |████████████████████████████████| 194kB 10.3MB/s 
    Installing collected packages: ujson
      Running setup.py install for ujson ... done
    Successfully installed ujson-1.35
    / # python -c 'import ujson'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ImportError: Error relocating /usr/lib/python2.7/site-packages/ujson.so: strreverse: symbol not found
    
    opened by ngaya-ll 18
  • If an object has a __json__ method, use it when encoding

    If an object has a __json__ method, use it when encoding

    It should return a raw JSON string which will be directly included in the resulting JSON when encoding.

    Similar to #130, but it allows an object to return already encoded string which is then included as it is in the output. This allows one to pass embed existing JSON strings inside a larger structure, without a need to deserialize and serialize again.

    opened by mitar 16
  • Supporting builds with Python's Limited API

    Supporting builds with Python's Limited API

    Following a bit of discussion in #572, I'm filing a separate issue for a decision on whether or not this is something we want to worry about in the future.

    The Limited API is a subset of the Python C API which is guaranteed not to change in future minor versions. This allows building a single 'abi3' wheel that will work on any Python version 3.x, past, present, or future, higher than some minimum version (depending on which APIs are used).

    Our current code makes use of an #ifndef Py_LIMITED_API switch in the PyUnicodechar* conversion to take advantage of an optimisation for certain strings, which is not part of the Limited API. This originates from #417. However, ujson currently does not compile with the Limited API; this line in the decoder, introduced in #555, uses a function not part of the Limited API as of 3.11. I haven't tested further or with earlier Python versions. #573 uses a function that has only been added to the Limited API in 3.11, so it wouldn't build on 3.7 through 3.10.

    The main advantage of using the Limited API (that I'm aware of) is the aforementioned single compilation for all 3.x versions. As a side effect, an abi3 wheel would also allow users to still easily install an old ujson version on a newer Python in the future without having to compile it themselves.

    The disadvantage is that #573 would be impossible and #555 would have to be reworked (which seems tricky, if it's even possible at all). More such cases may be present or arise in the future, and we'd be limiting ourselves in what we could add or significantly increasing the maintenance burden of constantly implementing workarounds for such builds. As with the example above, it will also have performance impacts.


    My own opinion is that the advantages are negligible and the downsides are very significant. Building separate wheels for each supported Python version is trivial with the fully automated release process, as @bwoodsend mentioned in #572, and adding support for a new Python version (assuming no backward incompatibilities) merely requires adding the version number in a couple places. If someone wants to run version combinations of ujson and Python that haven't been tested and aren't officially supported, they should have to jump through hoops before they shoot themselves in the foot. In short, while the Limited API is neat, it's too limited in some cases at this time and not remotely worth the effort.

    opened by JustAnotherArchivist 7
  • Please consider including other licenses mentioned in `LICENSE.txt`

    Please consider including other licenses mentioned in `LICENSE.txt`

    What did you do?

    Looked at LICENSE.txt.

    What did you expect to happen?

    All required license text is available.

    What actually happened?

    The copyright statement and license text for the main BSD-3-Clause license are present, but the file mentions that code is included or derived from libraries with different licenses.

    Portions of code from MODP_ASCII - Ascii transformations (upper/lower, etc)
    https://github.com/client9/stringencoders
    Copyright (c) 2007  Nick Galbreath -- nickg [at] modp [dot] com. All rights reserved.
    

    ~~Checking https://github.com/client9/stringencoders, the MIT license appears to apply.~~ (See my follow-up comment; this code appears to be used under a BSD-3-Clause license.)

    Numeric decoder derived from from TCL library
    https://opensource.apple.com/source/tcl/tcl-14/tcl/license.terms
     * Copyright (c) 1988-1993 The Regents of the University of California.
     * Copyright (c) 1994 Sun Microsystems, Inc.
    

    Checking https://opensource.apple.com/source/tcl/tcl-14/tcl/license.terms, the TCL license appears to apply.

    Both of these licenses require the copyright notice and the license text to be distributed in all copies. Please add the applicable full license text, either in separate files or in LICENSE.txt.

    What versions are you using?

    • OS: N/A
    • Python: N/A
    • UltraJSON: 5.5.0

    Please include code that reproduces the issue.

    Not applicable.

    documentation 
    opened by musicinmybrain 2
  • Unexpected keyword argument

    Unexpected keyword argument "reject_bytes" for "dumps"

    What did you do?

    • added this function in my source:
    def json_dumper(json_object: Any) -> str:
        return ujson.dumps(json_object, ensure_ascii=False, reject_bytes=False)
    
    • run mypy

    What did you expect to happen?

    mypy to not complain about the new function

    What actually happened?

    mypy returned the following error: Unexpected keyword argument "reject_bytes" for "dumps"

    What versions are you using?

    • OS: 21.6.0 Darwin Kernel Version 21.6.0: Sat Jun 18 17:07:25 PDT 2022; root:xnu-8020.140.41~1/RELEASE_X86_64 x86_64
    • Python: Python 3.9.13
    • UltraJSON: 5.4.0

    Please include code that reproduces the issue.

    The best reproductions are self-contained scripts with minimal dependencies.

    def json_dumper(json_object: Any) -> str:
        return ujson.dumps(json_object, ensure_ascii=False, reject_bytes=False)
    
    opened by aspacca 4
  • UltraJSON accepts invalid json

    UltraJSON accepts invalid json

    What did you do?

    Tried to loads this invalid json {"a": -}'

    What did you expect to happen?

    I expected some exception to be raised

    What actually happened?

    I got {'a': 0} result

    What versions are you using?

    • OS: Arch Linux
    • Python: Python 3.10.5
    • UltraJSON: ujson 5.3.0

    Please include code that reproduces the issue.

    The best reproductions are self-contained scripts with minimal dependencies.

    Python 3.10.5 (main, Jun  6 2022, 18:49:26) [GCC 12.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import json
    >>> import ujson
    >>> json.loads('{"a": -}')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
        return _default_decoder.decode(s)
      File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 7 (char 6)
    >>> ujson.loads('{"a": -}')
    {'a': 0}
    

    According to JSON's RFC https://datatracker.ietf.org/doc/html/rfc7159#section-6 standalone minus isn't a valid number. Builtin python json module doesn't accept this json and raises exception. But ujson doesn't raise any exception and parses minus sign as 0. I think it's a bug

    opened by Zerogoki00 3
  • Request: simd / AVX512 support

    Request: simd / AVX512 support

    Based on recent benchmarks it looks like Json performance might greatly benefit from access to SIMD instructions.

    https://www.phoronix.com/scan.php?page=article&item=simdjson-avx-512&num=2

    I'm not sure how difficult support would be to add and maintain. Would the default pypi binaries need to have it disabled for compatibility? Would there need to be a custom pypi index for different cflags like their is with torch? Or would this be something that could be enabled/disabled at runtime?

    opened by Erotemic 4
  • Benchmark stats v2

    Benchmark stats v2

    This PR supersedes #532 and is based off of main, so it does not include extra PRs.

    This is still a work in progress, but it is coming along, and it's starting to produce results. However, it is significantly more complex than the previous PR. However, this is justified based on the several design goals, are as follows:

    1. I want to show pretty graphs
    2. I want to output pretty tables
    3. I want to vary the size of the inputs of the current benchmark to show scalability
    4. I wanted repeated runs of the benchmarks to serialize their results and accumulate them so statistics from multiple runs (over multiple machines) can be accumulated and compared over time.
    5. I wanted to be able to run t-tests to quantify the probability that a performance regression has been introduced over different versions of ujson (or other json libraries for that matter).

    To this end, I've started work on an experimental module I'm currently calling "benchmarker", which I've added to the test subdirectory. There is still a lot to clean up here, but the general structure is in place.

    • The benchmarker.py file contains a wrapper around timerit that makes it simpler to express benchmarks over a grid of varied parameters.

    • The process_context collects information about the machine hardware / software so each benchmark knows the context in which it was run.

    • The result_analysis.py file is ported from another project I'm working on that runs stats over a table of results. I was originally using this to compare hyperparameters wrt machine learning performance metrics, but it also applies when that performance metric is "time" and the hyperparameters are different libraries / inputs / settings. It's highly general.

    • The util_json script is for json utilities I need to ensure the benchmarks are properly serialized. It might be able to be removed as this PR matures and is focused on this use-case.

    • The aggregate and visualize scripts will probably go away. I'm keeping them in for now as I continue development.

    The script that uses "bechmarker" is currently called "benchmark3.py" and it will be a superset of what "benchmark.py" currently does.

    Here is the current state of the visualization: image

    Current state of the statistics (currently marginalized over all sizes / inputs, but that can be refined):

    PARAMETER: impl - METRIC: mean_time
    ===================================
    mean_time   count      mean       std           min       25%       50%       75%       max
    impl                                                                                       
    orjson       20.0  0.000043  0.000074  4.760000e-07  0.000002  0.000009  0.000049  0.000300
    ujson        20.0  0.000182  0.000338  6.901000e-07  0.000004  0.000032  0.000185  0.001391
    nujson       20.0  0.000304  0.000518  6.606000e-07  0.000005  0.000044  0.000255  0.001733
    json         20.0  0.000361  0.000597  2.347300e-06  0.000010  0.000053  0.000451  0.002308
    simplejson   20.0  0.000452  0.000710  3.685600e-06  0.000011  0.000056  0.000671  0.002632
    
    ANOVA: If p is low, the param 'impl' might have an effect
      Rank-ANOVA: p=0.04903523
      Mean-ANOVA: p=0.09580734
    
    Pairwise T-Tests
      If p is low, impl=orjson may outperform impl=ujson.
        ttest_ind:  p=0.04442984
      If p is low, impl=ujson may outperform impl=nujson.
        ttest_ind:  p=0.19194550
      If p is low, impl=nujson may outperform impl=json.
        ttest_ind:  p=0.37413442
      If p is low, impl=json may outperform impl=simplejson.
        ttest_ind:  p=0.33158304
      param_name     metric  anova_rank_H  anova_rank_p  anova_mean_F  anova_mean_p
    0       impl  mean_time      9.534891      0.049035      2.033863      0.095807
    

    And the OpenSkill analysis (which can be interpreted as the probability the chosen implementation / version will be fastest):

    skillboard.ratings = {
        ('json', '2.0.9')       : Rating(mu=17.131378529535343, sigma=5.9439028153361395),
        ('nujson', '1.35.2')    : Rating(mu=11.38071601274192, sigma=6.063976195156715),
        ('orjson', '3.6.8')     : Rating(mu=60.1867466539136, sigma=7.080517291679502),
        ('simplejson', '3.17.6'): Rating(mu=3.5265870256595813, sigma=6.2092484551816725),
        ('ujson', '5.2.1.dev28'): Rating(mu=38.31167062257107, sigma=6.320048263068908),
    }
    win_probs = {
        ('json', '2.0.9'): 0.17368695029693812,
        ('nujson', '1.35.2'): 0.15447983652936728,
        ('orjson', '3.6.8'): 0.2981884925614975,
        ('simplejson', '3.17.6'): 0.12972249212852724,
        ('ujson', '5.2.1.dev28'): 0.2439222284836699,
    }
    

    I still need to:

    • [ ] Get the plots/analysis working with aggregated benchmark results
    • [ ] Reproduce the existing tables for the README (with percentage speedup / slowdown relative to ujson)
    • [ ] Clean up the plots, save them to disk, and make them work over multiple ujson versions.

    Submitting this now as it is starting to come together, and I'd be interested in feedback on adding what effectively is a benchmarking system to the repo. I'm thinking "benchmarker" can eventually become a standalone repo that is included as a benchmark dependency, but setting up and maintaining a separate repo is an endeavor, so if possible, I'd like to "pre-vendor" it here as a staging area where it can (1) be immediately useful and (2) prove itself / work out the kinks.

    opened by Erotemic 5
Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.

jsonpickle jsonpickle is a library for the two-way conversion of complex Python objects and JSON. jsonpickle builds upon the existing JSON encoders, s

null 1.1k Jan 2, 2023
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

orjson orjson is a fast, correct JSON library for Python. It benchmarks as the fastest Python library for JSON and is more correct than the standard j

null 4.1k Dec 30, 2022
Python bindings for the simdjson project.

pysimdjson Python bindings for the simdjson project, a SIMD-accelerated JSON parser. If SIMD instructions are unavailable a fallback parser is used, m

Tyler Kennedy 562 Jan 8, 2023
Crappy tool to convert .scw files to .json and and vice versa.

SCW-JSON-TOOL Crappy tool to convert .scw files to .json and vice versa. How to use Run main.py file with two arguments: python main.py <scw2json or j

Fred31 5 May 14, 2021
🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)

srsly: Modern high-performance serialization utilities for Python This package bundles some of the best Python serialization libraries into one standa

Explosion 329 Dec 28, 2022
MessagePack serializer implementation for Python msgpack.org[Python]

MessagePack for Python What's this MessagePack is an efficient binary serialization format. It lets you exchange data among multiple languages like JS

MessagePack 1.7k Dec 29, 2022
A lightweight library for converting complex objects to and from simple Python datatypes.

marshmallow: simplified object serialization marshmallow is an ORM/ODM/framework-agnostic library for converting complex datatypes, such as objects, t

marshmallow-code 6.4k Jan 2, 2023
Extended pickling support for Python objects

cloudpickle cloudpickle makes it possible to serialize Python constructs not supported by the default pickle module from the Python standard library.

null 1.3k Jan 5, 2023
Generic ASN.1 library for Python

ASN.1 library for Python This is a free and open source implementation of ASN.1 types and codecs as a Python package. It has been first written to sup

Ilya Etingof 223 Dec 11, 2022
serialize all of python

dill serialize all of python About Dill dill extends python's pickle module for serializing and de-serializing python objects to the majority of the b

The UQ Foundation 1.8k Jan 7, 2023
Python wrapper around rapidjson

python-rapidjson Python wrapper around RapidJSON Authors: Ken Robbins <[email protected]> Lele Gaifax <[email protected]> License: MIT License Sta

null 469 Jan 4, 2023
Ultra fast JSON decoder and encoder written in C with Python bindings

UltraJSON UltraJSON is an ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.6+. Install with pip: $ python -m pip insta

null 3.9k Jan 2, 2023
simplejson is a simple, fast, extensible JSON encoder/decoder for Python

simplejson simplejson is a simple, fast, complete, correct and extensible JSON <http://json.org> encoder and decoder for Python 3.3+ with legacy suppo

null 1.5k Dec 31, 2022
simplejson is a simple, fast, extensible JSON encoder/decoder for Python

simplejson simplejson is a simple, fast, complete, correct and extensible JSON <http://json.org> encoder and decoder for Python 3.3+ with legacy suppo

null 1.5k Jan 5, 2023
Fully Automated YouTube Channel ▶️with Added Extra Features.

Fully Automated Youtube Channel ▒█▀▀█ █▀▀█ ▀▀█▀▀ ▀▀█▀▀ █░░█ █▀▀▄ █▀▀ █▀▀█ ▒█▀▀▄ █░░█ ░░█░░ ░▒█░░ █░░█ █▀▀▄ █▀▀ █▄▄▀ ▒█▄▄█ ▀▀▀▀ ░░▀░░ ░▒█░░ ░▀▀▀ ▀▀▀░

sam-sepiol 249 Jan 2, 2023
Godzilla traffic decoder Godzilla Decoder 是一个用于 哥斯拉Godzilla 加密流量分析的辅助脚本。

Godzilla Decoder 简介 Godzilla Decoder 是一个用于 哥斯拉Godzilla 加密流量分析的辅助脚本。 Godzilla Decoder 基于 mitmproxy,是mitmproxy的addon脚本。 目前支持 哥斯拉3.0.3 PhpDynamicPayload的

He Ruiliang 40 Dec 25, 2022
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 7, 2023
[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

ZiNiU WaN 176 Dec 15, 2022
A morse code encoder and decoder utility.

morsedecode A morse code encoder and decoder utility. Installation Install it via pip: pip install morsedecode Alternatively, you can use pipx to run

Tushar Sadhwani 2 Dec 25, 2021
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

Wanyu Du 18 Dec 29, 2022