MessagePack serializer implementation for Python msgpack.org[Python]

Related tags

python msgpack
Overview

MessagePack for Python

Build Status Documentation Status

What's this

MessagePack is an efficient binary serialization format. It lets you exchange data among multiple languages like JSON. But it's faster and smaller. This package provides CPython bindings for reading and writing MessagePack data.

Very important notes for existing users

PyPI package name

Package name on PyPI was changed from msgpack-python to msgpack from 0.5.

When upgrading from msgpack-0.4 or earlier, do pip uninstall msgpack-python before pip install -U msgpack.

Compatibility with the old format

You can use use_bin_type=False option to pack bytes object into raw type in the old msgpack spec, instead of bin type in new msgpack spec.

You can unpack old msgpack format using raw=True option. It unpacks str (raw) type in msgpack into Python bytes.

See note below for detail.

Major breaking changes in msgpack 1.0

  • Python 2

    • The extension module does not support Python 2 anymore. The pure Python implementation (msgpack.fallback) is used for Python 2.
  • Packer

    • use_bin_type=True by default. bytes are encoded in bin type in msgpack. If you are still using Python 2, you must use unicode for all string types. You can use use_bin_type=False to encode into old msgpack format.
    • encoding option is removed. UTF-8 is used always.
  • Unpacker

    • raw=False by default. It assumes str types are valid UTF-8 string and decode them to Python str (unicode) object.
    • encoding option is removed. You can use raw=True to support old format.
    • Default value of max_buffer_size is changed from 0 to 100 MiB.
    • Default value of strict_map_key is changed to True to avoid hashdos. You need to pass strict_map_key=False if you have data which contain map keys which type is not bytes or str.

Install

$ pip install msgpack

Pure Python implementation

The extension module in msgpack (msgpack._cmsgpack) does not support Python 2 and PyPy.

But msgpack provides a pure Python implementation (msgpack.fallback) for PyPy and Python 2.

Windows

When you can't use a binary distribution, you need to install Visual Studio or Windows SDK on Windows. Without extension, using pure Python implementation on CPython runs slowly.

How to use

NOTE: In examples below, I use raw=False and use_bin_type=True for users using msgpack < 1.0. These options are default from msgpack 1.0 so you can omit them.

One-shot pack & unpack

Use packb for packing and unpackb for unpacking. msgpack provides dumps and loads as an alias for compatibility with json and pickle.

pack and dump packs to a file-like object. unpack and load unpacks from a file-like object.

>>> import msgpack
>>> msgpack.packb([1, 2, 3], use_bin_type=True)
'\x93\x01\x02\x03'
>>> msgpack.unpackb(_, raw=False)
[1, 2, 3]

unpack unpacks msgpack's array to Python's list, but can also unpack to tuple:

>>> msgpack.unpackb(b'\x93\x01\x02\x03', use_list=False, raw=False)
(1, 2, 3)

You should always specify the use_list keyword argument for backward compatibility. See performance issues relating to use_list option_ below.

Read the docstring for other options.

Streaming unpacking

Unpacker is a "streaming unpacker". It unpacks multiple objects from one stream (or from bytes provided through its feed method).

import msgpack
from io import BytesIO

buf = BytesIO()
for i in range(100):
   buf.write(msgpack.packb(i, use_bin_type=True))

buf.seek(0)

unpacker = msgpack.Unpacker(buf, raw=False)
for unpacked in unpacker:
    print(unpacked)

Packing/unpacking of custom data type

It is also possible to pack/unpack custom data types. Here is an example for datetime.datetime.

import datetime
import msgpack

useful_dict = {
    "id": 1,
    "created": datetime.datetime.now(),
}

def decode_datetime(obj):
    if '__datetime__' in obj:
        obj = datetime.datetime.strptime(obj["as_str"], "%Y%m%dT%H:%M:%S.%f")
    return obj

def encode_datetime(obj):
    if isinstance(obj, datetime.datetime):
        return {'__datetime__': True, 'as_str': obj.strftime("%Y%m%dT%H:%M:%S.%f")}
    return obj


packed_dict = msgpack.packb(useful_dict, default=encode_datetime, use_bin_type=True)
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime, raw=False)

Unpacker's object_hook callback receives a dict; the object_pairs_hook callback may instead be used to receive a list of key-value pairs.

Extended types

It is also possible to pack/unpack custom data types using the ext type.

>>> import msgpack
>>> import array
>>> def default(obj):
...     if isinstance(obj, array.array) and obj.typecode == 'd':
...         return msgpack.ExtType(42, obj.tostring())
...     raise TypeError("Unknown type: %r" % (obj,))
...
>>> def ext_hook(code, data):
...     if code == 42:
...         a = array.array('d')
...         a.fromstring(data)
...         return a
...     return ExtType(code, data)
...
>>> data = array.array('d', [1.2, 3.4])
>>> packed = msgpack.packb(data, default=default, use_bin_type=True)
>>> unpacked = msgpack.unpackb(packed, ext_hook=ext_hook, raw=False)
>>> data == unpacked
True

Advanced unpacking control

As an alternative to iteration, Unpacker objects provide unpack, skip, read_array_header and read_map_header methods. The former two read an entire message from the stream, respectively de-serialising and returning the result, or ignoring it. The latter two methods return the number of elements in the upcoming container, so that each element in an array, or key-value pair in a map, can be unpacked or skipped individually.

Notes

string and binary type

Early versions of msgpack didn't distinguish string and binary types. The type for representing both string and binary types was named raw.

You can pack into and unpack from this old spec using use_bin_type=False and raw=True options.

>>> import msgpack
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=False), raw=True)
[b'spam', b'eggs']
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=True), raw=False)
[b'spam', 'eggs']

ext type

To use the ext type, pass msgpack.ExtType object to packer.

>>> import msgpack
>>> packed = msgpack.packb(msgpack.ExtType(42, b'xyzzy'))
>>> msgpack.unpackb(packed)
ExtType(code=42, data='xyzzy')

You can use it with default and ext_hook. See below.

Security

To unpacking data received from unreliable source, msgpack provides two security options.

max_buffer_size (default: 100*1024*1024) limits the internal buffer size. It is used to limit the preallocated list size too.

strict_map_key (default: True) limits the type of map keys to bytes and str. While msgpack spec doesn't limit the types of the map keys, there is a risk of the hashdos. If you need to support other types for map keys, use strict_map_key=False.

Performance tips

CPython's GC starts when growing allocated object. This means unpacking may cause useless GC. You can use gc.disable() when unpacking large message.

List is the default sequence type of Python. But tuple is lighter than list. You can use use_list=False while unpacking when performance is important.

Issues
  • Use new buffer interface to unpack

    Use new buffer interface to unpack

    This PR adds support for unpacking/feeding from any object that supports the new buffer interface.

    For compatibility, an attempt is made to use the old buffer interface if that fails. On success, a RuntimeWarning is issued to inform users of possible errors and future removal of this feature.

    opened by jfolz 25
  • Backward incompatible API change toward 1.0

    Backward incompatible API change toward 1.0

    DRAFT This issue is for write down my ideas.

    Changing default behavior to use new spec.

    See also: https://github.com/msgpack/msgpack/blob/master/spec.md#upgrading-messagepack-specification.

    v1.0

    Drop encoding and unicode_errors option in Unpacker. Add raw=False option instead. When raw=True is passed, raw data in msgpack is deserialized into bytes object.

    Drop encoding and unicode_errors option in Packer. Unicode is encoded with UTF-8 always. 'surrogatepass' is not allowed to keep generated msgpack clean. use_bin_type=False is the only way to create dirty (raw contains non UTF-8 data) msgpack.

    v0.5.x

    Add raw=True option to unpacker.

    v0.6

    0.6 is version for warnings.

    Packer warns when encoding or unicode_errors is specified.

    Unpacker warns when encoding or unicode_errors is specified. ("Use raw=False instead").

    PyPI package name

    In past, easy_install crawls msgpack website and finds msgpack-x.y.z.tar.gz. But it was for C. That's why I moved from msgpack to msgpack-python

    Sadly, pip doesn't support transitional package (empty, but depends on new name package). So I release msgpack as both of msgpack-python and msgpack for a while until 1.0.

    As 1.0, I release msgpack only.

    1.0 
    opened by methane 22
  • Unknown serialization issue

    Unknown serialization issue

    Hello,

    I am having serious problems with database connectivity since yesterday. I've traced the issue down to a piece of code that is passing a long integer that has python's 'L' notation appended to it. This appears to be similar to #114. I am struggling a bit because some chained dependency is explicitly requiring the latest msgpack.

    opened by lordnynex 22
  • High CPU usage when unpacking on Ubuntu 12.04 with msgpack 4.8

    High CPU usage when unpacking on Ubuntu 12.04 with msgpack 4.8

    (gdb) bt
    #0  0x00007fee03ef525d in sem_post () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x000000000056d237 in PyThread_release_lock (lock=0x27bd4f0) at ../Python/thread_pthread.h:346
    #2  0x000000000051a1fd in PyEval_EvalFrameEx (
        f=Frame 0x2ac03c0, for file /usr/local/lib/python2.7/dist-packages/msgpack/fallback.py, line 537, in _fb_unpack (self=<Unpacker(_max_buffer_size=2147483647, _encoding=None, _max_map_len=2147483647, _max_bin_len=2147483647, _max_ext_len=2147483647, _fb_sloppiness=0, _object_hook=None, _fb_buf_o=52492, _fb_buf_n=1866623, _fb_buf_i=22, _unicode_errors='strict', _use_list=True, _max_str_len=2147483647, _ext_hook=<type at remote 0x17ed550>
    

    Recently, I've been observing this strange situation that my script just simply receives msgpack-ed data from stdin and extracts it. However, during that procedure, the script causes sever futex contention. I've used gdb to track it down and found out that there are tons of require and release on the same lock object, e.g. "lock=0x27bd4f0", which seems to me that is used by msgpack, specifically _fb_unpack.

    Just wondering has anyone else observed similar phenomenon ? I have not got time to dig into the _fb_unpack function so currently I have no idea why this function can cause futex contention.

    futex(0x27bd4f0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
    futex(0x27bd4f0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
    futex(0x27bd4f0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 0
    futex(0x27bd4f0, FUTEX_WAKE_PRIVATE, 1) = 1
    
    opened by imcom 20
  • RFC: deprecate write_bytes option in unpackers

    RFC: deprecate write_bytes option in unpackers

    I feel write_bytes option in Unpacker makes implementation complicated. I forgot why I added it. I'll check history and consider another API for the usecase.

    Ideas? comments?

    opened by methane 19
  • Support serializing datetime objects

    Support serializing datetime objects

    Since timestamps are now a part of the msgpack spec, shouldn't this library support serializing them?

    1.0 
    opened by gappleto97 19
  • Support packing memoryview objects

    Support packing memoryview objects

    I am working on a high-throughput scenario where I need to avoid copying memory wherever possible. I frequently create buffers from existing objects and sometimes need to hand them to msgpack. Unfortunately I could not find a way to pass buffers to the current version of msgpack without copying. So I wrote this small patch to support packing arbitrary buffer objects to binary data. It works well for my use case.

    Things I'm not so sure about:

    • Pure Python fallback is not implemented. Its structure is different from the Cython code and I'm not sure what to do exactly.
    • I'm not an expert in using this library. Does this interfere with any other functionality?
    • Information is lost in the conversion, e.g., shape of Numpy arrays. Since the lib just "swallows" everything that is a buffer. A user might expect to get the same type out again, but gets bytes instead.
    • Strided data is still copied.
    • No test cases yet. Might needs some more type checks.
    opened by jfolz 18
  • memoryview objects are not fully supported

    memoryview objects are not fully supported

    memoryview is the python3 type for non-owning memory buffer objects, also backported to python 2.7. unpackb and Unpacker.feed should unpack them without copying, and packing functions should handle them the same as bytes. The current support is quite limited due to multiple issues:

    Python 3.3

    >>> msgpack._unpacker.unpackb(memoryview(b'\x91\xc3'))
    [True]
    >>> msgpack.fallback.unpackb(memoryview(b'\x91\xc3'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/nofitserov/.local/lib64/python3.3/site-packages/msgpack/fallback.py", line 93, in unpackb
        ret = unpacker._fb_unpack()
      File "/home/nofitserov/.local/lib64/python3.3/site-packages/msgpack/fallback.py", line 383, in _fb_unpack
        typ, n, obj = self._read_header(execute, write_bytes)
      File "/home/nofitserov/.local/lib64/python3.3/site-packages/msgpack/fallback.py", line 274, in _read_header
        b = ord(c)
    TypeError: ord() expected string of length 1, but memoryview found
    >>> msgpack._packer.Packer().pack(memoryview(b'abc'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "_packer.pyx", line 224, in msgpack._packer.Packer.pack (msgpack/_packer.cpp:224)
      File "_packer.pyx", line 226, in msgpack._packer.Packer.pack (msgpack/_packer.cpp:226)
      File "_packer.pyx", line 221, in msgpack._packer.Packer._pack (msgpack/_packer.cpp:221)
    TypeError: can't serialize <memory at 0x7f37c3e41460>
    >>> msgpack.fallback.Packer().pack(memoryview(b'abc'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/nofitserov/.local/lib64/python3.3/site-packages/msgpack/fallback.py", line 618, in pack
        self._pack(obj)
      File "/home/nofitserov/.local/lib64/python3.3/site-packages/msgpack/fallback.py", line 615, in _pack
        raise TypeError("Cannot serialize %r" % obj)
    TypeError: Cannot serialize <memory at 0x7f37c3e41390>
    

    Python 2.7

    >>> msgpack._unpacker.unpackb(memoryview(b'\x91\xc3'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "_unpacker.pyx", line 105, in msgpack._unpacker.unpackb (msgpack/_unpacker.cpp:105)
    TypeError: expected a readable buffer object
    >>> msgpack.fallback.unpackb(memoryview(b'\x91\xc3'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/nofitserov/.local/lib64/python2.7/site-packages/msgpack/fallback.py", line 93, in unpackb
        ret = unpacker._fb_unpack()
      File "/home/nofitserov/.local/lib64/python2.7/site-packages/msgpack/fallback.py", line 381, in _fb_unpack
        typ, n, obj = self._read_header(execute, write_bytes)
      File "/home/nofitserov/.local/lib64/python2.7/site-packages/msgpack/fallback.py", line 272, in _read_header
        b = ord(c)
    TypeError: ord() expected string of length 1, but memoryview found
    >>> msgpack._packer.Packer().pack(memoryview(b'abc'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "_packer.pyx", line 206, in msgpack._packer.Packer.pack (msgpack/_packer.cpp:206)
      File "_packer.pyx", line 208, in msgpack._packer.Packer.pack (msgpack/_packer.cpp:208)
      File "_packer.pyx", line 203, in msgpack._packer.Packer._pack (msgpack/_packer.cpp:203)
    TypeError: can't serialize <memory at 0x7f506e40ddf8>
    >>> msgpack.fallback.Packer().pack(memoryview(b'abc'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/nofitserov/.local/lib64/python2.7/site-packages/msgpack/fallback.py", line 616, in pack
        self._pack(obj)
      File "/home/nofitserov/.local/lib64/python2.7/site-packages/msgpack/fallback.py", line 613, in _pack
        raise TypeError("Cannot serialize %r" % obj)
    TypeError: Cannot serialize <memory at 0x7f506e40ddf8>
    

    The only available workaround right now is to explicitly convert memoryview objects to bytes, needlessly copying the contents, which degrades performance, especially for unpacking large objects.

    opened by himikof 17
  • use_bin_type - confusing future hint

    use_bin_type - confusing future hint

    string and binary type
    
    Early versions of msgpack didn't distinguish string and binary types (like Python 1).
    

    Ehrm, did you really mean Python 1.x here or rather 2.x?

    The type for representing both string and binary types was named raw.
    
    For backward compatibility reasons, msgpack-python will still default all
    strings to byte strings, unless you specify the use_bin_type=True option
    in the packer.
    

    So that means the current default is still False (or rather 0, from the code).

    If you do so, it will use a non-standard type called bin to serialize byte arrays,
    and raw becomes to mean str. If you want to distinguish bin and raw in the
    unpacker, specify encoding='utf-8'.
    
    In future version, default value of ``use_bin_type`` will be changed to ``False``.
    

    Did you mean True here?

    To avoid this change will break your code, you must specify it explicitly even when you want to use old format.
    
    opened by ThomasWaldmann 17
  • exception hierarchy and its future

    exception hierarchy and its future

    http://msgpack-python.readthedocs.io/en/latest/api.html#exceptions

    For the upper layer(s) of the current msgpack exception hierarchy, it states that:

    Deprecated. Use Exception instead to catch all exception during packing.
    

    or

    Deprecated. Use ValueError instead.
    

    I am not sure why that is deprecated, for me this feels backwards.

    There are quite a lot of normal use cases where one wants to catch all msgpack exceptions, but one can't just catch Exception (or ValueError).

    For example, look at this, taken from borgbackup code:

    try:
        with IntegrityCheckedFile(hints_path, write=False, integrity_data=integrity_data) as fd:
            hints = msgpack.unpack(fd)
    except (msgpack.UnpackException, FileNotFoundError, FileIntegrityError) as e: 
        ... # the file is not there / is crap, rebuild it.
    

    So, if I'ld use Exception there, it would also catch all sorts of unspecific issues in the IntegrityCheckedFile code - that's a bad idea.

    Of course one could work around by having an additional inner try/except and reraise some specific custom exception, but why not just keep UnpackException for such cases?

    opened by ThomasWaldmann 16
  • Have build script fail on non-successful build of native (cython) extension and add explicit flag to use fallback

    Have build script fail on non-successful build of native (cython) extension and add explicit flag to use fallback

    I recently debugged a tricky issue with OpenStack Neutron or rather oslo.privsep (https://github.com/openstack/oslo.privsep) which makes heavy use of your very appreciated python-msgpack. You'll find the whole story here: https://bugs.launchpad.net/cloud-archive/+bug/1937261.

    The root cause for my observed issue was that the python-msgpack backport of version 0.6.2 to Ubuntu Bionic done by Ubuntu Cloud Archive was using the pure python fallback for pack/unpack. And this was simply due to Cython being too old as it lacked the support for bytearray which was only added in 0.29 https://github.com/cython/cython/pull/2573.

    1. The version requirement stated in the warning at https://github.com/msgpack/msgpack-python/blob/38dba9634e4efa7886a777b9e7c739dc148da457/setup.py#L57 is not correct anymore, obviously >=0.29 is true at least for python-msgpack version 0.6.2 and newer.

    2. Honestly I believe your build script is just way too nice, skipping over every possible error condition when building the extension and then simply falling back to pure python: https://github.com/msgpack/msgpack-python/blob/38dba9634e4efa7886a777b9e7c739dc148da457/setup.py#L50 I'd like to suggest to fail hard if Cython is not available or cannot build successfully and then add a flag to not build the extension and explicitly use the pure python fallback. In this particular case only looking at the human readable build logs exposed this problem which only gets harder to debug with every additional layer.

    3. Not allowing an at-runtime fallback is a whole other story - but could be sensible as well for use cases which will just not work with the performance available with fallback.

    opened by frittentheke 1
  • Segmentation fault when calling getbuffer() on Packer object

    Segmentation fault when calling getbuffer() on Packer object

    Hi there,

    I'm trying to get the internal data of the Packer object in order to avoid unneeded copying, as documented here.

    The code is as follow :

    import msgpack
    
    def do_the_job():
      packer = msgpack.Packer(autoreset=False)
      packer.pack(1)
      return packer.getbuffer()
    
    bytes(do_the_job())
    

    When running this snippet, I get the following error :

    [1]    9018 segmentation fault (core dumped)  python script.py
    

    I am using Ubuntu 18.04.5 LTS together with msgpack 1.0.2.

    Thanks in advance for your help and for your work on this package !

    opened by thibaudmartinez 9
  • Nicer error when packing a datetime without tzinfo

    Nicer error when packing a datetime without tzinfo

    When attempting to pack a datetime.datetime object with msgpack.packb(dt, datetime=True) where the dt object is missing tzinfo, the library throws the generic error TypeError: can not serialize 'datetime.datetime' object.

    That threw me off as I thought the problem related to datetime=True, and didn't realise I was missing tzinfo. This small patch outputs a more specific error message for that edge case.

    opened by bem7 2
  • BufferFull exception followed up by a UnicodeDecodeError exception

    BufferFull exception followed up by a UnicodeDecodeError exception

    Hey,

    I have a client/server architecture whereas the server is a Golang based, send msgpack messages over unix socket and a Python client that feed a buffer while reading messages from a socket. The server, might send non utf8 input values as string (go) and not as []byte. that means, these values are failed to be decoded when we use Unpacker(raw=False).

    What happens is, an exception is being thrown during decoding (expected)

    Exception caught while serving "'utf-8' codec can't decode byte 0xd3 in position 4: invalid continuation byte": Traceback (most recent call last):
      File somefile.py", line 131, in listen
        incoming_message = next(unpacker)
      File "msgpack/_unpacker.pyx", line 518, in msgpack._cmsgpack.Unpacker.__next__
      File "msgpack/_unpacker.pyx", line 443, in msgpack._cmsgpack.Unpacker._unpack
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd3 in position 4: invalid continuation byte
    

    The issue, from now on, the unpacker, which is being initialized once, is now considered 💀 as it is not being able to unpack any message after it hits the UnicodeDecodeError exception. but that is not just the case, every request, expands unpack message a bit more until it hits

    Exception caught while serving "": Traceback (most recent call last):
      File somefile.py", line 131, in listen
        unpacker.feed(buf_memoryview[:max])
      File "msgpack/_unpacker.pyx", line 422, in msgpack._cmsgpack.Unpacker.feed
      File "msgpack/_unpacker.pyx", line 446, in msgpack._cmsgpack.Unpacker.append_buffer
    msgpack.exceptions.BufferFull
    

    Way to reproduce

    1. create packer, set as server
    2. create unpacker, set as client
    3. create a message on the server, make sure it contains some non-utf8 (\xdb), send to client
    4. client try to read, explodes with UnicodeDecodeError
    5. server keep sending messages, 4 hits msgpack.exceptions.BufferFull

    Note: Tested on both 1.0.2, 0.6.1 versions.

    Suggestion how to fix: when hitting an exception - reset internal buffers to avoid ever-growing buffers

    How I fixed it: reinitiated unpacker on each exception, e.g.:

    unpacker = msgpack.Unpacker(raw=False)
    while True
      # ... read from socket
      socket.recv_into(buf)
      try:
        unpacker.feed(buf)
        msg = next(unpacker)
      except:
        unpacker = msgpack.Unpacker(raw=False)
    
    opened by liranbg 6
  • Adding type stubs

    Adding type stubs

    In the course of using msgpack in a project I've written some basic type stubs for the parts of the API I'm using. I don't think it would be too hard to fill out the type stubs for the rest of the API.

    Would you accept a PR adding stub files to the package? Not sure if you have a preference on adding the types to the code itself or keeping them as separate stub files.

    related: https://www.python.org/dev/peps/pep-0561/

    opened by sbdchd 4
  • OSS-Fuzz

    OSS-Fuzz

    https://github.com/google/oss-fuzz

    Help-Wanted 
    opened by methane 0
  • Stream processing requires knowledge of the data

    Stream processing requires knowledge of the data

    I was just trying to use the recent updated msgpack library for stream processing but it still requires knowledge of the data that's incoming which I don't have in all cases. What I want is a function that works roughly like this:

    >>> u = msgpack.Unpacker(StringIO('\x94\x01\x02\x03\x04'))
    >>> u.next_marker()
    ('map_start', 4)
    >>> u.next_marker()
    ('value', 1)
    >>> u.next_marker()
    ('value', 2)
    >>> u.next_marker()
    ('value', 3)
    >>> u.next_marker()
    ('value', 4)
    

    Eg: next marker returns a tuple in the form (marker, value) where marker is one of map_start / array_start or value. if it's value it will have the actual value as second item, if it's a container marker then it has the size in there. This would allow trivial stream processing. (a value marker would never contain a map or array, just scalar values).

    opened by mitsuhiko 14
Releases(v0.6.2)
simplejson is a simple, fast, extensible JSON encoder/decoder for Python

simplejson simplejson is a simple, fast, complete, correct and extensible JSON <http://json.org> encoder and decoder for Python 3.3+ with legacy suppo

null 1.4k Sep 24, 2021
Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

orjson orjson is a fast, correct JSON library for Python. It benchmarks as the fastest Python library for JSON and is more correct than the standard j

null 2.4k Sep 22, 2021
Generic ASN.1 library for Python

ASN.1 library for Python This is a free and open source implementation of ASN.1 types and codecs as a Python package. It has been first written to sup

Ilya Etingof 187 Sep 20, 2021
Python wrapper around rapidjson

python-rapidjson Python wrapper around RapidJSON Authors: Ken Robbins <[email protected]> Lele Gaifax <[email protected]> License: MIT License Sta

null 429 Sep 17, 2021
Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.

jsonpickle jsonpickle is a library for the two-way conversion of complex Python objects and JSON. jsonpickle builds upon the existing JSON encoders, s

null 962 Sep 16, 2021
serialize all of python

dill serialize all of python About Dill dill extends python's pickle module for serializing and de-serializing python objects to the majority of the b

The UQ Foundation 1.5k Sep 23, 2021
Python bindings for the simdjson project.

pysimdjson Python bindings for the simdjson project, a SIMD-accelerated JSON parser. If SIMD instructions are unavailable a fallback parser is used, m

Tyler Kennedy 474 Sep 24, 2021
Ultra fast JSON decoder and encoder written in C with Python bindings

UltraJSON UltraJSON is an ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.6+. Install with pip: $ python -m pip insta

null 3.4k Sep 24, 2021
Extended pickling support for Python objects

cloudpickle cloudpickle makes it possible to serialize Python constructs not supported by the default pickle module from the Python standard library.

null 1k Sep 23, 2021
A lightweight library for converting complex objects to and from simple Python datatypes.

marshmallow: simplified object serialization marshmallow is an ORM/ODM/framework-agnostic library for converting complex datatypes, such as objects, t

marshmallow-code 5.7k Sep 24, 2021
Protocol Buffers - Google's data interchange format

Protocol Buffers - Google's data interchange format Copyright 2008 Google Inc. https://developers.google.com/protocol-buffers/ Overview Protocol Buffe

Protocol Buffers 50.8k Sep 23, 2021
Crappy tool to convert .scw files to .json and and vice versa.

SCW-JSON-TOOL Crappy tool to convert .scw files to .json and vice versa. How to use Run main.py file with two arguments: python main.py <scw2json or j

Fred31 5 May 14, 2021
FlatBuffers: Memory Efficient Serialization Library

FlatBuffers FlatBuffers is a cross platform serialization library architected for maximum memory efficiency. It allows you to directly access serializ

Google 16.8k Sep 23, 2021