Skip to content

Commit c3701df

Browse files
committed
modernize
1 parent 063752f commit c3701df

19 files changed

Lines changed: 714 additions & 86 deletions

.github/workflows/ci.yml

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ jobs:
1313
strategy:
1414
fail-fast: false
1515
matrix:
16-
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14"]
16+
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
1717

1818
steps:
19-
- uses: actions/checkout@v4
20-
- uses: astral-sh/setup-uv@v6
19+
- uses: actions/checkout@v6
20+
- uses: astral-sh/setup-uv@v7
2121
with:
2222
python-version: ${{ matrix.python-version }}
2323
enable-cache: true
@@ -29,8 +29,9 @@ jobs:
2929
runs-on: ubuntu-latest
3030

3131
steps:
32-
- uses: actions/checkout@v4
33-
- uses: astral-sh/setup-uv@v6
32+
- uses: actions/checkout@v6
33+
- uses: astral-sh/setup-uv@v7
3434
- name: Lint
3535
run: |
36-
uv run ruff check --no-fix
36+
uv run ruff format --check
37+
uv run ruff check --no-fix --output-format=github

.github/workflows/publish.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
2+
name: Publish Python Package
3+
4+
on:
5+
release:
6+
types: [published]
7+
8+
permissions:
9+
contents: read
10+
11+
jobs:
12+
pypi-publish:
13+
runs-on: ubuntu-latest
14+
permissions:
15+
contents: read
16+
id-token: write
17+
18+
steps:
19+
- uses: actions/checkout@v6
20+
- uses: astral-sh/setup-uv@v7
21+
- name: Build release distributions
22+
run: |
23+
uv build
24+
- name: Publish release distributions to PyPI
25+
run: |
26+
uv publish

README.md

Lines changed: 185 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,186 @@
1-
# msgpack-stream
1+
# msgpack-streams
22

3-
Pure Python stream based implementation of msgpack
3+
Fast stream based implementation of msgpack in pure Python.
4+
5+
## Installation
6+
7+
```bash
8+
pip install msgpack-streams
9+
```
10+
11+
## Benchmarks
12+
13+
Average of 50 iterations each on a 3.77 MB payload, pure Python
14+
(`MSGPACK_PUREPYTHON=1`).
15+
16+
| Implementation | Operation | Speedup vs msgpack |
17+
| ------------------------------- | --------- | ------------------ |
18+
| msgpack-streams `unpack` | decode | 2.83x |
19+
| msgpack-streams `unpack_stream` | decode | 2.70x |
20+
| msgpack-streams `pack` | encode | 1.84x |
21+
| msgpack-streams `pack_stream` | encode | 1.69x |
22+
23+
## Usage
24+
25+
```python
26+
from msgpack_streams import pack, unpack
27+
28+
data = {"key": "value", "number": 42, "list": [1, 2, 3]}
29+
packed = pack(data)
30+
unpacked, excess_data = unpack(packed)
31+
assert data == unpacked
32+
assert not excess_data
33+
```
34+
35+
The stream based API is also available:
36+
37+
```python
38+
from msgpack_streams import pack_stream, unpack_stream
39+
import io
40+
41+
data = {"key": "value", "number": 42, "list": [1, 2, 3]}
42+
43+
with io.BytesIO() as stream:
44+
pack_stream(stream, data)
45+
# reset stream position for reading
46+
stream.seek(0)
47+
unpacked = unpack_stream(stream)
48+
49+
assert data == unpacked
50+
```
51+
52+
## Extensions
53+
54+
### Datetime
55+
56+
Timezone-aware `datetime` objects are natively supported and automatically
57+
encoded using the
58+
[msgpack Timestamp extension](https://github.com/msgpack/msgpack/blob/master/spec.md#timestamp-extension-type)
59+
(type code `-1`). The timestamp format (32-, 64-, or 96-bit) is chosen
60+
automatically based on the value's range and precision. Decoded timestamps are
61+
always returned as UTC `datetime` objects.
62+
63+
```python
64+
from datetime import datetime, timezone
65+
from msgpack_streams import pack_stream, unpack_stream
66+
import io
67+
68+
dt = datetime(2025, 3, 25, 12, 0, 0, tzinfo=timezone.utc)
69+
70+
with io.BytesIO() as stream:
71+
pack_stream(stream, dt)
72+
stream.seek(0)
73+
unpacked = unpack_stream(stream)
74+
75+
assert unpacked == dt
76+
```
77+
78+
Naive `datetime` objects (without `tzinfo`) will raise a `ValueError`.
79+
80+
### ExtType
81+
82+
Arbitrary msgpack extension types are supported via the `ExtType` dataclass:
83+
84+
```python
85+
from msgpack_streams import ExtType, pack_stream, unpack_stream
86+
import io
87+
88+
obj = ExtType(code=42, data=b"hello")
89+
90+
with io.BytesIO() as stream:
91+
pack_stream(stream, obj)
92+
stream.seek(0)
93+
unpacked = unpack_stream(stream)
94+
95+
assert unpacked == obj
96+
```
97+
98+
Use `ext_hook` to pack custom types as extensions, and `ext_hook` to decode them
99+
back:
100+
101+
```python
102+
from msgpack_streams import ExtType, pack, unpack
103+
from fmtspec import decode, encode, types # https://pypi.org/project/fmtspec/
104+
105+
class Point:
106+
EXT_CODE = 10
107+
108+
__fmt__ = {
109+
"x": types.u32,
110+
"y": types.u32,
111+
}
112+
113+
def __init__(self, x: int, y: int):
114+
self.x, self.y = x, y
115+
116+
def unknown_type_hook(obj):
117+
if isinstance(obj, Point):
118+
return ExtType(Point.EXT_CODE, encode(obj))
119+
return None # unsupported type → TypeError
120+
121+
def ext_hook(ext):
122+
if ext.code == Point.EXT_CODE:
123+
return decode(ext.data, shape=Point)
124+
return None # unknown → keep as ExtType
125+
126+
pt = Point(1, 2)
127+
packed = pack(pt, ext_hook=unknown_type_hook)
128+
result, _ = unpack(packed, ext_hook=ext_hook)
129+
assert result.x == pt.x and result.y == pt.y
130+
```
131+
132+
## API reference
133+
134+
```python
135+
def pack(obj: object, *, float32: bool = False, ext_hook: Callable[[object], ExtType | None] | None = None) -> bytes:
136+
...
137+
```
138+
139+
Serialize `obj` to a `bytes` object. Pass `float32=True` to encode `float`
140+
values as 32-bit instead of the default 64-bit.
141+
142+
Pass `ext_hook` to handle types that are not natively supported. The callback
143+
receives the unsupported object and should return an `ExtType` to pack in its
144+
place. If it returns `None` a `TypeError` is raised as normal.
145+
146+
---
147+
148+
```python
149+
def unpack(data: bytes, *, ext_hook: Callable[[ExtType], object | None] | None = None) -> tuple[object, bytes]:
150+
...
151+
```
152+
153+
Deserialize the first msgpack object from `data`. Returns `(obj, excess)` where
154+
`excess` is any unconsumed bytes that followed the object.
155+
156+
Pass `ext_hook` to convert `ExtType` values during decoding. The callback
157+
receives each `ExtType` and should return the decoded object, or `None` to leave
158+
it as an `ExtType`.
159+
160+
---
161+
162+
```python
163+
def pack_stream(stream: BinaryIO, obj: object, *, float32: bool = False, ext_hook: Callable[[object], ExtType | None] | None = None) -> None:
164+
...
165+
```
166+
167+
Serialize `obj` directly into a binary stream. Pass `float32=True` to encode
168+
`float` values as 32-bit instead of the default 64-bit.
169+
170+
Pass `ext_hook` to handle types that are not natively supported. The callback
171+
receives the unsupported object and should return an `ExtType` to pack in its
172+
place. If it returns `None` a `TypeError` is raised as normal.
173+
174+
---
175+
176+
```python
177+
def unpack_stream(stream: BinaryIO, *, ext_hook: Callable[[ExtType], object] | None = None) -> object:
178+
...
179+
```
180+
181+
Deserialize a single msgpack object from a binary stream, advancing the stream
182+
position past the consumed bytes.
183+
184+
Pass `ext_hook` to convert `ExtType` values during decoding. The callback
185+
receives each `ExtType` and should return the decoded object, or `None` to leave
186+
it as an `ExtType`.

pyproject.toml

Lines changed: 27 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,38 @@
11
[project]
2-
name = "msgpack-stream"
3-
version = "0.1.0"
4-
description = "Pure Python stream based implementation of msgpack"
2+
name = "msgpack-streams"
3+
version = "1.0.0"
4+
description = "Fast stream based implementation of msgpack in pure Python"
55
readme = "README.md"
6+
classifiers = [
7+
"Operating System :: OS Independent",
8+
"Typing :: Typed",
9+
"Development Status :: 5 - Production/Stable",
10+
]
11+
keywords = [
12+
"msgpack",
13+
"streams",
14+
"serialization",
15+
"streaming",
16+
"datetime",
17+
"ext",
18+
"fast",
19+
"minimal",
20+
]
621
authors = [
722
{ name = "Peter Gessler", email = "[email protected]" },
823
]
9-
requires-python = ">=3.9"
24+
requires-python = ">=3.10"
1025
dependencies = []
1126

27+
[project.urls]
28+
Homepage = "https://github.com/gesslerpd/msgpack-stream"
29+
Repository = "https://github.com/gesslerpd/msgpack-stream"
30+
Documentation = "https://github.com/gesslerpd/msgpack-stream/blob/main/README.md"
31+
Issues = "https://github.com/gesslerpd/msgpack-stream/issues"
32+
Changelog = "https://github.com/gesslerpd/msgpack-stream/releases"
33+
1234
[build-system]
13-
requires = ["uv_build>=0.8.6"]
35+
requires = ["uv-build>=0.10,<0.11"]
1436
build-backend = "uv_build"
1537

1638
[dependency-groups]
@@ -19,5 +41,4 @@ dev = [
1941
"pytest>=8.4.1",
2042
"pytest-cov>=6.2.1",
2143
"ruff>=0.12.8",
22-
"typing-extensions>=4.15.0",
2344
]

scripts/bench.py

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99

1010
from msgpack import packb, unpackb
1111

12-
from msgpack_stream import pack, pack_stream, unpack, unpack_stream
12+
from msgpack_streams import pack, pack_stream, unpack, unpack_stream
1313

1414
FILE = "scripts/obj.msgpack"
1515

@@ -93,14 +93,24 @@ def serialize_other(obj, mapped):
9393
t_stream = timeit.timeit("stream(mapped)", number=args.number, globals=_globals)
9494
t_other = timeit.timeit("other(mapped)", number=args.number, globals=_globals)
9595

96-
print(f"main: {t_main:.6f}s total, {t_main / args.number:.6f}s per call")
97-
print(f"stream: {t_stream:.6f}s total, {t_stream / args.number:.6f}s per call")
98-
print(f"other: {t_other:.6f}s total, {t_other / args.number:.6f}s per call")
96+
print(
97+
f"main: {t_main:.6f}s total, {t_main / args.number:.6f}s per call ({t_other / t_main:.2f}x speedup vs msgpack)"
98+
)
99+
print(
100+
f"stream: {t_stream:.6f}s total, {t_stream / args.number:.6f}s per call ({t_other / t_stream:.2f}x speedup vs msgpack)"
101+
)
102+
print(f"other (msgpack): {t_other:.6f}s total, {t_other / args.number:.6f}s per call")
99103

100104
t_main_s = timeit.timeit("main(obj, mapped)", number=args.number, globals=_serialize)
101105
t_stream_s = timeit.timeit("stream(obj, mapped)", number=args.number, globals=_serialize)
102106
t_other_s = timeit.timeit("other(obj, mapped)", number=args.number, globals=_serialize)
103107

104-
print(f"main serialize: {t_main_s:.6f}s total, {t_main_s / args.number:.6f}s per call")
105-
print(f"stream serialize: {t_stream_s:.6f}s total, {t_stream_s / args.number:.6f}s per call")
106-
print(f"other serialize: {t_other_s:.6f}s total, {t_other_s / args.number:.6f}s per call")
108+
print(
109+
f"main serialize: {t_main_s:.6f}s total, {t_main_s / args.number:.6f}s per call ({t_other_s / t_main_s:.2f}x speedup vs msgpack)"
110+
)
111+
print(
112+
f"stream serialize: {t_stream_s:.6f}s total, {t_stream_s / args.number:.6f}s per call ({t_other_s / t_stream_s:.2f}x speedup vs msgpack)"
113+
)
114+
print(
115+
f"other serialize (msgpack): {t_other_s:.6f}s total, {t_other_s / args.number:.6f}s per call"
116+
)

src/msgpack_stream/_ext.py

Lines changed: 0 additions & 11 deletions
This file was deleted.

src/msgpack_stream/_io.py

Lines changed: 0 additions & 19 deletions
This file was deleted.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
"""Pure Python stream based implementation of msgpack"""
1+
"""Fast stream based implementation of msgpack in pure Python."""
22

33
from ._ext import ExtType as ExtType
44
from ._io import pack as pack

src/msgpack_streams/_ext.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
from __future__ import annotations
2+
3+
from dataclasses import dataclass
4+
5+
6+
@dataclass(frozen=True, slots=True)
7+
class ExtType:
8+
"""Extension type."""
9+
10+
code: int
11+
"""Type code (0-127)."""
12+
data: bytes
13+
"""Raw data (up to 2^32-1 bytes)."""

0 commit comments

Comments
 (0)