Cloud-optimized, single-file archive format for pyramids of map tiles

Protomaps

Last update: Jan 4, 2023

Related tags

Third-party APIs Wrappers PMTiles

Overview

PMTiles

PMTiles is a single-file archive format for tiled data. A PMTiles archive can be hosted on a commodity storage platform such as S3, and enables low-cost, zero-maintenance map applications that are "serverless" - free of a custom tile backend or third party provider.

Demo - watch your network request log

How To Use

Python library: pip install pmtiles

pmtiles-convert TILES.mbtiles TILES.pmtiles
pmtiles-convert TILES.pmtiles DIRECTORY
pmtiles-show TILES.pmtiles // see info about a PMTiles directory
pmtiles-serve TILES.pmtiles // start an HTTP server that decodes PMTiles into traditional Z/X/Y paths

See https://github.com/protomaps/PMTiles/tree/master/python/bin for library usage

JavaScript usage:

Include the script:

<script src="https://unpkg.com/[email protected]/pmtiles.js"></script>

Example of a raster PMTiles archive decoded and displayed in Leaflet:

const p = new pmtiles.PMTiles('osm_carto.pmtiles',{allow_200:true})
p.leafletLayer({attribution:'© <a href="https://openstreetmap.org">OpenStreetMap</a> contributors'}).addTo(map)

Specification

A detailed specification is forthcoming. PMTiles is a binary serialization format designed for two main access patterns: over the network, via HTTP 1.1 Byte Serving (Range: requests), or via memory-mapped files on disk.

Design considerations

Directories are recursive, with a maximum of 21,845 entries per directory.
- 21845 is the total tiles of a pyramid with 8 levels, or 1+4+16+64+256+1024+4096+16384
Deduplication of tile data is handled by multiple entries pointing to the same offset in the archive.
The order of tile data in the archive is unspecified; an optimized implementation should arrange tiles on a 2D space-filling curve.

Details

The first 512,000 bytes of a PMTiles archive are reserved, and contain the headers as well as a root directory.
All integer values are little-endian.
The headers begin with a 2-byte magic number, "PM"
2 bytes: the PMTiles specification version, right now always 1.
4 bytes: the length of metadata (M bytes)
2 bytes: the number of entries in the root directory (N)
M bytes: the metadata, by convention a JSON object.
N * 17 bytes: the root directory.

Directory structure

A directory is a sequence of 17 byte entries. An entry consists of:

1 byte: the zoom level (Z) of the entry, with the top bit set to 1 instead of 0 to indicate the data is a child directory, not tile content.
3 bytes: the X (column) of the entry.
3 bytes: the Y (row) of the entry.
6 bytes: the offset of where the data begins in the archive.
4 bytes: the length of the data.

License

The reference implementations of PMTiles are published under the BSD 3-Clause License. The PMTiles specification itself is public domain, or under a CC0 license where applicable.

Comments

Alternative dir structure for a more compact storage and proximity clustering
I would like to propose some changes to the directory structure, but these might be totally irrelevant due to my misunderstanding.

Current directory entry is fixed at 17bytes, stores x,y as individual values, and requires x,y sorting order. I think all of those may benefit from a bit of restructuring:

Combine x and y values into a single interleaved value (z-curve). This will make tiles more locale-clustered, possibly reducing the number of range requests (web).

Make the combined x,y value size depend on the zoom level, rounding to the nearest byte boundary. zooms 0..3 -- 1 byte, 4..7 -- 2 bytes, etc. All leaf nodes are required to be the same zoom, making the directory entries the same size only inside a single directory block, rather than everywhere.

For the root dir, it could still be 17bytes (although it seems 6 bytes is a bit too much -- zoom 16..19 only requires 5 bytes, so unless pmtiles wants to support more... and even that can be made flex)

v3
opened by nyurik 41
Using python Reader object with cloud storage

It seems like go-pmtiles is set up to read from cloud storage (S3, Google Storage etc), but the python version here is not. Is that correct? How hard would it be to add that? I use Google, but an S3 example would be great.

If the Reader API is robust, I can add it into a flask/fastapi endpoint pretty trivially, implement my own authentication etc, but it definitely seems like being able to read the tiles from a file in Storage (rather than locally) is the really powerful use case here.

opened by fscottfoti 12
js: slow to load due to no network parallelism
When using pmtiles.js, network performance is a big bottleneck.

This is visible in the leaflet raster demo but mitigated by a very fast backend (read from a small pmtiles file), so response-time is ~15ms per tile. Still you can see in the chrome dev tools that each fetch request is "stalled" (queued) until the previous one is answered.

My setup responds in ~150ms per tile, so time from scrolling to display is several seconds.

How can this be optimised?

Reducing number of requests using Multipart Ranges does not seem possible without adding complexity to the leaflet integration.

... but according to chrome we're supposed to have up to 6 concurrent TCP connections per origin?
opened by eddy-geek 11
Compression (spec v3)
Options:

Gzip compression - requires a library like pako, may be expensive

Cap'n proto packing: https://capnproto.org/encoding.html

Protobuf varints: https://developers.google.com/protocol-buffers/docs/encoding

Use cases:

Dense tile pyramids

Sparse pyramids (tippecanoe output)

v3
opened by bdon 9
Cordova support?

I'm using Cordova/Ionic in order to serve offline tiles. I'm doing this by leveraging sqlite-ext to open a mbtiles file on the device and serve the tiles from there. Is there a way to to use pmtiles in Cordova in similar manor? From the documentation it seems that there's a need to use memory mapped file or something similar, have you experimented with this?

opened by HarelM 9
writer: Leaf directories: Find best base zoom
... to avoid extra indirection for as many tiles as we can

Previously, we hardcoded a "base_zoom = 7", which is only only optimal for a whole world map. instead "7" is replaced with min zoom that fits all tiles.

I only tested on one example (Bugianen, 152742 tiles) and got the expected base zoom of 14.

```sh sqlite3 Bugianen.mbtiles "SELECT zoom_level, COUNT(*) FROM tiles GROUP BY zoom_level" 12|448 13|1792 14|7168 15|28672 16|114662 PYTHONPATH=$PWD bin/pmtiles-convert Bugianen.mbtiles optim_Bugianen.pmtiles Num tiles: 152742 Num unique tiles: 151947 Num leaves: 7168

So, this needs more testing.
opened by eddy-geek 8

pmtiles-convert Attribute Error

I'm getting a python AttributeError when converting a .mbtiles file to .pmtiles

pmtiles-convert postcode.mbtiles postcode2.pmtiles
('compression:', 'disabled')
Traceback (most recent call last):
  File "/home/malcolm/.local/bin/pmtiles-convert", line 38, in <module>
    mbtiles_to_pmtiles(args.input, args.output, args.maxzoom, args.gzip)
  File "/home/malcolm/.local/lib/python2.7/site-packages/pmtiles/convert.py", line 40, in mbtiles_to_pmtiles
    writer.write_tile(row[0], row[1], flipped, force_compress(row[3], gzip))
  File "/home/malcolm/.local/lib/python2.7/site-packages/pmtiles/convert.py", line 15, in force_compress
    return gzip.decompress(data)
AttributeError: 'module' object has no attribute 'decompress'

postcode.zip

opened by mem48 7

Consumes all memory on large input

I'm trying to convert a 66 GB file and it quickly consumes 32GB ram then stops.

Is there a solution to converting large files besides launching a super huge VM to handle conversions?

opened by j 7
Unable to display map on browsers

Hi! I'm new to this project. I tried running some html files in the example section in Safari/Chrome but maps were not shown. I got the following errors: I couldn't figure out why fetch is aborted. Is there something I'm missing here?

Thank you.

opened by xinyuluo 7
Split a pmtiles file

Some hosts (like github pages) have maximum file sizes. Alternatives like https://github.com/phiresky/sql.js-httpvfs provide a way to split the tile archive until it is less than that max file size (https://github.com/phiresky/sql.js-httpvfs/blob/master/create_db.sh). Would it be possible for the pmtiles reader and writer to optionally support splitting a pmtiles file?

opened by msbarry 6
canvas tile display seems to be transparent :/

hi, i'm using our own pmtiles tileset server, the test url is : https://tilesets.urbanease.io/cadastre/64/64102/without_protobuf.pmtiles the location for seeing it is :bayonne, france latlng=[43.492949,-1.474841]

on the viewer, all seems to be ok bit leaflet preview don't work viewer i'm using react and leaflet with the npm protomaps package version 1.19.0

my code is simple const map = useMap(); const url = 'https://tilesets.urbanease.io/cadastre/64/64102/without_protobuf.pmtiles'; const layer = protomaps.leafletLayer({ url: url, id: 'cadastral', }); layer.addTo(map);

canvas are created but seems to be transparent, colors in paint_rules layer are good and opacity ok

i don't understand where is the pb bad pmtiles file? no compatibility with leaflet? thx for help

opened by DavidDvpt 5
More detailed Specification

Hey :)

First of all. Thank you for the great format. It is imho the perfect solution to a problem that will become more and more prevalent in the near future.

I tried to implement my own reader / writer in rust and found myself looking at the first-party implementations quite a lot, because I stumbled upon things that are not really clear in the specification.

I think the project would benefit a lot from having a WAY MORE detailed specification (at least more than something I could print on a single page) and I was wondering whether a contribution on my part would be welcomed.

I took the opportunity and wrote a really rough draft, of the section on the header. Just to give you an impression of what I would envision and get your feedback.

PS: It's also totally fine if you do not think the specification needs to be improved.

opened by DerZade 1
Refactor JS client to use streams instead of ArrayBuffers
This will require a major version bump, as the Source API should return a ReadableStream. This is made possible by Lambda now supporting Node 18.

Should also include #90 API changes as well to use If-Match.

[ ] fflate readablestream

[ ] varint readable stream
opened by bdon 1
ETag problems in JavaScript client + passing through other metadata

If the new resource has a new ETag but it shorter than the previous, the server might return 416 Range not Satisfiable, which does not have an accompanying ETag. This will invalidate the header, but present a bunch of errors in the console.

We should unify on using the If-Match header in sending the requests, which will return 412 Precondition Failed in the mismatch case, which is designed for this specific situation.
bug

opened by bdon 0
Inspector app improvements
[x] SVG tile previews should be zoomable

[x] Should be able to drill down into leaf directories

[x] should be able to preview vector tiles in leaflet

[ ] SVG should be feature-level inspectable

[ ] map preview should be feature-level inspectable

[ ] see directory Len and header-level metadata

[ ] inspect SVG with mismatched extents

[ ] correctly read tile and map hash states
opened by bdon 1

Releases(v0.0.0-alpha)

Owner

Protomaps

the map foundry

GitHub https://protomaps.github.io/PMTiles/examples/leaflet.html

Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program

Public Datasets Pipelines Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program. Overview Requi

109 Dec 30, 2022

Prisma Cloud utility scripts, and a Python SDK for Prisma Cloud APIs.

pcs-toolbox Prisma Cloud utility scripts, and a Python SDK for Prisma Cloud APIs. Table of Contents Support Setup Configuration Script Usage CSPM Scri

34 Dec 15, 2022

Python client for using Prefect Cloud with Saturn Cloud

prefect-saturn prefect-saturn is a Python package that makes it easy to run Prefect Cloud flows on a Dask cluster with Saturn Cloud. For a detailed tu

15 Dec 7, 2022

Converts between Spotify's new lyrics (and their proprietary format) to an LRC file for local playback.

spotify-lyrics-to-lrc Converts between Spotify's new lyrics (and their proprietary format) to an LRC file for local playback. How to use: Open Spotify

6 Nov 19, 2022

A Python library for rendering ASS subtitle file format using libass.

ass_renderer A Python library for rendering ASS subtitle file format using libass. Installation pip install --user ass-renderer Contributing # Clone

1 Nov 2, 2022

Dante, my discord bot. Open source project in development and not optimized for other filesystems, install and setup script in development

DanteMode (In private development for ~6 months) Dante, my discord bot. Open source project in development and not optimized for other filesystems, in

2 Nov 5, 2021

⚡TIKTOK BOT - FAST OPTIMIZED ZEFOY SCRIPT

⚡ ZEFOY [ TikTok Zefoy Bot ] Get the script in: discord.gg/onlp !! Official shop: onlp.sellix.io Newest version v.9.0.0 Requirements pip install p

186 Dec 31, 2022

A simple Python wrapper for the archive.is capturing service

archiveis A simple Python wrapper for the archive.is capturing service. Installation pipenv install archiveis Python Usage Import it. >>> import archi

157 Dec 28, 2022

Quickly and efficiently delete your entire tweet history with the help of your Twitter archive without worrying about the pointless 3200 tweet limit imposed by Twitter.

Twitter Nuke Quickly and efficiently delete your entire tweet history with the help of your Twitter archive without worrying about the puny and pointl

73 Dec 12, 2022

Archive tweets and make them searchable

Tweeter Archive and search your tweets and liked tweets using AWS Lambda, DynamoDB and Elasticsearch. Note: this project is primarily being used a tes

8 Nov 18, 2022

Utility for downloading fanfiction in bulk from the Archive of Our Own

What is this? This is a program intended to help you download fanfiction from the Archive of Our Own in bulk. This program is primarily intended to wo

73 Dec 30, 2022

A wrapper to stream information from Twitter's Full-Archive Search Endpoint

A wrapper to stream information from Twitter's Full-Archive Search Endpoint. To exploit this library, one must have approved academic research access.

9 Nov 28, 2022

An attempt to make a bot that can auto-archive Danganronpa KG RPs on Discord.

Danganronpa Killing Game Archiving Bot An attempt to make a bot that can auto-archive Danganronpa KG RPs on Discord. The final format is meant to look

1 Nov 30, 2021

Telegram Link Wayback Bot. This bot archives a web page thrown at itself with wayback Machine (Archive.org).

11 Feb 18, 2022

An accessible Archive of Our Own reader application written in python.

AO3-A11y. Important disclaimer. This project is under active development. Many features might not yet be present, or some things might not work at all

4 Nov 11, 2022

A simple bot to upload file to various cloud servers.

Cloudsy Bot A simple bot to upload file to various cloud servers. Variables API_HASH Your API Hash from my.telegram.org API_ID Your API ID from my.tel

8 Oct 31, 2022

Bot SpaceCrypto - An automation (bot) to play the game SpaceCrypto, it automatically log in, send ships to fight, refresh the game, new map, etc

SpaceCrypto Bot [en-us] In order to change to english Readme version click here.

11 Sep 11, 2022

Parse discord tokens from any file, even if there is other shit in the file with them.

Discord-Token-Parser Parse discord tokens from any file, even if there is other shit in the file with them. Any. File. I glued together all html from

4 May 7, 2022

This bot is created by AJTimePyro and It accepts direct downloading url & then return file as telegram file.

URL Uploader Bot This is the source code of URL Uploader Bot. And the developer of this bot is AJTimePyro, His Telegram Channel & Group. You can use t

23 Nov 13, 2022