Hacker Newsnew | past | comments | ask | show | jobs | submit | bdon's commentslogin

Thanks for giving it a try! If you're going to revisit this, we could improve the docs around setting CORS configuration on DigitalOcean: https://docs.protomaps.com/pmtiles/cloud-storage#digitalocea...


You can test this claim directly against a AWS S3 bucket.

First 100KB of a 100GB+ file:

curl -H "Range: bytes=0-100000" https://overturemaps-tiles-us-west-2-beta.s3.amazonaws.com/2... --output tmp -w "%{time_total}"

First 100KB at the 100GB mark:

curl -H "Range: bytes=100000000000-100000100000" https://overturemaps-tiles-us-west-2-beta.s3.amazonaws.com/2... --output tmp -w "%{time_total}"



The latency for small files and ranges of large files is pretty similar on most storage platforms, but there are some exceptions like Cloudflare R2.

The main reason PMTiles is one file and not two or more files is that it enables atomic updates in-place (which every mature storage platform supports) as well as ETag content change detection in downstream caches. All of the server and serverless implementations at http://github.com/protomaps support this now for AWS, S3-compatible storage, Google Cloud, and Azure.


Now I'm curious, what causes the latency for range requests with R2?


I don't have any insight into this other than observing how their storage system works, but here's some scripts I made last year to test:

https://github.com/bdon/cloudflare-r2-latency


Range requests means work and logic. Getting a file requires no logic.

Also, I'm pretty sure range requests are going to be difficult to cache. That implies going to origin every request which is bad.



1) The Protomaps schema is mostly a re-implementation of the Tilezen project https://tilezen.readthedocs.io/en/latest/ which is a linux foundation project. OpenMapTiles, which OpenFreeMap uses, while open source, does not have a license that encourages derivative works or enables distributing styles under a standard FOSS software license (https://www.npmjs.com/package/protomaps-themes-base). That's also one motivation for developing Shortbread which is at this point less developed than Tilezen.

2) The file format (PMTiles) addresses a different audience than either MBTiles or Btrfs images. Both of those require administering a server for tiles, while PMTiles requires static blob storage and nothing else. You do have the option of using a server like MBTiles/btrfs which ought to be comparable in latency, and that's documented here: https://docs.protomaps.com/deploy/ as well as Lambda, Cloudflare Workers, Google Cloud Run and Azure Serverless functions.

3) There are no existing styles for MapLibre GL that work off the Tilezen layer, generalization and tagging scheme, so we need to develop one style, with multiple themes.


> OpenMapTiles, which OpenFreeMap uses, while open source, does not have a license that encourages derivative works

Can you elaborate on this ? I'm derivating OMT and am quite worried now ^^


OMT use a CC-BY license: https://creativecommons.org/faq/#can-i-apply-a-creative-comm... (edit: link)

This means that software that implements OMT, even if written from scratch, cannot be re-used by other FOSS projects (Apache, BSD, GPL, AGPL, other software in the OpenStreetMap ecosystem, etc) without affecting the license.

Ideally for Protomaps it should be possible to re-use just one portion - like only the label layer with your own layers from other sources, or even bundle it as a JS dependency in another open source project - without affecting the license of downstream projects.


Are you talking about just the OpenMapTiles spec, or some adjacent software? I'm certain that you can build software to some specification without ever agreeing to the spec text's licence, and that a CC-BY licenced spec doesn't limit any implementing system's licence.

Even so, CC-BY is permissive and you could include CC-BY content in a, say, GPL project. You just need to include both licences.


> I'm certain that you can build software to some specification without ever agreeing to the spec text's licence

That is exactly the opposite of OMT's copyright interpretation: https://github.com/openmaptiles/openmaptiles?tab=readme-ov-f...

> You just need to include both licences.

That is the definition of license incompatibility as described in the Creative Commons documentation above. The license is open source and a good fit for if you are running a paid map SaaS or free service as an end product, but is not compatible with the open source ecosystem as a building block.


> That is exactly the opposite of OMT's copyright interpretation.

They are entitled to their opinion, but I don't consider a database to be a derivative work of a schema (this should be clear, since databases of facts aren't copyrightable anyway), same for software (it's at least fair use in the US, see Google v. Oracle). This goes back to the debate on copyrightability of APIs, where a decision for copyrightability would ruin swathes of the software industry and much of free software.

Though I understand if you'd rather avoid using projects by people with weird legal opinions.

> That is the definition of license incompatibility as described in the Creative Commons documentation above.

Where do you see that? All I found was a statement that the CC share-alike licences can be converted to GPL, but no word on non-SA licences:

> Version 4.0 of CC’s Attribution-ShareAlike (BY-SA) license is one-way compatible with the GNU General Public License version 3.0 (GPLv3).

The FSF says that CC-BY is compatible with the GPL, and I believe this extends to all reasonable FLOSS licences: https://www.gnu.org/licenses/license-list.html#ccby

It's not uncommon for a piece of code to have multiple licences in a big project. One for licencing to that project, and another for sublicencing to the end user. Proprietary apps that include FLOSS code do this all the time.


> Where do you see that?

"Additionally, our licenses are currently not compatible with the major software licenses, so it would be difficult to integrate CC-licensed work with other free software. Existing software licenses were designed specifically for use with software and offer a similar set of rights to the Creative Commons licenses." https://creativecommons.org/faq/

> Though I understand if you'd rather avoid using projects by people with weird legal opinions.

The scale of the Protomaps project is small (I'm the only full-time developer) and I don't have the resources to interpret novel copyright situations. I think it's best for the ecosystem to abide by the license terms stated by the open source developer, instead of challenging their validity.


I'm the developer of Protomaps, to summarize:

The latency you see on https://pmtiles.io/?url=https%3A%2F%2Fdata.source.coop%2Fpro... is representative of how PMTiles works on AWS S3, coming from the us-west-2 region. It will be reasonable to load for those in the western US and likely quite slow from Europe or Oceania.

If you want to make a direct comparison of Protomaps to OpenFreeMap, you need to compare serving OpenFreemap with NGINX from btrfs on disk, to running `pmtiles serve` on a `.pmtiles` file on disk, as described here: https://docs.protomaps.com/deploy/server

The OpenFreeMap page for me (in Taiwan) takes 1-2 seconds per tile, which is more than double the load tile for the PMTiles in us-west-2 example linked above.

The best solution to get latency competitive with commercial providers, for a global audience, is to cache tiles at CDN edge locations. I describe automated ways to do that with Cloudflare Workers and AWS Lambda + Cloudfront here:

https://docs.protomaps.com/deploy/cloudflare https://docs.protomaps.com/deploy/aws

I'm also experimenting with http://tigrisdata.com which lets you geo-distribute a static storage bucket like in this example: https://pmtiles.io/?url=https%3A%2F%2Fdemo-bucket.protomaps....


No, the ping is 150 ms to us-west-2, and the tiles load in like 5 seconds on a cold start. Of course we cannot test cold start on HN comments because HN is the definition of hot :-)

I can imagine workers to be fast, it's the range requests which are super slow. It's also outside of your control, it depends on how Cloudflare and S3 handles range requests to 90 GB files.

I think if you could make PMTiles split into files <10 MB, it'd be perfect with range requests.


I agree, there are tradeoffs to using static storage - the intended audience for PMTiles is those that prefer using static sites instead of administering a Linux server.

I would be interested to see a comparison of Btrfs + nginx serving latency, vs `pmtiles serve` from https://github.com/protomaps/go-pmtiles on a PMTiles archive on disk. That would be a more direct comparison.

I think there's potentially some interesting use case for tiles in Btrfs volumes and incremental updates, which I haven't tackled in PMTiles yet!


I think both solutions could easily saturate a 1 Gbps line. I benchmarked Btrfs + nginx and it could do 30 Gbps, which doesn't really make a difference if your server is 1 Gbps only.

The fact that there is no service running was the more important for me. Mostly for security and bugs. I had so many problems with various tile servers in production, they needed daily restarting, they had memory leaks, etc.

Basically I wanted to go nginx-only for security and to avoid tile server bugs.


I see, I think that's a good approach to enable serving with stock nginx as well as for companies that are built on Nginx or a plain HTTP serving stack already.

For PMTiles the module is loadable directly as a Caddy plugin (https://docs.protomaps.com/deploy/server#caddyfile) which I prefer to nginx for security and bugs (and automatic SSL), and also enables serving PMTiles from disk or a remote storage bucket without a separate service running.


Yes, PMTiles with the Caddy plugin is very similar to nginx + Btrfs.

At that point, the difference between the two projects is mostly which schema is being used.


You both specify the filesystem to be Btrfs. Is there any advantage in this case against ZFS, ext4, XFS... or is it just a practical choice?


Yes, small files can fit in the metadata, which makes a super big difference when you have 300 million files of 405 bytes each. Also, inode handling is way better compared to ext4.


Oh neat! Thanks! I'm checking the docs and I guess you're referring to "Inline files". I don't have much knowledge about btrfs, so I didn't know...

The nearest thing (and it's not really really similar, just related) that ZFS has is special VDEVs.

When you have an array of disks (usually "slow", like regular HDDs) you can attach to it another array of disks (usually very fast, like NVMe) where you can store metadata (file information) and optionally small files up to a size that you can define.

So for example you have a 50TB (or 500, who knows) array of SATA disks, and a small but superfast array of NVMe drives. Lets say 128GB. Or 512, or 1TB depending on you want to do.

File metadata is saved there, so doing a find, ls, tree... operation is now very fast. And if you save, for example, all files smaller than 32KB there (it will depend on your needs, also) all the small file operations will be way faster.


Yes, the key thing about Btrfs is how it handles inodes and how it can store data with the metadata. In the OpenFreeMap image, 60% of the files are stored with the metadata, essentially taking up no space.


I would also like to see this comparison. And for good measure it’d be great to also include Versatiles in this comparison.


If you would like to run this comparison for tileservers reading from disk, I wrote a small tool to simulate traffic based on access logs of OpenStreetMap tiles:

https://github.com/bdon/TileSiege


Here's an example of the entire planet as a PMTiles:

https://pmtiles.io/?url=https%3A%2F%2Fdata.source.coop%2Fpro...

This is zoom 0-15, or 1,431,655,765 addressed tiles. So it is designed for this scale - for a production internet site you can add a lambda, server or CDN as an additional layer for lower latency: https://docs.protomaps.com/deploy/


To mitigate the "denial of wallet" attack on S3 there are a few options:

AWS Lambda: https://docs.protomaps.com/deploy/aws lightweight tile server: https://docs.protomaps.com/pmtiles/cli#serve


The replacement for the UI is to use the "pmtiles extract" command line tool:

https://docs.protomaps.com/guide/getting-started#_3-extract-...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: