Slow tile reads
Reading COG data always takes > 1s no matter the input dataset. At first I assumed this was just the cost of reprojecting a large raster dataset (most images used by this library are reprojected on the fly from equirectangular to web mercator). However, it seems that these slow speeds persist even when using the test fixtures bundled with this library, which are windowed and downscaled to <100kb per file. Clearly data volume or network transfer is not the issue here.
Digging a bit, this could be related to some previous issues with GDAL reprojection encountered by rio-tiler (https://github.com/cogeotiff/rio-tiler/issues/346). Some slow speeds identified by @kylebarron, relating to network loading of data, might also be related (https://rasterio.groups.io/g/main/topic/72528118#468). It seems that there was a problem with RasterIO wheels with GDAL < 3.3 (although the problem might still exist in some reduced form).
It's unclear how to solve this at present.
Environment
We are running rio-tiler
v3, rasterio
1.2.10 (built as a wheel) and gdal
3.4.0.
A subset of relevant environment variables:
GDAL_CACHEMAX=200
GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR
GDAL_HTTP_MULTIPLEX=YES
GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES
CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".tif,.TIF,.tiff"
VSI_CACHE=TRUE
VSI_CACHE_SIZE=5000000
GDAL_HTTP_VERSION=2
PROJ_NETWORK=OFF
Profiling
We profiled reading each dataset as a warped VRT; roughly:
for file in datasets:
with rasterio.open(file) as src:
with WarpedVRT(src, crs=MARS_MERCATOR) as vrt:
vrt.read(1)
The results suggests that the problem might not be WarpedVRT
loading. But it does seem that a long time is taken parsing projection information. Perhaps pre-loading projections can be a speedup...
12139 function calls (11035 primitive calls) in 0.929 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.929 0.929 {built-in method builtins.exec}
1 0.000 0.000 0.929 0.929 <string>:1(<module>)
1 0.192 0.192 0.929 0.929 load-datasets:14(open_all_vrts)
8 0.000 0.000 0.356 0.044 _collections_abc.py:760(get)
8 0.000 0.000 0.356 0.044 crs.py:73(__getitem__)
8 0.000 0.000 0.356 0.044 crs.py:195(data)
8 0.000 0.000 0.356 0.044 crs.py:170(to_dict)
8 0.000 0.000 0.355 0.044 crs.py:146(to_epsg)
8 0.354 0.044 0.355 0.044 {method 'to_epsg' of 'rasterio._crs._CRS' objects}
8 0.000 0.000 0.308 0.038 crs.py:283(to_string)
8 0.000 0.000 0.305 0.038 crs.py:158(to_authority)
8 0.305 0.038 0.305 0.038 {method 'to_authority' of 'rasterio._crs._CRS' objects}
4 0.000 0.000 0.054 0.013 env.py:416(wrapper)
4 0.022 0.006 0.048 0.012 __init__.py:55(open)
12 0.000 0.000 0.025 0.002 crs.py:444(from_wkt)
12 0.025 0.002 0.025 0.002 {rasterio._crs.from_wkt}
4 0.009 0.002 0.010 0.003 {method 'read' of 'rasterio._warp.WarpedVRTReaderBase' objects}
8 0.000 0.000 0.008 0.001 env.py:257(__enter__)
8 0.000 0.000 0.007 0.001 env.py:302(defenv)
202 0.000 0.000 0.007 0.000 __init__.py:1424(debug)
8 0.005 0.001 0.007 0.001 {method 'start' of 'rasterio._env.GDALEnv' objects}