- 19 native H3 functions in pure SQL - no library imports, no cluster configuration. H3-accelerated spatial joins on 100M+ records drop from minutes to seconds by converting expensive geometry operations into integer equality joins
- GEOGRAPHY (WGS 84 sphere) and GEOMETRY (planar, any SRID) types with 60-70 native spatial functions. Covers the 80% case for spatial analytics - but no raster, no 3D, no topology
- Three spatial indexing approaches: automatic micro-partition pruning, H3 clustering (recommended for large tables), and Search Optimisation Service (Enterprise only, 10x credit rate - estimate costs before enabling)
- Snowpark Python UDFs support GeoPandas, Shapely, Fiona, and PyPROJ inside Snowflake's sandbox. Use Snowpark-optimised warehouses (256 GB memory) for heavy spatial operations - standard warehouses will OOM on complex geometries
- Zero-copy data sharing via Snowflake Marketplace is the differentiator no other platform matches. CARTO, Precisely, and Foursquare provide geospatial datasets you can join without moving data
- Typical moderate geospatial workload costs $150-300/month in Snowflake credits. A weekly spatial join on 50M features costs roughly $16/month on a Large warehouse vs $7,000/year for an ArcGIS Advanced licence
Geospatial in Cloud Series
This is Part 4 of our Geospatial in Cloud series. Each post is self-contained. Part 1 covers Databricks. Part 2 covers AWS. Part 3 covers GCP. Read the one that matches your stack.
Snowflake is the platform that teams pick when they want spatial analytics without infrastructure overhead. No clusters to configure, no libraries to install, no Spark context to debug. You write SELECT ST_DISTANCE(a.geom, b.geom) and it works.
That simplicity is real. But so are the trade-offs. Snowflake has no raster support, no 3D geometry, and its spatial function count is roughly a sixth of PostGIS. If your workflow involves satellite imagery, terrain models, or complex topology operations, this is not your platform.
For everything else - point-in-polygon joins, H3 aggregation, exposure analysis, fleet analytics, location intelligence - Snowflake delivers with less operational friction than any other cloud data platform we have worked with.
Why Snowflake for Geospatial
The pitch is simple: if your team thinks in SQL and your data is already in Snowflake, adding geospatial is close to zero effort. No new infrastructure, no new skills, no new budgets to approve.
Zero cluster management
Scale from XS to 6XL warehouse with one command. Auto-suspend after 60 seconds of idle. No Spark configuration, no driver/executor tuning.
Native H3 in pure SQL
19 built-in H3 functions. Convert a spatial join on 100M records from minutes to seconds by turning geometry operations into integer equality joins.
Zero-copy data sharing
Join your data with CARTO, Precisely, or Foursquare datasets from the Snowflake Marketplace without moving a single byte. No other platform matches this.
SQL your team already knows
No Python required for spatial analytics. Your SQL analysts can write spatial queries on day one. The learning curve is a handful of ST_* functions.
GEOGRAPHY vs GEOMETRY
Snowflake provides two spatial data types. Choosing the wrong one is one of the most common mistakes we see teams make in their first week.
GEOGRAPHY models the earth as a sphere using WGS 84 (SRID 4326). Coordinates are longitude/latitude degrees. Distance calculations return metres on the spherical surface. This is the type most GIS engineers reach for first - it maps directly to the lon/lat paradigm from QGIS and ArcGIS.
GEOMETRY uses a planar Cartesian coordinate system. It supports arbitrary SRIDs - EPSG:27700 for British National Grid, EPSG:32632 for UTM Zone 32N, whatever your data uses. Calculations are Euclidean. Use this when your data lives in a projected coordinate system and you need metre-accurate distances without spherical distortion.
Critical limitations to know
- Only 2D coordinates supported - no Z (elevation) or M (measure) values
- You cannot cast between GEOGRAPHY and GEOMETRY - they are separate worlds
- GEOGRAPHY only supports WGS 84 - no other CRS
- Column size limit is 64 MB per value (increased from 8 MB in 2025)
- GEOGRAPHY uses a perfect sphere model, not an ellipsoid - slight accuracy differences vs PostGIS for geodetic work
Rule of thumb: if your data is in lon/lat and you are doing analytics (distance, containment, aggregation), use GEOGRAPHY. If your data is in a national or UTM projection and you need precise planar geometry operations, use GEOMETRY. If you need to convert between projections, use ST_TRANSFORM - but only on GEOMETRY columns.
Spatial SQL Functions
Snowflake provides roughly 60-70 native spatial functions. For context, PostGIS has 400+. The gap is real but often irrelevant: the 60 functions Snowflake has cover the core operations that 80% of geospatial SQL workflows need.
The essentials are all present: ST_DISTANCE, ST_INTERSECTS, ST_CONTAINS, ST_BUFFER, ST_UNION, ST_INTERSECTION, ST_SIMPLIFY, ST_CENTROID, ST_AREA. You can run spatial joins, create buffers, dissolve polygons, calculate distances, and do containment checks.
Two aggregate functions handle the bulk of GIS dissolve operations: ST_COLLECT combines rows into a GeometryCollection, and ST_UNION_AGG dissolves all geometries in a group - the SQL equivalent of ArcGIS Dissolve.
Functions you will miss (coming from PostGIS)
- No
ST_Voronoi/ST_DelaunayTriangles - No
ST_ClusterDBSCAN/ST_ClusterKMeans(use Python UDFs) - No
ST_MakeValidnatively (available via Sedona Native App) - No linear referencing (
ST_LineLocatePoint,ST_LineInterpolatePoint) - No topology operations, no routing, no network analysis
- No raster functions at all
The gap filler is the Apache Sedona Native App (SedonaSnow), available on the Snowflake Marketplace. It roughly doubles the available function count and adds ST_MakeValid, ST_SubDivide, broader CRS support via ST_Transform, and more. Worth evaluating if you hit the native function ceiling.
H3: The Killer Feature
H3 is Uber's hexagonal spatial indexing system, and Snowflake's native integration is genuinely excellent. 19 functions, all first-class SQL citizens. No library imports, no UDF wrappers, no setup.
The core insight is this: traditional spatial joins ( ST_INTERSECTS on two large tables) are computationally expensive. Every row on the left must be tested against potentially every row on the right. With H3, you convert both datasets to the same hexagonal resolution, then join on integer equality. An O(n*m) geometry operation becomes a hash join. On 100M+ records, this can take spatial joins from minutes to seconds.
The key functions: H3_LATLNG_TO_CELL(lat, lng, resolution) converts a point to an H3 cell index. H3_COVERAGE(geography, resolution) covers a polygon with the minimal set of H3 cells. H3_CELL_TO_BOUNDARY(cell) converts back to a hexagonal polygon for visualisation. H3_CELL_TO_PARENT and H3_CELL_TO_CHILDREN navigate the hierarchy for multi-resolution analysis.
H3 RESOLUTION GUIDE
Resolution 3
~12,400 km2 - Country-level
Resolution 5
~253 km2 - Regional analysis
Resolution 7
~5.16 km2 - City / catchment areas
Resolution 9
~0.105 km2 - Neighbourhood / postcode
Resolution 11
~0.002 km2 - Building level
Resolution 15
~0.9 m2 - Sub-metre precision
The typical pattern: add an H3 column to your table, cluster by it, then join on integer equality. For a table of insurance claims joined to flood zones, resolution 9 gives neighbourhood-level precision. The join becomes a GROUP BY on an integer column - exactly what Snowflake's columnar engine is optimised for.
Comparison: Databricks has H3 natively since DBR 17+ and via Mosaic on earlier runtimes. BigQuery has no built-in H3 - it relies on S2 as its native spatial index. For teams that want H3 in pure SQL without Spark complexity, Snowflake is the simplest path.
Watch the cell count
Running H3_COVERAGE at resolution 12+ on country-sized polygons generates billions of cells. Start at resolution 7-9 and only increase if your analysis requires it. A medium warehouse can handle resolution 9 on most workloads. Resolution 12+ on large polygons needs an XL or larger.
Loading Geospatial Data
Snowflake natively understands GeoJSON, WKT, WKB, EWKT, and EWKB. You can load these from staged files (internal or external S3/GCS/Azure stages) via COPY INTO with a TO_GEOGRAPHY() or TO_GEOMETRY() cast.
GeoParquet is the recommended format. It stores geometry as WKB in a binary column, compresses well, and is becoming the interchange standard across Snowflake, Databricks, DuckDB, and BigQuery. Iceberg table support for GEOGRAPHY and GEOMETRY types was announced at Summit 2025 - when GA, this makes GeoParquet round-tripping straightforward across platforms.
Shapefiles are not natively supported. This is friction for every team migrating from ArcGIS. The recommended workflow: convert to GeoParquet locally using ogr2ogr or GeoPandas, upload to a Snowflake stage, then COPY INTO with WKB parsing. Alternatively, use FME's native Snowflake Spatial writer if your organisation already has a licence.
DATA LOADING SUPPORT
One trap to watch: GeoJSON FeatureCollections need to be flattened. Each Feature becomes a row via LATERAL FLATTEN. The GEOGRAPHY type rejects any SRID other than 4326 - if your GeoJSON has a different CRS, you need GEOMETRY instead.
Spatial Indexing
There is no CREATE SPATIAL INDEX in Snowflake. If you come from PostGIS, this feels wrong. But Snowflake has three mechanisms that achieve similar outcomes through different means.
1. Automatic micro-partition pruning
Snowflake stores bounding-box metadata for every micro-partition. When you filter with spatial predicates, it skips partitions whose bounding boxes do not overlap. This is automatic and free. For well-clustered data, it provides significant speedup. For randomly distributed data, it helps less.
2. H3 clustering (recommended)
Pre-compute an H3 index column and cluster the table by it. This ensures spatially proximate data lives in the same micro-partitions, making bounding-box pruning highly effective. This is the cheapest and most impactful optimisation for large spatial tables. Use ALTER TABLE ... CLUSTER BY (h3_column).
3. Search Optimisation Service (Enterprise only)
Creates persistent search access paths for spatial predicates. Enable per-column with ALTER TABLE ... ADD SEARCH OPTIMIZATION ON GEO(column). Supports ST_INTERSECTS, ST_CONTAINS, ST_WITHIN, ST_DWITHIN.
Cost warning: SOS uses serverless compute charged at 10x the normal credit rate. Run SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS before enabling. On frequently-updated tables, this can add $150-600/month.
For most teams, option 2 (H3 clustering) is the right starting point. It costs only the standard auto-reclustering credits and provides the biggest performance improvement per dollar. Only add SOS if H3 clustering is not sufficient and you have confirmed the cost is acceptable.
Python + GeoPandas Integration
Snowpark Python is the bridge between SQL and Python inside Snowflake. Python UDFs run in a secure sandbox with access to packages from Snowflake's Anaconda channel. The geospatial packages that matter are available: Shapely, GeoPandas, Fiona, PyPROJ, and GDAL.
The primary use case: operations that Snowflake's native SQL cannot handle. Spatial clustering with DBSCAN, complex geometry validation, coordinate transformations across CRS not supported natively, or reading Shapefiles directly from a stage using Fiona.
Performance traps with Python UDFs
- Cold start: ~400ms on small warehouses, ~900ms on medium. First call in a session is noticeably slower
- Memory limits: Standard warehouses provide ~16 GB per node. For heavy GeoPandas operations, use Snowpark-optimised warehouses (256 GB per node)
- Rasterio: Not reliably available in the Anaconda channel. Raster processing in Snowflake UDFs is essentially not viable
- Debugging: Python UDF errors surface as generic SQL errors. Add explicit try/except with descriptive messages in your UDFs
Snowflake Notebooks (Streamlit-based) let you run GeoPandas interactively inside Snowflake with Pydeck or Folium for visualisation. For exploratory spatial analysis, this is the fastest path. For production pipelines, wrap your logic in stored procedures or UDTFs.
Streamlit in Snowflake deserves mention: you can build interactive geospatial web apps (maps, dashboards) that run entirely inside your Snowflake account. Data never leaves the platform, access control inherits from Snowflake's RBAC, and there is no infrastructure to manage. For internal tools, this is a compelling alternative to standing up a separate mapping application.
Snowflake vs Databricks for Geospatial
Both are focus platforms for us. This is a direct comparison based on migrating production workloads to both.
| FACTOR | SNOWFLAKE | DATABRICKS |
|---|---|---|
| Primary language | SQL (Python via UDFs) | Python/Spark (SQL via Spark SQL) |
| H3 support | 19 native SQL functions | Native (DBR 17+) + Mosaic library |
| Spatial functions | 60-70 native | 90+ native + Sedona + Mosaic |
| Raster support | None | Via Mosaic / Sedona |
| 3D geometry | None | Limited |
| Cluster management | None (serverless) | Required (Spark config) |
| Data sharing | Zero-copy (Marketplace) | Delta Sharing (open protocol) |
| Scale ceiling | Excellent to ~500M rows | Superior beyond 500M (distributed Spark) |
| ML on spatial | Cortex ML (basic) + Python UDFs | MLflow + Spark ML + Mosaic |
| Cost model | Credits (simpler) | DBUs + cloud infra (harder to predict) |
Choose Snowflake when: your team is SQL-first, your data is already in Snowflake, you need zero-copy sharing with marketplace data providers, and your workload is vector analytics without raster requirements.
Choose Databricks when: your team is Python/Spark-first, you need raster support, you are building ML pipelines on spatial features, or you are processing at extreme scale (billions of rows with complex geometry operations).
For the reinsurance use case specifically - join 50M parcels to flood zones and aggregate claims by hexagon - both platforms handle this. Snowflake is simpler to set up and operate. Databricks is more powerful for the next step: training spatial risk models on the results.
Cost Model
Snowflake charges standard compute credits for all spatial operations - no separate geospatial pricing. You pay per second of warehouse time (1-minute minimum on resume), plus storage.
| WORKLOAD | WAREHOUSE | DURATION | ~COST (ENTERPRISE) |
|---|---|---|---|
| Spatial join, 10M x 10K | Medium | ~30 sec | ~$0.09 |
| H3 polyfill, 1M polygons at res 9 | Large | ~2 min | ~$0.81 |
| Weekly flood risk analysis, 50M parcels | Large | ~15 min | ~$4/run |
| Daily point aggregation, 10M events | Medium | ~5 min | ~$0.67/run |
| Geo Search Optimisation (monthly) | Serverless (10x) | Ongoing | $150-600/mo |
A moderate geospatial analytics workload - weekly batch jobs, daily aggregations, ad-hoc analyst queries - lands in the $150-300/month range. Compare that to an ArcGIS Advanced licence at ~$7,000/year for equivalent desktop processing. The break-even point favours Snowflake for any workload that runs more than a few times per month.
Hidden costs to watch
- Search Optimisation Service at 10x credit rate on large, frequently-updated tables
- Always-on warehouses - set auto-suspend to 60 seconds for ad-hoc spatial queries
- H3 coverage at resolution 12+ on country-sized polygons generates billions of cells
- Snowpark-optimised warehouses cost 1.7x standard warehouses
- Enterprise Edition required for Geo Search Optimisation - Standard Edition lacks this
When NOT to Use Snowflake for Geospatial
Every platform has boundaries. Here are Snowflake's.
Raster or imagery processing
No GeoTIFF, no DEMs, no satellite imagery, no raster algebra. If your workflow touches gridded data, use Databricks (Mosaic), Google Earth Engine, or a dedicated raster pipeline.
3D geometry or LiDAR
Only 2D coordinates supported. No Z values, no M values. Terrain analysis, 3D building models, and point clouds require a different tool.
Sub-100ms spatial lookups
Snowflake is an OLAP platform. If you need a live map API with real-time spatial queries, you still need PostGIS or a specialised tile server.
Complex topology operations
No network analysis, no routing, no linear referencing, no Voronoi diagrams natively. PostGIS or Shapely via Python UDFs can fill some gaps, but if topology is central to your workflow, Snowflake is the wrong tool.
Heavy CRS transformations
GEOGRAPHY is locked to WGS 84. GEOMETRY supports SRIDs but the built-in CRS catalogue is limited compared to PostGIS. If your workflow requires frequent reprojection between obscure coordinate systems, expect friction.
Budget-constrained teams without Enterprise
The most impactful performance feature (Geo Search Optimisation) requires Enterprise Edition. On Standard Edition, you rely on H3 clustering and micro-partition pruning - still effective, but the ceiling is lower.
Getting Started
If you are already on Snowflake, enabling geospatial is immediate. No packages to install, no extensions to enable. The spatial functions and H3 functions are available in every warehouse.
Convert your Shapefiles to GeoParquet
Use ogr2ogr or GeoPandas locally. This is the one step you cannot skip - Snowflake does not read Shapefiles natively.
Stage and load
PUT the files to a stage, COPY INTO with TO_GEOGRAPHY() or TO_GEOMETRY(). Validate with ST_ISVALID.
Add H3 indexes
Add an H3 column at resolution 7-9, then cluster by it. This single step gives you the biggest performance improvement.
Start with SQL
ST_INTERSECTS, ST_DISTANCE, ST_AREA, GROUP BY H3 cell. Your SQL analysts can start querying spatial data immediately.
Add Python when needed
Only reach for Snowpark Python UDFs when native SQL cannot do the job - spatial clustering, complex validation, format conversion.
Explore the Marketplace
Check the Snowflake Marketplace for CARTO Analytics Toolbox (70+ extra spatial functions) and third-party geospatial datasets (flood zones, demographics, POI).
Snowflake is not trying to be PostGIS. It is trying to make spatial analytics accessible to every SQL analyst who already has a Snowflake account.
For teams that need point-in-polygon joins, H3 aggregation, exposure analysis, and location intelligence - all in SQL, with no infrastructure to manage - it delivers. The 19 native H3 functions alone are worth the evaluation. The zero-copy data sharing is something no other platform offers.
Know its boundaries. No raster. No 3D. No sub-millisecond lookups. If your workflow fits within those boundaries, Snowflake is one of the simplest paths from legacy desktop GIS to cloud-native spatial analytics.
Get Workflow Automation Insights
Monthly tips on automating GIS workflows, open-source tools, and lessons from enterprise deployments. No spam.