- 80% of ArcPy functions have direct open-source equivalents (GeoPandas, rasterio, Shapely, Fiona)
- 12 ArcPy functions have NO equivalent in open source - you need workarounds or must keep ArcPy for these
- Migration is not all-or-nothing: the hybrid approach (keep ArcPy for what it does best, migrate the rest) is the most practical path
- Average migration timeline: 4-8 weeks for a 20-script inventory, with 2 weeks of parallel testing
Part 1 made the business case for migration. You're convinced the numbers work. Now what?
This post is the actual playbook: how to inventory your scripts, which functions map to what, the 12 functions that have no open-source equivalent (and what to do about them), and how to run old and new in parallel until you're confident.
Most migration guides say "use GeoPandas instead of ArcPy" and leave it there. That's not enough. You need to know which ArcPy functions translate cleanly, which require workarounds, and which mean you should keep ArcPy for those specific workflows. For a detailed function-by-function translation, see our ArcPy to GeoPandas guide.
The Inventory-First Approach
Before migrating anything, inventory everything. This step takes one week and saves months of wasted effort on scripts that should be deleted, not migrated.
Step 1: Find All ArcPy Scripts
Start with a simple recursive search across your scripts directory for any Python file that contains an import arcpy statement. This gives you a definitive list of every file that depends on ArcPy, along with a total count. Most teams are surprised - the real number is usually 20-40% higher than their initial estimate, because scripts live in project folders, shared drives, and archived directories that nobody has touched in years.
Step 2: Categorise Each Script
| Category | Definition | Action |
|---|---|---|
| Green (easy) | Standard geoprocessing (Buffer, Intersect, Dissolve, Join) | Migrate to GeoPandas |
| Yellow (medium) | Uses arcpy.env settings, cursor operations, or field calculations | Migrate with refactoring |
| Red (hard) | Network Analyst, Spatial Analyst, 3D Analyst, or Enterprise integration | Keep in ArcPy or find workarounds |
| xGrey (obsolete) | No longer used, duplicate, or superseded | Delete |
REAL-WORLD DISTRIBUTION
In a typical 20-script inventory, we see 8-10 Green, 5-7 Yellow, 2-4 Red, and 1-3 Grey. That means 65-85% of scripts can be migrated to open source. The Grey scripts are the hidden win - deleting dead code reduces maintenance burden immediately, at zero cost.
Function Mapping Table
This is the reference table your team will use daily during migration. For each ArcPy function, the open-source equivalent and any caveats. See our side-by-side benchmark comparison for performance data on these equivalents.

| ArcPy Function | Open-Source Equivalent | Notes |
|---|---|---|
| arcpy.analysis.Buffer | gdf.buffer(distance) | GeoPandas. Ensure projected CRS for metre units |
| arcpy.analysis.Intersect | gpd.overlay(gdf1, gdf2, how='intersection') | GeoPandas overlay |
| arcpy.analysis.Union | gpd.overlay(gdf1, gdf2, how='union') | GeoPandas overlay |
| arcpy.analysis.SpatialJoin | gpd.sjoin(gdf1, gdf2, how='inner') | GeoPandas spatial join |
| arcpy.management.Dissolve | gdf.dissolve(by='column') | GeoPandas dissolve |
| arcpy.management.Merge | pd.concat([gdf1, gdf2]) | pandas concat |
| arcpy.management.Clip | gpd.clip(gdf, mask) | GeoPandas clip |
| arcpy.management.Project | gdf.to_crs(epsg=4326) | GeoPandas CRS transformation |
| arcpy.management.AddField | gdf['new_col'] = value | pandas column assignment |
| arcpy.management.CalculateField | gdf['col'] = gdf['col'].apply(func) | pandas apply |
| arcpy.management.SelectByAttribute | gdf[gdf['col'] == value] | pandas filtering |
| arcpy.management.SelectByLocation | gdf[gdf.intersects(geometry)] | GeoPandas spatial filtering |
| arcpy.conversion.FeatureClassToFC | gdf.to_file('output.gpkg') | Fiona via GeoPandas |
| arcpy.conversion.TableToTable | df.to_csv('output.csv') | pandas export |
| arcpy.da.SearchCursor | for idx, row in gdf.iterrows() | pandas iteration (or vectorised) |
| arcpy.da.UpdateCursor | gdf.loc[condition, 'col'] = value | pandas vectorised update |
| arcpy.da.InsertCursor | gdf = pd.concat([gdf, new_rows]) | pandas concat |
| arcpy.Describe | gdf.crs, gdf.total_bounds, gdf.dtypes | GeoPandas properties |
| arcpy.ListFeatureClasses | glob.glob('*.gpkg') | Python glob |
| arcpy.sa.Raster | rasterio.open('file.tif') | rasterio |
| arcpy.sa.ZonalStatisticsAsTable | rasterstats.zonal_stats() | rasterstats package |
| arcpy.sa.ExtractByMask | rasterio.mask.mask() | rasterio.mask |
| arcpy.sa.Slope | richdem.TerrainAttribute(dem, 'slope') | richdem or custom numpy |
| arcpy.sa.Aspect | richdem.TerrainAttribute(dem, 'aspect') | richdem |
24 functions mapped. This covers the vast majority of scripts in a typical GIS team's inventory.
The 12 Gaps (No Direct Equivalent)
This is the section nobody else writes. These 12 ArcPy functions have no clean open-source replacement. Knowing this upfront prevents the worst migration outcome: discovering gaps after you've committed to a timeline.
| ArcPy Function | Why No Equivalent | Workaround |
|---|---|---|
| arcpy.na.MakeServiceAreaLayer | Network Analyst is proprietary | pgRouting (PostGIS) or OSRM |
| arcpy.na.MakeClosestFacilityLayer | Network Analyst | pgRouting or NetworkX |
| arcpy.na.MakeODCostMatrixLayer | Network Analyst | pgRouting or OSRM API |
| arcpy.un.Trace | Proprietary data model | No equivalent - keep ArcPy |
| arcpy.management.CreateTopology | ArcGIS topology rules | PostGIS topology (partial) |
| arcpy.management.ValidateTopology | ArcGIS topology validation | PostGIS ST_IsValid + custom rules |
| arcpy.cartography.SimplifyLine | Cartographic generalisation | Shapely simplify (less sophisticated) |
| arcpy.cartography.CollapseHydroPoly | Cartographic specialisation | No equivalent |
| arcpy.ia.ClassifyRaster | Image Analyst extension | scikit-learn + rasterio (more code) |
| arcpy.ddd.ViewShed | 3D Analyst extension | GRASS GIS r.viewshed (command-line) |
| arcpy.ddd.SurfaceVolume | 3D Analyst extension | Custom numpy calculation |
| arcpy.management.MakeQueryLayer | ArcGIS Enterprise integration | Direct SQL via SQLAlchemy |
The Honest Assessment
Of these 12, only Utility Network tracing has genuinely no workaround. The rest have alternatives that require more code but are functional. The question is whether the extra development effort is worth the licence savings. For Network Analyst functions, pgRouting is a capable alternative but requires PostGIS setup and a different data model - budget 2-3 weeks for the learning curve.
File-by-File Strategy
Each script gets migrated individually, not as a batch. This is the step-by-step process for a single script. Repeat for each Green, then Yellow.
Read the script
Understand what it does, not just what functions it calls. Talk to the analyst who runs it
Map the functions
Use the mapping table above to identify equivalents for every ArcPy call
Check for gaps
Any Red functions? If so, decide: workaround, keep in ArcPy, or redesign the workflow
Rewrite from scratch
Use GeoPandas/rasterio idioms. Do not transliterate ArcPy line-by-line - idiomatic open-source code is vectorised, not row-by-row
Test with the same data
Run old (ArcPy) and new (GeoPandas) on identical input datasets
Compare outputs
Geometry count, attribute values, CRS, spatial extent. Use the comparison script below
Document differences
Some differences are expected: floating-point precision (1e-6), sort order, field name casing
Validate with the analyst
The person who runs this workflow must confirm the output is correct. Technical equivalence is not enough - business equivalence matters
DO NOT TRANSLITERATE
The biggest migration mistake is converting ArcPy line-by-line into GeoPandas. ArcPy uses cursor-based, row-by-row processing. GeoPandas is vectorised. A transliterated script will be slower than both the original ArcPy and idiomatic GeoPandas. Rewrite from scratch using GeoPandas patterns.
Parallel Testing
Never switch over without running old and new in parallel. Automate the validation rather than comparing outputs by eye - human spot-checks miss edge cases.
A well-structured comparison function reads both the ArcPy output and the GeoPandas output as GeoDataFrames, then runs four checks: row count equality, CRS match, column set equality, and spatial extent agreement within a configurable tolerance (typically 1mm for projected coordinate systems). Any failed check prints a clear diagnostic. Run this function as part of your CI pipeline so failures are caught immediately, not during analyst review a week later.
Minimum 2 Weeks Parallel
Run both old (ArcPy) and new (GeoPandas) scripts on the same data for at least 2 weeks. Compare every output. Only switch over when you've had zero discrepancies for 2 consecutive weeks. This catches edge cases that unit tests miss - unusual geometries, encoding issues, projection edge cases.
Common Migration Patterns
Three patterns cover 90% of what you'll encounter. Each shows the ArcPy original and the idiomatic GeoPandas replacement.
Pattern 1: arcpy.env.workspace to working directory
ARCPY
ArcPy scripts typically open by setting arcpy.env.workspace to a file geodatabase path and enabling overwriteOutput. These two global settings implicitly govern every subsequent geoprocessing call in the script, creating hidden state that can cause unexpected behaviour when scripts are called from automated pipelines.
GEOPANDAS
Replace with a plain Python path constant - typically a string or pathlib.Path variable set at the top of the script. No equivalent of overwriteOutput is needed: GeoPandas writes to whatever path you specify and overwrites by default. Explicit beats implicit.
Pattern 2: Cursors to pandas operations
ARCPY (ROW-BY-ROW)
The ArcPy pattern opens an UpdateCursor as a context manager, iterates through every row in a feature class, checks a field value, conditionally updates another field, and commits the change row by row. For a 100,000-record parcel dataset, this means 100,000 individual write operations. The overhead is substantial.
GEOPANDAS (VECTORISED)
The GeoPandas equivalent uses a boolean mask to select all rows where the area exceeds the threshold, then assigns the value to the category column in a single vectorised operation. No loop. No cursor. The entire dataset updates in one pass at the C library level - typically 10-100x faster than the cursor approach on datasets above 10,000 records.
Pattern 3: Geoprocessing to GeoPandas methods
ARCPY
ArcPy chains geoprocessing calls by passing string layer names between tools. A Buffer writes to an intermediate layer by name, then Intersect reads that named layer as input. The workspace is the implicit link. This works well inside ArcGIS Pro, but makes scripts difficult to test in isolation and impossible to run outside an ArcGIS environment.
GEOPANDAS
GeoPandas passes GeoDataFrame objects directly between operations - no intermediate files, no string references. Call .buffer() on the roads GeoDataFrame (ensuring a projected CRS first), then pass the resulting geometry directly into gpd.overlay() with the parcels layer. The intersection runs in-memory. The critical detail: buffer distance is in the CRS units - always reproject to metres before buffering.
Handling arcpy.env
arcpy.env provides global settings that affect every geoprocessing operation. Open-source tools don't have this concept - each operation is explicit. Here's how to handle each setting.
| arcpy.env Setting | Open-Source Equivalent | Notes |
|---|---|---|
| workspace | Working directory variable | Manual path management |
| overwriteOutput | No equivalent (always overwrites) | Delete old files explicitly if needed |
| outputCoordinateSystem | gdf.to_crs() | Explicit per-operation |
| extent | gdf.clip(extent_gdf) | Explicit clipping |
| cellSize | rasterio resolution parameter | Per-operation |
| mask | rasterio.mask.mask() | Explicit masking |
| parallelProcessingFactor | dask-geopandas | Different parallelism model |
The shift from implicit (arcpy.env sets it once, everything inherits) to explicit (pass CRS, extent, and mask to each function) feels more verbose initially. In practice, explicit is better: you can see exactly what each operation does without hunting for global state set 200 lines earlier.
When to Keep ArcPy
Migration is not all-or-nothing. Keeping ArcPy for specific workflows is a legitimate engineering decision, not a failure. Here are the six scenarios where ArcPy remains the right tool.
Network Analysis Workflows
Service areas, closest facility, OD cost matrices. pgRouting exists but requires PostGIS setup and a different data model. If you depend on ESRI network datasets with turn restrictions and one-way streets, the migration effort outweighs the benefit.
Complex Raster Processing Chains
Multi-step raster workflows using Spatial Analyst - weighted overlay, cost distance, viewshed chaining. Replicating these in rasterio requires significantly more code and extensive testing. If the chain works and runs infrequently, leave it.
Cartographic Production
Map layouts, symbology, annotation. ArcGIS Pro's cartographic engine has no open-source equivalent at the same quality level. QGIS covers basic map production, but enterprise cartographic output still favours ArcGIS Pro.
Utility Network Integration
If your workflows touch Utility Network data, there is no migration path. Full stop. The Utility Network data model is proprietary to ArcGIS. This is not a technical limitation that will be solved - it's a vendor lock-in by design.
When the Team Pushes Back
Migration requires buy-in. An analyst forced to use unfamiliar tools is less productive than one using familiar tools willingly. Address resistance through training and gradual adoption, not mandates. Start with the analysts who are interested and let results speak.
When Migration Cost Exceeds 2 Years of Licence Savings
Do the arithmetic. If migrating 30 scripts costs $50K in engineering time and your annual ArcPy licence cost is $15K, the payback is 3.3 years. Technology changes in 3.3 years. Team priorities shift. Long payback periods carry execution risk that rarely shows up in the business case spreadsheet.
Migration Timeline
For a typical 20-script inventory, this is the phase-by-phase timeline. Adjust durations based on your team size and script complexity.

MIGRATION PHASES
Inventory and Categorise
1 weekFind all scripts, categorise as Green/Yellow/Red/Grey. Document dependencies between scripts. Interview analysts about each workflow.
Migrate Green Scripts
2-3 weeksStart with the easiest wins. Build confidence and establish patterns. Each Green script takes 2-8 hours. This phase also trains the team on GeoPandas idioms.
Migrate Yellow Scripts
2-3 weeksMedium complexity. Requires refactoring arcpy.env patterns, cursor replacements, and field calculation logic. Each Yellow script takes 4-16 hours.
Parallel Testing
2 weeks minimumRun old and new scripts on the same data. Compare every output using the comparison script. Fix discrepancies. Zero failures for 2 consecutive weeks before proceeding.
Red Scripts Decision
1 weekFor each Red script: find a workaround (pgRouting, GRASS GIS), keep it in ArcPy, or redesign the workflow entirely. Document the decision and rationale.
Switch Over
1 weekDecommission old ArcPy scripts. Update documentation. Conduct team training sessions. Set up monitoring for the new workflows.
TOTAL TIMELINE
Larger inventories (40+ scripts) take 16-20 weeks. Smaller ones (10 scripts) can complete in 5-6 weeks. The parallel testing phase is non-negotiable regardless of inventory size.
Frequently Asked Questions
Can I migrate ArcPy scripts to open source?
Yes, approximately 80% of ArcPy functions have direct open-source equivalents in GeoPandas, rasterio, and Shapely. About 12 functions - mainly Network Analyst, 3D Analyst, and Utility Network - have no direct equivalent and require workarounds or must remain in ArcPy.
How long does it take to migrate from ArcPy?
For a typical 20-script inventory, expect 8-12 weeks: inventory (1 week), easy scripts (2-3 weeks), medium scripts (2-3 weeks), parallel testing (2 weeks), and switch-over (1 week). Individual scripts take 2-16 hours depending on complexity.
What is the best replacement for ArcPy?
GeoPandas is the closest equivalent for vector analysis (buffer, intersect, dissolve, spatial join). rasterio replaces Spatial Analyst for raster operations. Shapely handles geometry operations. For network analysis, pgRouting (PostGIS) or NetworkX are alternatives. No single library replaces all of ArcPy.
Migration is a file-by-file process, not a big-bang switch. Start with the Green scripts, build confidence, and let the parallel testing results justify the next phase.
The function mapping table and 12-gap analysis give you the honest picture upfront. No surprises mid-project. The scripts that can't migrate stay in ArcPy - that's engineering pragmatism, not failure.
If you're starting from the business case, Part 1 covers the ROI analysis. For a deep dive into individual function translations with benchmarks, see our ArcPy to GeoPandas guide and the side-by-side benchmark comparison.
Get Workflow Automation Insights
Monthly tips on automating GIS workflows, open-source tools, and lessons from enterprise deployments. No spam.
