Working with Spatial Data in Python

Sequence: Working with Spatial Data in Python

missing-image
Bogda Mountains, September 1st, 1999, Landsat 7

Sequence Summary:

This sequence of modules is an introduction to spatial analytics and working with large datasets in Python.

Why?

The analysis of spatial data has applications across many disciplines and is undertaken with a wide array of tools. Spatial data here refers to all data that is referenced in space (and often time). Analysis in this context could involve looking at maps to decide your next apartment, the local council analyzing neighborhoods to decide where pedestrian crossings should be placed, or even Amazon using road networks to optimize delivery routes. Geographic Information Systems (GIS) is often the first tool that comes to mind when we think about the capture, processing, and visualization of spatial data. However, Python offers flexibility, abstraction, and scale, enabling coders to work with almost any piece of information (and lots of it), to do very sophisticated analytics.

Broadly speaking, Python is a lot more versatile and powerful than many other out-of-the-box software solutions for geographic data. GIS is a complex tool that makes it easy for non-technical users to manipulate maps and information layers fairly fast. But the same elements that make it easy to use, namely the front-end interface, also make it computationally expensive to do massive operations with it. Whereas Python, gives the programmer the flexibility to write code in a bare-bones fashion that enables them to use data from any source and run operations on millions of rows of data without trouble.

The objective of this sequence will be to analyze neighborhoods in NYC using open datasets, layering on socio-economic datapoints, and create an interactive map to display our findings. By the end of the sequence you will understand how to manipulate spatial data, do basic analysis, and techniques for visualization in Python using the Pandas, GeoPandas, and Shapely libraries.

Modules:

  1. Data manipulation
  2. Geospatial data
  3. Spatial analysis – point patterns
  4. Spatial analysis – correlation
  5. Spatial analysis – clustering
[]