

Many thanks to Sasha Trubetskoy for the color scheme.
Zip codes are used by the United States Postal Service (USPS) to deliver mail and are not always polygon areas. The Census Bureau releases Zip Code Tabulation Areas which approximate the boundaries of zip codes. Zip Code Tabulation Areas do not cover the entire country, as represented by the grey areas in the maps below.

One problem that comes up when working with Zip Codes is the fact that some Zip Codes contain areas in multiple states. While this does not use official USPS zip code data, the lower map shows Zip Code Tabulation areas that cross state lines, of which there are 137.

The below table lists out the ZCTAs highlighted in red above.
| ZCTA5 | States | Number of States |
|---|---|---|
| 86514 | {UT, NM, AZ} | 3 |
| 82082 | {WY, NE, CO} | 3 |
| 57717 | {MT, SD, WY} | 3 |
| 71749 | {AR, LA} | 2 |
| 69168 | {NE, CO} | 2 |
| 69201 | {SD, NE} | 2 |
| 69212 | {SD, NE} | 2 |
| 69216 | {SD, NE} | 2 |
| 69337 | {SD, NE} | 2 |
| 71953 | {OK, AR} | 2 |
| 71937 | {OK, AR} | 2 |
| 72338 | {TN, AR} | 2 |
| 72644 | {MO, AR} | 2 |
| 73949 | {OK, TX} | 2 |
| 75556 | {AR, TX} | 2 |
| 79835 | {NM, TX} | 2 |
| 69026 | {KS, NE} | 2 |
| 02861 | {MA, RI} | 2 |
| 79922 | {NM, TX} | 2 |
| 66541 | {KS, NE} | 2 |
| 59275 | {ND, MT} | 2 |
| 59847 | {ID, MT} | 2 |
| 63673 | {MO, IL} | 2 |
| 65729 | {MO, AR} | 2 |
| 65733 | {MO, AR} | 2 |
| 65761 | {MO, AR} | 2 |
| 66955 | {KS, NE} | 2 |
| 68978 | {KS, NE} | 2 |
| 68325 | {KS, NE} | 2 |
| 68327 | {KS, NE} | 2 |
| 68420 | {KS, NE} | 2 |
| 68719 | {SD, NE} | 2 |
| 68755 | {SD, NE} | 2 |
| 68943 | {KS, NE} | 2 |
| 79837 | {NM, TX} | 2 |
| 81120 | {NM, CO} | 2 |
| 80737 | {NE, CO} | 2 |
| 86515 | {NM, AZ} | 2 |
| 99128 | {ID, WA} | 2 |
| 99033 | {ID, WA} | 2 |
| 97913 | {ID, OR} | 2 |
| 97910 | {ID, OR} | 2 |
| 97635 | {CA, OR} | 2 |
| 89832 | {ID, NV} | 2 |
| 89439 | {CA, NV} | 2 |
| 89421 | {OR, NV} | 2 |
| 89061 | {CA, NV} | 2 |
| 89060 | {CA, NV} | 2 |
| 89019 | {CA, NV} | 2 |
| 89010 | {CA, NV} | 2 |
| 88430 | {NM, TX} | 2 |
| 87328 | {NM, AZ} | 2 |
| 86504 | {NM, AZ} | 2 |
| 59221 | {ND, MT} | 2 |
| 86044 | {UT, AZ} | 2 |
| 84536 | {UT, AZ} | 2 |
| 84531 | {UT, AZ} | 2 |
| 84034 | {UT, NV} | 2 |
| 83856 | {ID, WA} | 2 |
| 83342 | {UT, ID} | 2 |
| 83127 | {ID, WY} | 2 |
| 83120 | {ID, WY} | 2 |
| 82930 | {UT, WY} | 2 |
| 82801 | {MT, WY} | 2 |
| 82701 | {SD, WY} | 2 |
| 82063 | {WY, CO} | 2 |
| 81324 | {UT, CO} | 2 |
| 81137 | {NM, CO} | 2 |
| 59270 | {ND, MT} | 2 |
| 58653 | {ND, SD} | 2 |
| 03579 | {NH, ME} | 2 |
| 38549 | {TN, KY} | 2 |
| 54540 | {MI, WI} | 2 |
| 52626 | {MO, IA} | 2 |
| 52573 | {MO, IA} | 2 |
| 52542 | {MO, IA} | 2 |
| 51640 | {MO, IA} | 2 |
| 51557 | {IA, NE} | 2 |
| 51360 | {MN, IA} | 2 |
| 51023 | {SD, IA} | 2 |
| 51001 | {SD, IA} | 2 |
| 42602 | {TN, KY} | 2 |
| 42223 | {TN, KY} | 2 |
| 40965 | {TN, KY} | 2 |
| 38852 | {AL, MS} | 2 |
| 38769 | {AR, MS} | 2 |
| 38326 | {TN, MS} | 2 |
| 56136 | {MN, SD} | 2 |
| 38079 | {TN, KY} | 2 |
| 37752 | {VA, TN} | 2 |
| 37642 | {VA, TN} | 2 |
| 36855 | {AL, GA} | 2 |
| 30741 | {TN, GA} | 2 |
| 30165 | {AL, GA} | 2 |
| 28675 | {NC, VA} | 2 |
| 27048 | {NC, VA} | 2 |
| 24622 | {VA, WV} | 2 |
| 24604 | {VA, WV} | 2 |
| 21912 | {DE, MD} | 2 |
| 21874 | {DE, MD} | 2 |
| 20135 | {VA, WV} | 2 |
| 19973 | {DE, MD} | 2 |
| 56027 | {MN, IA} | 2 |
| 56144 | {MN, SD} | 2 |
| 58649 | {ND, SD} | 2 |
| 57660 | {ND, SD} | 2 |
| 58639 | {ND, SD} | 2 |
| 58623 | {ND, SD} | 2 |
| 58621 | {ND, MT} | 2 |
| 58568 | {ND, SD} | 2 |
| 58439 | {ND, SD} | 2 |
| 58436 | {ND, SD} | 2 |
| 58413 | {ND, SD} | 2 |
| 58225 | {ND, MN} | 2 |
| 58053 | {ND, SD} | 2 |
| 58043 | {ND, SD} | 2 |
| 58041 | {ND, SD} | 2 |
| 58030 | {ND, MN} | 2 |
| 57735 | {SD, NE} | 2 |
| 57724 | {MT, SD} | 2 |
| 57648 | {ND, SD} | 2 |
| 56164 | {MN, SD} | 2 |
| 57645 | {ND, SD} | 2 |
| 57642 | {ND, SD} | 2 |
| 57641 | {ND, SD} | 2 |
| 57638 | {ND, SD} | 2 |
| 57430 | {ND, SD} | 2 |
| 57255 | {ND, SD} | 2 |
| 57068 | {MN, SD} | 2 |
| 57034 | {SD, IA} | 2 |
| 57030 | {MN, SD} | 2 |
| 57026 | {MN, SD} | 2 |
| 56744 | {ND, MN} | 2 |
| 56257 | {MN, SD} | 2 |
| 56220 | {MN, SD} | 2 |
| 56219 | {MN, SD} | 2 |
| 99362 | {WA, OR} | 2 |
As anyone who has tried to make a map of the United States would tell you, the true locations of Alaska and Hawaii make creating good national maps difficult. Including these two states in their true locations leaves a lot of blank space on the map. As such, maps will oftentimes either cut out Alaska and Hawaii and only show the continental United States or resize and move them so as they appear close to the other states. The latter option is definitely preferable if you want to include Alaska and Hawaii in your map.

Scaling changes the size of a geometry and translating it moves left, right, up or down. Given Alaska’s size, particularly in coordinate reference systems like 3857, Alaska and Hawaii need to be scaled and translated to fit in nicely with the lower 48 states. One quirk with scaling geographic data is that if you are attempting to plot sub-state geographies of Alaska and Hawaii, the scaling step may pull these geographies apart, as in the example below:

In the above map, the counties in Alaska and Hawaii being plotted are not scaled around a fixed point, but instead to the center of their own respective geometries. This is mostly fine for Hawaii as its counties largely consist of islands, but destroys the county adjacency relationship in Alaska. If you scale the county geometries around a fixed point however, the adjacency relationships are maintained and Alaska looks just like you’d expect it to, as in the map below:

I’ve had to perform this operation many times and have found myself digging back into old code to find the exact numbers used in the scaling and translation. The code below includes the necessary scaling and translation to move data for Alaska and Hawaii in a Python GeoDataFrame to these locations.
The fourth parameter used when performing the scale operation is the fixed point. Please note that the code currently modifies the GeoDataFrame to be in crs 3857. For other coordinate reference systems, different scaling and translating values may be required.
The code takes in a GeoDataFrame containing data for Alaska and Hawaii, a string to refer to the column name where Alaska and Hawaii can be filtered and then inputs for the values for Alaska and Hawaii in that column.
def move_alaska_hawaii(gdf, filter_col, ak_id, hi_id):
gdf = gdf.to_crs(3857)
alaska = gdf[gdf[filter_col] == ak_id]
hawaii = gdf[gdf[filter_col] == hi_id]
remaining = gdf[~gdf[filter_col].isin([ak_id, hi_id])]
alaska = alaska.set_geometry(alaska.scale(.2,.2,.2,(-13452629.057,3227683.786)).translate(.215e7, -1.36e6))
hawaii = hawaii.set_geometry(hawaii.scale(1.5,1.5,1.5,(-14384434.819, 2342560.248)).translate(.57e7, 1e6))
gdf = gp.GeoDataFrame(pd.concat([alaska, hawaii, remaining]), crs = 3857)
return gdf
Python code to dissolve geopandas GeoDataFrames on adjacent geometries, with parameters to include or exclude point adjacencies.
Find the most up-to-date version of the code here.
import geopandas as gp
from itertools import combinations, starmap
def adjacency_dissolve(gdf, include_point_adjacency = True):
'''
Code that takes in a geodataframe and returns the geodataframe with adjacencies dissolved
Includes a "include_point_adjacency" parameter, with a default value set to "True"
Currently the code is not fine-tuned to handle non-geometric data during the dissolve
'''
# Make sure that the index column is unique
if "index" in gdf.columns:
raise ValueError("Column already named 'index'")
gdf.reset_index(inplace = True, drop = False)
if not gdf.index.is_unique:
raise ValueError ("Non-unique index column")
adj_groups = calculate_adjacency(gdf, include_point_adjacency)
if "Dissolve_Assignment" in gdf.columns:
raise ValueError("Existing 'Dissolve_Assignment' column")
gdf["Dissolve_Assignment"] = ""
for i in range(0,len(adj_groups)):
if type(adj_groups[i])==gdf.dtypes["index"]:
gdf.loc[gdf["index"]== adj_groups[i],"Dissolve_Assignment"] = i
elif type(adj_groups[i])==set:
gdf.loc[gdf["index"].isin(adj_groups[i]) ,"Dissolve_Assignment"] = i
dissolved = gdf.dissolve("Dissolve_Assignment")
dissolved.reset_index(drop = False, inplace = True)
dissolved.drop(["index","Dissolve_Assignment"], axis = 1, inplace = True)
return dissolved
def calculate_adjacency(gdf, include_point_adjacency = True):
'''
Code that takes a geodataframe and returns a dictionary of adjacencies
'''
# Intersected the GeoDataFrame with the buffer with the original GeoDataFrame
test_intersection = gp.overlay(gdf, gdf, how = "intersection", keep_geom_type = False)
# If the include_point_adjacency is False
if (include_point_adjacency == False):
# Filter out the intersections that are just points
test_intersection = test_intersection[test_intersection.geom_type != "Point"]
# Get value counts after the intersections
ser = test_intersection["index_1"].value_counts()
# Filter out self-intersections
test_intersection = test_intersection[test_intersection["index_1"]!=test_intersection["index_2"]]
# Define a tuple of zips of the unique_col pairs present in the intersection
test_intersection_tuples = list(list(zip(test_intersection["index_1"], test_intersection["index_2"])))
return subadjacencies_faster_3([set(i) for i in test_intersection_tuples]) + list(ser[ser==1].index)
def subadjacencies_faster_3(dup_list):
all_intersections = starmap(set.intersection, combinations(dup_list, 2))
finished = True
for val in all_intersections:
if val != set():
finished = False
if finished:
return [sorted(list(i)) for i in dup_list]
else:
final_holder = []
for val in dup_list:
added = False
added_indices = []
for idx, x in enumerate(final_holder):
if len(x.intersection(val)) > 0:
final_holder[idx] = x.union(val)
added_indices.append(idx)
added = True
if len(added_indices) > 1:
for i in range(1, len(added_indices)):
final_holder[added_indices[0]] = final_holder[added_indices[0]].union(final_holder[added_indices[i]])
for i in range(len(added_indices)-1,0,-1):
final_holder.pop(added_indices[i])
if not added:
final_holder.append(val)
return final_holder
Using election results data from MEDSL, I was able to make maps showing county-level presidential election results for the 2000, 2004, 2008, 2012, and 2016 presidential elections.
– No third-party candidate won any county in these 5 elections
– No county had a tie between the top two vote getters
– No county voted for the losing candidate in every election: (00-D, 04-D, 08-R, 12-R, 16-D)
– 39 counties voted: (00-D, 04-D, 08-R, 12-R, 16-R), only getting their first winner in 2016
(Please note: the maps are fairly high resolution, which means they may take a second to load. If you are struggling to view the counties, you can open the image in a new tab and enlarge the image)

Given the large number of different outcomes, it’s a bit easier to view the data by splitting the map into counties that voted Democratic in 2000 and counties that voted Republican:


– “INCOMPLETE” refers to Broomfield County, which became a new county in 2001. The boundaries in surrounding counties changed slightly as a result of this change.
– Election data from MEDSL and county shapes from the Census
– My apologies to Hawaii and Alaska, which will be included in future versions of this post
The below code takes in a geodataframe and a unique column and returns a dictionary mapping from each unique column value to a list of the column values it is adjacent too.
As written, the code uses a buffer of 1 in the 3857 crs and, as you can see below, accounts for point (Queen’s) adjacency.
The next version of the code will attempt to do the same thing without using a buffer and return an adjacency matrix rather than a dictionary.
def calculate_adjacency(gdf, unique_col):
'''
Code that takes a geodataframe and returns a dictionary of adjacencies
'''
# Convert to a crs to make sure the buffer area works
gdf = gdf.to_crs(3857)
# Make a copy of the GeoDataFrame
gdf_buffer = gdf.copy(deep = True)
# Add a buffer of 1 to the geometry of the copied GeoDataFrame
gdf_buffer["geometry"] = gdf.buffer(1)
# Intersected the GeoDataFrame with the buffer with the original GeoDataFrame
test_intersection = gp.overlay(gdf_buffer, gdf, how = "intersection")
# Define a tuple of zips of the unique_col pairs present in the intersection
test_intersection_tuples = tuple(zip(test_intersection[unique_col+"_1"], test_intersection[unique_col+"_2"]))
# Define a dictionary that will map from a unique_col value to a list of other unique_cols it is adjacent to
final_dict = {}
# Iterate over the tuples
for val in test_intersection_tuples:
# The shapes will intersect with themselves, we don't want to add these to the dictionary
if val[0] != val[1]:
# If the shape is already in the dictionary
if val[0] in list(final_dict.keys()):
# Append the adjacent shape to the list
holder = final_dict[val[0]]
holder.append(val[1])
final_dict[val[0]] = holder
else:
# Otherwise, create a key in the dictionary mapping to a list with the adjacenct shape
final_dict[val[0]] = [val[1]]
# Some shapes will only intersect with themselves and not be added to the above
for val in [i for i in gdf[unique_col] if i not in list(final_dict.keys())]:
# For each of these, add a blank list to the dictionary
final_dict[val] = []
# Return the adjacency dictionary
return final_dict
Example output from running the code on a shape file of the US States from the census.
