Adjacency – Peter Horton

State Adjacency Portmanteaus

A border portmanteau is a region or town near a mutual border that combines the names of two, or occasionally three, adjacent states. The most famous example is probably “Texarkana” which is a combination of Texas, Arkansas and Louisiana. There is a Texarkana, TX and a Texarkana, AR. Having seen “Pen Mar, MD” on the map, I was curious as to which state borders have a border portmanteau.

I generated the border geometries using a slightly modified version of the adjacency code described here. The data came from the “Border portmanteaus” section of the List of geographic portmanteaus Wikipedia article. Some border portmanteaus no longer exist (ex. Nosodak, ND) or have no current population (Oklarado, CO) but are included anyway.

Moving Alaska and Hawaii for Mapping in National Maps

Problems Mapping Alaska and Hawaii

As anyone who has tried to make a map of the United States would tell you, the true locations of Alaska and Hawaii make creating good national maps difficult. Including these two states in their true locations leaves a lot of blank space on the map. As such, maps will oftentimes either cut out Alaska and Hawaii and only show the continental United States or resize and move them so as they appear close to the other states. The latter option is definitely preferable if you want to include Alaska and Hawaii in your map.

Scaling and Translating Alaska and Hawaii

Scaling changes the size of a geometry and translating it moves left, right, up or down. Given Alaska’s size, particularly in coordinate reference systems like 3857, Alaska and Hawaii need to be scaled and translated to fit in nicely with the lower 48 states. One quirk with scaling geographic data is that if you are attempting to plot sub-state geographies of Alaska and Hawaii, the scaling step may pull these geographies apart, as in the example below:

In the above map, the counties in Alaska and Hawaii being plotted are not scaled around a fixed point, but instead to the center of their own respective geometries. This is mostly fine for Hawaii as its counties largely consist of islands, but destroys the county adjacency relationship in Alaska. If you scale the county geometries around a fixed point however, the adjacency relationships are maintained and Alaska looks just like you’d expect it to, as in the map below:

Code to Scale and Translate Alaska and Hawaii

I’ve had to perform this operation many times and have found myself digging back into old code to find the exact numbers used in the scaling and translation. The code below includes the necessary scaling and translation to move data for Alaska and Hawaii in a Python GeoDataFrame to these locations.

The fourth parameter used when performing the scale operation is the fixed point. Please note that the code currently modifies the GeoDataFrame to be in crs 3857. For other coordinate reference systems, different scaling and translating values may be required.

The code takes in a GeoDataFrame containing data for Alaska and Hawaii, a string to refer to the column name where Alaska and Hawaii can be filtered and then inputs for the values for Alaska and Hawaii in that column.

def move_alaska_hawaii(gdf, filter_col, ak_id, hi_id):
    
    gdf = gdf.to_crs(3857)
        
    alaska = gdf[gdf[filter_col] == ak_id]
    hawaii = gdf[gdf[filter_col] == hi_id]
    
    remaining = gdf[~gdf[filter_col].isin([ak_id, hi_id])]
    
    alaska = alaska.set_geometry(alaska.scale(.2,.2,.2,(-13452629.057,3227683.786)).translate(.215e7, -1.36e6))
    hawaii = hawaii.set_geometry(hawaii.scale(1.5,1.5,1.5,(-14384434.819, 2342560.248)).translate(.57e7, 1e6))
    
    gdf = gp.GeoDataFrame(pd.concat([alaska, hawaii, remaining]), crs = 3857)
        
    return gdf

Python Shapefile Adjacency Code – Queen vs. Rook

When calculating adjacency, there is sometimes a distinction made between queen adjacency and rook adjacency.

Queen vs. Rook Adjacency

For a given shape X, the Queen adjacent shapes are all the shapes that touch X. For a given shape X, the Rook adjacent shapes are all the shapes that touch X at more than just one particular point.

The two types of adjacencies are named as such as a nod to the movement of the chess pieces. Rooks can only move up or down or left to right, whereas queens can move up, down, left, right or any direction diagonally. From a given square X on a chess board, the rook adjacent squares would be those that border square X and could be traveled to by a rook, and likewise for the queen adjacency squares.

Source: “Investigating Commuting Time in a Metropolitan Statistical Area Using Spatial Autocorrelation Analysis” by S. Hessam Miri

Calculating the Two

The shapefile adjacency code I shared here, which uses a buffer, will only return the queen adjacency.

The new code, shared below, which does not user a buffer, allows a user to specify whether they want point adjacencies to be return or not. This occurs by intersecting the shapefile with itself, without a buffer and then, depending on the “include_point_adjacency” parameter, filtering out the intersection that are just of Point geometry type. These are the point intersections

import pandas as pd
import geopandas as gp
from collections import defaultdict

def calculate_adjacency(gdf, unique_col):
    '''
    Code that takes a geodataframe and returns two dataframes: the first with rook adjacencies and the second with queen adjacencies.
    
    Both dataframes have two columns the first column with the unique column values, and the second column with a list of adjacent geometries, listed by their unique column value.
    ''' 
    
    # Confirm that unique_col is actually unique
    if not(max(gdf[unique_col].value_counts(dropna = False) == 1)):
        raise ValueError("Non-unique column provided")
    
    # Intersected the GeoDataFrame with the buffer with the original GeoDataFrame
    all_intersections = gp.overlay(gdf, gdf, how = "intersection", keep_geom_type = False)
   
    # Filter out self-intersections
    filtered_intersections = all_intersections[all_intersections[unique_col+"_1"]!=all_intersections[unique_col+"_2"]]
    
    # Separate out point intersections
    point_intersections = filtered_intersections[filtered_intersections.geom_type == "Point"]
    non_point_intersections = filtered_intersections[filtered_intersections.geom_type != "Point"]
    
    # Define a tuple of zips of the unique_col pairs present in the non-point intersections
    non_point_intersections_tuples = tuple(zip(non_point_intersections[unique_col+"_1"], non_point_intersections[unique_col+"_2"]))
    
    # Define a dictionary that will map from a unique_col value to a list of other unique_cols it is adjacent to
    rook_dict = defaultdict(list)
    
    # Iterate over the tuples
    for val in non_point_intersections_tuples:        
        rook_dict[val[0]].append(val[1])

    # Some shapes will only intersect with themselves and not be added to the above
    not_added = list(set(gdf[unique_col]).difference(set(rook_dict.keys())))
    for val in not_added:
        
        # For each of these, add a blank list to the dictionary
        rook_dict[val] = []
     
    # Create DataFrame of rook intersections
    df_rook = pd.DataFrame()
    df_rook['GEOID20'] = rook_dict.keys()
    df_rook["ADJ_GEOMS"] = rook_dict.values()
        
    # Make a copy of the dictionary so we can add the point intersections
    queen_dict = {key: value[:] for key, value in rook_dict.items()}
    
    # Define a tuple of zips of the unique_col pairs present in the point intersections
    point_intersection_tuples = tuple(zip(point_intersections[unique_col+"_1"], point_intersections[unique_col+"_2"]))
    for val in point_intersection_tuples:        
        queen_dict[val[0]].append(val[1])
         
    # Create DataFrame of queen intersections
    df_queen = pd.DataFrame()
    df_queen['GEOID20'] = queen_dict.keys()
    df_queen["ADJ_GEOMS"] = queen_dict.values()
    
    return df_rook, df_queen

Comparing the Code

As you can see in comparing this image with the earlier image, AZ + CO and UT + NM are point adjacent to one another

Geopandas Shapefile Adjacency

The below code takes in a geodataframe and a unique column and returns a dictionary mapping from each unique column value to a list of the column values it is adjacent too.

As written, the code uses a buffer of 1 in the 3857 crs and, as you can see below, accounts for point (Queen’s) adjacency.

The next version of the code will attempt to do the same thing without using a buffer and return an adjacency matrix rather than a dictionary.

def calculate_adjacency(gdf, unique_col):
    '''
    Code that takes a geodataframe and returns a dictionary of adjacencies
    '''
    
    # Convert to a crs to make sure the buffer area works
    gdf = gdf.to_crs(3857)
    
    # Make a copy of the GeoDataFrame
    gdf_buffer = gdf.copy(deep = True)
    
    # Add a buffer of 1 to the geometry of the copied GeoDataFrame
    gdf_buffer["geometry"] = gdf.buffer(1)
    
    # Intersected the GeoDataFrame with the buffer with the original GeoDataFrame
    test_intersection = gp.overlay(gdf_buffer, gdf, how = "intersection")
    
    # Define a tuple of zips of the unique_col pairs present in the intersection
    test_intersection_tuples = tuple(zip(test_intersection[unique_col+"_1"], test_intersection[unique_col+"_2"]))
    
    # Define a dictionary that will map from a unique_col value to a list of other unique_cols it is adjacent to
    final_dict = {}
    
    # Iterate over the tuples
    for val in test_intersection_tuples:
        
        # The shapes will intersect with themselves, we don't want to add these to the dictionary
        if val[0] != val[1]:
            
            # If the shape is already in the dictionary
            if val[0] in list(final_dict.keys()):
                
                # Append the adjacent shape to the list
                holder = final_dict[val[0]]
                holder.append(val[1])
                final_dict[val[0]] = holder
            else:
                
                # Otherwise, create a key in the dictionary mapping to a list with the adjacenct shape
                final_dict[val[0]] = [val[1]]
                
    # Some shapes will only intersect with themselves and not be added to the above
    for val in [i for i in gdf[unique_col] if i not in list(final_dict.keys())]:
        
        # For each of these, add a blank list to the dictionary
        final_dict[val] = []
        
    # Return the adjacency dictionary    
    return final_dict

Example output from running the code on a shape file of the US States from the census.