Peter Horton

Zip Codes, ZCTAs and ZCTAs that Cross State Boundaries

Zip codes are used by the United States Postal Service (USPS) to deliver mail and are not always polygon areas. The Census Bureau releases Zip Code Tabulation Areas which approximate the boundaries of zip codes. Zip Code Tabulation Areas do not cover the entire country, as represented by the grey areas in the maps below.

One problem that comes up when working with Zip Codes is the fact that some Zip Codes contain areas in multiple states. While this does not use official USPS zip code data, the lower map shows Zip Code Tabulation areas that cross state lines, of which there are 137.

The below table lists out the ZCTAs highlighted in red above.

ZCTA5	States	Number of States
86514	{UT, NM, AZ}	3
82082	{WY, NE, CO}	3
57717	{MT, SD, WY}	3
71749	{AR, LA}	2
69168	{NE, CO}	2
69201	{SD, NE}	2
69212	{SD, NE}	2
69216	{SD, NE}	2
69337	{SD, NE}	2
71953	{OK, AR}	2
71937	{OK, AR}	2
72338	{TN, AR}	2
72644	{MO, AR}	2
73949	{OK, TX}	2
75556	{AR, TX}	2
79835	{NM, TX}	2
69026	{KS, NE}	2
02861	{MA, RI}	2
79922	{NM, TX}	2
66541	{KS, NE}	2
59275	{ND, MT}	2
59847	{ID, MT}	2
63673	{MO, IL}	2
65729	{MO, AR}	2
65733	{MO, AR}	2
65761	{MO, AR}	2
66955	{KS, NE}	2
68978	{KS, NE}	2
68325	{KS, NE}	2
68327	{KS, NE}	2
68420	{KS, NE}	2
68719	{SD, NE}	2
68755	{SD, NE}	2
68943	{KS, NE}	2
79837	{NM, TX}	2
81120	{NM, CO}	2
80737	{NE, CO}	2
86515	{NM, AZ}	2
99128	{ID, WA}	2
99033	{ID, WA}	2
97913	{ID, OR}	2
97910	{ID, OR}	2
97635	{CA, OR}	2
89832	{ID, NV}	2
89439	{CA, NV}	2
89421	{OR, NV}	2
89061	{CA, NV}	2
89060	{CA, NV}	2
89019	{CA, NV}	2
89010	{CA, NV}	2
88430	{NM, TX}	2
87328	{NM, AZ}	2
86504	{NM, AZ}	2
59221	{ND, MT}	2
86044	{UT, AZ}	2
84536	{UT, AZ}	2
84531	{UT, AZ}	2
84034	{UT, NV}	2
83856	{ID, WA}	2
83342	{UT, ID}	2
83127	{ID, WY}	2
83120	{ID, WY}	2
82930	{UT, WY}	2
82801	{MT, WY}	2
82701	{SD, WY}	2
82063	{WY, CO}	2
81324	{UT, CO}	2
81137	{NM, CO}	2
59270	{ND, MT}	2
58653	{ND, SD}	2
03579	{NH, ME}	2
38549	{TN, KY}	2
54540	{MI, WI}	2
52626	{MO, IA}	2
52573	{MO, IA}	2
52542	{MO, IA}	2
51640	{MO, IA}	2
51557	{IA, NE}	2
51360	{MN, IA}	2
51023	{SD, IA}	2
51001	{SD, IA}	2
42602	{TN, KY}	2
42223	{TN, KY}	2
40965	{TN, KY}	2
38852	{AL, MS}	2
38769	{AR, MS}	2
38326	{TN, MS}	2
56136	{MN, SD}	2
38079	{TN, KY}	2
37752	{VA, TN}	2
37642	{VA, TN}	2
36855	{AL, GA}	2
30741	{TN, GA}	2
30165	{AL, GA}	2
28675	{NC, VA}	2
27048	{NC, VA}	2
24622	{VA, WV}	2
24604	{VA, WV}	2
21912	{DE, MD}	2
21874	{DE, MD}	2
20135	{VA, WV}	2
19973	{DE, MD}	2
56027	{MN, IA}	2
56144	{MN, SD}	2
58649	{ND, SD}	2
57660	{ND, SD}	2
58639	{ND, SD}	2
58623	{ND, SD}	2
58621	{ND, MT}	2
58568	{ND, SD}	2
58439	{ND, SD}	2
58436	{ND, SD}	2
58413	{ND, SD}	2
58225	{ND, MN}	2
58053	{ND, SD}	2
58043	{ND, SD}	2
58041	{ND, SD}	2
58030	{ND, MN}	2
57735	{SD, NE}	2
57724	{MT, SD}	2
57648	{ND, SD}	2
56164	{MN, SD}	2
57645	{ND, SD}	2
57642	{ND, SD}	2
57641	{ND, SD}	2
57638	{ND, SD}	2
57430	{ND, SD}	2
57255	{ND, SD}	2
57068	{MN, SD}	2
57034	{SD, IA}	2
57030	{MN, SD}	2
57026	{MN, SD}	2
56744	{ND, MN}	2
56257	{MN, SD}	2
56220	{MN, SD}	2
56219	{MN, SD}	2
99362	{WA, OR}	2

Moving Alaska and Hawaii for Mapping in National Maps

Problems Mapping Alaska and Hawaii

As anyone who has tried to make a map of the United States would tell you, the true locations of Alaska and Hawaii make creating good national maps difficult. Including these two states in their true locations leaves a lot of blank space on the map. As such, maps will oftentimes either cut out Alaska and Hawaii and only show the continental United States or resize and move them so as they appear close to the other states. The latter option is definitely preferable if you want to include Alaska and Hawaii in your map.

Scaling and Translating Alaska and Hawaii

Scaling changes the size of a geometry and translating it moves left, right, up or down. Given Alaska’s size, particularly in coordinate reference systems like 3857, Alaska and Hawaii need to be scaled and translated to fit in nicely with the lower 48 states. One quirk with scaling geographic data is that if you are attempting to plot sub-state geographies of Alaska and Hawaii, the scaling step may pull these geographies apart, as in the example below:

In the above map, the counties in Alaska and Hawaii being plotted are not scaled around a fixed point, but instead to the center of their own respective geometries. This is mostly fine for Hawaii as its counties largely consist of islands, but destroys the county adjacency relationship in Alaska. If you scale the county geometries around a fixed point however, the adjacency relationships are maintained and Alaska looks just like you’d expect it to, as in the map below:

Code to Scale and Translate Alaska and Hawaii

I’ve had to perform this operation many times and have found myself digging back into old code to find the exact numbers used in the scaling and translation. The code below includes the necessary scaling and translation to move data for Alaska and Hawaii in a Python GeoDataFrame to these locations.

The fourth parameter used when performing the scale operation is the fixed point. Please note that the code currently modifies the GeoDataFrame to be in crs 3857. For other coordinate reference systems, different scaling and translating values may be required.

The code takes in a GeoDataFrame containing data for Alaska and Hawaii, a string to refer to the column name where Alaska and Hawaii can be filtered and then inputs for the values for Alaska and Hawaii in that column.

def move_alaska_hawaii(gdf, filter_col, ak_id, hi_id):
    
    gdf = gdf.to_crs(3857)
        
    alaska = gdf[gdf[filter_col] == ak_id]
    hawaii = gdf[gdf[filter_col] == hi_id]
    
    remaining = gdf[~gdf[filter_col].isin([ak_id, hi_id])]
    
    alaska = alaska.set_geometry(alaska.scale(.2,.2,.2,(-13452629.057,3227683.786)).translate(.215e7, -1.36e6))
    hawaii = hawaii.set_geometry(hawaii.scale(1.5,1.5,1.5,(-14384434.819, 2342560.248)).translate(.57e7, 1e6))
    
    gdf = gp.GeoDataFrame(pd.concat([alaska, hawaii, remaining]), crs = 3857)
        
    return gdf

Python GeoDataFrame Dissolve on Adjacencies

Python code to dissolve geopandas GeoDataFrames on adjacent geometries, with parameters to include or exclude point adjacencies.

Find the most up-to-date version of the code here.

import geopandas as gp
from itertools import combinations, starmap

def adjacency_dissolve(gdf, include_point_adjacency = True):
    '''
    Code that takes in a geodataframe and returns the geodataframe with adjacencies dissolved
    Includes a "include_point_adjacency" parameter, with a default value set to "True"
    Currently the code is not fine-tuned to handle non-geometric data during the dissolve 
    '''
    # Make sure that the index column is unique
    if "index" in gdf.columns:
        raise ValueError("Column already named 'index'")
        
    gdf.reset_index(inplace = True, drop = False)
    
    if not gdf.index.is_unique:
        raise ValueError ("Non-unique index column")
    
    adj_groups = calculate_adjacency(gdf, include_point_adjacency)
    
    if "Dissolve_Assignment" in gdf.columns:
        raise ValueError("Existing 'Dissolve_Assignment' column")
    
    gdf["Dissolve_Assignment"] = ""
    for i in range(0,len(adj_groups)):
        if type(adj_groups[i])==gdf.dtypes["index"]:
            gdf.loc[gdf["index"]== adj_groups[i],"Dissolve_Assignment"] = i
        elif type(adj_groups[i])==set:
            gdf.loc[gdf["index"].isin(adj_groups[i]) ,"Dissolve_Assignment"] = i
    
    dissolved = gdf.dissolve("Dissolve_Assignment")
    dissolved.reset_index(drop = False, inplace = True)
    dissolved.drop(["index","Dissolve_Assignment"], axis = 1, inplace = True)
    
    return dissolved


def calculate_adjacency(gdf, include_point_adjacency = True):
    '''
    Code that takes a geodataframe and returns a dictionary of adjacencies
    '''    
    # Intersected the GeoDataFrame with the buffer with the original GeoDataFrame
    test_intersection = gp.overlay(gdf, gdf, how = "intersection", keep_geom_type = False)
    
    # If the include_point_adjacency is False
    if (include_point_adjacency == False):
        # Filter out the intersections that are just points
        test_intersection = test_intersection[test_intersection.geom_type != "Point"]
    
    # Get value counts after the intersections
    ser = test_intersection["index_1"].value_counts()
    
    # Filter out self-intersections
    test_intersection = test_intersection[test_intersection["index_1"]!=test_intersection["index_2"]]
    
    # Define a tuple of zips of the unique_col pairs present in the intersection
    test_intersection_tuples = list(list(zip(test_intersection["index_1"], test_intersection["index_2"])))
    
    return subadjacencies_faster_3([set(i) for i in test_intersection_tuples]) + list(ser[ser==1].index)

def subadjacencies_faster_3(dup_list):
    all_intersections = starmap(set.intersection, combinations(dup_list, 2))
    finished = True
    for val in all_intersections:
        if val != set():
            finished = False
    if finished:
        return [sorted(list(i)) for i in dup_list]
    else:
        final_holder = []
        for val in dup_list:
            added = False
            added_indices = []
            for idx, x in enumerate(final_holder):
                if len(x.intersection(val)) > 0:
                    final_holder[idx] = x.union(val)
                    added_indices.append(idx)
                    added = True
            if len(added_indices) > 1:
                for i in range(1, len(added_indices)):
                    final_holder[added_indices[0]] = final_holder[added_indices[0]].union(final_holder[added_indices[i]])
                for i in range(len(added_indices)-1,0,-1):
                    final_holder.pop(added_indices[i])
            if not added:
                final_holder.append(val)
        return final_holder

How America’s Electoral Districts Represent its Population

The 8 OMB Categories

The 2020 U.S. census made available a large number of racial and ethnic categories. These options helped capture the diversity of the United States’ people. However, analyzing population data across these categories is somewhat difficult, given the sheer number of different categories. Simplifying the data by combining various categories helps make analysis easier.

One such simplification is the Office of Management and Budget’s (OMB) racial and ethnic categorization that place every person in the United States into one of 8 categories:

Non-Hispanic (NH) white
NH Black plus NH Black and white
NH Asian plus NH Asian and white
NH American Indian plus NH American Indian and white
NH Pacific Islander plus NH Pacific Islander and white
NH Some Other Races Alone plus NH Some Other Race Alone and white
NH Other Multiple Race (where more than one minority race listed)
Hispanic

2020 Census Data by OMB Category

According to the 2020 census data, 57.9% of the United States population falls into the first OMB category, Non-Hispanic white. The totals for the different categories are shown below:

	Non-Hispanic (NH) White Alone	NH Black Alone plus NH Black and White	NH Asian Alone + NH Asian and White	NH American Indian Alone + NH American Indian and White	NH Pacific Islander Alone + NH Pacific Islander and White	NH Some Other Race Alone + NH Some Other Race and White	NH Other Multiple-Race (where more than one minority race listed)	Hispanic
U.S. Population	57.9%	12.8%	6.7%	1.8%	0.2%	1.2%	0.7%	18.7%

Electoral “Representation” by OMB Category

One interesting question to consider, in light of these number, is how “well-represented” these groups are in American electoral politics. In particular, I was curious to know, for a given OMB category and electoral level, what percentage of seats or districts have that group as the largest (plurality) population.

As an example, people who identify as either Non-Hispanic American Indian or Non-Hispanic American Indian and white may encompass 1.8% of the population, but in how many congressional districts are they the largest OMB group?

Please note: For my analyses “largest” refers to the plurality OMB group of the total population in that district, and does not capture variations in population age, citizenship status or voting-registration.)

If you blindly took ~1.8% (this group’s percentage of the total U.S. population) and multiplied it by 435 (the total number of U.S. Congressional seats), you might expect there to be ~10 such Congressional Districts. While this is a reasonable enough estimate, in fact, there are actually 0 2022 Congressional Districts where people who identify as either Non-Hispanic American Indian or Non-Hispanic American Indian and white comprise the largest of the 8 OMB groups.

One important consideration in the gap between the “expected” seats and the actual seats is how population is spread across the United States. While Non-Hispanic whites comprise 57.9% of the U.S.’ population, they are spread throughout the country in sufficient concentrations to encompass the largest OMB group in 80.2% of U.S. Congressional Districts. Other considerations might include gerrymandering, legislative district sizes, and other state redistricting requirements.

Values in this table may be interesting to consider in light of the anticipated “majority-minority” demographics of the United States.

Values for other OMB groups at other electoral levels are shown in the table below. Please note that there are three rows each for the State Legislative Lower (SLL) district and State Legislative Upper (SLU) district levels.

The first row for each level “SL(L/U) Districts as Largest Group (%)” does not consider multi-member districts and simply tracks the percentage of districts where a particular OMB group is the largest.

The second row for each level “SL(L/U) Rep.’s Districts* as Largest Group (%)” considers multi-member districts and tracks the percentage of representatives for whom’s district a particular OMB group is the largest.

An example differentiating between the first and second rows would be a 2-member district whose largest OMB group is Non-Hispanic white alone. In the first row, this would be counted as just one district, in the second two.

Finally, the third row “SL(L/U) Rep.’s Districts (Norm.) as Largest Group (%)” considers multi-member districts similar to the second row, but also normalizes by the size of the state’s particular legislature. For example, the Alaska State House has only 40 members, whereas the New Hampshire State House has 400 members. In this row, totals in Alaska State House would be divided by 40 and totals in the New Hampshire State House by 400.

Data Table

	Non-Hispanic (NH) White Alone	NH Black Alone plus NH Black and White	NH Asian Alone + NH Asian and White	NH American Indian Alone + NH American Indian and White	NH Pacific Islander Alone + NH Pacific Islander and White	NH Some Other Race Alone + NH Some Other Race and White	NH Other Multiple-Race (where more than one minority race listed)	Hispanic
U.S. Population	57.9%	12.8%	6.7%	1.8%	0.2%	1.2%	0.7%	18.7%
Cong. Districts as Largest Group (%)	80.2%	6.0%	1.6%	0.0%	0.0%	0.0%	0.0%	12.2%
SLL Districts as Largest Group (%)	83.8%	8.4%	1.1%	0.5%	0.0%	0.0%	0.0%	6.2%
SLL Rep.’s Districts* as Largest Group (%)	84.6%	8.0%	1.0%	0.5%	0.0%	0.0%	0.0%	5.9%
SLL Rep.’s Districts* (Norm.) as Largest Group (%)	83.3%	7.2%	1.8%	0.8%	0.1%	0.0%	0.0%	6.8%
SLU Districts as Largest Group (%)	84.6%	7.2%	1.2%	0.7%	0.1%	0.0%	0.0%	6.3%
SLU Rep.’s Districts as Largest Group (%)	84.8%	7.0%	1.2%	0.7%	0.1%	0.0%	0.0%	6.2%
SLU Rep.’s Districts (Norm.) as Largest Group (%)	84.8%	6.4%	1.6%	0.7%	0.1%	0.0%	0.0%	6.4%
Senate Seats as Largest Group (%)	94.0%	0.0%	2.0%	0.0%	0.0%	0.0%	0.0%	4.0%

Sources

National Block Assignment File + 2020 PL Census Data from the Redistricting Data Hub.

County-Level 2000 – 2016 Presidential Election Result Maps

Using election results data from MEDSL, I was able to make maps showing county-level presidential election results for the 2000, 2004, 2008, 2012, and 2016 presidential elections.

Interesting Findings

– No third-party candidate won any county in these 5 elections
– No county had a tie between the top two vote getters
– No county voted for the losing candidate in every election: (00-D, 04-D, 08-R, 12-R, 16-D)
– 39 counties voted: (00-D, 04-D, 08-R, 12-R, 16-R), only getting their first winner in 2016

Maps

(Please note: the maps are fairly high resolution, which means they may take a second to load. If you are struggling to view the counties, you can open the image in a new tab and enlarge the image)

How Counties Voted in Presidential Elections from 200o to 2016

Given the large number of different outcomes, it’s a bit easier to view the data by splitting the map into counties that voted Democratic in 2000 and counties that voted Republican:

Notes

– “INCOMPLETE” refers to Broomfield County, which became a new county in 2001. The boundaries in surrounding counties changed slightly as a result of this change.
– Election data from MEDSL and county shapes from the Census
– My apologies to Hawaii and Alaska, which will be included in future versions of this post

Baltimore Police District Redistricting

It was announced last week that the Baltimore Police Department is planning on redrawing its districts for the first time since 1959.

Although exact population stats are hard to find, when the current 9 districts were drawn in 1959, they all contained roughly equal population (or to be more accurate they were called “more nearly even” relative to the prior districts – see the Baltimore Sun article “Station Sites Talked Over” from March 1, 1956).

Since 1959, changes in population within the city have led to huge population disparities among the districts, as you can see below, where 2020 census data is aggregated to each district.

As the total 2020 census population for Baltimore was 585,708, an equal population split of the city would result in 9 districts with 65,078 people each, give or take a person. On the current plan, the Northeastern District covers 122,022 people and the Eastern District only 33,546.

As the current districts were drawn on slightly different geographies than those of 2020 census blocks, the below totals are calculated by intersecting each block with a shapefile of police districts. If parts of a block are within two or more different districts, the population within the block is assigned to the district containing the largest portion of its area.

Also of interest may be the racial demographics of the various police districts, which can be seen below.

Python Shapefile Adjacency Code – Queen vs. Rook

When calculating adjacency, there is sometimes a distinction made between queen adjacency and rook adjacency.

Queen vs. Rook Adjacency

For a given shape X, the Queen adjacent shapes are all the shapes that touch X. For a given shape X, the Rook adjacent shapes are all the shapes that touch X at more than just one particular point.

The two types of adjacencies are named as such as a nod to the movement of the chess pieces. Rooks can only move up or down or left to right, whereas queens can move up, down, left, right or any direction diagonally. From a given square X on a chess board, the rook adjacent squares would be those that border square X and could be traveled to by a rook, and likewise for the queen adjacency squares.

Source: “Investigating Commuting Time in a Metropolitan Statistical Area Using Spatial Autocorrelation Analysis” by S. Hessam Miri

Calculating the Two

The shapefile adjacency code I shared here, which uses a buffer, will only return the queen adjacency.

The new code, shared below, which does not user a buffer, allows a user to specify whether they want point adjacencies to be return or not. This occurs by intersecting the shapefile with itself, without a buffer and then, depending on the “include_point_adjacency” parameter, filtering out the intersection that are just of Point geometry type. These are the point intersections

import pandas as pd
import geopandas as gp
from collections import defaultdict

def calculate_adjacency(gdf, unique_col):
    '''
    Code that takes a geodataframe and returns two dataframes: the first with rook adjacencies and the second with queen adjacencies.
    
    Both dataframes have two columns the first column with the unique column values, and the second column with a list of adjacent geometries, listed by their unique column value.
    ''' 
    
    # Confirm that unique_col is actually unique
    if not(max(gdf[unique_col].value_counts(dropna = False) == 1)):
        raise ValueError("Non-unique column provided")
    
    # Intersected the GeoDataFrame with the buffer with the original GeoDataFrame
    all_intersections = gp.overlay(gdf, gdf, how = "intersection", keep_geom_type = False)
   
    # Filter out self-intersections
    filtered_intersections = all_intersections[all_intersections[unique_col+"_1"]!=all_intersections[unique_col+"_2"]]
    
    # Separate out point intersections
    point_intersections = filtered_intersections[filtered_intersections.geom_type == "Point"]
    non_point_intersections = filtered_intersections[filtered_intersections.geom_type != "Point"]
    
    # Define a tuple of zips of the unique_col pairs present in the non-point intersections
    non_point_intersections_tuples = tuple(zip(non_point_intersections[unique_col+"_1"], non_point_intersections[unique_col+"_2"]))
    
    # Define a dictionary that will map from a unique_col value to a list of other unique_cols it is adjacent to
    rook_dict = defaultdict(list)
    
    # Iterate over the tuples
    for val in non_point_intersections_tuples:        
        rook_dict[val[0]].append(val[1])

    # Some shapes will only intersect with themselves and not be added to the above
    not_added = list(set(gdf[unique_col]).difference(set(rook_dict.keys())))
    for val in not_added:
        
        # For each of these, add a blank list to the dictionary
        rook_dict[val] = []
     
    # Create DataFrame of rook intersections
    df_rook = pd.DataFrame()
    df_rook['GEOID20'] = rook_dict.keys()
    df_rook["ADJ_GEOMS"] = rook_dict.values()
        
    # Make a copy of the dictionary so we can add the point intersections
    queen_dict = {key: value[:] for key, value in rook_dict.items()}
    
    # Define a tuple of zips of the unique_col pairs present in the point intersections
    point_intersection_tuples = tuple(zip(point_intersections[unique_col+"_1"], point_intersections[unique_col+"_2"]))
    for val in point_intersection_tuples:        
        queen_dict[val[0]].append(val[1])
         
    # Create DataFrame of queen intersections
    df_queen = pd.DataFrame()
    df_queen['GEOID20'] = queen_dict.keys()
    df_queen["ADJ_GEOMS"] = queen_dict.values()
    
    return df_rook, df_queen

Comparing the Code

As you can see in comparing this image with the earlier image, AZ + CO and UT + NM are point adjacent to one another

Geopandas Shapefile Adjacency

The below code takes in a geodataframe and a unique column and returns a dictionary mapping from each unique column value to a list of the column values it is adjacent too.

As written, the code uses a buffer of 1 in the 3857 crs and, as you can see below, accounts for point (Queen’s) adjacency.

The next version of the code will attempt to do the same thing without using a buffer and return an adjacency matrix rather than a dictionary.

def calculate_adjacency(gdf, unique_col):
    '''
    Code that takes a geodataframe and returns a dictionary of adjacencies
    '''
    
    # Convert to a crs to make sure the buffer area works
    gdf = gdf.to_crs(3857)
    
    # Make a copy of the GeoDataFrame
    gdf_buffer = gdf.copy(deep = True)
    
    # Add a buffer of 1 to the geometry of the copied GeoDataFrame
    gdf_buffer["geometry"] = gdf.buffer(1)
    
    # Intersected the GeoDataFrame with the buffer with the original GeoDataFrame
    test_intersection = gp.overlay(gdf_buffer, gdf, how = "intersection")
    
    # Define a tuple of zips of the unique_col pairs present in the intersection
    test_intersection_tuples = tuple(zip(test_intersection[unique_col+"_1"], test_intersection[unique_col+"_2"]))
    
    # Define a dictionary that will map from a unique_col value to a list of other unique_cols it is adjacent to
    final_dict = {}
    
    # Iterate over the tuples
    for val in test_intersection_tuples:
        
        # The shapes will intersect with themselves, we don't want to add these to the dictionary
        if val[0] != val[1]:
            
            # If the shape is already in the dictionary
            if val[0] in list(final_dict.keys()):
                
                # Append the adjacent shape to the list
                holder = final_dict[val[0]]
                holder.append(val[1])
                final_dict[val[0]] = holder
            else:
                
                # Otherwise, create a key in the dictionary mapping to a list with the adjacenct shape
                final_dict[val[0]] = [val[1]]
                
    # Some shapes will only intersect with themselves and not be added to the above
    for val in [i for i in gdf[unique_col] if i not in list(final_dict.keys())]:
        
        # For each of these, add a blank list to the dictionary
        final_dict[val] = []
        
    # Return the adjacency dictionary    
    return final_dict

Example output from running the code on a shape file of the US States from the census.

Mapping Baltimore’s “White L” and “Black Butterfly”

Baltimore is known for its “White L” and “Black Butterfly” patterns of settlement.

Demographics

Crime

Screenshot from Baltimore Sun’ Homicide tracker in November 2022. Last 500 homicides pictured.

Interests

Sträva heat map of most popular running routes in Baltimore from November 2022.

Housing Prices

Maps below from Zillow in November, 2022.

Beyond the Black Butterfly & White L

Compared with other large Northeast Corridor cities, Baltimore is also unique in its relatively small non-black and non-white population.

Concentrated Census Tract Population Density

I was interested in looking at state’s where the densest 2020 census tracts are all primarily in one city or area of the state.

In playing around with it, MA, LA, NY, PA, IL, and AK all stood out as states with this sort of concentrated density.

Check out the maps below to see more!

While the number of tracts for each state in the images may seem arbitrary, if you added an additional tract, the maps would no longer be contained in their regions.