Template for Jupyter notebooks running Python.

Version 0.1.0 | First Created July 12, 2023 | Updated August 01, 2023

Jupyter Notebook¶

This is an Jupyter Notebook document. For more details on using a Jupyter Notebook see https://docs.jupyter.org/en/latest/.

Reproduction of Urban Environmental Justice of Green Space Access in Chicago¶

Authors¶

  • Isaiah Bennett, ibennett@middlebury.edu, @isaiahbennett2, Middlebury College

Abstract¶

This study is a reproduction of:

The lab called "Urban Environmental Justice of Green Space Access in Chicago" from Middlebury College's Human Geography with GIS class.

The abstract of the original study is as follows:

Green space provide numerous public health, social, and environmental benefits to cities and their residents. These include mitigation of urban heat, improved storm water management and water quality, improved air quality, expanded access to exercise, and the social-psychological benefits of enjoying nature.

In this lab, we will conduct a GIS study similar to Wolch, Wilson and Fehrenbach's (2005) research on access to green spaces in Los Angeles, California. Wolch et al's purpose was to assess the environmental justice implications of municipal green spaces and recreation funding policies (Proposition K), asking whether minority groups, and especially minority children, were disproportionately excluded from access to green spaces. They operationalized the concept of "access" in terms of both the proximity and the total area of green spaces, and found significant disparities between racial/ethnic groups, rooted in histories of bias and segregation.

Our purpose is to assess people's access to green space (parks and forests) in segregated neighborhoods of Chicago. To do so, we will estimate indicators of access to green space according to regions where Asian, Black, Latinx, or White ethnic/racial groups are the majority (60% of the population or more), and Mixed neighborhoods where no single group makes up 60% or more of the population. The indicators of green space access are:

• Percentage of people living within 0.25 miles of a green space

• Green space area per person (in square meters)

Study metadata¶

  • Key words: Green space, racial majority, access, Chicago, Illinois, overlay, buffer, aggregation.
  • Subject: Social and Behavioral Sciences: Geography: Nature and Society Relations
  • Date created: November 30, 2023
  • Date modified: December 14, 20203
  • Spatial Coverage: The city of Chicago, Illinois
  • Spatial Resolution: Tracts
  • Spatial Reference System: EPSG 6454
  • Temporal Coverage: 2010
  • Temporal Resolution: Decennial census

Original study spatio-temporal metadata¶

  • Spatial Coverage: The city of Chicago, Illinois
  • Spatial Resolution: Tracts
  • Spatial Reference System: EPSG 6454
  • Temporal Coverage: 2010
  • Temporal Resolution: Decennial census

Study design¶

This study is a reproduction study of an unpublished geography study that was made for Middlebury College's Human Geography with GIS class to be done in QGIS. I completed the study last spring when I was in the course Human Geography with GIS, but wanted to revisit it through a computational environment to contribute to the world of reproducible science. Since this is not a computationally intensive study, its purpose is to act as a simple example to show new GIScientists what the world of reproduction looks like by demonstrating easily digestible components of reproducible research such as making a pre-analysis plan (preregistration), setting up a reproducible computational environment, sharing data and metadata, sharing methods and code, working through a reproduction study notebook, and exploring a research compendium.

The main research question at hand is to see if this QGIS study can be accurately reproduced using Python packages such as GeoPandas and Pandas.

Materials and procedure¶

Computational environment¶

Maintaining a reproducible computational environment requires some conscious choices in package management.

Please refer to 00-Python-environment-setup.ipynb for details.

In [569]:
# report python version and install required packages
# switch if statement from True to False once packages have been installed
if False:
    !python -V
    !pip install -r ../environment/requirements.txt
In [552]:
# Import modules, define directories
from pyhere import here
import pandas as pd
import geopandas as gpd
import folium as fm
from branca.colormap import LinearColormap
import matplotlib.pyplot as plt
from matplotlib.patches import Patch

# You can define your own shortcuts for file paths:
path = {
    "dscr": here("data", "scratch"),
    "drpub": here("data", "raw", "public"),
    "drpriv": here("data", "raw", "private"),
    "ddpub": here("data", "derived", "public"),
    "ddpriv": here("data", "derived", "private"),
    "rfig": here("results", "figures"),
    "roth": here("results", "other"),
    "rtab": here("results", "tables"),
    "dmet": here("data", "metadata")
}

Data and variables¶

Tracts2010¶

Standard Metadata

  • Title: Tracts2010.shp
  • Abstract: census tracts from the 2010 Census for Chicago containing with demographic data joined from the P2
  • Spatial Coverage: The city of Chicago, Illinois
  • Spatial Resolution: Tracts
  • Spatial Reference System: EPSG 6454
  • Temporal Coverage:
  • Temporal Resolution: 2010
  • Lineage: Data was downloaded from Middlebury College Geog 120 Week 04 Lab: Urban Models of Segregation by Race and Class which collected the data from Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 13.0 [Database]. Minneapolis: University of Minnesota. 2018. http://doi.org/10.18128/D050.V13.0.
  • Distribution: Data is available
  • Constraints: No legal constraints
  • Data Quality: No planned quality assessment
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
fid :--: object id int
STATEFP10 :--: state id int
COUNTYFP10 :--: county id int
TRACTCE10 :--: tract id int
GEOID10 :--: geography id int
NAME10 :--: ? int
NAMELSAD10 :--: census tract name string
GISJOIN :--: uniquely identifies tracts for purpose of joining to geographic data string
PopTotal :--: total population int
Latinx :--: total Hispanic or Latino/Latina population int
NotLatinx :--: total non-Hispanic White population int
White :--: total non-Hispanic White population int
Black :--: total non-Hispanic Black or African American population int
Asian :--: total non-Hispanic Asian population int
TwoOrMore :--: ? int
MedHouseVa :--: median house value for owner-occupied houses int
MedGrossRe :--: median gross monthly rent (including utilities) int
pctWhite :--: percent white population double
pctBlack :--: percent black population double
pctLatinx :--: percent latinx population double
pctAsian :--: percent asian population double
In [396]:
#load in tracts 2010
tracts2010 = gpd.read_file( here(path["drpub"], "tracts2010.shp") )
tracts2010 = gpd.GeoDataFrame(tracts2010)

# Create folium map
mtracts = fm.Map([41.88155337370558, -87.63007007169067], zoom_start=10, tiles = "CartoDB Positron")

# Define a style dictionary to adjust line properties (e.g., line weight)
style_function = lambda x: {'color': 'blue', 'weight': 1, 'lineweight': '.40', 'fillColor': 'blue', 'fillOpacity': '.40'}

# Add the GeoJson layer to the map
fm.GeoJson(tracts2010, style_function=style_function).add_to(mtracts)

mtracts
#print(tracts2010)
Out[396]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Blocks2010¶

Standard Metadata

  • Title: Blocks2010.shp
  • Abstract: census blocks from the 2010 Census for Chicago containing with demographic data joined from the P2
  • Spatial Coverage: The city of Chicago, Illinois
  • Spatial Resolution: Block groups
  • Spatial Reference System: EPSG 6454
  • Temporal Coverage: Specify the temporal extent of your study---i.e. the range of time represented by the data observations.
  • Temporal Resolution: 2010
  • Lineage: Data was downloaded from Middlebury College Geog 120 Week 07 Lab which collected the data from Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 13.0 [Database]. Minneapolis: University of Minnesota. 2018. http://doi.org/10.18128/D050.V13.0.
  • Distribution: Data is available
  • Constraints: No legal constraints
  • Data Quality: No planned quality assessment
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
GEO.id ... Id string ... ... ... ...
GEO.id2 ... Id2 string ... ... ... ...
GEO.display-label ... Geography geometry ... ... ... ...
D001 ... Total Population int ... ... ... ...
D002 ... Hispanic or Latino int ... ... ... ...
D003 ... Not Hispanic or Latino int ... ... ... ...
D004 ... Not Hispanic or Latino: - Population of one race int ... ... ... ...
D005 ... Not Hispanic or Latino: - White alone int ... ... ... ...
D006 ... Not Hispanic or Latino: - Black or African American alone int ... ... ... ...
D007 ... Not Hispanic or Latino: - American Indian and Alaska Native alone int ... ... ... ...
D008 ... Not Hispanic or Latino: - Asian alone int ... ... ... ...
D009 ... Not Hispanic or Latino: - Native Hawaiian and Other Pacific Islander alone int ... ... ... ...
D010 ... Not Hispanic or Latino: - Some Other Race alone int ... ... ... ...
In [394]:
# Load in blocks 2010
blocks2010 = gpd.read_file( here(path["drpub"], "blocks2010.shp") )
blocks2010 = gpd.GeoDataFrame(blocks2010)

Parks¶

Standard Metadata

  • Title: parks.shp
  • Abstract: parks in Chicago
  • Spatial Coverage: The city of Chicago, Illinois
  • Spatial Resolution: park polygons
  • Spatial Reference System: EPSG 6454
  • Temporal Coverage: Specify the temporal extent of your study---i.e. the range of time represented by the data observations.
  • Temporal Resolution: 2010
  • Lineage: Data was downloaded from Middlebury College Geog 120 Week 07 Lab Urban Environmental Justice of Green Space Access in Chicago
  • Distribution: Data is available
  • Constraints: No legal constraints
  • Data Quality: No planned quality assessment
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
parkName :--: names of every park in Chicago string
In [397]:
# Load in parks
parks = gpd.read_file( here(path["drpub"], "parks.shp") )
parks = gpd.GeoDataFrame(parks)

# Define a style dictionary to adjust symbology
style_function = lambda x: {'color': 'green', 'weight': 1, 'fillColor': 'green', 'fillOpacity': '100'}

fm.GeoJson(parks, style_function=style_function).add_to(mtracts)

# Display map
mtracts
Out[397]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Forest¶

​ Standard Metadata ​

  • Title: forest.shp
  • Abstract: forest areas in Chicago
  • Spatial Coverage: The city of Chicago, Illinois
  • Spatial Resolution: forest polygons
  • Spatial Reference System: EPSG 6454
  • Temporal Coverage: Specify the temporal extent of your study---i.e. the range of time represented by the data observations.
  • Temporal Resolution: 2010
  • Lineage: Data was downloaded from Middlebury College Geog 120 Week 07 Lab Urban Environmental Justice of Green Space Access in Chicago
  • Distribution: Data is available
  • Constraints: No legal constraints
  • Data Quality: No planned quality assessment
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
NAME :--: single multi part polygon with all forest areas string
In [399]:
# Load in forest
forest = gpd.read_file( here(path["drpub"], "forest.shp") )
forest = gpd.GeoDataFrame(forest)

# Define a style dictionary to adjust line properties 
style_function = lambda x: {'color': 'darkgreen', 'weight': 1, 'fillColor': 'darkgreen', 'fillOpacity': '100'}

fm.GeoJson(forest, style_function=style_function).add_to(mtracts)

# Display map
mtracts
Out[399]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Bias and threats to validity¶

Boundary Effects: Restricted to the extent of the city of Chicago the hard boundary of the city’s border could cut off people along the edges who might access a park within 0.25 miles outside of the city making it seem like they have less access to green space than people within the city.

Modifiable Aerial Unit Problem: The population with access to green space is calculated based on block groups which is one of the smallest enumeration units. This decreases the chances that diversity in spatial trends is generalized. The racial majority groups are calculated at the tract level, however, which might ignore varying trends between racial majority groups at the block group level.

Spatial Heterogeneity: This poses an issue when looking at the differences in greenspaces. Not all greenspaces have the same amenities or quality for their communities to access–just because it technically counts as a greenspace does not mean it has the same inherent value as other greenspaces. The size of the greenspace presents a parallel problem. Since access to greenspace is determined by a constant buffer of 0.25 miles it does not account for the fact that smaller spaces can reach more people proportionally to the size of the green space. Perhaps smaller green spaces should have a smaller buffer since they do not have a proportional capacity of use when compared to larger parks with the same 0.25-mile buffer.

Data transformations¶

The first data transformation is joining the parks and forests shapefiles since they come from different sources but together represent the total green space of Chicago.

In [500]:
# Join parks and forest
greenspace = gpd.overlay(parks, forest, how='union')

The construction of a new variable, the majority group column, in the tract data frame must also be calculated to continue with the analysis. The majority group is calculated by a threshold of 60% population, so if a tract has a single-race population percentage over 60% then it is given the designation of that race. Only White, Black, Asian, and Latinx are considered in this study as they were the most dominant in Chicago in 2010. The 60% threshold is a seemingly arbitrary choice in the original study design and could be modified to see how it influences the final results.

In [535]:
# Calculate Majority Group in tracts
threshold = 60.0
for index, row in tracts2010.iterrows():
    # Check the condition for each row using 'index' to access specific values
    if row['pctWhite'] >= threshold:  # Check if 'pctWhite' is greater than or equal to 60.0
        tracts2010.loc[index, 'majorGroup'] = 'White'  # Assign 'White' to 'majorGroup'
    elif row['pctBlack'] >= threshold:  
        tracts2010.loc[index, 'majorGroup'] = 'Black'
    elif row['pctAsian'] >= threshold:  
        tracts2010.loc[index, 'majorGroup'] = 'Asian'
    elif row['pctLatinx'] >= threshold:  
        tracts2010.loc[index, 'majorGroup'] = 'Latinx'
    else: 
        tracts2010.loc[index, 'majorGroup'] = 'Mixed'

Analysis¶

First, the green space will be buffered by 0.25 miles to create a catchment area to show which blocks are able to access it easily. Using a spatial overlay, blocks with access will be selected if their centroids intersect with the green space buffer. These access blocks containing population data will then be aggregated to the tract level based on their tract id. Population data for each tract is then summed based on their racial majority group. Area calculations are made for the total area that each majority group inhabits, and then subsequently after a clip overlay that selects all of the green space area that overlaps their tracts. Variables of Percent of Population with Access and Green Space Per Person (sqm) are also derived from existing variables.

Buffer green space¶

In [536]:
# Buffer by 0.25 miles converted to meters (1 mile ≈ 1609.34 meters)
bufferdist= 0.25 #in miles

greenspacebuffer = greenspace.buffer(bufferdist * 1609.34)

# Convert from multiple polygons into a single multi-part polygon
greenspacebuffer_single = greenspacebuffer.unary_union

# Turn buffered green space into geodataframe
bufferedgreen = gpd.GeoDataFrame(geometry=[greenspacebuffer_single], crs=blocks2010.crs)

# Create folium map
m = fm.Map([41.88155337370558, -87.63007007169067], zoom_start=10, tiles = "CartoDB Positron")

# Define a style dictionary to adjust symbology
style_function = lambda x: {'color': '#bbde93', 'weight': 1, 'fillColor': '#bbde93', 'fillOpacity': '100'}

# Add the GeoJson layer to the map
fm.GeoJson(greenspacebuffer, style_function=style_function).add_to(m)

m
Out[536]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Calculate blocks with accessibility¶

In [537]:
#create centroids
centroids = blocks2010.geometry.centroid  

#turn into geodataframe
blockscentroids = gpd.GeoDataFrame(geometry=centroids, crs=blocks2010.crs)

# Select centroids that intersect with buffered greenspaces
greenaccessblocks = gpd.sjoin(blockscentroids, bufferedgreen, how="inner", predicate="intersects")

# Merge the original block data with the selected blocks based on centroid intersections
greenaccessblocks = greenaccessblocks.merge(blocks2010, left_index=True, right_index=True)

Join to Tracts2010 by tract id¶

In [559]:
#Group by Tract id 
greenaccessPop = greenaccessblocks.groupby('TRACTCE10')['D001'].sum().reset_index()

# join by tractce10 and add totalpopaccess 
greenaccesstracts = tracts2010.merge(greenaccessPop[['TRACTCE10', 'D001']], on='TRACTCE10', how='left')

# create a Geodataframe with the greenaccesstracts
greenaccesstracts = gpd.GeoDataFrame(greenaccesstracts)
#print(greenaccesstracts)

Final calculations by majority group¶

In [560]:
#group by majority group
groupedgreenaccess = greenaccesstracts.dissolve(by='majorGroup', aggfunc={
         "majorGroup": "count","PopTotal":'sum',"D001":'sum'})
groupedgreenaccess = groupedgreenaccess.rename(columns={'majorGroup': 'Tracts'})
In [567]:
#calculate area for each majority group
groupedgreenaccess['area_sqm'] = groupedgreenaccess.geometry.area

#clip grouped green access and green space 
majorGrnSpace = groupedgreenaccess.clip(greenspace) 

# Calculate green space area
majorGrnSpace['areaGrn_sqm'] = majorGrnSpace.geometry.area

# Calculate 'pctAccess' as a percentage of access per population
majorGrnSpace['pctAccess'] = (majorGrnSpace['D001'] / majorGrnSpace['PopTotal']) * 100

# Calculate 'GreenAreaPop' as the ratio of green area to population
majorGrnSpace['GreenAreaPop'] = majorGrnSpace['areaGrn_sqm'] / majorGrnSpace['PopTotal']

majorGrnSpace
Out[567]:
geometry Tracts PopTotal D001 area_sqm areaGrn_sqm pctAccess GreenAreaPop
majorGroup
Black MULTIPOLYGON (((355481.444 561993.089, 355458.... 269 755569.0 483672.0 1.964818e+08 1.349878e+07 64.014273 17.865712
Latinx MULTIPOLYGON (((357408.653 576717.364, 357418.... 146 579637.0 334119.0 9.621542e+07 2.667939e+06 57.642801 4.602775
Mixed MULTIPOLYGON (((352328.568 566122.129, 352338.... 178 669499.0 512778.0 1.453426e+08 8.965317e+06 76.591302 13.391083
White MULTIPOLYGON (((347461.379 568394.469, 347367.... 190 660728.0 487125.0 1.186064e+08 1.151111e+07 73.725497 17.421861
Asian MULTIPOLYGON (((358470.236 575562.435, 358470.... 4 13875.0 11248.0 1.983068e+06 2.768250e+04 81.066667 1.995135

Results¶

In [565]:
# Rename index column 
table1 = majorGrnSpace.rename_axis('Majority Group')

#Clean up names and round 
table1['Population'] = table1['PopTotal'].round(0).astype(int) 
table1['Population with Access'] = table1['D001'].round(0).astype(int) 
table1['Area (sqm)'] = table1['area_sqm'].round(0).astype(int) 
table1['Green Space Area (sqm)'] = table1['areaGrn_sqm'].round(0).astype(int)
table1['Percent Population with Access'] = table1['pctAccess'].round(1)
table1['Green Space Per Person (sqm)'] = table1['GreenAreaPop'].round(1)
table1 = table1.drop(columns = ["geometry", "PopTotal", "D001", "area_sqm", "areaGrn_sqm", "pctAccess", "GreenAreaPop"])
 
table1.to_csv( here(path["rtab"],"table1.csv") )# Save image

table1
Out[565]:
Tracts Population Population with Access Area (sqm) Green Space Area (sqm) Percent Population with Access Green Space Per Person (sqm)
Majority Group
Black 269 755569 483672 196481794 13498778 64.0 17.9
Latinx 146 579637 334119 96215420 2667939 57.6 4.6
Mixed 178 669499 512778 145342623 8965317 76.6 13.4
White 190 660728 487125 118606374 11511111 73.7 17.4
Asian 4 13875 11248 1983068 27683 81.1 2.0
In [563]:
# load and compare original results 
original_results = gpd.read_file( here(path["ddpub"], "original_results.csv") ) 
original_results = pd.DataFrame(original_results)

#Drop geometry column
original_results = original_results.drop(columns = ['geometry'])

original_results
Out[563]:
Majority Group Tracts Population Population with Access Area (sqm) Green Space Area (sqm) Percent Population with Access Green Space Per Person (sqm)
0 Black 269 755,569 482,153 196,481,794 13,498,778 63.8 17.9
1 Latinx 146 579,637 332,655 96,215,420 2,667,939 57.4 4.6
2 Mixed 178 669,499 510,794 145,342,623 8,965,317 76.3 13.4
3 White 190 660,728 485,254 118,606,374 11,511,111 73.4 17.4
4 Asian 4 13,875 10,876 1,983,068 27,683 78.4 2.0

Visualize racial majority group and greenspace map¶

In [511]:
# create a Geodataframe with the greenaccesstracts
greenaccesstracts = gpd.GeoDataFrame(tracts2010)


# Create folium map
mrace = fm.Map([41.88155337370558, -87.63007007169067], zoom_start=10, tiles = "CartoDB Positron")

def style_function(feature):
    major_group = feature['properties']['majorGroup']
    if major_group == 'Asian':
        color = '#bf8282FF'
    elif major_group == 'Black':
        color = '#bbafd0FF'
    elif major_group == 'Latinx':
        color = '#e4b586FF'
    elif major_group == 'Mixed':
        color = '#feffa6FF'
    elif major_group == 'White':
        color = '#4468a8FF'
    else:
        color = '#ffff00'  # Default color if 'majorGroup' doesn't match any condition

    return {
        'fillColor': color,
        'color': 'grey',
        'weight': 0.3,
        'dashArray': '5, 5',
        'fillOpacity': '100'
    }
green_style = lambda x: {'color': '#bbde93', 'weight': 1, 'fillColor': '#bbde93', 'fillOpacity': '100'}

# Add the GeoJson layer to the map
fm.GeoJson(greenaccesstracts, style_function=style_function).add_to(mrace)
fm.GeoJson(greenspace, style_function=green_style).add_to(mrace)

mrace
Out[511]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [568]:
# dissolve the tracts geometry by racial majority group
greenaccesstracts = greenaccesstracts.dissolve(by= 'majorGroup')
greenaccesstracts['majorityGroup'] = greenaccesstracts.index

# Set up the base plot
fig, ax = plt.subplots(figsize=(10, 10))

# Define a function to assign colors based on 'majorGroup' column
def assign_color(feature):
    major_group = feature['majorityGroup']
    if major_group == 'Asian':
        return '#bf8282FF'
    elif major_group == 'Black':
        return '#bbafd0FF'
    elif major_group == 'Latinx':
        return '#e4b586FF'
    elif major_group == 'Mixed':
        return '#feffa6FF'
    elif major_group == 'White':
        return '#4468a8FF'
    else:
        return '#ffff00'

# Plot greenaccesstracts with different colors based on 'majorGroup' column
greenaccesstracts.plot(ax=ax, color=greenaccesstracts.apply(assign_color, axis=1), edgecolor='grey', linewidth=0.3)

# Plot greenspace 
greenspace.plot(ax=ax, color='#bbde93', edgecolor='grey', linewidth=0.05, alpha=1)

# Customize legend based on the colors for each majority group
plt.title('Racial Dimensions of Green Space in 2010 Chicago')
legend_patches = [
    Patch(facecolor='#bf8282FF', label='Asian'),
    Patch(facecolor='#bbafd0FF', label='Black'),
    Patch(facecolor='#e4b586FF', label='Latinx'),
    Patch(facecolor='#feffa6FF', label='Mixed'),
    Patch(facecolor='#4468a8FF', label='White'),
    Patch(facecolor='#bbde93', label='Greenspace')
]

# Add legend with the created patches
plt.legend(handles=legend_patches)

plt.savefig(here(path["rfig"], 'fig1.png')) # Save image

# Show the plot
plt.show()

Discussion¶

This reproduction was almost entirely successful. Looking at the differences in Percent Population with Access between the original results and the results from this study there is a range of + 0.2 to 0.3 deviance for all majority groups besides Asian, which was 2.7% higher in the reproduction. It is important to note, however, that the Asian majority group has significantly less tracts than the others with only 4 compared to a range of 146 to 269 for everything else giving this deviance an over-representation. Comparing Figure 1 from this reproduction with the original reproduction the results look identical which is consistent with the identical data in the table all except for the minor differences in Population with Access and Percent Population with Access.

The potential for this error lies in differences in algorithms for how geopandas executes certain spatial analysis steps such as buffering and creating centroids versus how they are executed in QGIS. In general the geopandas tools are more computationally simple with less specifications for how they are executed. For example, in QGIS the buffer tool offers options for different styles, and end caps, but that is not available in geopandas.

This reproduction was also successful in creating a reproducible framework to be easily expanded on in the future. To further investigate the relationship between green space and racial majority groups one can adjust the variables "bufferdist" in the step of calculating the green space buffer and "threshold" in the data transformation step of creating the majority groups to raise the percent threshold above 60%. An engaging challenge in the future would be to automate the data sourcing directly from the web so that this study can be applied to different temporal and spatial extents.

Overall, this study showed that two racial majority groups have a disproportionate lack of access to greenspaces compared to the rest. Only 63.8% (or 64.0% in the reproduction) of the Black majority group has access to green space and only 57.4% (or 57.6% in the reproduction) of the Latinx majority group have access compared to a range of 73.4-78.4% (or 73.7-81.1% in the reproduction). When looking at the calculated Green Space Per Person in square meters, there are two groups that stand out as having significantly less green space: Latinx and Asian with 4.6 sqm and 2.0 sqm respectively. Given that the Latinx majority also had the lowest percent of population with access, this study identifies the Latinx group as being the most disproportionately impacted by a lack of green space. This finding is important because of the impacts that not having access to green space can have on an individual. Ranging from worse urban heat islands, worse air quality, less recreation area, and compromised mental health not having green space can have a strong negative impact on a population. To reduce this environmental inequality careful consideration of this data must be taken into consideration in green space zoning and urban revival projects. Access to the natural environment is not a privilege but a right and should be represented as such in urban planning.

Integrity Statement¶

The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research.

Acknowledgements¶

  • Initial help: Tate Sutter, also a student in Joseph Holler's OpenGIScience course at Middlebury College, and I were originally going to work on this together, therefore he contributed to the initial steps of writing the requirements txt to set up the computational environment as well as loading in the the tracts2010 data. Everything after will be conducted by me, Isaiah Bennett.

This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:10.17605/OSF.IO/W29MQ

References¶

Wolch, J., Wilson, J. P., & Fehrenbach, J. (2005). Parks and Park Funding in Los Angeles: An Equity-Mapping Analysis. Urban Geography, 26(1), 4–35. https://doi.org/10.2747/0272-3638.26.1.4