Cook County Property Assessment - Part 1

Cook County Assessments
Author

Ryan Zomorrodi

Published

February 8, 2024

Part 1 (Due 2/13, 11:59pm)

You have been tasked with undertaking a multi-part analysis of homes in Cook County, Illinois. You are provided with a database to facilitate this analysis. This database was constructed from the Cook County Open Data portal. More information is included in the database section below. Note that the database must be downloaded.

Four tables are provided:

assessments - 2021 to 2023 (not finalized)
See: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Assessed-Values/uzyt-m557/about_data

View Table Columns
Column Name Description Type
pin Parcel Identification Number (PIN) Plain Text
tax_year Tax year Number
class Property class Plain Text
township_code Township code Plain Text
township_name Township name Plain Text
mailed_bldg Assessor mailed building value Number
mailed_land Assessor mailed land value Number
mailed_tot Assessor mailed total value Number
certified_bldg Assessor certified building value Number
certified_land Assessor certified land value Number
certified_tot Assessor certified total value Number
board_bldg Board of Review certified building value Number
board_land Board of Review certified land value Number
board_tot Board of Review certified total value Number

characteristics - Tax year 2023 characteristics. See: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Single-and-Multi-Family-Improvement-Chara/x54s-btds/about_data

View Table Columns
Column Name Description Type
pin Parcel Identification Number (PIN) Plain Text
tax_year Tax year Number
card_num Card number. Each card is an improvement/building on the parcel Number
class Property class Plain Text
township_code Township code Plain Text
proration_key_pin Tieback key PIN. Prorated properties (whose value is split across multiple PINs) have a “main” or key PIN Plain Text
pin_proration_rate Tieback proration rate. Prorated properties (whose value is split across multiple PINs) pay taxes on the proportion of value on their PIN. In other words, assessed value is multiplied by proration rate to determine taxable assessed value Number
card_proration_rate Card proration rate. Prorated parcels (whose value is split across multiple cards) pay taxes on the proportion of value on their card. In other words, assessed value is multiplied by proration rate to determine taxable assessed value. Cards are divisions within parcels, such as one of multiple buildings on a single parcel. Number
cdu Condition, Desirability, and Utility code. Not well maintained. Plain Text
pin_is_multicard Multicard PIN. Indicates whether the parcel contains more than one building (ADU, coach house, etc.) Checkbox
pin_num_cards Number of cards on this parcel. Each card is an improvement/building Number
pin_is_multiland Multiland PIN. Indicates whether parcel has more than one landline Checkbox
pin_num_landlines Number of landlines on a parcel. The sum of all landline square footage should be equal to the total square footage of the parcel. Each landline can correspond to a different land price/rate Number
year_built Year built Number
building_sqft Building square feet. Square footage of the building, as measured from the exterior Number
land_sqft Land square feet. Square footage of the land (not just the building) of the property. Note that a single PIN can have multiple landlines, meaning it can be associated with more than one land price/rate Number
num_bedrooms Number of bedrooms Number
num_rooms Rooms. Number of total rooms in the building (excluding baths). Not to be confused with bedrooms Number
num_full_baths Full baths. Defined as having a bath or shower. If this value is missing, the default value is set to 1 Number
num_half_baths Half baths. Defined as bathrooms without a shower or bathtub Number
num_fireplaces Fireplaces. Counted as the number of flues one can see from the outside of the building Number
type_of_residence Type of residence Plain Text
construction_quality Construction quality Plain Text
num_apartments Apartments. Number of apartments for class 211 and 212 properties Plain Text
attic_finish Attic finish Plain Text
garage_attached Garage 1 attached Plain Text
garage_area_included Is Garage 1 physically included within the building area? If yes, the garage area is subtracted from the building square feet calculation by the field agent Plain Text
garage_size Garage 1 size Plain Text
garage_ext_wall_material Garage 1 exterior wall material Plain Text
attic_type Attic type Plain Text
basement_type Basement type Plain Text
ext_wall_material Exterior wall material Plain Text
central_heating Central heating Plain Text
repair_condition Repair condition Plain Text
basement_finish Basement finish Plain Text
roof_material Roof material Plain Text
single_v_multi_family Single vs. multi-family use Plain Text
site_desirability Site desirability Plain Text
num_commercial_units Number of commercial units on the parcel (the vast majority are for properties with class 212) Plain Text
renovation Renovation Plain Text
recent_renovation Renovation in last 3 years Checkbox
porch Porch Plain Text
central_air Central air conditioning Plain Text
design_plan Design plan Plain Text

geospatial_universe - Information on latitude/longitude and neighborhood code from tax year 2022 (released on a delay). Only a subset of columns is selected. See: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Universe/nj4t-kc8j/about_data

View Table Columns
Column Name Description Type
pin Parcel Identification Number (PIN) Plain Text
pin10 Parcel Identification Number (10-digit) Plain Text
tax_year Tax year Number
class Property class Plain Text
triad_name Triad name. Reassessment of property in Cook County is done within a triennial cycle, meaning it occurs every three years. The Cook County Assessor’s Office alternates reassessments between triads: the north and west suburbs, the south and west suburbs and the City of Chicago. Plain Text
triad_code Triad code. Reassessment of property in Cook County is done within a triennial cycle, meaning it occurs every three years. The Cook County Assessor’s Office alternates reassessments between triads: the north and west suburbs, the south and west suburbs and the City of Chicago. Plain Text
township_name Township name Plain Text
township_code Township code Plain Text
neighborhood_code Assessor neighborhood code, first two digits are township, last three are neighborhood Plain Text
tax_district_code Tax district code, as seen on individual property tax bills (Not currently up-to-date) Plain Text
zip_code Property zip code Plain Text
longitude Parcel centroid longitude Number
latitude Parcel centroid latitude Number
centroid_x_crs_3435 Parcel centroid X coordinate (CRS 3435) Number
centroid_y_crs_3435 Parcel centroid Y coordinate (CRS 3435) Number
census_block_group_geoid Census block group GEOID Plain Text
census_block_geoid Census block GEOID Plain Text
census_congressional_district_geoid Census congressional district GEOID Plain Text
census_county_subdivision_geoid Census county subdivision GEOID Plain Text
census_place_geoid Census place GEOID Plain Text
census_puma_geoid Census PUMA GEOID Plain Text
census_school_district_elementary_geoid Census school district (elementary) GEOID Plain Text
census_school_district_secondary_geoid Census school district (secondary) GEOID Plain Text
census_school_district_unified_geoid Census school district (unified) GEOID Plain Text
census_state_representative_geoid Census state representative GEOID Plain Text
census_state_senate_geoid Census state senate GEOID Plain Text
census_tract_geoid Census tract GEOID Plain Text
census_zcta_geoid Census ZCTA GEOID Plain Text
census_data_year Census data year Number
census_acs5_congressional_district_geoid Census ACS5 congressional district GEOID Plain Text
census_acs5_county_subdivision_geoid Census ACS5 county subdivision GEOID Plain Text
census_acs5_place_geoid Census ACS5 place GEOID Plain Text
census_acs5_puma_geoid Census ACS5 PUMA GEOID Plain Text
census_acs5_school_district_elementary_geoid Census ACS5 school district (elementary) GEOID Plain Text
census_acs5_school_district_secondary_geoid Census ACS5 school district (secondary) GEOID Plain Text
census_acs5_school_district_unified_geoid Census ACS5 school district (unified) GEOID Plain Text
census_acs5_state_representative_geoid Census ACS5 state representative GEOID Plain Text
census_acs5_state_senate_geoid Census ACS5 state senate GEOID Plain Text
census_acs5_tract_geoid Census ACS5 tract GEOID Plain Text
census_acs5_data_year Census ACS5 data year Number
board_of_review_district_num Board of Review district number Plain Text
board_of_review_district_data_year Board of Review district data year Number
commissioner_district_num Commissioner district number Plain Text
commissioner_district_data_year Commissioner district data year Number
judicial_district_num Judicial district number Plain Text
judicial_district_data_year Judicial district data year Number
ward_num Ward number Plain Text
ward_chicago_data_year Chicago ward data year Number
ward_evanston_data_year Evanston ward data year Number
chicago_community_area_num Chicago community area number Plain Text
chicago_community_area_name Chicago community area name Plain Text
chicago_community_area_data_year Chicago community area data year Number
chicago_industrial_corridor_num Chicago industrial corridor number Plain Text
chicago_industrial_corridor_name Chicago industrial corridor name Plain Text
chicago_industrial_corridor_data_year Chicago industrial corridor data year Number
chicago_police_district_num Chicago police district number Plain Text
chicago_police_district_data_year Chicago police district data year Number
coordinated_care_area_num Coordinated Care Area number Plain Text
coordinated_care_area_data_year Coordinated Care Area data year Number
enterprise_zone_num Enterprise Zone number Plain Text
enterprise_zone_data_year Enterprise Zone data year Number
industrial_growth_zone_num Industrial Growth Zone number Plain Text
industrial_growth_zone_data_year Industrial Growth Zone data year Number
qualified_opportunity_zone_num Qualified Opportunity Zone number Plain Text
qualified_opportunity_zone_data_year Qualified Opportunity Zone data year Number
flood_fema_sfha FEMA Special Flood Hazard Area (SFHA) indicator Checkbox
flood_fema_data_year FEMA Special Flood Hazard Area (SFHA) data year Number
flood_fs_factor First Street Flood Factor Number
flood_fs_risk_direction First Street flood risk direction Number
flood_fs_data_year First Street data year Number
ohare_noise_contour_no_buffer_bool O’Hare noise contour indicator (no buffer). Indicates whether or not a parcel’s centroid is within O’Hare’s 65 DNL noise contour Checkbox
ohare_noise_contour_half_mile_buffer_bool O’Hare noise contour indicator (1/2 mile buffer). Indicates whether or not a parcel’s centroid is within O’Hare’s 65 DNL noise contour, buffered by 1/2 mile Checkbox
ohare_noise_contour_data_year O’Hare noise contour data year. The “omp” value corresponds to the projected noise contour upon completion of the O’Hare Modernization Project Number
airport_noise_dnl Airport continuous noise surface estimated DNL Number
airport_noise_data_year Airport continuous noise surface estimated data year Plain Text
school_elementary_district_geoid School district (elementary) GEOID, derived from Cook County and City of Chicago shapefiles. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
school_elementary_district_name School district (elementary) name, derived from Cook County and City of Chicago shapefiles. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
school_secondary_district_geoid School district (secondary) GEOID, derived from Cook County and City of Chicago shapefiles. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
school_secondary_district_name School district (secondary) name, derived from Cook County and City of Chicago shapefiles. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
school_unified_district_geoid School district (unified) GEOID, derived from Cook County and City of Chicago shapefiles. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
school_unified_district_name School district (unified) name, derived from Cook County and City of Chicago shapefiles. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
school_year School year Plain Text
school_data_year School data year Number
tax_municipality_num Municipality number Plain Text
tax_municipality_name Municipality name Plain Text
tax_school_elementary_district_num School district (elementary) number, derived from tax district. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
tax_school_elementary_district_name School district (elementary) name, derived from tax district. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
tax_school_secondary_district_num School district (secondary) number, derived from tax district. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
tax_school_secondary_district_name School district (secondary) name, derived from tax district. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
tax_school_unified_district_num School district (unified) number, derived from tax district. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
tax_school_unified_district_name School district (unified) name, derived from tax district. Chicago Public Schools are associated with attendance areas where suburban schools are associated with districts. Plain Text
tax_community_college_district_num Community college district number Plain Text
tax_community_college_district_name Community college district name Plain Text
tax_fire_protection_district_num Fire protection district number Plain Text
tax_fire_protection_district_name Fire protection district name Plain Text
tax_library_district_num Library district number Plain Text
tax_library_district_name Library district name Plain Text
tax_park_district_num Park district number Plain Text
tax_park_district_name Park district name Plain Text
tax_sanitation_district_num Sanitation district number Plain Text
tax_sanitation_district_name Sanitation district name Plain Text
tax_special_service_area_num Special Service Area number Plain Text
tax_special_service_area_name Special Service Area name Plain Text
tax_tif_district_num TIF district number Plain Text
tax_tif_district_name TIF district name Plain Text
tax_districts_data_year Data year for municipality, school, community college, fire, library, park, sanitary, special service area, and tax increment financing tax districts. Number
cmap_walkability_grid_id CMAP walkability grid ID. From CMAP’s ON TO 2050 spatial data files Plain Text
cmap_walkability_no_transit_score CMAP walkability score (no transit). From CMAP’s ON TO 2050 spatial data files Number
cmap_walkability_total_score CMAP walkability total score. From CMAP’s ON TO 2050 spatial data files Number
cmap_walkability_data_year CMAP walkability data year Number
subdivision_id Subdivision ID Plain Text
subdivision_data_year Subdivision data year Number

sales - Information on sales from 2021 to present (current mid-September 2023) See: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Sales/wvhk-k5uv/about_data

View Table Columns
Column Name Description Type
pin Parcel Identification Number (PIN) Plain Text
year Year Number
township_code Township code Plain Text
neighborhood_code Assessor neighborhood code, first two digits are township, last three are neighborhood Plain Text
class Property class Plain Text
sale_date Sale date (recorded, not executed) Date & Time
is_mydec_date Indicates whether the sale date has been overwritten with a more precise value from IDOR (Illinois Department of Revenue). In the past the Assessor’s ingest process truncated sale dates to the first of the month. Not all sales can be updated with dates from IDOR. Checkbox
sale_price Sale price Number
sale_document_num Sale document number. Corresponds to Clerk’s document number Plain Text
sale_deed_type Sale deed type Plain Text
mydec_deed_type Deed type from MyDec, more granular than CCAO deed type. Plain Text
sale_seller_name Sale seller name Plain Text
is_multisale Indicates whether a parcel was sold individually or as part of a larger group of PINs Checkbox
num_parcels_sale The number of parcels that were part of the sale Number
sale_buyer_name Sale buyer name Plain Text
sale_type Sale type Plain Text
sale_filter_same_sale_within_365 Remove sale with the same value (for the same PIN) within 365 days Checkbox
sale_filter_less_than_10k Indicator for whether sale is less than $10K FMW Checkbox
sale_filter_deed_type Indicator for quit claim, executor, beneficiary and missing deed types Checkbox

Conduct an exploratory data analysis

First let’s load our data and make some joins to sales data

Code
library(tidyverse)
con <- DBI::dbConnect(RSQLite::SQLite(), "../data/cook.sqlite")

sfh_sales <- tbl(con, 'sales') %>%
    collect() %>%
    filter(class %in% c(202, 203, 204, 205, 206, 207, 208, 209, 210, 234, 278)) %>%
    mutate(sale_date = as_datetime(sale_date)) %>% 
    mutate(township_code = as.character(township_code)) %>%
    distinct(doc_no, .keep_all = TRUE)

township_codes <- read_csv("../data/township_codes.csv") %>% 
    mutate(across(where(is.numeric), as.character))

sfh_characteristics <- tbl(con, 'characteristics') %>%
    collect() %>%
    distinct(pin, .keep_all = T) %>%
    select(-c(class, township_code)) %>%
    right_join(sfh_sales, by = join_by(pin, year == year)) %>%
    right_join(township_codes)

sfh_assessments <- tbl(con, 'assessments') %>%
    collect() %>%
    mutate(township_code = as.character(township_code)) %>%
    distinct(pin, .keep_all = T) %>%
    select(-c(class, township_code)) %>%
    right_join(sfh_sales, by = join_by(pin, tax_year == year))

Maps

The distributions of average sale price for single family homes by township lines up with most people’s preconceptions of Cook County. We can see that average sale prices are highest in townships in northern Cook County in Chicago and the more wealthy northern suburbs. Within these very same townships, we can see that single family homes both tend to be larger and tend to cost more per square foot. Lastly, we can see that the oldest households tend be within the city of Chicago, while the mean age of houses within Cook County seems to lower, the farther you get from the city of Chicago.

Code
library(sf)
library(leaflet)
library(scales)

townships <- read_sf("https://gis.cookcountyil.gov/traditional/rest/services/politicalBoundary/MapServer/3/query?outFields=*&where=1%3D1&f=geojson") %>%
    mutate(ORIGOID = as.character(ORIGOID)) %>%
    right_join(township_codes, by = join_by(ORIGOID == origoid))

sfh_mean_by_township <- sfh_characteristics %>%
    group_by(township_code) %>%
    reframe(sale_price_mean = mean(sale_price, na.rm = T),
        bldg_sf_mean = mean(char_bldg_sf, na.rm = T),
        sale_price_per_sf = mean(sale_price/char_bldg_sf, na.rm = T),
        years_old_mean = mean(2023 - char_yrblt, na.rm = T)) %>%
    right_join(townships) %>%
    st_as_sf()
    
qpal <- colorQuantile("Greens", sfh_mean_by_township$sale_price_mean, n = 5)

sfh_mean_by_township %>%
    leaflet() %>%
        addProviderTiles(providers$CartoDB.Positron) %>% 
        addPolygons(
            fillColor = ~ qpal(sale_price_mean),
            fillOpacity = 0.7,
            color = "Black",
            weight = 0.5,
            opacity = 0.5,
            highlightOptions = highlightOptions(
                weight = 2,
                color = "Black",
                fillOpacity = 1,
                bringToFront = T),
            label = sprintf(
                "<strong>%s</strong><br>Average Sale: %s<br/>",
                sfh_mean_by_township$NAME,
                label_currency()(sfh_mean_by_township$sale_price_mean)) %>% 
                    lapply(htmltools::HTML),
            labelOptions = labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "12px",
                direction = "auto"))   %>%
        addLegend(
            pal = qpal,
            values = sfh_mean_by_township$sale_price_mean,
            opacity = 0.7,
            title = "Mean Sale Price of </br>Single Family Homes </br>in Cook County by </br>Township",
            position = "topright",
            na.label = "Insufficient Data",
            labFormat = function(type, cuts, p) {
                n = length(cuts)
                p = paste0(round(p * 100), '%')
                cuts = paste0(label_currency()(cuts[-n]), " - ", label_currency()(cuts[-1]))
                paste0(
                    '<span title="', p[-n], " - ", p[-1], '">', cuts,
                    '</span>'
                )
            })
Code
qpal <- colorQuantile("Oranges", sfh_mean_by_township$bldg_sf_mean, n = 5)

sfh_mean_by_township %>%
    leaflet() %>%
        addProviderTiles(providers$CartoDB.Positron) %>% 
        addPolygons(
            fillColor = ~ qpal(bldg_sf_mean),
            fillOpacity = 0.7,
            color = "Black",
            weight = 0.5,
            opacity = 0.5,
            highlightOptions = highlightOptions(
                weight = 2,
                color = "Black",
                fillOpacity = 1,
                bringToFront = T),
            label = sprintf(
                "<strong>%s</strong><br>Mean Building sqft: %s<br/>",
                sfh_mean_by_township$NAME,
                label_comma(1)(sfh_mean_by_township$bldg_sf_mean)) %>% 
                    lapply(htmltools::HTML),
            labelOptions = labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "12px",
                direction = "auto"))   %>%
        addLegend(
            pal = qpal,
            values = sfh_mean_by_township$bldg_sf_mean,
            opacity = 0.7,
            title = "Mean Building Square Feet</br> of Single Family Homes </br>in Cook County by </br>Township",
            position = "topright",
            na.label = "Insufficient Data",
            labFormat = function(type, cuts, p) {
                n = length(cuts)
                p = paste0(round(p * 100), '%')
                cuts = paste0(label_comma(1)(cuts[-n]), " - ", label_comma(1)(cuts[-1]))
                paste0(
                    '<span title="', p[-n], " - ", p[-1], '">', cuts,
                    '</span>'
                )
            })
Code
qpal <- colorQuantile("Blues", sfh_mean_by_township$sale_price_per_sf, n = 5)

sfh_mean_by_township %>%
    leaflet() %>%
        addProviderTiles(providers$CartoDB.Positron) %>% 
        addPolygons(
            fillColor = ~ qpal(sale_price_per_sf),
            fillOpacity = 0.7,
            color = "Black",
            weight = 0.5,
            opacity = 0.5,
            highlightOptions = highlightOptions(
                weight = 2,
                color = "Black",
                fillOpacity = 1,
                bringToFront = T),
            label = sprintf(
                "<strong>%s</strong><br>%s per sqft<br/>",
                sfh_mean_by_township$NAME,
                label_currency()(sfh_mean_by_township$sale_price_per_sf)) %>% 
                    lapply(htmltools::HTML),
            labelOptions = labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "12px",
                direction = "auto")) %>%
        addLegend(
            pal = qpal,
            values = sfh_mean_by_township$sale_price_per_sf,
            opacity = 0.7,
            title = "Dollars per Square Foot of </br>Single Family Homes in </br>Cook County by Township",
            position = "topright",
            na.label = "Insufficient Data",
            labFormat = function(type, cuts, p) {
                n = length(cuts)
                p = paste0(round(p * 100), '%')
                cuts = paste0(label_currency()(cuts[-n]), " - ", label_currency()(cuts[-1]))
                paste0(
                    '<span title="', p[-n], " - ", p[-1], '">', cuts,
                    '</span>'
                )
            })
Code
qpal <- colorQuantile("Reds", sfh_mean_by_township$years_old_mean, n = 5)

sfh_mean_by_township %>%
    leaflet() %>%
        addProviderTiles(providers$CartoDB.Positron) %>% 
        addPolygons(
            fillColor = ~ qpal(years_old_mean),
            fillOpacity = 0.7,
            color = "Black",
            weight = 0.5,
            opacity = 0.5,
            highlightOptions = highlightOptions(
                weight = 2,
                color = "Black",
                fillOpacity = 1,
                bringToFront = T),
            label = sprintf(
                "<strong>%s</strong><br>%s years old<br/>",
                sfh_mean_by_township$NAME,
                label_number(1)(sfh_mean_by_township$years_old_mean)) %>% 
                    lapply(htmltools::HTML),
            labelOptions = labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "12px",
                direction = "auto")) %>%
        addLegend(
            pal = qpal,
            values = sfh_mean_by_township$years_old_mean,
            opacity = 0.7,
            title = "Mean Years Old of </br>Single Family Homes </br>in Cook County by </br>Township",
            position = "topright",
            na.label = "Insufficient Data",
            labFormat = function(type, cuts, p) {
                n = length(cuts)
                p = paste0(round(p * 100), '%')
                cuts = paste0(label_number(1)(cuts[-n]), " - ", label_number(1)(cuts[-1]))
                paste0(
                    '<span title="', p[-n], " - ", p[-1], '">', cuts,
                    '</span>'
                )
            })

Sale Price Distributions

It seems that single family homes built in the last 30 - 40 years tend to be more expensive, but beyond that sale prices do not seem to be associated with sale prices. As expected, higher square footage single family homes tend to sell for more, but there is still a large amount variability in sale prices. Larger land square footage single family homes seem to sell for more up until about 15,000 square feet where additional land does not seem to be associated with much greater sale prices. More beds, rooms, and bathrooms seem to all be associated with greater sale prices. However, there seems to be some odd outliers with 0 bedrooms or 2 rooms that are selling for more than would be expected. This may be something we would want to investigate later. These could be simply a result of faulty assessor data.

Code
sfh_characteristics %>%
    filter(sale_price <= 10^7) %>%
    ggplot(aes(x = char_yrblt, y = sale_price)) +
        geom_point(size = 0.1) +
        geom_smooth(method = "gam", color = "springgreen4") +
        scale_y_continuous(labels = label_currency()) + 
        labs(title = "Single Family Home Sale Price and Year Built in Cook County, IL", 
            x = "Year Built", 
            y = "Sale Price")

Code
sfh_characteristics %>%
    filter(sale_price <= 10^7) %>%
    ggplot(aes(x = char_bldg_sf, y = sale_price)) +
        geom_point(size = 0.1) +
        geom_smooth(method = "gam", color = "springgreen4") +
        scale_y_continuous(labels = label_currency()) + 
        labs(title = "Single Family Home Building Square Footage and Sale Price \nin Cook County, IL", 
            x = "Building Square Footage", 
            y = "Sale Price")

Code
sfh_characteristics %>%
    filter(sale_price <= 10^7) %>%
    filter(char_land_sf <= 10^5) %>%
    ggplot(aes(x = char_land_sf, y = sale_price)) +
        geom_point(size = 0.1) +
        geom_smooth(method = "gam", color = "springgreen4") +
        scale_y_continuous(labels = label_currency()) + 
        labs(title = "Single Family Home Building Square Footage and Sale Price \nin Cook County, IL", 
            x = "Land Square Footage", 
            y = "Sale Price")

Code
sfh_characteristics %>%
    ggplot() +
        geom_boxplot(aes(x = char_beds, y = sale_price, fill = factor(char_beds))) +
        scale_x_continuous(n.breaks = 9) +
        scale_y_continuous(labels = label_currency()) +
        labs(title = "Single Family Home Number of Bedrooms and Sale Price \nin Cook County, IL", 
            x = "Number of Bedrooms", 
            y = "Sale Price") + 
        theme(legend.position="none")

Code
sfh_characteristics %>%
    ggplot() +
        geom_boxplot(aes(x = char_rooms, y = sale_price, fill = factor(char_rooms))) +
        scale_x_continuous(n.breaks = 20) +
        scale_y_continuous(labels = label_currency()) +
        labs(title = "Single Family Home Number of Rooms and Sale Price \nin Cook County, IL", 
            x = "Number of Rooms", 
            y = "Sale Price") + 
        theme(legend.position="none")

Code
sfh_characteristics %>%
    ggplot() +
        geom_boxplot(aes(x = char_fbath + char_hbath * 0.5, y = sale_price, fill = factor(char_fbath + char_hbath * 0.5))) +
        scale_x_continuous(n.breaks = 20) +
        scale_y_continuous(labels = label_currency()) +
        labs(title = "Single Family Home Total Number of Baths and Sale Price \nin Cook County, IL", 
            x = "Total Number of Baths", 
            y = "Sale Price") + 
        theme(legend.position="none")

Use cmfproperty to conduct a sales ratio study across the relevant time period

Let’s get our data into a format that the cmfproperty package can use.

Code
library(cmfproperty)

con <- DBI::dbConnect(RSQLite::SQLite(), "../data/cook.sqlite")

sfh_sales_cmf <- sfh_sales %>% 
    select(pin, year, sale_price, doc_no) %>%
    distinct(doc_no, .keep_all = TRUE) %>%
    select(-doc_no) %>%
    filter(as.numeric(sale_price) > 2500)

assessments_cmf <- tbl(con, "assessments") %>%
    collect() %>%
    select(pin, tax_year, certified_tot)

sale_assess_cmf <- sfh_sales_cmf %>% 
    left_join(assessments_cmf, by = join_by(pin, year == tax_year)) %>% 
    rename(PIN = pin, SALE_YEAR = year, SALE_PRICE = sale_price, ASSESSED_VALUE = certified_tot) %>%
    mutate(ASSESSED_VALUE = 10 * ASSESSED_VALUE)

ratios <- reformat_data(sale_assess_cmf,
    sale_col = "SALE_PRICE",
    assessment_col = "ASSESSED_VALUE",
    sale_year_col = "SALE_YEAR")

stats <- calc_iaao_stats(ratios)

Overall, we can see that at every sale price decile, single family homes are underassessed compared to their true sale price. However, the degree to which single family homes are underassessed is not equal. Regardless of efforts of the county, it is clear that in 2023 (for the months of Jan-Sep), houses within the lowest deciles tend to be assessed at higher ratios to their true sales price. The overall picture from 2021 to Sep 2023 is more enocouraging, but nevertheless the lowest decile tends to be assessed at higher rates than all other deciles and the 2nd to 5th lowest deciles seem to be asssed at lower rates than the higher deciles.

Code
binned <- binned_scatter(ratios,
    min_reporting_yr = 2021,
    max_reporting_yr = 2023,
    jurisdiction_name = "Cook County, IL")

knitr::asis_output(htmltools::htmlPreserve(binned[[1]]))

binned[[2]]

In 2023, the most expensive homes (the top decile) were assessed at 76.2% of their value and the least expensive homes (the bottom decile) were assessed at 93.1%. In other words, the least expensive homes were assessed at 1.22 times the rate applied to the most expensive homes. Across our sample from 2021 to 2023, the most expensive homes were assessed at 77.4% of their value and the least expensive homes were assessed at 84.6%, which is 1.09 times the rate applied to the most expensive homes.

Code
pct_over <- pct_over_under(ratios,
    min_reporting_yr = 2021,
    max_reporting_yr = 2023,
    jurisdiction_name = "Cook County, IL")

knitr::asis_output(htmltools::htmlPreserve(pct_over[[1]]))

pct_over[[2]]

In Cook County, IL, 58% of the lowest value homes are overassessed and 52% of the highest value homes are overassessed.

Code
iaao_rslt <- iaao_graphs(stats,
    ratios,
    min_reporting_yr = 2021,
    max_reporting_yr = 2023,
    jurisdiction_name = "Cook County, Illinois")

knitr::asis_output(htmltools::htmlPreserve(iaao_rslt[[1]]))

iaao_rslt[[2]]

For 2023, the COD in Cook County, Illinois was 19.6 which did not meet the IAAO standard for uniformity.

Code
knitr::asis_output(htmltools::htmlPreserve(iaao_rslt[[3]]))

iaao_rslt[[4]]

In 2023, the PRD in Cook County, Illinois, was 1.074 which does not meet the IAAO standard for vertical equity.

Code
knitr::asis_output(htmltools::htmlPreserve(iaao_rslt[[5]]))

iaao_rslt[[6]]

In 2023, the PRB in Cook County, Illinois was 0.004 which indicates that sales ratios increase by 0.4% when home values double. This meets the IAAO standard.