Author

Jeremy Allen

Political violence in the US

I examine the number of people prosecuted for and number of deaths from violent crimes in the United States. I exclude violent crimes that do not have an ideological motivation and guilty verdict. I use data¹ from The Prosecution Project Dataset (Loadenthal et al. 2023). This data covers 1990 to present, though the most recent events are still being processed and may not appear in the data. In this data, each row represents a defendant in a case.

After Charlie Kirk was assassinated in September, the US president, Donald Trump, and Elon Musk said the left is the real problem (Suter 2025; Hutzler and Stoddart 2025). They struggle with numbers, so this report looks at reliable numbers. The left, right, and others commit violence. However, the lopsided swell of violence from the right is obvious for any observer interested in knowledge more than inflammatory politics.

Prepare the data

Import the data, clean the column names, and remove pending cases and cases without clear guilty verdict, and filter for only the US.

Code

suppressPackageStartupMessages({
    library(tidyverse)
    library(janitor)
    library(RColorBrewer)
})

jurisdiction_extract <- function(string1, string2) {
  if (is.na(string2)) {
    return(NA)
  }

  string1 <- tolower(string1)
  string2 <- tolower(string2)

  if (string1 == "federal") {
    return("Federal")
  }

  if (!is.character(string2)) {
    stop("string2 must be a character string")
  }

  first_two_words = str_extract(string2, "^\\w+(\\s\\w+)?")

  if (first_two_words == "district of") {
    return("District")
  }

  if (first_two_words == "city of") {
    return("City")
  }

  if (first_two_words == "county of") {
    return("County")
  }

  if (first_two_words == "united states") {
    return("Federal")
  }

  if (
    str_detect(first_two_words, "state") || first_two_words == "commonwealth of"
  ) {
    return("State")
  }

  if (!string1 %in% c("federal", "non-federal")) {
    return("Unknown")
  }

  return("Unknown")

}

Code

df <- read_csv(fs::dir_ls("./data", glob = "*.csv")) |> 
    clean_names()

# fix the mix of numeric and character in the # killed and injured columns
df <- df |> 
    mutate(
    number_killed = case_when(
      number_killed %in% c("Multiple", "Unknown") ~ NA_integer_,
      is.na(number_killed) ~ NA_integer_,
      TRUE ~ as.integer(number_killed)
    ),
    number_injured = case_when(
      number_injured %in% c("Multiple", "Unknown") ~ NA_integer_,
      is.na(number_injured) ~ NA_integer_,
      TRUE ~ as.integer(number_injured)
    )
  )

# Add missing case names
df <- df |> 
    mutate(
        name_of_case = str_to_title(name_of_case),
        name_of_case = if_else(
            case_id == "08112021_JMW_WEBB3",
            "State Of Michigan V. Justen Watkins",
            name_of_case
        ),
        name_of_case = if_else(
            case_id == "08112021_TFD_WEBB2",
            "State Of Michigan V. Thomas Denton",
            name_of_case
        ),
        name_of_case = if_else(
            case_id == "08112021_TW_WEBB1",
            "State Of Michigan V. Tristan Webb",
            name_of_case
        ),
        name_of_case = if_else(
            name_of_case == "Data Not Available",
            "Unknown",
            name_of_case
        )
    )

# only US cases
df <- df |> 
    filter(location_country == "United States")

# fix case name typos
df <- df |>
    mutate(
        name_of_case = str_replace_all(name_of_case, "Ofmassachusetts", "Of Massachusetts"),
        name_of_case = str_replace_all(name_of_case, "United Staters", "United States"),
        name_of_case = str_replace_all(name_of_case, "United Staers", "United States"),
        name_of_case = str_replace_all(name_of_case, "United Sates", "United States"),
        name_of_case = str_replace_all(name_of_case, "United Staes", "United States"),
        name_of_case = str_replace_all(name_of_case, "\\bUsa\\b", "United States"),
        name_of_case = str_replace_all(name_of_case, regex("\\bUnited State\\b"), "United States"),
        name_of_case = str_replace_all(name_of_case, regex("\\bUnites States\\b"), "United States"),
        name_of_case = str_remove(name_of_case, regex("^(The )?People Of ")),
        name_of_case = str_replace_all(name_of_case, regex("^The State"), "State"),
        name_of_case = str_replace_all(name_of_case, regex("^Virginia\\b"), "Commonwealth Of Virginia"),
        name_of_case = str_replace_all(name_of_case, regex("^New York\\b"), "State Of New York"),
        name_of_case = str_replace_all(name_of_case, regex("^Ten(n)?essee\\b"), "State Of Tennessee"),
        name_of_case = str_replace_all(name_of_case, regex("^West Virginia\\b"), "State Of West Virginia"),
        name_of_case = str_replace_all(name_of_case, regex("^State V\\. Christopher\\b"), "State Of Nevada V. Christopher"),
        name_of_case = str_replace_all(name_of_case, regex("^State V\\. Finley\\b"), "State Of Nevada V. Finley")
    )

# move appearance number from full names to its own column and clean up
df <- df |> 
  mutate(
    appearance_number = str_extract(full_legal_name, "\\s\\(\\d+\\)$"),
    full_legal_name = str_remove(full_legal_name, "\\s\\(\\d+\\)$"),
    appearance_number = str_remove_all(appearance_number, "[\\(\\)\\s]"),
    appearance_number = as.integer(appearance_number),
    full_legal_name = str_trim(full_legal_name),
    appearance_number = str_trim(appearance_number)
  ) 

# add finer-grained jurisdiction type
df <- df |>
    mutate(
      jurisdiction_level = map2_chr(jurisdiction, name_of_case, jurisdiction_extract)
    )

Code

no_clear_guilt <- c("Charged but not tried", "Data not available", "Not guilty", "Hung jury/mistrial", "Pending")
num_no_clear_guilt <- df |> 
    filter(verdict %in% no_clear_guilt) |> 
    nrow()

df <- df |> 
    filter(!verdict %in% no_clear_guilt)

There are 1853 defendants without a clear guilty verdict that I remove from the analysis.

Are any defendants duplicated for the same charge? For example, being in the data on two different dates as part of the same case?

Code

duplicates <- df |> 
    group_by(full_legal_name, name_of_case) |> 
    filter(n() > 1) |> 
    ungroup()

df <- df |>
    filter(!case_id %in% c("09262001_PMC", "02142024_KC"))

# Remove incorrect leftist classification for Cody Michael Tarner (should be rightist)
df <- df |>
    filter(!(case_id == "09082020_CMT" & ideological_affiliation == "Leftist: government-focused"))

# Remove incorrect nationalist-separatist classification for Jonathan L. Xie (should be Salafi/Jihadist/Islamist)
df <- df |>
    filter(!(case_id == "09152020_JLX" & ideological_affiliation == "Nationalist-separatist"))

# Remove incorrect "Carried through" classification for Cole James Bridges (should be Attempted)
df <- df |>
    filter(!(case_id == "02032021_CJB" & completion_of_crime == "Carried through"))

# Remove complaint row for John Malcolm Bareswill (keep indictment only)
df <- df |>
    filter(!(case_id == "06112020_JMB" & date_descriptor == "Complaint"))

# Remove incorrect "Not guilty" row for Daniel Delbert Dorson (keep guilty plea row)
df <- df |>
    filter(!(case_id == "12162020_DDD_MILLER4"))

# Remove initial indictment for Adis Medunjanin (keep superseding indictment from July 2010)
df <- df |>
    filter(!(case_id == "02252010_AM_AHMEDZAY2"))

# Remove initial indictment for Thomas Heard (keep later indictment with full charges)
df <- df |>
    filter(!(case_id == "03092019_TH"))

# Remove "Charged but not tried" row for Anthony Comello (keep guilty plea row)
df <- df |>
    filter(!(case_id == "04282019_AC"))

# Remove arrest row for Jason Brown (keep indictment row)
df <- df |>
    filter(!(case_id == "11142019_JB" & date_descriptor == "Arrest/arraignment"))

# Remove complaint row for Marian Hudak (keep indictment row)
df <- df |>
    filter(!(case_id == "06222023_MH" & date_descriptor == "Complaint"))

# Remove indictment row for Rowan McManigal (keep sentencing row)
df <- df |>
    filter(!(case_id == "04042022_RM" & date_descriptor == "Indictment"))

# Remove initial indictment for Walter Edmund Bond (keep later consolidated indictment)
df <- df |>
    filter(!(case_id == "07272010_WEB"))

# Remove indictment row for Michael David Fox (keep plea row)
df <- df |>
    filter(!(case_id == "09202023_MDF" & date_descriptor == "Indictment"))

# Remove arrest row for Omar Alkattoul (keep indictment row)
df <- df |>
    filter(!(case_id == "11102022_OA" & date_descriptor == "Arrest/arraignment"))

There are eight cases where defendants appear in the data multiple times for the same charge but at different stages of prosecution (e.g., indictment, arrest, plea, sentencing). I remove eight rows to avoid double counting these defendants, keeping the row that represents the most complete or final stage of their case. This leaves us with 5753 defendants.

How many defendants are not linked or motivated politically?

Code

total_defendants <- nrow(df)

defendants_not_political <- df |>
    filter(criminal_method == "Criminal violation not linked or motivated politically") |> 
    nrow()

There are 247 defendants not politically motivated out of a total of 5753 defendants, which is 4.29%. So, I exclude these non-politically motivated defendants going forward.

Code

# remove defendants not politically motivated or unknown
df <- df |>
    filter(!criminal_method %in% c("Unknown/unspecified/undeveloped", "Criminal violation not linked or motivated politically"))

# parse dates
df <- df |> 
    mutate(
        date = str_trim(date),
        date = lubridate::mdy(date)
    )

Were all of these crimes fully carried through?

Code

df |> 
    count(completion_of_crime) |> 
    knitr::kable(
        col.names = c("Case Outcome", "Number of Prosecutions"),
        caption = "Table 1: Case Outcomes"
    )

Table 1: Case Outcomes
Case Outcome	Number of Prosecutions
Attempted	331
Carried through	3425
Planned but not attempted	497
Threat	203
Unknown	13
NA	1004

Code

num_threats_and_unknown <- df |> 
    filter(completion_of_crime %in% c("Threat", "Unknown")) |> 
    nrow()

I am more interested in harm that was attempted, planned, or occurred, so I will filter out the others (216 defendants). That leaves us with 5257 defendants.

Code

df <- df |> 
    filter(!completion_of_crime %in% c("Threat", "Unknown"))

Which dates are represented in the data?

Code

# Warn if minimum date is earlier than 1900 or latest date is later than today
date_range <- range(df$date, na.rm = TRUE)
if (date_range[1] < as.Date("1900-01-01") || date_range[2] > Sys.Date()) {
    warning("Date range is outside expected bounds.")
}


df |>
    summarise(
        `Earliest Date` = min(date, na.rm = TRUE),
        `Latest Date` = max(date, na.rm = TRUE)
    ) |> 
    knitr::kable(caption = "Table 2: Date Range of the Data")

Table 2: Date Range of the Data
Earliest Date	Latest Date
1990-01-22	2025-05-13

Which ideological affiliations are represented in the data?

Code

df |>
    count(ideological_affiliation) |> 
    knitr::kable(
        col.names = c("Ideological Affiliation", "Number of Prosecutions"),
        caption = "Table 3: Number of Prosecutions by Ideological Affiliation"
    )

Table 3: Number of Prosecutions by Ideological Affiliation
Ideological Affiliation	Number of Prosecutions
Leftist: eco-animal focused	121
Leftist: government-focused	288
Leftist: identity-focused	107
Leftist: unspecified	145
Nationalist-separatist	69
No affiliation/not a factor	121
Other	39
Rightist: abortion-focused	85
Rightist: government-focused	1244
Rightist: identity-focused	1256
Rightist: unspecified	121
Salafi/Jihadist/Islamist	540
Unclear	139
NA	982

Simplify ideologies

Because there are multiple leftist categories and multiple rightist categories, I add a column that consolidates leftist types and rightest types. This hides important variation within ideologies but allows to more easily asses political statements about left vs. right violence.

Code

df <- df |> 
    mutate(
        ideology_simple = case_when(
            str_detect(ideological_affiliation, "Leftist") ~ "Leftist",
            str_detect(ideological_affiliation, "Rightist") ~ "Rightist",
            TRUE ~ ideological_affiliation
        )
    )

df |>
    count(ideology_simple, sort = TRUE) |> 
    knitr::kable(
        col.names = c("Ideological Affiliation", "Number of Prosecutions"),
        caption = "Table 4: Number of Prosecutions by Simplified Ideological Affiliation"
    )

Table 4: Number of Prosecutions by Simplified Ideological Affiliation
Ideological Affiliation	Number of Prosecutions
Rightist	2706
NA	982
Leftist	661
Salafi/Jihadist/Islamist	540
Unclear	139
No affiliation/not a factor	121
Nationalist-separatist	69
Other	39

Focus on Violence

How many defendants were used violent vs. non-violent methods, by ideological affiliation?

Then I filter for only those to make our visualizations.

Code

# classify methods as violent or non-violent
violent <- c(
  "Unarmed assault",
  "Hostage-taking",
  "Armed intimidation/standoff",
  "Vehicle ramming",
  "Chemical or biological weapon deployment",
  "Firearms: civilian",
  "Firearms: military",
  "Explosives",
  "Other weapons"
)

df <- df |> 
    mutate(method_type = if_else(
        criminal_method %in% violent,
        "violent",
        "non-violent"
    ))

df |> 
    group_by(ideology_simple, method_type) |> 
    summarise(n = n()) |> 
    pivot_wider(names_from = method_type, values_from = n, values_fill = 0) |>
    select(ideology_simple, violent, `non-violent`) |>
    arrange(desc(violent)) |> 
    knitr::kable(
        col.names = c("Ideological Affiliation", "Violent Prosecutions", "Non-Violent Prosecutions"),
        caption = "Table 5: Number of Violent vs. Non-Violent Prosecutions by Ideological Affiliation"
    )

Table 5: Number of Violent vs. Non-Violent Prosecutions by Ideological Affiliation
Ideological Affiliation	Violent Prosecutions	Non-Violent Prosecutions
Rightist	1200	1506
Salafi/Jihadist/Islamist	132	408
Leftist	124	537
Unclear	91	48
No affiliation/not a factor	46	75
Other	35	4
Nationalist-separatist	33	36
NA	19	963

Prosecutions by ideological affiliation

Code

violent_crimes <- df |> 
    filter(
        method_type == "violent",
        ideology_simple != "No affiliation/not a factor"
        )
    
violent_crimes_by_ideology <- violent_crimes |> 
    count(ideology_simple, name = "violent_prosecutions") |> 
    arrange(desc(violent_prosecutions))

ggplot(violent_crimes_by_ideology, aes(x = reorder(ideology_simple, violent_prosecutions), 
                                       y = violent_prosecutions)) +
    geom_col(fill = "black", alpha = 0.8) +
    coord_flip() +
    labs(
        title = "Defendants Prosecuted for Violent Crimes in the US",
        subtitle = "By Ideological Affiliation of the Perpetrator",
        x = "Their Ideology",
        y = "Number of Defendants",
        caption = "Source: The Prosecution Project"
    ) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11)
    )

Prosecutions over time

Code

df_yearly <- violent_crimes |>
    mutate(
        year = lubridate::year(date),
        year = as.integer(year)
    ) |>
    filter(!is.na(year)) |>
    count(year, ideology_simple, name = "prosecutions")

presidents <- tibble(
  year = c(1993L, 2001L, 2009L, 2017L, 2021L, 2025L),
  president = c("Clinton (D)", "Bush (R)", "Obama (D)", "Trump (R)", "Biden (D)", "Trump (R)")
)

ggplot() +
    geom_vline(data = presidents, aes(xintercept = year), 
               color = "black", linetype = "dashed", alpha = 0.7) +
    geom_text(data = presidents, aes(x = year+.25, y = 110, label = president), 
              vjust = 0, hjust = 0, angle = 0, size = 4, color = "black") +
    geom_col(data = df_yearly, aes(x = year, y = prosecutions, fill = ideology_simple), alpha = .8) +
    scale_fill_manual(values = c(
        "Rightist" = "#c14a58ff",                    # Red for Rightist
        "Leftist" = "#337fb5ff",                     # Blue for Leftist  
        "Salafi/Jihadist/Islamist" = "#e08738ff",    # Orange
        "Unclear" = "#71a771ff",                     # Green
        "Nationalist-separatist" = "#8c564b",      # Brown
        "Other" = "#e377c2"                        # Pink
    )) +
    scale_x_continuous(
        breaks = scales::pretty_breaks(n = 9),
        limits = c(1990, 2026)
    ) +
    labs(
        title = "Defendants Prosecuted for Violent Crimes in the US by Year",
        subtitle = "By Ideological Affiliation of the Perpetrator",
        x = "Year",
        y = "Number of Defendants",
        fill = "Their Ideology",
        caption = "Source: The Prosecution Project"
    ) +
    guides(fill = guide_legend(ncol = 6)) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11),
        panel.grid.minor = element_blank(),
        legend.position = "bottom",
        legend.title = element_text(size = 10),
        legend.text = element_text(size = 9)
    )

Deaths

The data is structured by prosecution. A single defendant can be prosecuted in multiple jurisdictions for the same incident. For example, a shooter can appear in a federal indictment and a state indictment for the same shooting. If I am not careful I could overcount deaths from these crimes.

How often does this happen? How much would I overcount if I simply summed the number killed from all rows in the data?

Code

# were any individuals charged at multiple jurisdictional levels?
dupes <- violent_crimes |>
  group_by(full_legal_name) |>
  arrange(date) |>
  summarize(
    n_jurisdiction_levels = n_distinct(jurisdiction_level),
    jurisdictions = str_c(jurisdiction_level, collapse = "; "),
    case_ids = str_c(case_id, collapse = "; "),
    date_types = str_c(date_descriptor, collapse = "; "),
    state_locations = str_c(location_state, collapse = "; "),
    n_killed_each = str_c(number_killed, collapse = "; "),
    ideology = str_c(ideology_simple, collapse = "; ")
  ) |>
  filter(n_jurisdiction_levels > 1) |>
  arrange((full_legal_name)) |>
  select(-n_jurisdiction_levels)

# how much overcount?
# extract digits from n_killed_each column where there are multiple numbers that are the same
dupe_counts <- dupes |>
  mutate(
    n_killed_each_list = str_extract_all(n_killed_each, "\\d+"),
    n_killed_each_list = map(n_killed_each_list, as.integer),
    total_killed_if_summed = map_int(n_killed_each_list, sum),
    are_equal = map_lgl(n_killed_each_list, ~ length(unique(.x)) == 1)
  ) |>
  filter(are_equal) |>  # just the ones where the numbers are the same
  mutate(actual_killed = map_int(n_killed_each_list, ~ ifelse(length(.x) > 0, .x[1], 0))) |>
  select(-n_killed_each_list, -are_equal)

dupes |> 
    knitr::kable(
        col.names = c("Defendant Name", "Jurisdictions", "Case IDs", "Date Types", "States", "# Killed Each", "Ideology"),
        caption = "Table 6: Defendants Charged at Multiple Jurisdictional Levels"
    )

Table 6: Defendants Charged at Multiple Jurisdictional Levels
Defendant Name	Jurisdictions	Case IDs	Date Types	States	# Killed Each	Ideology
Adam Purinton	State; Federal	02232017_AWP; 06092017_AWP	Arrest/arraignment; Indictment	Kansas; Kansas	1; 1	Rightist; Rightist
Alan Dale Covington	State; Federal	11302018_ADC1; 02202019_ADC2	Indictment; Indictment	Utah; Utah	0; 0	Rightist; Rightist
Donald Hansard	State; Federal	08221997_DH_WCOTC4; 10021998_DH_VIDEO3	Crime/attack; Indictment	Florida; Florida	0; 0	Rightist; Rightist
Dylann Storm Roof	State; Federal	07072015_DSR; 07222015_DSR	Indictment; Indictment	South Carolina; South Carolina	9; 9	Rightist; Rightist
James Alex Fields Jr.	State; Federal	08182017_JAF; 06272018_JAF	Crime/attack; Indictment	Virginia; Virginia	1; 1	Rightist; Rightist
James Charles Kopp	Federal; State	10172000_JCK; 05092003_JCK	Indictment; Sentencing	New York; New York	1; 1	Rightist; Rightist
John Fitzgerald Johnson	State; Federal	12022020_JFJ; 02242021_JFJ	Indictment; Indictment	Kentucky; Kentucky	0; 0	Leftist; Leftist
Patrick Wood Crusius	State; Federal	09122019_PWC2; 07092020_PWC1	Indictment; Indictment	Texas; Texas	23; 23	Rightist; Rightist
Rachelle Ranae Pauli Shannon	State; Federal	08191993_RRS; 06071995_RRS	Crime/attack; Plea	Kansas; Multiple states	0; 0	Rightist; Rightist
Ray Lazier Lengend	Federal; State	03192012_RLL2; 03192012_RLL1	Indictment; Indictment	New York; New York	0; 0	Rightist; Rightist
Raymond A. Leone	State; Federal	08221997_RL_WCOTC1; 03291998_RAL_VIDEO1	Crime/attack; Crime/attack	Florida; Florida	0; 0	Rightist; Rightist
Ross Anthony Farca	State; Federal	06102019_RAF; 11192019_RAF	Arrest/arraignment; Indictment	California; California	0; 0	Rightist; Rightist
Steven Carrillo	State; Federal	05292020_SC; 06162020_SC_CARRILLO1	Crime/attack; Indictment	California; California	1; 1	Rightist; Rightist

Out of our 1615 violent crime prosecutions, there are 13 defendants who were charged at multiple jurisdiction levels. If I simply summed the number killed from every row, I would mistakenly double count some deaths. For these 13 defendants, I would get 72 deaths instead of the actual 36 deaths. This would be an overcount of 36 deaths.

Now, let’s look at the number of deaths by ideological affiliation, and avoid double counting from cases where a defendant was prosecuted twice.

Code

# Don't double count deaths in cases where more than one defendant is tried for the same case
df_deaths <- violent_crimes |>
  mutate(
    # Extract the incident date from Case ID (first part before underscore)  
    incident_date = str_extract(case_id, "^[^_]+"),
    # Parse the date correctly
    incident_date_formatted = case_when(
      str_length(incident_date) == 8 & str_detect(incident_date, "^\\d{8}$") ~ 
        paste0(str_sub(incident_date, 1, 2), "/", 
               str_sub(incident_date, 3, 4), "/", 
               str_sub(incident_date, 5, 8)),
      TRUE ~ incident_date
    )
  ) |>
  group_by(incident_date, incident_date_formatted) |>
  summarise(
    unique_incidents = 1,  
    related_cases = n(),   
    total_individuals = n_distinct(full_legal_name), 
    total_deaths = first(number_killed, na_rm = TRUE),  
    case_names = paste(unique(name_of_case), collapse = "; "),
    ideology_simple = first(ideology_simple, na_rm = TRUE),
    .groups = "drop"
  ) |>
  group_by(ideology_simple) |> 
  summarise(total_deaths = sum(total_deaths, na.rm = TRUE)) |> 
  arrange(desc(total_deaths))

plot1 <- ggplot(df_deaths, aes(x = reorder(ideology_simple, total_deaths), 
                                       y = total_deaths)) +
    geom_col(fill = "black", alpha = 0.8) +
    coord_flip() +
    labs(
        title = "Deaths Reported in Prosecutions in the US",
        subtitle = "By Ideological Affiliation of the Perpetrator",
        x = "Perpetrator Ideology",
        y = "Deaths",
        caption = "Source: The Prosecution Project"
    ) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11)
    )

plot1

Notable Trends

An obvious spike for defendants prosecuted during the George Floyd protests in 2020.

Code

# just leftist
yearly_left <- df_yearly |> 
    filter(ideology_simple == "Leftist")

ggplot() +
    geom_vline(data = presidents, aes(xintercept = year), 
               color = "black", linetype = "dashed", alpha = 0.7) +
    geom_text(data = presidents, aes(x = year, y = 38, label = president), 
              vjust = 1.65, hjust = 0, angle = 90, size = 3, color = "black") +
    geom_col(data = yearly_left, aes(x = year, y = prosecutions, fill = ideology_simple)) +
    scale_fill_manual(values = c(
        "Rightist" = "#c14a58ff",                    # Red for Rightist
        "Leftist" = "#337fb5ff",                     # Blue for Leftist  
        "Salafi/Jihadist/Islamist" = "#e08738ff",    # Orange
        "Unclear" = "#71a771ff",                     # Green
        "Nationalist-separatist" = "#8c564b",      # Brown
        "Other" = "#e377c2"                        # Pink
    )) +
    scale_x_continuous(breaks = scales::pretty_breaks(n = 8)) +
    labs(
        title = "Leftist Defendants Prosecuted for Violent Crimes in the US by Year",
        subtitle = "",
        x = "Year",
        y = "Number of Defendants",
        fill = "Their Ideology",
        caption = "Source: The Prosecution Project"
    ) +
    guides(fill = guide_legend(ncol = 6)) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11),
        panel.grid.minor = element_blank(),
        legend.position = "bottom",
        legend.title = element_text(size = 10),
        legend.text = element_text(size = 9)
    )

And an obvious spike for defendants prosecuted following the January 6 Capital riot in 2021.

Code

# just leftist
yearly_right <- df_yearly |> 
    filter(ideology_simple == "Rightist")

ggplot() +
    geom_vline(data = presidents, aes(xintercept = year), 
               color = "black", linetype = "dashed", alpha = 0.7) +
    geom_text(data = presidents, aes(x = year, y = 80, label = president), 
              vjust = 1.65, hjust = 0, angle = 90, size = 3, color = "black") +
    geom_col(data = yearly_right, aes(x = year, y = prosecutions, fill = ideology_simple)) +
    scale_fill_manual(values = c(
        "Rightist" = "#c14a58ff",                    # Red for Rightist
        "Leftist" = "#337fb5ff",                     # Blue for Leftist  
        "Salafi/Jihadist/Islamist" = "#e08738ff",    # Orange
        "Unclear" = "#71a771ff",                     # Green
        "Nationalist-separatist" = "#8c564b",      # Brown
        "Other" = "#e377c2"                        # Pink
    )) +
    scale_x_continuous(breaks = scales::pretty_breaks(n = 8)) +
    labs(
        title = "Rightist Defendants Prosecuted for Violent Crimes in the US by Year",
        subtitle = "",
        x = "Year",
        y = "Number of Defendants",
        fill = "Their Ideology",
        caption = "Source: The Prosecution Project"
    ) +
    guides(fill = guide_legend(ncol = 6)) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11),
        panel.grid.minor = element_blank(),
        legend.position = "bottom",
        legend.title = element_text(size = 10),
        legend.text = element_text(size = 9)
    )

Perhaps the most notable trend is the rise in the number of defendants whose ideology is unclear.

Code

yearly_unclear <- df_yearly |> 
    filter(ideology_simple == "Unclear")

ggplot() +
    geom_vline(data = presidents, aes(xintercept = year), 
               color = "black", linetype = "dashed", alpha = 0.7) +
    geom_text(data = presidents, aes(x = year, y = 12, label = president), 
              vjust = 1.65, hjust = 0, angle = 90, size = 3, color = "black") +
    geom_col(data = yearly_unclear, aes(x = year, y = prosecutions, fill = ideology_simple)) +
    scale_fill_manual(values = c(
        "Rightist" = "#c14a58ff",                    # Red for Rightist
        "Leftist" = "#337fb5ff",                     # Blue for Leftist  
        "Salafi/Jihadist/Islamist" = "#e08738ff",    # Orange
        "Unclear" = "#71a771ff",                     # Green
        "Nationalist-separatist" = "#8c564b",      # Brown
        "Other" = "#e377c2"                        # Pink
    )) +
    scale_x_continuous(breaks = scales::pretty_breaks(n = 8)) +
    labs(
        title = "Defendants Prosecuted for Violent Crimes in the US by Year",
        subtitle = "Whose ideology is unclear",
        x = "Year",
        y = "Number of Defendants",
        fill = "Their Ideology",
        caption = "Source: The Prosecution Project"
    ) +
    guides(fill = guide_legend(ncol = 6)) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11),
        panel.grid.minor = element_blank(),
        legend.position = "bottom",
        legend.title = element_text(size = 10),
        legend.text = element_text(size = 9)
    )

Unclear Ideological Affiliation

Some defendants in the dataset do not have a clear ideological affiliation. These cases represent violent acts where the motivation is ambiguous or not clearly tied to a specific political ideology. This section examines what methods these defendants used and what kinds of targets they attacked.

Code

unclear_ideology <- violent_crimes |>
    filter(ideology_simple == "Unclear")

# Criminal methods for unclear ideology
method_counts <- unclear_ideology |>
    count(criminal_method, sort = TRUE) |>
    filter(!is.na(criminal_method)) |>
    slice_head(n = 15)

ggplot(method_counts, aes(x = reorder(criminal_method, n), y = n)) +
    geom_col(fill = "black", alpha = 0.8) +
    coord_flip() +
    labs(
        title = "Criminal Methods Used by Defendants",
        subtitle = "With Unclear Ideological Affiliation",
        x = NULL,
        y = "Number of Defendants",
        caption = "Source: The Prosecution Project"
    ) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11)
    )

Code

# Ideological targets for unclear ideology
target_counts <- unclear_ideology |>
    count(`ideological_target`, sort = TRUE) |>
    filter(!is.na(ideological_target)) |>
    slice_head(n = 15)

ggplot(target_counts, aes(x = reorder(ideological_target, n), y = n)) +
    geom_col(fill = "black", alpha = 0.8) +
    coord_flip() +
    labs(
        title = "Targets Attacked by Defendants",
        subtitle = "With Unclear Ideological Affiliation",
        x = NULL,
        y = "Number of Defendants",
        caption = "Source: The Prosecution Project"
    ) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11)
    )

References

Hutzler, Alexandra, and Michelle Stoddart. 2025. “Trump Doubles down on Blaming ‘radical Left’ after Vow to Go after Political Violence.” ABC News, September 12. https://abcnews.go.com/Politics/trump-doubles-blaming-radical-left-after-vow-after/story?id=125509965.

Loadenthal, Michael, Lauren Donahoe, Madison Weaver, Sara Godfrey, Kathryn Blowers, et. al. 2023. “The Prosecution Project Dataset,” the Prosecution Project, 2023 [dataset]. https://theprosecutionproject.org/

Suter, Tara. 2025. “Elon Musk: ‘The Left Is the Party of Murder.’” Text. The Hill, September 14. https://thehill.com/policy/technology/5502535-elon-musk-charlie-kirk-death/.

Code

This descriptive analysis was created by Jeremy Allen using R, tidyverse, janitor, and RColorBrewer packages. The report is built with Quarto. The code is available on GitHub.

Footnotes

See The Prosecution Project’s FAQ where they discuss what data is included, excluded, and why. For example, their data is likely an undercount of violence because they compile their data only from prosecutions. If there were a violent crime and the perpetrator were killed or died, there would be no prosecution and thus no data point in this set for that crime.↩︎