Author

Jeremy Allen

Political violence in the US

I examine the number of people prosecuted and number of deaths from violent crimes in the United States. Not all violent crimes, just violent crimes that have an ideological motivation and guilty verdict. I use data1 from The Prosecution Project Dataset (Loadenthal et al. 2023). In this data, each row is a defendant in a case.

After Charlie Kirk was assassinated in September, the US president, Donald Trump, and Elon Musk said the left is the real problem (Suter 2025; Hutzler and Stoddart 2025). They struggle with numbers, so this report looks at reliable numbers. The left, right, and others commit violence. However, the lopsided swell of violence from the right is obvious for any observer interested in knowledge more than inflammatory politics.

Prepare the data

Import the data, clean the column names, and remove pending cases and cases without clear guilty verdict, and filter for only the US.

Code
suppressPackageStartupMessages({
    library(tidyverse)
    library(janitor)
    library(RColorBrewer)
})
Code
df <- read_csv("data/tpp-2025-09-15-general.csv") |> 
    clean_names()

# fix the mix of numeric and character in the # killed column
df <- df |> 
    mutate(
    number_killed = case_when(
      number_killed %in% c("Multiple", "Unknown") ~ NA_real_,  # Convert text to NA
      is.na(number_killed) ~ NA_real_,  # Keep existing NAs
      TRUE ~ number_killed  # Keep numeric values as they are
    )
  )

no_clear_guilt <- c("Charged but not tried", "Data not available", "Not guilty", "Hung jury/mistrial", "Pending")
num_no_clear_guilt <- df |> 
    filter(verdict %in% no_clear_guilt) |> 
    nrow()

df <- df |> 
    filter(!verdict %in% no_clear_guilt)

df <- df |> 
    filter(location_country == "United States")

There are 1322 defendants without a clear guilty verdict that I remove from the analysis.

Are any defendants duplicated for the same charge? For example, being in the data on two different dates as part of the same case?

Code
duplicates <- df |> 
    group_by(full_legal_name, name_of_case) |> 
    filter(n() > 1) |> 
    ungroup()

df <- df |> 
    filter(!case_id %in% c("09262001_PMC", "02142024_KC"))

There are two cases where the defendants are in the data twice for the same charge, once for their indictment and once for a second part of their trial, on different dates. I remove two rows to avoid double counting these two defendants. This leaves us with 3302 defendants. Four others are duplicates but for different charges, so I keep them.

How many defendants are not linked or motivated politically?

Code
total_defendants <- nrow(df)

defendants_not_political <- df |>
    filter(criminal_method == "Criminal violation not linked or motivated politically") |> 
    nrow()

There are 243 defendants not politically motivated out of a total of 3302 defendants, which is 7.36%. So, I exclude these non-politically motivated defendants going forward.

Code
# remove defendants not politically motivated or unknown
df <- df |>
    filter(!criminal_method %in% c("Unknown/unspecified/undeveloped", "Criminal violation not linked or motivated politically"))

# parse dates
df <- df |> 
    mutate(
        date = str_trim(date),
        date = lubridate::mdy(date)
    ) 

Were all of these crimes fully carried through?

Code
df |> 
    count(completion_of_crime) |> 
    knitr::kable(
        col.names = c("Case Outcome", "Number of Prosecutions"),
        caption = "Table 1: Case Outcomes"
    )
Table 1: Case Outcomes
Case Outcome Number of Prosecutions
Attempted 296
Carried through 2050
Planned but not attempted 493
Threat 186
Unknown 13
Code
num_threats_and_unknown <- df |> 
    filter(completion_of_crime %in% c("Threat", "Unknown")) |> 
    nrow()

I am more interested in harm that was attempted, planned, or occurred, so I will filter out the others (199 defendants). That leaves us with 2839 defendants.

Code
df <- df |> 
    filter(!completion_of_crime %in% c("Threat", "Unknown"))

Which dates are represented in the data?

Code
# Warn if minimum date is earlier than 1900 or latest date is later than today
date_range <- range(df$date, na.rm = TRUE)
if (date_range[1] < as.Date("1900-01-01") || date_range[2] > Sys.Date()) {
    warning("Date range is outside expected bounds.")
}


df |>
    summarise(
        `Earliest Date` = min(date, na.rm = TRUE),
        `Latest Date` = max(date, na.rm = TRUE)
    ) |> 
    knitr::kable(caption = "Table 2: Date Range of the Data")
Table 2: Date Range of the Data
Earliest Date Latest Date
1990-01-22 2025-05-13

Which ideological affiliations are represented in the data?

Code
df |>
    count(ideological_affiliation) |> 
    knitr::kable(
        col.names = c("Ideological Affiliation", "Number of Prosecutions"),
        caption = "Table 3: Number of Prosecutions by Ideological Affiliation"
    )
Table 3: Number of Prosecutions by Ideological Affiliation
Ideological Affiliation Number of Prosecutions
Leftist: eco-animal focused 120
Leftist: government-focused 98
Leftist: identity-focused 24
Leftist: unspecified 6
Nationalist-separatist 70
No affiliation/not a factor 103
Other 39
Rightist: abortion-focused 85
Rightist: government-focused 402
Rightist: identity-focused 1202
Rightist: unspecified 43
Salafi/Jihadist/Islamist 539
Unclear 108

Simplify ideologies

Because there are multiple leftist categories and multiple rightist categories, I add a column that consolidates leftist types and rightest types. This hides important variation within ideologies but allows to more easily asses political statements about left vs. right violence.

Code
df <- df |> 
    mutate(
        ideology_simple = case_when(
            str_detect(ideological_affiliation, "Leftist") ~ "Leftist",
            str_detect(ideological_affiliation, "Rightist") ~ "Rightist",
            TRUE ~ ideological_affiliation
        )
    )

df |>
    count(ideology_simple, sort = TRUE) |> 
    knitr::kable(
        col.names = c("Ideological Affiliation", "Number of Prosecutions"),
        caption = "Table 4: Number of Prosecutions by Simplified Ideological Affiliation"
    )
Table 4: Number of Prosecutions by Simplified Ideological Affiliation
Ideological Affiliation Number of Prosecutions
Rightist 1732
Salafi/Jihadist/Islamist 539
Leftist 248
Unclear 108
No affiliation/not a factor 103
Nationalist-separatist 70
Other 39

Focus on Violence

How many defendants were used violent vs. non-violent methods, by ideological affiliation?

Then I filter for only those to make our visualizations.

Code
# classify methods as violent or non-violent
violent <- c(
  "Unarmed assault",
  "Hostage-taking",
  "Armed intimidation/standoff",
  "Vehicle ramming",
  "Chemical or biological weapon deployment",
  "Firearms: civilian",
  "Firearms: military",
  "Explosives",
  "Other weapons"
)

df <- df |> 
    mutate(method_type = if_else(
        criminal_method %in% violent,
        "violent",
        "non-violent"
    ))

df |> 
    group_by(ideology_simple, method_type) |> 
    summarise(n = n()) |> 
    pivot_wider(names_from = method_type, values_from = n, values_fill = 0) |>
    select(ideology_simple, violent, `non-violent`) |>
    arrange(desc(violent)) |> 
    knitr::kable(
        col.names = c("Ideological Affiliation", "Violent Prosecutions", "Non-Violent Prosecutions"),
        caption = "Table 5: Number of Violent vs. Non-Violent Prosecutions by Ideological Affiliation"
    )
Table 5: Number of Violent vs. Non-Violent Prosecutions by Ideological Affiliation
Ideological Affiliation Violent Prosecutions Non-Violent Prosecutions
Rightist 952 780
Salafi/Jihadist/Islamist 133 406
Unclear 77 31
Leftist 62 186
No affiliation/not a factor 38 65
Other 35 4
Nationalist-separatist 33 37

Defendants prosecuted for violent crimes in the US by ideological affiliation

Code
violent_crimes <- df |> 
    filter(
        method_type == "violent",
        ideology_simple != "No affiliation/not a factor"
        )
    
violent_crimes_by_ideology <- violent_crimes |> 
    count(ideology_simple, name = "violent_prosecutions") |> 
    arrange(desc(violent_prosecutions))

ggplot(violent_crimes_by_ideology, aes(x = reorder(ideology_simple, violent_prosecutions), 
                                       y = violent_prosecutions)) +
    geom_col(fill = "black", alpha = 0.8) +
    coord_flip() +
    labs(
        title = "Defendants Prosecuted for Violent Crimes in the US",
        subtitle = "By Ideological Affiliation of the Perpetrator",
        x = "Their Ideology",
        y = "Number of Defendants",
        caption = "Source: The Prosecution Project"
    ) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11)
    )

Defendants prosecuted for violent crimes in the US by ideological affiliation by year

Code
df_yearly <- violent_crimes |>
    mutate(
        year = lubridate::year(date),
        year = as.integer(year)
    ) |>
    filter(!is.na(year)) |>
    count(year, ideology_simple, name = "prosecutions")

ggplot(df_yearly, aes(x = year, y = prosecutions, fill = ideology_simple)) +
    geom_col() +
    scale_fill_manual(values = c(
        "Rightist" = "#c14a58ff",                    # Red for Rightist
        "Leftist" = "#337fb5ff",                     # Blue for Leftist  
        "Salafi/Jihadist/Islamist" = "#e08738ff",    # Orange
        "Unclear" = "#71a771ff",                     # Green
        "Nationalist-separatist" = "#8c564b",      # Brown
        "Other" = "#e377c2"                        # Pink
    )) +
    scale_x_continuous(breaks = scales::pretty_breaks(n = 8)) +
    labs(
        title = "Defendants Prosecuted for Violent Crimes in the US by Year",
        subtitle = "By Ideological Affiliation of the Perpetrator",
        x = "Year",
        y = "Number of Defendants",
        fill = "Their Ideology",
        caption = "Source: The Prosecution Project"
    ) +
    guides(fill = guide_legend(ncol = 6)) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11),
        legend.position = "bottom",
        legend.title = element_text(size = 10),
        legend.text = element_text(size = 9)
    )

Deaths

Code
# Don't double count deaths in cases where more than one defendant is tried for the same case
df_deaths <- violent_crimes |>
  mutate(
    # Extract the incident date from Case ID (first part before underscore)  
    incident_date = str_extract(case_id, "^[^_]+"),
    # Parse the date correctly
    incident_date_formatted = case_when(
      str_length(incident_date) == 8 & str_detect(incident_date, "^\\d{8}$") ~ 
        paste0(str_sub(incident_date, 1, 2), "/", 
               str_sub(incident_date, 3, 4), "/", 
               str_sub(incident_date, 5, 8)),
      TRUE ~ incident_date
    )
  ) |>
  group_by(incident_date, incident_date_formatted) |>
  summarise(
    unique_incidents = 1,  
    related_cases = n(),   
    total_individuals = n_distinct(full_legal_name), 
    total_deaths = first(number_killed, na_rm = TRUE),  
    case_names = paste(unique(name_of_case), collapse = "; "),
    ideology_simple = first(ideology_simple, na_rm = TRUE),
    .groups = "drop"
  ) |>
  group_by(ideology_simple) |> 
  summarise(total_deaths = sum(total_deaths, na.rm = TRUE)) |> 
  arrange(desc(total_deaths))

plot1 <- ggplot(df_deaths, aes(x = reorder(ideology_simple, total_deaths), 
                                       y = total_deaths)) +
    geom_col(fill = "black", alpha = 0.8) +
    coord_flip() +
    labs(
        title = "Who Causes More Death by Violent Crime in the US?",
        subtitle = "By Ideological Affiliation of the Perpetrator",
        x = "Perpetrator Ideology",
        y = "Deaths",
        caption = "Source: The Prosecution Project"
    ) +
    theme_minimal() +
    theme(
        plot.title = element_text(size = 14, face = "bold"),
        plot.subtitle = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.title = element_text(size = 11)
    )

plot1

References

Hutzler, Alexandra, and Michelle Stoddart. 2025. “Trump Doubles down on Blaming ‘radical Left’ after Vow to Go after Political Violence.” ABC News, September 12. https://abcnews.go.com/Politics/trump-doubles-blaming-radical-left-after-vow-after/story?id=125509965.

Loadenthal, Michael, Lauren Donahoe, Madison Weaver, Sara Godfrey, Kathryn Blowers, et. al. 2023. “The Prosecution Project Dataset,” the Prosecution Project, 2023 [dataset]. https://theprosecutionproject.org/

Suter, Tara. 2025. “Elon Musk: ‘The Left Is the Party of Murder.’” Text. The Hill, September 14. https://thehill.com/policy/technology/5502535-elon-musk-charlie-kirk-death/.

Code

This descriptive analysis was created by Jeremy Allen using R, tidyverse, janitor, and RColorBrewer packages. The report is built with Quarto. The code is available on GitHub.

Footnotes

  1. See The Prosecution Project’s FAQ where they discuss what data is included, excluded, and why. For example, their data is likely an undercount of violence because they compile their data only from prosecutions. So, if there were a violent incident and the perpetrator were killed onsite, there would be no prosecution and no data point in this set.↩︎