Federal Sentencing Disparity
What the data shows about variation across federal districts, and what drives it.
What Is Sentencing Disparity?
Sentencing disparity exists when similarly situated defendants — those convicted of the same offense with comparable criminal histories — receive substantially different sentences. Disparity may be warranted (reflecting legitimate differences in case facts) or unwarranted (reflecting inconsistency or bias in how the sentencing system operates).
The Federal Sentencing Guidelines were created precisely to address the wide sentencing variation documented before 1987, when individual judges had nearly unlimited discretion. After Booker made the Guidelines advisory, disparity increased again — but remains far lower than pre-Guidelines levels.
What USSC Data Reveals
The United States Sentencing Commission publishes annual datafiles on every sentenced federal offender. Analyzing this data across districts reveals:
- Average sentence length varies substantially across districts, even after accounting for offense mix.
- Within-guidelines rates differ widely — some districts sentence within guidelines 80%+ of the time; others depart more than half the time.
- Departure direction also varies: some districts trend toward upward departures; others show predominantly downward.
- Case composition explains much of the variation — border districts with heavy immigration caseloads naturally show different averages than urban fraud-heavy districts.
Structural vs. Unexplained Disparity
Researchers distinguish between structural disparity (driven by case mix and legitimate factors) and unexplained disparity (residual variation after controlling for offense and history). Studies by the USSC and academic researchers find that a portion of variation persists after controlling for guidelines variables — though the magnitude and causes remain debated.
PlainSentencing's disparity scores reflect the average percentage difference between a district's sentence lengths and the national average for the same offense types. This is a descriptive measure — it identifies districts that deviate from national norms but does not establish cause.
Factors That Drive Variation
Prosecution Practices
U.S. Attorney's offices operate with significant independence. Charging decisions, plea agreement terms, and § 5K1.1 cooperation motions vary by office — directly affecting guideline ranges and final sentences.
Judicial Philosophy
After Booker, district judges gained wide latitude. Some judges routinely vary below guidelines for certain offense types; others treat the guideline range as a near-ceiling. These patterns often persist within districts over time.
Defense Resources
Districts with well-resourced federal public defender offices and active private defense bars tend to show higher departure rates. Better representation often results in more thorough development of mitigating factors.
Case Mix
The Southern District of Texas and Western District of Texas process enormous volumes of immigration cases, which tend to carry shorter sentences and high within-guidelines rates under fast-track programs. This naturally depresses average sentence length in those districts compared to districts focused on violent crime or financial fraud.
Reforms and Ongoing Debates
The USSC regularly publishes analyses of sentencing patterns and submits guideline amendments to Congress. Key reform areas include:
- Guideline simplification — reducing the complexity of offense level calculations.
- Criminal history reform — the USSC's 2023 amendments reduced criminal history points for certain minor and old convictions.
- Drug trafficking — ongoing debates about quantity-based minimums and their relationship to racial disparities in sentencing outcomes.
- Zero-point offenders — 2023 amendments allowing reduced sentences for first-time, low-risk offenders.
How to Interpret PlainSentencing Data
When comparing districts, keep these limitations in mind:
- Average sentences reflect the mix of cases in a district, not just judicial discretion.
- Districts with high non-citizen defendant rates often process immigration cases under fast-track programs with discounted sentences.
- Year-to-year changes may reflect changes in prosecution priorities rather than judicial behavior.
- Disparity scores compare to national averages for the same offense types — but the comparison set is still imperfect because fact-specific variables are not fully captured.
Worked example: PlainSentencing disparity-score interpretation
Each district in our dataset has a disparity score derived from the deviation of its sentence distribution from a national offense-mix-adjusted benchmark. A district with disparity 1.18 means its average sentence is 18% above the offense-mix-adjusted national median; disparity 0.85 means 15% below. For example, the Eastern District of Kentucky has disparity 1.12, the Northern District of California 0.91. After offense-mix adjustment, the Kentucky district still runs about 12% longer than the national benchmark for the same offense composition, while NDCA runs 9% shorter. The unexplained residual after offense, criminal-history, and acceptance adjustments shrinks the gap to about 6% in Kentucky and 4% in NDCA — but that residual is real and persistent.
Disparity component decomposition
| Component | Typical magnitude | Source |
|---|---|---|
| Offense-mix effect | ±25% | Local crime patterns |
| Mandatory-minimum hit rate | ±18% | Charging decisions |
| 5K1.1 cooperation rate | ±15% | Prosecutorial culture |
| §3553(a) variance rate | ±12% | Judicial discretion |
| Criminal history mix | ±10% | Defendant pool |
| Unexplained residual | ±7% | Multiple factors |
Using disparity scores responsibly
PlainSentencing disparity scores are best used for three purposes. First, surfacing districts where further investigation may be warranted — a district persistently 20%+ above benchmark merits closer review by researchers, defenders, and policy makers. Second, tracking trends over time within a district — a district whose disparity score moved from 1.15 to 1.02 over five years is becoming more aligned with national norms, and that trajectory is informative regardless of the starting point. Third, comparing similar-sized peer districts — a high-volume urban district with disparity 1.1 is meaningfully different from a similar district with disparity 0.95, even if both fall within the normal range. Avoid using disparity scores to compare individual judges, to predict outcomes in pending cases, or to allege specific instances of unfair sentencing. The scores are district-level statistical aggregates, not case-level findings.