CSE 416, Section 1 Semester Project Discussion Session Objectives - - PDF document

cse 416 section 1
SMART_READER_LITE
LIVE PREVIEW

CSE 416, Section 1 Semester Project Discussion Session Objectives - - PDF document

Session 2 Project Overview CSE 416, Section 1 Semester Project Discussion Session Objectives Understand issues and terminology used in US congressional redistricting and voting analysis Understand data requirements to support voting


slide-1
SLIDE 1

Session 2 – Project Overview 9/24/2020 1

Robert Kelly, 2020

CSE 416, Section 1

Semester Project Discussion

Session Objectives

Understand issues and terminology used in US congressional redistricting and voting analysis Understand data requirements to support voting analysis Become familiar with gerrymandering and techniques to measure the relative amount of gerrymandering in a state district plan Understand issues associated with currently available data

Robert Kelly, 2020 2

We will explore the project functionality in more detail in the next 1-3 class sessions

slide-2
SLIDE 2

Session 2 – Project Overview 9/24/2020 2

Robert Kelly, 2020

Teams

You should form your team soon (4 members) You can register a 3-member team, but I will likely add a 4th member

Robert Kelly, 2020 3

You should use Piazza to look for possible students in your team You register your team by sending me an email message with the names of the team members

Project Background

Project based on

Fall 2017 CSE308 – apply quantitative measures of political gerrymandering Fall 2018 CSE308 – explore feasibility of algorithms for the generation of districts Spring 2019 – explore feasibility of integrating demographic data Fall 2019 – analyze demographic voting patterns

Lessons learned from previous projects

Robust set of data available Graph algorithms for the generation of viable congressional districts

Robert Kelly, 2020 4

Underlying issue is the question of racial fairness in US elections

slide-3
SLIDE 3

Session 2 – Project Overview 9/24/2020 3

Robert Kelly, 2020

Why This is an Important and Interesting Topic

Very current Lots of interesting CS concepts and technologies

Multiple languages Multiple computers Database algorithms

Realistic software development project

Robert Kelly, 2020 5

Background Info

Every state has a number of congressional districts proportional to the state population Population is recalculated after a US Census, and district boundaries must be recalculated if the number of representatives change or population shifts District boundaries can also change due to court decisions (e.g., Pennsylvania, North Carolina, etc.)

Robert Kelly, 2020

US Census is performed every 10 years

6

Most states will be redistricted following the 2020 census Population in districts must be almost equal

slide-4
SLIDE 4

Session 2 – Project Overview 9/24/2020 4

Robert Kelly, 2020

Precincts (sometimes known as Wards)

Lowest level government division Contained in one polling place Data usually available for voting totals

7

System Background

Current redistricting approach leads to many unfair practices (Gerrymandering) Some US states have a history of denying equal voting to some minority groups (e.g., African American) Recent approaches involve “packing” minorities into a small number

  • f districts, thereby minimizing their overall representation

Robert Kelly, 2020 8

"I propose that we draw the maps to give a partisan advantage to 10 Republicans and three Democrats because I do not believe it’s possible to draw a map with 11 Republicans and two Democrats.“ – Chairman of NC House redistricting committee

slide-5
SLIDE 5

Session 2 – Project Overview 9/24/2020 5

Robert Kelly, 2020

What is a Gerrymander?

Refers to a voting district that resembles a salamander Named after Eldridge Gerry, 5th VP of US

Robert Kelly, 2020 9

Why is Gerrymandering a Hot Topic?

Gerrymandering is a practice intended to establish an advantage for a particular party or group by manipulating district boundaries Usually features “cracking” (split opposing party voters into many districts and “packing” (packing maximum number of opposing party voters in to a handful of districts) Occurring in the US since 1812 Used aggressively in 2010, resulting in congressional dysfunction

Robert Kelly, 2020 10

Definition from Wikipedia

slide-6
SLIDE 6

Session 2 – Project Overview 9/24/2020 6

Robert Kelly, 2020

Consequences of Current Gerrymandering

Many congressional seats are not competitive Members of congress are more concerned with a primary battle than an election battle Republicans and Democrats represent their party’s position more than the wishes of their constituents Extremes of each party dominate, instead of the middle

Robert Kelly, 2020 11

Congressional Gridlock

1965 Voting Rights Act

VRA

Forced selected states to obtain pre- clearance for any election change that might affect the right to vote in a state Was very effective in restoring voting rights in the pre-clearance states

2013 decision of Supreme Court struck down parts (e.g., preclearance) of the VRA Majority-minority provisions remain, which has been used to enable district packing of congressional districts

Robert Kelly, 2020 12

slide-7
SLIDE 7

Session 2 – Project Overview 9/24/2020 7

Robert Kelly, 2020

Minority Packing

Congressional redistricting in many states is controlled by one political party The party in power tries to preserve its power by drawing voting districts that favor the party Data is now available that show the voting preferences and racial/ethnic characteristics of individual geographic locations Redistricting groups attempt to “pack” opposition voters (70%-80%) into a small number of districts while “cracking” the remaining

  • pposition voters (40%-45%) into the remaining districts

Robert Kelly, 2020 13

Congressional District Observations

Many districts appear oddly shaped, possibly due to attempts to pack minority groups Recent measurements of gerrymandering, but not including racial/ethnic origins New probabilistic approach to measuring gerrymandering developed at Duke University MGGG applied a probabilistic approach to Virginia state legislature districting to measure racial gerrymandering Probabilistic measures limited by computational requirement

Robert Kelly, 2020 14

slide-8
SLIDE 8

Session 2 – Project Overview 9/24/2020 8

Robert Kelly, 2020

Fall 2020 Project

Apply a modified version of the Duke and MGGG for a large number of states Measure the extent to which current districting plans appear to be biased in terms of racial/ethnic distribution of population in congressional districts Consider legal effects of packing in analysis (e.g., urban vs. rural) Consider impact of VRA on packing effects

Robert Kelly, 2020 15

Computation of this analysis requires an interesting combination

  • f programming (multi-processors), computer science (e.g.,

graph algorithms), and applied math

High Level View of the Project

Build a robust system to

Display a current districting plan for a state Generate a “random” set of districting plans for the state Calculate the racial/ethnic population in each random district Generate the racial/ethnic population distribution Determine if the current districting plan does not appear fair Generate a large number of districting plans on a high performance computer Display graphic results to the user

Robert Kelly, 2020 16

slide-9
SLIDE 9

Session 2 – Project Overview 9/24/2020 9

Robert Kelly, 2020

Generating a “Random” District

Goal - Allow congressional district boundaries to be automatically determined in a “random” way Approach

Treat precincts as nodes in a graph Edges in the graph are defined by contiguity (200+ foot shared border) Partition the statewide precinct graph into the set of connected sub- graphs Rebalance the sub-graphs to satisfy state districting constraints (e.g., equal population, compactness)

Robert Kelly, 2020 17

System Display

Districts and precinct boundaries are displayed The characteristics of a large number of random districting plans are displayed Setup of a run defines the constraints in districting (e.g., compactness)

Robert Kelly, 2020 18

Data is available from multiple data sources

slide-10
SLIDE 10

Session 2 – Project Overview 9/24/2020 10

Robert Kelly, 2020

Project Requirements Analysis

You will generate detailed requirements (use cases) Requirements will evolve over the first 6 weeks of the project

Top-Level functional requirements provided in first 2 weeks of class sessions You will develop detailed requirements based on top-level requirements Team requirements aggregated into a master use-case list Near-final set of use cases by early October

Robert Kelly, 2020 19

Ethnic Minority Data

Ethnic minority data should be included in your precinct data 1965 Voting Rights Act mandates the establishment of districts in which an ethnic minority constitutes a majority within the district

Robert Kelly, 2020 20

Each precinct includes demographic data originated from the US Census Bureau

slide-11
SLIDE 11

Session 2 – Project Overview 9/24/2020 11

Robert Kelly, 2020

VRA Preclearance States

Alabama Alaska Arizona Georgia Louisiana Mississippi South Carolina Texas Virginia

Robert Kelly, 2020 21

Your team will choose 3 states to analyze, and at least one of them will be a pre-clearance state

What Data is Needed?

Geospatial boundary data

Precincts Existing Congressional districts Cities/counties Census tracts (including demographic data)

Election results data

Robert Kelly, 2020 22

Multiple data sources can be used to measure the party preference of a precinct (congressional vote, presidential vote, registration, etc.) A data repository will be made available to you that will contain some of the data you need

slide-12
SLIDE 12

Session 2 – Project Overview 9/24/2020 12

Robert Kelly, 2020

State Selection

Your system will run for 3 states of your choosing All states will have a large minority population (>500,000) Maximum of 4 teams per state You can change any state during the semester as long as it is available Once the table is posted, send me an e-mail with your team’s choices. Be sure to include alternate selections in case your first choice is already filled

Robert Kelly, 2020 23

Sources of Data

Project Web site suggests many sources of data The most accurate data source is the originator

US Census Bureau State Election Office US Government repository of region borders

Many formal and informal data aggregation sources are available Sometimes difficult to locate the best source of data

Robert Kelly, 2020 24

Some data will be provided to you

slide-13
SLIDE 13

Session 2 – Project Overview 9/24/2020 13

Robert Kelly, 2020

Top-Level System Architecture

Robert Kelly, 2017-2020 25

GUI (JavaScript) Server Logic (Java) Resource DB Data Population (Python) Data sources SeaWulf (Python) Project DB Files

What Skills Do You Need?

Programming (Java, JavaScript, Python) Client/server interaction (e.g., Spring, JAX-RS) Graph algorithms (e.g., spanning tree) Thread programming Data serialization (migration of data from server to SeaWulf) Performance Analysis (parallel speedup measurement) Map system integration Client data display (e.g., box and whisker plot) Client framework (e.g., React) DB And more

Robert Kelly, 2020 26

Free SW libraries are available for everything you need TAs were selected based on experience with these technologies

slide-14
SLIDE 14

Session 2 – Project Overview 9/24/2020 14

Robert Kelly, 2020

Build Your Team

Build your project team so that you have many of the needed skills (note, building a team with all of the skills will be almost impossible) Ask students in the class you have worked with before (use Zoom window to identify students) Use Piazza to “interview” potential teammates

Robert Kelly, 2020 27

We have an excellent set of TAs this semester who can help with technologies

Have You Satisfied the Objectives?

Understand issues and terminology used in US congressional redistricting and voting analysis Understand data requirements to support voting analysis Understand issues associated with currently available data

Robert Kelly, 2020 28