cse 416 section 1
play

CSE 416, Section 1 Semester Project Approach Session Objectives - PDF document

Session 3 Project Approach CSE 416, Section 1 Semester Project Approach Session Objectives Understand the analysis approach taken by MGGG in their analysis of the effects of racial Gerrymandering in the Virginia House of Delegates


  1. Session 3 – Project Approach CSE 416, Section 1 Semester Project Approach Session Objectives � Understand the analysis approach taken by MGGG in their analysis of the effects of racial Gerrymandering in the Virginia House of Delegates � Understand your high-level approach to the project � Begin to think about design choices � Begin to understand data requirements to support analysis 2 � Robert Kelly, 2020 1 11/9/2020 � Robert Kelly, 2020

  2. Session 3 – Project Approach Reading � Comparison of Districting Plans for the Virginia House of Delegates, Metric Geometry and Gerrymandering Group, MGG, https://mggg.org/VA-report.pdf � Wikipedia - https://en.wikipedia.org/wiki/Graph_partition - Basic background Lots of approaches to graph partitioning, but we are not concerned with minimizing edges and generating equality of nodes, so we use a multi-level method 3 � Robert Kelly, 2020 Project Background � Project based on MGGG analysis for the court � Virginia House of Delegates � Eleven state districts were ruled unconstitutional � Analysis examined original (unconstitutional) plan and possible replacement plans (e.g., Republican suggested plan) � Analysis method “highlights and quantifies the dilutive effects of packing Black Voting Age Population (BVAP)” (>55% in 11 districts) � Analysis looked at the 11 districts and immediate neighbors (total of 33) Study remarks that 37% BVAP is the empirical line for African- American representation, >55% considered excessive 4 � Robert Kelly, 2020 2 11/9/2020 � Robert Kelly, 2020

  3. Session 3 – Project Approach Analysis Results Typical box and whisker plot � Districts sorted by (lowest to highest) BVAP � Each district shows the range of BVAP values in ensemble 5 � Robert Kelly, 2020 Box And Whisker Plot 99 th percentile Target BVAP range Evaluated plan result Median 6 � Robert Kelly, 2020 3 11/9/2020 � Robert Kelly, 2020

  4. Session 3 – Project Approach How Do You Generate a “Random” Districting Plan? � Recall that a districting plan is a partition of the k-node precinct graph into n subgraphs, each of which is 1) connected and 2) adheres to state districting requirements (e.g., equal population, compact, fewer counties, etc.) � 2-stage process in which initially each node is considered a cluster � 1. Recursively combine neighboring clusters until the overall graph reaches n clusters � 2. Recursively balance pairs of clusters until eventually each cluster achieves the compactness goal and the population distribution goal 7 � Robert Kelly, 2020 How Do We Form n Clusters � In each iteration � For each cluster, select a random neighboring cluster, and combine the two clusters into a new cluster � Update neighbors for each newly formed cluster � Terminate when number of clusters equals n There will likely be optional use cases in which you can try out variations on this algorithm 8 � Robert Kelly, 2020 4 11/9/2020 � Robert Kelly, 2020

  5. Session 3 – Project Approach How Do We Balance the Clusters? � For each cluster � Select a neighboring cluster at random to rebalance � Generate a spanning tree of the combined cluster (you can try different approaches) � Form the set of edges to cut that will “improve” some combination of 1) compactness and 2) population equality A tree is a connected undirected graph with no cycles. It is a spanning tree of a graph G if it spans G (that is, it includes every vertex of G) and is a subgraph of G (every edge in the tree belongs to G). A spanning tree of a connected graph G can also be defined as a maximal set of edges of G that contains no cycle, or as a minimal set of edges that connect all vertices. - Wikipedia 9 � Robert Kelly, 2020 When Do We Terminate the Algorithm? � Terminate when the redistricting plan has � 1. population difference between the most populous cluster (district) and the least populous cluster (district) in the state is less than a user provided threshold � Compactness measure for each district is less than a user provided threshold Will this approach provide “random” districting plans? How do you determine if the plans appear random? 10 � Robert Kelly, 2020 5 11/9/2020 � Robert Kelly, 2020

  6. Session 3 – Project Approach Begin to Think about Implications � How do we represent a node (i.e., precinct)? � How do we represent a cluster (i.e., district)? � How do we calculate neighbors? � How do we measure compactness? � How do we display a “random” district plan to the user? � How do we verify that the results are random? 11 � Robert Kelly, 2020 Multiple District Plans � You will generate multiple district plans (usually referred to as districtings) � Run multiple Python processes, each of which will generate a plan � Run a small number of processes on your laptop/desktop, but a larger number on the SeaWulf � Each process might run multiple algorithms in sequence until desired number of plans are generated 12 � Robert Kelly, 2020 6 11/9/2020 � Robert Kelly, 2020

  7. Session 3 – Project Approach Graph Node Geography � Project building block unit is the geography of a precinct � Boundary data is generally available, but may not be totally accurate � Combine/split of a cluster requires calculating the new cluster boundaries � BVAP (and other minority) population data is generally not available for a precinct (you will need to map census data to it) Note: MGGG paper uses census blocks as the building block 13 � Robert Kelly, 2020 Reports � Your project will generate summary data for a variety of runs � For example � Independence of results from seed � Change in population balance vs. iteration � Ideal Markov chain length � Single seed districting plan vs. one for each random district � Comparison with Gerrymandering measure results 14 � Robert Kelly, 2020 7 11/9/2020 � Robert Kelly, 2020

  8. Session 3 – Project Approach Differences Between Project and MGGG Report This comparison is meant to help you when reading the MGGG paper MGGG CSE416 � State districting � Congressional Districting � � Virginia Multiple states � Analysis of 33 VA house districts � Complete state analysis � � Markov chain No concern for Markov chain validity � 100 seed plans � Phase 1 graph partition approach � � Census block building blocks Precinct building blocks � Seed plan population balanced � Seed plans unbalanced by population � Spanning tree cuts balance � Spanning tree cuts reduce pop. Disparity � Phase 2: Flip, ReCom, and Mix � Phase 2: Modified ReCom only � Specific compactness measure � Multiple compactness measures 15 � Robert Kelly, 2020 Things to Think About � What data is needed to run the algorithm? � What data is needed to display the results in the Web client? � How and when do you transmit data from the client to the server? � Can you store partial results when building a districting ensemble? � How are results passed from the SeaWulf to your server? � What does the GUI look like? � How does the user request a run of multiple districting plans? � How do you display summary data from a run? � What debugging features should be built into the GUI? � How do you keep track of design options/decisions/questions? 16 � Robert Kelly, 2020 8 11/9/2020 � Robert Kelly, 2020

  9. Session 3 – Project Approach Top-Level System Architecture Server Logic SeaWulf GUI (Java) (Java) (JavaScript) Data Population Resource DB (Python) Project DB Data sources 17 � Robert Kelly, 2017-2020 Have You Satisfied the Objectives? � Understand the analysis approach taken by MGGG in their analysis of the effects of racial Gerrymandering in the Virginia House of Delegates � Understand your high-level approach to the project � Begin to think about design choices � Begin to understand data requirements to support analysis 18 � Robert Kelly, 2020 9 11/9/2020 � Robert Kelly, 2020

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend