do the genders commit different v iolations
play

Do the genders commit different v iolations ? AN ALYZIN G P OL IC - PowerPoint PPT Presentation

Do the genders commit different v iolations ? AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS Ke v in Markham Fo u nder , Data School Co u nting u niq u e v al u es (1) .value_counts() : Co u nts the u niq u e v al u es in a Series Best s u ited


  1. Do the genders commit different v iolations ? AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS Ke v in Markham Fo u nder , Data School

  2. Co u nting u niq u e v al u es (1) .value_counts() : Co u nts the u niq u e v al u es in a Series Best s u ited for categorical data ri.stop_outcome.value_counts() Citation 77091 Warning 5136 Arrest Driver 2735 No Action 624 N/D 607 Arrest Passenger 343 Name: stop_outcome, dtype: int64 ANALYZING POLICE ACTIVITY WITH PANDAS

  3. Co u nting u niq u e v al u es (2) ri.stop_outcome.value_counts().sum() 86536 ri.shape (86536, 13) ANALYZING POLICE ACTIVITY WITH PANDAS

  4. E x pressing co u nts as proportions Citation 77091 ri.stop_outcome.value_counts() Warning 5136 Arrest Driver 2735 77091/86536 No Action 624 N/D 607 0.8908546731995932 Arrest Passenger 343 Citation 0.890855 ri.stop_outcome.value_counts( Warning 0.059351 normalize=True) Arrest Driver 0.031605 No Action 0.007211 N/D 0.007014 Arrest Passenger 0.003964 ANALYZING POLICE ACTIVITY WITH PANDAS

  5. Filtering DataFrame ro w s ri.driver_race.value_counts() White 61870 Black 12285 Hispanic 9727 Asian 2389 Other 265 white = ri[ri.driver_race == 'White'] white.shape (61870, 13) ANALYZING POLICE ACTIVITY WITH PANDAS

  6. Comparing stop o u tcomes for t w o gro u ps Citation 0.902263 white.stop_outcome.value_counts( Warning 0.057508 normalize=True) Arrest Driver 0.024018 No Action 0.007031 N/D 0.006433 Arrest Passenger 0.002748 Citation 0.922980 asian = ri[ri.driver_race == Warning 0.045207 'Asian'] Arrest Driver 0.017581 asian.stop_outcome.value_counts( No Action 0.008372 normalize=True) N/D 0.004186 Arrest Passenger 0.001674 ANALYZING POLICE ACTIVITY WITH PANDAS

  7. Let ' s practice ! AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS

  8. Does gender affect w ho gets a ticket for speeding ? AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS Ke v in Markham Fo u nder , Data School

  9. Filtering b y m u ltiple conditions (1) female = ri[ri.driver_gender == 'F'] female.shape (23774, 13) ANALYZING POLICE ACTIVITY WITH PANDAS

  10. Filtering b y m u ltiple conditions (2) female_and_arrested = ri[(ri.driver_gender == 'F') & (ri.is_arrested == True)] Each condition is s u rro u nded b y parentheses Ampersand ( & ) represents the and operator female_and_arrested.shape (669, 13) Onl y incl u des female dri v ers w ho w ere arrested ANALYZING POLICE ACTIVITY WITH PANDAS

  11. Filtering b y m u ltiple conditions (3) female_or_arrested = ri[(ri.driver_gender == 'F') | (ri.is_arrested == True)] Pipe ( | ) represents the or operator female_or_arrested.shape (26183, 13) Incl u des all females Incl u des all dri v ers w ho w ere arrested ANALYZING POLICE ACTIVITY WITH PANDAS

  12. R u les for filtering b y m u ltiple conditions Ampersand ( & ): onl y incl u de ro w s that satisf y both conditions Pipe ( | ): incl u de ro w s that satisf y either condition Each condition m u st be s u rro u nded b y parentheses Conditions can check for eq u alit y ( == ), ineq u alit y ( != ), etc . Can u se more than t w o conditions ANALYZING POLICE ACTIVITY WITH PANDAS

  13. Correlation , not ca u sation Anal yz e the relationship bet w een gender and stop o u tcome Assess w hether there is a correlation Not going to dra w an y concl u sions abo u t ca u sation Wo u ld need additional data and e x pertise E x ploring relationships onl y ANALYZING POLICE ACTIVITY WITH PANDAS

  14. Let ' s practice ! AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS

  15. Does gender affect w hose v ehicle is searched ? AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS Ke v in Markham Fo u nder , Data School

  16. Math w ith Boolean v al u es ri.isnull().sum() import numpy as np np.mean([0, 1, 0, 0]) stop_date 0 stop_time 0 0.25 driver_gender 0 driver_race 0 np.mean([False, True, violation_raw 0 False, False]) ... 0.25 True = 1, False = 0 Mean of Boolean Series represents percentage of True v al u es ANALYZING POLICE ACTIVITY WITH PANDAS

  17. Taking the mean of a Boolean Series ri.is_arrested.value_counts(normalize=True) False 0.964431 True 0.035569 ri.is_arrested.mean() 0.0355690117407784 ri.is_arrested.dtype dtype('bool') ANALYZING POLICE ACTIVITY WITH PANDAS

  18. Comparing gro u ps u sing gro u pb y (1) St u d y the arrest rate b y police district ri.district.unique() array(['Zone X4', 'Zone K3', 'Zone X1', 'Zone X3', 'Zone K1', 'Zone K2'], dtype=object) ri[ri.district == 'Zone K1'].is_arrested.mean() 0.024349083895853423 ANALYZING POLICE ACTIVITY WITH PANDAS

  19. Comparing gro u ps u sing gro u pb y (2) ri[ri.district == 'Zone K2'].is_arrested.mean() 0.030800588834786546 ri.groupby('district').is_arrested.mean() district Zone K1 0.024349 Zone K2 0.030801 Zone K3 0.032311 Zone X1 0.023494 Zone X3 0.034871 Zone X4 0.048038 ANALYZING POLICE ACTIVITY WITH PANDAS

  20. Gro u ping b y m u ltiple categories ri.groupby(['district', 'driver_gender']).is_arrested.mean() district driver_gender Zone K1 F 0.019169 M 0.026588 Zone K2 F 0.022196 ... ... ... ri.groupby(['driver_gender', 'district']).is_arrested.mean() driver_gender district F Zone K1 0.019169 Zone K2 0.022196 ... ... ... ANALYZING POLICE ACTIVITY WITH PANDAS

  21. Let ' s practice ! AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS

  22. Does gender affect w ho is frisked d u ring a search ? AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS Ke v in Markham Fo u nder , Data School

  23. ri.search_conducted.value_counts() False 83229 True 3307 ri.search_type.value_counts(dropna=False) .value_counts() NaN 83229 Incident to Arrest 1290 e x cl u des missing Probable Cause 924 v al u es b y defa u lt Inventory 219 dropna=False Reasonable Suspicion 214 Protective Frisk 164 displa y s missing Incident to Arrest,Inventory 123 v al u es ... ANALYZING POLICE ACTIVITY WITH PANDAS

  24. E x amining the search t y pes ri.search_type.value_counts() Incident to Arrest 1290 Probable Cause 924 Inventory 219 Reasonable Suspicion 214 Protective Frisk 164 Incident to Arrest,Inventory 123 Incident to Arrest,Probable Cause 100 ... M u ltiple v al u es are separated b y commas 219 searches in w hich " In v entor y" w as the onl y search t y pe Locate " In v entor y" among m u ltiple search t y pes ANALYZING POLICE ACTIVITY WITH PANDAS

  25. Searching for a string (1) ri['inventory'] = ri.search_type.str.contains('Inventory', na=False str.contains() ret u rns True if string is fo u nd , False if not fo u nd na=False ret u rns False w hen it � nds a missing v al u e ANALYZING POLICE ACTIVITY WITH PANDAS

  26. Searching for a string (2) ri.inventory.dtype dtype('bool') True means in v entor y w as done , False means it w as not ri.inventory.sum() 441 ANALYZING POLICE ACTIVITY WITH PANDAS

  27. Calc u lating the in v entor y rate ri.inventory.mean() 0.0050961449570121106 0.5% of all tra � c stops res u lted in an in v entor y searched = ri[ri.search_conducted == True] searched.inventory.mean() 0.13335349259147264 13.3% of searches incl u ded an in v entor y ANALYZING POLICE ACTIVITY WITH PANDAS

  28. Let ' s practice ! AN ALYZIN G P OL IC E AC TIVITY W ITH PAN DAS

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend