types of joins
play

Types of joins PAN DAS JOIN S F OR S P READS H EET US ERS John - PowerPoint PPT Presentation

Types of joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist Three types of joins Types One-to-one One-to-many Many-to-many Join type indicates relationship of tables Like lookup values in VLOOKUP


  1. Types of joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist

  2. Three types of joins Types One-to-one One-to-many Many-to-many Join type indicates relationship of tables Like lookup values in VLOOKUP PANDAS JOINS FOR SPREADSHEET USERS

  3. One-to-one Datasets at the same level Lookup column same for both tables Rows match one-to-one PANDAS JOINS FOR SPREADSHEET USERS

  4. One-to-many Rows are unique at different levels Join based on the higher-level column Result based on lower-level column Similar to VLOOKUP based on lower-level data PANDAS JOINS FOR SPREADSHEET USERS

  5. Many-to-many Values are repeated Results in all possible combinations Can be hard to interpret Best supplemented with a "bridge" table PANDAS JOINS FOR SPREADSHEET USERS

  6. Let's practice! PAN DAS JOIN S F OR S P READS H EET US ERS

  7. A closer look at one- to-one joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist

  8. Basics of pandas.merge() Used with complementary data pd.merge(left, right) Joins on one or more columns Similar to VLOOKUP PANDAS JOINS FOR SPREADSHEET USERS

  9. Left merges pd.merge(df_left, df_right, on='GameKey', how='left') List data frame names for "df_left" and "df_right" placeholders Resulting frame matches rows in left frame PANDAS JOINS FOR SPREADSHEET USERS

  10. Right merges pd.merge(df_left, df_right, on='GameKey', how='right') Same concept as left merge Resulting frame rows match right frame PANDAS JOINS FOR SPREADSHEET USERS

  11. A framework for joins After viewing and understanding the data: Determine the relationship Check for unique values in key column Write merge statement and execute PANDAS JOINS FOR SPREADSHEET USERS

  12. Joining on two keys Determine the relationship Check for unique values in key column Write merge statement and execute pd.merge(df_left, df_right, on=['GameKey', 'PlayId']) PANDAS JOINS FOR SPREADSHEET USERS

  13. Let's practice! PAN DAS JOIN S F OR S P READS H EET US ERS

  14. Combining common data with inner joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist

  15. Object-oriented expressions pandas-level function: pd.merge(df_left, df_right, on=None) Shorthand, pandas dataframe method: df_left.merge(df_right, on=None) PANDAS JOINS FOR SPREADSHEET USERS

  16. A basic inner join Similar to VLOOKUP followed by removing NA's df1.merge(df2, on='GameKey', how='inner') Returns only rows present in both tables PANDAS JOINS FOR SPREADSHEET USERS

  17. Joining on different names Key columns may have different names No need to rename columns df1.merge(df2, left_on='GameKey', right_on='game-key', how='inner') PANDAS JOINS FOR SPREADSHEET USERS

  18. Let's practice! PAN DAS JOIN S F OR S P READS H EET US ERS

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend