Types of joins
PAN DAS JOIN S F OR S P READS H EET US ERS
John Miller
Principal Data Scientist
Types of joins PAN DAS JOIN S F OR S P READS H EET US ERS John - - PowerPoint PPT Presentation
Types of joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist Three types of joins Types One-to-one One-to-many Many-to-many Join type indicates relationship of tables Like lookup values in VLOOKUP
PAN DAS JOIN S F OR S P READS H EET US ERS
John Miller
Principal Data Scientist
PANDAS JOINS FOR SPREADSHEET USERS
Types One-to-one One-to-many Many-to-many Join type indicates relationship of tables Like lookup values in VLOOKUP
PANDAS JOINS FOR SPREADSHEET USERS
Datasets at the same level Lookup column same for both tables Rows match one-to-one
PANDAS JOINS FOR SPREADSHEET USERS
Rows are unique at different levels Join based on the higher-level column Result based on lower-level column Similar to VLOOKUP based on lower-level data
PANDAS JOINS FOR SPREADSHEET USERS
Values are repeated Results in all possible combinations Can be hard to interpret Best supplemented with a "bridge" table
PAN DAS JOIN S F OR S P READS H EET US ERS
PAN DAS JOIN S F OR S P READS H EET US ERS
John Miller
Principal Data Scientist
PANDAS JOINS FOR SPREADSHEET USERS
Used with complementary data
pd.merge(left, right)
Joins on one or more columns Similar to VLOOKUP
PANDAS JOINS FOR SPREADSHEET USERS
pd.merge(df_left, df_right,
List data frame names for "df_left" and "df_right" placeholders Resulting frame matches rows in left frame
PANDAS JOINS FOR SPREADSHEET USERS
pd.merge(df_left, df_right,
Same concept as left merge Resulting frame rows match right frame
PANDAS JOINS FOR SPREADSHEET USERS
After viewing and understanding the data: Determine the relationship Check for unique values in key column Write merge statement and execute
PANDAS JOINS FOR SPREADSHEET USERS
Determine the relationship Check for unique values in key column Write merge statement and execute
pd.merge(df_left, df_right,
PAN DAS JOIN S F OR S P READS H EET US ERS
PAN DAS JOIN S F OR S P READS H EET US ERS
John Miller
Principal Data Scientist
PANDAS JOINS FOR SPREADSHEET USERS
pandas-level function:
pd.merge(df_left, df_right, on=None)
Shorthand, pandas dataframe method:
df_left.merge(df_right, on=None)
PANDAS JOINS FOR SPREADSHEET USERS
Similar to VLOOKUP followed by removing NA's
df1.merge(df2, on='GameKey', how='inner')
Returns only rows present in both tables
PANDAS JOINS FOR SPREADSHEET USERS
Key columns may have different names No need to rename columns df1.merge(df2, left_on='GameKey', right_on='game-key', how='inner')
PAN DAS JOIN S F OR S P READS H EET US ERS