1

pandas merge by excluding certain columns

I want to merge two dataframes like:

df1.columns = A, B, C, D

df2.columns = A, B, C, D

If I merge them, it merge on all columns. Also since the number of columns is high I don't want to specify them in on. I prefer to exclude the columns which I don't want to be merged. How can I do that?

mdf = pd.merge(df1, df2, exclude D)

I expect the result be like:

mdf.columns = A, B, C, D_x, D_y

Submitted July 21st 2021 by Admin

Answers
0

You mentioned you mentioned you don't want to use on "since the number of columns is much".

You could still use on this way even if there are a lot of columns:

mdf = pd.merge(df1, df2, on=[i for i in df1.columns if i != 'D'])

Or

By using pd.Index.difference

mdf = pd.merge(df1, df2, on=df1.columns.difference(['D']))

Admin | 4 days ago


0

What about dropping the unwanted column after the merge?

You can use pandas.DataFrame.drop:

mdf = pd.merge(df1, df2).drop('D', axis=1)

Admin | 4 days ago


0

Another solution can be:

mdf = pd.merge(df1, df2, on= df1.columns.tolist().remove('D')

Admin | 4 days ago



Relevant Questions