R issues with merge/rbind/concatenate two data frames -



R issues with merge/rbind/concatenate two data frames -

i beginner r apologise in advance if question asked elsewhere. here issue:

i have 2 info frames, df1 , df2, different number of rows , columns. 2 frames have 1 variable (column) in mutual called "customer_no". want merged frame match records based on "customer_no" , rows in df2 only.both data.frames have multiple rows each customer_no.

i tried following:

merged.df <- (df1, df2, by="customer_no",all.y=true)

the problem assigns values of df1 df2 instead should empty. questions are:

1) how can tell command leave unmatched columns empty? 2) how can see merged file row came df? guess if resolve above question should easy see empty columns.

i missing in command don't know what. if question has been answered somewhere else, still kind plenty rephrase in english language here r beginner?

thanks!

data example:

df1: customer_no country year 10 uk 2001 10 uk 2002 10 uk 2003 20 2007 30 au 2006 df2: customer_no income 10 700 10 800 10 900 30 1000

merged file should this:

merged.df: customer_no income country year 10 uk 2001 10 uk 2002 10 uk 2003 10 700 10 800 10 900 30 au 2006 30 1000

so: puts columns together, adds values of df2 right after lastly 1 of df1 based on same customer_no , matches customer_no df2 (merged.df not have customer_no 20). also, leaves empty other cells.

in stata utilize append not sure in r...perhaps join?

thanks!!

try:

df1$id <- paste(df1$customer_no, 1, sep="_") df2$id <- paste(df2$customer_no, 2, sep="_") res <- merge(df1, df2, by=c('id', 'customer_no'),all=true)[,-1] res1 <- res[res$customer_no %in% df2$customer_no,] res1 # customer_no country year income #1 10 uk 2001 na #2 10 uk 2002 na #3 10 uk 2003 na #4 10 <na> na 700 #5 10 <na> na 800 #6 10 <na> na 900 #8 30 au 2006 na #9 30 <na> na 1000

if want alter na '',

res1[is.na(res1)] <- '' #but, leave `na` there `numeric` columns.

or, utilize rbindlist data.table (using original datasets)

library(data.table) indx <- df1$customer_no %in% df2$customer_no rbindlist(list(df1[indx,], df2),fill=true)[order(customer_no)] # customer_no country year income #1: 10 uk 2001 na #2: 10 uk 2002 na #3: 10 uk 2003 na #4: 10 na na 700 #5: 10 na na 800 #6: 10 na na 900 #7: 30 au 2006 na #8: 30 na na 1000

r merge

Comments

Popular posts from this blog

java Multi query from Mysql using netbeans -

c# - DotNetZip fails with "stream does not support seek operations" -

c++ - StartServiceCtrlDispatcher don't can access 1063 error -