r - Merging files on the basis of columns -



r - Merging files on the basis of columns -

i have multiple files many rows , 3 columns , need merge them on basis of first 2 columns match. file1

12 13 13 15 b 14 17 c 4 9 d . . . . . . 81 23 h

file 2

12 13 e 3 10 b 14 17 c 4 9 j . . . . . . 1 2 k

file 3

12 13 m 13 15 k 1 7 x 24 9 d . . . . . . 1 2 h

and on. want merge them obtain next result

12 13 e m 13 15 b k 14 17 c c 4 9 d j 3 10 b 24 9 d . . . . . . 81 23 h 1 2 k 1 7 x

the first thing comes mind these types of problems merge, perhaps in conjunction reduce(function(x, y) merge(x, y, = "somecols", = true), yourlistofdataframes).

however, merge not efficient function, since looks want "collapse" values fill in rows left right, not default merge behavior.

instead, suggest stack 1 long data.frame , reshape after have added index variable.

here 2 approaches:

option 1: "dplyr" + "tidyr" use mget set of data.frames list. use rbind_all convert list single data.frame. use sequence(n()) in mutate "dplyr" grouping info , create index. use spread "tidyr" transform "long" format "wide" format. library(dplyr) library(tidyr) combined <- rbind_all(mget(ls(pattern = "^file\\d"))) combined %>% group_by(v1, v2) %>% mutate(time = sequence(n())) %>% ungroup() %>% spread(time, v3, fill = "") # source: local info frame [7 x 5] # # v1 v2 1 2 3 # 1 1 7 x # 2 3 10 b # 3 4 9 d j # 4 12 13 e m # 5 13 15 b k # 6 14 17 c c # 7 24 9 d option 2: "data.table" use mget set of data.frames list. use rbindlist convert list single data.table. use sequence(.n) generate sequence groups. use dcast.data.table convert "long" data.table "wide" one. library(data.table) dcast.data.table( rbindlist(mget(ls(pattern = "^file\\d")))[, time := sequence(.n), = list(v1, v2)], v1 + v2 ~ time, value.var = "v3", fill = "") # v1 v2 1 2 3 # 1: 1 7 x # 2: 3 10 b # 3: 4 9 d j # 4: 12 13 e m # 5: 13 15 b k # 6: 14 17 c c # 7: 24 9 d

both of these answers assume starting next sample data:

file1 <- structure( list(v1 = c(12l, 13l, 14l, 4l), v2 = c(13l, 15l, 17l, 9l), v3 = c("a", "b", "c", "d")), .names = c("v1", "v2", "v3"), class = "data.frame", row.names = c(na, -4l)) file2 <- structure( list(v1 = c(12l, 3l, 14l, 4l), v2 = c(13l, 10l, 17l, 9l), v3 = c("e", "b", "c", "j")), .names = c("v1", "v2", "v3"), class = "data.frame", row.names = c(na, -4l)) file3 <- structure( list(v1 = c(12l, 13l, 1l, 24l), v2 = c(13l, 15l, 7l, 9l), v3 = c("m", "k", "x", "d")), .names = c("v1", "v2", "v3"), class = "data.frame", row.names = c(na, -4l))

r merge

Comments

Popular posts from this blog

Delphi change the assembly code of a running process -

json - Hibernate and Jackson (java.lang.IllegalStateException: Cannot call sendError() after the response has been committed) -

C++ 11 "class" keyword -