python - Pandas - consecutive values must be different -

i want subsample rows of dataframe such pairs of consecutive values in given column different, if 2 of them same, keep, say, first one.

here example

p = [1,1,2,1,3,3,2,4,3] t = range(len(p)) df = pd.dataframe({'t':t, 'p':p})  df     p  t 0  1  0 1  1  1 2  2  2 3  1  3 4  3  4 5  3  5 6  2  6 7  4  7 8  3  8    desireddf     p  t 0  1  0 2  2  2 3  1  3 4  3  4 6  2  6 7  4  7 8  3  8

in desireddf, 2 consecutive values in p column different.

how this?

>>> df[df.p != df.p.shift()] p t 0 1 0 2 2 2 3 1 3 4 3 4 6 2 6 7 4 7 8 3 8

explanation: df.p.shift() shifts entries of column p downwards row. df.p != df.p.shift() checks each entry of df.p different previous entry, returning boolean value.

this method works on columns number of consecutive entries: e.g. if there run of 3 identical values, first value in run returned.

python pandas dataframes distinct-values subsampling

Search This Blog

Jaimee

python - Pandas - consecutive values must be different -

Comments

Post a Comment

Popular posts from this blog

c - Compilation of a code: unkown type name string -

java - Bypassing "final local variable defined in an enclosing type" -

json - Hibernate and Jackson (java.lang.IllegalStateException: Cannot call sendError() after the response has been committed) -