python - Pandas, groupby and finding maximum in groups, returning value and count -

i have pandas dataframe log data:

        host service 0   this.com     mail service 1   this.com     mail service 2   this.com     web 3   that.com     mail service 4  other.net     mail service 5  other.net     web 6  other.net     web

and want find service on every host gives errors:

host service no 0 this.com mail service 2 1 that.com mail service 1 2 other.net web 2

the solution found grouping host , service, , iterating on level 0 of index.

can suggest better, shorter version? without iteration?

df = df_logfile.groupby(['host','service']).agg({'service':np.size})  df_count = pd.dataframe() df_count['host'] = df_logfile['host'].unique() df_count['service']  = np.nan df_count['no']    = np.nan  h,data in df.groupby(level=0):   = data.idxmax()[0]      service = i[1]                no = data.xs(i)[0]   df_count.loc[df_count['host'] == h, 'service'] = service   df_count.loc[(df_count['host'] == h) & (df_count['service'] == service), 'no']   = no

full code https://gist.github.com/bjelline/d8066de66e305887b714

given df, next step grouping host value lone and aggregate idxmax. gives index corresponds the greatest service value. can utilize df.loc[...] select rows in df correspond greatest service values:

import numpy np import pandas pd  df_logfile = pd.dataframe({      'host' : ['this.com', 'this.com', 'this.com', 'that.com', 'other.net',                'other.net', 'other.net'],     'service' : ['mail', 'mail', 'web', 'mail', 'mail', 'web', 'web' ] })  df = df_logfile.groupby(['host','service'])['service'].agg({'no':'count'}) mask = df.groupby(level=0).agg('idxmax') df_count = df.loc[mask['no']] df_count = df_count.reset_index() print("\noutput\n{}".format(df_count))

yields dataframe

host service no 0 other.net web 2 1 that.com mail service 1 2 this.com mail service 2

python numpy pandas

Search This Blog

Jaimee

python - Pandas, groupby and finding maximum in groups, returning value and count -

Comments

Post a Comment

Popular posts from this blog

assembly - What is the addressing mode for ld, add, and rjmp instructions? -

vowpalwabbit - Interpreting Vowpal Wabbit results: Why are some lines appended by "h"? -

Is there a way to convert an HTML page styled with Bootstrap CSS into email-compatible html? -