SQL-like window functions in PANDAS: Row Numbering in Python Pandas Dataframe -
i come sql background , use following data processing step frequently:
- partition table of data 1 or more fields
- for each partition, add rownumber each of rows ranks row 1 or more other fields, analyst specifies ascending or descending
ex:
df = pd.dataframe({'key1' : ['a','a','a','b','a'], 'data1' : [1,2,2,3,3], 'data2' : [1,10,2,3,30]}) df data1 data2 key1 0 1 1 1 2 10 2 2 2 3 3 3 b 4 3 30
i'm looking how pandas equivalent sql window function:
rn = row_number() on (partition key1, key2 order data1 asc, data2 desc) data1 data2 key1 rn 0 1 1 1 1 2 10 2 2 2 2 3 3 3 3 b 1 4 3 30 4
i've tried following i've gotten work there no 'partitions':
def row_number(frame,orderby_columns, orderby_direction,name): frame.sort_index(by = orderby_columns, ascending = orderby_direction, inplace = true) frame[name] = list(xrange(len(frame.index)))
i tried extend idea work partitions (groups in pandas) following didn't work:
df1 = df.groupby('key1').apply(lambda t: t.sort_index(by=['data1', 'data2'], ascending=[true, false], inplace = true)).reset_index() def nf(x): x['rn'] = list(xrange(len(x.index))) df1['rn1'] = df1.groupby('key1').apply(nf)
but got lot of nans when this.
ideally, there'd succinct way replicate window function capability of sql (i've figured out window based aggregates...that's 1 liner in pandas)...can share me idiomatic way number rows in pandas?
you can using groupby
twice along rank
method:
in [11]: g = df.groupby('key1')
use min method argument give values share same data1 same rn:
in [12]: g['data1'].rank(method='min') out[12]: 0 1 1 2 2 2 3 1 4 4 dtype: float64 in [13]: df['rn'] = g['data1'].rank(method='min')
and groupby these results , add rank respect data2:
in [14]: g1 = df.groupby(['key1', 'rn']) in [15]: g1['data2'].rank(ascending=false) - 1 out[15]: 0 0 1 0 2 1 3 0 4 0 dtype: float64 in [16]: df['rn'] += g1['data2'].rank(ascending=false) - 1 in [17]: df out[17]: data1 data2 key1 rn 0 1 1 1 1 2 10 2 2 2 2 3 3 3 3 b 1 4 3 30 4
it feels there ought native way (there may be!...).
Comments
Post a Comment