python - New column based on conditional selection from the values of 2 other columns in a Pandas DataFrame -
i've got dataframe
contains stock values.
it looks this:
>>>data open high low close volume adj close date 2013-07-08 76.91 77.81 76.85 77.04 5106200 77.04
when try make conditional new column following if statement:
data['test'] =data['close'] if data['close'] > data['open'] else data['open']
i following error:
traceback (most recent call last): file "<pyshell#116>", line 1, in <module> data[1]['test'] =data[1]['close'] if data[1]['close'] > data[1]['open'] else data[1]['open'] valueerror: truth value of array more 1 element ambiguous. use a.any() or a.all()
i used a.all()
:
data[1]['test'] =data[1]['close'] if all(data[1]['close'] > data[1]['open']) else data[1]['open']
the result entire ['open']
column selected. didn't condition wanted, select every time biggest value between ['open']
, ['close']
columns.
any appreciated.
thanks.
from dataframe like:
>>> df date open high low close volume adj close 0 2013-07-08 76.91 77.81 76.85 77.04 5106200 77.04 1 2013-07-00 77.04 79.81 71.81 72.87 1920834 77.04 2 2013-07-10 72.87 99.81 64.23 93.23 2934843 77.04
the simplest thing can think of be:
>>> df["test"] = df[["open", "close"]].max(axis=1) >>> df date open high low close volume adj close test 0 2013-07-08 76.91 77.81 76.85 77.04 5106200 77.04 77.04 1 2013-07-00 77.04 79.81 71.81 72.87 1920834 77.04 77.04 2 2013-07-10 72.87 99.81 64.23 93.23 2934843 77.04 93.23
df.ix[:,["open", "close"]].max(axis=1)
might little faster, don't think it's nice at.
alternatively, use .apply
on rows:
>>> df["test"] = df.apply(lambda row: max(row["open"], row["close"]), axis=1) >>> df date open high low close volume adj close test 0 2013-07-08 76.91 77.81 76.85 77.04 5106200 77.04 77.04 1 2013-07-00 77.04 79.81 71.81 72.87 1920834 77.04 77.04 2 2013-07-10 72.87 99.81 64.23 93.23 2934843 77.04 93.23
or fall numpy:
>>> df["test"] = np.maximum(df["open"], df["close"]) >>> df date open high low close volume adj close test 0 2013-07-08 76.91 77.81 76.85 77.04 5106200 77.04 77.04 1 2013-07-00 77.04 79.81 71.81 72.87 1920834 77.04 77.04 2 2013-07-10 72.87 99.81 64.23 93.23 2934843 77.04 93.23
the basic problem if/else
doesn't play nicely arrays, because if (something)
coerces something
single bool
. it's not equivalent "for every element in array something, if condition holds" or that.
Comments
Post a Comment