Pandas: find nearest higher high (OHLC)

问题

I'm trying to find the fastest solution to iterate over each row of Open High Low Close data and count out how many rows exist between the current row's high and the next high that’s greater than or equal to the original. Here is a shortened code example which I think explains it well (I'd be trying to find the final value of _next):

for i in df.index:
    while df.loc[i, 'high'] > df.loc[_next, 'high']:
         _next += 1

I think I need to use a lambda function inside apply but I can't seem to get it right. Been trying to figure this out for days and would love any advice!

Using this data, the function should return [2, 0, 0, 3, 1, 0, 0]

d = {'high': [1.2, 1.1, 1.1, 1.4, 1.3, 1.2, 1.3]}
df = pd.DataFrame(data=d)


    high   rows_to_higher_high
0   1.2    2
1   1.1    0
2   1.1    0
3   1.4    3
4   1.3    1
5   1.2    0
6   1.3    0

Edit: itertuples(name=None, index=True) is the fastest way to do it so far.

回答1:

using list comprehension and get index to list

In [166]: d = {'high': [1.2, 1.1, 1.1, 1.4, 1.3, 1.2, 1.3]}

In [167]: df = pd.DataFrame(data=d)

In [168]: df['rows_to_higher_high']=[(df['high'].values[i+1:]>=val).tolist().index(True) if True in (df['high'].values[i+1:]>=val).tolist() else le
     ...: n((df['high'].values[i+1:]>=val).tolist())  for i, val in enumerate(df['high'].values)]

In [169]: df
Out[169]:
   high  rows_to_higher_high
0   1.2                    2
1   1.1                    0
2   1.1                    0
3   1.4                    3
4   1.3                    1
5   1.2                    0
6   1.3                    0

回答2:

Base on my understanding, we can get the subtraction from current row by all value, then get the upper tri , and get the index max with argmax when the value is more than 0

Sample :

s
0    1
1    2
2    1
3    3
4    1
5    1
6    5
dtype: int64

Solution :

np.clip(np.triu((s.values-s.values[:,None])>0,0).argmax(1)-np.arange(len(s)),a_min=0, a_max=None)
array([1, 2, 1, 3, 2, 1, 0])

来源：https://stackoverflow.com/questions/62963977/pandas-find-nearest-higher-high-ohlc

标签

python

pandas

ohlc