Move non-empty cells to the left in pandas DataFrame

匿名 (未验证) 提交于 2019-12-03 01:54:01

问题:

Suppose I have data of the form

Name    h1    h2    h3    h4 A       1     nan   2     3 B       nan   nan   1     3 C       1     3     2     nan 

I want to move all non-nan cells to the left (or collect all non-nan data in new columns) while preserving the order from left to right, getting

Name    h1    h2    h3    h4 A       1     2     3     nan B       1     3     nan   nan C       1     3     2     nan 

I can of course do so row by row. But I hope to know if there are other ways with better performance.

回答1:

Here's what I did:

I unstacked your dataframe into a longer format, then grouped by the name column. Within each group, I drop the NaNs, but then reindex to the full h1 thought h4 set, thus re-creating your NaNs to the right.

from io import StringIO import pandas  def defragment(x):     values = x.dropna().values     return pandas.Series(values, index=df.columns[:len(values)])  datastring = StringIO("""\ Name    h1    h2    h3    h4 A       1     nan   2     3 B       nan   nan   1     3 C       1     3     2     nan""")  df = pandas.read_table(datastring, sep='\s+').set_index('Name') long_index = pandas.MultiIndex.from_product([df.index, df.columns])  print(     df.stack()       .groupby(level='Name')       .apply(defragment)       .reindex(long_index)         .unstack()   ) 

And so I get:

   h1  h2  h3  h4 A   1   2   3 NaN B   1   3 NaN NaN C   1   3   2 NaN 


回答2:

Here's how you could do it with a regex (possibly not recommended):

pd.read_csv(StringIO(re.sub(',+',',',df.to_csv()))) Out[20]:    Name  h1  h2  h3  h4 0    A   1   2   3 NaN 1    B   1   3 NaN NaN 2    C   1   3   2 NaN 


回答3:

First, make function.

        def squeeze_nan(x):             original_columns = x.index.tolist()              squeezed = x.dropna()             squeezed.index = [original_columns[n] for n in range(squeezed.count())]              return squeezed.reindex(original_columns, fill_value=np.nan) 

Second, apply the function.

df.apply(squeeze_nan, axis=1) 

You can also try axis=0 and .[::-1] to squeeze nan to any direction.

[EDIT]

@Mxracer888 you want this?

def squeeze_nan(x, hold):     if x.name not in hold:         original_columns = x.index.tolist()          squeezed = x.dropna()         squeezed.index = [original_columns[n] for n in range(squeezed.count())]          return squeezed.reindex(original_columns, fill_value=np.nan)     else:         return x  df.apply(lambda x: squeeze_nan(x, ['B']), axis=1) 



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!