I have those packages installed:
python: 2.7.3.final.0 python-bits: 64 OS: Linux machine: x86_64 processor: x86_64 byteorder: little pandas: 0.13.1 >
python: 2.7.3.final.0 python-bits: 64 OS: Linux machine: x86_64 processor: x86_64 byteorder: little pandas: 0.13.1
Try generating the _id field with DataFrame.apply call:
_id
def apply_id(x): x['_id'] = "{}_{}_{}".format(x['Store'], x['Dept'], x['Date_Str']) return x df_train = df_train.apply(apply_id, 1)
When using apply the id generation is performed per row resulting in minimal overhead in memory allocation.
apply