问题
Trying to append a fourth column to the following dataframe of length 465017
.
0 1 2
0 228055 231908 1
1 228056 228899 1
Running following syntax
x["Fake_date"]= fake.date(pattern="%Y-%m-%d", end_datetime=None)
returns
0 1 2 Fake_date
0 228055 231908 1 1980-10-12
1 228056 228899 1 1980-10-12
but I want different random dates on 465017
rows for an instance,
0 1 2 Fake_date
0 228055 231908 1 1980-10-11
1 228056 228899 1 1980-09-12
How do I randomize this?
回答1:
Without the faker
package, you can do this:
import numpy as np
import pandas as pd
x["Fake_date"] = np.random.choice(pd.date_range('1980-01-01', '2000-01-01'), len(x))
>>> x
0 1 2 Fake_date
0 228055 231908 1 1999-12-08
1 228056 228899 1 1989-01-25
replacing the 2 date strings in pd.date_range()
with the minimum and maximum date that you want to choose random dates from
来源:https://stackoverflow.com/questions/49522397/add-random-dates-in-400k-pandas-dataframe