Unlike every other question I can find, I do not want to create a DataFrame from a homogeneous Numpy array, nor do I want to convert a structured array into a DataFrame.
pandas.DataFrame ({"col": nparray, "col": nparray})
This works if you try list (nparray)
instead. Here's a generic example:
import numpy as np
import pandas as pd
alpha = np.array ([1, 2, 3])
beta = np.array ([4, 5, 6])
gamma = np.array ([7, 8, 9])
dikt = {"Alpha" : list (alpha), "Beta" : list (beta), "Gamma":list (gamma)}
data_frame = pd.DataFrame (dikt)
print (data_frame)
May I suggest adding the columns one by one. It might help with efficiency. Like this for example,
import numpy as np
import pandas as pd
df = pd.DataFrame()
col1 = np.array([1, 2, 3])
col2 = np.array([4, 5, 6])
df['col1'] = col1
df['col2'] = col2
>>> df
col1 col2
0 1 4
1 2 5
2 3 6
I don't think this fully answers the question but it might help.
1-when you initialize your dataframe directly from 2D array, a copy is not made.
2-you don't have 2D arrays, you have 1D arrays, how do you get 2D arrays from 1D arrays without making copies, I don't know.
To illustrate the points, see below:
a = np.array([1,2,3])
b = np.array([4,5,6])
c = np.array((a,b))
df = pd.DataFrame(c)
a = np.array([1,2,3])
b = np.array([4,5,6])
c = np.array((a,b))
df = pd.DataFrame(c)
print(c)
[[1 2 3]
[4 5 6]]
print(df)
0 1 2
0 1 2 3
1 4 5 6
c[1,1]=10
print(df)
0 1 2
0 1 2 3
1 4 10 6
So, changing c indeed changes df. However if you try changing a or b, that does not affect c (or df).