I want to train a simple neural network on PyTorch using a personal database. This database is imported from an Excel file and stored in df
.
One of the
You can use below functions to convert any dataframe or pandas series to a pytorch tensor
import pandas as pd
import torch
# determine the supported device
def get_device():
if torch.cuda.is_available():
device = torch.device('cuda:0')
else:
device = torch.device('cpu') # don't have GPU
return device
# convert a df to tensor to be used in pytorch
def df_to_tensor(df):
device = get_device()
return torch.from_numpy(df.values).float().to(device)
df_tensor = df_to_tensor(df)
series_tensor = df_to_tensor(series)
Simply convert the pandas dataframe -> numpy array -> pytorch tensor
. An example of this is described below:
import pandas as pd
import numpy as np
import torch
df = pd.read_csv('train.csv')
target = pd.DataFrame(df['target'])
del df['target']
train = data_utils.TensorDataset(torch.Tensor(np.array(df)), torch.Tensor(np.array(target)))
train_loader = data_utils.DataLoader(train, batch_size = 10, shuffle = True)
Hopefully, this will help you to create your own datasets using pytorch (Compatible with the latest version of pytorch).
Maybe try this to see if it can fix your problem(based on your sample code)?
train_target = torch.tensor(train['Target'].values.astype(np.float32))
train = torch.tensor(train.drop('Target', axis = 1).values.astype(np.float32))
train_tensor = data_utils.TensorDataset(train, train_target)
train_loader = data_utils.DataLoader(dataset = train_tensor, batch_size = batch_size, shuffle = True)
I'm referring to the question in the title as you haven't really specified anything else in the text, so just converting the DataFrame into a PyTorch tensor.
Without information about your data, I'm just taking float values as example targets here.
Convert Pandas dataframe to PyTorch tensor?
import pandas as pd
import torch
import random
# creating dummy targets (float values)
targets_data = [random.random() for i in range(10)]
# creating DataFrame from targets_data
targets_df = pd.DataFrame(data=targets_data)
targets_df.columns = ['targets']
# creating tensor from targets_df
torch_tensor = torch.tensor(targets_df['targets'].values)
# printing out result
print(torch_tensor)
Output:
tensor([ 0.5827, 0.5881, 0.1543, 0.6815, 0.9400, 0.8683, 0.4289,
0.5940, 0.6438, 0.7514], dtype=torch.float64)
Tested with Pytorch 0.4.0.
I hope this helps, if you have any further questions - just ask. :)