My aim is to build an image classification model for flowers. The data RAR file consists of a folder named train data which consists of about 16000 images labelled from 0- 16000. Similarly there is a folder for test data also.
Apart from this there are two csv workbooks. The first csv workbook consists of two attributes - label & flower class. There are 104 labels & flower classes. The second workbook also consists of two attributes - id & flower class. This csv is the link between the train images folder & flower classes. ID is the linking attribute. I.e for eg assume that image labelled 10 in train images folder is the image of a sunflower. Hence in the csv workbook the flower class entry corresponding to id = 10 is a sunflower. For eg assume that image labelled 10 in train data folder is a sunflower. Hence in the (second) workbook the flower class entry corresponding to id =10 is a sunflower.
This is my code
# Import relavant libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Dense, Flatten
from sklearn.model_selection import train_test_split
from PIL import Image
# Load the csv files
# Workbook no.1
label_csv = pd.read_csv('/content/flowers_label.csv')
# Workbook no.2
train = pd.read_csv('/content/flowers_idx.csv')
# To sort the train csv id wise from 0 - 16464
# Creating inputs and targets
X = [] #images
y = [] # labels
base = "/content/flower_tpu/flower_tpu/flowers_google/flowers_google//"
row = 0;
for idx in range(len(train)):
# get the flower row
flower = train.iloc[idx]
# create flower path
path = f"{base}{flower.id}.jpeg"
#load image
img = Image.open(path)
# convert to numpy
img = np.array(img)
#save to X
# get label
label = label_csv[label_csv['flower_class'] == flower.flower_cls].label.values[0]
# save to y
# Train Validation split
X_train, X_validation, y_train, y_validation = train_test_split(X, y, random_state=12, test_size=0.2)
# The model
output_size = 104
hidden_layer_size = 150
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(224, 224, 3)),
tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
tf.keras.layers.Dense(output_size, activation='softmax')
# Converting all data into ndarrays
X_train = np.asarray(X_train)
y_train = np.asarray(y_train)
X_validation = np.asarray(X_validation)
y_validation = np.asarray(y_validation)
# Compilation
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# Fitting
model.fit(X_train, y_train, epochs=3, validation_data=(X_validation, y_validation), validation_steps=10, verbose =2)
I code is running but the train & validation accuracy is as poor as 6%. :/ How can I improve this code?