classification

Python (numpy) crashes system with large number of array elements

这一生的挚爱 提交于 2021-01-29 05:30:29
问题 I'm trying to build a basic character recognition model using the many classifiers that scikit provides. The dataset being used is a standard handwritten set of alphanumeric samples (Chars74K image dataset taken from this source: EnglishHnd.tgz). There are 55 samples of each character (62 alphanumeric characters in all), each being 900x1200 pixels. I'm flattening the matrix (first converting to grayscale) into a 1x1080000 array (each representing a feature). for sample in sample_images: #

How to get the feature names in a different pipeline in sklearn in python

落爺英雄遲暮 提交于 2021-01-28 18:05:38
问题 I am using the following code (source) to concatenate multiple feature extraction methods. from sklearn.pipeline import Pipeline, FeatureUnion from sklearn.model_selection import GridSearchCV from sklearn.svm import SVC from sklearn.datasets import load_iris from sklearn.decomposition import PCA from sklearn.feature_selection import SelectKBest iris = load_iris() X, y = iris.data, iris.target pca = PCA(n_components=2) selection = SelectKBest(k=1) # Build estimator from PCA and Univariate

How to fine-tune a keras model with existing plus newer classes?

南笙酒味 提交于 2021-01-27 14:03:10
问题 Good day! I have a celebrity dataset on which I want to fine-tune a keras built-in model. SO far what I have explored and done, we remove the top layers of the original model (or preferably, pass the include_top=False) and add our own layers, and then train our newly added layers while keeping the previous layers frozen. This whole thing is pretty much like intuitive. Now what I require is, that my model learns to identify the celebrity faces, while also being able to detect all the other

How would I go about counting the amount of each alphanumerical in an array? (APL)

江枫思渺然 提交于 2021-01-27 07:44:54
问题 I can't figure out how to take a matrix and count the amount of the alphanumerical values for each row. I will only be taking in matrices with the values I'm counting. For example, if I got: ABA455 7L9O36G DZLFPEI I would get something like A:2 B:1 4:1 5:2 for the first row and each row would be counted independently. I would most like to understand the operators used if you could please explain them too. Thank you. 回答1: The following should work in any mainstream APL implementation. Let's

How would I go about counting the amount of each alphanumerical in an array? (APL)

一个人想着一个人 提交于 2021-01-27 07:43:05
问题 I can't figure out how to take a matrix and count the amount of the alphanumerical values for each row. I will only be taking in matrices with the values I'm counting. For example, if I got: ABA455 7L9O36G DZLFPEI I would get something like A:2 B:1 4:1 5:2 for the first row and each row would be counted independently. I would most like to understand the operators used if you could please explain them too. Thank you. 回答1: The following should work in any mainstream APL implementation. Let's

Scikit-Learn Decision Tree: Probability of prediction being a or b?

半世苍凉 提交于 2021-01-21 08:27:19
问题 I have a basic decision tree classifier with Scikit-Learn: #Used to determine men from women based on height and shoe size from sklearn import tree #height and shoe size X = [[65,9],[67,7],[70,11],[62,6],[60,7],[72,13],[66,10],[67,7.5]] Y=["male","female","male","female","female","male","male","female"] #creating a decision tree clf = tree.DecisionTreeClassifier() #fitting the data to the tree clf.fit(X, Y) #predicting the gender based on a prediction prediction = clf.predict([68,9]) #print

How to find the wrong predictions in Keras?

江枫思渺然 提交于 2021-01-04 04:37:09
问题 I have built a Keras model for extracting information from a raw input of text input. I am getting an accuracy of 0.9869. How can I know which of the training data is making the accuracy go low? I have pasted the code I am using below. import numpy as np from keras.models import Model, load_model from keras.layers import Input, Dense, LSTM, Activation, Bidirectional, Dot, Flatten from keras.callbacks import ModelCheckpoint x_nyha = np.load("data/x_nyha.npy") y_nyha = np.load("data/y/y_nyha

Is there class weight (or alternative way) for GradientBoostingClassifier in Sklearn when dealing with VotingClassifier or Grid search?

醉酒当歌 提交于 2020-12-30 06:42:09
问题 I'm using GradientBoostingClassifier for my unbalanced labeled datasets. It seems like class weight doesn't exist as a parameter for this classifier in Sklearn. I see I can use sample_weight when fit but I cannot use it when I deal with VotingClassifier or GridSearch. Could someone help? 回答1: Currently there isn't a way to use class_weights for GB in sklearn. Don't confuse this with sample_weight Sample Weights change the loss function and your score that you're trying to optimize. This is

Is there class weight (or alternative way) for GradientBoostingClassifier in Sklearn when dealing with VotingClassifier or Grid search?

ぃ、小莉子 提交于 2020-12-30 06:39:38
问题 I'm using GradientBoostingClassifier for my unbalanced labeled datasets. It seems like class weight doesn't exist as a parameter for this classifier in Sklearn. I see I can use sample_weight when fit but I cannot use it when I deal with VotingClassifier or GridSearch. Could someone help? 回答1: Currently there isn't a way to use class_weights for GB in sklearn. Don't confuse this with sample_weight Sample Weights change the loss function and your score that you're trying to optimize. This is

Is there class weight (or alternative way) for GradientBoostingClassifier in Sklearn when dealing with VotingClassifier or Grid search?

限于喜欢 提交于 2020-12-30 06:39:09
问题 I'm using GradientBoostingClassifier for my unbalanced labeled datasets. It seems like class weight doesn't exist as a parameter for this classifier in Sklearn. I see I can use sample_weight when fit but I cannot use it when I deal with VotingClassifier or GridSearch. Could someone help? 回答1: Currently there isn't a way to use class_weights for GB in sklearn. Don't confuse this with sample_weight Sample Weights change the loss function and your score that you're trying to optimize. This is