How to use datasets.fetch_mldata() in sklearn?

后端 未结 11 2053
半阙折子戏
半阙折子戏 2020-12-30 05:42

I am trying to run the following code for a brief machine learning algorithm:

import re
import argparse
import csv
from collections import Counter
from sklea         


        
相关标签:
11条回答
  • 2020-12-30 06:07

    Apart from what @szymon has mentioned you can alternatively load dataset using:

    from six.moves import urllib
    from sklearn.datasets import fetch_mldata
    
    from scipy.io import loadmat
    mnist_alternative_url = "https://github.com/amplab/datascience-sp14/raw/master/lab7/mldata/mnist-original.mat"
    mnist_path = "./mnist-original.mat"
    response = urllib.request.urlopen(mnist_alternative_url)
    with open(mnist_path, "wb") as f:
        content = response.read()
        f.write(content)
    mnist_raw = loadmat(mnist_path)
    mnist = {
        "data": mnist_raw["data"].T,
        "target": mnist_raw["label"][0],
        "COL_NAMES": ["label", "data"],
        "DESCR": "mldata.org dataset: mnist-original",
    }
    
    0 讨论(0)
  • 2020-12-30 06:10

    I experienced the same issue and found different file size of mnist-original.mat at different times while I use my poor WiFi. I switched to LAN and it works fine. It maybe the issue of networking.

    0 讨论(0)
  • 2020-12-30 06:10

    If you didn't give the data_home, program look the ${yourprojectpath}/mldata/minist-original.mat you can download the program and put the file the correct path

    0 讨论(0)
  • 2020-12-30 06:11

    Try it like this:

    dataDict = fetch_mldata('MNIST original')
    

    This worked for me. Since you used the from ... import ... syntax, you shouldn't prepend datasets when you use it

    0 讨论(0)
  • 2020-12-30 06:12

    I also had this problem in the past. It is due to the dataset is quite large (about 55.4 mb), I run the "fetch_mldata" but because of the internet connection, it took awhile to download them all. I did not know and interrupt the process.

    The dataset is corrupted and that why the error happened.

    0 讨论(0)
提交回复
热议问题