Scikit-learn - Cannot load MNIST Original dataset using fetch_openml in Python
Solution 1
Try
mnist = fetch_openml('mnist_784')
I found it via https://www.openml.org/ under https://www.openml.org/d/554
Solution 2
you can use:
mist = fetch_openml('mnist_784', version=1)
Solution 3
Method fetch_openml() download dataset from mldata.org which is not stable and can not connect. An alternative way is manually to download the data set from the original data. You can download data from Kaggle(mnist data) and run the following code
from scipy.io import loadmat
mnist = loadmat("../input/mnist-original.loadmat")
mnist_data = mnist["data"].T
mnist_label = mnist["label"][0]
Solution 4
fetch_mldata is deprecated since scikit-learn v0.20
Test sklearn version
import sklearn
sklearn.__version__
Import Dataset
from sklearn.datasets import fetch_openml
X, y = fetch_openml('mnist_784', version=1, return_X_y=True)
Example
Inglorion
Updated on June 27, 2022Comments
-
Inglorion about 2 years
I'm trying to load the MNIST Original dataset in Python. The
sklearn.datasets.fetch_openml
function doesn't seem to work for this.Here is the code I'm using-
from sklearn.datasets import fetch_openml dataset = fetch_openml("MNIST Original")
I get this error-
File "generateClassifier.py", line 11, in <module> dataset = fetch_openml("MNIST Original") File "/home/inglorion/.local/lib/python3.5/site- packages/sklearn/datasets/openml.py", line 526, in fetch_openml data_info = _get_data_info_by_name(name, version, data_home) File "/home/inglorion/.local/lib/python3.5/site- packages/sklearn/datasets/openml.py", line 302, in _get_data_info_by_name data_home) File "/home/inglorion/.local/lib/python3.5/site- packages/sklearn/datasets/openml.py", line 169, in _get_json_content_from_openml_api raise error File "/home/inglorion/.local/lib/python3.5/site- packages/sklearn/datasets/openml.py", line 164, in _get_json_content_from_openml_api return _load_json() File "/home/inglorion/.local/lib/python3.5/site- packages/sklearn/datasets/openml.py", line 52, in wrapper return f() File "/home/inglorion/.local/lib/python3.5/site- packages/sklearn/datasets/openml.py", line 160, in _load_json with closing(_open_openml_url(url, data_home)) as response: File "/home/inglorion/.local/lib/python3.5/site- packages/sklearn/datasets/openml.py", line 109, in _open_openml_url with closing(urlopen(req)) as fsrc: File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.5/urllib/request.py", line 472, in open response = meth(req, response) File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python3.5/urllib/request.py", line 510, in error return self._call_chain(*args) File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain result = func(*args) File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 400: Bad Request
How can I fix this? Alternately, is there any other way to load the MNIST dataset into Python?
I'm using version 0.20.2 of
scikit-learn
.I'm relatively new to programming in general, so I would appreciate it if I could get a simple answer. Thanks!