Extended MNIST

class olympus.datasets.emnist.BalancedEMNIST(data_path)[source]

Bases: olympus.datasets.dataset.AllDataset

The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits. More on arxiv.

See also MNIST and FashionMNIST

References

[1]Gregory Cohen, Saeed Afshar, Jonathan Tapson, André van Schaik. “EMNIST: an extension of MNIST to handwritten letters”, Mar 2017
Attributes:
classes: List[int]

Return the mapping between samples index and their class

input_shape: (28, 28)

Size of a sample stored in this dataset

target_shape: (47,)

The dataset is composed of 47 classes, 10 digits, 37 letters

train_size: 94000

Size of the train dataset

valid_size: 18800

Size of the validation dataset

test_size: 18800

Size of the test dataset

Methods

categories() Dataset tags so we can filter what we want depending on the task
transforms()
register_datapipe_as_function  
register_function  
static categories()[source]

Dataset tags so we can filter what we want depending on the task