Tensorflow - About mnist.train.next_batch()
11,368
Re 1, when shuffle=True
the order of examples in the data is randomized. Re 2, yes, it should respect whatever order the examples have in the numpy arrays.
Author by
YOON
Updated on June 04, 2022Comments
-
YOON almost 2 years
When I search about mnist.train.next_batch() I found this https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/mnist.py
In this code
def next_batch(self, batch_size, fake_data=False, shuffle=True): """Return the next `batch_size` examples from this data set.""" if fake_data: fake_image = [1] * 784 if self.one_hot: fake_label = [1] + [0] * 9 else: fake_label = 0 return [fake_image for _ in xrange(batch_size)], [ fake_label for _ in xrange(batch_size) ] start = self._index_in_epoch # Shuffle for the first epoch if self._epochs_completed == 0 and start == 0 and shuffle: perm0 = numpy.arange(self._num_examples) numpy.random.shuffle(perm0) self._images = self.images[perm0] self._labels = self.labels[perm0] # Go to the next epoch if start + batch_size > self._num_examples: # Finished epoch self._epochs_completed += 1 # Get the rest examples in this epoch rest_num_examples = self._num_examples - start images_rest_part = self._images[start:self._num_examples] labels_rest_part = self._labels[start:self._num_examples] # Shuffle the data if shuffle: perm = numpy.arange(self._num_examples) numpy.random.shuffle(perm) self._images = self.images[perm] self._labels = self.labels[perm] # Start next epoch start = 0 self._index_in_epoch = batch_size - rest_num_examples end = self._index_in_epoch images_new_part = self._images[start:end] labels_new_part = self._labels[start:end] return numpy.concatenate((images_rest_part, images_new_part), axis=0) , numpy.concatenate((labels_rest_part, labels_new_part), axis=0) else: self._index_in_epoch += batch_size end = self._index_in_epoch return self._images[start:end], self._labels[start:end]
I know that mnist.train.next_batch(batch_size=100) means it randomly pick 100 data from MNIST dataset. Now, Here's my question
- What is shuffle=true means?
- If I set next_batch(batch_size=100,fake_data=False, shuffle=False) then it picks 100 data from the start to the end of MNIST dataset sequentially? Not randomly?