This wraps in-memory data as a CNTK MinibatchSource object (aka “reader”), used to feed the data into a TrainingSession.

MinibatchSourceFromData(data_streams, max_samples = IO_INFINITELY_REPEAT)

Arguments

data_streams

data streams

max_samples

max samples

Details

Use this if your data is small enough to be loaded into RAM in its entirety, and the data is already sufficiently randomized.

While CNTK allows user code to iterate through minibatches by itself and feed data minibatch by minibatch through `train_minibatch()`, the standard way is to iterate through data using a MinibatchSource object. For example, the high-level TrainingSession interface, which manages a full training including checkpointing and cross validation, operates on this level.

A MinibatchSource created as a MinibatchSourceFromData linearly iterates through the data provided by the caller as numpy arrays or scipy.sparse.csr_matrix objects, without randomization. The data is not copied, so if you want to modify the data while being read through a MinibatchSourceFromData, please pass a copy.

See also

get_minibatch_checkpoint_state next_minibatch restore_mb_from_checkpoint mb_stream_infos