This wraps in-memory data as a CNTK MinibatchSource object (aka “reader”), used to feed the data into a TrainingSession.
MinibatchSourceFromData(data_streams, max_samples = IO_INFINITELY_REPEAT)
data_streams | data streams |
---|---|
max_samples | max samples |
Use this if your data is small enough to be loaded into RAM in its entirety, and the data is already sufficiently randomized.
While CNTK allows user code to iterate through minibatches by itself and feed data minibatch by minibatch through `train_minibatch()`, the standard way is to iterate through data using a MinibatchSource object. For example, the high-level TrainingSession interface, which manages a full training including checkpointing and cross validation, operates on this level.
A MinibatchSource created as a MinibatchSourceFromData linearly iterates through the data provided by the caller as numpy arrays or scipy.sparse.csr_matrix objects, without randomization. The data is not copied, so if you want to modify the data while being read through a MinibatchSourceFromData, please pass a copy.
get_minibatch_checkpoint_state
next_minibatch
restore_mb_from_checkpoint
mb_stream_infos