Introduction¶
This page explains the basics of the QCoDeS dataset. Read here to figure out what the heck is going on.
Basics¶
The DataSet is directly linked to an SQLite database on the host machine’s hard drive. Here we provide an overview of the layout of that database. The specifics of the python objects implementing the API on top of that can be found in the section Dataset Design.
The database holds a number of Experiments
that each holds a number of DataSets
. One DataSet
contains exactly one run
, and we may use the words DataSet
and run
interchangeably.
Experiment¶
The intended use of the Experiment
is to group DataSets
that belong together. For example, an experiment may contain all the measurements necessary to characterise a particular sample (Is it superconducting? What is the critical current?). Once the goodness of the sample has been established, a new experiment containing all the “real” measurements can then commence. The grouping runs this way is offered as a convenience and is as such not essential for any functionality in QCoDeS.
Each Experiment
has the following attributes:
An experiment-ID
A name
A sample-name
A start time
(potentially) An end time
The experiment-ID is an ever-increasing integer enumerating the number of experiments in the current database. The experiments are kept in the SQLite database in the experiments
table. The experiment-ID is the primary key of this table.
The name and the sample name of an experiment must be provided by the user at creation time.
The start time is automatically added upon creation, and the end time is added when (if ever) the user decides to mark the experiment as completed after which point no more runs may be added.
For an example notebook showing the usage of the database as a container for experiments, see the Experiment Container Notebook.
DataSet¶
Each DataSet
has the following attributes:
A run-ID
An experiment-ID
A name
A table of results
A number of parameters
A start time
(potentially) An end time
A GUID
The run-ID is an ever-increasing integer enumerating the number of runs in the current database. The runs are kept in the SQLite database in the runs
table. The run-ID is the primary key of this table.
The experiment-ID provides the link to the experiment that this dataset belongs to.
The name must be provided by the user at creation time.
The table of results is where the actual data of the DataSet
resides. The table has a number of associated parameters as columns with data points as rows. The parameters may be QCoDeS Parameters
, but are not limited to that. The preferred procedure for associating parameters with a run and adding data is via the Measurement
object as described in the Measurement Object Example Notebook. The deep thinking behind how different parameters relate to and depend on each other within a DataSet
is explained in Interdependent Parameters.
The start time is automatically added upon creation, and the end time is added when (if ever) the user decides to mark the run as completed after which point no more data points may be added.
The GUID is a run identifier that aims to go beyond the current database. The motivation for the GUID is that collaboration and data sharing across several machines requires a run to be labelled by something more than its name and number in the local database. The GUID is a standard 36 character string GUID composed from three integer codes and a timestamp. The codes are: location code (1-256), work station code (1-16777216), sample code (1-4294967296). The idea is that the people in a certain collaboration will label their locations, machines, and samples according to a scheme agreed upon within the relevant collaboration. That way, each run is uniquely identifiable in a human-readable way. This extra label for runs is offered as a convenience as is as such not essential for any functionality in QCoDeS.
For an example notebook showing details of DataSet
class and its usage, see the DataSet class walkthrough notebook and Accessing data in DataSet notebook.