This page was generated from
docs/examples/DataSet/Extracting-runs-from-one-DB-file-to-another.ipynb.
Interactive online version:
.
Extracting runs from one DB file to another¶
This notebook shows how to use the extract_runs_into_db
function to extract runs from a database (DB) file (the source DB) into another DB file (the target DB). If the target DB does not exist, it will be created. The runs are NOT removed from the original DB file; they are copied over.
Setup¶
Let us set up a DB file with some runs in it.
[1]:
from pathlib import Path
import numpy as np
from qcodes.dataset import (
Measurement,
connect,
extract_runs_into_db,
load_experiment_by_name,
load_or_create_experiment,
)
from qcodes.instrument_drivers.mock_instruments import DummyInstrument
from qcodes.station import Station
[2]:
source_path = Path.cwd().parent / "example_output" / "extract_runs_notebook_source.db"
target_path = Path.cwd().parent / "example_output" / "extract_runs_notebook_target.db"
[3]:
source_conn = connect(source_path)
target_conn = connect(target_path)
[4]:
exp = load_or_create_experiment(
experiment_name="extract_runs_experiment", sample_name="no_sample", conn=source_conn
)
my_inst = DummyInstrument("my_inst", gates=["voltage", "current"])
station = Station(my_inst)
[5]:
meas = Measurement(exp=exp)
meas.register_parameter(my_inst.voltage)
meas.register_parameter(my_inst.current, setpoints=(my_inst.voltage,))
# Add 10 runs with gradually more and more data
for run_id in range(1, 11):
with meas.run() as datasaver:
for step, noise in enumerate(np.random.randn(run_id)):
datasaver.add_result((my_inst.voltage, step), (my_inst.current, noise))
Starting experimental run with id: 1.
Starting experimental run with id: 2.
Starting experimental run with id: 3.
Starting experimental run with id: 4.
Starting experimental run with id: 5.
Starting experimental run with id: 6.
Starting experimental run with id: 7.
Starting experimental run with id: 8.
Starting experimental run with id: 9.
Starting experimental run with id: 10.
Extraction¶
Now let us extract runs 3 and 7 into our desired target DB file. All runs must come from the same experiment. To extract runs from different experiments, one may call the function several times.
The function will look in the target DB to see if an experiment with matching attributes already exists. If not, such an experiment is created.
[6]:
extract_runs_into_db(source_path, target_path, 3, 7)
[7]:
target_exp = load_experiment_by_name(name="extract_runs_experiment", conn=target_conn)
[8]:
target_exp
[8]:
extract_runs_experiment#no_sample#1@/home/runner/work/Qcodes/Qcodes/docs/examples/example_output/extract_runs_notebook_target.db
--------------------------------------------------------------------------------------------------------------------------------
1-results-1-my_inst_voltage,my_inst_current-3
2-results-2-my_inst_voltage,my_inst_current-7
The last number printed in each line is the number of data points. As expected, we get 3 and 7.
Note that the runs will have different run_id
s in the new database. Their GUIDs are, however, the same (as they must be).
[9]:
exp.data_set(3).guid
[9]:
'47db1f24-0000-0000-0000-019a154ca305'
[10]:
target_exp.data_set(1).guid
[10]:
'47db1f24-0000-0000-0000-019a154ca305'
Furthermore, note that the original run_id
preserved as captured_run_id
. We will demonstrate below how to look up data via the captured_run_id
.
[11]:
target_exp.data_set(1).captured_run_id
[11]:
3
Merging data from 2 databases¶
There are occasions where it is convenient to combine data from several databases.
Let’s first demonstrate this by creating some new experiments in another db file.
[12]:
extra_source_path = (
Path.cwd().parent / "example_output" / "extract_runs_notebook_source_aux.db"
)
[13]:
source_extra_conn = connect(extra_source_path)
[14]:
exp = load_or_create_experiment(
experiment_name="extract_runs_experiment_aux",
sample_name="no_sample",
conn=source_extra_conn,
)
[15]:
meas = Measurement(exp=exp)
meas.register_parameter(my_inst.current)
meas.register_parameter(my_inst.voltage, setpoints=(my_inst.current,))
# Add 10 runs with gradually more and more data
for run_id in range(1, 11):
with meas.run() as datasaver:
for step, noise in enumerate(np.random.randn(run_id)):
datasaver.add_result((my_inst.current, step), (my_inst.voltage, noise))
Starting experimental run with id: 1.
Starting experimental run with id: 2.
Starting experimental run with id: 3.
Starting experimental run with id: 4.
Starting experimental run with id: 5.
Starting experimental run with id: 6.
Starting experimental run with id: 7.
Starting experimental run with id: 8.
Starting experimental run with id: 9.
Starting experimental run with id: 10.
[16]:
exp.data_set(3).guid
[16]:
'629d00df-0000-0000-0000-019a154ca414'
[17]:
extract_runs_into_db(extra_source_path, target_path, 1, 3)
[18]:
target_exp_aux = load_experiment_by_name(
name="extract_runs_experiment_aux", conn=target_conn
)
The GUID should be preserved.
[19]:
target_exp_aux.data_set(2).guid
[19]:
'629d00df-0000-0000-0000-019a154ca414'
And the original run_id
is preserved as captured_run_id
[20]:
target_exp_aux.data_set(2).captured_run_id
[20]:
3
Uniquely identifying and loading runs¶
As runs move from one database to the other, uniquely identifying a run becomes non-trivial.
Note how we now have 2 runs in the same DB sharing the same captured_run_id
. This means that captured_run_id
is not a unique key. We can demonstrate that captured_run_id
is not unique by looking up the GUID
s that match this captured_run_id
.
[21]:
from qcodes.dataset import get_guids_by_run_spec, load_by_guid, load_by_run_spec
[22]:
guids = get_guids_by_run_spec(conn=target_conn, captured_run_id=3)
guids
[22]:
['47db1f24-0000-0000-0000-019a154ca305',
'629d00df-0000-0000-0000-019a154ca414']
[23]:
load_by_guid(guids[0], conn=target_conn)
[23]:
results #1@/home/runner/work/Qcodes/Qcodes/docs/examples/example_output/extract_runs_notebook_target.db
-------------------------------------------------------------------------------------------------------
my_inst_voltage - numeric
my_inst_current - numeric
[24]:
load_by_guid(guids[1], conn=target_conn)
[24]:
results #4@/home/runner/work/Qcodes/Qcodes/docs/examples/example_output/extract_runs_notebook_target.db
-------------------------------------------------------------------------------------------------------
my_inst_current - numeric
my_inst_voltage - numeric
To enable loading of runs that may share the same captured_run_id
, the function load_by_run_data
is supplied. This function takes one or more optional sets of metadata. If more than one run matching this information is found the metadata of the matching runs is printed and an error is raised. It is now possible to suply more information to the function to uniquely identify a specific run.
[25]:
try:
load_by_run_spec(captured_run_id=3, conn=target_conn)
except NameError:
print("Caught a NameError")
captured_run_id captured_counter experiment_name sample_name location work_station
----------------- ------------------ --------------------------- ------------- ---------- --------------
3 3 extract_runs_experiment no_sample 0 0
3 3 extract_runs_experiment_aux no_sample 0 0
Caught a NameError
To single out one of these two runs, we can thus specify the experiment_name
:
[26]:
load_by_run_spec(
captured_run_id=3, experiment_name="extract_runs_experiment_aux", conn=target_conn
)
[26]:
results #4@/home/runner/work/Qcodes/Qcodes/docs/examples/example_output/extract_runs_notebook_target.db
-------------------------------------------------------------------------------------------------------
my_inst_current - numeric
my_inst_voltage - numeric
Related functionality: export_datasets_and_create_metadata_db¶
QCoDeS also provides another function related to database operations: :func:qcodes.dataset.export_datasets_and_create_metadata_db
. This function:
Exports all datasets from a source database to NetCDF files
Creates a new metadata-only database (without raw data)
Provides space-efficient storage by offloading data to NetCDF files
This is useful when you want to reduce database file sizes while maintaining all metadata in a database format. See the API documentation for more details on usage and parameters.