Coverage for mlos_bench/mlos_bench/storage/__init__.py: 100%
3 statements
« prev ^ index » next coverage.py v7.6.9, created at 2024-12-20 00:44 +0000
« prev ^ index » next coverage.py v7.6.9, created at 2024-12-20 00:44 +0000
1#
2# Copyright (c) Microsoft Corporation.
3# Licensed under the MIT License.
4#
5"""
6Interfaces to the storage backends for mlos_bench.
8Storage backends (for instance :py:mod:`~mlos_bench.storage.sql`) are used to store
9and retrieve the results of experiments and implement a persistent queue for
10:py:mod:`~mlos_bench.schedulers`.
12The :py:class:`~mlos_bench.storage.base_storage.Storage` class is the main interface
13and provides the ability to
15- Create or reload a new :py:class:`~.Storage.Experiment` with one or more
16 associated :py:class:`~.Storage.Trial` instances which are used by the
17 :py:mod:`~mlos_bench.schedulers` during ``mlos_bench`` run time to execute
18 `Trials`.
20 In MLOS terms, an *Experiment* is a group of *Trials* that share the same scripts
21 and target system.
23 A *Trial* is a single run of the target system with a specific *Configuration*
24 (e.g., set of tunable parameter values).
25 (Note: other systems may call this a *sample*)
27- Retrieve the :py:class:`~mlos_bench.storage.base_trial_data.TrialData` results
28 with the :py:attr:`~mlos_bench.storage.base_experiment_data.ExperimentData.trials`
29 property on a :py:class:`~mlos_bench.storage.base_experiment_data.ExperimentData`
30 instance via the :py:class:`~.Storage` instance's
31 :py:attr:`~mlos_bench.storage.base_storage.Storage.experiments` property.
33 These can be especially useful with :py:mod:`mlos_viz` for interactive exploration
34 in a Jupyter Notebook interface, for instance.
36The :py:func:`.from_config` :py:mod:`.storage_factory` function can be used to get a
37:py:class:`.Storage` instance from a
38:py:attr:`~mlos_bench.config.schemas.config_schemas.ConfigSchema.STORAGE` type json
39config.
41Example
42-------
44Here's a very basic example of the Storage APIs.
46>>> # Create a new storage object from a JSON config.
47>>> # Normally, we'd load these from a file, but for this example we'll use a string.
48>>> global_config = '''
49... {
50... // Additional global configuration parameters can be added here.
51... /* For instance:
52... "storage_host": "some-remote-host",
53... "storage_user": "mlos_bench",
54... "storage_pass": "SuperSecretPassword",
55... */
56... }
57... '''
58>>> storage_config = '''
59... {
60... "class": "mlos_bench.storage.sql.storage.SqlStorage",
61... "config": {
62... // Don't create the schema until we actually need it.
63... // (helps speed up initial launch and tests)
64... "lazy_schema_create": true,
65... // Parameters below must match kwargs of `sqlalchemy.URL.create()`:
66... // Normally, we'd specify a real database, but for testing examples
67... // we'll use an in-memory one.
68... "drivername": "sqlite",
69... "database": ":memory:"
70... // Otherwise we might use something like the following
71... // to pull the values from the globals:
72... /*
73... "host": "$storage_host",
74... "username": "$storage_user",
75... "password": "$storage_pass",
76... */
77... }
78... }
79... '''
80>>> from mlos_bench.storage import from_config
81>>> storage = from_config(storage_config, global_configs=[global_config])
82>>> storage
83sqlite::memory:
84>>> #
85>>> # Internally, mlos_bench will use this config and storage backend to track
86>>> # Experiments and Trials it creates.
87>>> # Most users won't need to do that, but it works something like the following:
88>>> # Create a new experiment with a single trial.
89>>> # (Normally, we'd use a real environment config, but for this example we'll use a string.)
90>>> #
91>>> # Create a dummy tunable group.
92>>> from mlos_bench.services.config_persistence import ConfigPersistenceService
93>>> config_persistence_service = ConfigPersistenceService()
94>>> tunables_config = '''
95... {
96... "param_group": {
97... "cost": 1,
98... "params": {
99... "param1": {
100... "type": "int",
101... "range": [0, 100],
102... "default": 50
103... }
104... }
105... }
106... }
107... '''
108>>> tunables = config_persistence_service.load_tunables([tunables_config])
109>>> from mlos_bench.environments.status import Status
110>>> from datetime import datetime
111>>> with storage.experiment(
112... experiment_id="my_experiment_id",
113... trial_id=1,
114... root_env_config="root_env_config_info",
115... description="some description",
116... tunables=tunables,
117... opt_targets={"objective_metric": "min"},
118... ) as experiment:
119... # Create a dummy trial.
120... trial = experiment.new_trial(tunables=tunables)
121... # Pretend something ran with that trial and we have the results now.
122... _ = trial.update(Status.SUCCEEDED, datetime.now(), {"objective_metric": 42})
123>>> #
124>>> # Now, once there's data to look at, in a Jupyter notebook or similar,
125>>> # we can also use the storage object to view the results.
126>>> #
127>>> storage.experiments
128{'my_experiment_id': Experiment :: my_experiment_id: 'some description'}
129>>> # Access ExperimentData by experiment id.
130>>> experiment_data = storage.experiments["my_experiment_id"]
131>>> experiment_data.trials
132{1: Trial :: my_experiment_id:1 cid:1 SUCCEEDED}
133>>> # Access TrialData for an Experiment by trial id.
134>>> trial_data = experiment_data.trials[1]
135>>> assert trial_data.status == Status.SUCCEEDED
136>>> # Retrieve the tunable configuration from the TrialData as a dictionary.
137>>> trial_config_data = trial_data.tunable_config
138>>> trial_config_data.config_dict
139{'param1': 50}
140>>> # Retrieve the results from the TrialData as a dictionary.
141>>> trial_data.results_dict
142{'objective_metric': 42}
143>>> # Retrieve the results of all Trials in the Experiment as a DataFrame.
144>>> experiment_data.results_df.columns.tolist()
145['trial_id', 'ts_start', 'ts_end', 'tunable_config_id', 'tunable_config_trial_group_id', 'status', 'config.param1', 'result.objective_metric']
146>>> # Drop the timestamp columns to make it a repeatable test.
147>>> experiment_data.results_df.drop(columns=["ts_start", "ts_end"])
148 trial_id tunable_config_id tunable_config_trial_group_id status config.param1 result.objective_metric
1490 1 1 1 SUCCEEDED 50 42
151[1 rows x 6 columns]
153See Also
154--------
155mlos_bench.storage.base_storage : Base interface for backends.
156mlos_bench.storage.base_experiment_data : Base interface for ExperimentData.
157mlos_bench.storage.base_trial_data : Base interface for TrialData.
159Notes
160-----
161- See `sqlite-autotuning notebooks
162 <https://github.com/Microsoft-CISL/sqlite-autotuning/blob/main/mlos_demo_sqlite_teachers.ipynb>`_
163 for additional examples.
164""" # pylint: disable=line-too-long # noqa: E501
166from mlos_bench.storage.base_storage import Storage
167from mlos_bench.storage.storage_factory import from_config
169__all__ = [
170 "Storage",
171 "from_config",
172]