Coverage for mlos_bench/mlos_bench/storage/__init_

2# Copyright (c) Microsoft Corporation.

3# Licensed under the MIT License.

5"""

6Interfaces to the storage backends for mlos_bench.

8Storage backends (for instance :py:mod:`~mlos_bench.storage.sql`) are used to store

9and retrieve the results of experiments and implement a persistent queue for

10:py:mod:`~mlos_bench.schedulers`.

12The :py:class:`~mlos_bench.storage.base_storage.Storage` class is the main interface

13and provides the ability to

15- Create or reload a new :py:class:`~.Storage.Experiment` with one or more

16 associated :py:class:`~.Storage.Trial` instances which are used by the

17 :py:mod:`~mlos_bench.schedulers` during ``mlos_bench`` run time to execute

18 `Trials`.

20 In MLOS terms, an *Experiment* is a group of *Trials* that share the same scripts

21 and target system.

23 A *Trial* is a single run of the target system with a specific *Configuration*

24 (e.g., set of tunable parameter values).

25 (Note: other systems may call this a *sample*)

27- Retrieve the :py:class:`~mlos_bench.storage.base_trial_data.TrialData` results

28 with the :py:attr:`~mlos_bench.storage.base_experiment_data.ExperimentData.trials`

29 property on a :py:class:`~mlos_bench.storage.base_experiment_data.ExperimentData`

30 instance via the :py:class:`~.Storage` instance's

31 :py:attr:`~mlos_bench.storage.base_storage.Storage.experiments` property.

33 These can be especially useful with :py:mod:`mlos_viz` for interactive exploration

34 in a Jupyter Notebook interface, for instance.

36The :py:func:`.from_config` :py:mod:`.storage_factory` function can be used to get a

37:py:class:`.Storage` instance from a

38:py:attr:`~mlos_bench.config.schemas.config_schemas.ConfigSchema.STORAGE` type json

39config.

41Example

42-------

44Here's a very basic example of the Storage APIs.

46>>> # Create a new storage object from a JSON config.

47>>> # Normally, we'd load these from a file, but for this example we'll use a string.

48>>> global_config = '''

49... {

50... // Additional global configuration parameters can be added here.

51... /* For instance:

52... "storage_host": "some-remote-host",

53... "storage_user": "mlos_bench",

54... "storage_pass": "SuperSecretPassword",

55... */

56... }

57... '''

58>>> storage_config = '''

59... {

60... "class": "mlos_bench.storage.sql.storage.SqlStorage",

61... "config": {

62... // Don't create the schema until we actually need it.

63... // (helps speed up initial launch and tests)

64... "lazy_schema_create": true,

65... // Parameters below must match kwargs of `sqlalchemy.URL.create()`:

66... // Normally, we'd specify a real database, but for testing examples

67... // we'll use an in-memory one.

68... "drivername": "sqlite",

69... "database": ":memory:"

70... // Otherwise we might use something like the following

71... // to pull the values from the globals:

72... /*

73... "host": "$storage_host",

74... "username": "$storage_user",

75... "password": "$storage_pass",

76... */

77... }

78... }

79... '''

80>>> from mlos_bench.storage import from_config

81>>> storage = from_config(storage_config, global_configs=[global_config])

82>>> storage

83sqlite::memory:

84>>> #

85>>> # Internally, mlos_bench will use this config and storage backend to track

86>>> # Experiments and Trials it creates.

87>>> # Most users won't need to do that, but it works something like the following:

88>>> # Create a new experiment with a single trial.

89>>> # (Normally, we'd use a real environment config, but for this example we'll use a string.)

90>>> #

91>>> # Create a dummy tunable group.

92>>> from mlos_bench.services.config_persistence import ConfigPersistenceService

93>>> config_persistence_service = ConfigPersistenceService()

94>>> tunables_config = '''

95... {

96... "param_group": {

97... "cost": 1,

98... "params": {

99... "param1": {

100... "type": "int",

101... "range": [0, 100],

102... "default": 50

103... }

104... }

105... }

106... }

107... '''

108>>> tunables = config_persistence_service.load_tunables([tunables_config])

109>>> from mlos_bench.environments.status import Status

110>>> from datetime import datetime

111>>> with storage.experiment(

112... experiment_id="my_experiment_id",

113... trial_id=1,

114... root_env_config="root_env_config_info",

115... description="some description",

116... tunables=tunables,

117... opt_targets={"objective_metric": "min"},

118... ) as experiment:

119... # Create a dummy trial.

120... trial = experiment.new_trial(tunables=tunables)

121... # Pretend something ran with that trial and we have the results now.

122... # NOTE: Normally this would run through a TrialRunner via a Scheduler.

123... _ = trial.update(Status.SUCCEEDED, datetime.now(), {"objective_metric": 42})

124>>> #

125>>> # Now, once there's data to look at, in a Jupyter notebook or similar,

126>>> # we can also use the storage object to view the results.

127>>> #

128>>> storage.experiments

129{'my_experiment_id': Experiment :: my_experiment_id: 'some description'}

130>>> # Access ExperimentData by experiment id.

131>>> experiment_data = storage.experiments["my_experiment_id"]

132>>> experiment_data.trials

133{1: Trial :: my_experiment_id:1 cid:1 rid:None SUCCEEDED}

134>>> # Access TrialData for an Experiment by trial id.

135>>> trial_data = experiment_data.trials[1]

136>>> assert trial_data.status == Status.SUCCEEDED

137>>> # Retrieve the tunable configuration from the TrialData as a dictionary.

138>>> trial_config_data = trial_data.tunable_config

139>>> trial_config_data.config_dict

140{'param1': 50}

141>>> # Retrieve the results from the TrialData as a dictionary.

142>>> trial_data.results_dict

143{'objective_metric': 42}

144>>> # Retrieve the results of all Trials in the Experiment as a DataFrame.

145>>> experiment_data.results_df.columns.tolist()

146['trial_id', 'ts_start', 'ts_end', 'tunable_config_id', 'tunable_config_trial_group_id', 'status', 'trial_runner_id', 'config.param1', 'result.objective_metric']

147>>> # Drop the timestamp columns to make it a repeatable test.

148>>> experiment_data.results_df.drop(columns=["ts_start", "ts_end"])

149 trial_id tunable_config_id tunable_config_trial_group_id status trial_runner_id config.param1 result.objective_metric

1500 1 1 1 SUCCEEDED None 50 42

151

152[1 rows x 7 columns]

153

154See Also

155--------

156mlos_bench.storage.base_storage : Base interface for backends.

157mlos_bench.storage.base_experiment_data : Base interface for ExperimentData.

158mlos_bench.storage.base_trial_data : Base interface for TrialData.

159

160Notes

161-----

162- See `sqlite-autotuning notebooks

163 <https://github.com/Microsoft-CISL/sqlite-autotuning/blob/main/mlos_demo_sqlite_teachers.ipynb>`_

164 for additional examples.

165""" # pylint: disable=line-too-long # noqa: E501

166

167from mlos_bench.storage.base_storage import Storage

168from mlos_bench.storage.storage_factory import from_config

169

170__all__ = [

171 "Storage",

172 "from_config",

173]

Coverage for mlos_bench/mlos_bench/storage/init.py: 100%

3 statements