Coverage for mlos_bench/mlos_bench/config/__init_

2# Copyright (c) Microsoft Corporation.

3# Licensed under the MIT License.

5"""

6A module for and documentation about the structure and management of json configs, their

7schemas and validation for various components of MLOS.

9.. contents:: Table of Contents

10 :depth: 3

12Overview

13++++++++

15MLOS is a framework for doing benchmarking and autotuning for systems.

16The bulk of the code to do that is written in python. As such, all of the code

17classes documented here take python objects in their construction.

19However, most users of MLOS will interact with the system via the ``mlos_bench`` CLI

20and its json config files and their own scripts for MLOS to invoke. This module

21attempts to document some of those high level interactions.

23General JSON Config Structure

24+++++++++++++++++++++++++++++

26We use `json5 <https://pypi.org/project/json5/>`_ to parse the json files, since it

27allows for inline C style comments (e.g., ``//``, ``/* */``), trailing commas, etc.,

28so it is slightly more user friendly than strict json.

30By convention files use the ``*.mlos.json`` or ``*.mlos.jsonc`` extension to

31indicate that they are an ``mlos_bench`` config file.

33This allows tools that support `JSON Schema Store

34<https://www.schemastore.org/json/>`_ (e.g., `VSCode

35<https://code.visualstudio.com/>`_ with an `extension

36<https://marketplace.visualstudio.com/items?itemName=remcohaszing.schemastore>`_) to

37provide helpful autocomplete and validation of the json configs while editing.

39Organization

40^^^^^^^^^^^^

42Ultimately, each experiment is slightly different so it can take some time to get

43the automation right.

45Therefore the configs are intended to be modular and reusable to reduce the time to

46do that for the next set.

47Hence, they are usually split into several files and directory structures.

49We attempt to provide some examples and reusable templates in the core ``mlos_bench``

50package, but users are encouraged to create their own configs as needed, or to

51`submit PRs or Issues <https://github.com/microsoft/MLOS/CONTRIBUTING.md>`_ to add

52additional ones.

54References to some examples are provided below.

56Additional details about the organization of the files and directories are as follows:

58- ``cli/``:

59 Contains the cli configs that control the overall setup for a set of Experiments.

61- ``environments/``:

62 Contains the configs for :py:mod:`~mlos_bench.environments`, and their

63 associated scripts (if relevant, e.g., for

64 :py:class:`~mlos_bench.environments.remote.remote_env.RemoteEnv` or

65 :py:class:`~mlos_bench.environments.script_env.ScriptEnv`) and

66 :py:mod:`~mlos_bench.tunables`.

68 There is usually one *root* environment that chains the others together to build

69 a full experiment (e.g., via

70 :py:class:`~mlos_bench.environments.composite_env.CompositeEnv` and the

71 ``include_children`` field).

72 The *root* environment is the one referenced in the CLI config ``environment``

73 field.

75 Note that each separate Environment config is really more of a template that

76 allows for variable expansion so that the same environment can be used in

77 multiple experiments with different configurations (see below).

79 Similarly, Environments declare a need for a particular Service, but not which

80 implementation of it.

81 This allows for easy swapping of Services (e.g., a different cloud vendor) using

82 a different ``services`` config in the CLI config.

84 Grouping the scripts and tunables together with the environment allows for

85 easier reuse, readability, and debugging.

87 Note that tunables are also separated into "groups" each of which can be enabled

88 for tuning or not, again controllable via ``globals`` variable expansion.

90- ``experiments/``:

91 Contains some ``globals`` (variables) that help expand a set of other config

92 templates out into a full set of configs.

93 Since each experiment may only slightly differ from a previous one, this allows

94 a greater reuse across individual experiments.

96- ``optimizers/``:

97 Contains the configs for :py:mod:`mlos_bench.optimizers`.

98 The optimizer is referenced in the CLI config's ``optimizer`` field.

99 This config controls which optimizer to use and any custom settings for it.

100

101- ``services/``:

102 Contains the configs for :py:mod:`mlos_bench.services`.

103

104 In general services can simply be referenced in the CLI config's ``services``

105 field, though sometimes additional settings are required, possibly provided by

106 an additional ``globals`` config in the CLI config.

107

108- ``storage/``:

109 Contains the configs for :py:mod:`mlos_bench.storage`.

110

111 The storage config is referenced in the CLI config's ``storage`` field and

112 controls how data is stored and retrieved for the experiments and trials.

113

114See below for additional details about each configuration type.

115

116CLI Configs

117^^^^^^^^^^^

118

119:py:attr:`~.mlos_bench.config.schemas.config_schemas.ConfigSchema.CLI` style configs

120are typically used to start the ``mlos_bench`` CLI using the ``--config`` argument

121and a restricted key-value dict form where each key corresponds to a CLI argument.

122

123For instance:

124

125.. code-block:: json

126

127 // cli-config.mlos.json

128 {

129 "experiment": "path/to/base/experiment-config.mlos.json",

130 "services": [

131 "path/to/some/service-config.mlos.json",

132 ],

133 "globals": "path/to/basic-globals-config.mlos.json",

134 }

135

136.. code-block:: json

137

138 // basic-globals-config.mlos.json

139 {

140 "location": "westus",

141 "vm_size": "Standard_D2s_v5",

142 }

143

144Typically CLI configs will reference some other configs, especially the base

145Environment and Services configs, but some ``globals`` may be left to be specified

146on the command line.

147

148For instance:

149

150.. code-block:: shell

151

152 mlos_bench --config path/to/cli-config.mlos.json --globals experiment-config.mlos.json

153

154where ``experiment-config.mlos.json`` might look something like this:

155

156.. code-block:: json

157

158 // experiment-config.mlos.json (also a set of globals)

159 {

160 "experiment_id": "my_experiment",

161 "some_var": "some_value",

162 }

163

164This allows some of the ``globals`` to be specified on the CLI to alter the behavior

165of a set of Experiments without having to adjust many of the other config files

166themselves.

167

168See below for examples.

169

170Notes

171-----

172- See `mlos_bench CLI usage <../../../mlos_bench.run.usage.html>`_ for more details on the

173 CLI arguments.

174- See `mlos_bench/config/cli

175 <https://github.com/microsoft/MLOS/tree/main/mlos_bench/mlos_bench/config/cli>`_

176 and `mlos_bench/tests/config/cli

177 <https://github.com/microsoft/MLOS/tree/main/mlos_bench/mlos_bench/tests/config/cli>`_

178 for some examples of CLI configs.

179

180Globals and Variable Substitution

181+++++++++++++++++++++++++++++++++

182

183:py:attr:`Globals <mlos_bench.config.schemas.config_schemas.ConfigSchema.GLOBALS>`

184are basically just key-value variables that can be used in other configs using

185``$variable`` substitution via the

186:py:meth:`~mlos_bench.dict_templater.DictTemplater.expand_vars` method.

187

188For instance:

189

190.. code-block:: json

191

192 // globals-config.mlos.json

193 {

194 "experiment_id": "my_experiment",

195 "some_var": "some_value",

196 // environment variable expansion also works here

197 "current_dir": "$PWD",

198 "some_expanded_var": "$some_var: $experiment_id",

199 "location": "eastus",

200

201 // This can be specified in the CLI config or the globals config

202 "tunable_params_map": {

203 // a map of tunable_params variables to their covariant group names

204 "environment1_tunables": [

205 "covariant_group_name",

206 "another_covariant_group_name"

207 ],

208 "environment2_tunables": [

209 // empty list means no tunables

210 // are enabled for this environment

211 // during this experiment

212 // (e.g., only use defaults for this environment)

213 ],

214 }

215

216Users can have multiple global config files, each specified with a ``--globals``

217CLI arg or ``"globals"`` CLI config property.

218

219At runtime, parameters from these files will be combined into a single

220dictionary, in the order they appear, and pushed to the root

221:py:class:`Environment <mlos_bench.environments>`.

222

223Any global or :py:class:`~.Environment` parameter can also be overridden from

224the command line, by simply specifying ``--PARAMETER_NAME PARAMETER_VALUE``.

225

226Another common use of global config files is to store sensitive data (e.g.,

227passwords, tokens, etc.) that should not be version-controlled.

228

229This way, users can keep their experiment-specific parameters separately from

230the Environment configs making them more reusable.

231

232There are additional details about `Variable Propagation

233<../environments/index.html#variable-propagation>`_ in the

234:py:mod:`mlos_bench.environments` module.

235

236Well Known Variables

237^^^^^^^^^^^^^^^^^^^^

238

239Here is a list of some well known variables that are provided or required by the

240system and may be used in the config files:

241

242- ``$experiment_id``: A unique identifier for the ``Experiment``.

243 Typically provided in globals.

244- ``$trial_id``: A unique identifier for the ``Trial`` currently being executed.

245 This can be useful in the configs for :py:mod:`mlos_bench.environments` for

246 instance (e.g., when writing scripts).

247- ``$trial_runner_id``: A unique identifier for the ``TrialRunner``.

248 This can be useful when running multiple trials in parallel (e.g., to

249 provision a numbered VM per worker).

250- ``$tunable_params_map``: A map of ``tunable_params`` ``$name`` to their list of covariant group names.

251 This is usually used in a CLI ``--config`` CLI config or ``--globals``

252 (e.g., "experiment") config file and is used to control what the

253 ``"tunable_params": $tunable_group_name`` specified in the the

254 :py:mod:`mlos_bench.environments` JSONC configs resolves to.

255 This can be used to control which tunables are enabled for tuning for an

256 experiment without having to change the underlying Environment config.

257

258Tunable Configs

259^^^^^^^^^^^^^^^

260

261There are two forms of tunable configs:

262

263- "TunableParams" style configs

264

265 Which are used to define the set of

266 :py:mod:`~mlos_bench.tunables.tunable_groups.TunableGroups` (i.e., tunable

267 parameters).

268

269 .. code-block:: json

270

271 // some-env-tunables.json

272 {

273 // a group of tunables that are tuned together

274 "covariant_group_name": [

275 {

276 "name": "tunable_name",

277 "type": "int",

278 "range": [0, 100],

279 "default": 50,

280 },

281 // more tunables

282 ],

283 // another group of tunables

284 // both can be enabled at the same time

285 "another_group_name": [

286 {

287 "name": "another_tunable_name",

288 "type": "categorical",

289 "values": ["red", "yellow", "green"],

290 "default": "green"

291 },

292 // more tunables

293 ],

294 }

295

296 Since TunableParams are associated with an :py:mod:`~mlos_bench.environments`,

297 they are typically kept in the same directory as that Environment's config and

298 named something like ``env-tunables.json``.

299

300- "TunableValues" style configs which are used to specify the values for an

301 instantiation of a set of tunables params.

302

303 These are essentially just a dict of the tunable names and their values.

304 For instance:

305

306 .. code-block:: json

307

308 // tunable-values.mlos.json

309 {

310 "tunable_name": 25,

311 "another_tunable_name": "red",

312 }

313

314 These can be used with the

315 :py:class:`~mlos_bench.optimizers.one_shot_optimizer.OneShotOptimizer`

316 :py:class:`~mlos_bench.optimizers.manual_optimizer.ManualOptimizer` to run a

317 benchmark with a particular config or set of configs.

318

319For more information on tunable configs, see the :py:mod:`mlos_bench.tunables`

320module.

321

322Class Configs

323^^^^^^^^^^^^^

324

325Class style configs include most anything else and roughly take this form:

326

327.. code-block:: json

328

329 // class configs (environments, services, etc.)

330 {

331 // some mlos class name to load

332 "class": "mlos_bench.type.ClassName",

333 "config": {

334 // class specific config

335 "key": "value",

336 "key2": "$some_var", // variable substitution is allowed here too

337 }

338 }

339

340Where ``type`` is one of the core classes in the system:

341

342- :py:mod:`~mlos_bench.environments`

343- :py:mod:`~mlos_bench.optimizers`

344- :py:mod:`~mlos_bench.services`

345- :py:mod:`~mlos_bench.schedulers`

346- :py:mod:`~mlos_bench.storage`

347

348Each of which have their own submodules and classes that dictate the allowed and

349expected structure of the ``config`` section.

350

351In certain cases (e.g., script command execution) the variable substitution rules

352take on slightly different behavior

353See various documentation in :py:mod:`mlos_bench.environments` for more details.

354

355Config Processing

356+++++++++++++++++

357

358Config files are processed by the :py:class:`~mlos_bench.launcher.Launcher` and

359:py:class:`~mlos_bench.services.config_persistence.ConfigPersistenceService` classes

360at startup time by the ``mlos_bench`` CLI.

361

362The typical entrypoint is a CLI config which references other configs, especially

363the base Environment config, Services, Optimizer, and Storage.

364

365See `mlos_bench CLI usage <../../../mlos_bench.run.usage.html>`__ for more details

366on those arguments.

367

368Schema Definitions

369++++++++++++++++++

370

371For further details on the schema definitions and validation, see the

372:py:class:`~mlos_bench.config.schemas.config_schemas.ConfigSchema` class

373documentation, which also contains links to the actual schema definitions in the

374source tree (see below).

375

376Debugging

377+++++++++

378

379Most of the time issues in running an Experiment involve issues with the json

380configs and/or user scripts that are being run by the framework.

381

382It can help to run ``mlos_bench`` with ``--log-level DEBUG`` to see more detailed

383output about the steps it is taking.

384Alternatively, it can help to add additional debug logging to the user scripts

385themselves to see what about the unique automation process is failing.

386

387Notes

388-----

389See `mlos_bench/config/README.md

390<https://github.com/microsoft/MLOS/tree/main/mlos_bench/mlos_bench/config/>`_ and

391`mlos_bench/tests/config/README.md

392<https://github.com/microsoft/MLOS/tree/main/mlos_bench/mlos_bench/tests/config/>`_

393for additional documentation and examples in the source tree.

394""" # pylint: disable=line-too-long # noqa: E501

Coverage for mlos_bench/mlos_bench/config/init.py: 100%

0 statements