This is the 12th day of my participation in the First Challenge 2022

In machine learning projects, the number of experiments is growing rapidly. DVC can track these experiments, list and compare their most relevant parameters and metrics, find the most appropriate experiments among them, and commit only those experiments we need to Git.

In the previous quick Start DVC (vi) : Indicator Tracking, Updating training parameters, and visual model representation, we described how to perform indicator tracking, updating training parameters, and visual model representation.

In this article, we will explore the basic features of DVC experiment management through the Example – DVC-Experiments project.

Environment to prepare

Clone the project, create a virtual environment, and install the dependent libraries

It is highly recommended to create a virtual environment that isolates the libraries we use from the rest of the system, which effectively prevents version conflicts.

$ git clone https://github.com/iterative/example-dvc-experiments -b get-started
$ cd example-dvc-experiments

$ virtualenv .venv
$ . .venv/bin/activate

$ python -m pip install -r requirements.txt
Copy the code

Download data set

The Git remote repository we cloned does not contain data sets. Instead of storing the data in a Git repository, we use DVC to retrieve it from a shared data store. In this case, we use DVC pull to download the missing data file.

$ dvc pull
Copy the code

After the dataset is downloaded, the repository contains all the configuration needed to run the experiment.

Running test

To run the experiment with the default project Settings, simply execute the following command:

$ dvc exp run

...
Reproduced experiment(s): exp-b28f0
Experiment results have been applied to your workspace.
...
Copy the code

It runs the specified command (python train.py) in dvc.yaml. This command writes the indicator values to the metrics.json file.

This experiment is then associated with values found in the parameter file (params.yaml) and other dependencies (data/images/) that have these generated metrics.

The DVC exp command is designed to let you run, capture, and compare machine learning experiments immediately as you iterate on projects. The models and indicators generated by each experiment are tracked by DVC, and the relevant parameters and indicators can be submitted to Git as text files.

You can use DVC exp show to view the experimental results and see these metrics and results in a well-formed table:

$DVC exp show ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ Experiment Created loss acc train.epochs model.conv_units ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ the workspace - 0.23282-0.9152  10 16 7317bc6 Jul 18, 2021-10 16 └ ─ ─ 1 a1d858 / exp - 6 DCCF 03:21 10 16 PM 0.23282 0.9152 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─Copy the code

Where, the workspace rows in the table show the results of the latest experiments available in the workspace.

In addition, the table shows each experiment in a separate row, along with the Commit ID of Git to which they were committed. We can see that we are running the experiment named EXP-6dCCF from Commit ID 7317BC6.

Now, let’s do some more experiments. DVC allows you to update parameters defined in the pipeline without manually modifying files. We use this feature to set up the convolution unit in train.py.

$ dvc exp run --set-param model.conv_units=24

...
Reproduced experiment(s): exp-7b56f
Experiment results have been applied to your workspace.
...

Copy the code

For more information about (super) parameters:

It is common for data science projects to include configuration files that define parameters that can be adjusted (to train the model, adjust the model architecture, do preprocessing, and so on). DVC provides a mechanism for experimentation to rely on specific variables in a file.

By default, DVC assumes that your project has a parameter file called params.yaml. DVC parses this file and creates dependencies on the variables found in it: model.conv_units and train.epochs. Such as:

train:
  epochs: 10
model:
  conv_units: 16
Copy the code

When you use DVC exp run –set-param, DVC updates the parameters in params.yaml with the values you set on the command line before running the experiment.

Run multiple experiments in parallel

We can define them as batch running, rather than running experiments one by one. This is especially handy when you run long-running experiments.

We use the –queue argument of DVC exp run to add experiments to the queue. We can also use -s (–set-param) to set the value of the parameter.

 $ dvc exp run --queue -S model.conv_units=32
 Queued experiment '3cac8c6' for future execution.
 
 $ dvc exp run --queue -S model.conv_units=64
 Queued experiment '23660b6' for future execution.
 
 $ dvc exp run --queue -S model.conv_units=128
 Queued experiment '6591a57' for future execution.
 
 $ dvc exp run --queue -S model.conv_units=256
 Queued experiment '9109ea9' for future execution.
Copy the code

Next, run all of the queued experiments in parallel with the –run-all parameter. You can specify the number of parallel processes using –jobs:

 dvc exp run --run-all --jobs 2
Copy the code

Compare and persist experiments

After running the experiment several times with different parameters. We use the DVC exp show command to compare all of these experiments.

$DVC exp show ┏ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ┳ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ┳ ━ ━ ━ ━ ━ ━ ━ ━ ━ ┳ ━ ━ ━ ━ ━ ━ ━ ━ ┳ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ┳ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ┓ ┃ Experiment ┃ Created ┃ loss acc ┃ train. Epochs ┃ model. Conv_units ┃ ┡ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ╇ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ╇ ━ ━ ━ ━ ━ ━ ━ ━ ━ ╇ ━ ━ ━ ━ ━ ━ ━ ━ ╇ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ╇ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ┩ │ workspace │ │. - 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 │ │ - 2021-10 16 │ │ │ │ ├ ─ ─ e2647ef/exp - ee8a4 │ 05:14 PM 10 64 │ │ │ │ │ 0.9145 0.23146 │ ├ ─ ─ 15 c9451 │ [exp - a9be6] 05:14 PM 10 │ │ │ │ 0.9102 0.25231 32 │ │ ├ ─ ─ nine c32227 / exp - 17 dd9 │ 04:46 PM 10 256 │ │ │ │ │ 0.9167 0.23687 │ ├ ─ ─ │ ├─ ├─ dfC536F [exp-a1Bd9] │ ├─ dfC536F [exp-A1BD9] │ │ ├─ dfC536F [exp-A1BD9] │ ├─ dfC536F [exp-A1BD9] │ ├─ dfc536F [exp-A1Bd9] │ ├─ dfc536F [exp-A1Bd9] │ ├─ dfc536F [exp-A1Bd9] │ October 24 │ │ │ └ ─ ─ 1 a1d858 / exp - 6 DCCF │ 03:21 PM 10 16 │ │ │ │ │ 0.9152 0.23282 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘Copy the code

By default, it displays all parameters and time-stamped metrics. If you have a large number of parameters, metrics, or experiments, this can lead to cluttered views. You can use the command’s –drop argument to restrict the table to viewing specific metrics or parameters, or to hide the timestamp column.

$ dvc exp show --drop 'Created|train|loss'─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ Experiment model. The acc conv_units ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ the workspace 0.9151 24 7317 bc6-16 ├ ─ ─ e2647ef [exp - ee8a4] 0.9145 64 ├─ 15C9451 [EXP-A9BE6] 0.9102 32 ├─ 9C32227 [EXP-17DD9] 0.9167 256 ├─ 8A9CB15 [EXP-29D93] 0.9134 128 ├─ DFC536F [exp - a1bd9] 0.9151 └ ─ ─ 1 a1d858 [exp - 6 DCCF] 0.9152 16 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─Copy the code

Select one of the experiments from the table, and then create a Git branch that contains the experiment and all its associated files.

$ dvc exp branch exp-17dd9 "cnn-256"

Git branch 'cnn-256' has been created from experiment 'exp-17dd9'.
To switch to the new branch run:

        git checkout cnn-256
Copy the code

You can then switch branches git checkout and continue working from that branch, or merge the branch into your main branch (main) using the usual Git commands.

Other operations of experiment management

DVC EXP has many other capabilities, such as cleaning up unused experiments, sharing them without committing to Git, or capturing differences between two experiments.

Refer to the experiment Management section of the user guide or the DVC exp command and its subcommands in the reference command description.