reliability-tool 101

This section walks through an example of how to do reliability tests on one pre-trained model. Later, we will go over some cool tips and tricks about the tool.

Run a sample reliability tests on MNLI dataset:

recheck task=mnli

It is important to note that, we can only run the experiments on the pre-defined set of datasets that are listed inside configs/experiments/ <https://github.com/Maitreyapatel/reliability-checklist/tree/develop/configs/experiment>.

Using on different devices:

# eval on CPU
recheck

# eval on 1 GPU
recheck trainer=gpu

# eval on 2 GPU
recheck trainer=gpu +trainer.gpus=2

# eval on 2 GPU with specific ids
recheck trainer=gpu +trainer.gpus=[1, 5]

# eval on TPU
recheck trainer=tpu +trainer.tpu_cores=8

# eval with DDP (Distributed Data Parallel) (4 GPUs)
recheck trainer=ddp trainer.devices=4

# eval with DDP (Distributed Data Parallel) (8 GPUs, 2 nodes)
recheck trainer=ddp trainer.devices=4 trainer.num_nodes=2

# simulate DDP on CPU processes
recheck trainer=ddp_sim trainer.devices=2

# accelerate training on mac
recheck trainer=mps

Saving the output of the reliability tests:

recheck logger=csv

Going beyond user and configuring each experiments:

Please refer to the hydra package and configs/ folder to understand the different parameters and features. Once you understand them then you can modify them on cli (for example, how different devices are used in above example).