Recipes for Experiment Tracking and Hyperparameter Optimization
- Ensure you have a valid in
.env
. Otherwise, set it in python:
import os
os.environ['WANDB_NOTEBOOK_NAME'] = 'notebook.ipynb'
os.environ['WANDB_NAME'] = 'Name of this run'
os.environ['WANDB_NOTES'] = 'Subheading'
# os.environ['WANDB_MODE'] = 'offline' # enable for local mode
# os.environ['WANDB_BASE_URL'] = 'http://localhost:9595' # enable for local mode
- Login (need to register on WandB site first)
import wandb
wandb.login()
- Start a run:
params = {}
with wandb.init(project='project_name', dir='../../artifacts/wandb/', config=params) as run:
pass # do model stuff here
wandb.watch(model)
for epoch in epochs:
for batch_num, Xs, ys in enumerate(dataloaders):
# model code
metrics = train_one_batch(...)
if batchnum % k == 0:
wandb.log(metrics)
fastai & keras
- If running fastai or keras, watching and logging are already implemented in
WandbCallback
. - You still need to save the actual model afterwards.
You can implement by hooking into the callback system:
- fastai:
learner.fit(..., cbs=[WandbCallback])
- keras:
model.fit(..., callbacks[WandbCallback])
torch.onnx.export(
model,
args, # any tensor that matches input shape -- ok to use your input data
'model.onnx',
input_names=feature_col_names,
output_names=target_col_name
)
wandb.save('model.onnx')
Sweeps (Hyperparameter Optimization)
- Declare parameter settings
#sweep.yml method: bayes metric: goal: minimize name: grid parameters: dropout: values: [0.1, 0.2, 0.3, 0.4] learning_rate: values: [0.001, 0.0001, 0.00001] early_terminate: type: hyperband min_iter: 20
Run from notebook
Note that you will have to wrap your training loop into a function.
with open('sweep.yml') as f:
sweep_config = yaml.base_load(f)
sweep_id = wandb.sweep(sweep_config)
wandb.agent(sweep_id, function=train)
Run from terminal
wandb sweep sweep.yml
wandb agent {value_of_output_above}
import mlflow
mlflow.set_tracking_uri('file:///home/andrew/notes/artifacts/mlruns')
- And available through the web UI via
mlflow ui --backend-store-uri file:///home/andrew/notes/artifacts/mlruns
with mlflow.start_run() as run:
mlflow.lightgbm.autolog()
# model code here
- Otherwise you can to track artifacts manually
with mlflow.start_run() as run:
model_file = 'model.pkl'
mlflow.log_artifacts({'model_file': model_file})
- You can also log parameters and metrics
with mlflow.start_run() as run:
params = {'lambda': 0.1, 'num_leaves': 8}
metrics = {'mse': 1234}
mlflow.log_params(params)
mlflow.log_metrics(metrics)
Setup
dvc init && git commit -m "init dvc"
If you are currently tracking data with git, stop tracking:
git rm -r --cached "data" && git commit -m "stop tracking data"
Add dvc remote:
dvc remote add -d {name} s3://{bucketname} && \
# dvc remote modify {name} endpointurl {minio_url} && \ # only if using Minio
git add .dvc/config && \
git commit -m "add {name} as remote"
This example shows LightGBM, but the idea is the same for similar models like XGBoost, sklearn, etc.
from pathlib import Path
artifacts = Path('../../artifacts')
df = pd.read_csv('https://raw.githubusercontent.com/tidyverse/ggplot2/master/data-raw/diamonds.csv')
categories = {
'cut': {'ordered': True, 'categories': ['Fair', 'Good', 'Very Good', 'Premium', 'Ideal']},
'color': {'ordered': True, 'categories': ['J', 'I', 'H', 'G', 'F', 'E', 'D']},
'clarity': {'ordered': True, 'categories': ['I1', 'SI2', 'SI1', 'VS2', 'VS1', 'VVS2', 'VVS1', 'IF']},
}
df2 = categorize(df, categories)
def objective(trial):
X = df.drop('price', axis=1)
y = df[['price']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
train = lgb.Dataset(X_train, y_train)
params = {
"objective": "mse",
"metric": "mae,mse",
"verbosity": -1,
"boosting_type": "gbdt",
"lambda_l1": trial.suggest_float("lambda_l1", 1e-8, 10.0, log=True),
"lambda_l2": trial.suggest_float("lambda_l2", 1e-8, 10.0, log=True),
"num_leaves": trial.suggest_int("num_leaves", 2, 256),
"feature_fraction": trial.suggest_float("feature_fraction", 0.4, 1.0),
"bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
"bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
"min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
'nthreads': 1 # for compatibility
}
model = lgb.train(params, train)
preds = model.predict(X_test)
return mse(y_test, preds)
import logging
logger = logging.getLogger()
logger.addHandler(logging.FileHandler(artifacts / 'optuna.log', mode='a'))
optuna.logging.enable_propagation()
optuna.logging.disable_default_handler()
study = optuna.create_study(study_name='testing', direction='minimize')
study.optimize(objective, n_trials=10)
results = study.trials_dataframe()
results.to_csv(artifacts / 'optuna_study.csv')
You can also log parameters to WandB with this hook:
log_study_wandb(study,
wandb_kwargs={'project':'optuna', 'name':'hyperparam search', 'job_type':'logging'}
)
fig = optuna.visualization.plot_param_importances(study)
show_plotly(fig)