Skip to main content
Documentation

Experiment tracking

Compare training runs and build dashboard views.

Use the Experiments page when you want a training-oriented view of the data you already log to Embedl Hub. It reads the same runs, metrics, parameters, tags and artifacts that appear elsewhere in the project, but presents them as an interactive dashboard for comparing runs over time.

What you will track

For a useful training dashboard, log:

  • Step metrics such as loss, accuracy, val_loss and val_accuracy
  • Parameters such as learning_rate, batch_size, optimizer and dataset
  • Tags such as training, imagenette, candidate or baseline
  • Artifacts such as checkpoints, exported models, reports or evaluation files

Metrics with a step value become curves. Metrics without a step are still available in tables, bars and scatter plots.

Log a training run

Create a client, select a project, then log parameters and metrics inside a run. For long training loops, buffer metrics and send them with log_batch rather than making one request per scalar.

from pathlib import Path
from embedl_hub.tracking import Client
client = Client()
client.set_project("edge-classifier")
with client.start_run("training", name="resnet50-adamw-lr3e-4"):
    client.log_tag("dataset", "imagenette")
    client.log_batch(
        params=[
            ("model", "resnet50"),
            ("optimizer", "adamw"),
            ("learning_rate", "0.0003"),
            ("batch_size", "64"),
        ]
    )
    pending_metrics = []
    for step in range(1000):
        loss, accuracy = train_step(...)
        pending_metrics.extend(
            [
                ("loss", loss, step),
                ("accuracy", accuracy, step),
            ]
        )
        if step % 20 == 0:
            val_loss, val_accuracy = validate(...)
            pending_metrics.extend(
                [
                    ("val_loss", val_loss, step),
                    ("val_accuracy", val_accuracy, step),
                ]
            )
        if len(pending_metrics) >= 100:
            client.log_batch(metrics=pending_metrics)
            pending_metrics.clear()
    if pending_metrics:
        client.log_batch(metrics=pending_metrics)
    client.log_artifact(Path("outputs/resnet50.onnx"), name="model")

The run is marked finished when the with block exits. If an exception leaves the block, the run is marked failed and the metrics logged before the failure remain available for analysis.

Open Experiments

Open the project, then select Experiments from the project navigation. The left sidebar is the run selector. Search or filter by status, type, custom type, tag or parameter, then choose the runs you want to compare.

Experiments page with selected training runs and dashboard panels
The Experiments page showing selected training runs, metric curves and dashboard panels.

Build a dashboard view

Click Add in a section to create a panel. Panels start empty, so choose the chart type and inputs that match the question you are asking:

  • Line chart for step curves such as training loss or validation accuracy
  • Bar chart for comparing latest, final, min, max or best metric values
  • Scatter plot for tradeoffs such as accuracy vs latency
  • Parallel coordinates for hyperparameter and metric relationships
  • Table for run comparison with parameters, latest metrics, status and links

Use Configure on a panel to adjust its data. For line charts, select one or more metrics and set smoothing when raw training curves are noisy. When smoothing is enabled, the raw curve remains visible in the background so you can see the original signal.

Experiments panel configuration controls for a line chart
Panel configuration for a line chart with selected metrics and smoothing.

Tailor the view for analysis

Start with a small set of runs, then add panels for the decisions you need to make:

  • Compare loss and val_loss curves to spot underfitting or overfitting
  • Plot final val_accuracy as a bar chart to rank candidate runs
  • Use scatter plots for edge AI tradeoffs, for example top1 vs mean_latency_ms
  • Add a table panel when you need exact parameter values, latest metrics, status and links to run details or artifacts
  • Create separate sections for training, evaluation, compilation and hardware profiling if one dashboard starts to mix too many questions

Your browser keeps a personal draft of the current view. Use Save or Save as to make a dashboard available project-wide to collaborators. Use Copy link when you want to share the exact filters, selected runs and panel layout you are looking at.