usage

Use the UseDesktop app.

The app is the control plane. Use it to manage workflow packages from Workbench, inspect training artifacts, compare model runs, and decide which environment packages are ready to publish or export.

Dashboard

Start at app.usedesktop.com/dashboard. The dashboard should answer what changed recently: captured workflows, created packages, training or eval jobs, hosted models, and failed checks that need review.

Workflows

Review captured source workflows and decide which ones should become tasks.

Runs

Inspect model attempts, scores, traces, and verifier output.

Training

Track datasets, SFT jobs, RL jobs, eval jobs, and exported artifacts.

Models

Compare adapters, checkpoints, hosted endpoints, deployments, and eval results.

Control-plane workflow

1. Upload package

Receive metadata and artifact keys from Workbench or a remote runner.

2. Review quality

Check task solvability, verifier behavior, model failures, and contamination notes.

3. Run evals

Submit model attempts through local, RunPod, AWS, or customer infrastructure.

4. Compare results

Use pass@k, average score, failure mode, and run evidence to decide what to train on.

5. Publish or export

Publish a public eval report or export the package for a lab/customer workflow.

What to check before export

Do not export only trajectories. A useful package includes grader contracts, verifier audit notes, model pass@k distributions, failure traces, and source provenance. That is the difference between raw data and evidence.