Introduction

Grog can record execution traces for every build, test, and run invocation. Traces capture per-target phase-level timing data.

Traces are stored as Parquet files using the same remote caching backends (local filesystem, S3, or GCS) and can be queried from the terminal or exported for use in dashboards like Grafana, Datadog, or Jaeger.

Enabling traces

Add the following to your grog.toml:

[traces]
enabled = true

By default, traces are stored alongside your build cache. To use a separate storage backend (e.g. for different retention or access controls):

[traces]
enabled = true
backend = "s3"

[traces.s3]
bucket = "my-traces-bucket"
prefix = "grog-traces"

What traces capture

Each trace records a complete picture of a single build invocation:

Build-level metadata:

Trace ID, timestamp, and total duration
Git commit and branch
Grog version and platform
Target patterns requested
Aggregate counts (targets, cache hits, failures)
Critical path timing

Per-target span data:

Label, status (success/failure/cancelled), and cache result (hit/miss/skip)
8 phase-level timing breakdowns for bottleneck analysis

Phase timing

Every target’s execution is broken down into granular phases:

Cache miss: |-- queue --|-- hash --|-- cache_check --|-- command --|-- output_write --|-- cache_write --|
Cache hit:  |-- queue --|-- hash --|-- cache_check --|-- output_load --|

Phase	What it measures
queue	Time waiting in the worker pool before starting
hash	Time computing the target’s ChangeHash from inputs
cache_check	Time checking target cache, output checks, and taint status
command	Time running the actual shell command
output_write	Time writing outputs and computing the OutputHash
output_load	Time loading cached outputs (cache hit path)
cache_write	Time persisting the TargetResult to cache
dep_load	Time loading dependency outputs (`load_outputs=minimal` only)

This granularity lets you answer questions like:

Which targets are the slowest to build?
Is my worker pool saturated (high queue wait)?
Are cache operations a bottleneck (high output_load or cache_write)?
Do I have targets with many inputs causing slow hashing?

Querying traces from the terminal

Listing recent traces

grog traces list

TRACE ID   DATE        CMD    TARGETS  HITS  FAILS  DURATION  COMMIT
a1b2c3d4   2026-03-30  build  42       38    0      12.3s     abc1234
e5f6g7h8   2026-03-30  test   18       12    2      45.1s     abc1234

Filter with --limit, --since, --command, or --failures-only:

grog traces list --since 2026-03-01 --command build --limit 50
grog traces list --failures-only

Viewing a trace

grog traces show a1b2c3d4

This displays the build summary and a per-target breakdown sorted by duration:

TARGET                    STATUS  CACHE  TOTAL   CMD    HASH   I/O    QUEUE
//services/api:build      ok      miss   45.2s   42.1s  0.8s   2.1s   0.2s
//libs/proto:codegen      ok      miss   12.4s   11.1s  0.2s   1.0s   0.1s
//services/api:test       FAIL    miss    8.7s    8.5s  0.1s   0.0s   0.1s
//libs/common:build       ok      hit     0.3s    0.0s  0.0s   0.3s   0.0s

Sort by different columns to surface specific bottlenecks:

grog traces show a1b2c3d4 --sort-by command  # slowest commands
grog traces show a1b2c3d4 --sort-by queue    # most queue contention
grog traces show a1b2c3d4 --top 10           # top 10 slowest only

Aggregate statistics

grog traces stats

Shows average build duration, cache hit rate, and failure count over recent traces. Add --detailed to load full traces and show per-target breakdowns (slowest targets, highest queue wait, most frequent failures):

grog traces stats --command-type build
grog traces stats --ci true
grog traces stats --detailed --limit 50
grog traces stats --detailed --command-type test

Dashboard integration

Traces can be exported in two formats for use with external analytics and observability tools.

JSONL export

Each trace is serialized as a single JSON line, suitable for log aggregation tools and data warehouses:

grog traces export --format=jsonl > traces.jsonl
grog traces export --format=jsonl --since 2026-03-01 --output traces.jsonl

JSONL files can be:

Uploaded to S3 and queried with AWS Athena
Loaded into BigQuery for analysis
Ingested by Grafana Loki
Piped to jq for ad-hoc queries

# Find the slowest target across all recent builds
grog traces export --format=jsonl | jq -r '.spans[] | "\(.command_duration_millis)ms \(.label)"' | sort -rn | head -10

OpenTelemetry export

Traces can be exported as OTLP-compatible JSON for distributed tracing backends:

grog traces export --format=otel > traces-otel.json
grog traces export --format=otel --output traces-otel.json

This maps each build to an OTLP trace with the build as the root span and each target as a child span, including all phase timing as span attributes. Compatible with:

Grafana Tempo — Waterfall visualization of parallel target execution
Jaeger — Trace search and comparison
Datadog APM — Build performance monitoring

CI workflow example

A typical CI integration pipes traces to your observability stack after each build:

# Run the build
grog build //...

# Export trace to JSONL and ship to your log aggregator
grog traces export --format=jsonl --limit 1 >> /shared/grog-traces.jsonl

For detailed setup instructions with specific backends, see the Integrations guides. The examples/tracing directory contains a Docker Compose setup that runs Jaeger, Grafana Tempo, and Loki locally.

Managing traces

Syncing remote traces

When using a remote cache backend (S3, GCS, Azure), traces written by other machines (e.g. CI) are stored remotely but not automatically available for local querying. Use sync to download them:

grog traces pull

Pruning old traces

Traces accumulate over time. Use prune to delete traces older than a given duration:

grog traces prune --older-than 30d
grog traces prune --older-than 72h

Storage

Traces are stored as Parquet files under a traces/ prefix with date-based partitioning — two tables (builds + spans), one file per trace:

traces/
  builds/
    2026-03-30/
      <trace-id>.parquet     # 1 row with build metadata
  spans/
    2026-03-30/
      <trace-id>.parquet     # 1 row per target with phase timing

There is no index file. DuckDB queries Parquet files directly via glob patterns. Each trace is typically a few KB.

Configuration reference

Option	Type	Default	Description
`traces.enabled`	bool	`false`	Enable trace collection
`traces.backend`	string	(uses cache backend)	Override storage backend (`"s3"` or `"gcs"`)
`traces.gcs.bucket`	string		GCS bucket for trace storage
`traces.gcs.prefix`	string	`/`	Prefix within the GCS bucket
`traces.s3.bucket`	string		S3 bucket for trace storage
`traces.s3.prefix`	string	`/`	Prefix within the S3 bucket