Skip to content

Introduction

Grog can record execution traces for every build, test, and run invocation. Traces capture per-target phase-level timing data.

Traces are stored as Parquet files using the same remote caching backends (local filesystem, S3, or GCS) and can be queried from the terminal or exported for use in dashboards like Grafana, Datadog, or Jaeger.

Add the following to your grog.toml:

[traces]
enabled = true

By default, traces are stored alongside your build cache. To use a separate storage backend (e.g. for different retention or access controls):

[traces]
enabled = true
backend = "s3"
[traces.s3]
bucket = "my-traces-bucket"
prefix = "grog-traces"

Each trace records a complete picture of a single build invocation:

Build-level metadata:

  • Trace ID, timestamp, and total duration
  • Git commit and branch
  • Grog version and platform
  • Target patterns requested
  • Aggregate counts (targets, cache hits, failures)
  • Critical path timing

Per-target span data:

  • Label, status (success/failure/cancelled), and cache result (hit/miss/skip)
  • 8 phase-level timing breakdowns for bottleneck analysis

Every target’s execution is broken down into granular phases:

Cache miss: |-- queue --|-- hash --|-- cache_check --|-- command --|-- output_write --|-- cache_write --|
Cache hit: |-- queue --|-- hash --|-- cache_check --|-- output_load --|
PhaseWhat it measures
queueTime waiting in the worker pool before starting
hashTime computing the target’s ChangeHash from inputs
cache_checkTime checking target cache, output checks, and taint status
commandTime running the actual shell command
output_writeTime writing outputs and computing the OutputHash
output_loadTime loading cached outputs (cache hit path)
cache_writeTime persisting the TargetResult to cache
dep_loadTime loading dependency outputs (load_outputs=minimal only)

This granularity lets you answer questions like:

  • Which targets are the slowest to build?
  • Is my worker pool saturated (high queue wait)?
  • Are cache operations a bottleneck (high output_load or cache_write)?
  • Do I have targets with many inputs causing slow hashing?
Terminal window
grog traces list
TRACE ID DATE CMD TARGETS HITS FAILS DURATION COMMIT
a1b2c3d4 2026-03-30 build 42 38 0 12.3s abc1234
e5f6g7h8 2026-03-30 test 18 12 2 45.1s abc1234

Filter with --limit, --since, --command, or --failures-only:

Terminal window
grog traces list --since 2026-03-01 --command build --limit 50
grog traces list --failures-only
Terminal window
grog traces show a1b2c3d4

This displays the build summary and a per-target breakdown sorted by duration:

TARGET STATUS CACHE TOTAL CMD HASH I/O QUEUE
//services/api:build ok miss 45.2s 42.1s 0.8s 2.1s 0.2s
//libs/proto:codegen ok miss 12.4s 11.1s 0.2s 1.0s 0.1s
//services/api:test FAIL miss 8.7s 8.5s 0.1s 0.0s 0.1s
//libs/common:build ok hit 0.3s 0.0s 0.0s 0.3s 0.0s

Sort by different columns to surface specific bottlenecks:

Terminal window
grog traces show a1b2c3d4 --sort-by command # slowest commands
grog traces show a1b2c3d4 --sort-by queue # most queue contention
grog traces show a1b2c3d4 --top 10 # top 10 slowest only
Terminal window
grog traces stats

Shows average build duration, cache hit rate, and failure count over recent traces. Add --detailed to load full traces and show per-target breakdowns (slowest targets, highest queue wait, most frequent failures):

Terminal window
grog traces stats --command-type build
grog traces stats --ci true
grog traces stats --detailed --limit 50
grog traces stats --detailed --command-type test

Traces can be exported in two formats for use with external analytics and observability tools.

Each trace is serialized as a single JSON line, suitable for log aggregation tools and data warehouses:

Terminal window
grog traces export --format=jsonl > traces.jsonl
grog traces export --format=jsonl --since 2026-03-01 --output traces.jsonl

JSONL files can be:

  • Uploaded to S3 and queried with AWS Athena
  • Loaded into BigQuery for analysis
  • Ingested by Grafana Loki
  • Piped to jq for ad-hoc queries
Terminal window
# Find the slowest target across all recent builds
grog traces export --format=jsonl | jq -r '.spans[] | "\(.command_duration_millis)ms \(.label)"' | sort -rn | head -10

Traces can be exported as OTLP-compatible JSON for distributed tracing backends:

Terminal window
grog traces export --format=otel > traces-otel.json
grog traces export --format=otel --output traces-otel.json

This maps each build to an OTLP trace with the build as the root span and each target as a child span, including all phase timing as span attributes. Compatible with:

  • Grafana Tempo — Waterfall visualization of parallel target execution
  • Jaeger — Trace search and comparison
  • Datadog APM — Build performance monitoring

A typical CI integration pipes traces to your observability stack after each build:

Terminal window
# Run the build
grog build //...
# Export trace to JSONL and ship to your log aggregator
grog traces export --format=jsonl --limit 1 >> /shared/grog-traces.jsonl

For detailed setup instructions with specific backends, see the Integrations guides. The examples/tracing directory contains a Docker Compose setup that runs Jaeger, Grafana Tempo, and Loki locally.

When using a remote cache backend (S3, GCS, Azure), traces written by other machines (e.g. CI) are stored remotely but not automatically available for local querying. Use sync to download them:

Terminal window
grog traces pull

Traces accumulate over time. Use prune to delete traces older than a given duration:

Terminal window
grog traces prune --older-than 30d
grog traces prune --older-than 72h

Traces are stored as Parquet files under a traces/ prefix with date-based partitioning — two tables (builds + spans), one file per trace:

traces/
builds/
2026-03-30/
<trace-id>.parquet # 1 row with build metadata
spans/
2026-03-30/
<trace-id>.parquet # 1 row per target with phase timing

There is no index file. DuckDB queries Parquet files directly via glob patterns. Each trace is typically a few KB.

OptionTypeDefaultDescription
traces.enabledboolfalseEnable trace collection
traces.backendstring(uses cache backend)Override storage backend ("s3" or "gcs")
traces.gcs.bucketstringGCS bucket for trace storage
traces.gcs.prefixstring/Prefix within the GCS bucket
traces.s3.bucketstringS3 bucket for trace storage
traces.s3.prefixstring/Prefix within the S3 bucket