Expand description

Telemetry Capturing is the process of recording the logs and traces happening during a request to the binary engine, and rendering them in the response.

The interaction diagram below (soorry width!) shows the different roles at play during telemetry capturing. A textual explanatation follows it. For the sake of example a server environment –the query-engine crate– is assumed.

┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐

│ <> │

╔═══════════════════════╗ │╔═══════════════╗ │

║<<SpanProcessor, Sync>>║ ║ <> ║ ╔════════════════╗ ╔═══════════════════╗

┌───────────────────┐ ║ PROCESSOR ║ │║ Sender ║ ║ Storage ║│ ║ TRACER ║

│ Server │ ╚═══════════╦═══════════╝ ╚══════╦════════╝ ╚═══════╦════════╝ ╚═════════╦═════════╝

└─────────┬─────────┘ │ │ │ │ │ │

│ │ │ │ │

│ │ │ │ │ │ │

POST │ │ │ │ │

(body, headers)│ │ │ │ │ │ │

──────────▶┌┴┐ │ │ │ │

┌─┐ │ │new(headers)╔════════════╗ │ │ │ │ │ │

│1│ │ ├───────────▶║s: Settings ║ │ │ │ │

└─┘ │ │ ╚════════════╝ │ │ │ │ │ │

│ │ │ │ │ │

│ │ ╔═══════════════════╗ │ │ │ │ │ │

│ │ ║ Capturer::Enabled ║ │ │ │ │ ┌────────────┐

│ │ ╚═══════════════════╝ │ │ │ │ │ │ │<

│ │ │ │ │ │ │ └──────┬─────┘

│ │ ┌─┐ new(trace_id, s) │ │ │ │ │ │ │ │

│ ├───┤2├───────────────────────▶│ │ │ │ │ │

│ │ └─┘ │ │ │ │ │ │ │ │

│ │ │ │ │ │ │ │

│ │ ┌─┐ start_capturing() │ start_capturing │ │ │ │ │ │ │

│ ├───┤3├───────────────────────▶│ (trace_id, s) │ │ │ │ │

│ │ └─┘ │ │ │ │ │ │ │ │

│ │ ├─────────────────────▶│ send(StartCapturing, │ │ │ │

│ │ │ │ trace_id)│ │ │ │ │ │

│ │ │ │── ── ── ── ── ── ── ─▶│ │ │ │

│ │ │ │ ┌─┐ │ │insert(trace_id, s) │ │ │ │

│ │ │ │ │4│ │────────────────────▶│ │ │

│ │ │ │ └─┘ │ │ │ │ ┌─┐ │ process_query │

│ │──────────────────────────────┼──────────────────────┼───────────────────────┼─────────────────────┼────────────┤5├──────┼──────────────────────────▶┌┴┐

│ │ │ │ │ │ │ │ └─┘ │ │ │

│ │ │ │ │ │ │ │ │

│ │ │ │ │ │ │ │ │ │ │ ┌─────────────────────┐

│ │ │ │ │ │ │ log! / span! ┌─┐ │ │ │ res: PrismaResponse │

│ │ │ │ │ │ │ │ │◀─────────────────────┤6├──│ │ └──────────┬──────────┘

│ │ │ │ │ on_end(span_data)│ ┌─┐ │ └─┘ │ │ new │

│ │ │ │◀──────────────┼───────┼─────────────────────┼─────────┼──┤7├──────┤ │ │────────────▶│

│ │ │ │ send(SpanDataProcessed│ │ └─┘ │ │ │ │

│ │ │ │ , trace_id) │ append(trace_id, │ │ │ │ │ │

│ │ │ │── ── ── ── ── ── ── ─▶│ logs, traces) │ │ │ │ │

│ │ │ │ ┌─┐ │ ├────────────────────▶│ │ │ │ │ │

│ │ │ │ │8│ │ │ │ │ │ │

│ │ res: PrismaResponse │ ┌─┐ │ └─┘ │ │ │ │ │ │ │ │

│ │─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┼ ┤9├ ─ ─ ─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─return ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─└┬┘ │

│ │ ┌────┐ fetch_captures() │ └─┘ │ │ │ │ │ │ │ │

│ ├─┤ 10 ├──────────────────────▶│ fetch_captures │ │ │ │ │ │

│ │ └────┘ │ (trace_id) │ │ │ │ │ │ │ │

│ │ ├─────────────────────▶│ send(FetchCaptures, │ │ │ x │

│ │ │ │ trace_id) │ │ │ │ │

│ │ │ │── ── ── ── ── ── ── ─▶│ get logs/traces │ │ │

│ │ │ │ ┌────┐ │ ├─────────────────────▶ │ │ │

│ │ │ │ │ 11 │ │ │ │ │

│ │ │ │ └────┘ │ │◁ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│ │ │ │

│ │ │ │ │ │ │ │

│ │ ◁ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│ │ │ │ │ │ │

│ │ logs, traces │ │ │ │ │ │

│ │◁─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│ │ │ │ │ │ │ │

│ │ x ┌────┐ │ │ │ │ res.set_extension(logs) │

│ ├───────────────────────────────────────┤ 12 ├────────┼───────────────┼───────┼─────────────────────┼─────────┼───────────┼──────────────────────────────────────────▶│

│ │ └────┘ │ │ │ │ res.set_extension(traces) │

│ ├─────────────────────────────────────────────────────┼───────────────┼───────┼─────────────────────┼─────────┼───────────┼──────────────────────────────────────────▶│

◀ ─ ─ ─└┬┘ │ │ │ │ x

json!(res) │ │ │

┌────┐ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─

│ 13 │ │

└────┘

◀─────── call (pseudo-signatures)

◀─ ── ── async message passing (channels)

◁─ ─ ─ ─ return

In the diagram, you will see objects whose lifetime is static. The boxes for those have a double width margin. These are:

  • The server itself
  • The global TRACER, which handles log! and span! and uses the global PROCESSOR to process the data constituting a trace Spans and log Events
  • The global PROCESSOR, which manages the Storage set of data structures, holding logs, traces (and capture settings) per request.

Then, through the request lifecycle, different objects are created and dropped:

  • When a request comes in, its headers are processed and a Settings object is built, this object determines, for the request, how logging and tracing are going to be captured: if only traces, logs, or both, and which log levels are going to be captured.
  • Based on the settings, a new Capturer is created; a capturer is nothing but an exporter wrapped to start capturing / fetch the captures for this particular request.
  • An asynchronous task is spawned to own the storage of telemetry data without needing to share memory accross threads. Communication with this task is done through channels. The Sender part of the channel is kept in a global, so it can be cloned and used by a) the Capturer (to start capturing / fetch the captures) or by the tracer’s SpanProcessor, to extract tracing and logging information that’s eventually displayed to the user.

Then the capturing process works in this way:

  • The server receives a query [1]
  • It grabs the HTTP headers and builds a Capture object [2], which is configured with the settings denoted by the X-capture-telemetry
  • Now the server tells the Capturer to start capturing all the logs and traces occurring on the request [3] (denoted by a trace_id) The trace_id is either carried on the traceparent header or implicitly created on the first span of the request.
  • The Capturer sends a message to the task owning the storage to start capturing [4]. The tasks creates a new entry in the storage for the given trace_id. Spans without a corresponding trace_id in the storage are ignored.
  • The server dispatches the request and Somewhere else in the code, it is processed [5].
  • There the code logs events and emits traces asynchronously, as part of the processing [6]
  • Traces and Logs arrive at the TRACER, and get hydrated as SpanData in the PROCESSOR [7].
  • This SpanData is sent through a channel to the task running in parallel, [8]. The task transforms the SpanData into TraceSpans and LogEvents depending on the capture settings and stores those spans and events in the storage.
  • When the code that dispatches the request is done it returns a PrismaResponse to the server [9].
  • Then the server asks the PROCESSOR to fetch the captures [10]
  • Like before, the PROCESSOR sends a message to the task running in parallel, to fetch the captures from the Storage [11]. At that time, although that’s not represented in the diagram, the captures are deleted from the storage, thus freeing any memory used for capturing during the request
  • Finally, the server sets the logs and traces extensions in the PrismaResponse[12], it serializes the extended response in json format and returns it as an HTTP Response blob [13].

Modules

Structs

Enums

  • Capturer determines, based on a set of settings and a trace id, how capturing is going to be handled. Generally, both the trace id and the settings will be derived from request headers. Thus, a new value of this enum is created per request.

Traits

Functions

  • Creates a new capturer, which is configured to export traces and log events happening during a particular request
  • Adds a capturing layer to the given subscriber and installs the transformed subscriber as the global, default subscriber