Skip to content
Dynamic Telemetry is a PROPOSAL : please provide feedback! :-)

Dynamic Telemetry is not an implementation, it's a request for collaboration, that will lead to an shared understanding, and hopefully one or more implementations.

Your feedback and suggestions on this document are highly encouraged!

Please:

  1. Join us, by providing comments or feedback, in our Discussions page

  2. Submit a PR with changes to this file ( docs/PositionPaper.DeliveryGuarantees.document.md)

Direct Sharing URL

http://microsoft.github.io/DynamicTelemetry/docs/PositionPaper.DeliveryGuarantees.document/

Delivery Guarantees of Dynamic Telemetry (and OpenTelemetry)

  1. Telemetry must always be lossy; when push comes to shove
  2. Telemetry should never be lost, unless push has come to shove

In simpler terms:

  1. It's never okay to lose telemetry on machines that have surplus memory
  2. It's usually not a good idea to store telemetry on disk, except in dire emergency
  3. A delivery guarantee cannot be given to the users of telemetry - telemetry is not a replacement for transaction processing

Reason:

  1. Assume a service is operating nominally, servicing business needs
  2. Assume the telemetry backend locks up - perhaps a network outage
  3. A good telemetry system should queue, first to RAM, and maybe to disk
  4. As telemetry collects, at some point, a decision must be made
  5. start dropping telemetry
  6. stop servicing customer workloads

The right answer is to continue servicing customer loads, and to stop dropping telemetry.