Logging in a Distributed System
A distributed system is composed of several applications calling each other to complete one operation. Each of these applications emits its own logs and often stores them in different locations. With distributed systems, logging is the easy part. What's much harder is to make sense of this ocean of logs from a logical point of view.
If you need to implement logging for a distributed system, the first step is to select a single location to store the logs. Dozens of solutions are available and PostSharp Logging is agnostic about this choice.
The next challenge is to correlate the log records coming from different applications to get a logical view of all records relevant to the processing of a specific request. Typically, this means adding some HTTP headers on client side and read them on server side. PostSharp Logging implements the HTTP Correlation Protocol specification for HttpClient and ASP.NET. It also exposes APIs you can use to implement distributed logging with other technologies or implement different correlation protocols.
This topic contains the following sections:
TODO: point to the example and blog post.
To prepare a set of applications for distributed logging:
If you have any application that emit outgoing HTTP requests or process incoming HTTP requests using ASP.NET or ASP.NET core, instrument tham as described in Instrumentation of ASP.NET and HttpClient and make sure to pass an implementation of the ICorrelationProtocol interface.
This will result in creating a property named RequestId on the client and the server part of the request. Additionally, the SyntheticId property on client and server side will be synchronized.
If an application emits or accepts requests using another technology than HttpClient or ASP.NET, for instance WCF or MSMQ, you have to implement correlation manually:
On the client side, add a header to your outgoing request with the value of the current SyntheticId, which you can get from LoggingServices.DefaultBackend.CurrentContext.SyntheticId.
On the server side, you need to read the parent id from the incoming headers, open a new activity (see Working with Custom Activities), and set the OpenActivityOptions.SyntheticParentId property. You may take this oppportunity for implementing support for per-request logging. See Defining Your Own Logging Transactions for details.
Choose a logging back-end that supports logging properties (Serilog is a good choice) and configure it to write to a central log server such as Elasticsearch. You can correlate items using the RequestId property but it's more convenient to use the EventId property, generated from SyntheticId.
Choose a generation strategy for SyntheticId as described below.
Enabling request correlation on a publically available service is a potential security risk. Logging is often performed before the request is fully authenticated, so the content of the additional correlation HTTP headers cannot be trusted. However, they can be used to inject values to your logging system. These values may then be processed by your application, and will be stored in your log server.
If you need to enable correlation of distributed logging on a public service, consider wrapping a default implementation of ICorrelationProtocol with your own logic that verifies the request security.
To correlate traces across processes, it is necessary to assign a proper identifier to each context and record. PostSharp Logging offers a suitable context identifier in the SyntheticId property. You can read this identifier from logSource.CurrentContext.SyntheticId or activity.Context.SyntheticId.
The most important benefit of this specification is that it is possible to select all children records of a logical distributed operation only by filtering by the beginning of the SyntheticId, e.g. in pseudo-SQL:
SELECT * FROM Records WHERE SyntheticId LIKE '|45ed51e.1.da4e9679%';
Additionally, our implementation generates sortables identifiers, so that ordering records by SyntheticId makes sense:
SELECT * FROM Records WHERE SyntheticId LIKE '|45ed51e.1.da4e9679%' ORDER BY SyntheticId ASC;
The filtering and ordering characteristics of the identifiers depend on the generation strategy, as explained below.
Comparing generation strategies
ContextIdGenerationStrategy is the most important configuration option. It determines how the LoggingContext.Id property (a 64-bit integer) is generated. There are basically two stategies: global and hierarchical. They are compared in the next table.
This strategy uses a single AppDomain-wide static 64-bit counter to generate the identifier. The SyntheticId is therefore composed of the RootSyntheticId (by default, a random 64-bit integer) followed by the 64-bit Id itself.
The Id property is generated using a counter in the scope of the parent context. If there is no parent context, the global strategy is used. Therefore, the SyntheticId property is composed of the SyntheticId of the parent context, plus the Id of the current context.
Length of generated identifiers
Potentially very long.
May cause thread contention on highly loaded systems because several threads may be trying to get exclusive access to the global counter at the same time.
No thread contention issue, but more CPU time required to render the id.
Preserving time, but not causality (parallel async calls are mixed).
Preserving causality, but not time (parallel async calls are separated from each other).
Only based on cross-process boundaries (see discussion on SyntheticParentId below).
Based on any context (external activity, method, custom in-process activity).