prometheus_client, a Prometheus Implementation for Dart

We recently released the stable 1.0.0 release of our prometheus_client package — a way to expose Prometheus metrics in your Dart application. It’s inspired by similar packages like the official client_java for Java or the community package prom_client for Node.js. We have several Microservices written in Dart where we use the package to provide observability. I want to share the experience we had while developing the package.

We have a rather large application that has been under development for more than 10 years. There is a lot of code to share between front-end and back-end. So, the desire to write everything in a single language, in this case Dart, is high. Especially, as the application architecture originates in a monolithic design, but is nowadays split into multiple Microservices. While we take advantage that Microservices can be built in heterogeneous frameworks and languages — following the principal to use the right tool for the job — we still have parts where we need to run the same logic in both frontend and backend.

State of the Dart ecosystem for backends

But this isn’t that easy. Back when we started with Dart 10 years ago, it was a new rising language designed for front-end and back-end work. It came with the DartVM for running backend loads, Dart2JS for compiling to JavaScript code to run in any browser, and there was even the goal of establishing Dart as a natively supported language for Browsers, as an alternative to JavaScript. But since then, the Dart team has changed its mission multiple times. Today it’s: “Dart is a client-optimized language for fast apps on any platform”. Notice the emphasis on client — the Dart team doesn’t have the goal of build a language that can be used for everything.

This is something you also notice when looking through the packages in the Dart ecosystem: There aren’t a lot of backend related packages available on Pub, Dart’s package manager. And if you find the right packages for your backend project, there is still a high chance that it isn’t maintained anymore. Sure, there is a package for Postgres, or MongoDB (even though support is limited), but client libraries for common services like Kafka aren’t available. We even had issues with the lack of well-maintained JWT implementations in the past. There aren’t any big full featured backend frameworks available (like NestJS for Node.js) and the previous existing ones got discontinued. Summing his up, I could write a story about the danger of using tools outside of the use cases they are designed for — but I will keep that for another day 😉.

It’s not as bad as it looks, the Dart team itself is still using Dart for writing servers and tools. One example is pub.dev, the website behind the Pub package manager, which is completely written in Dart and hosted on AppEngine. A central package for this is shelf. shelf is the HTTP server of the Dart ecosystem, similar to Express for Node.js. A small ecosystem of middlewares and tools has grown around shelf. Therefore it’s good start for building your own backend service. Today, Dart code can be complied to native code, which makes it interesting for creating very small distroless Docker containers.

Why you need Observability

As said before, we split our monolithic application into a Microservice architecture. While that gives us benefits, like developing, deploying, and scaling parts of the system independently, it trades them for a less reliable and way more complex system. This is especially an issue when we want to analyze problems, as we first have to detect in which part of the system a problem originates.

This is where observability comes into play. Observability is the extension of monitoring and tries to collect as much data about the system as required for answering questions about its state. The three pillars of observability are log, metrics, and traces. We already collect logs, but would like to have a more aggregated overview of the state of the system. This is something metrics can deliver, providing observations of values in a time series. Examples for metrics are memory usage over time, processed requests per second, or number of errors per second. It’s also important that these metrics are correctly labeled, so that you can interpret and correlate them to parts of your system.

Common tools for collecting and visualizing metrics are Prometheus and Grafana. Prometheus is a platform to collect metrics from your services and to aggregate, store, and query them. In combination with Grafana, a technology independent data visualization tool, gives you the possibility to visualize your metrics on dashboard and to notify you of unexpected deviations via alerts. Exporting metrics with Prometheus is supported for a broad range of programing languages. But back then, there wasn’t any package for that available on Pub!

Welcome prometheus_client!

Based on this need, we created prometheus_client. The Prometheus documentation provides a wide range of specifications and guidance for that. For example, the exposition formats are well described and they even provide guidelines for writing client libraries. The latter makes it easy to use the libraries: They all work the same and are easy to learn, independent from the language you are in.

We integrated with shelf early on, making it possible to instrument a server and to expose metrics as an endpoint. However, we got feedback from the community that such a tight integration isn’t desired and got asked to remove the hard dependency on shelf. We did that and split the package up into two parts: prometheus_client and prometheus_client_shelf. The latter provides a middleware for collecting metrics in a shelf server and a handler for exposing them in a metrics endpoint.

Lately, before performing the stable release, we introduced a collectCallback, inspired by a similar feature in prom-client The callback allows update the value of a metric to which is handy if you need to observe something at a specific point-in-time and run some aggregation logic on demand. For example, you might want to perform a database query to count the users of your service. In such a case it’s important that you can run asynchronous code inside the callback by returning a Future. Implementing this required a breaking change in the whole architecture, as the collection process is now completely asynchronous. Before the stable release we also took some time to polish the documentation, which is hopefully now providing all you need to know.

Let’s have a look at how you can use the package. First you have to create a metric and update its value.

// Create a counter metric that counts requests
final requestsCounter = Counter(
  name: 'metric_requests_total',
  help: 'The total amount of requests.',
);

// Count a request, increase the counter by one
requestsCounter.inc();

Besides counters, we also provide gauges (a point-in-time value), histograms (sorting values in buckets) and summaries (providing percentiles). Each metric is registered at a central registry. Once you want to expose the metrics, for example at a metrics endpoint, collect all metric samples and convert them into the text representation:

// Collect all metrics in the default registry
final metrics = await CollectorRegistry.defaultRegistry
  .collectMetricFamilySamples();
// Convert the metrics into the text representation and
// write them into the request output.
format.write004(request.response, metrics);

For more usage examples, have a look at the package README.

The current state of the package fulfils all our needs, but it could still be extended in the future. One feature is support for isolates, Dart’s concurrency mechanism. For now, metrics are only collected in the current isolate, but we could support aggregating the metrics from multiple isolates. prom-client already has support for this, as Node.js has a similar concurrency model as Dart.

Another topic are the default metrics suggested by the guidelines for writing client libraries. While we have limited support for some of them (namely dart_info, process_start_time_seconds and process_resident_memory_bytes), it would be desirable to implement more of them. We would also be interested in metrics related to the Dart VM itself, for example garbage collection statistics. However, the Dart runtime doesn’t make it easy to expose any of them. We could fall back to platform specific APIs via dart:ffi or the Dart VM service protocol, but that would make it harder to keep the package portable.

For now, we only support the shelf ecosystem by providing prometheus_client_shelf, but could integrate with more backend frameworks if there would be any. The language has grown a lot lately due to its usage in Flutter. I guess the desire to use it as a back-end language will grow even further in the future. If it’s not the main goal of the Dart team, maybe it’s something the community can establish?

After all it was a fun project and I hope making it Open Source helps others to use Dart as backend language.