Prometheus group by substring of label

14,934

Solution 1

While it'd be best to fix the metrics, the next best thing is to use metric_relabel_configs using the same technique as this blog post:

  metric_relabel_configs:
  - source_labels: [index]
    regex: 'project\.([^.]*)\..*'
    replacement: '${1}'
    target_label: project

You will then have a project label that you can use as usual.

Solution 2

It looks as if the index label naming convention is wrong. These constituents should clearly be separate metric labels instead. So, instead of project.<projectname>.<uniqueid>.<date> you should be storing timeseries with labels as in:

project{projectname="xxx", uniqueid="yyy"}

As for the date part, I assume this already covered by the sample timestamps themselves? There are plenty of functions like timestamp(), month(), day() etc to manipulate the timestamp.

So you have two options:

  • fix the metrics exporter to separate the labels instead of concatenating this information into the metric name itself,
  • OR, if you can't alter the exporter, use metric relabeling to convert the index label into new time series with the label set I described above. You can then use standard Prometheus functions as you described. See this article for an example of how you could do this.
Share:
14,934

Related videos on Youtube

Lars Milland
Author by

Lars Milland

Updated on June 04, 2022

Comments

  • Lars Milland
    Lars Milland almost 2 years

    I am trying to solve a problem of making a sum and group by query in Prometheus on a metric where the labels assigned to the metric values to unique to my sum and group by requirements.

    I have a metric sampling sizes of ElasticSearch indices, where the index names are labelled on the metric. The indices are named like this and are placed in the label "index":

    project.<projectname>.<uniqueid>.<date>

    with concrete value that would look like this:

    project.sample-x.ad19f880-2f16-11e7-8a64-jkzdfaskdfjk.2018.03.12

    project.sample-y.jkcjdjdk-1234-11e7-kdjd-005056bf2fbf.2018.03.12

    project.sample-x.ueruwuhd-dsfg-11e7-8a64-kdfjkjdjdjkk.2018.03.11

    project.sample-y.jksdjkfs-2f16-11e7-3454-005056bf2fbf.2018.03.11

    so if I had the short version of values in the "index" label I would just do:

    sum(metric) by (index)

    but what I am trying to do is something like this:

    sum(metric) by ("project.<projectname>")

    where I can group by a substring of the "index" label. How can this be done with a Prometheus query? I assume this could maybe be solved using a label_replace as part of the group, but I can't just see how to "truncate" the label value to achieve this.

    Best regards

    Lars Milland

  • Lars Milland
    Lars Milland about 6 years
    I am not in control of the metric exporter, and would also not like to do this in a relabeling exercise when scraping the metrics by Prometheus. I am looking for a solution with Prometheus query. But as question is stated, I simply can't figure out how to drop parts of the "index" label so I can group by only the first parts and not the whole full label value. So if anyone know how to trim/truncate the label as part of a query, and then use the trimmed value in a group please advice.
  • ekarak
    ekarak about 6 years
    try using the __name__ hidden label. Something in the likes of: sum({_name__=~"project.[a|b|c}*"})
  • Lars Milland
    Lars Milland about 6 years
    Maybe I am explaining my problem wrong. It is not the name of the metric that I want to truncate parts of and group by. It is a label on the metric.
  • Lars Milland
    Lars Milland about 6 years
    So what I thought would work is something like: sum(metric) by (label_replace(....)) I just don't know how to use the label_replace function inside the grouping part and truncate the the "index" label using some smart regular expression.