Django aggregate count of records per day

10,785

Solution 1

The reason your listed query always gives 1 is because you're not grouping by date. Basically, you've asked the database to take the MessageLog rows that fall on a given day of the week. For each such row, count how many ids it has (always 1). Then take the average of all those counts, which is of course also 1.

Normally, you would need to use a values clause to group your MessageLog rows prior to your annotate and aggregate parts. However, since your logtime field is a datetime rather than just a date, I am not sure you can express that directly with Django's ORM. You can definitely do it with an extra clause, as shown here. Or if you felt like it you could declare a view in your SQL with as much of the aggregating and average math as you liked and declare an unmanaged model for it, then just use the ORM normally.

So an extra field works to get the total number of records per actual day, but doesn't handle aggregating the average of the computed annotation. I think this may be sufficiently abstracted from the model that you'd have to use a raw SQL query, or at least I can't find anything that makes it work in one call.

That said, you already know how you can get the total number of records per weekday in a simple query as shown in your question.

And this query will tell you how many distinct date records there are on a given weekday:

MessageLog.objects.filter(logtime__week_day=i).dates('logtime', day').count()

So you could do the averaging math in Python instead, which might be simpler than trying get the SQL right.

Alternately, this query will get you the raw number of messages for all weekdays in one query rather than a for loop:

MessageLog.objects.extra({'weekday': "dayofweek(logtime)"}).values('weekday').annotate(Count('id'))

But I haven't been able to get a nice query to give you the count of distinct dates for each weekday annotated to that - dates querysets lose the ability to handle annotate calls, and annotating over an extra value doesn't seem to work either.

This has been surprisingly tricky, given that it's not that hard a SQL expression.

Solution 2

I do something similar with a datetime field, but annotating over extra values does work for me. I have a Record model with a datetime field "created_at" and a "my_value" field I want to get the average for.

from django.db.models import Avg

qs = Record.objects.extra({'created_day':"date(created_at)"}).\
    values('created_day').\
    annotate(count=Avg('my_value'))

The above will group by the day of the datetime value in "created_at" field.

Solution 3

queryset.extra(select={'day': 'date(logtime)'}).values('day').order_by('-day').annotate(Count('id'))
Share:
10,785

Related videos on Youtube

bb89
Author by

bb89

Updated on June 04, 2022

Comments

  • bb89
    bb89 almost 2 years

    I've got a django app that is doing some logging. My model looks like this:

    class MessageLog(models.Model):
        logtime = models.DateTimeField(auto_now_add=True)
        user = models.CharField(max_length=50)
        message = models.CharField(max_length=512)
    

    What a want to do is get the average number of messages logged per day of the week so that I can see which days are the most active. I've managed to write a query that pulls the total number of messages per day which is:

    for i in range(1, 8):
        MessageLog.objects.filter(logtime__week_day=i).count()
    

    But I'm having trouble calculating the average in a query. What I have right now is:

    for i in range(1, 8):
        MessageLog.objects.filter(logtime__week_day=i).annotate(num_msgs=Count('id')).aggregate(Avg('num_msgs'))
    

    For some reason this is returning 1.0 for every day though. I looked at the SQL it is generating and it is:

    SELECT AVG(num_msgs) FROM (
    SELECT 
    `myapp_messagelog`.`id` AS `id`, `myapp_messagelog`.`logtime` AS `logtime`, 
    `myapp_messagelog`.`user` AS `user`, `myapp_messagelog`.`message` AS `message`, 
    COUNT(`myapp_messagelog`.`id`) AS `num_msgs` 
    FROM `myapp_messagelog` 
    WHERE DAYOFWEEK(`myapp_messagelog`.`logtime`) = 1 
    GROUP BY `myapp_messagelog`.`id` ORDER BY NULL
    ) subquery
    

    I think the problem might be coming from the GROUP BY id but I'm not really sure. Anyone have any ideas or suggestions? Thanks in advance!

  • bb89
    bb89 almost 11 years
    Thanks for the explanation! That makes a lot of sense. I have tried using the extra clause, however now I'm getting a SQL error. I modified my query to be: MessageLog.objects.filter(logtime__week_day=i) .extra({'date_logged': "date(logtime)"}).values('date_logged') .annotate(num_msgs=Count('id')) .aggregate(Avg('num_msgs')) And the error I get is "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FROM (SELECT (date(logtime)) AS date_logged, COUNT(myapp_messagelog.`id' at line 1")
  • Peter DeGlopper
    Peter DeGlopper almost 11 years
    Yeah, that's not quite working, is it? The aggregate clause throws it off - it works up to the annotate. You can get a count of the total number of messages per weekday couple of different ways, but I can't find a way to make the Django ORM to do this in one call. I'll edit some of what I came up with into my answer.
  • bb89
    bb89 almost 11 years
    I actually ended up creating a view (actually needed 2 for mysql) and then an unmanaged model as you suggested and that worked perfectly. I'll add another post explaining what I did later for those that might be interested. Thanks again for all your help!
  • Arun Vinoth - MVP
    Arun Vinoth - MVP over 4 years
    Can you explain how/why the answer will solve the problem in question, that will add value to your answer rather than posting the code-only answer.