Showing posts with label django. Show all posts
Showing posts with label django. Show all posts

Saturday, 12 August 2017

Django tips #3 - Subquery expressions

This post describes a common quirk observed when performing joins against multiple tables in an SQL query, and a solution to it using a new feature (Subquery expressions) of the Django ORM.

A common problem


Consider the following models in Django.

from django.db import models


class Account(models.Model):
    name = models.CharField(max_length=32, unique=True)


class Credit(models.Model):
    account = models.ForeignKey(Account)
    amount = models.DecimalField(max_digits=12, decimal_places=2)


class Debit(models.Model):
    account = models.ForeignKey(Account)
    amount = models.DecimalField(max_digits=12, decimal_places=2)


Let's add some accounts, and also a few credit and debit entries against them. This is what the tables look like now:

# Accounts
+------+--------+
|   id | name   |
|------+--------|
|    1 | FOX    |
|    2 | SNAKE  |
|    3 | DOG    |
+------+--------+


# Credits
+----------------+-----------------+
| account_name   |   credit_amount |
|----------------+-----------------|
| FOX            |           100.0 |
| SNAKE          |            50.0 |
| SNAKE          |            20.0 |
| DOG            |           300.0 |
+----------------+-----------------+

# Debits
+----------------+----------------+
| account_name   |   dedit_amount |
|----------------+----------------|
| FOX            |           40.0 |
| SNAKE          |           30.0 |
| DOG            |           12.0 |
| DOG            |           23.0 |
+----------------+----------------+


Now, we want to answer a simple question: "What is the current balance of each account?". The balance of an account is defined as the sum of all credits minus the sum of all debits.

Seems pretty straightforward, right?

This is how one would do it using the Django ORM (A simple group-by and aggregate, which I described in one of the previous posts: https://mixedquantum.blogspot.in/2015/01/python-django-tips.html)

Account.objects.values('id').annotate(
    credit_sum=Sum('credit__amount'),
    debit_sum=Sum('debit__amount'),
    balance=F('credit_sum') - F('debit_sum')
).values_list('name', 'balance')

"""
[('FOX', Decimal('60.00')),
 ('SNAKE', Decimal('10.00')),
 ('DOG', Decimal('565.00'))]
"""


The balance for FOX looks okay (100 - 40). But what's happening with SNAKE, and DOG? Their balance amount isn't correct.
Printing the individual sums of credits and debits might help us figure out what's happening.

+--------+--------------+-------------+-----------+
| name   |   credit_sum |   debit_sum |   balance |
|--------+--------------+-------------+-----------|
| FOX    |        100.0 |        40.0 |      60.0 |
| SNAKE  |         70.0 |        60.0 |      10.0 |
| DOG    |        600.0 |        35.0 |     565.0 |
+--------+--------------+-------------+-----------+

The sum of debits is double what is should be for SNAKE, and the sum of credits is twice of what it should be for DOG.
The "twice" comes from the fact that there are two entries of credit for SNAKE, and two entries of debit for DOG.

An intermediate representation might help explain this better. This is the data on which the aggregates (SUM) are applied.

+--------+-------------+-----------------+------------+----------------+
| name   |   credit_id |   credit_amount |   debit_id |   debit_amount |
|--------+-------------+-----------------+------------+----------------|
| FOX    |           1 |           100.0 |          1 |           40.0 |
| SNAKE  |           4 |            20.0 |          2 |           30.0 |
| SNAKE  |           3 |            50.0 |          2 |           30.0 |
| DOG    |           5 |           300.0 |          3 |           12.0 |
| DOG    |           5 |           300.0 |          4 |           23.0 |
+--------+-------------+-----------------+------------+----------------+

The INNER JOIN is applied before the aggregate (SUM in this case). We need it the other way round.

The solution


In Postgres, this can be done using a WITH query (Common Table Expression):

WITH credits AS (
    SELECT account_id, sum(amount) AS credit_sum
        FROM core_credit
    GROUP BY account_id
), debits AS (
    SELECT account_id, sum(amount) AS debit_sum
        FROM core_debit
    GROUP BY account_id
)

SELECT a.name, credit_sum - debit_sum AS balance
    FROM core_account a
        INNER JOIN credits c
            ON c.account_id = a.id
        INNER JOIN debits d
            ON d.account_id = a.id;

Which gives us the correct balance of each account:

+--------+-----------+
| name   |   balance |
|--------+-----------|
| FOX    |      60.0 |
| SNAKE  |      40.0 |
| DOG    |     265.0 |
+--------+-----------+


Common Table Expressions don't seem to be supported in Django, but subqueries, which can be used to achieve the same output, were recently added in Django 1.11.

This is how we can use Subquery expressions to get the same result:

from django.db.models.expressions import Subquery, OuterRef
from django.db.models import F, Sum

from core.models import Account, Credit, Debit

credits = Credit.objects.filter(
    account=OuterRef('pk')).values('account_id').annotate(sum_credits=Sum('amount'))
debits = Debit.objects.filter(
    account=OuterRef('pk')).values('account_id').annotate(sum_debits=Sum('amount'))

Account.objects.annotate(
    credit_sum=Subquery(credits.values('sum_credits')),
    debit_sum=Subquery(debits.values('sum_debits')),
    balance=F('credit_sum') - F('debit_sum')
).values_list('name', 'balance')

"""
[('FOX', Decimal('60.00')),
 ('SNAKE', Decimal('40.00')),
 ('DOG', Decimal('265.00'))]
"""


It's not the most intuitive of code, but to someone who spends a lot of time using the ORM, it's just a matter of getting used to.

You can read more about Subquery expressions here: https://docs.djangoproject.com/en/1.11/ref/models/expressions/#subquery-expressions

The Django ORM has come a long way in terms of things it makes possible, reducing the need to resort to raw SQL queries even for some not-so-straightforward use cases.

Friday, 1 April 2016

Better Django choices

A typical Django model looks something like this:

from django.db import models


class MyModel(models.Model):
    STATUS_CHOICES = (
        (1, "Active"),
        (2, "In-Progress"),
        (-1, "Inactive"),
    )
    status = models.SmallIntegerField(choices=STATUS_CHOICES, default=1)

    def __str__(self):
        return '{}'.format(self.pk)

To filter on "status", one would do this

MyModel.objects.filter(status=1)

Not only is hardcoding non-informative in this case, anyone reading or writing a filter has to refer to the Model every time. In the two fairly large Django codebases that I have worked on, there was a mapping system that would let you do this.

MyModel.objects.filter(status=STATUS_MAP["Active"])

This gives better context to someone reading the code. Authors, on the other hand, still need to hardcode a string.

"There must be a better way"

Sure there is, a quick search will lead you to this.

One can pip install django-choices and call it a day. To me, this problem seems too trivial to require a dependency.

More searching will lead you to this. What I suggest is a mix of both, with some restructuring.

Create a directory named "choices" in the app's folder. For each model in your app, add a file in this directory (for "MyModel", you can add "mymodel.py").

For each field with choices, you define a class (with the same name capitalised).

class Status:
    ACTIVE = 1
    IN_PROGRESS = 2
    INACTIVE = -1

    choices = (
        (ACTIVE, "Active"),
        (IN_PROGRESS, "In-Progress"),
        (INACTIVE, "Inactive"),
    )

MyModel will be updated likewise. An extra property of the form <model_name><capitalised_field_name> is added. You can, and probably should, experiment with the naming conventions here.


from django.db import models

from .choices import mymodel


class MyModel(models.Model):
    MyModelStatus = mymodel.Status
    status = models.SmallIntegerField(choices=MyModelStatus.choices,
                                      default=MyModelStatus.ACTIVE)

    def __str__(self):
        return '{}'.format(self.pk)

Filtering will be thus:

MyModel.objects.filter(status=MyModel.MyModelStatus.ACTIVE)

There's several enhancements you can make to the "Status" class. Such as creating a base that auto-defines "choices", gives default values, etc. This does the job without adding much overhead. It is slightly verbose, but I'd take that any day for improved readability. Authors will also benefit from autocomplete in supporting editors and IDEs.

For those who might consider refactoring an existing codebase, you can run "makemigrations" to test it out. If you did everything correctly, no changes will be detected when you run "python manage.py makemigrations".

Making changes to production systems however, based on the latest blog posts that you read, is a risk that is yours to take :)

Sunday, 25 January 2015

Django tips #2

More Django tips. Check out the previous post in the series here: http://mixedquantum.blogspot.com/2015/01/python-django-tips.html

Exclude Inlines when creating an object.


This is actually well-documented but including here nonetheless.

(This is a method of admin.ModelAdmin, override in your custom admin class)

inlines = [MyInline]

def get_formsets_with_inlines(self, request, obj=None):
    for inline in self.get_inline_instances(request, obj):
        if isinstance(inline, MyInline) and obj is None:
            continue
        yield inline.get_formset(request, obj), inline

Original docs: https://docs.djangoproject.com/en/1.7/ref/contrib/admin/#django.contrib.admin.ModelAdmin.get_formsets_with_inlines


More control in Django admin.


When you want to limit the model instances that are seen by users depending on some property in the request object (or always). Override get_queryset. We have used this before but that was to annotate the QuerySet.

class MyModelAdmin(admin.ModelAdmin):

    def get_queryset(self, request):
        if not request.user.is_superuser:
            return MyModel.objects.filter(some_property=10)
        return super(MyModelAdmin, self).get_queryset(request)

Modifying Fieldsets


When you need to change fieldsets (choose which fields to display / how to display), override get_fieldsets.

class MyModelAdmin(admin.ModelAdmin):
    fieldsets = [
        (None, {
            'fields': ['a', 'b', 'c', 'd']
        })
    ]
    
    def _get_add_fieldsets(self):
        return [(None,{'fields':['a', 'b', 'c']})]

    def get_fieldsets(self, request, obj=None):
        if not request.user.is_superuser:
            self.exclude = ('a')
        if not obj:
            return self._get_add_fieldsets()
        return super(MyModelAdmin, self).get_fieldsets(request, obj=obj)

Modifying QuerySet for individual fields (foreign keys)


We override get_form

def get_form(self, request, obj=None, **kwargs):
    f = super(MyModelAdmin, self).get_form(request, obj=obj, **kwargs)
    if not request.user.is_superuser:
        # a is a foreign key here, whose model is A
        f.base_fields['a'].queryset = A.objects.filter(some_property=1)
    return f

For changing the QuerySet of a field in an Inline admin class


We will override formfield_for_foreignkey and / or formfield_for_manytomany:

class InlineAdmin(admin.StackedInline):
    model = MyModel

    def formfield_for_manytomany(self, db_field, request, **kwargs):
        if db_field.name == 'a':
            kwargs['queryset'] = A.objects.filter(some_property=2)
        return super(InlineAdmin, self).formfield_for_manytomany(
            db_field, request, **kwargs)

Trickery to hide models in Admin 


(When you have to register the model but not show it - for example, in case of django-polymorphic, you have to register child models, but not necessarily show them in admin).

def get_model_perms(self, request):
    return {}



Model functions:


Do last minute changes when a model is saved: save_model (this is also available in admin)


def save_model(self, request, obj, form, change):
    if not request.user.is_superuser:
        obj.some_property = 1337
    obj.save()

The advantage of doing this in admin is that you get access to the request / form / change objects, based on which you may do modifications before saving the model instance.

Thursday, 1 January 2015

Python & Django tips

Some code snippets that I've found useful. (Mostly from stackoverflow.com, sorry for not pointing to the actual answers, my bookmarks are a mess)

Ordering by custom column in Django admin:


You have to annotate your QuerySet, add a method for your custom column and enable ordering on it.

Here we want to display how many books each Publisher has. And display the Publisher's name and the book count in the admin interface.

We override get_queryset and annotate the QuerySet.

class PublisherAdmin(admin.ModelAdmin):
    def get_queryset(self, request):
        return Publisher.objects.annotate(book_count=Count('book'))

    # Return the annotated column's name
    def total_books(self, publisher_object):
        return publisher_object.book_count
    # Enable ordering
    total_books.admin_order_field = 'book_count'

    list_display = ('name', 'total_books')

Serializing datetime (and possibly other) objects when passed to json.dumps


def _unix_time(dt):
    epoch = datetime.date.fromtimestamp(0)
    delta = dt - epoch
    return delta.total_seconds()

def serialize_datetime(obj):
    if hasattr(obj, 'isoformat'):
        return obj.isoformat()
        # Or if you want the date in another format
        # return obj.strftime('%d-%m-%Y')
        
        # To get ms since epoch
        # return _unix_time(obj)
    return None

j = json.dumps(list_of_objects_with_datetime, default=serialize_datetime)


Ordering by multiple fields, aggregating and ordering QuerySet.


This seems difficult to achieve at first, but it couldn't be easier.

MyModel.objects.values('A', 'B') \
    .annotate(count=Sum('C')) \
    .order_by('A')

In the above example, you group by columns A and B. Then you can Count or Sum another column (the aggregation), and finally we order by one of the columns we grouped by (A or B) or the annotated column (count).

Wednesday, 15 October 2014

Overriding forms in Django admin.

First up, find out where your python packages are stored. This will differ from your system packages if you are in a virtualenv.

echo `which python`

You will get a path like this

/some/path/bin/python

Navigate up a couple of directories and find the django package. Then browse to the following directory. Here you will find all the original templates for django admin.

/some/path/lib/python2.7/site-packages/django/contrib/admin/templates/admin/

My use case required overriding change forms for particular models of some applications.

This is the form that you get once an object of a particular model is saved and you need to change some values / fields in that model.

Copy the change_form.html from the directory and paste it in your projects templates folder. The path would depend on which model you are overriding. Suppose you have model Foo in application ABC.

Copy change_form.html to

templates/admin/ABC/foo/change_form.html

(creating directories where needed). This is your projects' templates folder.

Application name reserves the casing but the model name should be converted to lowercase.

Inside this template you can use {{ original }} along with {{ block.super}} to access the particular object of the model whose form you have overridden. A typical override would look like this, removing the unnecessary stuff.


{% extends "admin/change_form.html" %}
{% load i18n admin_urls %}

{% load static %}

{% block extrahead %}{{ block.super }}
	<link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/css/bootstrap.min.css">
	<script src="//code.jquery.com/jquery-1.11.1.min.js"></script>
	<script src="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/js/bootstrap.min.js"></script>
	<script src="{% static "js/custom.js" %}"></script>
{% endblock %}


{% block after_field_sets %}{{ block.super }}
    <!-- override template here -->
{% endblock %}


You can also override inlines. Suppose you are using ModelAdmin and need to override an inline.
You must add an additional attribute named template to the inline, and giving it the path of the form you have overridden.


We maintain the same directory path convention as found in the site-packages/django/.. directory. But this is not necessary, stacked.html can reside anywhere under templates. Copy it under your templates from the same directory that you copied change_form.html from.


class FooInline(admin.StackedInline):
	model = Foo
	fields = (('a'),('b'),('c'),)
	extra = 0
        template = 'admin/ABC/bar/edit_inline/stacked.html'

class BarAdmin(admin.ModelAdmin):
	list_display = ['b', 'a', 'r']
	inlines = [FooInline]

admin.site.register(Bar, BarAdmin)

admin.site.register(Foo)


Extra headers can be added under the extrahead block.


{% block extrahead %}
    <!-- extra css/js resources defined here -->
{% endblock %}


Custom JavaScript can be linked in the extrahead block or you can have it on the page at the end of the template. Note that if you intend on adding some forms, and have input or button with type="submit", it conflicts with the model's form (you should be able to stop event propagation by handling it, I suppose, this remains to be tried).

If you need to process any input separately, have a button tag with an event listener on it.