Friday, May 6, 2011

Django models - how to filter number of ForeignKey objects

I have a models A and B, that are like this:

class A(models.Model):
  title = models.CharField(max_length=20)
  (...)

class B(models.Model):
  date = models.DateTimeField(auto_now_add=True)
  (...)
  a = models.ForeignKey(A)

Now I have some A and B objects, and I'd like to get a query that selects all A objects that have less then 2 B pointing at them.

A is something like a pool thing, and users (the B) join pool. if there's only 1 or 0 joined, the pool shouldn't be displayed at all.

Is it possible with such model design? Or should I modify that a bit?

From stackoverflow
  • Sounds like a job for extra.

    A.objects.extra(
        select={
            'b_count': 'SELECT COUNT(*) FROM yourapp_b WHERE yourapp_b.a_id = yourapp_a.id',
        },
        where=['b_count < 2']
    )
    

    If the B count is something you often need as a filtering or ordering criterion, or needs to be displayed on list views, you could consider denormalisation by adding a b_count field to your A model and using signals to update it when a B is added or deleted:

    from django.db import connection, transaction
    from django.db.models.signals import post_delete, post_save
    
    def update_b_count(instance, **kwargs):
        """
        Updates the B count for the A related to the given B.
        """
        if not kwargs.get('created', True) or kwargs.get('raw', False):
            return
        cursor = connection.cursor()
        cursor.execute(
            'UPDATE yourapp_a SET b_count = ('
                'SELECT COUNT(*) FROM yourapp_b '
                'WHERE yourapp_b.a_id = yourapp_a.id'
            ') '
            'WHERE id = %s', [instance.a_id])
        transaction.commit_unless_managed()
    
    post_save.connect(update_b_count, sender=B)
    post_delete.connect(update_b_count, sender=B)
    

    Another solution would be to manage a status flag on the A object when you're adding or removing a related B.

    B.objects.create(a=some_a)
    if some_a.hidden and some_a.b_set.count() > 1:
        A.objects.filter(id=some_a.id).update(hidden=False)
    
    ...
    
    some_a = b.a
    some_b.delete()
    if not some_a.hidden and some_a.b_set.count() < 2:
        A.objects.filter(id=some_a.id).update(hidden=True)
    
  • I'd recommend modifying your design to include some status field on A.

    The issue is one of "why?" Why does A have < 2 B's and why does A have >= 2 B's. Is it because user's didn't enter something? Or is because they tried and their input had errors. Or is it because the < 2 rule doesn't apply in this case.

    Using presence or absence of a Foreign Key limits the meaning to -- well -- present or absent. You don't have any way to represent "why?"

    Also, you have the following option

    [ a for a in A.objects.all() if a.b_set.count() < 2 ]
    

    This can be pricey because it does fetch all the A's rather than force the database to do the work.


    Edit: From the comment "would require me to watch for user join / user leaving the pool events".

    You don't "watch" anything -- you provide an API which does what you need. That's the central benefit of the Django model. Here's one way, with explict methods in the A class.

    class A( models.Model ):
        ....
        def addB( self, b ):
            self.b_set.add( b )
            self.changeFlags()
        def removeB( self, b ):
            self.b_set.remove( b )
            self.changeFlags()
        def changeFlags( self ):
            if self.b_set.count() < 2: self.show= NotYet
            else: self.show= ShowNow
    

    You can also define a special Manager for this, and replace the default b_set Manager with your manager that counts references and updates A.

    kender : A is something like a pool thing, and users (the B) join pool. if there's only 1 or 0 joined, the pool shouldnt be displayed at all... That's why I wanted to not include such status - would require me to watch for user join / user leaving the pool events. But, maybe that's the way...
  • I assume that joining or leaving the pool may not happen as often as listing (showing) the pools. I also believe that it would be more efficient for the users join/leave actions to update the pool display status. This way, listing & showing the pools would require less time as you would just run a single query for SHOW_STATUS of the pool objects.

0 comments:

Post a Comment