Patching django sessions to control user sessions

14 Feb 2015

###Introduction HackerEarth uses django framework at its heart. We use two third party django packages for the purpose of user authentication and session management:

  • django-allauth: Provides pre-built modules for email-based as well as all popular social authentication mechanisms.
  • django-redis-sessions: Allows storage of user session data in redis(a memory based data store that writes on disk) for fast retrieval. We used MySQL earlier for this purpose but the retrieval became very slow as number of users grew.

Django sessions are simple dictionaries which look something like this:

{
    '_session_cache': {
        '_auth_user_id': 2L,
        '_auth_user_backend': 'allauth.account.auth_backends.AuthenticationBackend',
    },
    '_session_key': '44617f83e234b6aa7e632abb8b44b906',
    'modified': False,
    'accessed': True
}

The _session_cache contains the information about the user who is logged in, the backend that is used for user authentication(since we do not use django’s default authentication backend, the value is different here). Also if you set any other data on the session it will be present inside the _session_cache dictionary. The _session_key is generated by the session backend using a random function.

All the sessions are stored in redis in the form of key value pairs where the key is the _session_key and value is _session_cache in encoded format.

###The problem

As it can be clearly seen, there is no way to determine which key belongs to which user apart from getting that key’s data from the data-store and checking the user id associated with that data.

Now consider a scenario where you want to find all the sessions associated with a given user. One of the use cases can be when a user changes their password, we would want to delete all their existing sessions. In such a scenario, you will have to iterate over all the rows of data, decode it and check if it belongs to a certain user. This might work well when there are a few hundred users on your site, but with a large number of users, this is not such a good idea.

###The solution

Redis lets you fetch values of keys containing a certain pattern. If a user’s session keys can contain a certain constant string, we can get all their sessions using that constant string.

We realized that inserting a constant string inside the session key was all we needed to do to solve our problem.

###The implementation

The implementation is divided into two steps:

  • Change the key creation logic in the SessionStore class:

The session objects in django are abstracted using a class called SessionStore. This class has a method _get_new_session_key which is responsible for generating session_keys. We define our own CustomSessionStore which only overrides the above mentioned method.

from django.contrib.session.backends.db import SessionStore

class CustomSessionStore(SessionStore):

    def _get_new_session_key(self):
        session_key = super(CustomSessionStore, self)._get_new_session_key()

        # If the user's information is present in the session, get it and
        # inject it inside the session key, else inject a random string to
        # keep the session key pattern consistent
        if '_auth_user_id' in self._session:
            user_id = self._session.get('_auth_user_id')
            encoded_user_id = user_encoder_function(user_id)
            session_key =   '%s:%s' %(encoded_user_id, session_key)
        else:
            session_key = '%s:%s' % (some_random_string, session_key)
        return session_key
  • Overriding django Session middleware

Django has a SessionMiddleware which is responsible for initializing the session object on the request as well as setting the session cookie on the response object. We only need to override the process_request function so that the newly defined CustomSessionStore class can be used.

from django.conf import settings
from django.contrib.session.middleware import SessionMiddleware

class CustomSessionMiddleware(SessionMiddleware):

    def process_request(self, request):

        # settings.SESSION_ENGINE is the path to your session store class.
        # It will be the path to CustomSessionStore in this case.
        engine = import_module(settings.SESSION_ENGINE)

        # session_key name is defined in the settings file
        session_key = request.COOKIES.get(settings.SESSION_COOKIE_NAME, None)

        # Earlier the session_key was a single string with no delimiters in
        # between. We inserted the ':' delimiter in between for easy
        # segregation of the two components of the session_key. If an old
        # session is found we copy its data to the new style session class and cycle
        # its key. The cycle_key method internally calls the
        # _get_new_session_key which now will generate a session key in
        # the new format but the old data will remain intact. All this
        # hassle is for preserving user authentication state when we deploy this code.
        # If we change the keys directly, users' existing sessions will get lost and
        # they will get logged out resulting in an unpleasant experience.
        if session_key is not None and len(session_key.split(':')) != 2:
            old_session = SessionStore(session_key=session_key)
            old_data = old_session.load()
            request.session = engine.CustomSessionStore(session_key=session_key)
            request.session._session_cache = old_session.load()
            request.session.cycle_key()
        else:
            request.session = engine.CustomSessionStore(session_key=session_key)

###Conclusion

All the sessions for a given user_id can be fetched using the following pseudo code:

redis_conn = get a redis connection
encoded_user_id = user_encoder_function(user_id)
# This pattern represents any key starting with encoded_user_id followed by
# a ':' and any string after that, which is how are sessions are store in
# redis.
key_pattern = encoded_user_id + ':*'
keys = redis_conn.keys(key_pattern)
for key in keys:
    session = redis_conn.get(key)
    # Do something with the sesssion

This approach helped us solve a lot of problems like deleting all user sessions on password change, keeping track of active user sessions to name a few.

There might be multiple ways of implementing this but we preferred this approach because it did not involve any change in django’s source and only a couple of the existing methods were overridden, which makes it less prone to bugs.

Feel free to comment below or reach us at support@hackerearth.com for any suggestions, queries or bugs.

Posted by Virendra Jain
Follow me @virendra2334


blog comments powered by Disqus