Testing Django Pages Containing the csrf_token Tag

Django logo

If you’ve created any forms at all using the Django web framework then you should already be familiar with Django’s CSRF middleware and the protection it provides web site’s against cross site forgery request attacks. When the middleware is active, and unless the view has this protection overridden, any form POSTed will be expected to contain a hidden field named csrfmiddlewaretoken the value of which is expected to match a similarly named field in a CSRF cookie attached to the user. Because this value is specific to a user and constantly changing as well, testing the output of webpages with forms against what is expected is difficult. What follows is the solution I am using in Django 1.10.

If you are doing Test Driven Development (TDD) or less formal testing of your Django project, then one of the things you would likely want to do is to test the output of a view function against the output you expect.

If the CSRF protection wasn’t an issue, you’d probably have a unit test that looks like:

For the purpose of this example, form_page is the view function that is located in the views subdirectory of the Django app myapp. form.html is the template used to generate the response. It contains the form with the {% csrf_token %}. This is the template tag that inserts the csrfmiddlewaretoken hidden field into the form. This looks something like:

Prior to Django 1.10, the test above would have succeeded. This is because in earlier versions of Django, the CSRF token remained static during the user’s session. In order to protect against BREACH attacks, this behavior changed in Django 1.10. Now the CSRF token changes after every user request.

The test code above fails because although a single request object is generated in the line request = HttpRequest(), the CSRF token changes after the call to the view, response = form_page(request). So even though we send the same request as an argument to the render_to_string function, because the CSRF token has changed, the html it generates will have a different value for the csrfmiddlewaretoken hidden field than the one in the response from the view.

Since we can’t get the CSRF token to remain unchanged for our testing and because we can’t ever predict its value, the solution is to remove the hidden field entirely from the html we are comparing. To do this, we need a function that takes an html string as an argument and strips out the csrfmiddlewaretoken hidden field returning the rest of the html untouched.

Both this solution and the following code were inspired by this StackOverflow answer: http://stackoverflow.com/a/39859042. Instead of an object method like the code that inspired it, I have chosen to implement it as a top level function in the test module.

The function to strip out the  csrfmiddlewaretoken hidden field is:

In short, we are generating a regular expression to match csrfmiddlewaretoken hidden field and using the sub function provided by Python’s regular expression re module to remove it.

We now apply the remove_csrf function to both of the html strings we are comparing:

For completeness, our original example test module now looks like:

After searching far and wide, this is the solution I found, but if you’ve got a better one, please share it in the comments below.

1 comment

    • Thiago Melo on October 4, 2018 at 6:41 pm
    • Reply

    Thanks for your post, really enjoyed you solution 🙂

    I would just replace your regex pattern by:

    csrf_regex = r’]+csrfmiddlewaretoken[^>]+>’

    At least on my tests your regex was not able of match the pattern inside response.content.decode(), probably some str/unicode/raw/binary problem.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.