Response to June 2012 article: http://warp.byu.edu/site/content/1100

The article is coming up on its first birthday, but it’s on the web, and I found it by mistake while searching for an unrelated topic.

The argument

The argument of the author, a teacher of Java, was that the Django urlconf system is too verbose and unwieldy when dozens, or even hundreds of urls exist in a single app.

I can deal with [“large and unwieldy”] through good practices, but who wants to write a line in the urls.py file each time a page is added? Certainly not me.

The solution, he states, is to make one master url pattern to rule them all, where the hypothetical “.dj” extension stands in place of “.html” for the sake of demonstration:

    urlpatterns = patterns('',
        url(r'^account/(?P<path>.*)\.dj(?P<urlparams>/.*)?$', 'account.views.route_request'),
    )

And a generic view function to rule them all:

import sys
from django.http import HttpResponse, HttpResponseRedirect, Http404
from django.shortcuts import render_to_response

def route_request(request, path, urlparams):
    parameters = urlparams and urlparams[1:].split('/') or []
    funcname = 'process_request__%s' % path
    try:
        function = getattr(sys.modules[__name__], funcname)
    except AttributeError:
        return render_to_response('account/%s.dj' % path, { 'parameters': parameters })
    return function(request, parameters)

This way, the effort of designing urls can be shipped off to a single top-level view, which dynamically looks up an existing module function in the pattern "process_request__urlpathname". If the function lookup fails, it looks for a template named "account/urlpathname.dj" and renders it.

In the case of the “process_request__%s” secondary view functions, these all exist in the views.py module.

The problems

Philosophy

From The Zen of Python:

Explicit is better than implicit

A proposal for a master view that executes arbitrary functions from an app is effectively contrary to the entire design philosophy. In order for a developer to know what his app can serve to a client, he has to tediously sift through two separate file locations:

  1. an unspecified number of sub-folders of template files
  2. the views.py module

Django’s url system is designed with a couple of very specific features:

  • urlpatterns explicitly allow access to executable code.
  • urlpatterns can be mounted via the provided include() utility, allowing fancy features such as url namespacing and a plural number of instances of a reusable third-party app.
  • urls have explicit names which can be reversed by the django.core.urlresolvers.reverse function, allowing developers to refer to urls by name, not by ephemeral hardcoded paths.
  • The developer doesn’t automatically publish every function in their views.py namespace. This includes imported names, ranging from utilities to Django base classes. Clients could gain direct execution access to arbitrary Django framework callables, which will certainly raise 500 errors. Building in safety mechanisms to prevent such executions is completely counterproductive, a symptom of the sickness just released into wild.

The author’s function views would ultimately look a lot like normal Django function views (although even at the time of his writing, class-based views were already available in production-ready Django releases.) Class-based views solve a number of common problems, providing common workflows for form submission, object listing, creation, modification, deletion, etc. Class-based views offer a potentially simplified scaffolding for almost every type of view.

The fallback “automatic view” templates are basically views in their own right, which complicates matters further. This provides the developer with a convoluted way to avoid writing a one-line function (or a one-line urlpatterns entry grace à class-based generics), but that’s the end of benefits.

App design

The foundation of the author’s argument is that writing urls is tedious and painful. Done incorrectly, I’m sure it is. But then you’ve got nobody to blame but yourself.

Done correctly, views do simple straight-forward tasks that are easy to discern. Placing them in the app’s urls should be a trivial matter. Best of all, you get to assign a name to the view that can be referenced in templates and other apps without needing to know anything about url structure or template names. The site’s literal url regex could be completely redesigned to be more beautiful, and the url names themselves don’t have to change. Not a single template file needs to suffer a giant find-and-replace.

Url names are really quite awesome. You can assign the same name to multiple views, where an argument list disambiguates which specific view should be selected, almost reminiscent of the way compiled languages provide function overloading.

Since the author’s proxy-dispatch view swallows up the entire url path by liberally matching everything (and even tries to match GET parameters, the sure sign of a PHP-influenced workflow), url names become impossible. You’ve just given up one of the most beautiful features of a framework that operates at a higher level than the server’s ruddy file system.

Especially if someone ends up with an unholy number of urls in one app, I’m not sure what advantages a person thinks they’re gaining by throwing away url names. Now you have no choice but to hard code filesystem-like url paths into every template and call for a redirect.

I suspect that the author hasn’t had the pleasure of breaking up apps into smaller units of work. One thing I’ve learned, and after several years of Django development I keep on learning, that smaller apps ease the mind. You don’t want to go crazy, but creating a handful of smaller apps that rely on one another is a good way to keep code modular and flexible. No app should have 50 urls right in a line in its urlpatterns. Something’s wrong in that case.

Some of the redundancy can even be removed by nesting include() calls right in the urlconf:

    urlpatterns('',
        url(r'^account/', include(patterns('',
            url(r'^$', my_profile),
            url(r'^password-reset/$', reset_password),
            url(r'^(?P<username>\w+)/$', include(patterns('',
                url(r'^$', view_profile),
                url(r'^message/$', message_user),
                # ... and on
            ))),
        ))),

Of course, The Zen of Python also politely states that Flat is better than nested, so if the nesting gets too crazy, you can assign multiple module-level url patterns and graft them into place at the end.

Security

Although the author makes concessions to security by prefixing function lookups with the rather verbose "process_request__" prefix, security hasn’t magically found the front door and left you alone. A really bizarre amount of effort would need to go into normalizing client url requests. Even if we’re talking about the fallback template “flatpages” (there was an app for that already, although I never liked it much), the users can literally request any template in your entire project, whether it was meant to be served as a direct-to-response style view or not.

In the author’s example, he uses a “.dj” suffix for his magic fallbacks, but implies that he would rather use “.html” in production:

In my site, I simply use .html because every page in my site is dynamic; both color-coding editors and users like seeing .html in the filename.

Neverminding for a moment that Django template files themselves previous to being dynamically rendered already use the .html suffix, and diametrically opposing the practice of fake url suffixes, the Django tutorials petition the users:

Because the URL patterns are regular expressions, there really is no limit on what you can do with them. And there’s no need to add URL cruft such as .php — unless you have a sick sense of humor, in which case you can do something like this:

(r'^polls/latest\.php$', 'polls.views.index'),

But, don’t do that. It’s silly.

Just imagine what stupidity the a client can get away with, requesting templates that weren’t meant to be accessed directly:

/account/includes/_sidebar.html
/account/../../../../../???
/account/admin_view.dj
/account/password_reset.html

The point is: who on earth even knows what the system would do in each of these cases? And I include the original developer in that question. Suddenly no template is safe, especially if “.html” is used in the url pattern. Because the purpose of direct-to-template fallback views was to omit the step of assigning a view function or explicit url, there’s no opportunity for the system to verify that the requesting user is even logged in.

The author’s advice: templates that shouldn’t be directly accessed by an anonymous client should have a separate file suffix, such as “.hdj” for a “hidden” django template. Great. The layers of complexity are starting to lay it on thick here.

With the normal Django urlpatterns system, it’s abundantly clear what each of those url requests would do: nothing. Not unless the urlpatterns say so.

That’s security.

Don’t make this more like Java

Instead of looking in a views.py file, my controller looks for view functions in a views/ directory. For example, the URL /account/test.dj in my system goes to /account/views/test.py and looks for a standard method called process_request() within the test.py module. This allows me to split my view functions across many files rather than a huge views.py file.

Not only does the url router actually complicate the url-resolving process for the human mind, it just turned every view into its own module with one function per file. Nice try, Java, but I see what you’re doing there.

Nevermind that every view is probably using the same imports, or that a nicely designed app that requires a lot of view functions would use a views/ directory with just a few sub-modules such as “views/management.py”, “views/messaging.py”, etc. One measly function per view is a bit much, isn’t it?

After all, if a view is hulkingly large enough, maybe it does too much? Maybe the complicated logic it contains should be extracted to a cleaner set of utility functions available to the view?

A point emphasized in every version of Two Scoops of Django: Best Practices that I’ve seen is to keep really heavy logic out of the view functions, whether you use function-based or class-based views. The connection from view function to template file should be thin. If the view is handling complex deletion mechanics for a model, maybe the deletion logic belongs on the model class itself?

I suspect the author was a big fan of code in views, because he seems to be a big fan of code in template files, too, having replaced the Django template renderer with the Mako one. There are just way too many places for code to live in a project like that. It’s the sign of someone that believes they’ve already found enlightenment, and they’ll die trying to unmake core mechanisms. They’re free to do so, but man. I’d hate to live and breathe in a modified workflow like that. It sort of screams “one man team” to me. No team of individuals wants there to be a bunch of non-standard homebrew mechanisms living in their Django.

It just causes more problems than it solves. No, really. It does.

I’ve been there

I’ve been there. I’ve made these mistakes. I’ve written url helpers that automatically assigned function names as url names, to simplify my url burden. I gave the helper an automatic class-based handler that automatically called MyView.as_view() for me. I couldn’t be bothered with calling .as_view() on every class. I gave the helper an automatic template name finder, so that the function name translated directly to an .html file that I could find. Then I devised some crackpot method of translating a double underscore in function names to directory separators, turning the view function name account__profile to the template account/profile.html.

Then someone looked at my code and kind of grunted and sighed a bit. Finally they said, “Dude I don’t know what you’ve done to Django, but I don’t know what’s going on here.” I explained the mechanism, defending the simplicity it introduced.

Eventually I lost that debate. Not because somebody else beat me up for believing differently than they, but because I recalled an important lesson I’ve learned time and time again:

There was this time I had a friendly competition with a buddy of mine to write a database driver for some software we were writing at work. I spent all night developing my awesome strategy and class hierarchies and interfaces and abstract base classes. By midnight, I had code that just wouldn’t function anymore. The compiler was screaming at me about unknown generic types K and L and all kinds of unfixable problems. Nothing I did, including the wonky work-arounds on the Internet, ultimately fixed the problem.

I went to work the next day and moaned the whole afternoon about how dumb the language was, while watching my friend run tests with his driver on some example data. I knew that his code was less beautiful than the code I had written, even though mine was broken beyond help, but it didn’t matter, because he turned and said, half joking yet sincere:

I just try to not write code that won’t work.

It might sound rigid and show too much affinity for tradition, but if you misunderstand your tools and subvert their strengths just to find some PHP or Java philosophical comforts, you’re going to have a bad time.

Tagged with:
 

One Response to Why Django doesn’t need better urls.py

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>