Things I Wish I Had Known About Django Development Before Starting My Company

streblo · on April 17, 2013

12. Use `pip freeze` to keep a list of your requirements, and keep that requirements list in your repo. Use virtualenv and virtualenvwrapper to keep your environment clean from other environments. Virtualenvburrito (https://github.com/brainsik/virtualenv-burrito) will help you set it all up.

13. You almost never need to write bash scripts. Use django management commands to write scripts that interact with your application. Use fabric and a fabfile to do deploys. Use something like chef or puppet to do machine configuration. Your bin/ folder will turn into an unmaintainable mess very quickly.

14. Use south. Yes, it's obtuse, but it gets the job done.

15. Use a pre_save hook on all of your models to do full_clean for validation before anything goes to the database. This will save you from cleaning up your data later.

    from django.db.models.signals import pre_save

    def validate_model(sender, **kwargs):
        if 'raw' in kwargs and not kwargs['raw']:
            kwargs['instance'].full_clean()

    pre_save.connect(validate_model, dispatch_uid='validate_models')

16. If you're writing a javascript-heavy application, consider all of your options for static asset management before you get too deep. I've used django-compressor, django-pipeline, webassets, and django-gears. None of them are perfect solutions (there still isn't a sprockets-esque one-catch-all solution for django like in the rails world), so consider the pros and cons before you make a choice.

edited for formatting

jdunck · on April 18, 2013

Regarding 15 - if you want that behavior, you'd be better off with using your own model subclass:

  class ValidatingModel(Model):
    def save(self, *args, **kwargs):
      self.full_clean()
      super(ValidatingModel, self).save(*args, **kwargs)

Then inherit from that everywhere.

Signals have a not-insignificant overhead. They are most useful for tying together code that comes from different places. (If you want to do full validations on 3rd-party app models, then you might this signal approach, but it might also be a bug in the app that the hook isn't just there for you.)

elithrar · on April 18, 2013

> (there still isn't a sprockets-esque one-catch-all solution for django like in the rails world)

This is definitely a little frustrating. I'd kill for a solution that handles minification, versioning & can upload to S3/etc. You can pass things through boto to get to s3, but the rest requires a bit of fiddling.

I've seen a few solutions that come close, but then they have extremely odd versioning systems. I should just fork something so I can add an ISO8601 timestamp to my CSS & JS; which I like because having granular dates helps when debugging front-end issues.

jdunck · on April 18, 2013

> I'd kill for a solution

Would you code for one? I, too, am sorry that we don't have a thing like asset pipeline. I'm pretty sure it would be welcome...

kevinastone · on April 18, 2013

What have you tried? Because staticfiles + django-storages + django-compressor sounds like it can handle that use case.

th · on April 18, 2013

I'm using exactly that setup and it does work, it just took me many hours and over 100 lines of custom storages + settings code to get it working.

I had to solve issues with custom domains, HTTPS, gzip, separate media/static files buckets, and storages/compressor not playing nicely together in general.

ritchiea · on April 18, 2013

I have been developing with rails for a couple years, though I am comfortable with python I've never touched django because I haven't had to and one thing I will say is that the rails asset pipeline documentation is something I constantly return to. The asset pipeline provides a lot of functionality but not a lot of simplicity or management of complexity. If what you wish you had from rails is the asset pipeline you are either either an expert with with rails api or you are seeing things greener on the other side of the fence.

I don't mean to insult the asset pipeline, it provides a lot. But it definitely doesn't save you hours. Out of the box the asset pipeline is great for all the things that come for free but if you are doing a lot of development in the framework you probably return to the asset pipeline documentation on a regular basis. And I consider time spent in documentation a negative compared to time finding your own solution if the API is not intuitive and you find yourself consistently returning to the docs about similar problems. And let me tell you, plenty of my fellow Rails devs have said to me they also regularly return to the asset pipeline docs.

flexterra · on April 18, 2013

#16 is one of the biggest pains we had to deal with but I think we now have a good solution for now.

We use brunch.io a node.js build tool. It handles concatenation and minification of CoffeeScript, Stylus and Handlebar templates for us but it can work with almost any front end technology out there. A cool thing is that we can use require within out CoffeeScript files and that allows us to better organize code into modules.

Brunch even provides file watching capability so when we save a stylus or CoffeeScript file it will update the browser without a reload.

In our setup we compile everything to our django static folder so when we collectstatic, files get uploaded to s3.

One thing that was a bit a tricky and something that we have to make better is cache expiration of files. For now we are appending static files with a few characters from the current commit hash (app.css?e2232 something like that), while this is pretty effective we've found some cache systems that ignore it. The next step is to rename files before uploading to s3 (app-e2232.css).

I agree that this might be a bit too complicated for most apps but in our case (getblimp.com) we have a pretty heavy JS app and we really need to take advantage of all the "better" front end tools available.

I will be willing to contribute to FOSS project to solve this issues once and for all on the django/python side. BTW I think that file watching is better than compiling on every reload during development. In our experience (many js files) request on the development side took seconds and became very frustrating.

emddudley · on April 18, 2013

> Virtualenvburrito will help you set it all up.

Is it just me, or is it a little absurd to have a wrapper for a wrapper to a Python module?

zwass · on April 18, 2013

I think the author recognizes that as well, hence the name.

tracker1 · on April 18, 2013

Just responding to #16, it may be best to manage your static/client assets using NodeJS with grunt... It's not in the django chain, but may be worth calling the external. Node seems to be at the leading edge of JS & CSS tooling.

It's worth noting that I'm a pretty big NodeJS fan and am even using grunt via node for building the client bits in my latest .Net project at work. It's pretty sweet, though some addons are broken since the new grunt version.

crucialfelix · on April 18, 2013

I've switched to this method as well and recently removed django-compressor. Better to have a tool that is used and improved by many, not just django people.

In the top level of my project I have a makefile so make assets runs grunt and collects static.

My grunt also runs jslint so it will halt if there are errors.

andrewthornton · on April 18, 2013

Would you mind giving a little more of an overview of how you set this up? I think this would be incredibly useful.

crucialfelix · on April 19, 2013

I'm going to try to throw together a blog post.

basically though, I've used requirejs for the js both for single page apps and for general js like plugins and galleries.

compass/sass for the css which does minification and concatenation already, and also ensures there is no illegal css

js is processed by requirejs into STATIC_DIR/r/

css is compiled into myproject/static/myproject/css and then uses django's staticfiles system to collect and deploy

grunt is what calls requirejs. it could also concat and minimize css but compass has already done it. it also has watch and does live reload so that my browser will reload the css and even the js when either of them are edited.

on my pages I use a tag:

{% vcss "nestseekers/css/front.css" %}

which renders a <link css tag with a ?v=HASH appended

and

{% r_url 'nestseekers/js/libs/dist/html5shiv.js' %}

for the js

Bockit · on April 17, 2013

Would you mind expanding on the issues you have with pipeline? I use pipeline at the moment and beyond a couple of small things (mainly working with vendor apps) I'm really happy with it, but I'm open to the idea that it's because of what I don't know than because it solves every problem.

streblo · on April 18, 2013

Mostly with setup and configuration. Configuring it to work the way I expected required a lot of digging through documentation/code/blogs/fudging around. I don't like it when I have to do that, for a number of reasons -- a. I'm lazy and I don't like spending a lot of time doing upfront configuration, b. when something doesn't come with sensible defaults pre-configured, I assume I'll do something wrong and it'll have non-obvious but bad consequences.

I had this problem with django-compressor as well (but more with configuring 3rd party asset compilation). I usually recommend people use webassets via django-assets. It's easy to configure and very feature-ful.

windexh8er · on April 18, 2013

I'd maybe also suggest pythonbrew since you win by getting the exact version of Python you want and virtualenv is packaged up nice with it.

amccloud · on April 18, 2013

pip freeze has always been horrible for me. I've been better off managing my requirements manually.

Spiritus · on April 18, 2013

At least managing multiple files is a pain.

I've had to resort to

    $ vimdiff <(pip freeze) requirements/prod.txt

You can't just do "pip freeze > requirements.txt" like you see on every tutorial. I wish there was a better way to do this.

And I would also like to only display without dependencies. Say I "pip install X", which depends on Y and Z as well. Then later I want to delete X, so I just "pip uninstall X", but I will still have Y and Z.

pindi · on April 17, 2013

> Django does not have a built in JSON HTTP response, so you are going to have to either man up and roll your own (good luck)

Am I missing something? What's wrong with:

   return HttpResponse(json.dumps(data), mimetype='application/json')

Wrap it up in a convenience function and you're done.

The JSONResponse class suggested automatically implements JSONP, which is extremely dangerous. Consider a view on /accounts/info which returns some information about the currently logged in user. A malicious site could embed

  <script src="http://example.com/accounts/info?callback=someFunction">

and access the account information of any user logged into your site. JSONP is a technique to bypass the same-origin policy in appropriate cases; don't just blindly apply it everywhere or you're giving up the protection of the policy.

the_cat_kittles · on April 17, 2013

might also be worth noting that django-tastypie is the defacto standard for REST apis, and sends and returns json (among many other serialization formats) very easily. This obviously doesn't work for all ajax cases, but its extremely useful nonetheless.

scott_w · on April 18, 2013

I would recommend Django Rest Framework. It gives you more fine grained control. We just use the serialisers for example.

peterhunt · on April 18, 2013

json.dumps() can be dangerous if used on your raw domain data. You should specify the exact schema being sent down to the client so you don't accidentally leak something (this can happen very easily in Python)

pindi · on April 18, 2013

Well, not dangerous so much as will fail with a "Model instance is not JSON serializable" message. So of course you'll need to construct the list/dictionary representation of your data manually. A good framework can help with that, but this isn't something that's solvable in the general case with just a response subclass without risking data leaks as you stated. (The other option in the original post makes this mistake, making both suggested options insecure)

jaegerpicker · on April 18, 2013

yep, I build response objects ( my own term, not great but it describes what they are ) that are basic subsets of the object that I want to serialize to json. That way I'm sure only the fields that I really want to send are making it out.

peterhunt · on April 18, 2013

Cool, I think a good JSONResponse implementation would bake that into the framework such that it's difficult to make the mistake you didn't make :)

lifeisstillgood · on April 17, 2013

I would guess complex objects - containers, or strange databasey stuff.

The way to deal with is __complex__ as a method on the object and recurse through asking the complex method to return nested simpler python types.

danellis · on April 18, 2013

__complex__ is for converting to a complex number.

davesque · on April 18, 2013

yeah, danellis is right. don't use __complex__ for that. it's for complex numbers:

http://docs.python.org/2/library/functions.html#complex --and-- http://docs.python.org/2/library/cmath.html

apendleton · on April 17, 2013

> Use Gunicorn instead of Apache for your webserver

This is strange advice; while you can use gunicorn as a front-facing webserver, the gunicorn docs strongly recommend against doing so. In a typical deployment scenario, then, gunicorn and Apache would occupy different levels of your stack, with one running your WSGI app and the other exposing it to the world. The advice ought probably to be "Use gunicorn+nginx instead of Apache+mod_wsgi," and indeed, lots of people do make that recommendation about Django deployment.

misiti3780 · on April 17, 2013

sorry - i did forget to mention nginx should be in front of everything managing all requests + serving static content.

misiti3780 · on April 17, 2013

updated article

antihero · on April 18, 2013

nginx+uWSGI emperor mode is such a nice setup.

tinco · on April 18, 2013

Did you ever try using Passenger (nginx) to run your WSGI app instead of Gunicorn+nginx? Seems like it would be even less of a hassle to run.

(I work at Phusion, and am not a Django guy, just curious)

misiti3780 · on April 18, 2013

if this question was for the author, my answer is no.

falsedan · on April 17, 2013

1. package your python code as a sdist (except setuptools is too hairy so nevermind), keep your deployment scripts & configs in a separate repo or orphan branch

3. use nginx to reverse-proxy WSGI to Gunicorn over a unix socket

6. don't have per-environment config in your app, use the same config in all environments and configure each host with aliases/proxies to consume the appropriate resources (use local_settings.py for local development on your workstation & exclude it from your package)

7. use native operating system service management (systemd, upstart, launchd)

10. use munin if your app can withstand a CPU spike every 5 minutes, use ganglia/graphite/collectd otherwise

glenjamin · on April 17, 2013

I have to disagree strongly with your point 6.

I'm working on a large app at the moment where all hostnames are the same regardless of environment, and the only way to switch envs is via a proxy. It's a massive pain.

Keep environments mostly similar, but always make sure you can configure locations of the app itself and its various external resources separately on staging, test and development platforms.

falsedan · on April 18, 2013

> I'm working on a large app at the moment where all hostnames are the same regardless of environment, and the only way to switch envs is via a proxy. It's a massive pain.

One of our projects works like this, it sucks.

For per-host app location, we set X-HTTP-Script-Name in the nginx location block and extracted it in WSGI middleware like http://flask.pocoo.org/snippets/35/

Everything else is hardcoded to use 'http://localhost/service_name/foo & nginx proxies to the appropriate resource. Obviously, this doesn't work for non-HTTP services!

manoDev · on April 17, 2013

> 1. package your python code as a sdist (except setuptools is too hairy so nevermind), keep your deployment scripts & configs in a separate repo or orphan branch

This. I'm tired of Django projects not being packaged/released correctly, and deploy scripts that simply git clone in production (yiikes).

hoov · on April 17, 2013

I'd take it one step further. Why do they have to be packaged as sdists at all? Why not bdist_egg?

manoDev · on April 18, 2013

bdist works too, but it adds one more level of annoyance (you need the same environment as the target). sdist is a good balance because you get a pip-installable package, so your deploy can just consist of creating a virtualenv in /opt/<your project> and installing a determined version of your project there.

hoov · on April 18, 2013

I was under the impression that Django projects couldn't be deployed when they were bdists (because Django doesn't use pkg_resources).

I agree the annoyance wrt needing the same environment as the target. We tend to have two platforms that are supported: CentOS 5.x and MacOS. Keeping the build machine on the same platform as the deployment machines is simple. Creating the eggs for MacOS developers is more difficult, but still not too bad. That might seem odd, since we could just use pypi.python.org, but we have an internal PyPi server so that we can easily share internal libraries. Adding a line to a project's setup.cfg makes this trivial for the application developers.

There's another annoyance with sdists. I don't want to compile during deployments. So, I build everything that can possibly be built as an egg as one, and fall back to sdist for everything else. I push those to the internal PyPi server. At deploy time, I create a virtualenv and easy_install the appropriate artifacts. I know the correct artifacts because I `pip freeze` the requirements at build time.

We're also extra paranoid, so our stage and production VPCs are on different AWS accounts. We have one PyPi server per VPC and flow artifacts forward as needed.

subleq · on April 18, 2013

Could you go into some more detail about what being 'packaged correctly' means? What's wrong with a git clone deployment, why is deploying as a package better, and how does one implement it correctly?

manoDev · on April 18, 2013

I love how I got downvoted for mentioning git clone deploys.

> What's wrong with a git clone deployment, why is deploying as a package better, (...)

The advantage is that you can have a release process, where you update the package's __version__, compile/minify the files you need, build documentation, and so on, until you have a deployable branch that you can tag and upload to the repository. This way you have a reproducible history of releases, it's easier to inspect which version is deployed, you have hooks for installation (for instance, you can abort installation if the tests fail on production), etc. Mainly this:

Crank out code -> Run a makefile/fabfile to update version/compile/minify/build whatever -> Export a tag -> Build an sdist/bdist -> Install on production

I believe too many things grew inside Django (e.g. collectstatic) that really shouldn't be part of the framework at all. Another thing that bothers me is South: you need to push a release to production, then run a migration, because the migration is part of the codebase. Well, the migration really should be part of the installation process. There are corner cases where this is an issue - for instance, worker processes reloading before your migration is complete would use the new, wrong model definitions, and suddenly you have a broken release on production.

For all effects, your Django project should just be a valid Python package that you can pip install in a virtualenv inside your server (lets say, /opt/<myproject>) and you're done with it. This way you can freeze the environment on production, pip can handle upgrade/downgrade, you don't have to care about *.pyc hell, etc.

> and how does one implement it correctly?

I should probably upload a project template with a workable setup.py to github.

manoDev · on April 18, 2013

There you go: https://github.com/hcarvalhoalves/python-package-template

famousactress · on April 17, 2013

Solid list! For most of those bullets I definitely had the 'Oh wow, should have done this sooner' moments. I'd add:

- use South (right away!)

- Class Based Views

- (not django specific) use virtualenvs!

archivator · on April 17, 2013

Are class based views really any good for anything outside of CRUD? I've found the documentation (and, more importantly, rationale) lacking. Generally, when I try and use them, I spend more time figuring out how to customise than when I write "normal" views.

famousactress · on April 17, 2013

I should be specific. I never use the built in Django views. I'm sure they're great but I don't like that much magic and I moved to class based views later in a project.. those seems easier in a clean-room build.

I do, however use class based views that I've built myself. We extend them and add mixins and I much prefer all of this to the decorator soup that is the alternative.

archivator · on April 17, 2013

Ah, fair enough, that makes more sense. Last time I tried to use the built-in ones, I just scrapped the project after two days and redid it with functions. It's not magic if it doesn't work!

jscn · on April 18, 2013

There's definitely a learning curve, but CBVs will give you much DRYer views. Being able to use mixins is a godsend.

mrtron · on April 17, 2013

Am I the only one left who finds south more trouble than it is worth?

famousactress · on April 17, 2013

The only one? No, I'm sure not. In the vast minority? I'd suspect so :)

South is pretty obtuse to learn and definitely has it's warts and issues, which Andrew seems well aware of and wants to fix so check out the kickstarter link below!

That said, our project has over three hundred tables, a couple thousand migrations, and I've been very happy with it compared to the systems I've rolled myself in the past. I'm certainly unaware of an alternative that holds a candle to it, assuming Django's ORM.

metaphorm · on April 17, 2013

South can be troublesome, but what's your alternative? never make changes to your schema? do it painstakingly by hand from the psql command line? just suck it up and use South, warts and all. its still really good, even though its not perfect.

mattdeboard · on April 18, 2013

...what's your alternative?

Write SQL and interact with your RDBMS directly?

subleq · on April 18, 2013

You still need a system for tracking which changes have been applied to your database and for distributing the changes to the rest of your team.

Even if you don't use South's automatic migration generation, its database-independent schema modification api [1], dependency tracking, and tracking of which migrations have been run, are useful and necessary. In fact, the automatic generation of migrations is an optional feature of South that was not in it when first released.

When you need to write SQL and interact with your database directly, using `db.execute` inside a South migration is a nice way to do it.

[1]: http://south.readthedocs.org/en/latest/databaseapi.html

metaphorm · on April 18, 2013

that is not a viable practice over the long run. the entire point of using a migrations tool is so you can migrate the schema and the data forwards and backwards automatically and consistently every time. attempting to do that with raw SQL scripts is a disaster waiting to happen. its so bad that the entire idea of a migration tool was invented to solve this problem.

so I ask again: what's your alternative (given that avoiding raw SQL scripts is the problem you're trying to solve)?

mattdeboard · on April 18, 2013

I'm getting sucked into an argument I don't really care about because I actually like South and use it in production for some things.

But I mean idempotent SQL takes care of the vast majority of all this stuff you mentioned. Write your SQL properly and it doesn't matter whether or not it's been applied before.

mh- · on April 18, 2013

I don't know about postgres offhand, but there aren't idempotent solutions to some DDL operations in mysql. (alter table add column, off the top of my head)

mattdeboard · on April 19, 2013

Yeah in Postgres you can check for the existence of a column in a table. If it exists/doesn't exist you can drop/add the column. Obviously "ALTER TABLE x ADD COLUMN y" isn't idempotent, but the whole SQL statement around it can be idempotent.

scaryclam · on April 18, 2013

I've been on a project that was doing exactly this. It's a giant PITA and we found ourselves slowly implementing something that looks a lot like South in order to mitigate all the problems we encountered. So we dropped this approach entirely and just used South. With a half decent migration strategy it's so much more robust than messing around with SQL. Unless there are specific edge cases that a migration tool can't deal with, use one.

tomlu · on April 18, 2013

Maybe if you're working on your own. If you're in a team (where each member has her own local DB) and multiple environments for testing, staging, preproduction etc. I'd say you're going to want some sort of scripted migration.

pico303 · on April 17, 2013

I came to Django & South after using ActiveRecord and Rails migrations for years, and it drove me nuts. Not trying to cause a big argument, but why the big push in the Python community to model database tables via object properties/fields? I seem to recall SqlAlchemy does something similar.

wmil · on April 18, 2013

From 'The Zen of Python' - Explicit is better than implicit.

Automatically generating properties after querying the database is seen as unpythonic. PyLint and some IDEs/editors will not be happy if you try.

regularfry · on April 18, 2013

DataMapper 1 does the same. It does make sense if you're worried about changes to your database schema spontaneously breaking previously working code.

pico303 · on April 19, 2013

No, I get that part. I'm talking more about building a database from models. It seems like these tools are designed to build the database scheme from the object models, which to me is just as bad as building the objects from the schema.

Ideally, we should use the declarative style of Django models and SqlAlchemy on the object, along with the scheme generation tools of Rails migrations.

kingrolo · on April 18, 2013

My workflow is to only add south to the project when required. To begin with I use a bash script which trashes and rebuilds the DB each time I wish to. Data which needs to survive this gets dumped to fixtures (using another bash script).

tocomment · on April 17, 2013

I totally agree with you. I just export the SQL and see what tables need to change.

dbla · on April 17, 2013

I'm working on a tool for better schema migrations as well. Currently it only supports MySQL but more RDBMS's coming soon: http://devjoist.com

jsmeaton · on April 17, 2013

Good luck, but you may want to reconsider. Andrew Godwin (author of South and Django contributor) is building a new version directly into Django core.

http://www.kickstarter.com/projects/andrewgodwin/schema-migr...

dbla · on April 18, 2013

DevJoist is language agnostic so I think there will be a need for it regardless of what happens with Python. I'm actually a backer of Andrew's kickstarter :)

mh- · on April 18, 2013

don't be discouraged. a solid, open-source solution that is language/platform agnostic is valuable.

akarambir · on April 17, 2013

When I started using South, I was much in awe and loved its ease. But as time passed migrations became completed and south soon came to its knees.

scaryclam · on April 18, 2013

I find that generally, when this happens it's due to a bad / non-existant migration policy (it's just as important as your branching strategy to keep migrations solid). Migrations should be able to rebuild an entire database from scratch. If it can't then the migrations have been screwed up.

misiti3780 · on April 17, 2013

i think it is the best solution out there right now ... someone had a kickstarter recently to replace it, but I can't seem to find that info via google right now ...

akarambir · on April 17, 2013

Andrew Godwin had that Kickstarter and he is the creator of South. Here is the link http://www.kickstarter.com/projects/andrewgodwin/schema-migr...

briancurtin · on April 17, 2013

I believe you're thinking of Andrew Godwin's schema migration kickstarter: http://www.kickstarter.com/projects/andrewgodwin/schema-migr...

misiti3780 · on April 17, 2013

yep - that is what i was looking for - thanks

apendleton · on April 17, 2013

The article recommends using MongoDB as a primary data store, so South wouldn't be applicable (and that's specifically addressed).

conesus · on April 17, 2013

Actually, I use South to manage data migrations (as opposed to schema migrations) in MongoDB. That way I can ensure that I've run the same data migrations on all of my dev and prod databases. Just a `./manage.py migrate` and every developer is caught up.

tomlu · on April 18, 2013

I don't know about class based views - each time I've tried to use them they've started out nice and convenient then slowly turned into an over-engineered mess as my requirements got more complicated. I think I prefer the control and simplicity of function-based views even if there's a bit more duplication.

dhaivatpandya · on April 18, 2013

I wish there was something like this for rails - well written, concise and keeps the newbies away from things that'll hurt them. In particular, some major pain points:

1. Use rbenv or rvm. Half the problems I notice with beginners stem from installing rails with apt-get or something.

2. Your controller is meant to be simple glue between model and view - if you're putting tons of code in there, you're doing it wrong.

3. Unit test. A lot. This is especially good for beginners because you often make little errors when you're starting out.

4. Unit testing isn't enough. Functional testing is important too.

5. Learn about your hosting provider. Specific to Heroku, if you're on the free plan, every dyno spin-up takes absolutely forever if you don't have a steady stream of visitors.

jordanmessina · on April 18, 2013

> 2. Your controller is meant to be simple glue between model and view - if you're putting tons of code in there, you're doing it wrong.

Yes, fat models are the way to go. This is just a good rule of thumb for any MVC framework. I feel like no framework docs actually explain this. Newbies end up shooting themselves in the foot because they don't understand the reason for separation or understand where logic should be implemented.

mercurial · on April 18, 2013

That's very debatable, and pretty rails-specific, IMHO. Conventional wisdom in other ecosystems is instead to have a business logic layer between your controller and your persistence layer. I've always found this to be good advice, as it makes the code more testable and makes reasoning about manipulating different models in the same operation easier.

josephlord · on April 18, 2013

The Rails models ARE the "business logic layer" between the controller and the persistence layer (the database itself).

I don't want to get into exactly what MVC should mean and whether any particular framework does it right though.

mercurial · on April 18, 2013

That's how they're traditionally used, but this does not mean doing something cleaner is impossible.

Iterated · on April 18, 2013

I started reading about Django recently so this is a noob question, but what's the controller in Django? All I've seen with with a lot of code is models, views, and templates. It's kind of annoying how tutorials talk about MVC but it doesn't seem like an MVC framework.

Also, if I want a user to input numbers to do calculations, where does the computationally heavy code go?

davesque · on April 18, 2013

For most intents and purpose, "controllers" from so called MVC frameworks are referred to as "views" in Django.

The article that edavis mentions does make one good point, which is that the term "view" was chosen to emphasize the notion that the python callback function (or object) sets up a view of the data represented by the models. I think there's something good in this choice of words. I find the terms "view" and "template" more intuitively meaningful than the corresponding "controller" and "view".

One important, substantive difference that I'm aware of between controllers in many MVC frameworks (probably not all) and views in Django is that a lot of MVC frameworks map requests to controller actions through URL traversal, whereas Django offers a more hands-on approach with URL confs.

Also, whereas controllers are often implemented through classes in MVC frameworks, views in Django may be implemented through any callable object which accepts a request and returns a response.

As far as where your code should go, it probably depends on what kind of data the user is entering. If the data must be persisted in the database somehow, such as with transactions in a bank account, then the code that crunches the numbers would probably be in a method on a "BankAccount" or "Transaction" model class. If it's data that only really needs to persist for the user's individual browsing session, then it's probably safe to put the code that works on it in a view. Django limits what kinds of things you can do in its templates (thankfully), so you probably won't have much luck putting your computation code in a template.

edavis · on April 18, 2013

Django is close to an MVC framework but not exactly an MVC framework. This link (https://docs.djangoproject.com/en/1.5/faq/general/#django-ap...) describes the difference better than I ever could.

wikiburner · on April 18, 2013

I'm somewhat new to Django, and I'm just curious if Heroku is the best solution out there for Django? I've heard some good things about Gondor, and apparently it's more tailored for Django apps, but I'd be curious to know what the HN consensus is on Django hosting that won't collapse on you if you get HNed/Reddited.

bryanh · on April 18, 2013

Watch out for these Celery gotchas:

1. Tracking tasks status in SQL will result in a lot of queries! (Even when using Redis/RabbitMQ as brokers.)

2. Crontab style tasks are great... until they take a long time to complete and Celery kills them. And then you're back to regular crons (and there is nothing wrong with that).

3. Use UTC everywhere! New projects get this by default, but don't make this mistake.

jdunck · on April 18, 2013

My recc after a lot of heavy use is: RabbitMQ for the queue, Redis for the result store.

RabbitMQ for the result store is madness.

misiti3780 · on April 18, 2013

why not just use redis for both - just curious ?

jdunck · on April 18, 2013

Redis does OK as a queue, but not as well as rabbit.

For a start, rabbit handles OOM better. It has lower per-message overhead. It has a more flexible queueing model in general so that you can use it for both work queue-y stuff and other queuing needs. It's interesting how often queues seem handy once they are easy to use.

Then there's the fact that redis is good for lots of things other than queues, and you'll be tempted to use it for that, and that will crowd the queue use. If you hold a transaction on redis or do a large set intersection or run a lua script, you block everything else, including your fanned-out celery worker pool.

deepakprakash · on April 18, 2013

This is indeed a solid list.

The most comprehensive best-practices resource of Django I've come across by far is "Two Scoops of Django"[1] by "pydanny"[2]. Its absolutely worth the $17.

[1]: https://django.2scoops.org/ [2]: http://pydanny.com/

pydanny · on April 18, 2013

Correction: It's not by "pydanny", it's by "pydanny" and "audreyr". She just doesn't do the whole Twitter/HN/blog thing as much as me. Seriously, do you think I could write a section including code that discusses "the impossible condition of too much chocolate". ;-)

Seriously, between the two of us she's the better coder, taught me LaTeX, and knows Strunk and White. I just write what I think and let the grammar experts fix it. :P

shavenwarthog2 · on April 18, 2013

Seconded. My investment in "Two Scoops of Django" has repaid itself many, many times over. Concrete "best practices" book.

dangoldin · on April 17, 2013

An item I'd like to add to Python development in general is to use virtualenv. If you're doing multiple Django projects this is a must have.

One of those tools that you start off thinking isn't useful but quickly turns into a must have.

nnq · on April 18, 2013

> Use Gunicorn instead of Apache for your webserver [...] This assumes NGINX is managing all incoming requests and serving static content.

Why not just go nginx + django the usgi way? Now you have friendly tutorials and docs for this like https://uwsgi.readthedocs.org/en/latest/tutorials/Django_and... . Maybe I'm weird or I've never worked on large enough apps, but I fail to get what the shiny unicorn and gunicorn bring useful tot the table...

spamizbad · on April 17, 2013

> 10gen has added the aggregation framework, full-text search, collection-level locking, etc.

Mongo has collection-level locking? I thought there was some preliminary work to support it in 2.2 (It just does database-level locking for now), but it's still unsupported. Source: https://jira.mongodb.org/browse/SERVER-1240

Also, I wouldn't recommend using any of the aggregation stuff unless its for infrequent ad-hoc stuff or scheduled tasks during slow periods.

conesus · on April 17, 2013

I use aggregation queries regularly and it's amazing. My MongoDB database has 4k ops/sec and I run an aggregation query to count averages, sums, and splits on a name (to get averages and sums for each of two dozen groups) for about 10K rows every 5 minutes. Not a huge amount, but this is on a db that is performing a ton of work. Also, I should mention that you should probably run your aggregation queries on your secondaries.

misiti3780 · on April 18, 2013

i agree - the aggregation framework seems solid - much better performance than map-reduce

pvnick · on April 18, 2013

You're right, it doesn't yet support collection-level locking (https://jira.mongodb.org/browse/SERVER-1240).

falcolas · on April 18, 2013

I don't see a monitoring system there? Something like Supervisor is great, but who watches the watcher? I prefer Nagios, but that's just because I'm used to it. Even it is better than nothing.

If you want something a bit more distributed (read less prone to single points of failure) than Supervisor, check out pacemaker. Very powerful and useful for keeping resources alive in any sized cluster.

coolsunglasses · on April 17, 2013

Don't use celery use: https://github.com/binarydud/pyres

mcintyre1994 · on April 17, 2013

I've been expecting to use Celery in a project I'm working on, could you explain why?

hobb0001 · on April 18, 2013

Celery is brittle. If you stop a celery service, odds are, it will either not stop at all or leave stray child processes running.

It is not transactional, at least not with a MongoDB back-end. If you stop a celery service, any tasks that were in progress may or may not complete, but none of them will be picked up again when you resume.

No prioritization of queues or tasks. It is possible to set up separate services that handle different queues, but that's a PITA and there's still no guaranteed order of processing.

coolsunglasses · on April 18, 2013

Celery is a PITA and over-complicated for the problem it solves.

rozap · on April 18, 2013

Do you know why pyres relies on itty? Seems like kind of an odd dependency...

lost-theory · on April 18, 2013

The pyres web interface (where you can view job status) is built using itty.

edit: looks like they're using flask now, I don't see itty being used anywhere. They also split the web interface (resweb) into a separate project, so pyres doesn't depend on a web framework now.

rozap · on April 18, 2013

Ah ok, thanks. I guess the documentation is a little outdated then. Thanks for the suggestion, I'll check pyres out when I get to that point in my project.

the_cat_kittles · on April 17, 2013

considering the infrastructure that has built up around celery and particularly its use with django, i dont know why you wouldn't- care to explain?

dbarlett · on April 18, 2013

Pinterest replaced Celery and RabbitMQ [1] with their fork of pyres [2], presumably because it's 1/100th the codebase but does what they need to.

[1] http://highscalability.com/blog/2013/4/15/scaling-pinterest-...

[2] https://github.com/pinterest/pyres

Judson · on April 18, 2013

This may sound strange, but when I first started doing python stuff with Django, I spent WAY too long trying to figure out the best project directory layout. To me, Django should generate a "pretty-good" directory layout, much like `rails new`.

scottrb · on April 18, 2013

My supreme thanks to the author for writing this and to all those who have submitted additional items to the list in the comments here. I'm writing my first few Django apps in series and have found this bits of advice to be very valuable.

dshap · on April 17, 2013

I haven't used Jammit before, but I highly recommend checking out django-pipeline for the same purpose: https://github.com/cyberdelia/django-pipeline

Bockit · on April 17, 2013

Specifically for me, pipeline fits into django's collectstatic way of doing things. And because of this it works fantastically with django-storages as well. My projects are set up so that `python manage.py collectstatic` trawls all my apps static folders, compiles everything if required (r.js, sass) and uploads any changes to an S3 bucket. And in development it compiles over the wire (except r.js, we let requirejs do the async loading thing in development).

dshap · on April 18, 2013

Ditto regarding everything up until your last sentence - it is a pretty great setup.

What do you mean it "compiles over the wire" in development? In development, pipeline just renders individual js/css tags for each of your static files, un-compiled...

Bockit · on April 18, 2013

That when we're developing, you edit a sass file, and when you request the CSS file it compiles the SASS to CSS before serving the file. It has been a little while since I set it all up but I think for the SASS example I've made a gist[1] showing the settings you need to make it work. If I recall correctly the key bit is using the .scss as the source file. Then, in production make sure to set `PIPELINE = True` to stop compiling per request.

[1]: https://gist.github.com/Bockit/5408958

dshap · on April 18, 2013

Gotcha

metaphorm · on April 17, 2013

your django app doesn't look like a django app.

the basic premise of the django defaults is that app modules are pluggable and self-contained. all of the static assets, templates, model code, view code, migration files, management commands, etc. related to a particular app should go in that app's folder (which is treated as a Python module with its own __init__.py file). conventionally the app's folder is a first child descendant of the top level project directory (i.e. its in the same directory as manage.py)

taylorfausak · on April 18, 2013

That's true about Django apps, but this post doesn't describe a Django app. It describes a Django project.

Camillo · on April 18, 2013

The main thing I wish I had known is that the ORM is rather limited (or, at any rate, it was too limited for my app), and while you can use raw SQL, it won't let you get model objects out of your own queries, which makes it impossible to integrate with the rest of your app that uses the ORM [1]. The first thing I'd do if I had to use Django again would be to eschew its ORM and use SQLAlchemy instead.

[1] This was a few versions of Django ago; if they fixed this since then, I welcome corrections.

edavis · on April 18, 2013

Not sure when it was added, but is this what you were after? https://docs.djangoproject.com/en/1.5/topics/db/sql/#perform...

Camillo · on April 18, 2013

Yes, that looks like it should do the job. Thanks!

edanm · on April 18, 2013

Thing #1 for me, by a long shot:

Deploy to Heroku (or similar). Saves a ton of headaches, saves boatloads of time/money, and for most startups in the early stages, is well worth the tradeoff.

SoftwareMaven · on April 18, 2013

I decided to dive into bootstrap.py/buildout for my current project. So far, it has done a reasonable job of keeping the directory structure clean as well as keeping dependencies under control. The nice thing is that there are recipes for most important things, and adding other new recipes (oh, I need to install and have access to coffeescript) are very straightforward.

I see it as a build step before fabric or puppet would take the output and deploy it.

crucialfelix · on April 18, 2013

I've used buildout for years but I'm now moving away from it. I've enjoyed it though, and I wouldn't recommend against it.

Several issues:

the buildout process is long and monolithic and not conducive to minor adjustments. I'm using ansible now and its much better for just changing one setting on an nginx config file or settings file and reloading. I chatted with the author of buildout and he said he was building to an rpm and then mounting that as his means to do a live deploy.

I often wished I had virtualenv, so many things work well with it. for instance python-vim and sublime lint / rope like to have a virtualenv to get the python paths. also ctags is happier if I can just enter the virtualenv and run it.

there may be a recipe for that but I never found one that worked nicely.

tuxidomasx · on April 18, 2013

This list is somewhat dependent on the size and complexity of the project. For a small or simple project, many of these points might not matter.

For example, if Apache can handle your traffic just fine, why spend time replacing it with gunicorn? Or if you never really migrate your database ever, why waste time fiddling with South?

Just a gentle reminder to take into consideration the present and future needs of your project to avoid needlessly adding complexity to it.

jvm · on April 19, 2013

I think he's actually saying gunicorn was simpler than Apache. I've also found that to be true (anecdotaly).

taude · on April 17, 2013

I like the ideas on directory structure. I've been working in Flask and I spent way too much time thinking about my directory structure.

Luyt · on April 18, 2013

You're also using Flask? I wrote a short piece about roughly the same subject as the author of the original article uses for Django. Running Flask behind gunicorn and nginx, monitored by supervisord:

http://www.michielovertoom.com/freebsd/flask-gunicorn-nginx-...

taude · on April 19, 2013

Yes, using Flask. I need to write why we chose it over Django....but mostly becasue we're using Mongo and once you're down the path of not using the Django ORM, what's the point? Also, it seems like there's dozens of articles on "how to structure" your large Django projects, so you don't even get good guidance about it out of the box.

I'm running mine under NGINX->uWSGI....but might be switching to Gunicorn. (I'm still researching our path to production).

overshard · on April 17, 2013

I had to double check the title because I thought he said Django and all the advice seemed to match my Rails experience and not my Django one, also, this may be a bit nitpicky, but Django isn't really an MVC Framework nor is it a CMS. You can only really call it a web framework as it has elements of these items but isn't the same:

1. The right directory structure, the default use for the media folder is supposed to be using for stuff uploaded through your app, not the files you develop, your static files should go in static and your compressed files should go in media as they are dynamically generated or in your global "static" folder that is pointed to from your web server. You should also have different settings.py files on a per-situation basis. Your development servers, staging, and production environments will probably all have different settings because at the very least development will have `DEBUG = True` and production `DEBUG = False`.

2. This is fine even though it seems wonky to use it for cron jobs to me.

3. This is fine too but I personally use nginx+uwsgi w/ emperor mode.

4. Up to you on this one, there are some nice Django branches that support key-value store databases but don't use the standard Django with key-value databases because it is 100% built around RDBMS.

5. Only piece of advice I agree with completely.

6. As partially discussed in #1 by me. I don't like his override method. I think an import of a base settings into production.py and development.py is cleaner.

7. Supervisor is good here but also if you use uwsgi in emperor mode instead of the suggested gunicorn it can handle the same task saving you an extra install and configuration.

8. Django very clearly has a nice Mixin for JSON responses in the docs or you can build a nice easy API using Tastypie: https://docs.djangoproject.com/en/1.5/topics/class-based-vie...

9. It's up to you if you want to use Redis, I don't personally need it for all the suggested things and I like how well memcached works with Django out of the box for caching.

10. Munin is great! I'm often too lazy to set it up and am fine reading log files.

11. This drove me to write this long comment... I actually get really annoyed when people try to mix in items from other stacks when there are many solid solutions already that don't force you to install another entire stack. Django has django_compressor which works great in this situation (https://github.com/jezdez/django_compressor) and a quick Google search will find many other similar solutions that won't require you to install Ruby to work with your Python web app.

Source: 6 years of developing Django apps and doing everything that was suggested here and more.

xaritas · on April 18, 2013

The guy who wrote django-compressor fairly recently released a library to address the settings.py problem, django-configurations https://django-configurations.readthedocs.org/en/latest/

It uses classes to define your settings, and you can use inheritance to override and mixins to augment your bits of configuration. I usually have a "CommonSettings" class, with a few others extending it (Production, Development, etc.). At runtime the correct class is set based on an environment variable (you have to add a line or two to manage.py and wsgi.py to start the magic).

Anyway, it doesn't seem that popular but I've enjoyed using it (and done so without problem).

akoumjian · on April 18, 2013

How is Django not an MVC framework? I'm not being snarky, just curious.

steveklabnik · on April 18, 2013

MVC means a ton of different things. Django itself does not use "MVC" to describe itself: https://docs.djangoproject.com/en/dev/faq/general/#django-ap...

> If you’re hungry for acronyms, you might say that Django is a “MTV” framework – that is, “model”, “template”, and “view.” That breakdown makes much more sense.

misiti3780 · on April 18, 2013

models -> models.py, controllers -> views.py, views -> templates,

right?

kitanata · on April 18, 2013

It's all semantics. As a long time django developer (4 years) I agree.

tracker1 · on April 18, 2013

Just want to point out that MongoDB doesn't have full text search perse, but it can do nested array indexes... If you want stubbing/phoneticization you need to do it as part of your input, and part of your search logic... if you front your queries with a service then it is easy enough to do.

mathias_10gen · on April 18, 2013

Actually, as of 2.4 there is an experimental text search feature built-in: http://docs.mongodb.org/manual/core/text-search/. While it is no Lucene (and is still considered experimental) it does provide simple tokenizing, stop-words and stemming.

domrdy · on April 18, 2013

I tried various asset management tools but prefer having grunt manage all frontend related things. Node projects seem to be better tailored for this. Especially stuff like linting and watching for code changes, plus having a js based config file for people that don't speak python.

fduran · on April 17, 2013

nginx with uwsgi is also an excellent alternative imho

the_cat_kittles · on April 17, 2013

using a settings module, requirements directory and environmental variables makes alot of sense (outlined in two scoops of django at https://django.2scoops.org/, a good book)

boerni1234 · on April 18, 2013

"Use named URLs, reverse, and the url template tag": For Javascript URL handling use sth. like that https://github.com/version2/django-js-reverse

Pxtl · on April 18, 2013

That is an impressively long lost of tools to pick up and get running professionally on top of being a web newbie. Hat tip to that man.

misiti3780 · on April 18, 2013

~ 2 years and a lot of pinot noir ....

mhahn · on April 17, 2013

solid list, I would also add that it is extremely useful (I would argue necessary) to use vagrant (http://www.vagrantup.com/) to manage your development environment.

misiti3780 · on April 18, 2013

this is actually on my todo list, i saw this video by zac holman a while back:

http://zachholman.com/screencast/vagranception/

and have been meaning to mess around with it ever since

unwind · on April 18, 2013

Why does this page break "space" for paging down (in Firefox 20.0.1, Windows 7)?

smogzer · on April 18, 2013

Replace all comments with: Use web2py instead of django

derengel · on April 18, 2013

Is mongodb a good choice for CRUD apps also?

jaegerpicker · on April 18, 2013

depends on the actual app use. Simple in and out low traffic crud? It's probably not worth the extra effort. MySQL is still a great product that has and will continue to serve billions of web requests and just cause NoSQL is a new kid on the block doesn't mean that everything should use it.

That said if you are storing data that is really just documents, Mongo or another document db is likely a good choice. For example if you storing interrelated performance measurements (like in a factory setting ) Mongo would likely not be the best choice but say in a setting like an app for applying for a job or registering for events ( natural document data segments ) a document db would work well for modeling the domain I'd think.

neo2001 · on April 18, 2013

It really depends on the nature of you application, but raw speaking... probably you'll regret using mongodb as your primary database at some point.

schrijver · on April 18, 2013

off-topic, this _Medium_ platform seems to mangle all outgoing urls…