Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Dear Flask, Please Fix Your “Secure” Cookies (stacksmashing.net)
39 points by mxey on Aug 11, 2012 | hide | past | favorite | 47 comments


Wait a second, this is 2012. Why are we still storing session data in cookies, whether encrypted or not?

Unless the data is really simple and you don't care if it's stolen, session data should always be stored on the server side, and only a meaningless identifier sent to the client.

Managing sessions on the server side used to be a PITA, to be sure, especially if you had multiple servers. But nowadays there are lightening-fast distributed key-value stores that are ideal for storing session data. No need to encrypt, base64-encode, base64-decode, and decrypt multiple kilobytes of data with every request. Think of all the bandwidth and latency you could save by not sending those kilobytes back and forth!


But nowadays there are lightening-fast distributed key-value stores that are ideal for storing session data.

"Lightning fast" is still a lot slower than no lookup at all.

A few bytes of cookie cost next to nothing in terms of bandwidth nor latency and can save a lot of hardware on the backend for large sites.


Security is more important. Storing anything in the cookie apart from a simple identifying token increases the attack surface.

If you use encryption and an HMAC to harden up said cookie, you need to a) not bugger up your scheme and b) pay the overhead the calculations. For the HMAC it's trivial, but ciphers are a bit more involved.


I agree there's a small risk of buggering up. But it's a one-time effort to get it right and reviewed, encrypting a string is not exactly rocket science.

AES overhead is a non-issue for all but the very largest sites. A modern CPU can pipe north of 100MB/sec through AES - it's very unlikely to become the bottleneck in a python web-app.


Sure it costs nothing until you are compromised and any valuable data you have is taken. Remote code exploits are extremely dangerous. And mitigating them with a small amount of i/o overhead seems like a fair trade off.


That's unrelated.

You can store data in cookies without using a potentially unsafe serialization scheme. Baby != Bathwater.


> Why are we still storing session data in cookies, whether encrypted or not?

First of all they are not encrypted, they have a MAC. Secondly because you can verify the basic information of the request without hitting a database which is very convenient. In fact if it would not be for signed cookies Flask would not have any sessions because it does not dictate a data store. However you can easily change that: http://flask.pocoo.org/snippets/75/

> Think of all the bandwidth and latency you could save by not sending those kilobytes back and forth!

You are not saving anything there. A session should really only contain user id.


> You are not saving anything there. A session should really only contain user id.

So it looks like Flask is using something that just deliberately runs pickled strings. Even if I know my own secret key, I don't exactly need to be able to send myself arbitrary code. So JSON seems to make sense to me for my use cases.

    pickle.loads("cos\nsystem\n(S'whoami'\ntR.")
Right.. so anyway, does anyone know which session engine I want to use with Flask if I want to generate a token for a user's session, which I can then later revoke if I end up hating that user's session but not two other sessions?


JSON seems to make sense to me for my use cases

It should be noted that the maintainer knows this, but has commented that he cannot change the default to json because folks are still storing non-json-safe data.

This is the right decision (for now). Otherwise, we'd see a "flask just broke everyone's apps" story on the front page.


Why do people keep doing these things in new frameworks? It has been known that it is not secure to receive pickles from clients for many years and yet the same mistake keeps being repeated.


Flask doesn't have a user system and as such does store nothing on its own in the session.


That's clearly not the case in all situations though. If its simply a userid there would be no reason to use pickle. JSON or simple string storage would be sufficient. Clearly people are storing far more than just a userid in their client side sessions.


>You are not saving anything there. A session should really only contain user id.

This is all ours do. I'm not totally clear on where the actual security issue is here. Adding a level of indirection to a backend store doesn't seem like it would change much.


I don't have access to Flask or the ability to answer this question in any reasonable amount of time, so I'm asking. If your session has the user id, what format is the user id in? Is it the email address, or goodness forbid, an autoincremented numeric ID? Or anything else even remotely guessable? If so, you've still got a problem.

Obscuring through an opaque token of suitable entropy means that it is effectively impossible to guess a valid token. Accepting the client's idea of who the user is is playing with fire.


Flask doesn't have its own user system, in the wild one sees stuff like user-ids and so on. It makes it very easy to make bad mistakes.


It's a MongoDB objectid. It's guessable. Still not seeing the problem yet, as you haven't bothered to explain it.


So what stops me from changing the user ID cookie to another user and impersonating them?


The HMAC authenticating the cookie.


So it is a secure cookie, that wasn't clear.


So going back to my original question, does this mean I have nothing to worry about?


I won't dare give security advice with all the security professionals lurking around here.


K, falling back to my "not giving a fuck" mode I was in before and assuming any compromises of the session would involve access to my backend servers which is worse than losing control of the sessions.


On this topic, Beaker cookie sessions are encrypted and are great; however, neither Ben or I have much time/interest in maintaining Beaker, as the caching is now modernized in dogpile.cache and neither of us agree with Beaker's server side session approach.

Might someone adapt Beaker's secure cookie session into its own pypi package and include a flask adapter with it? It's time we all got behind a best practice for sessions (which IMHO should be cookie based, encrypted, and very small/only basic identiication with more significant data stored in the DB).


In information security, people used to think that web-based vulnerabilities were "lame" or "easy."

To some extent, they may have been correct: if you look at the relative complexity of HTTP header fuzzing or a SQL injection attack compared to finding 0day in, say, the Linux kernel, it's generally not even a contest.

That said, exploits like this demonstrate the carelessness that can go into many applications and frameworks on the web. In my years of application security, I've seen some "interesting" vulnerabilities like this, but I've seen thousands of people with standard, well-known security problems all over their applications.

"Broken Authentication & Session Management" is an extremely common finding for my team to write up. People think that, oh, we don't need this cookie to be marked Secure or HTTPOnly. "If the browser's owned, we're screwed anyway."

The problem with this uninformed stance on application security is that exploits do not always need to stand on their own to introduce a significant level of risk. For example, should a cookie not be marked HTTPOnly, the browser itself does not need to be compromised. Simple cross-site scripting (another extremely common issue in webapps) can easily access these cookies and throw them at malicious receivers.

This is just one example, but like the article referenced in the original post, it seems to me that some people just don't take security seriously.

That said, the research referenced here is pretty cool -- for those of you that missed it, you can read the Black Hat 2011 paper on this issue here: http://media.blackhat.com/bh-us-11/Slaviero/BH_US_11_Slavier...


I would love to switch the default to JSON but people store too much stuff in their sessions that does not work on top of JSON (datetime objects and token objects are the most common offender).

In fact the python-openid extension is the most common offender. Also I can't just switch the default because it would invalidate everyone's currently issued sessions.

The issue is not new, changing it without breaking things is the hard part.


It's a security issue. You're allowed to break things (or at least deprecate dangerous features)!


It's a security issue if people leak their secret key. At that point you have a huge problem anyways. Also breaking things assumes that afterwards you are left with an equally functioning system which I currently don't think I can guarantee.


No, I don't think "you have a huge problem anyways" is true, as there are a lot of other scenarios.


Like what?


One thing that you could and should add is AES-encryption.

It does nothing for the pickle issue but it's quite a shame to send cookies in plain-text in this day and age.


Why can't a custom JSON encoder be used with support for serializing/deserializing datetime objects?


It's still inline signalling and might conflict with what people have. Keep in mind that sessions are schemaless. It's not a huge problem changing that and I have a somewhat working implementation but it will break stuff, that's just how it works unfortunately.


For what it's worth, I implemented JSON based sessions for testing purposes: https://github.com/mitsuhiko/flask/commit/4df3bf2058954624f9...

They are currently on a separate branch and they are known to break python-openid (and with that Flask-OpenID). I don't have a solution for that yet. If someone is really concerned regarding security you can copy/paste the code into your own project. The session interface in Flask is pluggable for a while already.


Is there a reason why session storage isn't local, with clients only given the index into that storage?

Allows for any size/types of data but the worst case scenario-without code execution is that an attacker hijacks an existing but valid session.


I should add that this will not be at all the final implementation for Flask 0.10 in case I want to change that. If the change of the implementation will happen I will make itsdangerous (separate Python module) have a way to serialize the custom objects properly and then add this as a dependency.


Thank you for the new implementation and your fast reaction!


No Python program should ever try to unpickle data from an untrusted source.


There is a way to harden pickle to protect it against the most basic exploits: http://docs.python.org/py3k/library/pickle.html#restricting-...

It is not a complete solution, as an attacker could still DoS your service by making pickle allocate a huge amount of memory, but at least that's better than allowing arbitrary code execution.


It's signed, so the integrity is ensured as long as the secret key stays secret.


There was never any good reason to use pickle for this, though..


The rule of thumb is that each distinct kind of operation should have its own secret key.

For example I developed a protocol with hash-chaining and HMACs for messages. I have two keys: one for performing the hash-chain calculation, and a different key for the HMAC of the one time message.

Could I have used a single key? Absolutely, yes. But the time it takes to perform both is trivial. And I've bought myself a little extra protection from any attacks that reveal a few bits of a key, because gathering a bit here and there of different keys won't help my attacker as much as gathering bits for the a universal key being used for everything.


Er, if someone has gotten access to your secret key, aren't they already into your server? Or is this just CYA for developer incompetence?


Not necessarily. They may have gotten access to an old backup of your server. This misfeature would let them escalate that to a compromise of your live server.

Attacks often involve multiple rounds of escalation as people keep on increasing their level of access. You want that escalation to be as hard as possible. Therefore you really don't want there to be a fact that could be floating around your organization in various ways that can be used to go straight to executing arbitrary code on your production site.


That was my thinking so far in regards to this issue. I do however understand the concerns so I am probably going to evaluate a few alternatives.


Local file inclusions are quite often very hard or impossible to exploit. Also, as linked in the blogpost, some may just pushed them online by accident, someone may lost a notebook or whatever. So no, filesystem access is only one way.


Another problem we saw (but didn't mentioned): You can't logout sessions on the server-side. If you forgot to logout somewhere, your session will stay valid. And an expiration-date for the sessions is optional. Neither changing your password nor contacting the site admins will help you.


That's not really true. You could easily add another piece of information to your session that is derived from something stored with the user (for instance a running counter). So if someone runs away with your session you add a button "logout everything" which increments the counter. When the session is accessed for the next time the counter values don't match and the user is logged out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: