Cookies are filled with weird gotchas and uncomfortable behavior that works 99.95% of the time. My favorite cookie minefield is cookie shadowing - if you set cookies with the same name but different key properties (domain, path, etc.) you can get multiple near-identical cookies set at once - with no ability for the backend or JS to tell which is which.
Yep. Even within the prod environment it's ideal to have a separate domain (as defined by the Public Suffix List) for sketchy stuff like files uploaded by users. Eliminates a whole class of security issues and general fuckery
Buy a second domain, ideally using the same TLD as your production domain (some firewalls and filters will be prejudiced against specific TLDs). Mimic the subdomains exactly as they are in production for staging/dev.
That only works if you (and any third party code that might run on such a domain) are completely consistent about always specifying the domain as one of your subdomains whenever you set a cookie.
And if your marketing/SEO/business people are ok with having something like "prod" as a subdomain for all your production web pages.
Usually it's mainsite.com for the marketing site, and then app.mainsite.com for actual production, or if you have multiple it'll have the product name, like coolproduct.mainsite.com
We then have app-stg and app-canary subdomains for our test envs which can only be accessed by us (enforced via zero trust). No reason for marketing or SEO teams to care in any case.
This works fine and is what I’ve done. But if you’re sending email from those domains or working with enterprise customers using the same TLD will be helpful.
I had the option to re-use the prod domain for non-prod a few years ago (the company's other two projects use the prod domain for all non-prod environments).
I didn't really think about cookies back then but it just felt like a generally bad idea because disastrously messing up a URL in some config or related service would be much easier.
Nah dev should probably be a separate tld so the cookies are completely isolated.
Stage, it depends - if you want stage to have production data with newer code, and are fine with the session / cookies being shared - host it on the same domain and switch whether users get stage or prod based on IP, who is logged in, and/or a cookie. That way your code doesn't have to do anything different for stage vs prod every time it looks at the request domain (or wants to set cookies).
If you want an isolated stage environment, why not just use a separate top level domain? Otherwise you are likely seeing yourself up for the two interfering with each other via cookies on the TLD.
I'm sure this will be replicated in future projects because it's much easier to argue "we're already following this pattern so let's be consistent" than "this pattern is bad and let's not have two ruined projects"
If you are on /somepath I'd expect to get C as is the most specific value out of all three. All the values are still returned, ordered, which to me is the best of both worlds (path-specific values + knowing the globals)
The only thing I don't like is the magic `document.cookie` setter, but alas that's nearly 30 years old.
... this came up recently after I tightened the validation in jshttp/cookie https://github.com/jshttp/cookie/pull/167 - since that PR the validation has been loosened again a bit, similar to the browser code mentioned in the article.
My changes were prompted by finding a bug in our code (not jshttp) where a cookie header was constructed by mashing the strings together without encoding; every so often a value would have a space and break requests. I was going to suggest using jshttp/cookie's serialize() to devs to avoid this but then realized that that didn't validate well enough to catch the bug we'd seen. I proposed a fix, and someone else then spotted that the validation was loose enough you could slip js into the _name_ field of the cookie which would be interpreted elsewhere as the _value_, providing an unusal vector for code injection.
This is one of those things where specs are still hard to parse.
It is considered invalid syntax to lead with a dot by the rules. But it also must be ignored if present. Its lacking a “MUST NOT” because the spec is defining valid syntax, while also defining behavior for back compat.
It would break too many things to throw here or serialize while ignoring the leading dot. Leading dots are discouraged, but shouldnt break anyone following the spec. Maybe a warn log in dev mode if serializing a domain with dot, to try and educate users. Dunno its worth it though.
The point of jshttp IMO is to smooth over these kinds of nuances from spec updates. So devs can get output which is valid in as many browsers as possible without sacrificing functionality or time studying the tomes.
I do sympathise somewhat with that view, but I disagree. To be valid in as many browsers as possible, and as many back-end systems too, serialize() would have to take the _narrowest_ view of the spec possible. If you make cookies that stray from the spec, you cannot know if they will work as intended when they are read, you've baked in undefined behaviour. It's not just browsers; in our systems we have myriad backends that read the cookies that are set by other microservices, that could be reading them strictly and dropping the non-conformant values.
If you want to set invalid cookie headers, it's very easy to do so, I just don't think you should expect a method that says it will validate the values to do that.
The dot I can go along with because the behaviour is defined, but I'm less comfortable that a bunch of other characters got re-added a couple of days ago.
As for smoothing over nuances from spec updates...the RFC has been out there for 13 years, and jshttp/cookie has only been around for 12; there have been no updates to smooth, it has just never validated to the spec.
It means that you are setting cookies on whatever page you're on, without considering whether the cookie will be consistently accessible on other pages.
For example, you set the currency to EUR in /product/123, but when you navigate to /cart and refresh, it's back to USD. You change it again to EUR, only to realize in /cart/checkout that the USD pricing is actually better. So you try to set it back to USD, but now the cookie at /cart conflicts with the one at /cart/checkout because each page has its own cookie.
If you want cookies to be global, set them to / or leave out the path. If you want more fine-grained cookies, use a specific path. What's the problem? Currency is—in your example—clearly a site-wide setting. I think sites should make more of their hierarchical structure, not less.
If you leave out the path, it will default to the directory of the current URL, not /.
If not for this default behavior, it would have been much easier to manage global settings such as currency. Right now, all it takes is one cookie without a path to introduce inconsistency, only on some pages, in a way that's hard to reproduce.
Isn't that just the feature working as intended? Of course it is possible to introduce a bug by setting or not setting a cookie somewhere where it should/shouldn't be set.
I've never found a use for path-based cookies personally, but I'm not sure this is a particularly compelling example.
The typical example of a path-based cookie is the "remember my login name" feature, where you want the cookie with the user name only available on the login page. (And you cannot use session storage because you want it to work whilst logged out.)
That would include the cookie with each request, which is inefficient. And potentially it also can get sent with requests to other subdomains, which may not be desirable from a security point of view (it could be cdn.example.com, owned by someone else)
Server side session state for more than authentication is way worse than "code smell."
It requires a ping to a shared data source on every request. And, the same one for all of them. No sharding, No split domains... That gets expensive fast!
> In computer programming, a code smell is any characteristic in the source code of a program that possibly indicates a deeper problem. Determining what is and is not a code smell is subjective, and varies by language, developer, and development methodology.
I am using path to wire my http only cookies to be sent only to /api not in assets/html requests. The cookie will eventually contain a JWT token I do use as an access token. Consequently I will probably wire my refresh cookie only to be sent to /api/refresh-token and not in other requests.
The client won't get to decide which cookie to send where.
But if the attributes are exactly the same then the cookies replace each other. So this isn't a general mechanism for representing a list.
Not to mention that the way to delete a cookie is sending a replacement cookie that expires in the past. How are you supposed to delete the right cookie here?
And the worst is that you need to exactly match the domain and path semantics in order to delete the cookie! Domain is easy enough because there are only two options - available to subdomain and not available to subdomain. But if you have a cookie with the `/path` set and you don't know what value was used, you literally cannot delete that cookie from JS or the backend. You need to either pop open devtools and look at the path or ask the end user to clear all cookies.
Is there a way for JS to see the attributes for each value?
Because presumably setting an expire time in the past and iterating over every used set of attributes would get the job done to delete the cookie.
Iterating over all possible (plausible?) attributes may also work, but knowing the specific attributes set would narrow that list of erasing writes to issue.
The article mentions Rust's approach, but note that (unlike the other mentioned languages) Rust doesn't ship any cookie handling facilities in the standard library, so it's actually looking at the behavior of the third-party "cookie" crate (which includes the option to percent-encode as Ruby does): https://docs.rs/cookie/0.18.1/cookie/
Thanks for pointing that out -- I've updated the article and given you credit down at the bottom. Let me know if you'd prefer something other than "kibwen."
Not really. A lot of essential third party Rust crates and projects have "weird" names, eg. "nom", "tokio", etc. You can see that from the list of most downloaded crates [1].
This one just happens to have been owned and maintained by core Rust folks and used in a lot of larger libraries. This is more the exception than the rule.
It's a given that you should do due diligence on crates and not just use the first name that matches your use case. There's a lot of crate name squatting and abandonware.
Rust crates need namespacing to avoid this and similar problems going forward.
A sibling comment talked about “UwU names”. Not sure exactly if they are referring to “tokio” or something else. But if it’s tokio, they might find this informative:
> I enjoyed visiting Tokio (Tokyo) the city and I liked the "io" suffix and how it plays w/ Mio as well. I don't know... naming is hard so I didn't spend too much time thinking about it.
The name of the city is 東京 -- anything in Latin characters is a rough transliteration. Tokio was the common spelling in European texts until some time last century, and is still used regularly in continental Europe.
> Rust crates need namespacing to avoid this and similar problems going forward.
It hasn't been implemented despite crowd demanding it on HN for years because it won't solve the problem (namespace squatting is going to replace name squatting and tada! you're back to square one with an extra step).
I do agree that people will assume xyz/xyz is more authoriative than some-org/xyz, but I think there is benefit to knowing that everything under xyz/* has a single owner. The current approach is to name companion crates like xyz_abc but someone else could come along with xyz_def and it's not immediately obvious that xyz_abc has the same owner as xyz but xyz_def does not.
This is a completely different topic though, and I think there's interest in shipping something like that.
That's the main problem with “just add namespace FFS” discussions that come every other week: everyone has its own vision of what namespace should look like and what they are meant for, but nobody has ever taken the time to write an RFC with supporting arguments. In fact, people bring this mostly in ways that are related to name squatting (like right here) even though that's not a problem namespace can solve in the first place. It's magical thinking at its finest.
Exactly, this isn't about “default namespace”, this is the other feature which I said had support (didn't know the RFC had been merged though, thanks for pointing that out).
This isn't the kind of namespace people say they want to prevent squatting.
Solved the problem almost completely in npm. Sure you can't search for a name of a company or a project and expect it to be related to the company or project.
But there's no way to solve that.
But once you know a namespace is owned by a company or project, you can know that everything under it is legit.
Which solves the vast majority of squatting and impersonation problems.
Also you know that everything under "node" for example is part of the language.
> Sure you can't search for a name of a company or a project and expect it to be related to the company or project. But there's no way to solve that.
There's a way to solve it partially: you can have a special part of your namespace tied to domains and require that eg com.google.some-package be signed by a certificate that can also sign some-package.google.com
Of course, there's no guarantee that https://company.com belongs to the company, but the public has already developed ways of coping with that.
(I specifically suggest doing that only to part of your namespace, because you still want people to be able to upload packages without having to register a domain first.)
That just makes package names harder to remember and type (and actually less secure as more prone to typosquatting and backdoors in seamingly harmless pull requests) for no benefit.
Keep in mind that the majority of package by far don't come from companies in the first place, and requiring individual developers to have a domain of their own isn't particularly welcoming.
It's going to be tons of complexity for zero actual benefit.
One wonders if Bluesky's approach to usernames might one day inspire a future package manager in this direction: a GUID that is then aliased to a friendly (sub)domain through proof of ownership, with a default fallback domain for those without a domain (i.e. mypkg.crates.io vs mypkg.philpax.me)
There are problems it does solve though. It’s incomprehensible that we get so many new package managers that fail to learn from the bajillion that came before.
It actually learned and that's what makes cargo as good as it is (arguably the best of all that came before, and a source of inspiration for the ones that came after).
But its authors rightly concluded that it's useless to expect to prevent name squatting by any technical mean!
I recall in the Elm community there was a lot of hooplah around the package system aligning too much with a single repo provider (github) so that might be one disincentive there.
php deals with this by using the username/organization name of a repository as the namespace name of packages. At least then you're having to squat something further up the food chain.
Did anyone else notice that the HTTP protocol embeds within it ten-thousand different protocols? Browsers and web servers both "add-on" a ton of functionality, which all have specifications and de-facto specifications, and all of it is delivered through the umbrella of basically one generic "HTTP" protocol. You can't have the client specify what version of these ten-thousand non-specifications it is compatible with, and the server can't either. We can't upgrade the "specs" because none of the rest of the clients will understand, and there won't be backwards-compatibility. So we just have this morass of random shit that nobody can agree on and can't fix. And there is no planned obsolescence, so we have to carry forward whatever bad decisions we made in the past.
This is also the fault of shit-tastic middleware boxes which block any protocol they don't understand-- because, hey, it's "more secure" to default-fail, right?-- so every new type of application traffic until the end of time has to be tunneled over HTTP if it wants to work over the real Internet.
> middleware boxes which block any protocol they don't understand-- because, hey, it's "more secure" to default-fail, right?
If the intent is to secure something then failing-open will indeed be at odds with that goal. I suspect you’re not implying otherwise, but rather expressing frustration that such providers simply can’t be bothered to put in the work and use security as an excuse.
> a monopoly dictate a nice clean spec which they can force-deprecate whenever they want
We already have that at times. Apple forced a change to cert expiration that nobody else wanted, but everyone had to pick up as a result. Google regularly forces new specs, and then decides "actually we don't like it now" and deprecates them, which others then have to do as well. Virtually all of the web today is defined by the 3 major browser vendors.
If all these "specs" actually had versions, and our clients and servers had ways to just serve features as requested, then we could have 20 million features and versions, but nothing would break.
Example with cookies: if the cookie spec itself was versioned, then the server could advertise the old version spec, clients could ask for it, and the server could serve it. Then later if there's a new version of the spec, again, new clients could ask for it, and the server could serve it. So both old and new clients get the latest feature they support. You don't have to worry about backwards compatibility because both client and server can pin versions of specs and pick and choose. You can make radical changes to specs to fix persistent issues (without worrying about backwards compatibility) while simultaneously not breaking old stuff.
But we can't do that, because "cookies" aren't their own spec with their own versions, and clients and servers have no way of requesting or advertising versions of specs/sub-specs.
You could actually implement this on top of HTTP/1.1:
1. Make cookies their own spec, and version it
2. Add a new request header: "Cookie-Spec-Ver-Req: [range]"
3. If there's no request header, the spec is version 1.0
4. If Server sees a request header, it determines if it supports it
5. Server replies with "Cookie-Spec-Ver: <version>" of what it supports, based on the request
6. Client receives the spec it requested and handles it accordingly
Do that for every weird "feature" delivered over HTTP, and suddenly we can both have backwards-compatibility and new features, and everything is supported, and we can move forward without breaking or obsoleting things.
This actually makes sense from a programmatic standpoint, because "Cookies" are implemented as their own "Spec" in a program anyway, as a class that has to handle every version of how cookies work. So you might as well make it explicit, and have 5 different classes (one per version), and make a new object from the class matching the version you want. This way you have less convoluted logic and don't have regressions when you change things for a new version.
About 10 years ago I implemented cookie based sessions for a project I was working on. I had a terrible time debugging why auth was working in Safari but not Chrome (or vice-versa, can't remember). Turned out that one of the browsers just wouldn't set cookies if they didn't have the right format, and I wasn't doing anything particularly weird, it was a difference of '-' vs '_' if I recall correctly.
IIRC there is (or was?) a difference in case-sensitivity between Safari and Chrome, maybe with the Set-Cookie header? I've run into something before which stopped me from using camelCase as cookie keys.
Can't seem to find the exact issue from googling it.
I got the impression that almost as soon as they were introduced people thought the only sensible use of cookies is to set an opaque token so the server can recognize the client when it sees it again, and store everything else server side.
I don;t understand why it's a problem that the client (in principle) can handle values that the server will never send. Just don't send them, and you don;t have to worry about perplexing riddles like "but what would happen if I did?"
Cookies are an antiquated technology. One of the first introduced while the web was still young in the 90s, and they have had a few iterations of bad ideas.
They are the only place to store opaque tokens, so you gotta use them for auth.
They are not the only place to store tokens. You can store tokens with localStorage for JS-heavy website, in fact plenty of websites do that. It's not as secure, but acceptable. Another alternative is to "store" token in URL, it was widely used in Java for some reason (jsessionid parameter).
To expand on the "not as secure" comment: local storage is accessible to every JS that runs in the context of the page. This includes anything loaded into the page via <script src=""/> like tracking or cookie consent services.
And I feel like it's important to expand on the fact that Cookies are visible to JS by default as well, except if the Cookie has the `HttpOnly` attribute set. Obviously, for auth, you absolutely want the session cookie to have both the `Secure` and `HttpOnly` attributes.
Cookie header parsing is a shitshow. The "standards" don't represent what actually exists in the wild, each back-end server and/or library and/or framework accepts something different, and browsers do something else yet.
If you are in complete control of front-end and back-end it's not a big problem, but as soon as you have to get different stuff to interoperate it gets very stupid very fast.
Cookies seem to be a big complicated mess, and meanwhile are almost impossible to change for backwards-compatibility reasons. Is this a case to create a new separate mechanism? For example a NewCookie mechanism could be specified instead, and redesigned from the ground-up to work consistently. It could have all the modern security measures built-in, a stricter specification, proper support for unicode, etc.
Imagine pwning a frontend server or proxy, spawning an http/s server on another port, and being able to intercept all cookies and sessions of all users, even when you couldn't pwn the (fortified) database.
This could have a huge advantage, because if you leave the original service untouched on port 80/443, there is no alert popping up on the defending blueteam side.
I think one important use case we have for cookies is "Secure; HttpOnly" cookies. Making a token totally inaccessible from JS, but still letting the client handle the session is a use case that localStorage can't help with. (Even if there's a lot of JWTs in localStorage out there.)
However, potentially a localStorage (and sessionStorage!) compatible cookie-replacement api might allow for annotating keys with secure and/or HttpOnly bits? Keeping cookies and localStorage in sync is a hassle anyhow when necessary, so having the apis align a little better would be nice. Not to mention that that would have the advantage of partially heading off an inevitable criticism - that users don't want yet another tracking mechanism. After all, we already have localStorage and sessionStorage, and they're server-readable too now, just indirectly.
On the other hand; the size constraints on storage will be less severe than those on tags in each http request, so perhaps this is being overly clever with risks of accidentally huge payloads suddenly being sent along with each request.
I think if I were implementing a webapp from scratch today I'd use one single Session ID cookie, store sessions in Redis (etc) indefinitely (they really aren't that big), and for things meant to be stored/accessed on the frontend (e.g. "has dismissed some dumb popup") just use local storage. Dealing with anything to do with cookies is indeed incredibly painful.
I think they mean that you can always send back the content of a localstorage property with javascript grabbing the value and sending another request back with it in the body. Since the front end is going to run any javascript the server sends it (disregarding adblockers at least), it's sort of a more indirect version of Set-Cookie.
i think the main problem there is that cookies are so intractibly tied up with tracking, any attempt to create better cookies now will get shut down by privacy advocates who simply don't want the whole concept to exist.
Every privacy advocate I know hands over exquisitely detailed private and personal information to Google and/or Apple. It seems unfair to generalize as “privacy advocates” so much as it is people who are anti-ads.
Being anti-ads is a valid opinion. It has less intellectual cover than pro “privacy” though.
The DOM & URL are the safest places to store client-side state. This doesn't cover all use cases, but it does cover the space of clicking pre-authorized links in emails, etc.
I spend a solid month chasing ghosts around iOS Safari arbitrarily eating cookies from domains controlled by our customers. I've never seen Google/Twitter/Facebook/etc domains lose session state like this.
Safari is a lot more strict about cookies than Chromium or Firefox, it will straight up drop or ignore (or, occasionally, truncate) cookies that the other two will happily accept.
I had hoped when writing this article that Google would look at Safari and see that it was always strict about feel comfortable about changing to be the same. But doing so now would unfortunately break too many things for too many users.
If I open a second window or tab I expect when I go to 'myemail.com' that it knows who I am and shows me my account even though the url in the 2nd tab doesn't have any extra info in the URL
Cookies need to die. Their only legitimate use is with for which we have the Authentication header. Having a standard way to authenticate into a website in a browser would be amazing, just too bad that Basic and Digest auth wasn’t good enough at the time.
As a bonus we could get Persona-style passwordless future.
They are not bad they just are unnecessary. If your application uses local state, use local storage. If you store session data on the server, identify the user using the Authorization header. Why send arbitrary strings back and forth often with requests that don’t need them. Plus the technology is clearly rotten. They never got namespacing snd expiration right so you can just do weird stuff with them. Also, CSRF wouldn’t be a thing if cookies weren’t. This is like saying “why is finger/gopher/etc. bad?” They are not exactly bad but they are obsolete.
Take a look at how basic auth is implemented in browsers today. Now imagine expanding it to (a) provide a much nicer and somewhat customizable UI for entering your credentials and (b) using proper encryption.
What about redirects from other sites, should Authorization behave like cookies? My point is cookies are ok for auth, and you basically should invent same things with another header.
That header was invented for this exact purpose before cookies were invented. It has wide browser support and semantics that make sense. Moreover, the design specifically includes provisions for additional auth mechanisms (basic and digest being the two most widely used). The downside was that the UI for setting that header was ugly.
Your comments remind me of the people who didn’t get HTTP verbs and wanted to use POST for everything before rediscovering REST.
Not a web dev. So do I understand it correctly that it's not so much the server side of this that's the issue, after all the Authorization header contains a nice token, but rather how to safely store the token client side without using cookies?
Identifier in local storage could be stolen by 3rd party JavaScript. Anybody who wants to use local storage for sensitive information should read why there is a httpOnly cookie attribute.
If you are running third party JS on your site they can just make requests to your server now. Once JS is loaded it is running in the context of your domain. No they can’t do it once the user closes the browser but third party JS is XSS in action.
And I am not suggesting using local storage for it. I am suggesting adding browser support for standard/generic login UI. Basically think basic auth, just not so basic.
Author started with throwing the results of JSON.stringify into a cookie, and I was surprised that his issue wasn't just that someone had thrown a semicolon into the JSON that was being stringified.
Most of the headaches around cookies seem to be around people trying to get them to work with arbitrary user input. Don't do that. Stick with fixed-length alphanumeric ASCII strings (the kind you use for auth tokens) and you'll be fine.
The way around this, as a developer, is URL-safe-base64 encode the value. Then you have a bytes primitive & you can use whatever inner representation your heart desires. But the article does also note that you're not 100% in control, either. (Nor should you be, it is a user agent, after all.)
I do wish more UAs opted for "obey the standard" over "bytes and an prayer on the wire". Those 400 responses in the screenshots … they're a conforming response. This would have been better if headers had been either UTF-8 from the start (but there are causality problems with that) or ASCII and then permitted to be UTF-8 later (but that could still cause issues since you're making values that were illegal, legal).
And make sure to specify what exactly you mean by that. base64url-encoding is incompatible with base64+urlencoding in ~3% of cases, which is easily missed during development, but will surely happen in production.
… yeah. I assume they're getting that from doing 3/64, but for uniform bytes, you're rolling that 3/64 chance every base64-output-character. (And bytes are hardly uniform, either … TFA's example input of JSON is going to skew towards that format's character set.)
The article mocks Postel's law, but if the setter of the cookie had been conservative in what they sent, there would have been no need for the article...
As they should. Postel's Law was a terrible idea and has created minefields all over the place.
Sometimes, those mines aren't just bugs, but create gaping security holes.
If your client is sending data that doesn't conform to spec, you have a bug, and you need to fix it. It should never be up to the server to figure out what you meant and accept it.
Following Postel's law does not mean to accept anything. The received data should still be unambiguous.
You can see that in the case where ASN.1 data need to be exchanged. You could decide to always send them in the DER form (conservative) but accept BER (liberal). BER is still an unambiguous encoding for ASN.1 data but allow several representations for the same data.
The problem with BER mainly lies with cryptographic signature as the signature will only match a specific encoding so that's why DER is used in certificates. But you can still apply Postel's law, you may still accept BER fields when parsing file. If the field has been incorrectly encoded in a varied form which is incompatible with the signature, you will just reject it as you would reject it because it is not standard with DER. But still, you lessen the burden to make sure all parts follow exactly the standards the same way and things tend to work more reliably across server/clients combinations.
You could split the difference with a 397 TOLERATING response, which lets you say "okay I'll handle that for now, but here's what you were supposed to do, and I'll expect that in the future". (j/k it's an April Fool's parody)
And yet the html5 syntax variation survived (with all it's weird now-codified quirks), and the simpler, stricter xhtml died out. I'm not disagreeing with out; it's just that being flexible, even if it's bad for the ecosystem is good for surviving in the ecosystem.
There was a lot of pain and suffering along the way to html5, and html5 is the logical end state of postel's law: every possible sequence of bytes is a valid html5 document with a well-defined parsing, so there is no longer any room to be more liberal in what you accept than what the standard permits (at least so far as parsing the document).
Getting slightly off topic, but I think it's hard to find the right terminology to talk about html's complexities. As you point out, it isn't really a syntax anymore now that literally every sequence is valid. Yet the parsing rules are obviously not as simple as a .* regex. It's syntactically simple, but structurally complex? What's the right term for the complexity represented by how the stack of open elements interacts with self-closing or otherwise special elements?
Anyhow, I can't say I'm thrilled that some deeply nested subtree of divs for instance might be closed by a open-button tag just because they were themselves part of a button, except when... well, lots of exceptions. It's what we have, I guess.
It's also not a (fully) solved problem; just earlier this year I had to work around an issue in the chromium html parser that caused IIRC quadratic parsing behavior in select items with many options. That's probably the most widely used parser in the world, and a really inanely simple repro. I wonder whether stuff like that would slip through as often were the parsing rules at all sane. And of course encapsulation of a document-fragment is tricky due to the context-sensitivity of the parsing rules; many valid DOM trees don't have an HTML serialization.
I agree that being liberal in what you accept can leave technical debt. But my comment was about the place in the code where they set a cookie with JSON content instead of keeping to a format that is known to pass easily through HTTP header parsing, like base64. They should have been conservative in what they sent.
So, just be as conservative as possible when you produce data and as liberal as possible when you receive something. Your code will then require the least cooperation from *any* other code to be compatible with.
Doing otherwise will require cooperation to adjust on the specificities clients expect, and you fall into the trap of the prisoner dilemna.
You changed the problem. Postel's law is not about writing the protocol but implementing it.
Sure, protocol should be designed to be as specific as possible but unfortunately these are not always defined up to that point for any good or bad reasons, and we generally are at best just in the implementation side and cannot influence the writing of the protocol, so the Postel's law is the best we can apply to avoid having to cooperate with the rest of the planet.
I came across a similar issue when experimenting with the Crystal language. I thought it would be fun to build a simple web scraper to test it out, only to find the default HTTP client fails to parse many cookies set by the response and aborts.
In both cases (cookie vs localStorage) you're really just storing your data as a string value, not truly a JSON object, so whether you use a cookie or localStorage is more dependent on the use case.
If you only ever need the stored data on the client, localStorage is your pick. If you need to pass it back to the server with each request, cookies.
Right, I meant it's not a JavaScript object. It's serialized into a string in any case, no matter which API you're stuffing it into. So it's a bit of a non-sequitur for the parent to suggest that it's somehow weird to store JSON in a cookie, but not in localStorage. It's all just strings.
I find it weird too. I’ve always considered cookies like very stupid key value stores.
It would never occur to me to put something more than a simple token in a cookie. A username, and email address, some opaque thing.
The idea of trying to use it for arbitrary strings just seems weird to my intuition, but I don’t really know why. Maybe just because when I was learning about them long ago I don’t remember seeing that in any of the examples.
Httponly cookie is the way, but then you just don't use json as cookie value that is send on every request.
Csrf is no problem as the data from service worker is only active on the site itself.
If you speak about csrf with a website where you can't trust js, you're site is broken as xhr/fetch use the same httponly cookies and is affected as well.
I mean, anyone can open devtools and change the code to do whatever ... or install an extension that does it. So, since when can you guarantee that a browser client will actually do what you program it to do? In my experience, you can't guarantee anything on the client -- since forever. I was asking when/if that changed. I don't see why you would make that a personal attack?
Well you see when a front end developer and a backend developer hate each other very much, they do a special hug and nine days later a 400 request header or cookie too large error is born.
(Seriously though, someone trying to implement breadcrumbs fe-only)
And the article isn't even about the proliferation of attributes cookies have, that browsers honor, and in some cases are just mandatory. I was trying to explain SameSite to a coworker, and scrolled down a bit... https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies#co... wait, cookie prefixes? What the heck are those? The draft appears to date to 2016, but I've been trying to write secure cookie code for longer than that, hadn't heard of it until recently, and I can't really find when they went in to browsers (because there's a lot more drafts than there are implemented drafts and the date doesn't mean much necessarily), replies explaining that welcome.
Seems like every time I look at cookies they've grown a new wrinkle. They're just a nightmare to keep up with.
Well, prefixes are opt-in. You don't have to keep-up with them.
The only recent large problem with cookies were to changes to avoid CSRF, those were opt-out, but they were also extremely overdue.
All of the web standards are always gaining new random features. You don't have to keep-up with most of them. They do look like bad abstractions, but maybe it's just the problem that is hard.
I was answering your question about when they went into browsers with a link, and summarizing it in a parenthetical. So much for “replies explaining that welcome”, I guess.
It's the first part of your reply they're responding to, where it looks like you've answered their rhetorical question with the exact link they used to illustrate it.
I'd guess you just screwed up your copy paste and didn't notice.
Go and failing to parse http headers correctly should become a meme at some point.
One issue we had was the reverse proxy inserting headers about the origin of the request to the server behind. Like ip, ip city lookup etc. And that parsed through a service written in go that just crashed whenever the city had a Norwegians letter in it, took ages to understand why some of our (luckily only internal) services didn't work for coworkers working from Røros for instance. And that was again not the fault of the Go software, but how the stdlib handled it.
That is true, but in that case they are part of the value itself, they're not doing anything special:
> Per the grammar above, the cookie-value MAY be wrapped in DQUOTE characters. Note that in this case, the initial and trailing DQUOTE characters are not stripped. They are part of the cookie-value, and will be included in Cookie header fields sent to the server.
The "law" is: "Be liberal in what you accept, and conservative in what you send."
But here the problem is caused by being liberal in what is sent while being more conservative in what is accepted. It's using invalid characters in the cookie value, which not everything can handle.
Following Postel's law would have avoided the problem.
Postel's law is the main reason why there are so many cases where something is being liberal in what it sends. It's a natural approach when trying to enter into an existing ecosystem, but when the whole ecosystem follows it you get a gigantic ball of slightly different interpretations if the protocol, because something that is non-compliant but happens to work with some portion of the ecosystem won't get discovered until it's already prevalent enough it now needs to be accounted for by everyone, complexifying the 'real' spec and increasing the likelihood someone else messes up what they send.
I don't think you can blame postel's law for people not following it.
> when the whole ecosystem follows it you get a gigantic ball of slightly different interpretations
You're describing the properties of a long-lived, well-used, well-supported, living system. We'd all like the ecosystems we have to interact with to be consistent and well-defined. But even more importantly, we'd like them to exist in the first place. Postel's law lets that happen.
If your app is a leaf node in the ecosystem, and it's simple enough that you have direct control over all the parts of your app (such that you can develop, test, and release updates to them on a unified plan/timeline), then, yes, fail-early pickiness helps, because the failures happen in development. Outside of that you end up with a brittle system where the first place you see many failures is in production.
One of the things I’ve always found frustrating about cookies is that you have to do your own encoding instead of the API doing it for you. I’m sure someone somewhere does but too often I’m doing my own urlencode calls.
Encoding is at least solvable, but every browser having their own cookie length versus some standard value makes that some nonsense. Kong actually has a plugin to split (and, of course, recombine) cookies just to work around this
> Many languages, such as PHP, don't have native functions for parsing cookies, which makes it somewhat difficult to definitively say what it allows and does not allow.
That reminds me of the Frog and Toad story about willpower vs eating cookies. Yes, handling cookies is a mine field!
I read the collected stories with my two year old, though I made sure we skipped the scary ones with the Dark Frog. I think the cookies ending was a little over his head, but we had fun taking turns acting out Toad pulling his blankets over his head when Frog tells him it's spring.
Literally everything in IT runs on decades old principles and technologies. The world simply refuses to fix things because "if ain't broken, don't fix it" philosophy. Look at TCP, HTML, JSON, SMTP..all good tech but insanely old and outdated and overtaxed for that it was invented for. When people joke that the entire banking industry runs on excel sheets, they are really not far from truth. Things will be shitty until they completely break down and people are forced to fix them. Look at JavaScript, this horribly stinking steaming pile of green diarrhea that rules over the entire front-end is still being worked on and developed and billions of money and countless work-hours have been wasted in order to make it somewhat usable, instead of just coming up with entirely new tech suitable for the 21st century. This is the entire internet and tech in general.
Wait til you have a legacy system and a newer system and need to, among other things:
- Implement redirects from the old login screen to the new one
- Keep sessions in sync
- Make sure all internal and external users know how to clear cookies
- Remind everyone to update bookmarks on all devices
- Troubleshoot edge cases
Are we sure the website wasn't just broken normally? I kid, a bit, but good lord does Apple _suck_ at websites. Apple Developer and, more often, App Store Connect is broken for no good reason with zero or a confusing error message.
Note: I'm typing this on a M3 Max MBP (via a Magic Keyboard and Magic Mouse) with an iPhone 16 Pro and iPad Mini (N-1 version) on the desk next to me with an Apple Watch Series 10 on my wrist and AirPods Pro in my pocket. I'm a huge Apple fanboy, but their websites are hot garbage.
Cookies are a bit of a mess, but if you're going to use them, you can follow the standard and all will be well. Not so much a minefield, but a hammer; you just need to take some care not to hit yourself on the thumb.
I guess the confusion here is that the browser is taking on the role of the server in setting the cookie value. In doing so it should follow the same rules any server should in setting a cookie value, which don't generally allow for raw JSON (no double-quote! no comma!).
Either use a decent higher-level API for something like this (which will take care of any necessary encoding/escaping), or learn exactly what low-level encoding/escaping is needed. Pretty much the same thing you face in nearly anything to do with information communication.
Well, we’re getting into how to choose metaphors here. Not being literal, there’s always room to stretch. Still, you try to choose a metaphor with characteristics congruent with the topic.
With a minefield, you can be doing something perfectly reasonable, with eyes open and even paying attention yet nevertheless it can blow up on you.
Here, though, there’s no special peril. If you just follow the standard everything will be fine.
If this is a minefield, then practically everything in software development is equally a minefield and the metaphor loses its power.
(Later in the article they touch on something that is a minefield — updating dependencies. There’s probably a good article about that to be written.)
It sure doesn't, that was a comment for a completely different post. I have no idea why HN posted this comment on this article instead of the PHP 8.4 article I thought I was commenting on O_o
Cookies are filled with weird gotchas and uncomfortable behavior that works 99.95% of the time. My favorite cookie minefield is cookie shadowing - if you set cookies with the same name but different key properties (domain, path, etc.) you can get multiple near-identical cookies set at once - with no ability for the backend or JS to tell which is which.
Try going to https://example.com/somepath and entering the following into the browser console:
I getAt work, whoever designed our setup put the staging and dev environments on the same domain and the entire massive company has adopted this pattern.
What a colossal mistake.
Yep. Even within the prod environment it's ideal to have a separate domain (as defined by the Public Suffix List) for sketchy stuff like files uploaded by users. Eliminates a whole class of security issues and general fuckery
For the juniors reading this, here's what you do:
Buy a second domain, ideally using the same TLD as your production domain (some firewalls and filters will be prejudiced against specific TLDs). Mimic the subdomains exactly as they are in production for staging/dev.
Just use subdomains such as *.dev.example.com, *.test.example.com, *.prod.example.com, etc., no?
The reason not to do that is that dev.example.com can set cookies on example.com and other envs can see them.
That only works if you (and any third party code that might run on such a domain) are completely consistent about always specifying the domain as one of your subdomains whenever you set a cookie.
And if your marketing/SEO/business people are ok with having something like "prod" as a subdomain for all your production web pages.
Usually it's mainsite.com for the marketing site, and then app.mainsite.com for actual production, or if you have multiple it'll have the product name, like coolproduct.mainsite.com
We then have app-stg and app-canary subdomains for our test envs which can only be accessed by us (enforced via zero trust). No reason for marketing or SEO teams to care in any case.
When was the last time you saw a public website like that? prod.companyname.com websites are extremely rare especially outside tech.
The production site could be www. or something else that makes sense.
We have *.example.dev, *.example.qa, *.example.com for development, staging/qa and production. Works well and we haven't had any issues with cookies.
This works fine and is what I’ve done. But if you’re sending email from those domains or working with enterprise customers using the same TLD will be helpful.
Ah yes if you use a CNAME that would work. You know better than me.
I had the option to re-use the prod domain for non-prod a few years ago (the company's other two projects use the prod domain for all non-prod environments).
I didn't really think about cookies back then but it just felt like a generally bad idea because disastrously messing up a URL in some config or related service would be much easier.
Nah dev should probably be a separate tld so the cookies are completely isolated.
Stage, it depends - if you want stage to have production data with newer code, and are fine with the session / cookies being shared - host it on the same domain and switch whether users get stage or prod based on IP, who is logged in, and/or a cookie. That way your code doesn't have to do anything different for stage vs prod every time it looks at the request domain (or wants to set cookies).
If you want an isolated stage environment, why not just use a separate top level domain? Otherwise you are likely seeing yourself up for the two interfering with each other via cookies on the TLD.
I'm sure this will be replicated in future projects because it's much easier to argue "we're already following this pattern so let's be consistent" than "this pattern is bad and let's not have two ruined projects"
Seems perfectly reasonable to me?
If you are on /somepath I'd expect to get C as is the most specific value out of all three. All the values are still returned, ordered, which to me is the best of both worlds (path-specific values + knowing the globals)
The only thing I don't like is the magic `document.cookie` setter, but alas that's nearly 30 years old.
I wonder if this explains a lot of the unusual behaviour that happens when you use multiple accounts on a website in the same browser.
btw, technically that leading dot in the domain isn't allowed and will be ignored; https://www.rfc-editor.org/rfc/rfc6265#section-4.1.2.3
... this came up recently after I tightened the validation in jshttp/cookie https://github.com/jshttp/cookie/pull/167 - since that PR the validation has been loosened again a bit, similar to the browser code mentioned in the article.
My changes were prompted by finding a bug in our code (not jshttp) where a cookie header was constructed by mashing the strings together without encoding; every so often a value would have a space and break requests. I was going to suggest using jshttp/cookie's serialize() to devs to avoid this but then realized that that didn't validate well enough to catch the bug we'd seen. I proposed a fix, and someone else then spotted that the validation was loose enough you could slip js into the _name_ field of the cookie which would be interpreted elsewhere as the _value_, providing an unusal vector for code injection.
This is one of those things where specs are still hard to parse.
It is considered invalid syntax to lead with a dot by the rules. But it also must be ignored if present. Its lacking a “MUST NOT” because the spec is defining valid syntax, while also defining behavior for back compat.
It would break too many things to throw here or serialize while ignoring the leading dot. Leading dots are discouraged, but shouldnt break anyone following the spec. Maybe a warn log in dev mode if serializing a domain with dot, to try and educate users. Dunno its worth it though.
The point of jshttp IMO is to smooth over these kinds of nuances from spec updates. So devs can get output which is valid in as many browsers as possible without sacrificing functionality or time studying the tomes.
I do sympathise somewhat with that view, but I disagree. To be valid in as many browsers as possible, and as many back-end systems too, serialize() would have to take the _narrowest_ view of the spec possible. If you make cookies that stray from the spec, you cannot know if they will work as intended when they are read, you've baked in undefined behaviour. It's not just browsers; in our systems we have myriad backends that read the cookies that are set by other microservices, that could be reading them strictly and dropping the non-conformant values.
If you want to set invalid cookie headers, it's very easy to do so, I just don't think you should expect a method that says it will validate the values to do that.
The dot I can go along with because the behaviour is defined, but I'm less comfortable that a bunch of other characters got re-added a couple of days ago. As for smoothing over nuances from spec updates...the RFC has been out there for 13 years, and jshttp/cookie has only been around for 12; there have been no updates to smooth, it has just never validated to the spec.
Yep it's hella fraught. https://www.usenix.org/conference/usenixsecurity15/technical... goes into detail about this problem and related headaches
Using the path field is a code smell
Can you elaborate? I'm having a tough time finding references to that. (Disclaimer: I'm not an avid JS developer)
It means that you are setting cookies on whatever page you're on, without considering whether the cookie will be consistently accessible on other pages.
For example, you set the currency to EUR in /product/123, but when you navigate to /cart and refresh, it's back to USD. You change it again to EUR, only to realize in /cart/checkout that the USD pricing is actually better. So you try to set it back to USD, but now the cookie at /cart conflicts with the one at /cart/checkout because each page has its own cookie.
If you want cookies to be global, set them to / or leave out the path. If you want more fine-grained cookies, use a specific path. What's the problem? Currency is—in your example—clearly a site-wide setting. I think sites should make more of their hierarchical structure, not less.
If you leave out the path, it will default to the directory of the current URL, not /.
If not for this default behavior, it would have been much easier to manage global settings such as currency. Right now, all it takes is one cookie without a path to introduce inconsistency, only on some pages, in a way that's hard to reproduce.
Isn't that just the feature working as intended? Of course it is possible to introduce a bug by setting or not setting a cookie somewhere where it should/shouldn't be set.
I've never found a use for path-based cookies personally, but I'm not sure this is a particularly compelling example.
The typical example of a path-based cookie is the "remember my login name" feature, where you want the cookie with the user name only available on the login page. (And you cannot use session storage because you want it to work whilst logged out.)
You don't need to store multiple login names for seperate pages though, so why can't this just be a site wide cookie?
That would include the cookie with each request, which is inefficient. And potentially it also can get sent with requests to other subdomains, which may not be desirable from a security point of view (it could be cdn.example.com, owned by someone else)
For modern applications you’ll have better ways to maintain state. As shown they cause trouble in practice. Cookies should be used sparingly.
If you want to maintain state across navigations and share that state with a server it’s the best we’ve got.
Server can store session state
Server side session state for more than authentication is way worse than "code smell."
It requires a ping to a shared data source on every request. And, the same one for all of them. No sharding, No split domains... That gets expensive fast!
You just described how the whole web operates. It works just fine.
Even if you want client side, we have better ways now than cookies.
We do, but only cookies are universally available. Plenty of unusual user-agents in the world, or people like me that browse with JS off by default.
I add some products in phone. Then I login to desktop later for modification and order. Cart is empty. That's engineering smell. A really bad one.
Thats nothing more than UX/UI.
> In computer programming, a code smell is any characteristic in the source code of a program that possibly indicates a deeper problem. Determining what is and is not a code smell is subjective, and varies by language, developer, and development methodology.
- https://en.wikipedia.org/wiki/Code_smell
I am using path to wire my http only cookies to be sent only to /api not in assets/html requests. The cookie will eventually contain a JWT token I do use as an access token. Consequently I will probably wire my refresh cookie only to be sent to /api/refresh-token and not in other requests.
The client won't get to decide which cookie to send where.
Looks like a good pattern to me.
Yeah, isn’t that how you represent a list of values? (Or maybe better to say a collection, not sure if ordering is preserved)
But if the attributes are exactly the same then the cookies replace each other. So this isn't a general mechanism for representing a list.
Not to mention that the way to delete a cookie is sending a replacement cookie that expires in the past. How are you supposed to delete the right cookie here?
And the worst is that you need to exactly match the domain and path semantics in order to delete the cookie! Domain is easy enough because there are only two options - available to subdomain and not available to subdomain. But if you have a cookie with the `/path` set and you don't know what value was used, you literally cannot delete that cookie from JS or the backend. You need to either pop open devtools and look at the path or ask the end user to clear all cookies.
Is there a way for JS to see the attributes for each value? Because presumably setting an expire time in the past and iterating over every used set of attributes would get the job done to delete the cookie. Iterating over all possible (plausible?) attributes may also work, but knowing the specific attributes set would narrow that list of erasing writes to issue.
No, there isn't. All you get a list of values that are valid for the current page. Same on the server side.
If you're ever in a situation where you need to invalidate all possible instances of a cookie, it's easier to just use a different name.
[dead]
The article mentions Rust's approach, but note that (unlike the other mentioned languages) Rust doesn't ship any cookie handling facilities in the standard library, so it's actually looking at the behavior of the third-party "cookie" crate (which includes the option to percent-encode as Ruby does): https://docs.rs/cookie/0.18.1/cookie/
Thanks for pointing that out -- I've updated the article and given you credit down at the bottom. Let me know if you'd prefer something other than "kibwen."
De facto standardization by snapping up good names early!
Not really. A lot of essential third party Rust crates and projects have "weird" names, eg. "nom", "tokio", etc. You can see that from the list of most downloaded crates [1].
This one just happens to have been owned and maintained by core Rust folks and used in a lot of larger libraries. This is more the exception than the rule.
It's a given that you should do due diligence on crates and not just use the first name that matches your use case. There's a lot of crate name squatting and abandonware.
Rust crates need namespacing to avoid this and similar problems going forward.
[1] https://crates.io/crates?sort=downloads
A sibling comment talked about “UwU names”. Not sure exactly if they are referring to “tokio” or something else. But if it’s tokio, they might find this informative:
> I enjoyed visiting Tokio (Tokyo) the city and I liked the "io" suffix and how it plays w/ Mio as well. I don't know... naming is hard so I didn't spend too much time thinking about it.
https://www.reddit.com/r/rust/comments/d3ld9z/comment/f03lnm...
From the original release announcement of tokio on r/rust on Reddit.
And also to the sibling commenter, if tokio is a problematic name to you:
Would either of the following names be equally problematic or not?
- Chicago. Code name for Windows 95, and also the name of a city in the USA. https://en.wikipedia.org/wiki/Development_of_Windows_95 https://en.wikipedia.org/wiki/Chicago
- Oslo. Name of a team working on OpenStack, and also appears in their package names. Oslo is the capital of Norway. https://wiki.openstack.org/wiki/Oslo https://en.wikipedia.org/wiki/Oslo
If yes, why? If no, also why?
Just want to point out that location names are used for codenames because they cannot be trademarked
Big tech uses them instead of wasting legal time and money having to clear a new name that's temporary or non-public.
Changing the name to Tokio removes this benefit and still leaves it disconnected from its purpose.
The name of the city is 東京 -- anything in Latin characters is a rough transliteration. Tokio was the common spelling in European texts until some time last century, and is still used regularly in continental Europe.
see also, e.g. Tokio Hotel
A reference to Tokio Hotel was not on my HN bingo card
This is the first time Tokio Hotel has been mentioned on HN in over ten years.
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
That has me thinking of Neutral Milk Hotel. Totally different vibes.
Thank you for doing the background work. Wild they've never been mentioned before. And this time, not in relation to their music...
Tokio is a different (masculine) name in Japanese, pronounced quite differently. /tokʲio/ vs. /to̞ːkʲo̞ː/.
https://en.m.wikipedia.org/wiki/Tokio_(given_name)
We are talking about the spelling centuries ago, when the romanisation were less standardised
> location names are used for codenames because they cannot be trademarked
I don't think that's the case. Amazon, Nokia as some counterexamples.
[flagged]
> Rust crates need namespacing to avoid this and similar problems going forward.
It hasn't been implemented despite crowd demanding it on HN for years because it won't solve the problem (namespace squatting is going to replace name squatting and tada! you're back to square one with an extra step).
I do agree that people will assume xyz/xyz is more authoriative than some-org/xyz, but I think there is benefit to knowing that everything under xyz/* has a single owner. The current approach is to name companion crates like xyz_abc but someone else could come along with xyz_def and it's not immediately obvious that xyz_abc has the same owner as xyz but xyz_def does not.
This is a completely different topic though, and I think there's interest in shipping something like that.
That's the main problem with “just add namespace FFS” discussions that come every other week: everyone has its own vision of what namespace should look like and what they are meant for, but nobody has ever taken the time to write an RFC with supporting arguments. In fact, people bring this mostly in ways that are related to name squatting (like right here) even though that's not a problem namespace can solve in the first place. It's magical thinking at its finest.
> nobody has ever taken the time to write an RFC with supporting arguments.
https://rust-lang.github.io/rfcs/3243-packages-as-optional-n...
https://github.com/rust-lang/rfcs/pull/3243
Exactly, this isn't about “default namespace”, this is the other feature which I said had support (didn't know the RFC had been merged though, thanks for pointing that out).
This isn't the kind of namespace people say they want to prevent squatting.
Solved the problem almost completely in npm. Sure you can't search for a name of a company or a project and expect it to be related to the company or project. But there's no way to solve that.
But once you know a namespace is owned by a company or project, you can know that everything under it is legit. Which solves the vast majority of squatting and impersonation problems.
Also you know that everything under "node" for example is part of the language.
> Sure you can't search for a name of a company or a project and expect it to be related to the company or project. But there's no way to solve that.
There's a way to solve it partially: you can have a special part of your namespace tied to domains and require that eg com.google.some-package be signed by a certificate that can also sign some-package.google.com
Of course, there's no guarantee that https://company.com belongs to the company, but the public has already developed ways of coping with that.
(I specifically suggest doing that only to part of your namespace, because you still want people to be able to upload packages without having to register a domain first.)
That just makes package names harder to remember and type (and actually less secure as more prone to typosquatting and backdoors in seamingly harmless pull requests) for no benefit.
Keep in mind that the majority of package by far don't come from companies in the first place, and requiring individual developers to have a domain of their own isn't particularly welcoming.
It's going to be tons of complexity for zero actual benefit.
> [...], and requiring individual developers to have a domain of their own isn't particularly welcoming.
Try reading my comment.
I specifically said that this shouldn't be required, and would only apply to one part of the namespace.
One wonders if Bluesky's approach to usernames might one day inspire a future package manager in this direction: a GUID that is then aliased to a friendly (sub)domain through proof of ownership, with a default fallback domain for those without a domain (i.e. mypkg.crates.io vs mypkg.philpax.me)
There are problems it does solve though. It’s incomprehensible that we get so many new package managers that fail to learn from the bajillion that came before.
It actually learned and that's what makes cargo as good as it is (arguably the best of all that came before, and a source of inspiration for the ones that came after).
But its authors rightly concluded that it's useless to expect to prevent name squatting by any technical mean!
Why not do it like go does and use the git hosting domain as a prefix (like github.com/org/project)?
It doesn't have to be git either - a few version control systems are supported. See https://go.dev/ref/mod#vcs
And it doesn't have to be the direct domain+path of the repository, it can be some URL where you put a metadata file that points to the source repo.
I recall in the Elm community there was a lot of hooplah around the package system aligning too much with a single repo provider (github) so that might be one disincentive there.
How does it prevent squatting in any way?
At least it makes it easy to see the difference between std / official packages (not prefixed) and others.
It doesn't apply to Rust them because std doesn't need to appear in the Cargo.toml file in the first place.
php deals with this by using the username/organization name of a repository as the namespace name of packages. At least then you're having to squat something further up the food chain.
[dead]
Would “rookie” be the obvious name in that case?
Did anyone else notice that the HTTP protocol embeds within it ten-thousand different protocols? Browsers and web servers both "add-on" a ton of functionality, which all have specifications and de-facto specifications, and all of it is delivered through the umbrella of basically one generic "HTTP" protocol. You can't have the client specify what version of these ten-thousand non-specifications it is compatible with, and the server can't either. We can't upgrade the "specs" because none of the rest of the clients will understand, and there won't be backwards-compatibility. So we just have this morass of random shit that nobody can agree on and can't fix. And there is no planned obsolescence, so we have to carry forward whatever bad decisions we made in the past.
This is also the fault of shit-tastic middleware boxes which block any protocol they don't understand-- because, hey, it's "more secure" to default-fail, right?-- so every new type of application traffic until the end of time has to be tunneled over HTTP if it wants to work over the real Internet.
> middleware boxes which block any protocol they don't understand-- because, hey, it's "more secure" to default-fail, right?
If the intent is to secure something then failing-open will indeed be at odds with that goal. I suspect you’re not implying otherwise, but rather expressing frustration that such providers simply can’t be bothered to put in the work and use security as an excuse.
Tbh I’ve made peace with this world and I might even enjoy it more than the planned obsolescence one.
That was the model that Microsoft used at the height of their power and dominance in the 1990s and 2000s.
Anarchy is the price to pay for not having a monopoly dictate a nice clean spec which they can force-deprecate whenever they want.
> a monopoly dictate a nice clean spec which they can force-deprecate whenever they want
We already have that at times. Apple forced a change to cert expiration that nobody else wanted, but everyone had to pick up as a result. Google regularly forces new specs, and then decides "actually we don't like it now" and deprecates them, which others then have to do as well. Virtually all of the web today is defined by the 3 major browser vendors.
If all these "specs" actually had versions, and our clients and servers had ways to just serve features as requested, then we could have 20 million features and versions, but nothing would break.
Example with cookies: if the cookie spec itself was versioned, then the server could advertise the old version spec, clients could ask for it, and the server could serve it. Then later if there's a new version of the spec, again, new clients could ask for it, and the server could serve it. So both old and new clients get the latest feature they support. You don't have to worry about backwards compatibility because both client and server can pin versions of specs and pick and choose. You can make radical changes to specs to fix persistent issues (without worrying about backwards compatibility) while simultaneously not breaking old stuff.
But we can't do that, because "cookies" aren't their own spec with their own versions, and clients and servers have no way of requesting or advertising versions of specs/sub-specs.
You could actually implement this on top of HTTP/1.1:
Do that for every weird "feature" delivered over HTTP, and suddenly we can both have backwards-compatibility and new features, and everything is supported, and we can move forward without breaking or obsoleting things.This actually makes sense from a programmatic standpoint, because "Cookies" are implemented as their own "Spec" in a program anyway, as a class that has to handle every version of how cookies work. So you might as well make it explicit, and have 5 different classes (one per version), and make a new object from the class matching the version you want. This way you have less convoluted logic and don't have regressions when you change things for a new version.
There's no differences between a monopoly and an open standard when it comes to breaking users. They both would rather not for the same reasons
About 10 years ago I implemented cookie based sessions for a project I was working on. I had a terrible time debugging why auth was working in Safari but not Chrome (or vice-versa, can't remember). Turned out that one of the browsers just wouldn't set cookies if they didn't have the right format, and I wasn't doing anything particularly weird, it was a difference of '-' vs '_' if I recall correctly.
IIRC there is (or was?) a difference in case-sensitivity between Safari and Chrome, maybe with the Set-Cookie header? I've run into something before which stopped me from using camelCase as cookie keys.
Can't seem to find the exact issue from googling it.
I got the impression that almost as soon as they were introduced people thought the only sensible use of cookies is to set an opaque token so the server can recognize the client when it sees it again, and store everything else server side.
I don;t understand why it's a problem that the client (in principle) can handle values that the server will never send. Just don't send them, and you don;t have to worry about perplexing riddles like "but what would happen if I did?"
Cookies are an antiquated technology. One of the first introduced while the web was still young in the 90s, and they have had a few iterations of bad ideas.
They are the only place to store opaque tokens, so you gotta use them for auth.
They are not the only place to store tokens. You can store tokens with localStorage for JS-heavy website, in fact plenty of websites do that. It's not as secure, but acceptable. Another alternative is to "store" token in URL, it was widely used in Java for some reason (jsessionid parameter).
To expand on the "not as secure" comment: local storage is accessible to every JS that runs in the context of the page. This includes anything loaded into the page via <script src=""/> like tracking or cookie consent services.
And I feel like it's important to expand on the fact that Cookies are visible to JS by default as well, except if the Cookie has the `HttpOnly` attribute set. Obviously, for auth, you absolutely want the session cookie to have both the `Secure` and `HttpOnly` attributes.
See https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies#bl...
Cookie header parsing is a shitshow. The "standards" don't represent what actually exists in the wild, each back-end server and/or library and/or framework accepts something different, and browsers do something else yet.
If you are in complete control of front-end and back-end it's not a big problem, but as soon as you have to get different stuff to interoperate it gets very stupid very fast.
Cookies seem to be a big complicated mess, and meanwhile are almost impossible to change for backwards-compatibility reasons. Is this a case to create a new separate mechanism? For example a NewCookie mechanism could be specified instead, and redesigned from the ground-up to work consistently. It could have all the modern security measures built-in, a stricter specification, proper support for unicode, etc.
It's funny that you mention NewCookie, there is actually a deprecated Set-Cookie2 header already: https://stackoverflow.com/q/9462180/3474615
Imagine pwning a frontend server or proxy, spawning an http/s server on another port, and being able to intercept all cookies and sessions of all users, even when you couldn't pwn the (fortified) database.
This could have a huge advantage, because if you leave the original service untouched on port 80/443, there is no alert popping up on the defending blueteam side.
This gives me an idea for a project...
[flagged]
NewCookie is, roughly, what browser Local Storage is.
At least for some use cases. Of course, it doesn't directly integrate with headers.
I think one important use case we have for cookies is "Secure; HttpOnly" cookies. Making a token totally inaccessible from JS, but still letting the client handle the session is a use case that localStorage can't help with. (Even if there's a lot of JWTs in localStorage out there.)
However, potentially a localStorage (and sessionStorage!) compatible cookie-replacement api might allow for annotating keys with secure and/or HttpOnly bits? Keeping cookies and localStorage in sync is a hassle anyhow when necessary, so having the apis align a little better would be nice. Not to mention that that would have the advantage of partially heading off an inevitable criticism - that users don't want yet another tracking mechanism. After all, we already have localStorage and sessionStorage, and they're server-readable too now, just indirectly.
On the other hand; the size constraints on storage will be less severe than those on tags in each http request, so perhaps this is being overly clever with risks of accidentally huge payloads suddenly being sent along with each request.
I think if I were implementing a webapp from scratch today I'd use one single Session ID cookie, store sessions in Redis (etc) indefinitely (they really aren't that big), and for things meant to be stored/accessed on the frontend (e.g. "has dismissed some dumb popup") just use local storage. Dealing with anything to do with cookies is indeed incredibly painful.
> and they're server-readable too now, just indirectly.
Could you point me to more reading about this? It's the first time I've heard of it
I think they mean that you can always send back the content of a localstorage property with javascript grabbing the value and sending another request back with it in the body. Since the front end is going to run any javascript the server sends it (disregarding adblockers at least), it's sort of a more indirect version of Set-Cookie.
Yeah, that's what I meant. There's no built in support; but it's indirectly readable since client-side JS can read it.
This miss the "HttpOnly" part, which prevents javascript (think script injection vulnerability) from touching this part of the storage
i think the main problem there is that cookies are so intractibly tied up with tracking, any attempt to create better cookies now will get shut down by privacy advocates who simply don't want the whole concept to exist.
we're stuck with cookies because they exist.
Every privacy advocate I know hands over exquisitely detailed private and personal information to Google and/or Apple. It seems unfair to generalize as “privacy advocates” so much as it is people who are anti-ads.
Being anti-ads is a valid opinion. It has less intellectual cover than pro “privacy” though.
The DOM & URL are the safest places to store client-side state. This doesn't cover all use cases, but it does cover the space of clicking pre-authorized links in emails, etc.
I spend a solid month chasing ghosts around iOS Safari arbitrarily eating cookies from domains controlled by our customers. I've never seen Google/Twitter/Facebook/etc domains lose session state like this.
Safari is a lot more strict about cookies than Chromium or Firefox, it will straight up drop or ignore (or, occasionally, truncate) cookies that the other two will happily accept.
I had hoped when writing this article that Google would look at Safari and see that it was always strict about feel comfortable about changing to be the same. But doing so now would unfortunately break too many things for too many users.
If I open a second window or tab I expect when I go to 'myemail.com' that it knows who I am and shows me my account even though the url in the 2nd tab doesn't have any extra info in the URL
Needs a better name than NewCookie though. Suggestions include SuperCookie, UltraCookie or BetterCookie
Or to be slightly more serious avoid calling it a cookie and call it something else. Too much baggage surrounding the word cookie.
Definitely don't use "SuperCookie" as that's a thing: https://en.wikipedia.org/wiki/HTTP_cookie#Supercookie
His Majesty's English might suggest "biscuit".
Limp Biscuit it is then.
Everyone will surely be rushing to be the first one to disseminate this new technology!
What do the British call biscuits?
https://chefjar.com/wp-content/uploads/2021/05/popeyes-biscu...
The closest thing is https://en.wikipedia.org/wiki/Scone.
I've had something (in the US) that was called a "scone", and it was rigid, which disqualifies it from being similar to a biscuit in my mind.
Is that generally true of scones?
The thing the US call scones is different from the thing the UK calls "scone".
A Dookie is a digested Cookie.
Muffin? Cake?
You graduate from consuming cookies to eating...
TrickOrTreat would seem appropriate.
the new thing should be called "cupcakes" or "candies" or "snacks" or "munchies"
We already have Macaroons
https://en.wikipedia.org/wiki/Macaroons_(computer_science)
https://en.wikipedia.org/wiki/Macaroon
That feels like that XKCD comic about now there being 15 standards.
https://xkcd.com/927/
Cookies need to die. Their only legitimate use is with for which we have the Authentication header. Having a standard way to authenticate into a website in a browser would be amazing, just too bad that Basic and Digest auth wasn’t good enough at the time.
As a bonus we could get Persona-style passwordless future.
How about user preference without logging in? Are you suggesting create a trillion throwaway accounts?
What about things like local storage?
If you want to store language preferences then that means you only know client side and you can't serve html in their language
...example.com/en/ or example.com/es/
The url can store state just fine...
Why are first-party cookies bad?
They are not bad they just are unnecessary. If your application uses local state, use local storage. If you store session data on the server, identify the user using the Authorization header. Why send arbitrary strings back and forth often with requests that don’t need them. Plus the technology is clearly rotten. They never got namespacing snd expiration right so you can just do weird stuff with them. Also, CSRF wouldn’t be a thing if cookies weren’t. This is like saying “why is finger/gopher/etc. bad?” They are not exactly bad but they are obsolete.
> if you store session data on the server, identify the user using the Authorization header.
And by what miracle browser would send Authorization header? Who sets it? For which domain it could be set?
Take a look at how basic auth is implemented in browsers today. Now imagine expanding it to (a) provide a much nicer and somewhat customizable UI for entering your credentials and (b) using proper encryption.
What about redirects from other sites, should Authorization behave like cookies? My point is cookies are ok for auth, and you basically should invent same things with another header.
That header was invented for this exact purpose before cookies were invented. It has wide browser support and semantics that make sense. Moreover, the design specifically includes provisions for additional auth mechanisms (basic and digest being the two most widely used). The downside was that the UI for setting that header was ugly.
Your comments remind me of the people who didn’t get HTTP verbs and wanted to use POST for everything before rediscovering REST.
> and semantics that make sense
A paradise for CSRF.
> Your comments remind me of the people who didn’t get HTTP verbs and wanted to use POST for everything before rediscovering REST.
REST is not about HTTP methods if you read the paper. It's curious you have a direct map between HTTP methods and REST verbs as your mental model.
How would you use the Authorization header to implement server side session data?
Not a web dev. So do I understand it correctly that it's not so much the server side of this that's the issue, after all the Authorization header contains a nice token, but rather how to safely store the token client side without using cookies?
I think they mean storing an identifier in local or session storage and then sending it in the header.
Identifier in local storage could be stolen by 3rd party JavaScript. Anybody who wants to use local storage for sensitive information should read why there is a httpOnly cookie attribute.
If you are running third party JS on your site they can just make requests to your server now. Once JS is loaded it is running in the context of your domain. No they can’t do it once the user closes the browser but third party JS is XSS in action.
And I am not suggesting using local storage for it. I am suggesting adding browser support for standard/generic login UI. Basically think basic auth, just not so basic.
> Basically think basic auth, just not so basic
It's like technobros trying to invent an inferior train with each pod iteration.
It doesn't work with basic multi page sites though.
Oh right, strictly for spas.
Author started with throwing the results of JSON.stringify into a cookie, and I was surprised that his issue wasn't just that someone had thrown a semicolon into the JSON that was being stringified.
Most of the headaches around cookies seem to be around people trying to get them to work with arbitrary user input. Don't do that. Stick with fixed-length alphanumeric ASCII strings (the kind you use for auth tokens) and you'll be fine.
Re Safari’s networking code being closed source, a good substitute might be the Swift port of Foundation. You can see checks for control and delete characters here: https://github.com/swiftlang/swift-corelibs-foundation/blob/...
That is a bit of a minefield, I agree…
The way around this, as a developer, is URL-safe-base64 encode the value. Then you have a bytes primitive & you can use whatever inner representation your heart desires. But the article does also note that you're not 100% in control, either. (Nor should you be, it is a user agent, after all.)
I do wish more UAs opted for "obey the standard" over "bytes and an prayer on the wire". Those 400 responses in the screenshots … they're a conforming response. This would have been better if headers had been either UTF-8 from the start (but there are causality problems with that) or ASCII and then permitted to be UTF-8 later (but that could still cause issues since you're making values that were illegal, legal).
> URL-safe-base64
And make sure to specify what exactly you mean by that. base64url-encoding is incompatible with base64+urlencoding in ~3% of cases, which is easily missed during development, but will surely happen in production.
Isn't it a lot more than 3%? I don't think I've heard anyone say url-safe-base64 and actually mean urlencode(base64(x))
… yeah. I assume they're getting that from doing 3/64, but for uniform bytes, you're rolling that 3/64 chance every base64-output-character. (And bytes are hardly uniform, either … TFA's example input of JSON is going to skew towards that format's character set.)
oh, geez. No, just base64, using the URL safe alphabet. (The obvious 62 characters, and "-_" for the last two.
It's called "urlsafe base64", or some variant, in the languages I work in.
> This encoding may be referred to as "base64url".
https://datatracker.ietf.org/doc/html/rfc4648#section-5
But yeah, it's not base64 followed by a urlencode. It's "just" base64-with-a-different-alphabet.
Cookie value can contain `=`, `/` and `+` characters so standard base64 encoding can be used as well :)
The article mocks Postel's law, but if the setter of the cookie had been conservative in what they sent, there would have been no need for the article...
> The article mocks Postel's law
As they should. Postel's Law was a terrible idea and has created minefields all over the place.
Sometimes, those mines aren't just bugs, but create gaping security holes.
If your client is sending data that doesn't conform to spec, you have a bug, and you need to fix it. It should never be up to the server to figure out what you meant and accept it.
Following Postel's law does not mean to accept anything. The received data should still be unambiguous.
You can see that in the case where ASN.1 data need to be exchanged. You could decide to always send them in the DER form (conservative) but accept BER (liberal). BER is still an unambiguous encoding for ASN.1 data but allow several representations for the same data.
The problem with BER mainly lies with cryptographic signature as the signature will only match a specific encoding so that's why DER is used in certificates. But you can still apply Postel's law, you may still accept BER fields when parsing file. If the field has been incorrectly encoded in a varied form which is incompatible with the signature, you will just reject it as you would reject it because it is not standard with DER. But still, you lessen the burden to make sure all parts follow exactly the standards the same way and things tend to work more reliably across server/clients combinations.
You could split the difference with a 397 TOLERATING response, which lets you say "okay I'll handle that for now, but here's what you were supposed to do, and I'll expect that in the future". (j/k it's an April Fool's parody)
https://pastebin.com/TPj9RwuZ
And yet the html5 syntax variation survived (with all it's weird now-codified quirks), and the simpler, stricter xhtml died out. I'm not disagreeing with out; it's just that being flexible, even if it's bad for the ecosystem is good for surviving in the ecosystem.
There was a lot of pain and suffering along the way to html5, and html5 is the logical end state of postel's law: every possible sequence of bytes is a valid html5 document with a well-defined parsing, so there is no longer any room to be more liberal in what you accept than what the standard permits (at least so far as parsing the document).
Getting slightly off topic, but I think it's hard to find the right terminology to talk about html's complexities. As you point out, it isn't really a syntax anymore now that literally every sequence is valid. Yet the parsing rules are obviously not as simple as a .* regex. It's syntactically simple, but structurally complex? What's the right term for the complexity represented by how the stack of open elements interacts with self-closing or otherwise special elements?
Anyhow, I can't say I'm thrilled that some deeply nested subtree of divs for instance might be closed by a open-button tag just because they were themselves part of a button, except when... well, lots of exceptions. It's what we have, I guess.
It's also not a (fully) solved problem; just earlier this year I had to work around an issue in the chromium html parser that caused IIRC quadratic parsing behavior in select items with many options. That's probably the most widely used parser in the world, and a really inanely simple repro. I wonder whether stuff like that would slip through as often were the parsing rules at all sane. And of course encapsulation of a document-fragment is tricky due to the context-sensitivity of the parsing rules; many valid DOM trees don't have an HTML serialization.
I agree that being liberal in what you accept can leave technical debt. But my comment was about the place in the code where they set a cookie with JSON content instead of keeping to a format that is known to pass easily through HTTP header parsing, like base64. They should have been conservative in what they sent.
[flagged]
The problem with Postel's law is exactly that the sender is never conservative, and will tend to use any detail that most receivers accept.
So the problem with Postel's law is that people don't follow Postel's law?
The problem is that it's a prisoner's dilemma. And you can't cooperate on a prisoner's dilemma against the entire world.
So, just be as conservative as possible when you produce data and as liberal as possible when you receive something. Your code will then require the least cooperation from *any* other code to be compatible with.
Doing otherwise will require cooperation to adjust on the specificities clients expect, and you fall into the trap of the prisoner dilemna.
No, when you create a protocol define exactly what you need what is optional and what is an error, and stick to that.
Being liberal on what you accept is a path for disaster.
You changed the problem. Postel's law is not about writing the protocol but implementing it.
Sure, protocol should be designed to be as specific as possible but unfortunately these are not always defined up to that point for any good or bad reasons, and we generally are at best just in the implementation side and cannot influence the writing of the protocol, so the Postel's law is the best we can apply to avoid having to cooperate with the rest of the planet.
From wikipedia:
> The principle is also known as Postel's law, after Jon Postel, who used the wording in an early specification of TCP.
I came across a similar issue when experimenting with the Crystal language. I thought it would be fun to build a simple web scraper to test it out, only to find the default HTTP client fails to parse many cookies set by the response and aborts.
IT IS a mess, but I never saw json inside a cookie. For json I use local storage or indexeddb.
In both cases (cookie vs localStorage) you're really just storing your data as a string value, not truly a JSON object, so whether you use a cookie or localStorage is more dependent on the use case.
If you only ever need the stored data on the client, localStorage is your pick. If you need to pass it back to the server with each request, cookies.
JSON is explicitly a string serialization format.
Right, I meant it's not a JavaScript object. It's serialized into a string in any case, no matter which API you're stuffing it into. So it's a bit of a non-sequitur for the parent to suggest that it's somehow weird to store JSON in a cookie, but not in localStorage. It's all just strings.
I find it weird too. I’ve always considered cookies like very stupid key value stores.
It would never occur to me to put something more than a simple token in a cookie. A username, and email address, some opaque thing.
The idea of trying to use it for arbitrary strings just seems weird to my intuition, but I don’t really know why. Maybe just because when I was learning about them long ago I don’t remember seeing that in any of the examples.
My point is that there really is no such thing as "truly a JSON object".
Combine local storage with service worker, so you pass the data to the server if needed. Completely without setting cookies.
And if I don't want any javascript to see my values, ever? Or how do you handle CSRF?
Httponly cookie is the way, but then you just don't use json as cookie value that is send on every request.
Csrf is no problem as the data from service worker is only active on the site itself. If you speak about csrf with a website where you can't trust js, you're site is broken as xhr/fetch use the same httponly cookies and is affected as well.
Since when can you trust js?
Big websites use js and if they leak most of the time it's not a js issue.
I think the distrust of js is a personal issue.
I mean, anyone can open devtools and change the code to do whatever ... or install an extension that does it. So, since when can you guarantee that a browser client will actually do what you program it to do? In my experience, you can't guarantee anything on the client -- since forever. I was asking when/if that changed. I don't see why you would make that a personal attack?
But that a general problem, having a html only page with a form is the same problem. Only transfer what the user should see.
You need server verification for data that's important. Native programs can be changed with over programs or hex editor.
The talk was about data that is stored into cookies as json and csrf. (Cookies can be changed with devtools or extension)
Csrf is always an attack from third party against the user, if the user extract the data itself that's no csrf problem.
Because of this I thought you distrust js that can get attacked from third party, but yes js is as easy to change like .net or java programs.
You're really going to hate it when you learn about JSON Web Tokens, which exist exactly to hack past this sort of problem.
Jwt is encoded and it is used for data without a server session.
I'm not a fan for jwt and it used more often than it should, but sometimes it makes sense.
But at least they’re base 64 encoded so you don’t have to worry about the special characters
Good way to hit max header length issues. Ask me how I know.
How?
Well you see when a front end developer and a backend developer hate each other very much, they do a special hug and nine days later a 400 request header or cookie too large error is born.
(Seriously though, someone trying to implement breadcrumbs fe-only)
I'm not them, but that 419 pattern in the logs is burned into my adrenaline response: https://duckduckgo.com/?t=ffab&q=nginx+419+cookie+header&ia=...
I used chromelogger years ago that created often a too big http header over time https://craig.is/writing/chrome-logger
Are they ubiquitous? I'm no client side guru, I know I could look at makeuseof etc, but why not ask some professionals instead.
At the very least localstorage is supported across the board
No. It is disabled in many browsers when opened in private mode. Where you can have session cookies
>everything behaves differently, and it's a miracle that [it] work at all.
The web in a nutshell.
Browsers: what it would look like if Postel's Law were somehow made manifest in C++ and also essential to modern life
And the article isn't even about the proliferation of attributes cookies have, that browsers honor, and in some cases are just mandatory. I was trying to explain SameSite to a coworker, and scrolled down a bit... https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies#co... wait, cookie prefixes? What the heck are those? The draft appears to date to 2016, but I've been trying to write secure cookie code for longer than that, hadn't heard of it until recently, and I can't really find when they went in to browsers (because there's a lot more drafts than there are implemented drafts and the date doesn't mean much necessarily), replies explaining that welcome.
Seems like every time I look at cookies they've grown a new wrinkle. They're just a nightmare to keep up with.
Well, prefixes are opt-in. You don't have to keep-up with them.
The only recent large problem with cookies were to changes to avoid CSRF, those were opt-out, but they were also extremely overdue.
All of the web standards are always gaining new random features. You don't have to keep-up with most of them. They do look like bad abstractions, but maybe it's just the problem that is hard.
> https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies#co... wait, cookie prefixes? What the heck are those?
https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies#co...
> For more information about cookie prefixes and the current state of browser support, see the Prefixes section of the Set-Cookie reference article.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Se...
(Cookie prefixes have been widely supported since 2016 and more or less globally supported since 2019.)
They’re backwards-compatible, so if your cookie need meets the requirements for the `__Host-` prefix, you should use `__Host-`.
[flagged]
I was answering your question about when they went into browsers with a link, and summarizing it in a parenthetical. So much for “replies explaining that welcome”, I guess.
It's the first part of your reply they're responding to, where it looks like you've answered their rhetorical question with the exact link they used to illustrate it.
I'd guess you just screwed up your copy paste and didn't notice.
Go and failing to parse http headers correctly should become a meme at some point.
One issue we had was the reverse proxy inserting headers about the origin of the request to the server behind. Like ip, ip city lookup etc. And that parsed through a service written in go that just crashed whenever the city had a Norwegians letter in it, took ages to understand why some of our (luckily only internal) services didn't work for coworkers working from Røros for instance. And that was again not the fault of the Go software, but how the stdlib handled it.
Quotes in the value when quotes delimit the value? Yeah that seems dangerous to me.
Quotes don't delimit the value.
Per the section 4.1.1 rules quoted in the article, cookie values can be optionally quoted:
> cookie-value = cookie-octet / ( DQUOTE cookie-octet DQUOTE )
That is true, but in that case they are part of the value itself, they're not doing anything special:
> Per the grammar above, the cookie-value MAY be wrapped in DQUOTE characters. Note that in this case, the initial and trailing DQUOTE characters are not stripped. They are part of the cookie-value, and will be included in Cookie header fields sent to the server.
Why does the specification specifically mention them, then?
To clarify that by the spec, double quotes are allowed in the cookie value, but only at the beginning and end.
As for why that is, I have no idea.
Ah, thanks for the clarification!
> ...tragedy of following Postel's Law.
The "law" is: "Be liberal in what you accept, and conservative in what you send."
But here the problem is caused by being liberal in what is sent while being more conservative in what is accepted. It's using invalid characters in the cookie value, which not everything can handle.
Following Postel's law would have avoided the problem.
Postel's law is the main reason why there are so many cases where something is being liberal in what it sends. It's a natural approach when trying to enter into an existing ecosystem, but when the whole ecosystem follows it you get a gigantic ball of slightly different interpretations if the protocol, because something that is non-compliant but happens to work with some portion of the ecosystem won't get discovered until it's already prevalent enough it now needs to be accounted for by everyone, complexifying the 'real' spec and increasing the likelihood someone else messes up what they send.
I don't think you can blame postel's law for people not following it.
> when the whole ecosystem follows it you get a gigantic ball of slightly different interpretations
You're describing the properties of a long-lived, well-used, well-supported, living system. We'd all like the ecosystems we have to interact with to be consistent and well-defined. But even more importantly, we'd like them to exist in the first place. Postel's law lets that happen.
If your app is a leaf node in the ecosystem, and it's simple enough that you have direct control over all the parts of your app (such that you can develop, test, and release updates to them on a unified plan/timeline), then, yes, fail-early pickiness helps, because the failures happen in development. Outside of that you end up with a brittle system where the first place you see many failures is in production.
One of the things I’ve always found frustrating about cookies is that you have to do your own encoding instead of the API doing it for you. I’m sure someone somewhere does but too often I’m doing my own urlencode calls.
Encoding is at least solvable, but every browser having their own cookie length versus some standard value makes that some nonsense. Kong actually has a plugin to split (and, of course, recombine) cookies just to work around this
But it's so solvable that I shouldn't have to solve it
There’s a nasty bug in the python cookie parser, cookies after a cookie with quotes will be dropped: https://github.com/python/cpython/pull/113663
Zoom or some other website our customers use was writing a cookie with quotes that would break the site. Amazingly hard to reproduce and debug.
I got a trick.
Just dont make them, and dont accept them.
> Many languages, such as PHP, don't have native functions for parsing cookies, which makes it somewhat difficult to definitively say what it allows and does not allow.
What?
Parse a cookie with http://php.adamharvey.name/manual/en/function.http-parse-coo...
Send a cookie with https://www.php.net/manual/en/function.setcookie.php or https://www.php.net/manual/en/function.setrawcookie.php
Or if you have to check how php populates the $_COOKIE superglobal I think it is somewhere is this file: https://github.com/php/php-src/blob/master/main/php_variable...
Need to alternate background color-code (61e272 / e26161 + 63b754 / b75454) tables because reading a sea of Yes and No requires too much effort.
Almost nobody uses SimpleCookie.load for python. Flask, FastAPI, Django have own more relaxed parsers which doesn't break on invalid byte.
Everything about the web is a minefield. It's an exercise in "how many unnecessary layers can we put between users and their content"?
I have a solution! I just made one more framework!
What are you implicitly comparing it against?
Native desktop development.
Aka unindexable, unsearchable, violation access error mess.
Indexing doesn't really work well if cookies are involved. And if indexing doesn't work, then searching doesn't.
Web pages fail for me (javascript or whatever) far more often then I get a "violation access error" from my desktop programs.
Good luck to have HN as a native app and get a huge diverse community around :P
Electron: Eyes Emoji
That reminds me of the Frog and Toad story about willpower vs eating cookies. Yes, handling cookies is a mine field!
I read the collected stories with my two year old, though I made sure we skipped the scary ones with the Dark Frog. I think the cookies ending was a little over his head, but we had fun taking turns acting out Toad pulling his blankets over his head when Frog tells him it's spring.
Literally everything in IT runs on decades old principles and technologies. The world simply refuses to fix things because "if ain't broken, don't fix it" philosophy. Look at TCP, HTML, JSON, SMTP..all good tech but insanely old and outdated and overtaxed for that it was invented for. When people joke that the entire banking industry runs on excel sheets, they are really not far from truth. Things will be shitty until they completely break down and people are forced to fix them. Look at JavaScript, this horribly stinking steaming pile of green diarrhea that rules over the entire front-end is still being worked on and developed and billions of money and countless work-hours have been wasted in order to make it somewhat usable, instead of just coming up with entirely new tech suitable for the 21st century. This is the entire internet and tech in general.
Wait til you have a legacy system and a newer system and need to, among other things:
- Implement redirects from the old login screen to the new one - Keep sessions in sync - Make sure all internal and external users know how to clear cookies - Remind everyone to update bookmarks on all devices - Troubleshoot edge cases
> Apple Support
Are we sure the website wasn't just broken normally? I kid, a bit, but good lord does Apple _suck_ at websites. Apple Developer and, more often, App Store Connect is broken for no good reason with zero or a confusing error message.
Note: I'm typing this on a M3 Max MBP (via a Magic Keyboard and Magic Mouse) with an iPhone 16 Pro and iPad Mini (N-1 version) on the desk next to me with an Apple Watch Series 10 on my wrist and AirPods Pro in my pocket. I'm a huge Apple fanboy, but their websites are hot garbage.
But why wouldn't web pages written in ObjC be just awesome and easy to manage?!
https://en.wikipedia.org/wiki/WebObjects
I can still remember when they'd purposefully take down their store page for some godforsaken reason. The mind reels
They still do take the store page offline in the leading hours and during a keynote.
[dead]
[flagged]
> minefield
Cookies are a bit of a mess, but if you're going to use them, you can follow the standard and all will be well. Not so much a minefield, but a hammer; you just need to take some care not to hit yourself on the thumb.
I guess the confusion here is that the browser is taking on the role of the server in setting the cookie value. In doing so it should follow the same rules any server should in setting a cookie value, which don't generally allow for raw JSON (no double-quote! no comma!).
Either use a decent higher-level API for something like this (which will take care of any necessary encoding/escaping), or learn exactly what low-level encoding/escaping is needed. Pretty much the same thing you face in nearly anything to do with information communication.
I don’t understand how that’s not a minefield, it’s easy to go astray?
Well, we’re getting into how to choose metaphors here. Not being literal, there’s always room to stretch. Still, you try to choose a metaphor with characteristics congruent with the topic.
With a minefield, you can be doing something perfectly reasonable, with eyes open and even paying attention yet nevertheless it can blow up on you.
Here, though, there’s no special peril. If you just follow the standard everything will be fine.
If this is a minefield, then practically everything in software development is equally a minefield and the metaphor loses its power.
(Later in the article they touch on something that is a minefield — updating dependencies. There’s probably a good article about that to be written.)
Probably just semantics.
[flagged]
> Handling cookies is a minefield
I know! You gotta let them cool down first. Learned this the hard way.
[comment intended for a different post, but too old to delete]
None of this explicitly has anything specifically to do with HTML.
It sure doesn't, that was a comment for a completely different post. I have no idea why HN posted this comment on this article instead of the PHP 8.4 article I thought I was commenting on O_o
It’s happened enough that I suspect there’s a rarely-seen race condition somewhere in the Arc code that runs HN.
Hack 99999