The Stack Exchange API used to revoke API keys sent over HTTP (and return an error message), which is my favorite way to handle this.
Or better: actually provide the API on HTTP and HTTPS if your use case allows it (ie, non-commercial/institutional, just something for human people).
I don't think this is ever a good idea. Even for non-enterprise use cases, you wouldn't want some public hotspot to be able to inject random garbage into responses, even if not done with malicious intent.
• It allows retro computers to connect.
• It allows very low power embedded devices to connect without extra overhead.
• It's not a real security concern if you're on a private network.
• It's not a real security concern if you're on a private network.
I'm not convinced that private networks should be assumed secure by default.
It's definitely not improving security when, in order for a website to interact with an API that both are hosted on my private network, possibly even on the same machine, I need to set up publicly accessible DNS entries and/or hosting my own resolver. That and CORS makes local-first anything a huge PITA.
I setup a photo viewing web service running locally on my home computer, now I can access my photos over HTTP from my phone when I'm out and about. Both devices are on the same Tailscale network. If I can't trust in the security of that, HTTPS isn't going to help me, and the security of viewing my photos is the least of my concerns. But sure, in other contexts (like an enterprise company), some thought should be given to what's possible when the attacker is inside the corporate VPN since that's all too easy.
Perhaps, but the other realistic option is a self-signed cert. Since browsers refuse to implement any kind of TOFU or otherwise 'trust history', a self-signed cert is pretty much exactly equivalent to no TLS at all.
Even for non-enterprise use cases
I don't think non-enterprise, non machine use cases should be automatically handles, though. Attempting client upgrade is better than not, but we should be more clear about whether our devices are acting safely, i.e. calling out the change, and in the case of http local usage, reminding to use visible, out of band verification methods.
Of course this only works if the default is secure, but I am glad that browser still let me go unencrypted when I really need to, I prefer the giant warning banners...
That is an awful idea - in post Snowden world you encrypt all traffic period.
Then you have post Jia Tan world - if there is even slightest remote possibility, you just don't want to be exposed.
Just like washing hands after peeing, just do HTTPS and don't argue.
If that's your threat model then CA TLS is going to make things even worse for you because now the nation state pressure can be centralized and directed through the CA.
There are trade-offs but HTTP has it's place. HTTP is far easier to set up, more robust, far more decentralized, supports far more other software, and has a longer lifetime. For humans who aren't dealing with nation state threat models those attributes make it very attractive and useful.
We should not cargo cult the requirements for a multi-national business that does monetary transactions and sends private information with a website run by a human person for recreation and other humans. There is more to the web than commerce and we should design for human persons as well as corporate persons.
Were you familiar with Snowden revelations?
How NSA was just putting gear in ISPs or nodes where are internet gateways to pass all traffic through their automated tooling?
It was not monetary or business it is not “threat model” thing. Having all traffic encrypted is a must for basic human freedom.
It is a human freedom thing. Having to get continued permission to host a visit-able website from a CA is very un-free. That's why HTTP+HTTPS is so good. The best of both worlds with the downsides of neither.
Passive massive surveillance is by definition not doing active MITM targeting.
Dogma is how religion works, not engineering. If someone doesn't believe the benefit is worth the cost, they can and should question the practice. Blind obedience to a dogma of "just do HTTPS" is not a reasonable approach.
You cannot call out something dogma if you don’t understand reasons.
Injecting ads by ISPs into http is documented and known - they can inject anything on transit of HTTP it can be done automatically and basically with 0 cost. ISP is one the other all kind of free WiFi.
It is not only “NSA will get me” or only financial transactions. There are new known exploits in browsers and systems found on daily basis.
So reasoning is someone has to be of interest - not true because it is cheap and automated tls is making cost higher for simply scooping stuff.
“Rules of thumb” form many of the fundamental tenets of engineering practice.
Using HTTPS everywhere is one such rule of thumb.
It’s just not worth expending the mental energy considering every single edge case (while likely forgetting about some) in order to try and work out whether you can cut the corner to use HTTP rather than HTTPS, when using HTTPS is so easy.
No, HTTP would expose any sensitive information.
It's just clear text.
Does HTTPS also hide the URL request in most logging systems? You can always see the domain (api.example.com) but you cannot see the URL? The benefit being it hides an API key if included in the URL?
Yes, it hides the URL, although sadly not the domain.
It hides the domain too, in the literal HTTP request.
What it doesn't hide is the DNS lookup for that domain. You still have to translate a hostname into an IP address.
This might be a concern for certain uses. But at least it's on another port and protocol and not directly related to the HTTP request itself.
No, HTTPS has the domain in plaintext. There is a plan to fix this (Encrypted Client Hello), but AFAIK it's not widely used yet.
Ah yes, apologies. Again, it's not strictly part of the HTTP request, but part of the TLS handshake around it. And only part of the TLS handshake as part of SNI, if supported (which is true by default).
> "Server Name Indication payload is not encrypted, thus the hostname of the server the client tries to connect to is visible to a passive eavesdropper."
https://en.wikipedia.org/wiki/Server_Name_Indication
So you're right, this is more aligned to the HTTP request than the DNS resolution of hostname that I mentioned. Strictly speaking, it's not part of HTTP per se (it's part of TLS), but still, it's in the same request in the most common definition, as you are saying.
The benefit is that it:
1. hides any private information anywhere in the request, URL or otherwise, API key or otherwise. Maybe you're fine if someone knows you used Bing (revealed through DNS lookups), but not what query you entered (encrypted to be decryptable only by Bing servers). An API key is obviously secret but something as oft-innocuous as search queries can also be private.
2. disallows someone on the network path from injecting extra content into the page. This can be an ISP inserting ads or tracking (mobile carriers have been playing with extra HTTP headers containing an identifier for you for advertising reasons iirc) or a local Machine-in-the-Middle attack where someone is trying to attack another website you've visited that used https.
You’ve already been shouted down, but thank you for daring to suggest this. I maintain APIs and proxies for APIs for legacy devices, and will continue to suggest that some kinds of APIs remain appropriate for HTTP access. Never do your banking this way, obviously, but where is the harm in allowing older devices to access content in a read-only fashion?
Hypothetically speaking, plain HTTP transport even for "read only" content, can be a problem if it can be manipulated in transit.
Let's take a weather service. Seems like weather information is a read-only immutable fact and should not be something that needs protection from MITM attacks. You want to reach the largest audience possible and your authoritative weather information is used throughout the world.
One day, an intermediary system is hijacked which carries your traffic, and your weather information can be rewritten in transit. Your credibility for providing outstanding data is compromised when you start serving up weather information that predicts sunny skies when a tornado watch is in effect.
Additionally, you have now leaked information related to the traffic of your users. Even if the request is just vanilla HTTP-only, an adversary can see that your users from one region are interested in the weather and can start building a map of that traffic. They also inject a javascript payload into your traffic that starts computing bitcoin hashes and you are blamed for spreading malware.
In general, HTTPS protects both your interests and those of your users, even for benign data that doesn't necessarily need to sit behind "an account" or a "web login".
One day, an intermediary system is hijacked which carries your traffic, and your weather information can be rewritten in transit. Your credibility for providing outstanding data is compromised when you start serving up weather information that predicts sunny skies when a tornado watch is in effect.
Why would they want to do that? Is your weatherman always right?
Additionally, you have now leaked information related to the traffic of your users. Even if the request is just vanilla HTTP-only, an adversary can see that your users from one region are interested in the weather and can start building a map of that traffic.
Ah, yes, people are interested in the weather. Wow!
Of course, they could get the same info from observing that users are connecting to the IP address of a weather API provider.
They also inject a javascript payload into your traffic that starts computing bitcoin hashes and you are blamed for spreading malware.
Got there eventually. Crappy ISPs.
I mean, weather was just an arbitrary and silly made up example. You're reading it a bit too literally there.
an adversary can see that your users from one region are interested in the weather and can start building a map of that traffic
I think this is the most convincing argument, but, I think that some data doesn't care if it is not confidential. The weather is perhaps more pointed, but I think for large protected binaries (either executable or inscrutable, e.g. encrypted or sig protected archives), its a bit moot and possibly only worse performing.
However, also remember that https does not protect all data, just the application portion - adversaries can still see, map, and measure traffic to bobthebaker.com and sallyswidgets.biz. To truly protect that information, https is the wrong protocol, you need something like Tor or similar bit mixing.
Additionally, you have now leaked information related to the traffic of your users. Even if the request is just vanilla HTTP-only, an adversary can see that your users from one region are interested in the weather and can start building a map of that traffic.
One thing to note is that nothing about HTTPS protects against this type of attack. Assuming your API doesn't have much else going on (most services, probably), an adversary can easily see that you visited mycoolweatherapi.example regardless of if HTTPS is being used or not.
What TLS protects is higher on the network layer cake
How is this better in literally any way other than it makes things (in 2024, only very slightly) easier from an ops perspective, and panders to some nerdy fetish for simple ‘read it over wireshark’ protocols?
HTTPS-only should be the default. Plain-text information delivery protocols that can easily be MITMd are unsuitable for almost all uses.
This just feels like contrarianism.
I guess I have to respond with the same thing over and over because there are so many people saying the same thing without reading the replies.
HTTP is better because: it lasts forever without mantainence, it's easier to learn set up with no third parties required, all software can access a HTTP website, it's low resource requirements, and HTTP+HTTPS is perfectly fine.
Whereas CA TLS only lasts a year or two without mantainence, is so complex to set up and keep running that a literal ecosystem of programs (acme, acme2, etc) exist to hide that complexity, only software within the last ~few years can access CA TLS websites because of TLS version sunsetting and root cert expirations and the like, and everyone centralizing in the same CA makes it extremely easy to apply political/social pressures for censorship in a CA TLS only browser world. Additionally requiring third party participating to set up a website makes it harder for people to learn how to run their own and it requires more compute resources.
CA TLS only feels like it should be "required" when the person can't imagine a web browser that doesn't automatically run all untrusted code sent to it. When the only types of websites the person can image are commercial or institutional rather than personal and the person believes all the web should cargo cult the requirements of commerce. Personal websites involving no monetary or private information transactions don't actually need to worry about targeted MITM and there's no such thing as open wireless access points anymore.
I'll disagree on these grounds:
1) HTTP can be modified by a man in the middle
2) It's better to default to requests and responses being private, even if you're only using a non-commercial/institutional service.
You could say "The person chose to send requests to HTTP instead of HTTPS" and assume that the consumer of the API didn't care about privacy but, as the article points out, it's easy to typo http instead of https.
Great article! We've updated the OpenAI API to 403 on HTTP requests instead of redirecting.
$ curl http://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 123" \
-d '{}'
{
"error": {
"type": "invalid_request_error",
"code": "http_unsupported",
"message": "The OpenAI API is only accessible over HTTPS. Ensure the URL starts with 'https://' and not 'http://'.",
"param": null
}
}
If anybody is looking to copy in a public API, please return 400 and don't misuse a standard code.
400 is usually for a malformed request. It seems like in this case the request is well formed, it's just not allowed. 403 seems reasonable if the user isn't authorized to make a request to the URL, which they aren't. Some APIs return redirects which also seems pretty reasonable.
But that also implies that some user would be authorized to make a request to the HTTP port (or that the resource does exist, which in this case it doesn’t).
IMO, 400 is more accurate, but really either could be acceptable, so long as the client is notified of the error. But, I wouldn’t automatically redirect the client. That’s what we are trying to avoid.
Good point.
I guess this might depend a little on the implementation. In some cases the http endpoint may exist but may only be accessible to a sidecar container via localhost. For example, if the sidecar terminates https.
404 would also work, since the resource does not exist at the http: address.
True, but 404 has trained us to look hard at the URL part, not the protocol part.
Whereas 403 or 400 are less likely to have so automated built-in handling on the client side.
Well in that case really anything is a 403.
Why do you think 403 is the wrong error code? Based on the spec it seems entirely appropriate to me:
HTTP 403 provides a distinct error case from HTTP 401; while HTTP 401 is returned when the client has not authenticated, and implies that a successful response may be returned following valid authentication, HTTP 403 is returned when the client is not permitted access to the resource despite providing authentication such as insufficient permissions of the authenticated account.[a]
Error 403: "The server understood the request, but is refusing to authorize it." (RFC 7231)
I thought the best response from the article was the 426 Upgrade Required. That way you can throw in an Upgrade: https field in the response. It makes it immediately clear that something weird is going on. 403s are common, my first thought would be a badly-configured API key which lacks the expected permissions.
Because it's not an authorization error.
You do not throw 403 if the client is authorized to access whatever resource it's trying to.
The 426 on the sibling comment is great, though. But if you don't find an error code for your case, you don't go and redefined a well defined one. You use 400 (or 500).
Not sure how I feel about this (extremely arbitrary) distinction. 400 Bad Request maybe implies that no matter how many times you retry the request, it will never succeed. 403 Forbidden maybe implies that some external change in the authentication system could perhaps allow the same request to succeed in the future? So I guess in that lens I can see the logic, but again, seems extremely arbitrary.
Doesn’t returning a 403 on HTTP break HSTS?
https://security.stackexchange.com/questions/122441/should-h...
Doesn’t HSTS require only responding to a user via HTTPS (even for error codes).
What about this then? When the request is made over insecure HTTP, revoke the API key used, but then send the usual redirect response for HSTS. Then, when/if the request gets repeated over HTTPS, notice the key is revoked and so respond to that one with 403.
If your goal is to waste people's time, cause them to question their sanity, and guarantee that they're way too pissed off when they finally figure out what happened that instead of teaching others about how important it is too use HTTPS from the start, they talk about how awful your API is and how much they hate your company for a terrible design, then yes this sounds like a good plan.
If it were a generic 403, sure. But if the 403 message said something to the effect of "this API key is no longer valid because it has previously been observed over insecure HTTP", then wouldn't that be fine?
HSTS is intended for browsers. For API clients the correct behavior (following curl's lead) is probably to never follow/make any redirects by default.
HSTS is a note to the browser to insist on TLS when hitting a website. It is sent as a header, with a timescale, regardless of http/https.
The HSTS header is only effective when it's received over HTTPS. And if it has taken effect, the client won't try to access HTTP anymore, so it won't even know what response it would have gotten from HTTP.
Why not just stop listening on port 80, period?
It’s a good option, but you can’t give users a reason for the failure. They might even assume your service is broken.
I think it's fair to assume j. random user isn't typing "http://api.example.net" into their web browser.
leading www perhaps, leading api no.
You'd be surprised... generally if there's a dedicated hostname for the API, I would expect / to either display or redirect to API docs.
Also, doesn't help when you're reverse proxying /api to $API/api
I stopped listening on port 80 for everything… nobody’s complained yet! Maybe because they can’t find the service though.
Because the whole point is a mitm can compromise it, and the mitm can listen on 80 regardless if you turn yours off.
I've done that myself and have consumed many others who have done it, and I don't think it's better. Much better to get a response that tells you to use https for the API. (for browser also a redirect is a must for UX, though our context here is API)
This is better as it allow you to immediately notice that there's an issue. However it still facilitates api key exposing on the initial request.
How would the endpoint prevent that?
Not listening on port 80, such that the user gets a connection refused, would result in the client not sending the api key over the wire at all.
I personally think listening, accepting that user mistakes can expose API keys to MITMs, and returning the user-facing error is better than a "connection refused" error, but it is a tradeoff.
Thank you for sharing! I think this sort of thing is what makes HN great.
Have you rolled this out to prod yet? Did you check how many users this might effect? I can imagine some (probably amateur) apps are going to break when this hits, so some notice might be nice.
I'm not asking those questions critically, mainly wanting to facilitate a full discussion around the pros and cons (I think the pros are are much stronger personally).
You may want to disable path resolution as well.
http://api.openai.com/v1/chat/completions/../bar responds with error messages about http://api.openai.com/v1/chat/bar which might suggest some path traversal vulnerability that could be exploited.
Generally an API client is not going to need .. to be resolved in a path. It should return 400 - Bad Request (deceptive routing).
My personal website (darigo.su) doesn't have HTTPS. I just deployed it a few months ago and haven't really done much with it yet. I guess I'll have to get around to it eventually, but I find charm in small old sites that haven't implemented modern protocol stuff. My site also uses <font> and <center> tags all over the place.
Maybe I'll do some more quirky anachronisms, like only serve the site via HTTP 1.0 or something. Who knows. Since my site has very little functionality, it doesn't really matter, it's just for fun.
I see this a lot. You've just made a small non-interactive site, and there's nothing secret. So why bother either https?
Well, firstly, I'd say, make ot https for fun. It's pretty simple to do, makes itself automatic, and costs no money. Just exploring this route can be very illuminating.
Secondly it prevents your site from being altered by the outside. There's a lot of equipment between your site and the client. Being HTTP allows it to be altered along the way.
Gor example, and thd least serious, is that your ISP can inject adverts onto your page. Most ISPs don't do this, but quite a few do.
Second, a malicious user could inject additional text onto upur pages, proclaiming your love for a political party, some racial or misogynistic slur, or whatever.
Third, you're sending a signal to all that you either don't understand the security risks, or don't care about them. This can have consequences (reputationally) if you are in, or going into, a computing career.
Making it HTTPS can be fun too!
Honestly, hard disagree. I get the push for HTTPS, but setting up HTTPS nowadays isn't fun. It's more or less 4 steps;
1. Install and run certbot (+the DNS plugin for your DNS provider).
2. Find an nginx config file on the internet that only includes the necessary ciphers to work on modern devices. Then include that config file + lines to your cert paths that certbot spits out into your nginx config.
3. Set up a generic redirect server block to send everything on port 80 to port 443.
4. Reboot nginx.
It's at least better than fiddling with openssl directly, but this isn't fun, it's busywork.
I'm not sure if you're being sarcastic here or not? I mean you listed "reboot" as a whole step... which suggests sarcasm (at least to me)...
Anyway, all the steps together take what - all of 5 minutes? - that's if you do DNS Challenges. If you do HTTP challenges it doesn't even take that.
So you're saying this 5 minutes isn't worth the effort? That it's somehow too hard?
It's worth the effort, I just wouldn't use the descriptor of it being "fun". It's busywork equivalent to pointing nginx at your PHP socket. It's not fun, you're just wrangling a config file until it does what you want and the modern browsing landscape disproportionately demands HTTPS everywhere even for sites that arguably aren't really helped by it.
The bigger issue is that a malicious interceptor could inject javascript. The site may not use javascript, but the users almost certainly have it turned on, and accessing a non-https site means that any man in the middle can inject malicious javascript.
HTTPS is about protecting the client.
E.g. China's great firewall sometimes inserts malicious Javascript that causes the client to launch a DDoS attack against Github or other sites China doesn't like.
When a user visits your site with a modern browser in default configuration they'll get an error page along the lines of:
Secure site not available
Most likely, the web site simply does not support HTTPS.
However, it’s also possible that an attacker is involved. If you continue to the web site, you should not enter any sensitive info. If you continue, HTTPS-Only mode will be turned off temporarily for the site.
[Continue to HTTP site]
(Copied from Firefox on Android)Personally I think that's reason to do it alone, for the price of a free Let's Encrypt cert you can deliver much better UX. You presumably want people to visit your site.
When a user visits your site with a modern browser in default configuration they'll get an error page along the lines of:
Neither desktop firefox nor chrome seem to do this by default, at least on my Mac (actually I think I'm wrong about Firefox on desktop as well, thanks to a guestbook signer!). Maybe it's a Firefox mobile thing, rather than a modernity thing?
for the price of a free Let's Encrypt cert you can deliver much better UX
I'm going to get around to it, I promise haha.
Btw if anyone does visit the site, please do sign my guestbook: http://darigo.su/33chan (but be warned, a MITM might intercept and alter your comment!!)
Neither desktop firefox nor chrome seem to do this by default
they do, you probably just checked the "i'm sure this is safe" button
posted a pic on the imageboard hehe
Thanks for visiting! Receiving guestbook comments is such a delight. Does Chrome also give you this warning on desktop?
nope
A HTTP website presents an opportunity for an attacker to MITM a payload that is ultimately executed in a user’s browser. Beyond ‘getting a moustache tattoo on your finger’ quirkiness, HTTP-only websites are really inexcusable beyond some very niche cases.
not my problem!
Beyond ‘getting a moustache tattoo on your finger’ quirkiness
In that case, seems totally worth it. I, like moustache finger tattoos, am aggressively opposed to worrying about being perceived as cool. I will just have to live with being inexcusable.
That’s fine, but rather unrelated to the article, which is about the situation that you have an API served via HTTPS, and the question of whether you should also have a redirect from HTTP to HTTPS in that case, or rather return an HTTP error.
Yeah, I should have clarified. Nothing to do with the article really, just a random thought. Sorry if too off-topic!
To me, HTTPS is worth it alone to eliminate the possibility of the ISPs of people reading my site from injecting shit (ads, trackers, etc.) into the responses I send to them.
It’s completely trivial to set up, there’s really no downside at this point.
This is the somewhat depressing but accurate answer. HTTPS doesn't mean you are communicating sensitive or private data, it means you want the client to see what you send them, and not something else.
Honestly, for public facing read-only websites, it's perfectly fine to redirect HTTP to HTTPS. There's just too many cases where you aren't going to get everyone to put "https://" on the front of URIs when they put them in docs, flyers, etc. You're lucky if you get "http://"!
The API security thing, yes, that makes sense. Personally, I run a number of servers for small groups where the sensitive stuff is SSL only - you won't even get an error going to port 80, other than the eventual timeout. But for reasons above, I cannot just turn port 80 off, and it's perfectly safe redirecting to 443.
I can appreciate this and also run a service that is neither meant to be commonly visited (imagine a tor exit node landing page explaining what this IP address is doing, but for a different service) nor will produce or ingest sensitive information
For a personal website that people might commonly want to visit, though, consider the second point made in this other comment: https://news.ycombinator.com/item?id=40505294 (someone else mentioned this in the thread slightly sooner than me but I don't see it anymore)
The author includes a surprising response from "Provider B" to the HackerOne report.
Provider B: Reported on 2024-05-21 through their HackerOne program. Got a prompt triage response, stating that attacks requiring MITM (or physical access to a user's device) are outside the scope of the program. Sent back a response explaining that MITM or physical access was not required for sniffing. Awaiting response.
I think Provider B morally should require HTTPS, but it really surprises me that the author would say "MITM or physical access is not required for sniffing."
Is that true? Isn't HTTP sniffing an example of a MITM attack, by definition? Am I using the words "MITM" or "sniffing" differently from the author?
I'm familiar with the following attacks, all of which I'd call "MITM":
1. Public unencrypted (or weakly WEP encrypted) wifi, with clients connecting to HTTP websites. Other clients on the same wifi network can read the unencrypted HTTP packets over the air.
2. Public encrypted wifi, where the attacker controls the wifi network (or runs a proxy wifi with the same/similar SSID,) tricking the client into connecting to the attacker over non-TLS HTTP.
3. ISP-level attacks where the ISP reads the packets between you and the HTTP website.
Aren't all of these MITM attacks, or at the very least "physical access" attacks? How could anyone possibly perform HTTP sniffing without MITM or physical access??
MITM generally refers to someone who can intercept and modify traffic, i.e. they sit “in the middle”, between you and your recipient, reading/modifying/relaying traffic.
“Passive eavesdropper” is often used to describe what you talk about. Someone on an unencrypted WiFi network sniffing your traffic isn’t really “in the middle” at all, after all.
I disagree. So does Wikipedia ("where the attacker secretly relays and possibly alters the communications between two parties who believe that they are directly communicating with each other, as the attacker has inserted themselves between the two parties ... for example, an attacker within range of an Wi-Fi access point hosting a network without encryption could insert themselves as a man in the middle") and so I believe do most people.
"Active MITM" would be how you describe someone who does modify traffic.
And an attacker in each of the scenarios GP mentioned can modify traffic. (For ISP/attacker-controlled networks it's trivial; for other networks you just need to ARP spoof)
There's no "relaying" when the the attacker just captures unencrypted WiFi packets from the air, or more traditionally, splits some light out of the fiber line.
for example, an attacker within range of an Wi-Fi access point hosting a network without encryption
The monkey in the middle doesn't get to "relay" anything either, but he can sure see it going over his head.
I hate to agree but they are right. Endpoint-spoofing and relaying between two spoofed endpoinbts is just one of the possible forms of mitm attack that just happens to be required if you happen need to open and re-pack encryption in order to evesdrop, or if you need to modify the data.
Spoofing the two endpoints to decrypt and re-encrypt, just so that you can evesdrop without modifying the data (other than the encryption) is certainly still "mitm". Yet all the man in the middle did was evesdrop. Becoming two endpoints in the middle was only an implimentetion detail required because of the encryption.
If you are admin of one of the mail servers along the way between sender and recipient and and can read all the plain smtp messages that pass through your hands like postcards without having to decrypt or spoof endpoints, that is still mitm.
So listening to wifi is no less. There is nothing substantive that makes it any different.
For endpoint-spoofing to be required for mitm, you would have to say that mitm only applies to modifying the data, which I don't think is so. Several purely evesdropping applications are still called mitm.
It's just semantics... but I'll throw my hat into the ring nevertheless:
The "eavesdropping" attack happens when you capture unecrypted packets. From there, you could either try to hijack the session by inserting yourself into the local conversation (effectively launching a "MITM" attack) or completely independently of the local conversation attempt to impersonate the login session (effectively launching an "impersonation" attack).
I agree that's not a case of MITM, but I do think it's fair to call sitting in range of the same Wi-Fi access point "physical access".
Eve and Mallory are both MITM, in my opinion.
Public unencrypted (or weakly WEP encrypted) wifi, with clients connecting to HTTP websites. Other clients on the same wifi network can read the unencrypted HTTP packets over the air.
That's sniffing. The other two are MITM. The sniffer isn't in the middle of anything; you never speak to him.
You have to mitm in order to evesdrop on an encrypted channel. If you do nothing but evesdrop, isn't it still mitm? You had to actively replace & spoof the two endpoints, but that is just a technicality required by the encryption, in the end you still only evesdropped, yet it was still called mitm.
So, mere evesdropping is mitm. So, how is evesdropping on unencrypted traffic or wifi traffic meaningfully different than evesdropping by decrypt & reencrypt?
I don't think the term mitm includes a requirement that you aren't talking to who you thought you were talking to, because generally you do still talk to who you thought you were, just not directly.
The traffic may be modified along the way rather than merely copied, or one or the other endpoint may be wholly faked rather than merely relayed, but they also may not be, the attacker may simply relay everything verbatim in both directions, IE pure evesdropping, and the attack would still be called mitm.
Then again, I guess a keylogger is evesdropping and not called mitm.
You have to mitm in order to evesdrop on an encrypted channel.
OK but we're talking about evesdropping HTTP, which is unencrypted.
So what? My point was the encryption doesn't matter, it's just the reason you have to impersonate in order to merely evesdrop sometimes.
The point is that you don't need to be 'in the middle' to evesdrop unencrypted data - you can sniff traffic without either end being any the wiser, but if the data is TLS encrypted then you can only decrypt it if you interpose yourself.
I said that already. What are you trying to contest or expand?
One of us has not understood the others purpose, because I don't understand the point of your response to my first comment. Nothing is making sense after that.
My initial point was to show that mere evesdropping on an encrypted link is still mitm, and so the full interposition is merely an implimentation detail, required by the encryption.
If mere evesdropping is still mitm, then as far as I'm concerned any other evesdropping is also mitm.
But then I also add that a keylogger is an evesdropper and I wouldn't call that mitm so maybe the argument is missing something.
Maybe the way I should think of it is "Yeah. It's an implimentation detail. An implimentation detail called man in the middle. Evesdropping on an encrypted link requires mitm, but not all evesdropping is mitm the way all beef is meat but not all meat is beef."
IE the fact that you chose not to do anything with the mitm but merely evesdrop isn't really significant the way I argued at first. That particular example of "mere evesdropping" is still called mitm not because "therefor evesdropping is a form of mitm", but because that instance of evesdropping required mitm to do it.
Allll that said, I now actually think all those other examples of evesdropping like even a keylogger should be considered mitm. Because they are all examples of you're not talking to who you thought you were talking to. In the case of a passive observer like a wifi or keylogger or phone tap, you thought you were talking to a certain listener, but in fact you were talking to them plus other listeners.
It's perfectly logically arguable both ways.
because generally you do still talk to who you thought you were, just not directly.
...that's the requirement. The indirection of someone interposing themselves between you and the party you're trying to speak to is what is referred to by the phrase "man in the middle".
There is no phrase "man to the side". If you don't represent yourself as being the party they want to talk to, you aren't performing a man-in-the-middle attack.
I don't think people would consider 1 or 3 to be MITM. MITM requires someone who is, well, in the middle: you connect to the MITM, and they connect to your destination. 2 is clearly a MITM.
Also, while 1 is arguably a case of "physical access", I don't think 3 is. If you have a tap in an ISP, you don't have "physical access" to any of the endpoints of the HTTP connection. Otherwise, you could say you have "physical access" to literally every machine on the internet, since there is some physical path between you and that machine.
For another example, Kubernetes started as IPv4-only, and there are still plenty of plugins that have IPv4-only features.
Consider: 4. Someone else logs unencrypted traffic for whatever reason and the attacker gets access to the log later.
MITM means that you need to be in the middle at the time of the attack and so is more limited than an attack that works on logs.
At the end of the day it doesn't matter how you define "MITM". HTTPS means that even if I control a network hop in between the two endpoints, I can't get the key. Provider B says "we are OK with any hop on your route gaining access to your keys" (e.g. via owning a wifi router and giving customers access, or via being a customer and sniffing other customer's traffic, or the industrial-level equivalents of this).
Definition question. I can see your reasoning and I can see the author's, where they define MITM as requiring an active component being in the middle to actually tamper with it and not just some far-off device receiving backscatter with a high-gain antenna, or a read-only mirror port on a switch or whatever is technically not "in the middle" but in a cul-de-sac. I'm not sure I've got a strong opinion, claiming one or the other is the only correct definition might just be nitpicking
They may have chosen this wording ("it's not MITM") to get the team into action rather than dismissing the risk
Edit: another legitimate-sounding question downvoted in this thread without further comment (since I'm the only comment still after it got downvoted). Can people maybe just explain what's wrong with a post when it's not a personal attack, not off topic, not answered in the article, or any other obvious downvote reason? Everyone would appreciate the author learning from the problem if there is one
Completely agree, and arguably why stop at API servers?
Depending on server-side HTTP -> HTTPS redirects for security reinforces/rewards bad practices (linking to HTTP, users directly entering HTTP etc.), in a way that makes users vulnerable to one of the few remaining attack vectors of "scary public Wi-Fis".
The push for "TLS all the things" was already a massive overreach that actively made security worse overall, because it further ingrained the user tendency to click through scary browser warnings (all for the sake of encrypting things that were fine in plaintext). And you want to go even further? No thank you.
What scary browser warnings do you regularly get when visiting HTTPS sites?
It’s also not just about avoiding transmitting HTML etc. in plaintext; somebody being able to inject arbitrary scripts into sites you otherwise trust is bad as well.
But as I've said above, I think the HTTP -> HTTPS redirect should have never happened at the HTTP level. If we'd done it in DNS or at least as a new HTTP header ("optional TLS available" or whatnot), we could have avoided locking out legacy clients.
Scripts other than those that the user had specifically allowed, can also be a problem, whether or not it is with TLS. With TLS, only the server operator can add malicious scripts; without TLS, spies can also do so; either way, it can do so.
Specifying availability of TLS (and, perhaps, which ciphers are usable, in order to avoid the use of insecure ciphers) by DNS would do, if you can (at the client's option) acquire DNS records securely and can know that they have not been tampered with (including by removing parts of them). (This is independent of whether it is HTTP or other protocols that can optionally use TLS.)
(Actually, I think that using DNS in this way, would also solve "Gopher with TLS"; the format of gopher menus makes it difficult to use TLS, but knowing if TLS is available by looking at DNS records would make it work. Gopher servers can still accept non-TLS requests without a problem, if none of the valid selector strings on that server begin with character code 0x16 (which is always the first byte of any TLS connection, and is very unlikely to be a part of any Gopher selector string).)
It would also help to make cookies unable to cross between secure and insecure connections in either direction (always, rather than needing a "secure cookies" flag).
If you are saying that people click through certificate warnings then people definitely would just permit whatever script. The number of people who will say "yeah its okay that this is a self-signed cert" and also say "no, I have a strict allowlist of verified scripts from this server that I allow to run" is miniscule.
all for the sake of encrypting things that were fine in plaintext
What was fine in plaintext?
Who cares how many of the n-dozen routers and switches between you and your favourite blog got to inject a few <script> tags for your convenience?
Encryption is for more than secrecy, folks.
because it further ingrained the user tendency to click through scary browser warnings (all for the sake of encrypting things that were fine in plaintext).
Why should there be more scary warnings when more websites use TLS? Sure, you get more scary warnings if you set your browser to "warn if it's http", but then you're asking for it.
Sure, you get more scary warnings if you set your browser to "warn if it's http", but then you're asking for it.
Defaults. They matter.
We are. Slowly, due to lots of legacy, but surely getting there.
See the small steps over the years where it was first an add-on to force https-only mode (HttpsEverywhere, 2011), then browsers started showing insecure symbols for http connections (e.g. in 2019: https://blog.mozilla.org/security/2019/10/15/improved-securi...), and more recently I think browsers are starting to try https before http when you don't specify the protocol. I've also seen a mention of strict https mode or something, not sure if that's a private navigation feature or something yet to come, but warning screens equivalent to insecure certificate pages are getting there
Chrome's version of trying https first sure is annoying though.
If a site is down entirely, when chrome can't connect to port 443 it confidently declares that "the connection is not secure because this site does not support https" and gives a "continue" button. Then when you click "continue" nothing happens for a while before it finally admits there's nothing responding at all.
So it gives a misleading error and takes longer to figure out if a site is actually down.
I'm highly surprised by this. It seems very dumb and I have never seen anything like this, though I never use Chrome and very rarely fire Chromium for testing something.
Is there something to read about this, like a dev ticket?
There might be but I'm not aware of any tickets. But if you open chrome and navigate to 192.168.20.20 you should see it. Or any domain that resolves to a non-responsive IP, if you have one in mind.
Just tried on Chromium, I get ERR_ADDRESS_UNREACHABLE as I would expect.
I have never seen this happen in Chrome. I just tried it to make sure I wasn't crazy and it did indeed go straight to telling my that the connection timed out without showing a security error first.
Firefox has a similar bug, but for DNS rather than connection.
This is only enabled for a small percent of people at random, otherwise just folks in the advanced protection program.
arguably why stop at API servers?
I think this is pretty convincingly argued in TFA, honestly: modern browsers understand and respect HSTS headers, maintain enough local state that such headers are meaningful, and HSTS preloading is easy enough to set up that it should be achievable by most website operators.
Furthermore, it is actually quite hard to concoct a scenario where a user clicking an HTTP link and getting immediately redirected constitutes a danger to their security: unlike with API endpoints, people clicking links (and in particular, links which were typed out by hand, which is how you get the HTTP protocol in the first place) are generally not making requests that contain sensitive information (with the exception of cookies, but I would argue that getting someone to have a sane configuration for their cookies and HSTS headers is a far easier ask than telling them to stop responding to all port 80 traffic).
Don't have HTTP available at all
That's literally what the article suggests:
A great solution for failing fast would be to disable the API server's HTTP interface altogether and not even answer to connections attempts to port 80.
It also says
We didn't have the guts to disable the HTTP interface for that domain altogether, so we picked next best option: all unencrypted HTTP requests made under /api now return a descriptive error message along with the HTTP status code 403.
So close and yet … their misconfigured clients will still be sending keys over unencrypted streams. Doh
So close and yet … their misconfigured clients will still be sending keys over unencrypted streams. Doh
And how does disabling the HTTP interface altogether prevent that? In that case, any sensitive credentials are still already sent by the client before the server can do anything.
TCP handshake failure prevents the client from being able to send any data, no?
Often, but not always: https://en.wikipedia.org/wiki/TCP_Fast_Open
TFO has been difficult to deploy due to protocol ossification; in 2020, no Web browsers used it by default.[2]
It's in use by Android for DNS over TLS. Ossification issues are exaggerated.
Not if the server refuses a connection on port 80.
addendum: How quickly can a server write a 403 response and close the connection and can it be done before the client is able to write the entire HTTP request? My guess is not fast enough.
I'd be surprised if most HTTP clients even looked at the response code before writing at least the entire HTTP request (including authentication headers) out to the TCP send buffer.
And if they do that, even if they immediately close the TCP socket upon seeing the 403, or even shut down their process, I believe most socket implementations would still write out the queued send buffer (unless there's unread inbound data queued up at the time of unclean socket close, in which case at least Linux would send an RST).
And with "TCP fast open", it would definitely not work.
If the client can't open a TCP connection to port 80, there's no unencrypted path to send the API keys down
Would disabling HTTP change that? Would TCP figure out the connection isn't working before the api keys are sent?
Yes disabling HTTP would prevent keys being sent if the remote end isn't listening, unless TCP Fast Open is in use, which it is not by default.
This is better, and the only way to really prevent the primary problem. Why would an API be on HTTP at all?
Why is it better?
I can't think of large differences. What comes to mind are two human factors that both speak against it:
- Having an HTTP error page informs the developer they did something wrong and they can immediately know what to fix, instead of blindly wondering what the problem is or if your API is down/unreliable
- That page, or a config comment, will also inform the sysadmin that gets hired after you retire comfortably that HTTP being disabled is intentional. Turning something off that is commonly on might be a time bomb waiting for someone who doesn't know this to turn it back on
Edit:
Just saw this reason, that's fair (by u/piperswe) https://news.ycombinator.com/item?id=40505545
If the client can't open a TCP connection to port 80, there's no unencrypted path to send the API keys down
If that immediately errors out on the first try, though, what is the risk of the key being intercepted? The dev would never put that in production, so I'm not sure how the pros and cons stack up. I also loved this suggestion from u/zepton which resolves that concern: https://news.ycombinator.com/item?id=40505525 invalidate API keys submitted insecurely
Why spend effort on this at all? Just don't run anything on HTTP. Devs don't need hand holding. Should you run your API on `localhos` in case they typo the hostname?
Why does everything have to be so complicated?
The argument I've heard against not having HTTP at all is that potentially someone might be able to run something malicious on port 80 for that address that the admin is not aware of.
People can make up their own minds if that's a good argument or not.
That’s a terrible argument. If someone can run something on port 80, they have root on the machine, at which point can do whatever they want (including getting rid of whatever existing process was listening on port 80 and replacing it with their own).
Servers can now send HSTS along with the initial HTTP-to-HTTPS redirection response
Node.js's built-in fetch happily and quietly followed those redirects to the HTTPS endpoint.
Okay.. does nodejs fetch respect HSTS?
I'm not even sure I'd find it desirable for nodejs fetch() to quietly store state somewhere on my server without asking me: I wouldn't know to back that file up, it may be trashed regularly depending on the infrastructure, it could mess with version control by creating a local working directory change, or it might run into an error condition if it is on a read-only filesystem (either crashing or being unable to use this security feature, neither is great).
Config file writes are to be expected for user-facing software, but for developers it should error out so they can just choose what it needs to do
E.g., "Error: HSTS response received after a redirect from HTTP, but no storage location specified for security states. Use the https:// protocol, specify a file location, or set insecure_ignore_hsts=true."
Edit: but it's a very legitimate question?! Just saw you got downvoted, I have no idea why
I think the original question "Okay.. does nodejs fetch respect HSTS?" goes into the "not even wrong" bucket, for the reasons you point out.
HSTS really only makes sense from a browser perspective (or, rather, a "permanently installed, stateful client" perspective). For an API like fetch it doesn't even make sense as a question IMO.
For an API like fetch it doesn't even make sense as a question IMO.
Why not?
Because the way HSTS works fundamentally implies a stateful client, and because it was fundamentally created to solve a human problem (i.e. humans entering in URLs directly in the address bar). It really doesn't make sense with a non-stateful client, and it isn't intended for cases where someone isn't manually entering in the URL to hit.
E.g. fetch is often loaded in transient situations - it really shouldn't be updating its own config because of responses it gets. Also, based on the other comment I was originally thinking it would be good if fetch would just error out if it gets a redirect from http -> https AND also a Strict-Transport-Security header, but in that case it would still mean it was dependent on the server setting up the STS header, and if they could do that they should just go ahead and get rid of the http -> https redirect in the first place.
I agree with you on things like CORS, but HSTS would actually solve the problem stated in this thread quite gracefully.
Client fetches http, notices the header, retries https then ... ok no, lol. But I guess it's still of some use to let clients know "this should always happen through https" and make them fail if that's not the case.
Edit: yeah I got it, client fetches http, notices the header and then explicitly fails because HSTS is there.
I would say it does make sense as a nice upgrade path for services that don't yet support https but you'd like to switch as soon as they enable it, or if you're forgetful or lazy and didn't type https:// in front of the link you were given or so
Whether that's still common enough to warrant the extra complexity in the fetch function is not something I'm qualified to judge
Just saw you got downvoted, I have no idea why
I've noticed a gradual increase in this behavior during the past ... year maybe? I think for a lot of new people, downvoting equates disagreeing, which is not ideal.
Although, I also have no idea why someone would disagree with a neutral question like GPs, lol.
Hopefully, it's not the beginning of the end for HN as it is a great website.
These types of downvotes also discourage discussions. I’ll upvote a comment when it has been downvoted if it has a constructive discussion thread.
How would that even work? It's up to the developer to consider the response and act correctly on it.
So if you occasionally forget and use http when you meant https and are worried about the consequences of that, you should just implement your own HSTS checking layer?
Why not just implement your own fetch wrapper that throws if it's not an https connection?
So if you occasionally forget and use http when you meant https and are worried about the consequences of that, you should just implement your own HSTS checking layer?
Or use a library to do it. The core fetch functionality shouldn't have to deal with HSTS. There may be legitimate reasons to fetch over HTTP even after you received an HSTS header - for testing purposes, for example.
Why not just implement your own fetch wrapper that throws if it's not an https connection?
That's the developer dealing with HSTS.
Just do it like a web browser - when you install Chrome it comes with a list of many tens of thousands of domains that had HSTS set when GoogleBot visited it.
Same way as TLS session resumption can be handled by libraries without you having to touch it, or perhaps requiring you to specify a storage file and taking it from there
I'm not aware of any general programming language http clients that honor HSTS.
libcurl supports HSTS, but the client has to specify a file where the state should be stored https://curl.se/docs/hsts.html
Many languages/libraries use libcurl in some shape or form, but whether they set up an on-disk HSTS store or not - I don't know either.
Revoking an API key upon a single http request assumes you have a competent team. Have worked with people that have committed sensitive credentials into public repositories, used PRODUCTION secrets for testing, and of course sharing secrets in plain text over IM and group chats.
The number of times I have had to deal with password or secret key resets because of this is way too high. I remember working with a guy that thought sharing a demo of an internal application on YouTube was okay. Of course it had snippets of company secrets and development API keys clearly visible in the demo.
It assumes nothing like that. It provides you and your team with a safe opportunity to learn and grow.
Revoking a key like this is a problem that's solution is at worst a dozen clicks away fix. The alternative, leaking keys can be much worse.
So go and reset those creds for the guy who made a mistake happily and be grateful. He had an opportunity to learn.
There’s the guy that made _a_ mistake. Then there’s the guy who continues to make the same mistake over and over again.
That second person is way to common in companies that hire from the bottom of the barrel.
The thing with these kinds of mistakes is that if the service doesn't revoke the key and "cause problems" immediately, then there's no feedback to learn from. It's a fail-fast situation that's usually a better outcome anyway.
It might be optimistic to say that rolling over to a new key is 'at most a dozen clicks'.
Hardcoded keys are a big one. But even just hunting down all config where the key needs to change can be a major hassle. Is there some ci yaml that expects a runner to have the key in a .env file that only runs on major release? Good chance you won't realize that key exists.
Still a great idea to revoke those keys. But it will damage some customers in the short term. And they will be angry, as people often are when you demonstrate their mistake.
I appreciate the author calling this out because creating an HTTP-redirect-to-HTTPS is something I'll do almost without thinking about it. "If it has HTTPS, I'll set up an HTTP redirect." Now I know that I need to think about it before setting that up.
It also made me realize that cURL's default to not redirect automatically is probably intentional and is a good default. Praise be to Daniel Stenberg for this choice when implementing cURL.
Using Cloudfront, the redirect was the only built-in option for a long time. They only added pushbutton HSTS recently. But I'd say author is correct that if you're hosting an API there's no reason to support http at all. Just send a 400 on all requests and let the client developers use common sense.
You can still return a response...
400 - HTTP is unsupported, use HTTPS.
I tend to use Caddy as a reverse proxy for personal projects... The default behavior is to redirect to https. May have to make a special rule for API instances.
Yeah I completely missed this as a security flaw. Time to go deploy a fix...
Or just add your domain to the hsts preload list and never have to worry about this.
Is the HSTS preload list used by anything other than browsers? I'd expect it to be minimally useful for an API.
That works for browsers but I doubt any non-browser HTTP clients (e.g. curl and wget) or HTTP library (e.g. Python requests lib) will check the HSTS preload list.
In fact if they do follow HSTS headers, a simple `Strict-Transport-Security: ...; preload` would have fixed the issues mentioned in the article.
Did you happen to RTFA, in which the author specifically mentions HSTS preloading--helpfully styled as a bold, underlined, bright blue link--in the second paragraph? If you manage to then get to the third paragraph, a concise and compelling reason is given for why it's not applicable in the scenario the author is examining.
Please look into the myriad scenarios in which HSTS is not honoured.
As usual, any comment stating that people should “just” do x, is wrong.
npm is misusing 426 Upgrade Required.
https://httpwg.org/specs/rfc9110.html#status.426:
The server MUST send an Upgrade header field in a 426 response to indicate the required protocol(s) (Section 7.8).
https://httpwg.org/specs/rfc9110.html#field.upgrade:
The Upgrade header field only applies to switching protocols on top of the existing connection; it cannot be used to switch the underlying connection (transport) protocol, nor to switch the existing communication to a different connection. For those purposes, it is more appropriate to use a 3xx (Redirection) response (Section 15.4).
If you’re going to talk cleartext HTTP and issue a client error rather than redirecting, 403 Forbidden or 410 Gone are the two most clearly correct codes to use.
Ignoring the mandated semantics and requirements of status codes is sadly not as rare as it should be. A few I’ve encountered more than once or twice: 401 Unauthorized without using WWW-Authenticate and Authorization; 405 Method Not Allowed without providing Allow; 412 Precondition Failed for business logic preconditions rather than HTTP preconditions; 417 Expectation Failed for something other than an Expect header. I think it only ever really happens with 4xx client errors.
TLS is in fact a valid protocol to use in an Upgrade header:
HTTP/1.1 426 Upgrade Required
Upgrade: TLS/1.0, HTTP/1.1
Connection: Upgrade
So you can use 426 Upgrade Required here, and I'd argue it's the most correct code to use in such a case. npm doesn't send the Upgrade header though, so that's a mistake.https://www.iana.org/assignments/http-upgrade-tokens/http-up...
That’s something different: that’s for upgrading to TLS within the same connection. As in, approximately http://example.com/ → https://example.com:80/ (but without the URL’s scheme actually being allowed to change), whereas https://example.com/ is https://example.com:443/. I was only a child when RFC 2817 was published, but I’ve never heard of any software that supported it, other than the Internet Printing Protocol which can use it for ipp: URLs, kinda like SMTP has STARTTLS. As for the motivations of RFC 2817, they’re long obsolete: encryption should no longer be optional on these sorts of things so that the parallel secure port problem is gone (not sure when this became actual IETF policy, but I’m going to guess towards ten years ago), and the virtual hosting problem is solved by SNI (supported by everything that matters for well over a decade).
How is this not more upvoted? HTTP Code 426 sounds like the best code to send?
I think that it would be better to allow both as much as possible. One way to handle authentication is to use HMAC; you can do that even without needing TLS (and HMAC won't leak the keys if one of the systems (e.g. a reverse proxy or something else) is compromised).
If you do not want to do that, then don't accept connections on port 80, at least when version 6 internet is being used. (For version 4 internet, it is possible that you might use the same IP address for domain names that do want to accept connections on port 80, so you cannot easily block them in such a case.)
And, if you want to avoid compromising authentication data, then TLS is not good enough anyways. The client will need to know that the server certificates have not been compromised. HMAC will avoid that problem, even without TLS.
There is also the payload data. TLS will encrypt that, but the URL will be unencrypted if TLS is not used, whether or not the API key is revoked; the URL and headers may contain stuff other than API keys.
Some APIs also might not need keys; e.g. read-only functions for public data often should not need any kind of API keys (and should not have mandatory TLS either, although allowing optional TLS is helpful, since it does provide some security, even though it doesn't solve everything).
HSTS is even worse.
TLS prevents spies from reading and tampering with your messages, but does not prevent the server from doing so (although in this case it might be unimportant, depending on the specific file being accessed). It also is complicated and wastes energy, and there are sometimes security vulnerabilities in some implementations so does not necessarily improve security.
Of course, these ideas will not, by itself, improve security, and neither will TLS; their combination also won't do. You will have to be more careful to actually improve security properly. Some people think that, if you add TLS and HTTPS, and insist on using it, then it is secure. Well, it is very wrong!!! TLS will improve security in some ways as described in the previous paragraph, but does not solve everything.
It is also problematic if a client does not have an option to disable TLS (or use unencrypted proxies), since if you deliberately want to do MITM on your own computer, you will then have to effectively decrypt and encrypt the data twice. (If the client uses TLS by default, that would work, although if it is according to the URL then it might be by the configurable URL instead; however, the URLs do not always come from the configuration file, and even if it does, and if you want to avoid typographical errors (although even if "https" is specified, typographical errors are still possible (e.g. in the domain name), so just checking for "https" won't even necessarily help anyways; specifying what certificates to expect might sometimes help), then you might have your program to display a warning message, perhaps.) Another problem is if the client and server require different versions of TLS and it is difficult to change the software (there are reasons you might want to change only some parts of it, and that can be difficult); using a local unencrypted proxy which connects to the server using TLS, can also avoid problems like this, too.
HMAC is a fine suggestion but let's be practical. These days you can tell a junior engineer to go use TLS, but you can't tell a junior engineer to implement HMAC to sign API requests.
The client will need to know that the server certificates have not been compromised. HMAC will avoid that problem, even without TLS.
HMAC doesn't solve the problem: the client still doesn't know that the shared key isn't compromised. What does it even mean for either a client or server to know something is compromised? If Alice and Bob have a shared key, what can they do to ensure Mallory doesn't also have the shared key?
the client still doesn't know that the shared key isn't compromised. What does it even mean for either a client or server to know something is compromised? If Alice and Bob have a shared key, what can they do to ensure Mallory doesn't also have the shared key?
Of course, that is true, but TLS doesn't help with that, either. However, if a reverse proxy (especially if run by a third party) or something else like that is compromised, then HMAC won't allow you to compromise the shared key if the reverse proxy does not know the shared key (unless they can somehow trick the server into revealing it, but that is a separate issue).
An additional issue arises if it is compromised even before the client receives the key for the first time, if it is sent using the same communication channels; in that case, of course neither TLS nor HMAC will help (although certificate revocation may do in this case, but some other method will then be needed to be able to correctly trust the new certificate).
However, different services may require different levels of security (sometimes, hiding the key isn't enough; and, sometimes, encrypting the data isn't enough). How you will handle that depends on what security you require.
Sort of off-topic. What is a recommended way to sell access to a one-off data API? Low code method to control access and facilitate payment?
As in, selling API keys? Not sure what you're asking for. Are you looking for a webshop that has API key sales as a default item type or something?
Yes, more like a SaaS. Maybe a solution that is tailored to selling API access. Generates a unique URL to the user, or API key, after they sign up for the API service.
Maybe shttp would have been better than https to reduce typo errors or maybe even something completely different from http.
Ah... you know, that idea sounds not too shtty to me.
I'm sure it was added to the end following a pattern of other protocols doing the same for their ssl-wrapped version. Right now, shttp reads like http over ssh to me.
You can’t make this decision until you know how many customers are on http and how lucrative they are .
Breaking an API because you read a blog post is a bad idea
You can only do it for new customers, though. Just save the list of API keys used over HTTP, then whitelist those who earn more than $xxx/year.
Then you can work with those customers to help them upgrade their clients to HTTPS.
HTTPS and SVCB DNS records will hopefully make it more feasible over time to drop the traditional HTTP server-side redirect. The client agent will be able to read the DNS record and upgrade to the highest available protocol prior to sending the first request.
Seeing how web browsers haven't added support for SRV in decades, I'm not holding my breath.
I'm guessing there's some (?) advantage of SVCB and HTTPS records that can't be achieved with SRV[1] and TXT records, but I haven't read the RFC yet to know what that advantage is.
[1]: `_https._tcp` or `_http3._udp` or whatever.
I fully support this and have always pushed for this. One because it becomes a huge mess to maintain over time but also because it long term will lower traffic through the LB.
Unfortunately what i see happen all the time is quick fixes are pushed to the infra. For example they deploy and typo the URL. Now we have a prod outage and infra is pulled in to fix this asap. No time to wait for that 10 minute deploy pipeline that requires all the tests to run and a deploy to dev.
This happens once and then infra is asked why we don’t already redirect all URLs. Management doesn’t care about security and they just lost money. Guess what you are doing now. This is the world we live in.
Indeed. It’s probably why so many APIs accept the api key in the URL.
After all this chatter, I am considering blocking all outgoing traffic to port 80 in my local firewall
This would prevent my fat fingers from ever even making the mistake.
BLOCK outgoing port 80
How bad would that be? Would I be shooting myself in the foot somehow?
Perhaps I would do it on the egress rule for where my requesting service is running like in ECS.
It would block requests to OCSP responders, for one.
I hope that providers whose APIs responded and interacted fully over unencrypted HTTP would go back to their historical access logs and check how widespread using plaintext HTTP is. If they don't have access logs for their API then they could just sample next 24 hours for API accesses.
Popular providers have so many API users today that even a rare mistake could expose quite many users in absolute numbers. Would rather have providers to check this out rather than have this poor practice abused by the next DNS hijacking malware affecting home routers.
I wouldn’t hold my breath. As I recall a similar article appeared a few years ago, and the author called out a major SaaS provider as having this issue. The provider ultimately decided not to do anything about it, because it would break too many clients.
If you make a breaking API change like this, some portion of clients are just never going to update. If you’re a usage-based billing SaaS provider, that means lost revenue.
Likely the only way this issue is fixed widely is if it ends up on a security audit checklist.
Now that I think about it, interfaces such as Js fetch should make it an error to use http without explicitly allowing in an option.
It seems to easy to make an error and end up in a situation like the post explains.
Hmm. I think, perhaps, release versions should need this, without a flag.
For testing/prototyping, it is invaluable to turn off all the security to rule out security misconfiguration instead of application error.
If your API is non-sensitive/relies on out of band security (like large files with checksums), you may still not want https, so there should be some configuration to turn it off. And for "integrations" like jsdelivr, perhaps https libraries should follow this rule, while http ones can have the flag off...
Then, if you mix the two (http and https) perhaps they can provide an noticeable alert to the user rather than failing silently...
As a non-developer, ordinary computer user "providing service" for one user (yours truly) it's easy for me to configure the TLS forward proxy listening on the loopback to send _all_ HTTP requests, from _any_ application, including ones sent to port 80, via HTTPS. This I find preferable to letting a browser try to convert HTTP to HTTPS, e.g., "HTTPS Everywhere", or letting a developer do it with a redirect. Personally, I compile clients without linking to a TLS library. They are smaller without it and I do not have to worry about every author having correctly added TLS support. When SSL/TLS changes, applications using it often need to be updated, and some authors have made mistakes, socat being one example that comes to mind. I do 100% of TLS negotiation using a single program: the proxy. Every HTTP request on the home network goes to the proxy.
There’s absolutely nothing ordinary or easy about that setup, but I admire it. I’ve only seen that level of paranoia at a three letter agency.
Do applications with pinned certificates break? A bunch of mobile apps do that to get in the way of Wireshark.
Hard to argue against this.
That's an excellent point, and definitely something I've done without thinking about it. I'm going to stop, and disable those. Thanks!
Makes sense.
and revoke API keys sent over the unencrypted connection.
Excuse me, short question:
If I am not offering a non-TLS endpoint in the first place, and the client, for some reason, decides to take something that is literally called "SECRET", and decides to shout it across the open internet unencrypted...
...how is that my problem again?
Why should my setup be more complex than it needs to be, to make up for an obvious mistake made by the client?
If you want to implement this in a AWS/ALB (Load Balancer) setup, you have to:
- Remove the ALB's HTTP listener
- Remove port 80 from the ALB's security group
If you are paying the cost of a TLS handshake, which you will have to anyway, why not just use client certificates to authenticate (mTLS) instead of hand rolled and rudimentary auth tokens on top. gRPC has built-in support for mTLS and it could be a good time to modernize your endpoints if you are looking to invest time to improve your API security.
Headline reads like a hot take. Actual recommendation is rather useful. Click-bait used for good.
"This unencrypted part of the communication flow has its flaws. Third parties in shared networks, as well as network intermediaries, could sniff passwords and other secrets from the initial HTTP traffic or even impersonate the web server with a MITM attack."
A strawman fallacy.
I think a good approach would be for developers to always use a custom HTTPClient class which throws an error if HTTP is used. I.e. you MUST opt in to use HTTP.
Nothing should redirect ever. However, it happens.
This makes sense, and I wondered for a moment why I hadn't noticed the issue in any of our projects. I realized it's because we just don't expose HTTP in the first place, whether for API or UI purposes. However, that doesn't mean there's no danger.
I believe it's still possible for a client to leak its credentials if it makes a bare HTTP/1 call to an actual HTTPS endpoint. The server only gets a chance to reject its invalid TLS handshake after it's already sent headers that may contain sensitive information. After all, the TCP negotiation did go through, and the first layer 7 packet is the bare HTTP headers.
Of course the port numbers should help here, but that's not guaranteed in an environment where port numbers are assigned dynamically and rely on service discovery.
Serving exclusively HTTP/3 should close this gap but requires all clients to be ready for that. I know many internal deployments do this, but it's not a universal solution yet.
Yes, great article. And now can we convince folks with http+https websites to shut down http access and only offer https. I've seen simple mistakes like only partial redirects happening. Large numbers of internal links that still go to the http site, and some of those not redirect, etc. (you would think they are simple to find and just clean up), etc. And it is frustrating when sites like some online forums may be interesting targets for password theft.
https is great, and I'm glad most web traffic uses it. However, sometimes you want http.
Example: when teaching low-level socket programming, plain http is the easiest place to start.
Example: when providing simple, nonsensitive, read-only info via API, the overhead of https and certificate management seems unnecessary.
I agree that APIs shouldn't automatically redirect HTTP to HTTPS, but I also think that client libraries shouldn't follow redirects by default.
I've stopped opening port 80 at all for some of my web services. The parent domain is in the HSTS preload list, so no modern browser should ever be trying to connect to port 80. And (as the fine article intimates) API calls shouldn't be available on port 80 anyway.
Interesting - I hadn't considered this before, but makes perfect sense. Feels like it's something that's easy to miss, as lots of APIs are hosted behind generic web application firewalls that often have automatic HTTPS redirection as a base rule.
There is a limited number of cases, where unencrypted http is still applicable, e.g. verifiable packages. In general though, it feels wrong even to think about putting such a glaring useless hole.
I do redirect APIs to HTTPS, but I'd prefer not to. There's a simple reason - My APIs are hosted on the same IP as the public website, behind the same load balancer, so something has to be on the HTTP port. I would prefer to separate them, and my larger customers do - But for smaller customers, it's an unnecessary added expense and complication that doesn't make sense.
that's more secure, but still not bulletproof:
A MITM (e.g. a router along a multi-hop route between the victim client and StackExchange) could silently drop the unsafe HTTP requests and maliciously repackage it as an HTTPS request, thereby circumventing the revocation.
Also: even if an insecure HTTP request isn't dropped / makes it through to StackExchange's endpoint eventually (and thereby triggering the API key revocation), a MITM with a shorter trip time to SE's servers could race for wrecking havoc until the revocation happens.
Nevertheless, SE's revocation tactic contributes positively to a defense in depth strategy.
There's absolutely nothing you can do to prevent an active MITM attack over HTTP
Have the authentication header effectively be a signed hash of relevant headers and the full URL, rather than a simple bearer token?
What's stopping the MITM just copying that header?
There's complicated authentication schemes around hmac that tries to do this, but if you're putting that much effort into it you might as well give up and use https.
Some of these include a nonce and/or are deployed over TLS to prevent replay attacks and avoid sending bearer tokens over the wire. AWS sig v4 and RFC7616 come to mind.
Even if the copy the header, they can only perform a replay attack, which is an improvement over leaking an API key. Also, you could include a timestamp in the signature to limit the amount of time it could be replayed.
Sign a nonce.
I think the CONNECT proxy protocol carrying TLS over HTTP/1 is a counterexample.
ref: https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/CO...
I'd argue your reasoning is incorrect. By the time your service is developed you would have already changed it to https, as during development every time you tried your API keys sent via http got disabled. So an in-the-wild MITM would never get to see your http request
I agree from a developer point of view, but the people configuring and deploying the application aren't always the same people developing it.
As a developer I like to make many options available for debugging in various situations, including disabling TLS. This isn't controversial, every Go and Rust library I've ever seen defaults to no TLS, preferring to make it easy rather than required, so reflecting those defaults in the service's configuration is natural and intuitive.
I make sure my example configurations are as close to what will be needed in production as possible, including not just TLS but at least one "mutual TLS" validation. I even sync these back from production if it turns out something had to be changed, so the examples in the repository and built artifact are in line with production.
Yet I routinely find at least some of these disabled in at least some production deployments, presumably because the operator saw a shortcut and took it.
Let's rework Murphy's original law: if there are multiple ways to deploy something and one of those will be a security disaster, someone will do it that way.
Won't those people have the same experience? The app won't work until they configure it securely
There is a non-zero number of developers out there who would sooner deploy a proxy that upgrades http to https because the thought of changing the application code wouldn’t spring to mind
It appears the trick they've found is to disable TLS on both ends :)
That's a very good point, I agree. You're always going to run a service at least once.
The MITM can be between the developer's machine and Stack Overflow, e.g: the classic Evil Cafe Wifi.
The thing about bullet proof is that nothing is bulletproof when you have a big or fast enough bullet
Yes, but what is meant, that it provides protection against common calibers. So you can have bullet proof security in IT. That does not mean it is blast proof, or acid resistant, or prevents someone using a backup key on the second entrance. It is just a metapher saying this security is very solid. Might be true, or not, but that nothing is 100% secure is quite known.
You tend to only hear about the systems where security was successfully broken, not the systems nobody managed to penetrate.
Nothing could possibly be bulletproof. You sent a key over the wire unencrypted. You were in trouble before the data even got to the server to do anything about it.
This approach is a practical choice based on the reality that the bulk of unencrypted traffic is not being actively mitmed and is at most being passively collected. Outside of actually developing cryptosystems, security tends to be a practical affair where we are happy building systems that improve security posture even if they don't fix everything.
as an old-school reader of the cypherpunks email list from before HTTPS existed, I'm still mad about this part:
Outside of actually developing cryptosystems, security tends to be a practical affair where we are happy building systems that improve security posture even if they don't fix everything.
there was a time in the 1990s when cryptography geeks were blind to this reality and thought we'd build a very different internet. it sure didn't happen, but it would have been better.
we had (and still have today) all the technology required to build genuinely secure systems. we all knew passwords were a shitty alternative. but the people with power were the corrupt, useless "series of tubes!" political class and the VCs, who obviously are always going to favor onboarding and growth over technical validity. it's basically an entire online economy founded on security theater at this point.
True, but if we actually did that, it would make those systems very unpleasant to use. The age-old tradeoff of security vs convenience is still as much a force today as is always has been.
Having technically the tightest possible security is not always the right answer. The right answer is whatever balance people are willing to work with for a particular use case. There's a reason that most people don't secure their houses by removing all windows and installing vault doors.
I've been thinking for about 5 minutes about this comment and what to write but i've come to the conclusion that this is really not the best thing to do, but the correct thing to do.
It's not different levels of good or bad... everything else is wrong.
One of the approaches mentioned in the article is to just not listen on port 80. Supposedly that’s equally good because the connection should get aborted before the client has the chance to actually send any API keys.
But is that actually true? With TCP Fast Open, a client can send initial TCP data before actually learning whether the port is open. It needs a cookie previously received from the server to do so, but the cookie is not port-specific, so – assuming the server supports Fast Open – the client could have obtained the cookie from a prior connection over HTTPS or any other valid port. That’s the impression I get from reading the RFC, anyway. The RFC does mention that clients should distinguish between different server ports when caching refusals by the server to support Fast Open, but by that point it’s too late; the data may have already been leaked.
If someone is in your path they can just fake listen to 80 and intercept, then forward your call to 443.
Probably best to listen on 80 and trash the token right then as the majority of the time there won't be a MITM and breaking the application will force the developer to change to https
They can do that whether or not you are listening on port 80 though.
That was OPs point. Not listening on port 80 won't help against an active MitM.
But listening on port 80 and revoking the key also won’t help either as the active MitM would have been smart enough to internally proxy to port 443 or return some other fake response.
The real point is to break the application during development before the first MitM. Either approach does that equally well.
But not listening on port 80 will also usually break the application. Though I suppose the same API key may be used by multiple applications, or multiple copies of an application configured differently.
edit: and even if there's only one application, yet for whatever reason it doesn't get taken down despite being broken, revoking the key now still prevents against a MITM later.
If you're serving web traffic and API traffic on the same domain, which many services are, then not listening on port 80 may not be possible. Even if you do use a different domain, if you're behind a CDN then you probably can't avoid an open port 80. I do keep port 80 closed for those of my services I can do so for, but I don't have anything else that needs port 80 to be open on those IPs.
I think Stack Exchange's solution is probably the right one in that case -- and hopefully anyone who hits it will do so with dev keys rather than in production.
I always thought it was bad practice to use the same domain for API and non-API traffic. In the browser there'll be a ton of wasted context (cookies) attached to the API request that isn't needed.
So it's better to have "api.example.com" and "www.example.com" kept separate, rather than using "www.example.com/api/", where API requests will have inflated headers.
What matters is that there is nothing listening on port 80 on the same IP address. That may be hard to control if you are using an environment with shared ingress.
The point in the article is that APIs are used by developers, not end users.
Returning an error (and/or blocking the port entirely) allows the developer to understand he is using the wrong protocol and fix it.
In this scenario, the end user never actually performs an http request, because the protocol was fixed by the service developer.
Well, nothing you do on the server side will protect a client willing to use http: when an MITM is present: the client can still connect to the MITIM, give away its credentials, and your server won't know.
Still, I agree that this is a very good way to teach your users to not start with http:! And that this is what one should do.
Wouldn't this open the door to revoking random API keys sent maliciously ?
If a malicious party has access to the API key, it should be revoked regardless
Of course. But I think the poster above was referring to just posting random keys to the server.
In other words I don't have your key, or any key, but I have "all of them".
The correct response to this though is that "there are lots of keys, and valid keys are sparse."
In other words the jumper of valid keys that could be invalidated in this way is massively smaller than the list of invalid keys. Think trillions of trillions to 1.
Which, like, if posting random keys has any realistic plausibility of collision, malicious revoking of keys is the least of your concerns.
People could just hit important data fetch endpoints with random keys, until they find one that’s good, and then have a compromised account.
Good point. Presented that way I am seeing more positives to their policies, in particular if a vulnerability was unearthed by the invalidation quirk it's a way better way to find out than any other way.
It's wrong that clients are authenticated with just the random generated username. But it's also what everyone do.
This is just a run-of-the-mill DoS attack, with the astronomically unlikely jackpot of additionally invaliding a random unknown user's API key when you get a hit.
Astronomically is an understatement. If they made 1000 requests per second they might have a 1% chance of revoking a key before the heat death of the universe.
Cracking hashing requires large parallel processing, something you can't do if you're API limited
If the API key is a UUID or similar in complexity, they'd have to send 5.3 undecillion API keys to make sure all of them were invalidated.
So yes, it would open the door to revoking random API keys, but that's not a bad thing; when using an API key, you should be ready to rotate it at any point for any reason.
This sounds like a great way to cause Denial-of-Service attacks.
You need to actually send them the API key, so not really.
To do that you need to guess or steal the API keys.
Denial of service by blocking API keys is really your happy case when someone malevolent has your API keys.
Careful, someone might use that as an API! https://xkcd.com/1172/
I mean... it is a lot easier to do, than to program a procedure to revoke an api key.
Exactly. Easier but actually terrible for security since a MitM can intercept and use the key (and never actually revoke it)
The client-side library should disable HTTP by default to ensure that raw data never leaves the local environment, thereby avoiding any leakage.
What about things like unencrypted websockets? Or raw TCP/UDP connections?
It should, but additional server-side mitigations are good for defense in depth. There may be people using a different client-side library, maybe because they use a different programming language.
Technically correct.
Best kind of correct.
I love this.
Used to? Did they stop? Did they give a reason why?