Very clever way of doing this (though I have a feeling you could probably enforce pinning even in sandboxed mode). I remember trying to MitM Snapchat back in college and couldn't figure it out as they were also using cert pinning.
I'm really glad this is possible, because it's important for dispelling conspiracy theories.
Plenty of people are convinced that Facebook's apps spy on them through their microphone and use that to show them targeted ads.
The easiest way to disprove this is to monitor the traffic between the apps and Facebook's servers... but certificate pinning prevents this!
(Not that anyone who believes this can ever be talked out of it, see https://simonwillison.net/2023/Dec/14/ai-trust-crisis/#faceb... - but it's nice to know that we can keep tabs on this kind of thing anyway)
Unfortunately while this thing helps it doesn't actually conclusively stop any speculation. If I wanted to spy on you via app, I would encrypt the data inside the HTTPS stream and only decrypt it on my server.
Pretty sure anything you encrypt client side can be decrypted client side, as long as you have control over the binary and OS/hardware. It's just a matter of effort.
Not the case with asymmetric encryption, you could encrypt with a public key and only the server's private key would be able to decrypt it. Not even the client could.
Asymmetric encryption is very computationally expensive - there's a reason that it's typically only feasible to use for signing a hash or as part of a key exchange to agree upon a shared symmetric key.
Envelope encryption works for that - client generates a random symmetric key, encrypts the data symmetrically, then asymmetrically encrypts just the key (which is then thrown away on the client). Both the symmetrically encrypted body and asymmetrically encrypted key are sent.
You just modify the client to leak the data before it's encrypted symmetrically. Keys don't matter at that point.
I think the person you're replying to perhaps meant that if you have total control of the hardware and the binary you can pull the value prior to being sent to the encrypt function.
Fair! In that case, yes you totally have access to the payload before its encrypted.
They only need the server's public key to encrypt it client side. But if all you want is to see if they're spying on you, you could go one step above and see if they're calling system APIs to your mic/camera/keyboard, instead of observing the network activities.
And if spying works without using the microphone or whatever, the alternative is almost worse - it means Meta et al has such a good virtual “mind reading” Skinner model of you that they have a good hunch of what you will talk and think about. If we are not there yet, it’s only a matter of time with enough machine learning…
This is always what has screwed with me the most about this AdTech thought experiment: Both likelihoods (listening-in vs astute prediction models) are equally bad; and whoever downplays either as "business as usual" or "humans are predictable", respectively, ought to be called out for it.
It's NOT good when you listen to conversations without explicit (or implied, for that matter) consent, just as it's equally NOT good to exploit human predictive models to such a precise degree for profit. You SHOULDN'T be complicit to these practices, and saying that it's "just what it is" is one more person in the arena that's throwing their hands up to allow it. Attempt for change is ALWAYS better than apathy for complacency - before every interaction exists to become a transaction.
My hypothesis is that it's not listening nor is it predicting based on the individual, instead it's reacting to web surfing behaviors of your associates.
For example, you and your partner use the same wifi at home a lot, and you both visit a close friend's house and use their wifi every time you're there. Services that you use in both places (e.g. Facebook, Google) now have a graph where there's a very strong link between you and your partner, and a weaker but still important link between the two residential IP addresses.
Now you're at home, just had dinner with your partner, and you say you are considering buying a guitar. An hour later you open your phone and see ads for guitars. "Honey, did you search for guitars already?" "No, why?" "Oh no! It heard me! Or it knows me too well! Uninstall all your apps!"
No, what happened is that last night you were at that friend's house, told your friend about your guitar desires, and all morning that friend has been doing a bit of market research themselves, perhaps to see what you're on about and maybe consider getting it for you. The graph connects the dots, and advertisers suspect that perhaps guitar ads should go not only to your friend, but also to you (by that weak association) just in case you might be the one who buys the guitar.
The uncanniness is a function of you having no idea that your friend was building up this slight likelihood that you're about to buy a guitar, combined with even a very weak signal poking out above the noise given no other recent signals.
My theory is that we're all just WAY less interesting than we think we are.
Male, 40+? A bit more likely than the average human to have a mini mid-life crisis and decide to buy an electric guitar.
These platforms suggest SO many ads to us that even if 99% of the suggestions are total junk that we ignore without even registering, the 1% that represent a lucky roll of the dice still really stick in our memories.
If you can come up with this heuristic, you can bet your ass that some ML model can come up with something much better.
That's still dystopian and STILL exists for the sole purpose of interactions to finalize as a transaction. It's not a good thing.
How come applications from such big players are not completely obfuscated and have all kinds of other protections in them to e.g. deny modified binaries from running?
Obfuscation has costs, and certificate pinning is more to make it more difficult for user-adversarial MITM than to prevent reverse engineering. Although the impact on reverse engineering is more than a happy accident.
At the end of the day, your code runs on user machines, and they can observe what the code does, so it's always possible to deobfuscate, and if one person does it and shares their results, it becomes very easy to replicate. That doesn't mean obfuscation is useless, but you shouldn't put too much time into it.
Some app builders turn it into an art though. Like TikTok. They're infamous for it.
I wonder if this is a cultural line of defense against server security...
Because they don't care.
Because doing so is pointless for a mobile/front-end app. The attacker has physical access to the device; there's no way to stop them at this point. The only thing you can do is make the process more annoying in hopes that they will get frustrated and give up.
It's probably a matter of priorities, as well as cost v. benefit.
Obfuscation would've had very little effect on the outcome of this experiment, but might've changed the approach to involve dynamic instrumentation a little more. The most effective obfuscation I've seen is VM obfuscation, but that presents a significant performance impact. Obfuscation would also make legitimate debugging harder.
Preventing modified binaries is done at the system level, and could feasibly be implemented at the application level and is common, but this functionality itself could be both bypassed, or modifications could simply be implemented after security checks have completed (once again, through dynamic instrumentation libraries like Frida).
Engaging in a cat-and-mouse game with reverse engineers probably isn't in Meta's best interest.
It's probably there just to prevent malware or company proxy from intercepting user messages... etc easily. Anything other is a happy accident.
How come every bank isn't secured like Fort Knox?
As the person who made this call originally at Facebook for the apps--it's not worth it. Any sufficiently advanced or motivated person/group/government would break through eventually...such is the nature of shipping client binaries. You can spend a ton of time and money trying to prevent it (for example Pinterest once was trying to ship their own custom language + vm, which I advised against) OR...just accept that your client code is compromised by default, put logic on the server, and move on with your life.
Cert pinning is basically free and is sort of a "you must be this tall to ride the ride" thing--not secure, but keeps the riff raff out.
Ha, I found myself going down a similar route and threw in the towel once I was trying to decompile/edit/recompile. This is dedication, would love to know the hours involved. I set myself a cutoff and stuck to it.
This was initially an internal post at Texts.com that we decided to share, and I scrapped mention of the fact I had tried the exact same approach a few weeks prior and reached my time-box as well.
I initially spent two hours trying to modify different instructions, and then gave up. I saw another blog post written by a reverse engineer by the name of "Hassan Mostafa" (aka cyclon3) that previously succeeded in the same approach (taking Hopper Disassembler to Instagram on iOS) and I was inspired to try again that night, but I had no luck. I even found and attempted to modify the same instructions.
I decided to call it quits, and then a few weeks later with a bit of a grudge, I spontaneously tried again and I had it done in about 30 minutes after finding the sandbox function.
Ok, that makes sense! Sometimes when you read a blog post that is well written and cogent it makes it feel like the author did it in 20 min!
If I end up in the same arena I think I’ll look for debugging code next. I love certificate pinning as a user, but as a forensic analyst I fucking loath it.
Even as a user I don’t there’s a good reason to love cert pinning. If you’re going up against adversaries that can compromise web pki they also probably have some other exploits up their sleeve to pwn you.
Cert pinning pretty much serves to protect companies from people reversing their protocols and little else imo.
It prevents attack vectors that involve attacker-owned certificate authorities as well as compromised certificate authorities from exposing user-data.
https://sslmate.com/resources/certificate_authority_failures
i agree, feels sort of like "we have a walled garden dont anybody else use it cuz our stuff is super secret and secure, trust us(tm)"; it's a layer of obscurity for their "security" - in reality its the app on a users pc that both has this "secrecy" as well a the "handshake" to open it
As a westerner I can only speak for others a little bit, but this is a very western perspective. Even Kazakhstan has been caught doing sketchy stuff with their CA.
If it’s managed well, certificate pinning takes the web PKI out of the implicit trust envelope for your app.
From a pure security perspective, why trust someone you don’t have to trust? The web PKI CA bundle is great for cases where it’s hard to have a unique trust root for your application - like you’re running in a browser with no privileges - but if you’re distributing code then you’ve already solved that problem.
Managed well, it should be completely transparent to users as well. Managed poorly and it can be catastrophic (your app is dead until users upgrade it).
Useless trivia: I believe Hassan Mostafa is the referee from the Quidditch World Cup in the fourth Harry Potter book. Obscure characters are my favorite SNs and dummy data.
There's no point in implementing cert pinning if you don't also have integrity checking... Being able to alter bytes in the physical file and running it should not be possible (without another bypass).
Eh, clearly it raises the barrier to entry significantly. You’re never safe from a truly determined adversary, but you can keep out the riff raff.
Perhaps I'm a bit harsh... but my suggestion to fortune 500 tech company remains. Implement integrity validation as well, otherwise all it takes is editing 2 bytes to bypass your ssl pinning.
Right, but the threat model of SSL pinning is an attacker that has compromised the CA certificate store. The user editing a binary on disk is not a security problem.
Cool idea! Now it takes 2 bytes to bypass integrity validation, and 2 bytes to bypass cert pinning. (4 bytes in total)
Cert pinning protects against compromised certificate authorities. There are hundreds of trusted root certificates in most operating system stores so one of them gets breached every once and a while.
Integrity checking is user-hostile, but certificate pinning can be good for users.
I don't know which users integrity checking the executable would be hostile against. But, I see your point that perhaps their reason for cert pinning is to defend against compromised CAs. It does fit the narrative better with their lack of obfuscation and other layers of defense on their app.
Stopping users from modifying the software they run is user hostile.
I remember the first time I ever cracked an app, I was so convinced I would fail, but it turns out that finding these sorts of easy-to-modify JNE/JEZ spots is easier than it seems. Even if you pick wrong you can just revert to the original file and try a different spot.
I imagine this would be something that AI will be able to do easily in an automated fashion, you can literally just try flipping the JEZ/JNZ in a bunch of candidate spots and launching the app and seeing if the nag screen comes up.
Not really an AI problem though: that's just fuzzing. If the fail case is well defined then really all you need to do is prune the candidates down.
Now if AI could crack something like Denuvo in a 0-shot way...
I will say that ChatGPT did a decent job of explaining non-documented instructions in prior attempts of binary patching.
Now if I could feed an AI a binary and have it tell me where what is happening in a very broad scope, that'd be a game changer, and I'd say that's quite attainable with a high-context window LLM as they seemingly understand hex-formatted byte-code quite well.
Not really an AI problem though: that's just fuzzing
everything is AI these days apparently... even LLMs
Had tools like that already in the 90s, no AI, just brute force.
Seems Meta’s (or at least Messenger’s) RE defense is quite lenient here. Should be trivial for them to drop IsUsingSandbox() from prod builds entirely, that’s before we get into advanced obfuscation techniques.
Meta's apps come with entire debug menus in production builds. The string that author found is likely part of such a menu.
Their Android application in particular allows the participation in a developer program which allows access to one of these menus. Not available on macOS and iOS unfortunately!
Years ago I did manage to get into the impressively huge debug menu in the iOS Messenger app on a jailbroken device. So they do exist there, or at least did back then.
At least when I worked there, protecting against reverse engineering was never a goal. Cert pinning is to make it harder for an adversary to tamper, not to make it harder for the user to.
You don't really need to do that if you want to intercept Meta apps traffic.
https://www.facebook.com/whitehat/bugbounty-education/261571...
This only works on Android, we had no interest in intercepting the Android application.
Out of interest, why not? I've needed to reverse engineer APIs in the past and using Android apps was always much easier so we always did that when the APIs were available.
Since the Messenger Application on desktop is much closer to the usage model of the Texts.com client. We want to replicate the desktop client as closely as possible. It can be assumed there’s going to be properties that are unique to the desktop client and vice versa.
I am curious about the legality of this. I guess I assumed that doing this type of thing would technically a DCMA type breech? So this makes me wonder if my assumption wrong? How does this work legally?
What does copyright have to do with this?
I think the implication is that bypassing cert pinning could be considered a violation of the anti-circumvention provisions in the DMCA and WIPO Copyright Treaty, because it results in decryption of copyrighted content without the permission of the copyright owner.
IANAL, but in the US, at least, I think the exemptions for good-faith security research[1] would apply. Maybe even the reverse-engineering for interoperability language in the DMCA itself[2].
[1] https://www.federalregister.gov/documents/2015/10/28/2015-27...
[2] https://www.govinfo.gov/content/pkg/PLAW-105publ304/pdf/PLAW...
It's typically against terms of service to decompile or reverse engineer applications you download in this way, but it's also typically against terms of service to use their services from unofficial clients, so I think they're already way past T&Cs.
Does anyone know WHERE the HELL Facebook stores tracking data on iOS?
It shows my previous account even after I delete the app, clear the cache and Keychain, disable iCloud Drive, AND sign out of iCloud??
Why can't I see where this data is stored? Same for TikTok.
WHY does Apple, parading around as a pompous paragon of privacy, even allow this shit?
Developers can store items in keychain on your device/icloud account that are only visible to the apps made by that developer (and not you). It is a feature that it works this way, and this whole concept is fucking insane to me.
So how can the user delete it without going through the app or wiping the entire phone?
What else is being stored that we aren’t even aware of?
On iPhones, three ways to get rid of the keychain data of an app 1. wipe the phone AND you must not restore backups 2. jailbreak the phone 3. the app can wipes its own keychain (but apps don’t expose this feature generally)
Would a runtime binary checksum have helped to complicate such modification? This isn’t sop for mobile apps? Do iOS or Android SDK’s provide such facilities? Presumably associated with the official release process and enforced on their respective non-jailbroken platforms?
Basic questions, admittedly. Just noticed that the final solution was to simply modify a few bytes of the binary, which seemed preventable.
macOS (desktop), not iOS (mobile).
Thanks for the correction. Same inquiry for macOS for signed apps.
You have to resign the binary when you modify anyway which achieves the same thing.
On non-jailbroken platforms you generally do this with a developer certificate.
Okay why just use the web interface and intercept that
Often the web interface will be using a different API or be missing characteristics that are being investigated.
With Meta’s Messenger application for macOS being so close to the Texts.com model—that being a standalone desktop application—Batuhan İçöz who is leading the Meta platform project at Texts.com thought we could gain some valuable insight by analyzing it.
It seems that with ebpf you can read data before TLS encryption : Debugging with eBPF Part 3: Tracing SSL/TLS connections https://blog.px.dev/ebpf-openssl-tracing/
Side note: this wouldn't work with Rust programs that statically link to `rustls`, the most popular Rust TLS library.
That's handy, and you can almost certainly hook the TLS send/receive functions in other ways, like with Frida, but being able to bypass pinning instead means that the researcher can route the traffic through existing tools like Burp Suite or mitmproxy.
Routing real app traffic through an intercepting proxy can be a real time-saver depending on what the researcher is trying to do. E.g. if they want to automatically tamper with a parameter in a request that doesn't happen until after some kind of authentication/session setup, it's much faster to let the app do all of that and configure the proxy to just make the one change, versus having to write a whole client that does all of the initial steps and then makes the modified request, or writing an eBPF filter that makes the changes the researcher is interested in.
What proxy tool are you using in that write up? Does it route all application traffic through it when running? Sorry if these are dumb questions.
Good question, Proxyman is the one I'm using in the writeup. It does route all application through it on macOS, and you can proxy iOS devices as well by installing a self-signed certificate on the device and connecting it through the proxy.
I can see an argument that software's communication over the network must be inspectable by the owner of the hardware.
I don't know why but your comment reminded me of learning about $SSLKEYLOGFILE and the ability to retroactively decrypt traffic captured in Wireshark: https://everything.curl.dev/usingcurl/tls/sslkeylogfile (I was expecting there to be an entry on MDN since my first contact with that env-var was from Mozilla's TLS library but no luck)
I guess it's the fact that in my mental model any supporting library doesn't have to be modified to allow viewing the traffic, and no cert-pining-breaks required
This made me think back of the days of +Orc [1]. I believe a lot of knowledge common back then, like how to find and nop out an undesired branch, has been lost. Which is fair, there’s way more other tech to learn nowadays.
I always feel nostalgic when I see references to +Orc, or Fravia (RIP).
But I think there's still a lot of people doing the NOP-patching thing, albeit with more complexity. There continue to be people breaking DRM, and investigating random [mobile] apps with hex-editors & etc.
It's harder to get started these days as programs are more complex, but at the same time the knowledge required is more accessible.
Good reminder that no app is truly ever "closed source" after all there is still the compiled machine code. People used to hand code in this language.
Though I'm personally glad we no longer have to :) it's still way more difficult and compilers can really obfuscate the code (if it isn't already by design)
I did something similar for Instagram on android few years ago. The usual methods for bypassing certificate didn't work on Instagram, they were statically linking openssl into a shared library called libcoldstart.so. I Spent some time reading openssl documentation and ended up installing a hook to the function that configured the certificate verification.
In case you are curious. I used Frida for installing hooks to native functions.
Fundamentally, it’s hard to enforce certificate pinning if the user can modify the binary. Even if sandbox mode used certificate pinning, there would likely be some other way of removing the pinned cert checks.
This is a large part of Apple's control/Secure Enclave decisions. These decisions can seem arbitrary and anti-completive from the outside.
This seems unrelated?
I saw is as related by an entities ability to control certificates on platforms with zero trust.
Apple designs the platform, though. Seems like a different model to me?
Exactly I was pointing out why some may choose certain models. I would say that building a platform takes many considerations and the choices made led to this outcome. Apple made different choices and is often vilified for trying to maintain these protections.
Apple has been slowly making progress of opening up their platform. The next 3 years will introduce a new landscape for apps. People will still be complaining.
I wouldn't call it anti-competitive. Treacherous is a more apt description.
https://www.gnu.org/philosophy/can-you-trust.html
Apple is a west coast company and they like their west coast BSD licenses.
Yes, but it's significantly harder than flipping a bit. There's also clever ways of countering this (e.g. checksumming the public key). Of course, even this is technically hackable, but extremely time-consuming in practice. Imagine getting the public key and adding a bunch (and by a bunch, I mean like 16k) of random ops throughout the control flow that crash the app if any random byte of the key is wrong. For extra fun, offset the byte by the instruction pointer. Good luck debugging that.
It prevents very basic RE / MITM.
I tried the same thing, and while I managed to patch the application and intercept the requests, I gave up when trying to RE the shared object responsible for request signing. I couldn't even find the entry point. For a relatively small social media app they had insane security already back in 2015.
Snapchat and TikTok both boast pretty gnarly RE-prevention measures.
For the uninitiated: TikTok is known to send and receive telemetry packages through headers in other requests (IIRC), and employs the use of a virtual machine(!) to execute encrypted client code.
Any source for that? What does "other" requests mean? Other than what? I doubt it could modify the headers of other apps.
Snapchat’s founding principle and only differentiator from day one has been untrusted client security. There were way too many years where the general public believed that a Snapchat could not be saved. I give huge credit to Snapchat for accidentally teaching the public that if human eyeballs can see something, it can be recorded forever. Now that is taken for granted, even last week’s Saturday Night Live TV sketch referenced what a fundamentally flawed security model Snapchat has.
What? That wasn't a principle of theirs. They explicitly exclude "screenshot detection avoidance" from their bug bounty policy: https://hackerone.com/snapchat?type=team . They always have. As far as they're concerned, that's not a security issue.
BBP policies don’t align with anything except “we cba paying for that”
Snap has always had pretty beefy client security. Since, of course, a hacked client breaks the entire premise of their app.
Ha, same. I think I was eventually trying to hook into kernel-level functions and do it that way (I was using the Android client) but couldn't get far there either, though I think it's technically doable. IIRC, they were using some kind of vtable patching protection around kernel functions to ensure integrity.
I built anti-cheat software (and hardware) before, and it felt like anti-cheat level security. I had an axe to grind with Snapchat, as they rejected me after the first interview round :P
amusing how many of us have tried this with mobile apps to be thwarted
….until now
You're right, it probably could have been implemented by only assigning the output of the sandbox flag function in the consumer function to true, but in this case it worked fine. :)