leaked 1.1 TiB (1.2 TB)
I don't know why but I find this funny.
leaked 1.1 TiB (1.2 TB)
I don't know why but I find this funny.
Again highlighting the unrecognized liability companies are taking on by logging every scrap of internal communication, no matter how informal or ill-conceived it may be.
It may be a requirement or law depending on where the company does business.
For example, the financial companies I used to work for had a “standard practice” of archiving all e-mails and internal chats for 7-8 years. Not sure if phone calls on company equipment were recorded or retained though (may be a YMMV case).
This is why I separate work and personal assets. I never do work on personal devices nor do I use work devices for personal activities (ie, social media, e-commerce, shit posting). Also if I’m shit talking the boss’s boss. It’s never using work devices.
Have been asked a few times to use personal devices for work but absolutely refused. I would be asked to install their invasive spyware and root kits so they can abide by their draconian corporate policies. So far, they haven’t forced me otherwise I would have quit those companies long ago.
Requirement to log doesn't mean the record has to be in online storage though. It could easily get rotated into cold storage every month with only a unique offline password granting you access.
I think the word "easily" is carrying a lot weight here -- for a company the size of Disney, keeping all internal communication records in secure offline storage sounds pretty hard from both a technical and operational standpoint. Certainly doable, but I doubt it'd ever happen unless it were required by law
There are various levels of offline. For example you can have an S3 bucket with write-only access. No, it's not perfectly offline. But it's isolated from both vulnerabilities and from hacked employees, which covers most common types of breaches. You can solve 99% of the offline storage features without having an actual physical location with tapes.
what about hacked employees' aws accounts?
Employees shouldn't have default access to those credentials. This applies to audit/backup/account management/billing privileges. You can have very dedicated roles with lots of restrictions for those specific things.
Unless they're highly privileged enough to turn on read access to the bucket, you're fine. Thus, you can contain most breaches of credentials.
If the organisation doesn't use SSO coupled with MFA and the enforcement of the least amount of privileges principle on a cloud platform, then they have no right to complain about security breaches.
I guarantee you that large-cap, highly scrutinized public companies comply with much harder regulations and internal controls than this.
Have you ever worked for a company larger than a 3 person startup? When you work at a company with hundreds of thousands of people, how can you “easily” move data into cold storage without destroying employee productivity? Don’t be hand wavy. Be specific about your solution.
Do you move all old files, emails, internal knowledge articles, code repos, chat messages, etc etc to cold storage? You can’t search for anything you’re looking for in OneDrive, Outlook or Slack if the data is in cold storage so are you going to develop custom applications to support the same search filters as each original source application, but which searches your cold storage?
Once a user finds something in cold storage, you have to load it back into the SaaS app to display it. Why? A variation of the following applies to all enterprise data: A Slack message has tons of important metadata tied to it besides the text content. For example Slack user Id of the message sender, channel ID it was in, file ids of any file attachments in the message, and so on. How do you load that back into Slack from your cold storage?
You can put any data in cold storage. In the real world, that often makes the data almost useless for employees.
Requirement to log doesn't mean the record has to be in online
Banks are (or were) required, in America, to use write-once offline media for records [1].
Does Cisco AnyConnect fall into the invasive spyware root kit category?
A corporate VPN on a personal device definitely does.
or you buy a shitty phone for the company crapware and leave it at your desk when you go home.
The company should pay for that phone
for 7-8 years. Not sure if phone calls on company equipment were recorded or retained though
For a front office role - archiving of; emails, internal chats, external chats, phone calls / squawk lines, browser history, pc screenshot every X interval, same for work laptop at home. Office cctv, room microphones, mobile phone calls + text messages + device monitoring to ensure no other apps.
I never do work on personal devices nor do I use work devices for personal activities
This is now the way.
Also if I’m shit talking the boss’s boss. It’s never using work devices.
It's wild that some don't get this. A couple years back some employees at my company were fired for doing just this over the same IM system we used to coordinate everything else.
In a very large company like Disney there are often legal data retention requirements from ongoing litigation, which means Corporate Slack might be more complicated than the AT&T customer data breach.
Retention doesn't require it to be online.
A tape sitting in Iron mountain would have a smaller attack surface and be compliant.
Potentially this breach will allow litigation that was financially infeesable for some people.
As a former WDIG employee I am not even suggesting anything concrete or that I have any knowledge of unlawful activity.
But as someone who also worked in the electronic evidence discovery field, the cost of blind discovery has a chilling effect on lawsuits.
Now that targeted discovery is possible, it will be within the budgets of more potential cases.
The forever retention was a marketing differentiator for Slack, so this type of events were a risk you have to accept.
But all about convenience and not compliance.
With how many people in this thread don't see the problem with keeping all data always hot... we are fucked.
Hot data is not such a problem with hot security. It’s when your security freezes that it becomes an issue.
But why take the risk? Cold storage is cheaper, more secure, more friendly to a “we have the data but it will take time” defense.
Proper logging for retention would surely involve a point where you encrypt the data with a temporary key and then encrypt that key with the public key and only your top brass would have access to the HSM that could decrypt that blob..
Why keep it hot? Physical security is a mostly solved issue for backups unless you are taget d by nation state level actors.
Like Sony..
With how big and aggressive Disney is I'd expect it to be under ongoing litigation 24/7/365.
Setting (formally or informally) corporate policies which destroy or even prevent the creation of a record of internal communications, regardless of how formal those communications may be - is very well illegal depending on a variety of factors.
The shining poster boy for this would be Google, who told staff to disable logging when discussing sensitive topics:
https://www.techspot.com/news/102874-doj-alleges-google-dest...
They also told employees to never use certain keywords, so that records of conversations would not be found by legal teams using search tools, but also they wouldn't be shown talking like monopolists:
https://arstechnica.com/tech-policy/2023/09/google-hid-evide...
Right, but not using Slack at all would not violate any laws. At least today, conversations at the water cooler or in the lunchroom are not required to be recorded.
Exactly. How preposterous these speaking to someone aloud is fine but using an electronic tool with logging set to OFF is someone bad. So completely inane.
So I guess whispering is a crime now?
the unrecognized liability companies are taking on by logging every scrap of internal communication
Do any large companies not delete everything at the first opportunity?
no, of course not. Especially not disney. they need every shred of everything, for liability's sake. If someone brings something to HR, they need to be able to tamp it down. They keep receipts of everything, all the time.
I know a lot of these types of entertainment companies employ things like keyloggers or remote screen viewers in case an employee is working on a writing project or drawing/painting a picture during their lunch hour, because if they are, everything they make, write, sketch or even jot down belongs to disney exclusively... and if they, say, bring that script to prospective publishers outside the company a year later, or try to sell a print of the artwork they created, they can intervene and stop you.
if you take a shit in their staff bathrooms, that turd belongs to them too.
if you take a shit in their staff bathrooms, that turd belongs to them too.
Yeah. That’s how we got Cars 4.
Google used to have all 1:1 chats autodeleted in 24hrs, unless the employee explicitly disabled it for a specific chat in question. But then they got in trouble with DOJ for that last year[0].
0. https://news.bloomberglaw.com/antitrust/google-chat-deletion...
Interestingly Disney has done this since inception which is why the (IMO, excellent) biography Walt Disney: The Triumph of the American Imagination can be so detailed.
Is it even legal to view that data ?
Why would it not be? What data is it illegal to view? Other than perhaps CSAM, which I would strongly hope Disney don't host on their Slack.
Viewing the data necessary copies it. The data is of course, all Disney IP in the sense that all employee output is the employers intellectual property. Copying Disney IP hasn't historically worked out for folks.
I never liked that interpretation of copyright much. Clearly the person publishing the data ought to be the one liable, if someone obtains the data they should be allowed to do whatever they want with it in private.
I think at least some legal systems agree with my interpretation, but the U.S. is insane.
It's not publishright, it's copyright.
Well over here the most literal translation is 'author's rights', so arguing it ought to be about copying just because that's in the name doesn't carry much weight in my view. If anything it's a great example of how framing can manipulate people's views.
I did mention that I considered the U.S. interpretation insane, didn't I?
The broad contours of modern copyright were formulated at a time when only for-profit enterprises (and perhaps sufficiently well-funded religious groups, but I repeat myself) could actually afford to copy things at scale. It may as well have been called "publishright".
The data is of course, all Disney IP in the sense that all employee output is the employers intellectual property.
IANAL, but I think this description is overbroad. There is a "work for hire" doctrine in copyright law that assigns copyright to the employer, but I believe by default that only applies to works of authorship within the scope of an employee/contractor's assigned duties, with any broader scope needing to be explicitly assigned by contract. I would expect internal communications in general to be covered by an NDA or some concept of privacy rights, depending on the context.
I don’t know why it would be illegal, but it feels skeevy. Besides, Disney has a a good legal team — I wouldn’t be surprised if they could find a reason.
I'm not a lawyer, but I would assume that good lawyers would advise their clients against suing random people on the internet.
There are a shortage of good lawyers
Disney doesn’t care about you looking at this data dump.
every piece of writing, every sketch and illustration, and a lot of discussions about the process or development of shows/films/books/games/etc are copyrighted and under NDA.
Have you ever worked in entertainment?
The NDA is with the employees, not everyone else.
I have not worked at entertainment, so maybe that's the reason why I don't understand how an NDA could affect my ability to read something someone else wrote.
lol. Typical of someone that “works in entertainment” to incorrectly condescendingly explain something that everyone else already knows about and deals with at their jobs.
You just gave an example. There are loads of classified information out there. Though Disney will probably not be able to sue you for just reading the data necessarily, doing so is a big liability if you're a competitor or work for a competitor.
Train an LLM on it...
If someone did that, it would be copyright? As the consumer of the llm, would that breach copyright
Well openai/microsoft claims the output wouldn't violate any copyright.
Can someone explain why hackers dump the files publicly rather than just tell the victim they got access? What's the point?
https://www.csoonline.com/article/565048/what-hackers-do-the...
Financial motivations
Nation-state sponsored/cyberwarfare
Corporate espionage
Hackivists
Resource theft
Gamer issues
Financial theft and nation-state attacks are easily the largest portion of cybercrime. Decades ago, the lone, solitary youth hacker powered by junk food was an adequate representation of the average hacker. They were interested in showing themselves and others that they could hack something or create interesting malware. Rarely did they do real harm.
Today, most hackers belong to professional groups, which are motivated by taking something of value, and often causing significant harm. The malware they use is designed to be covert as possible and to take as much of something of value as is possible before discovery.
You missed non-state political motivations. There are plenty of revolutionaries and so-called terrorists around.
It appears that this is actually the case here, as it's supposedly about artist's rights.
> non-state political motivations
"hacktivists"?
Some people just like to watch the world burn?
Disney is a cultural slash-and-burn enterprise, themselves.
Disney pissed off a lot of fans of marvel, starwars and most IPs they recently bought. The hackers dump the files publicly to be mined to shaming Disney. Maybe...
they themselves said why: Disney is responsible for cancelling, burying, and destroying the records of a lot of shows that fans seem to really love. TAG, the largest animation union, has been raising awareness of extremely unfair treatment of the creative side of Disney and other companies for a while now, how years of their hard creative work has been squatted on without ever seeing the light of day.
apparently, the hackers don't really seem to consult the artists they're purportedly trying to help by doing this, because none of them want to see their work leaked like this. I think these hackers just selfishly want to see the materials behind cancelled shows they were looking forward to.
Maybe they shorted the stock.
Bad move unless they're actively day trading. This won't have any serious effect on Disney stock.
The motto of the group that leaked it is "A hacktivist group protecting artists' rights and ensuring fair compensation for their work", so my guess is that they're trying to give the press and researchers a look at what goes into the sausage, because the industry abuses the hell out of creative talent.
One question is whether there is one massive slack, or multiple different ones. I'd certainly hope that sensitive stuff is limited to a separate slack, for extra insulation.
They mention a name, and a google search shows that person works in Disney IT, so maybe their credentials were leaked and they had admin access to the slack. In that case, relying on slack permissions to limit the scope of a breach isn't really going to work.
There isn't a single slack instance, nor is there a single IT organization. Disney is a huge company with many mostly autonomous divisions. I highly doubt they took everything as it would not be all in one place. I presume given the message it was likely the folks making movies or Disney+, not Parks. Of course, it could also be something much more benign.
If you read the article, the answer to your question is in the fourth sentence. There's even an entire section under the headline "Who, Why, and How" that goes into motives.
Interesting logic... what's the point of "just telling them"?
techno-anarchism
This is going to be a anti-DEI treasure trove. The unsaid things will be shown to have very much been said.
Will it?
Well it seems like a huge leak. And if there's a place where there are illegal, woke, decisions taken it's Disney. For example there's a Disney exec who's been recently caught by an undercover journalist, on video, saying "we don't hire white men as such positions".
It is discrimination and it's totally illegal. I do personally find the "wokeness" of Disney also shows in their cartoons (it's just an opinion but several have that same opinion).
So for such things to happen, there may be a company culture of wokism in place at Disney.
If there's such a company culture and seen the size of the leak, it's very possible that they've been so dumb as to openly discuss how to be anti-white (e.g. both in hiring practices and in picking fictional characters) on their slack channels.
To paraphrase a well-known meme: *"You never go full woke: Disney went full woke".
What do you mean by “woke”, exactly?
DEI is just the new n-word for people who are offended when they see someone who doesn't look like them.
Given the contexts I most commonly see it used in, I think it means “having empathy for someone outside my identity group”.
Is woke in the room with you now?
I believe in the state of California, for entertainment roles, it’s perfectly legal to state you only hire a certain race, sex, etc. for role. If you need an Asian male for a role, it’s ok to say so.
Maybe go read the stuff and find this darning evidence instead of just running your mouth about Disney with a bunch of accusations?
My YouTube recommendation page is going to be full of grifters from just reading this comment.
They should learn opsec from the Disney Vault.
As someone who literally used to own the digital version of the Disney vault I find this leak highly unlikely what it is claimed to be.
Disney doesn’t just use one Slack instance across the whole company and everyone knows to not put pre-release content on my public platforms.
Maybe they compromised an instance owned by DTSS (Disneys centralized IT entity), but this would have little to do with Disney Studios like they imply.
Its pretty standard in the industry to only store pre-release content on airgapped systems.
I don't know, the Sony hacks were pretty comprehensive. Why not Disney? Because of some aspirations about imagineers and giant corporations or whatever? They aren't software specialists. The software they sold to other people, like their games business, was kind of a disaster. They don't compete on software.
Its pretty standard in the industry to only store pre-release content on airgapped systems.
Unreleased narrative content isn't actually valuable, so nobody actually cares. I mean of course they say it's valuable. But there are aspects of value that are objective, and I am saying objectively, not in some aspirational sense, it's not valuable. And anyway, surely, how did such pre-release content get on such airgapped systems? They have tens of thousands of vendors, and those people talk, and they have ordinary desktop computers. They make mistakes all the time. It doesn't really matter.
Their business communications are valuable. So people hacked that.
I understand there is a lot of gestural, performative security measures in the industry, I belong to it. At the end of the day, Disney (Hollywood) asks too much from IT for too little money, does not attract talent comparable to a middle-of-the-road Series A startup in San Francisco, and is led by people who don't value technology (on average).
The Sony hacks were definitely not comprehensive. They affected the core company but all subsidiaries were left alone.
Source: I was an employee at the time and my only data leaked was HR data sent to the primary company.
does not attract talent comparable to a middle-of-the-road Series A startup in San Francisco
LOL.
If you think that this should be the yard stick, you certainly think too highly of the tech industry, to the point of delusion.
SV VC startup land is the exception that proves the rule. It unfortunately gives everyone here a massively overinflated sense of worth.
“Everyone gets paid exactly what they’re worth”, “market forces”, blah blah blah. But the reality is that you just won’t find nearly as much easy, dumb money anywhere else. Hearing kids that’ve been spoiled by the SV startup scene whinge here about things that are completely typical of even the most cushy tech jobs jobs ANYWHERE ELSE is so telling.
No, they should learn opsec from whoever runs our elections--the most secure elections in the world.
Lol, only in USA people don't mention which country is "our" and blindly claim they are best in the world.
https://www.electoralintegrityproject.com/eip-blog/2022/6/13...
https://www.pewresearch.org/short-reads/2016/10/31/u-s-elect...
https://citizen-network.org/library/global-ranking-of-electo...
Anecdotally it feels like there has been an uptick in these high-profile hacks recently, maybe a result of more security people being laid off as a result of companies thinking they would replace everyone with AI?
Probably not - The reason we continue to see attacks is for a couple of reasons:
1) There are very few consequences. At worst, a hacker will get 5-7 years, and the chance of getting caught is low.
2) Security is very very very hard. The defender must get everything right. The attacker only needs to find one flaw.
3) Security does not just depend on security staff. It depends on every software engineer, operations (or devops) engineer, every software dependency, every piece of hardware, etc. If one of these people or dependencies has a problem, the whole system can be cracked. Examples of problems include writing insecure code, getting hacked, not removing old employees from an ACL or group, installing a tool with a back door, etc.
The point is security is hard and it depends on people doing the right thing. It's very hard to get people to do the right thing.
The reason for this attack is political (in the general sense of the term).
The reason for this attack is political (in the general sense of the term).
The opening salvo in the article is:
> A self-proclaimed hacktivist group named NullBulge, aiming to “protect artists’ rights and ensure fair compensation for their work,” claims to have breached Disney and leaked 1.1 TiB (1.2 TB) of the company’s internal Slack infrastructure
Which seems more financial.
Perhaps even hybrid warfare
More security people laid off but also layouts in general put strain on the remaining workers who are supposed to do more work to make up the difference hence more likely to cut corners to deliver products
If AI is a factor at all, then more likely on the hackers’ side.
Seems like slack has a problem
Maybe a dumping tool that uses a stolen api key? Rate limiting and monitoring on slack’s part could help…
All their APIs are rate limited. Disney would have a Grid and with Grids you get data dumps. The feature is normally used for litigation and you need pretty high admin access to get a dump. They either found an exploit or they compromised an Admins account.
Whether you’re talking about enterprise file storage, email, or chat messaging software, they all have APIs and/or admin user interface to allow retrieving any and all data to support eDiscovery.
Hardly slacks fault. With so many clients and so much money behind that, theres such a big target on their back that shoring up defenses is fundamentally impossible. It’s probably best to just consider such services from such large providers as already compromised, and keep sensitive data off them entirely.
Dark side of API-based access to everything on SaaS where companies have no control.
I can’t guard the front door effectively.
Nor, I can easily guard the back doors.
Will data breaches like these: AT&T, Ticketmaster, and now Disney—-a nail in Security coffins for SaaS?
lol. no. besides which all of these hacks would have been prevented by simple, well established controls (eg. MFA everywhere, not hoarding every scrap of customer data and internal comms).
so all of those basics are going to magically happen when you move your data on-prem?
Also the above-mentioned SaaS customers will face no negative consequences from investors or otherwise, just some bad press that will be forgotten quickly and amount to nothing. It's great if your SaaS vendor gets hacked and not you: it spreads blame around in the eyes of the public, and makes it harder for legislators, regulators, and plucky DAs to come after you.
It's clear that 1Tb is a lot of data, but I would have expected more from Disney's slack?
According to the hackers they lost access so it's possible there's more data that wasn't gathered in time
I don't understand the situation with the insider (Matthew J Van Andel). Is the implication that he was originally collaborating with the hackers to give them access, then regretted doing so and decided to cut off their access, and the hackers retaliated by doxxing him?
this video alleges that it might've been because he downloaded an infected mod for a game: https://youtu.be/ZGScvWIyw2E
Not sure why they would dox him, maybe to throw him under the bus after he found out he got pwned and cut them off?
I’d like to know if that’s really how Kathleen Kennedy eats her Linguini.
Could you please explain this reference? I know who she is but I don’t get it.
where can i actually read it
i would like this too, i don't think it's going to be easily found on some website though, Disney will sue the owner out of existence
I wonder why there are so few articles considering this happened last night. Also, it's sad how the "insider" (who probably was hacked/RATed) had his SSN and other info leaked :/
After the bell on Friday is an infamous time for releasing news you don’t want to be covered.
Thats for Ruining MCU!
The spin from Disney is going to be entertaining.
Considering the social and political controversies that Disney is unvolved in, I would expect a lot of scrutiny of the contents of this link.
Any news on the contents in terms of unreleased films?
I can’t stop giggling at this group’s name.
Disney seems to be just shooting themselves in the foot over and over again recently.
It will be interesting to see what happens here. Information that leaks could actually impact share price.
This is the same group that put malware in ComfyUI_LLMVISION and said they were against crypto but then extorted people for crypto.
(ComfyUI_LLMVISION is probably what caused this breach)
Perhaps one day, we can return to the days when a KB was a KB and a MB was a MB. Those grand old days, when we all accepted kilo and mega stretch a little more for computers. Because in binary, base10 metric is a wee bit of a shoehorn. Just a bit.
It all changed when "normal people" started using computers. 1 KB = 1024 bytes makes perfect sense except to 98% of the world.
I know about 1 KB = 1024 bytes, sometimes. I'm a computer nerd, grew up playing on computers and hacking on them, and I'm a programmer now.
But, if someone asks me for a good explanation why 1 KB != 1000 bytes, I don't have a good answer. I know about powers of 2, but why are powers of 2 more important than "kilo" meaning 1000 like it does in every other context?
It's like if a kilometer wasn't 1000 meters, because of the way car odometers worked, or the shape of the tires or something. Why would technical details about a car change the meaning of "kilometer"?
Addressing, at some point, always ends up with physical wires representing bits, so chips are manufactured with power-of-two sizes. It's like asking why we measure crude oil in barrels.
Yes. I know. I've taken an architecture course in university, and I've completed the nand2tetris course and have conceptually build a computer from nand gates up. I ask again:
Why are oil barrels more important than the SI units of volume we use in every other context?
In this analogy, it would be more like if "barrel" was a standardized unit of volume that everyone understood and used, but then in the oil industry specifically they used a slightly different volume and still just referred to it as a "barrel" because it's what they're used to.
And, whenever pressed for clarification, the oil people admitted "yes, technically our unit should be noted as 'oil barrels' which are different from the normal kind, but we like to just say 'barrels' because it's easier".
That is indeed why I made the analogy. https://news.ycombinator.com/item?id=40956618
Real-world example: What weighs more, a pound of feathers or a pound of gold?
Reflexive answer: gold (well obviously gold is heavier than feathers)
Logical answer: neither (1 pound = 1 pound)
Actual trick answer: feathers (precious metals used troy weights instead of the one just about everything else used, and 1 pound in the troy system weighs less than 1 pound in the other one)
https://en.wikipedia.org/wiki/Troy_weight
I thought we were taking about SI units, their general meaning, and the technical details of computers. Barrels seem completely unrelated to those things, being neither a SI unit, nor having to do with computers.
Like a lot of arguments, we're arguing over the definition of a word here ("kilobyte"), nothing more. I'm asking why technical details about a computer are so important they can override the generally understood (and well defined) meaning of that word.
Because the technical details about a computer are important when describing its technical characteristics.
In short, context matters, and we adapt the meaning of words by the context they're used in all the time. It's ordinary.
In fact, it's so ordinary in this particular case, that all we humans did it for decades, before a weird group not representing the existing organic consensus came along and decided the terms absolutely must be changed, and presented us with extremely silly-sounding ones to replace the existing ones, that of course few adopted, leading to the situation we have today where the existing terms are used interchangably to mean both things, and there is now a greater ambiguity around them than existed before.
It wasn't perfect before, but the "solution" made it worse.
Therefore, it sucks in practice at meeting its goal, no matter how much sense it may make to the minority that thinks "gibibyte" is something anyone would ever want to say in public, other than in a funny voice to a dog or a baby.
[Insert American flag emoji here]
It's not that powers of 2 are more important. It's that there will never be, for example, a RAM chip that has 32GB of RAM. They will have 34.36GB, which is an ugly number. But, they happen to have a very nice, round number of bytes if you look at them otherwise - they have 32GiB. And since these two numbers are pretty close, and the clean power of two one is far more natural for humans than the SI one in this context, it was natural to just call it GB.
Does that hold up in practice though? Last I checked my USB drives and RAM bytes were not perfect powers of 2. One clear example that comes to mind is my GPU with approximately 12 GB of RAM. That's no power of 2.
These numbers being a power of two seems pretty important, important enough that we redefine words to match powers of 2. Then, when we look at the exact number of bytes, it's not a power of 2.
Flash gets fussier especially when there are reserved sections.
But come on, are you really saying that 1100 0000000000 0000000000 0000000000 bytes of RAM isn't close enough to being a power of two to prove the same point?
"Close enough" isn't good enough apparently.
Our starting point is that a kilobyte is 1000 bytes, but then people say "that's not close enough to 1024, which is a power of 2", and so we redefine the word "kilobyte" to mean 1024, etc. Then I buy a device with a gigabyte and it doesn't have 1,000,000,000 bytes, and it doesn't have exactly 1,073,741,824 (2^30) bytes either, it has some other random number.
So we started with Système International units and a common understanding of what they mean. Computer people said, "that's not close enough, let's redefine standardized words so they will be exact", and then they use those redefined words in an inexact way.
And for the sane normie people, a kilobyte is still 1000 bytes.
Cute.
But no, being a few percent off is very different from saying "it's not a pure factor of two, it's a very small number multiplied by a very large power of two".
Your GPU has an exact multiple of 2^30 bytes of memory.
If you want to talk about a USB drive, then to do that properly we need the size and count of chips inside a real model.
The important point is that they are multiples of powers of two, instead of multiples of powers of ten. Your RAM has 12GiB of RAM, but in SI GB it has 12.884 GB of RAM.
Look under the heatsink. You're probably going to find 6x 2GB chips in parallel. The individual chips have a power of 2 capacity.
That's actually a better point than you realize because crude oil is another special case! Typically, the steel drum barrel that we're all familiar with is a 55-gallon (208L) drum, except that crude oil barrels are 47 gallons (159 L).
So clearly the right thing to do here to clear up any confusion is to introduce the concept of computer-sized bytes, and metric bytes. Metric bytes would be 0.9765625 of a regular computer byte, so 1000 MB would be 1000 Metric Bytes, or 1024 * 0.9765625 = 1024 Bytes.
Thus hard drives could be rated at 1,000 GMB, for 1,000 giga metric bytes, which would really be a 1 TMB drive or 1 tera metric bytes, which is the same as 1024 giga regular-computer-sized-bytes, or 1024 GRCSB.
Totally straightforwards and not confusing to anybody.
I think you meant 42 gallons.
Gigamegabytes, perfectly reasonable.
They don't have to be.
At one point in history some machines used BCD, even for addressing, and there are magnetic core memory assemblies which have power of 10 sizes.
Because everything (except SSDs now a days) in a computer, on a fundamental level is either 0 or 1. So when you want something that maps to that, 2 to the power of 10 is exactly 1024 bits. Somewhere along the line, someone decided that accuracy of that mapping was more important than adherence to the exact meaning of kilo.
The alternative, would have been to use something else than kilo, mega ect., that represented the base 2 magnitudes. It would be awkward to say you have 8.306.688 bytes of ram if you need to be exact.
We have that alternative. KiB, MiB, etc.
We have that now. We did not for the formative years of the field.
I think it’s also the SI standards pedants who can’t imagine a kilogram might be a different context than a kilobyte.
The civilized world is also using kilometers for example. Kilo has its roots in Greek and literally means thousand.
Two kinds of countries out there. Those that use metric and those that have gone to the moon.
I get your point, but let's not forget where most of the US rocket technology is coming from.
NASA might be the agency who most bitterly regrets not having paid more attention to units and gone metric earlier.
Quick, how many gibibyte are 1234567890 byte?
Quick, how many blocks will 4096 bytes use on my storage device?
The argument is that the base10 interval makes no sense with computers, because they're physically base2.
You can't really have 10 without wasting 2, and that's why it made sense to use 1024 instead of 1000.
Personally I feel the pushback against gibi/mibi/kibi overblown. It's ultimately better to be coherent everywhere and always specify everything with decimals/rounded over random context dependent decisions. But still, the original argument for 1024 made sense too.
1 or 8, depending.
Are we doing our own ECC or are we relying on the controller to do it? If the controller is doing it, how big is that block actually?
Let's compromise and go with kibigrams
No it all really changed when storage service manufacturers realized that they could market 1,000,000,000 bytes as "1 gigabyte", to people who then saw their computer tell them that there was about 7% less than a gigabyte in there.
I think that started before the gigabyte.
Can't say when they started using it, but gigabyte external hard drives would be about when the gap got large enough normal people started to notice it.
Yup, this is when I started to really notice it too.
They did it the 70s.
98% of the world doesn't even know that details like this exist.
They never have the opportunity to question the sensibility of one or the other.
It's a conflict between communications and storage. If you are doing data communications, you are probably dealing with phenomena measured in hertz. Those use SI prefixes, so it's natural to use them with bits as well.
But if you are doing data storage, there are many natural power-of-two structures. Using 1024-based prefixes with them often leads to more convenient numbers.
Even if we can't can we think of better names?
"kibibyte" sounds like a dog treat not a unit of measurement.
I agree. I don't care how technically correct they are if I sound like an idiot when I'm saying it.
The best I've seen is just to have the base as a subscript, like `kB_2` (2 is subscript) or `kB_10`. Though in practice I have yet to come across a situation where the difference a) matters and b) isn't clear from the context.
You're just used to the common prefixes. Kibi is not any weirder than yotta, pico, or deci. They all sound silly if you think about it - so we just don't.
No it definitely is silly. Mebi is even worse.
No, its definitely not silly. Why make comments like this? I'll take the repercussions, but passing judgement on language by how things sound is doodoo behavior.
Another route might be inspiration from exponential math notation. Traditional kilo/mega/giga/tera-bytes are just 2 to the power of 10, 20, 30, 40, etc.
So perhaps a terabyte could be a "bin fourty", or a "two-to-fourty", etc. (Although as it linguistically relaxes into Tootafortie, it'll sound goofy too.)
doesn't work for non english languages
What a vague and bizarre complaint.
You're saying that units-of-10 in English (and using Arabic numerals) will "not work" for other languages, when the international status-quo we're complaining about is already powers-of-1000 in Greek which are then mutated with Latin?
Why do you think there's a (new) problem?
I always wanted to use Knuth's proposal of prefixing the base 2 variety with "long", analogous to tons.
eg. Long Kilobytes, LKB or KKB
I remember in the 90's we used the prefix case to differentiate between SI (kB) and powers of 1024 (KB). Not sure how widespread it was though; no Internet to poll at the time :)
I've honestly never been in a situation where I actually cared about the difference. Just nerd pedantry.
The bigger the storages get the bigger the discrepancy. 1 pebibyte isn't 10^15 byte but more than 10% more.
Would you rather have three fingers and a thumb or seven fingers and a thumb?
70 million year old evolutionary technical debt rearing its head, yet again.
(humans have a number of fingers that isnt a base of two)
Except a subset of humans who have been involved in industrial accidents.
Don't forget about birth defects. I went to school with a kid that was born without pinkies, but otherwise completely normal looking hands. You could still tell something was off when you looked at his hands, but it usually took a few seconds for it to register with most people.
He was an exemption, not an exception.
You can count to 1023 on them. Although some numbers might be a bit awkward to use.
you can double that with every physical property you care to add.
fingers arent innately binary, you can curl them and point them.
wrist orientation relative to the hand and at that point you can count up to 2048 without resorting to too much more than a set of two rules (past instinctual counting)
The built in video player on Reddit will say 2:21 in the preview, then the video will be 2:22 long
Yeah, the innumeracy is strong in that one.
Alex has us covered: https://www.youtube.com/watch?v=TCTWyNstpD0&t=55s