The "National Emergency Library" was obviously a huge mistake, and I'm surprised that IA continues to defend it. The problem is, their online book lending is far from the most important part of the Internet Archive, and by continuing to fight for it, they risk losing everything, including the entire rest of the archive which seems to me to be far more important.
The Internet Archive has become the de-facto default location to upload anything rare, important, or valuable, and a terrifyingly large amount of history would suddenly blink from existence if it were brought down.
While I agree that the book lending is only a small portion of the value provided by the IA, it makes sense that they're making a stand here. Losing this battle would establish precedent that format-shifting legally aquired copies of media is not protected under fair use, which would be disastrous to preservation efforts of all kinds going forward.
This has nothing to do with format shifting at all. The IA's normal book-lending service works that way, but the "emergency library" allowed more copies of a book to be "borrowed" than they actually physically owned.
This is straight-up piracy, and there's a 0% chance of it being legally justified.
This might kill the Internet Archive as a legal entity in the US if the outcome is particularly unfavorable, but LibGen, Scihub, Z-Library, Anna's Archive etc aren't going away.
Agree to disagree. Copyright has worn out its welcome when it is locking up culture for life + 70 years at a time [1] [2]. The Internet Archive could archive and maintain these works in cold storage (both physical and digital), only to make them public again 100-200 years from now, but that would not be keeping with "Universal Access to All Knowledge." Disk and bandwidth is cheap, and the planet is big.
[1] https://www.copyright.gov/history/copyright-exhibit/lifecycl...
[2] https://www.princeton.edu/news/2003/02/21/lessig-were-copyri...
(no affiliation, but a fan and a supporter, and believe in defending public goods)
This isn't a "disagreement" with the GP. Piracy is a legal concept, and they were speaking in legal terms. Whether or not copyright has "worn out its welcome", it continues to be a legal reality in the US.
Then why should internet archive, which fills a number of niches that aren't just book or document piracy, be killed off?
When one has the means and opportunity, unjust laws should be challenged. One should also have a plan if they fail.
Except the IA did not establish the NEL as an act of civil disobedience or call for changes to the unjust existing copyright laws when they did it. They instead told themselves and their patrons that there was a "national emergency" exception to copyright law, when no such exception existed. Should one have existed? Of course. But it did not at the time. The IA continued to assert that the NEL was legal within existing copyright law, not that it was a principled act of civil disobedience against an unjust law. They were trying to Jedi mind trick a "national emergency" exception to existing copyright law into existence.
If you want to challenge an unjust law in a democracy, you can work within the system to change it, or you can engage in civil disobedience, publicly accept the consequences, and put your freedom on the line as a way to bring popular attention to the unjust law. That is the mechanism of civil disobedience against unjust laws. The whole point is that you must risk something in order to go around the system and challenge an unjust law. Everyone in America has at least one law they consider to be unjust, including some laws that I'm sure you believe are very just and would be very upset if others started to violate them in the name of justice.
Birmingham banned Martin Luther King, Jr. from participating in public protests and a Circuit court judge signed off on it. When MLK said he would protest anyway, the whole point of the civil disobedience was that MLK would be arrested for it. He was right that the public was outraged at the enforcement of this unjust legal order and called for change. What the IA did was like if MLK put on a mask to go protest, and when the police tried to arrest him, he said that he wasn't MLK.
A global pandemic is a national emergency (I'm unsure how this cannot be argued considering the efforts both the US government and the Federal Reserve engaged in). The Internet Archive is at risk because of their actions, which can be argued to be in good faith during extraordinary times based on the events that occurred at that time [1]. You may not agree with regards to intent and the definition of good faith, but that is for the judicial process to resolve. If the Internet Archive legal entity is forced to dissolve, are legal participants on the other side of the civil suit prepared for the fallout from such an outcome (the "public outrage" you mention)?
I would argue the NEL is an example of fair use when the entire world is locked down [2] [3] [4], but I'm not the one who is defending the case. You're upset they are seeking a favorable interpretation of law through the judicial process. That isn't a Jedi mind trick, and to call it as such is silly ("The fair use right is a general exception that applies to all different kinds of uses with all types of works. In the U.S., fair use right/exception is based on a flexible proportionality test that examines the purpose of the use, the amount used, and the impact on the market of the original work.").
Power concedes nothing without a demand. Better to ask for forgiveness than permission.
[1] https://www.cdc.gov/museum/timeline/covid19.html
[2] https://en.wikipedia.org/wiki/Fair_use
[3] https://en.wikipedia.org/wiki/Fair_use#Internet_publication
[4] https://en.wikipedia.org/wiki/Fair_use#Policy_arguments_abou...
I am not upset they are seeking a favorable interpretation of law through the judicial process. I am upset that they literally bet the entire organization on a questionable and novel legal theory, without acknowledging that they were putting the rest of their mission at such risk. You quoted some Wikipedia to me, but fair use is notoriously a minefield. You, me, the IA, and whoever edited that Wikipedia article all probably agree on how we think copyright law and fair use ought to be interpreted. But it seems like I am the only one of us who accepts that Big IP has a lot of influence over copyright law and that most judges don't think like us.
It would be one thing if the IA said, "We know we will be sued for this, and while we believe we will win this case and that the law is on our side, there is a very real possibility that we will not. If we lose, this may bankrupt our organization. However, we have a strong moral imperative to serve the people....." Or if they wanted to do a legal challenge to settle the law, they make one book from the most litigious publisher available for two people, record the entire thing, and send it to the publisher's lawyers. It goes to court
But they didn't. They responded to any criticism that this is risky in our current judicial system by saying that you can only believe it is risky because you don't share our views on what copyright ought to be.
That is maybe decent advice for dealing with parents or a boss, but not with the legal system. There is no forgiveness in copyright law. You don't escape liability just because you thought you were acting in good faith.
As much as I love the core of the IA's mission, there will be no public fallout from this if the IA has to dissolve to pay its debts. I wish there would be, but I seriously doubt this would break through in our current political climate. This is one of the reasons I was so upset, because the IA does not have the political capital to pull off a civil disobedience project. And they didn't even try! Where is Brewster calling on Congress or the President to get a digital library exception added to any one of those bills or executive orders that were being passed around the national emergency?
Sure, but do they? They're a nonprofit, and as such depend on donations. Their donors might or might not be aligned on these two relatively orthogonal issues.
I'm even sympathetic of their desire to challenge the quite absurd status quo of controlled digital lending, with bizarre skeuomorphisms such as simulating "books wearing out" after a couple of lending cycles, while at the same time being more restricted than physical books (even though I don't necessarily agree with their means of challenging it).
But even for me, I think the risk is too big, and I'd feel much more comfortable with a different (maybe related/affiliated, but ultimately separated as legal entities) non-profit organization for each concern.
To say nothing of the fully legal digital services (and physical checkouts with inter-library loans) that many public libraries in the US and I assume elsewhere offer.
The services they cannot afford?
https://apnews.com/article/libraries-ebooks-publishers-expen... ("Libraries struggle to afford the demand for e-books and seek new state laws in fight with publishers")
Consider the title above in the context of this post. It is libraries against publishers.
That may be but it exists today for the most part. And librarians at research libraries were grumbling at licensing costs for digital a couple decades ago. Not they were happy with physical subscriptions either.
Unless you're a billion-dollar corporation who needs to feed your LLM.
Piracy is a PR/Marketing concept. Unless you are talking about commandeering or ransacking ships on the high seas.
It may indeed kill the project and it's a tragic example of play stupid games win stupid prizes. There have been enough decades of lost lawsuits now that any thinking person ought to know better than to go sticking a fork in the outlet that is distributing copyrighted materials.
I think you've forgotten what 2020 was like. Libraries were closed. Schools were closed. Book stores were closed. Shipping was majorly slowed. Basically all public services were inaccessible to most people. Now imagine you're the kind of person who who founds and runs The Internet Archive for 25 years, not just another HN commenter bored at work. You're sitting in front of the button to help millions of people regain access to something the pandemic took away from them. Do you push the button, or do you cower away because some rich prick might sue you later?
I don't think the IA exists without the kind of person who pushes that button.
I don't actually care what 2020 was like because it isn't germane. No new legislation was passed meaningfully altering copyright law during that timeframe so this is absolutely a case of fuck around and find out, regardless of the optics.
Well, I expect that rigid respect for the law is why you didn't start the Internet Archive, and also why you weren't in a position to help millions of people during one of the biggest social upheavals in the last century :)
The law is the law. I don't have to respect it to grasp the concept of cause and effect. There wouldn't be much to talk about here if there wasn't a contingent whinging and ringing their hands as if any aspect of this situation was surprising or novel. Broadcast copyrighted material online without the benefits of significant opsec and you will get dragged in court, period. Hell it didn't take the recording industry 12 months to get a lock on Napster. Y'all need to quit playing like there's a victim here.
Yes, I think we are in agreement. If the law were respected, then the IA, the Wayback Machine, Abandonware Archive, and all of the benefits we get from them would not exist. It sounds like you think the law should be respected, so you think it's good for the IA to be destroyed by this lawsuit. Yes?
Clearly you aren't paying attention so allow me to reiterate:
Regardless of one's opinion on the IA or their mission, they objectively did a stupid when they decided to fuck around with copyright law. Publishers (and judges) have demonstrated repeatedly over the last two and a half decades that from a business perspective this is the equivalent of standing on railroad tracks giving an oncoming train the bird. 10/10 for balls, 2/10 for judgement.
I think it's terrible that the Internet Archive is very likely destroyed, or at best, we have to donate a large sum of money to pay off Hachette.
But I think it was reckless to engage in uncontrolled ebook lending. Controlled lending (one copy lent for one copy on the shelf) is not a legal right (in the US), but it's got a much better chance of avoiding a lawsuit.
Uncontrolled lending was foolish. It was inviting a lawsuit, and it has far less chance of popular support than the intuitively more reasonable model of controlled lending.
I agree with the sentiment behind the NEL program: it's a lovely gesture. But to invite destruction of the Archive like that was a terrible mistake, in my view.
If it turns out to have been a suicide button, then what the the IA needed most was the kind of person who would not have pushed that button.
What was wrong with all the works that are in the public domain?
Seriously, I don't understand how people can defend copyright at this point. Maybe at the beginning it was implemented properly. Maybe in theory it's a great idea. But - surprise, surprise! - it's been broken by corporations. Life + 70 years? gtfo.
One can agree that 70+ years is way too long without disagreeing with the basic principle of copyright.
I don't necessarily disagree with the basic principle of copyright. I just am unconvinced it's an attainable long-term goal. It seems to me that copyright in a capitalist society is bound to end up like this.
That's true, but I actually disagree with the basic principle of copyright. I don't think that telling people they can't engage in certain creative acts encourages people to engage in creative acts.
I don't believe in copyright at all, even in principle. The closest I can get is that it should be within the power of the state to grant temporary monopolies on things that it wants to encourage. But there is a basic conflict between freedom of expression and the fact that expressions can be copyrighted.
However, the idea that IA is going to defeat a copyright industry worth hundreds of billions of dollars is laughable. If you're going to fork between government choosing to end copyright and government choosing to end the IA, the IA is 100% dead.
Characterizing people who don't want the IA to risk all against copyright as copyright-supporters is not fair at all. It's like calling people Assadists who didn't want to invade Syria.
Have you seen the other 97% of the archive?
The lawsuit is not limited to a ruling against the unlimited lending, but any lending whatsoever (even 1-to-1).
Given the nebulous nature of "fair use" it seems quite excessive to say there is a 0% chance of it being legally justified.
The emergency library was the trigger for the lawsuit, but the lawsuit contends that even controlled digital lending is illegal.
There seems to be continued confusion on this basic issue of what is at stake in the lawsuit.
[0]: https://www.library.upenn.edu/news/hachette-v-internet-archi...
I think it's a serious mistake that they allowed unlimited borrowers at a time, because that shifts from fair-use format shifting to effectively making multiple copies of the format-shifted document.
That said, IANAL and I don't know what actual legal conclusions were arrived at from the trial or appeals.
I think it's a serious mistake that they used DRM at all. That positions them right along the publisher narrative which is exactly where they should not be.
I think they're intentionally tying it all together because they know that the "emergency library" was a heap of shit (you can't argue you can digitally lend the same document multiple times simultaneously, it makes no sense).
They're basically holding the Internet Archive hostage to try to force this thing through.
You are directing your ire at the wrong party. Hachette has nothing to gain from continuing to pursue this lawsuit, the only possible outcome (as you correctly state) is the world becomes a worse place. The behavior they object to has already stopped, and they've got a judgment to prevent it happening again. Hachette could drop enforcement of the judgment, both parties can dismiss the appeal, and no one loses anything.
Hachette's owners see an opportunity here to destroy a public good, and they are taking it. Hachette are the bad actors trying to destroy what you find valuable, not the IA.
Hachette obviously benefits from teaching would-be unlimited "lenders" a lesson. Even anti-DRM, "buy my books only if you can afford" authors were against this hare-brained lending scheme because the IA didn't even bother to buy a single copy of the books they were "lending".
The blame squarely falls at the IA's feet; being an idealist doesn't give you the rights to delve into illegal behavior, regardless of the righteousness of your cause or the depth of your conviction. If the world is better with an org in it, and it jeopardizes it's own ability to remain a going concern, it's clear to me who is culpable. "Too good to die" doesn't exist.
This way of thinking is the reason why we are losing so many great things. Laws are created by people to support a society we want to live in. When laws no longer sever the society, then the society must rise up and change them. Like with any bug, fixing it early is cheaper than fixing it later.
"Look what you made me do. If you hadn't acted up I wouldn't have had to destroy you."
I don't want to live in a society where authors don't get paid, so the laws are just fine.
The vast, VAST majority of money that an author makes is from their advance. It is exceedingly rare for a book to sell even enough to cover that advance, and even rarer for it to have sales strong enough that the author sees meaningful, life changing residuals.
This is how author advances would work in a world without copyright: authors would self-publish their books, and there would be no advances. If the book proved to be popular and successful, all the major distribution platforms would "pirate" it and pay them nothing. No conceivable DRM would save the author's income, because the platforms can afford to pay people to manually key in the work.
Think about it. If there is no copyright, why would anyone pay an advance?
And why should I even offer you an advance if I’m not going to make any money off your book. Heck, why should I invest any money in editing your book for that matter.
I’m not sure it’s exceedingly rare for an author to not make some beer money on top of single dollar advances but it’s not a full-time job for many authors. It mostly works to support the day job or as a hobby.
But many authors might as well self-publish today. I mostly have.
You already live in one. Publishing house shareholders get most of the money, even for online books. If you pirated the book and donated to the author, they'd actually get more money.
Doesn't this apply to every mass-market creative endeavor - software engineering included? There a whole lot of machinery sitting between {code|book} author and the paying consumers, leveraging efficiency of scale and demanding a pound of flesh in return. Agents, editors, lawyers, proof readers, marketers, book cover artists, sales people, type-setters, and requisite admin support staff all of them necessary to publish and distribute books at scale. If you think authors don't need an entire industry behind them, try sifting through the self-published dreck on Amazon.
And it is pretty clear who would be first to exploit system where authors don't have copy rights. That is Amazon to start with followed by all other big companies who can effectively distribute the works.
Let me put it bluntly: the IA went about pursuing that change in stupid and impulsive way, and their actions may very well accomplish nothing while causing us to lose more "great things."
"I'm going to pretend the laws I don't like don't exist in order to try to change them," is an activity for people with little other responsibility and little to lose.
In hindsight, if the IA wanted to try something like the "National Emergency Library," they should have set up an independent entity to take the fall and contain the damage if it didn't work out. And since they didn't do that, they should probably have tried really hard to settle and fight another day than go down in a blaze of glory.
Just to be clear, it is your position that the IA's Wayback Machine and abandonware archives should also not exist, right?
No, I fully support these missions. Both have defensible fair use protections and do not try to break new legal ground with flimsy justifications. I wish the IA were little more aggressive about not retroactively applying robots.txt rules on archived content.
It's hard to reconcile how overly careful they are with the Wayback Machine compared to the carelessness of unlimited lending. I am livid they risked their priceless archives for book piracy - that's not a great hill to die on.
Indeed. It's maybe worth reflecting on the apparent conflict there. What info are you missing, that could explain the conflict? The IA folks aren't crazy, but they are opinionated and willing to take action where others might not, and the world was in a very crazy state at the time the decision was made. Consider some sympathy for the people leading the project you feel so passionately about.
But they could issue a mea culpa, and move on. Admit defeat, pay a token fine or settlement, and keep their donations to preserve the rest of their archive. Why don't they cut out the rot, to preserve the rest of the archives. I know they had a mission, or some goal, or whatever, but it failed - it failed a long time ago.
It can be right or wrong, i don't know. I want organizations to fight the battles that gain us new rights and freedoms. I know that they have a lot to lose here though, and they shouldn't risk it.
Concretely, I was a big individual donor to the IA until this lawsuit. I support their mission, I love their work, I help (technically and financially) other organizations like local museums and non-profits handle their archival work. This is something important to me, and I really want their archives to persist.
I stopped donating to the IA - and won't resume - until this lawsuit is resolved. I don't want to donate to the book publishers, and it looks that's going to be the outcome of their entire funds.
They've stopped doing that. They now ignore robots.txt completely and you have to email them to stop them.
Book piracy or not, IA seems to be the only source for many programming books from the 2000s. (Everyone’s go-to pirate library has a much less comprehensive collection of those.)
Is it really? The cynical side of me wonders if it just might be intentional. What if this is a nonprofit analogue to VC monetization? Do you dislike an existing law? Create a similar but legal service you know other people will appreciate, use donations to undercut competitors and become the defacto monopoly, ride the network effect to a large crowd that basically relies on you, then rugpull by tying their narrow, legal use to your crusade for a different legal system by infecting their data with illegal material and declaring the whole thing must sink or swim together. Now your users have to pay you to fight your policy crusade or they lose their already legal resource they value much more, and you can use your legal half as a moral shield to get approval from anyone who only had the time to read the headline when the prosecution inevitably shows up at your door. All you need yo hold the almost-grift together is to lie by omission about who instigated it all.
It’s a fair point. Most of what the IA does is probably technically copyright violation but I’d argue there’s a qualitative difference between making copies of public websites, software that publishers have abandoned, and other things in that vein—especially given they historically bent over backwards to stop sharing copyrighted material if someone got upset— and sharing digitized copyrighted books Willy-nilly given there was already precedent that you just can’t do that.
As I understand it, they're pretty good about taking your site out if you ask them to. That's not quite the letter of the law on copyright, and potentially leaves them open to lawsuits, but few web site publishers are really going to pursue them once they've taken the material down.
If they'd applied that same level to the books, they might have avoided this mess.
Stupid laws that hurt society shouldn't be defended, and those that defend such laws simply because "it's the law" are not worthy of being taken seriously.
If the law were stupid, I'd expect a more robust argument about how it's stupid in the IA's appeal. They are not arguing that.
I have plenty of issues with copyright law as it's currently written and wholeheartedly support copyright reform. That's very different from any one party unilaterally suspending copyright "because of COVID"
... Illegal as defined by the highly paid lobbyists of the trillion dollar copyright monopolies? It boggles my mind that such "laws" are even considered legitimate.
Tell that to Wikipedia and several other organizations which put a stop to stuff like SOPA/PIPA with a single day of blackout. I want to see them try to destroy Wikipedia over copyright nonsense.
Thank you for spelling this correctly!
Everything you allege here is misinformed as to the current state of the lawsuit and stakes.
You demonize Hachette et al (4 major publishers) as seeking to destroy a public good. In fact, they've already settled with the IA, as of last August 2023, in a manner that caps costs to IA at a survivable level and sets clear mutually-acceptable rules for future activity.
You imply IA would dismiss the appeal if the plaintiffs "could drop enforcement of the judgement". In fact, there were never any assessed damages, the parties have already reached a mutually-acceptable settlement per above, and despite that – in fact, as part of the settlement! – the IA has retained the right to appeal regarding the fair-use principles that are important to them.
Per https://en.wikipedia.org/wiki/Hachette_v._Internet_Archive#F...
>On August 11, 2023, the parties reached a negotiated judgment. The agreement prescribes a permanent injunction against the Internet Archive preventing it from distributing the plaintiffs' books, except those for which no e-book is currently available,[3] as well as an undisclosed payment to the plaintiffs.[25][26] The agreement also preserves the right for the Internet Archive to appeal the previous ruling.[25][26]
IA's August 2023 statement on how much will continue despite the injunction & settlement limits: https://blog.archive.org/2023/08/17/what-the-hachette-v-inte...
>Because this case was limited to our book lending program, the injunction does not significantly impact our other library services. The Internet Archive may still digitize books for preservation purposes, and may still provide access to our digital collections in a number of ways, including through interlibrary loan and by making accessible formats available to people with qualified print disabilities. We may continue to display “short portions” of books as is consistent with fair use—for example, Wikipedia references (as shown in the image above). The injunction does not affect lending of out-of-print books. And of course, the Internet Archive will still make millions of public domain texts available to the public without restriction.
Hhhhuh. Thanks a lot for this info, this isn't anywhere near as bad as all the commentary around the lawsuit I've seen lead me to believe. I'm now confused by the apocalyptic tone of the article, I read it as being an existential threat to the IA ("last-ditch effort to save itself," "things aren't looking good for the Internet's archivist," "A extremely noble, and valuable, endeavor. Which makes the likelihood of this legal defeat all the more unfortunate."). I still don't think Hachette has much to gain here, but you're absolutely right that I was way off the mark.
This informative take deserves to be higher up the comment chain!
It's perfectly possible to think Hachette is being cruel and doing harm and also think IA has acted stupidly and/or recklessly.
Yeah, this is largely my position, with an added dose of understanding (2020 was a weird year; many mistakes were made by many people). But Hachette & Co. are the ones acting with intent to harm now so I think they deserve the ire now.
Quite possibly on behalf of some other entities in their same bed that don't like information to be free and preserved.
They're probably just making sure that the next place that wants to pull something funny, doesn't.
Ye old "lets make an example out of them" thing.
We reasoned. We begged. We screamed at IA not to do what they did, that this would be the inevitable outcome. This is like someone left a bear trap on my front lawn and because I should have the right to do whatever I want on my property I refuse to listen to the crowd of people telling me not to stick my dick in the bear trap. The controlled digital lending was already completely illegal but the IA was making headway on making it a culturally accepted practice. Then they burned it to the ground. The loss of the IA would be one of if not the greatest Internet-tragedy in history, but the fools in charge of it don't deserve anyone's sympathy.
They do: Legal precedent.
IA made themselves into an easy, prominent target by doing something everyone else agreed to not do via Gentlemens' Agreements(tm) and precedents set by lesser transgressions, so now they're reaping what they sowed.
The book publishers stand to gain legal precedent that doing what IA did can and will result in legal consequences severe enough to ruin you.
If the IA is a single point of failure, better we learn that now instead of later.
That doesn't make sense. Of course it's a "single point of failure". I don't think anyone could have ever thought otherwise. A competing archive service would undoubtedly be great for humanity.
https://archive.ph/
This is NOT anywhere near what the IA provides.
Yes, it's an alternative, not a copy. For example, it doesn't provide "emergency library" style borrowing, and maybe this missing feature is a pro, not a con.
On the other hand, it's doing something similarly dubious with paywalled news articles in that it bypasses many news sites' paywalls and supposedly injects its own ads next to the content.
There are even many comment threads detailing their strategy to avoid legal takedown requests by serving content via an "anti-CDN" (i.e. always serving content from abroad whenever possible, to make legal actions more difficult).
It archives a much smaller part of the internet, it't not an IA replacement.
I tried to hit archive.is, archive.ph and archive.org yesterday. They were all down, and archive.is seems to still be down.
Any chance you're using Cloudflare DNS (directly or via e.g. iCloud Private Relay)? The people (person?) running archive have a somewhat complicated history with them.
Not using Cloudflare DNS, I use my own recursive resolver. And by "archive", I assume you mean archive.org; archive.is and archive.ph are different people.
No, I meant .is, .today etc; archive.org does not do weird DNS things to my knowledge.
This one is even more frightening, as it's (as far as I know) a one-man show that could disappear at any time.
Yes, it could, but IA, which - up to now - looked more professional and reliable could, too. And, unexpectedly, that's what's happening now.
Maybe archive.ph will also surprise us, by being more resilient.
Anyway, it's an alternative, for now.
I think people don't really stop to consider - the Internet Archive could be gone tomorrow; either through legal action, malice, accident, mismanagement of the non-profit, etc. It's just considered "an Internet tool that exists".
I would argue that their attempt to even try this emergency library thing indicates that the people in charge are a bit too cavalier with what they have, for whatever (likely well-intentioned) reasons they may have.
And while individuals can mirror/datahoard the publicly facing parts of the IA, that's not all they have access too.
So, we need a distributed planet-size, multi-redundant, partly-encrypted, archive that can absorb and host the entirety of IA multiple times (for redundancy), INCLUDING the borrowing library.
"why build one when you can have two at twice the price"
they have a duplicate headquarters in Vancouver now, a similarly grand building to their SF headquarters. Years ago there was a collab with the library of Alexandria in Egypt to host an offsite backup but I don't think it panned out.
This would be the poster child for durable backups such as Microsoft Silica (if Microsoft dared to piss off those plaintiffs).
At 99 Petabytes, an offline copy would take about 2,000 LTO-9 tapes. I'm not familiar with other vendors, but a single IBM TS4500 tape library offers about 400 PB of near-line storage and I don't think IBM would be making the largest ones in existence.
Also, CERN could host multiple copies on unused blocks of their storage farm.
edit: just found a StorageTek (now Oracle) that can do "57.6 EB of uncompressed data". That's just surreal. HPE sells a much more modest unit that can store 2.5 EB.
This comment reminded me of famous quote: "When someone says "I want a programming language in which I need only say what I wish done," give him a lollipop."
I didn't say it'd be easy. In fact, I was being sarcastic.
Sadly, my doctor also said I can't have the lollipop.
Yes. If you have the pockets and logistics behind you, I'm sure Brewster would invite you for lunch to discuss. You'll need people for infra ops, people to coordinate hosting racks across the globe (preferably on every reasonable continent), and whatever the current cost is of a few exabytes of cost efficient storage hardware.
There was supposed to be a second archive at the Bibelotheque Alexandrina in Egypt.[1] It worked for a few years. But it seems to be down now.
[1] https://www.bibalex.org/en/Project/Details?DocumentID=283&Ke...
[2] http://web.archive.bibalex.org/
One response to that is the "The Offline Internet Archive" [0], which includes software to crawl Internet Archive collections and store them to a local server [1].
[0] https://archive.org/about/offline-archive
[1] https://github.com/internetarchive/dweb-mirror
How much data does the IA store? If only storage were cheaper, I'd be way more of a digital hoarder.
According to their about page, 145+ Petabytes.
So a few racks full of Storinators, and a fat internet pipe I guess
I almost can't express how happy this Phineas and Ferb reference makes me.
I don't know what that is
I always thought the physical IA locations like the one in San Francisco should offer a kind of "internet cafe" setup so you can plug into an ethernet port and get gigabit+ downloads without going over the internet at all.
So apparently a 20 TB HDD can be purchased at around $250, plus perhaps $30 per hdd slot with a computer (I just recall a number of this range from calculating years ago how much it costs to have storage at home), making it 2 million dollars—without any redundancy, so maybe twice that.
On the other hand you can already have 24 20TB HDDs for maybe $7000 (with required other hardware), and that's almost 0.5 petabytes. I imagine it would be able to archive all things a single person cares about. Now only if there was a way to interconnect these smaller storage pods to each other..
For the redundancy at this scale, perhaps tape storage is interesting. Though the prices for high density tape seem to not be low enough to be public, and IBM scam their customers by advertising storage capacity before compression.
I'm gonna need a bigger NAS...
To be somewhat diplomatic, how much good does your data hoarding actually benefit anyone but you and maybe some friends? Maybe it will help some leak into the world but the contents of your hard drive are pretty ephemeral.
Oh for sure. I don't think of my digital hoarding as anything I'd endeavor to share publicly. I didn't mean to phrase my question in the sense of so I could share what I hoard with others. It was sort of two separate thoughts that yeah, came out sounding like they were attached together.
Yes, but ...
Counter examples exist: the NASA moon landing tapes were lost, but a copy was found in a horde in Australia. Dr Who episodes, lost by the BBC, have been collected from fans' recordings, etc.
There's some difficulty in connecting those who want the info with those who have it, but if the search gets enough publicity, it can work. This seems like a problem that could be solved with software.
My problem with the "rest of the archive" is that it's arguably 98% unchecked mass piracy that their own people seem to be ok with, in the form of "we're not copyright police, so just upload whatever you want and let companies deal with it later." which is exactly what Jason Scott has said.
Could it be that the alternative is losing stuff forever?
Sadly, yes. It's more of a reason why you should download the stuff you want and need before it's too late: https://news.ycombinator.com/item?id=39908676
Yes. If there is something you think you might personally want in the future and you can download it today you probably should. But it’s probably not terabytes.
The vast bulk of content is lost forever and given how much is created I’m not even sure that’s a bad thing. And if literally everything were magically siphoned up I’m also not sure that’s a good thing.
Of course it is piracy. This coloring of "bad" piracy with "good" things like libraries and preservation is an important step in dispelling the broadly pro-copyright culture we live in. It draws a clear picture of what copyright robs us of by presenting a case were violating it has a sense of moral righteousness.
The lawsuit is not about the “National Emergency Library”.
It’s about, as the IA calls it, “Controlled Digital Lending” - which the IA was doing before and after the “emergency”, and is still doing now. The idea that, if they have a physical copy of a book, they can lend, one-for-one, a digital copy.
The “National Emergency Library” was basically uncontrolled - they were ‘lending’ digital copies of books they did not have physical copies of. But there are no lawsuits about that - presumably because they stopped and it would be bad publicity fore book companies to pursue them about it now. I do wonder if it’s what precipitated the book companies’ ire - but I also think a lawsuit about “Controlled Digital Lending” was coming sooner or later anyway.
No, the publishers' lawsuit also took issue with the "NEL".
It's just that the lower court's judgement didn't really focus much that issue, because they found the broader/simpler and more-foundational "can you loan a format-shifted ebook 1:1 from your physical book" issue against IA, which then makes an anti-NEL conclusion almost automatic.
So NEL-related arguments haven't yet been the focus of the rulings, despite being part of the original lawsuit.
Thus, also, IA's appeal is mostly about that foundational finding – though section II of the IA appeal brief says that if the appeal succeeds on "ordinary controlled digital lending", the NEL ruling should also be rejudged:
>Il. THIS COURT SHOULD REMAND FOR RECONSIDERATION OF THE NATIONAL EMERGENCY LIBRARY Publishers do not deny that the district court’s NEL ruling depends entirely on its analysis of ordinary controlled digital lending. Resp.Br. 61-62. If the Court reverses the latter, then it should also reverse and remand the former. It need not address Publishers’ arguments about the justifications for NEL (Resp.Br. 62), which should be left for the district court in the first instance.
You’re right - I remembered the judgement, which, to heavily paraphrase, basically said “the NEL doesn’t matter if CDL is not allowed, so we only really need to consider that”.
Context?
Context in here: https://lunduke.locals.com/post/5556650/the-internet-archive... (see also the discussion thread https://news.ycombinator.com/item?id=40201053 "259 points by rbanffy 4 hours ago")
I disagree. Archiving information for future researchers is valuable, but giving access to information to people right now is also very valuable. Most people's access to texts for research is very shallow, unless they are part of a research university. Google Books hosts many works that are out of copyright, but there is a century-long dead zone that is inaccessible.
I write a history of technology blog and the Internet Archive lending service has saved me thousands of dollars and many hours that it would have taken to track down the same research materials on eBay. Realistically, I just wouldn't have bothered, and the material I write would be of lower quality or not get written at all.
(That said I do agree that the emergency library was a strategic and legal mistake.)
Great, now you get access but then the year this free access is established is the year NEW information stops being created and/or the quality goes down hill (funny how making something free stops a lot of people from doing it for a living). Now everyone's content is 'lower quality or not get written at all' because the content that makes your's not 'lower quality' is no longer produced (most people/businesses don't work for free).
Here's the related discussion: Stop using the Internet Archive as the sole host for preservation projects | 87 points by yours truly | 27 days ago | 27 comments https://news.ycombinator.com/item?id=39908676
Well said