I feel obligated to say my usual "IP is at this point doing more harm than good" spiel here but don't have the time budget to argue it with people today
At this point, we need a service that "offers" an 8-bay (with 12TB? 14TB? drives) full with the whole ~80TB Anna's Archive. It's essentially all of human knowledge and to be frank it belongs to no one - rather...everyone.
People can store this at their house, keep it offline. Just to have these seeds of knowledge everywhere.
...I suppose LLM's trained on this data, essentially their model weights and tokenization are a much more efficient way of storing and condensing this 80TB archive?
You dropped a 0. Anna's Archive is currently 862.4 TB
Yes, too much for one person, but collectively it is possible to keep it alive.
If anyone wishes to help, you can generate a chunk in 1TB units and seed via BitTorrent here:
Honestly, if I can't have the whole thing, I'm not going to bother mirroring a 1TB fragment that's worthless by itself to everybody except copyright attorneys.
As ndriscoll points out, the only feasible way to distribute an archive of this size is with physical hard drives. I sure wish they would find a reasonably-trustworthy way to offer that.
You are never going to have a physical copy of the archive. It's nearly a petabyte in size.
I know several datahoarders that have at least 1PB, also archive.org grows by that much at least every day
I assumed that GP was an average person who doesn't have a storage array sitting at home. I'm not really sure why the IA is relevant here
1 PB of disk space would cost about $10K at this point in time. Not exactly unattainable. Looks like it would fit in a volume of space about the size of a standard refrigerator.
I'd be OK with both requirements.
It doesn't seem reasonable to me to suggest that an average person would spend $10,000+ (and the time to maintain it) on a pirate archive, hence my comment.
On the other hand, contributing a TB or two to a torrent swarm is much more feasible for most people.
In any case, if you're okay with that, you should do it. Please report back in 6 months with how it's going.
In any case, if you're okay with that, you should do it. Please report back in 6 months with how it's going.
Point being, if I tried to torrent the whole thing, it probably would take 6 months, and would likely get me booted from my ISP and/or sued. I would much rather buy a set of hard drives with the contents already loaded. Or tapes, as userbinator suggests.
(And as for the hypothetical "average person" you keep citing, I don't see anyone meeting that description around here.)
I would much rather buy a set of hard drives with the contents already loaded. Or tapes, as userbinator suggests.
And my point is that this is an absurd suggestion. I shouldn't have to explain why a shadow library shouldn't be selling (tens of) thousands of dollars worth of hard drives containing pirated content. Beyond that, and what I was getting at earlier, is that maintaining a 1PB storage array at home isn't exactly easy, or cheap.
I shouldn't have to explain why a shadow library shouldn't be selling (tens of) thousands of dollars worth of hard drives containing pirated content.
Depends on what their goal is. I shouldn't have to explain why a "library" that's operating illegally in virtually every jurisdiction, with few or no complete mirrors, is vulnerable to being shut down by a small number of governmental or judicial entities.
If I were running the archive, not being a single point of interdiction would be high on my list of priorities. Especially when any number of people are indeed willing and able to keep 1 PB+ of content in circulation, samizdat-style. I would work to find these people, put them in touch with each other, and help them.
Beyond that, and what I was getting at earlier, is that maintaining a 1PB storage array at home isn't exactly easy, or cheap.
Not everything that's worth doing is easy or cheap, or otherwise suited to "average people." Again, I don't know where you're coming from here. What's your interest in the subject, exactly?
1PB is well beyond the point at which a tape drive and a bunch of tapes will be cheaper than hard drives, and likely more reliable.
If you only care about non-fiction and science journals it is more like 250TB I think? Still several thousands in 22TB drives with ZFS though.
22 TB drives are around $230 on ebay, so if you used 15 of them in raidz2, that'd be around $3500 (so maybe a little over $4k with the rest of the server), which is around the cost of a new mirrorless camera and a decent lens, so certainly within the realm of a hobbyist. You probably couldn't get away with downloading 250 TB in any reasonable timeframe with most US ISPs (or at least Comcast) though. That'd be over 2.5 months of 300 Mb/s non-stop. Even copying it from a friend using 2.5 Gbit/s Ethernet would take over a week.
Tape might be a better choice with that much data.
That is true. However, it also has a staggering amount of duplicate data. I have _heard_ that if you search for most any particular book, you often get a dozen results of varying sizes and quality. Even for the same filetype. It's a hard problem to solve, but if we had something that could somehow pick the "best" copy of a particular title, for every title in the library, Anna could likely drop the zero herself.
The above number is excluding duplicates.
If the content is to be trustworthy then using LLMs to compress it makes no sense.
It's possible to do lossless compression with LLMs, basically using the LLM as a predictor and then storing differences when the LLM would have predicted incorrectly. The incredible Fabrice Bellard actually implemented this idea: https://bellard.org/ts_zip/
Can we do this in physics?
Use a universal function approximator to approximate the universe, seek Erf(x)>threshold, interrogate universe for fresh data, retrain new universal approximator, ... loop previous ... , universe in a bottle.
You can do that sort of thing with a toy universe -- in fact Stephen Wolfram has a number of ongoing projects along broadly similar lines -- but you can't do it in our physical universe. Among other reasons, the universe is to all appearances infinite and simultaneously very complex, therefore it is incompressible and cannot be described by anything smaller than itself, nor can it be encapsulated in any encoding. You can make statistical statements about it -- with, e.g., Ramsey Theory -- but you can never capture its totality in a way that would enable its use in computation. For another thing, toy model universes tend to be straightforwardly deterministic, which is not clearly the case with our physical universe. (It is likely deterministic in ways that are not straightforward from our frame of reference.)
When it comes to the evils (and goods) of copyright, it is hard to go wrong with Thomas Babington Macaulay's address to the House of Commons in 1841[1]:
"At present the holder of copyright has the public feeling on his side. Those who invade copyright are regarded as knaves who take the bread out of the mouths of deserving men. Everybody is well pleased to see them restrained by the law, and compelled to refund their ill-gotten gains. No tradesman of good repute will have anything to do with such disgraceful transactions. Pass this law: and that feeling is at an end. Men very different from the present race of piratical booksellers will soon infringe this intolerable monopoly. Great masses of capital will be constantly employed in the violation of the law. Every art will be employed to evade legal pursuit; and the whole nation will be in the plot. On which side indeed should the public sympathy be when the question is whether some book as popular as “Robinson Crusoe” or the “Pilgrim’s Progress” shall be in every cottage, or whether it shall be confined to the libraries of the rich for the advantage of the great-grandson of a bookseller who, a hundred years before, drove a hard bargain for the copyright with the author when in great distress? Remember too that, when once it ceases to be considered as wrong and discreditable to invade literary property, no person can say where the invasion will stop. The public seldom makes nice distinctions. The wholesome copyright which now exists will share in the disgrace and danger of the new copyright which you are about to create. And you will find that, in attempting to impose unreasonable restraints on the reprinting of the works of the dead, you have, to a great extent, annulled those restraints which now prevent men from pillaging and defrauding the living."
He was decrying the increase in term of copyright to life of the author + 50 years.
1.https://www.thepublicdomain.org/2014/07/24/macaulay-on-copyr...
That is a powerful address, indeed. Good thing to know even in 1841 people saw what copyright would become today: an intolerable monopoly granted by the government, functionally infinite.
Seriously, this text is so great. I read the entire thing. It's nearly two hundred years old and contains everything one needs to know about copyright in 2024. Thank you for posting it.
I’d rather copy the whole thing down by hand, than rely on a bullshit generator for my access to knowledge.
People didn't want to buy apples new headset because it was too expensive at 3500 dollars.
You think anyone would spend 3000 dollars on such a thing? I doubt it.
The problem with the later is reliability, or rather it's efficient but unreliable. I'd rather overdo my offline storage and figure out some way to script/code my way into searching it in a convenient way.
Mixed feelings about the case.
Sharing is the best thing that can happen to knowledge. It is great that gatekeepers lose money over this.
However, the blame of the loss might burden oclc, which might have been doing a positive job.
What is OCLCs added value? They didn't create the data
Useless in my country because there are no libraries. But I can lookup libraries "Near my country" and beyond.
I suppose some libraries will allow ebook loans through worldcat. They seem to be more about sharing within us law without directly charging people.
Thats why I said positive. Torrent sharing is better, but idk if that will be sustainable
What country has no libraries?
Good question. The internet is full of lies, but it suggests that Papua New Guinea ranks at the bottom since it only has one and that was a gift from Australia.
Suing Anna's Archive and similar product, eg being the lapdogs of big publishers, it seems. Why else would they care if they had no shared interests with them?
MAybe I am missing a way to use their database that makes sense, but for me worldcat is pinterest-level and other SEO-pollution on my search results when I need to find some real information. I do not care if this book is in some library 1000km away. If I need a book I will search in my local ones. Never understood what is the point in what worldcat does but maybe others use it in some way useful to them.
Organizing data is valuable.
Not as valuable as the actual data, but its not nothing either.
https://en.wikipedia.org/wiki/OCLC
In November 2008, the Board of Directors of OCLC unilaterally issued a new Policy for Use and Transfer of WorldCat Records[66] that would have required member libraries to include an OCLC policy note on their bibliographic records; the policy caused an uproar among librarian bloggers.[67][68] Among those who protested the policy was the non-librarian activist Aaron Swartz, who believed the policy would threaten projects such as the Open Library, Zotero, and Wikipedia, and who started a petition to "Stop the OCLC powergrab".
and
OCLC acquired NetLibrary, a provider of electronic books and textbooks, in 2002 and sold it in 2010 to EBSCO Industries.[54] OCLC owns 100% of the shares of OCLC PICA, a library automation systems and services company which has its headquarters in Leiden in the Netherlands and which was renamed "OCLC" at the end of 2007.[55] In July 2006, the Research Libraries Group (RLG) merged with OCLC.
My theory is that OCLC expanded outside of Ohio, and then the bureaucracy expanded to the point where it became self sustaining. It accidentally merged with monopolistic strains from The Netherlands and is now no different from the other knowledge ransoming entities, also in The Netherlands.
Oh man, the current president and CEO of OCLC.
https://en.wikipedia.org/wiki/Skip_Prichard
Prichard held executive positions with LexisNexis from 1995 to 2003. As vice president, he focused on business information and risk management solutions for corporations, libraries, and other organizations.[4]
Prichard was general manager and senior vice president of sales and marketing at ProQuest Information and Learning, a global publisher and information provider, from April 2003 to October 2005.[4] From October 2005 to April 2007, he served as president and CEO of ProQuest.
I think we found out villain here: Mr. Prichard.
Anyone care to drop him an email to ask why he is messing up humanity's knowledge for profit and greed?
OCLC's slogan is literally "Because what is known must be shared". Them being a litigative gatekeeper is mighty hypocritical.
I don't really get their dilemma. They claim that a publicly available copy of the data on Anna’s Archive is a direct threat to their business. But the same data is freely available by going to their own worldcat.org. Any library that was satisfied with pure read access to the data was already not going to pay them money.
They allege that scraping the 2.2TB of data cost them $5 million over 2 years. That's $2 per megabyte. If the cost of providing this was the only issue surely within those two years someone would have gotten the idea to just put up an XML dump for download, or to shoot Anna's Archive an email with an offer to just send them the data as soon as it became clear that it was them.
They're just piling on anything they can regardless of reason. A real damage count would be lost sales due to web scraping, but they don't sell anything.
For example, the organization spent $1,548,693 on upgrades for its hardware infrastructure, and an additional $608,069 for a two-year Cloudflare contract [..] Other costs include the salaries of 34 full-time employees, who were tasked with mitigating the harm caused by the attacks, as well as various other investigation, security, and hardware-related costs.
“OCLC has incurred damages of $5,333,064 as a direct result of Anna’s Archive’s cyberattacks, but that amount does not fully compensate OCLC for the harm from Anna’s Archive’s wrongful actions. OCLC continues to suffer from harms that cannot be remedied by monetary damages.”
Is web scraping now considered a cyberattack? Was it eating their bandwidth even if it was served through Cloudflare? LOL.
Is web scraping now considered a cyberattack? Was it eating their bandwidth even if it was served through Cloudflare? LOL.
We've been tobagonning down the slippery slope of "cyber" damages for a long time now.
Dang, surely this will put an end to AI training by scraping the web! Unless... perhaps such standard might not get evenly applied?
Case in point https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn
The original "damages" in a lawsuit are almost always made up. They're supposed to get adjusted downward during and after the lawsuit.
This is also why having a default judgement delivered against you for failing to show up generally isn't great.
They're going to be adjusted down to zero
> and an additional $608,069 for a two-year Cloudflare contract
Was it eating their bandwidth even if it was served through Cloudflare? LOL.
On the paid plans, I think Cloudflare charges for bandwidth. Also, you still need to pay for bandwidth from the origin to Cloudflare. And if it's scraping, it's likely to be a lot of cache misses that need to go to the origin.
Is web scraping now considered a cyberattack?
I've run sites that attract scraping... Aggressive scapers can effectively DoS a server; and if they're trying to get around rate limits, that can look like a DDoS.
Does OCLC get revenue from advertising or anything like that? I always notice they have google beacons on their site.
Ask Aaron Swartz
What is the deal with this Njalla hosting service? Is it really so hard to take sites from there down?
It's made by a former piratebay founder. When you buy a domain from them, they act as a middleman and keep legal ownership of the domain thus shielding your PI somewhat from legal enforcement. They claim their hosting is "In secret locations in Sweden". Their LLC is located in Nevis (some tax evasion island).
There's also https://1984.hosting/ which runs a similar operation out of iceland.
Considering I haven't managed to run into malware or phishing sites on either of those providers they seem to be doing something right. Why are these so-called "evil" hosting companies less of a problem for me than namecheap, godaddy, and google?
maybe you haven't but they exist
Because their primary driven is something other than maximising profits.
Sweet! Thanks for the links guys! :^)
Saint Kitts and Nevis is a sovereign state and part of the Commonwealth. It's primary business is tourism.
I used to use one of these Swedish hosts for a very large TV torrent tracker. They successfully batted off every attempt to shut it down.
The most successful attacks were against our donation payment providers, knocking our (considerable) income offline for a few days at a time.
What a ridiculous claim. They call computer hardware and salaries damage.
I want a competent judge to make sure that these are not damages and I wish that Anna’s Archive continues to operate in sensible jurisdiction for the foreseeable future.
If I had the means I would donate to them.
Having the means normally means the extra finances to give away vs knowing the link to their donation form
It an easily misunderstood idiom since the difference between the two meanings is just the definite vs indefinite article: 'the means' vs 'a means'.
Still kind of nice to have that link thrown out here for those with the means to support their work.
Article says nobody is responding as defendant except one individual who has filed a motion to dismiss based on being misidentified. I don’t know the status of that motion though.
They mean "damages" in a legal sense, not in a "broken" sense. Damages in tort law are the amount intended to make the claimant whole -- that is, to reimburse expenses incurred to protect legal rights, compensate for lost revenue, or to restore the status quo ante.
RECAP link to the PACER docket copy:
https://www.courtlistener.com/docket/68157923/oclc-online-co...
The "evidence" they present for naming that woman as the operator of Anna's Archive seems... flimsy.
Yeah, I dug into it, seems like they fucked up and just caused her tens of thousands of dollars in legal fees she may never recover.
OTOH maybe it is her and she's accidentally done a Ross Ulbricht?
(Also, I note Google lists Ulbricht as an "American enterpriser", whatever that is lol)
Looks like the case is defaulted against the "Anna's Archive" entity, but still ongoing against other defendants, including Maria who has responded with some pretty damned decent filings asserting (fairly correctly as I can see) that the Complaint fails to properly state a claim against them. I don't see a ruling granting Maria's Motion to Dismiss, so that's still in the future.
Anyone know the particulars of (federal) legal service by email?
I've only ever used the Sheriff for service, by hand to the person or their agent.
Why does cloudflare cost $600k for 2 years?
What exactly are they paying for? Surely
High-transfer users can easily pay tens of thousands a month to a CDN. And not even that high an amount of transfer, even. Not like Steam or someone like that.
Cloudflare also offers other services and likes to bundle them with their enterprise accounts. They don’t really compete on transfer costs vs other decent CDNs. They prefer you’re using lots of their stuff than just watching the meter every month. They may have been using other services, too.
Why not rent servers from hetzner or ovh for a lot less and setup a CDN of their own? Or is it difficult to do that?
TL;DR actually delivering serious quantities of bits to clients globally is kinda hard and costs real money.
The just-rent-servers scheme presents the following issues:
1) Won’t get you anywhere near as good a network of globally-distributed servers. [edit] I mean, maybe, but at that point you’re looking at a fair amount of cost in vendor management—you’ll probably need several hosting providers to even approach it.
2) You need minimum three good full-time ops people to build and support such a thing. What’s the salary on that over two years? Even with Polish salaries or what have you, it’s cheaper, but not cheap. Let’s say $300,000 fully-loaded cost over two years, just for a nice round number, for three ops/sysadmin sorts. Even in cheaper markets, that may be underestimating what it’d take. Halfway there already, and you haven’t rented servers or rack space yet. I’m not gonna claim that your infra will also work far less well and break more often than Cloudflare, because sometimes these sorts of set-ups do end up being rather stable because they can be far simpler and smaller-scale than an as-a-service product, but that is a risk if it’s not done quite well.
3) Server providers may drop you if you get too expensive. Get DDOS’d too much, just use their bandwidth too heavily but within your nominal limits. It’s a risk.
4) Those rented servers do cost money, and add to the operations salaries costs above. Buy servers? Also money, and now you need to rent rack space and bandwidth.
5) Peering agreements for server hosts are sometimes bad. It’s hard to know whether yours is bad until you do something at-scale on it and see support tickets roll in for consistently dial-up speeds for clients who really ought to be seeing better than that. This info doesn’t make service comparison spec sheets.
6) CDNs have usually put in some effort to solve e.g. China deliverability. You’re starting from scratch as far as political difficulties go.
This is the most ridiculous lawsuit ever. How is scrapping publicly available data "hacking"? How is less than 2TB of total bandwidth amount to millions of dollars of damage? Something is off here. Maybe there is some other motivation behind this, like drawing some legal representative of Anna's archive out? It makes no sense otherwise. Considering that high ranking officers of OCLC (taking a look over their executives in [0]) seem deeply intertwined with the book industry, it makes sense that it is used as a proxy for other types of interests.
[0] https://www.oclc.org/en/about/leadership.html?cmpcat=md_ab&c...
It’s great. Maybe we can turn it against OpenAI as established precedent if it happens
Careful, there may be 2nd order effects there.
And that's also a "maybe" -- ruling could go the other way and get even worse.
"Whether we’re supporting advancements on the leading edge of science or helping children build a strong learning foundation, shared knowledge is the common thread". (source: https://www.oclc.org/en/about.html)
Well, now it's shared on a torrent, but I guess for them that was "over shared" lol.
Also, they are a library but spent 5 million on cyber defense... seriously??
More from OCLC's About page:
"Breakthroughs depend on access to knowledge. Together, member institutions, individual librarians, partners, and staff believe in that mission to share knowledge. And we believe that, together, we can do more.
Because what is known must be shared."
Yeah why would a library of all places be upset about Anna's Archive? Their whole mission is sharing and preserving knowledge.
Are they going to sue Microsoft next since the database is available publicly on the web?
The mistake was calling it Anna's Archive, and not Anna's AI startup.
Anna's Archive provides a free mirror of nonprofit WorldCat data.
OCLC should be sending Anna flowers.
This smells like an embezzlement scam from an imsider at OCLC.
Cry some more, Anna's Archive did nothing wrong.
IP-intensive industries contribute 41% of the US GDP and employ 44% of the US workforce [1]. If you abolished IP all of that would go away. How would you replace that? I'm not a fan of IP either but I think it's pretty hard to escape that reality. Big companies like NVIDIA (3T market cap) are almost 100% IP.
IP is more-or-less central to the US's economic and security strategy. Without it, the country loses a huge amount of power and influence in the world.
[1] https://www.uspto.gov/ip-policy/economic-research/intellectu...
Pay them from general taxation to dig holes and fill them in again with spoons?
If you can only succeed because you're government is holding back others at the barrel of a gun do you really deserve to?
Business in physical property/goods also requires the government to hold others back with a barrel of a gun.
One of the key differences being that physical theft deprives the rightful owner of the property. It's not an analogous crime.
Of course I agree that there are some differences, but they are the same with respect to the sated role of government.
Physical theft deprives the owner of physical property (where this right is respected by law. IP theft deprives the owner of intellectual property ( where this right is respected by law).
People can and do make arguments against both IP and physical property, but the role of government the the same in both cases.
as hug correctly points out, you have accidentally made a circular argument; without the regulatory regime, the thing that 'ip theft' deprives the owner of doesn't exist in the first place
so the role of government is quite different in the two cases: it creates 'intellectual property', in the sense of 'what the owner is deprived of by so-called ip theft', while physical property exists in its own right
if we instead consider 'intellectual property' in the sense of 'literature, knowledge, or designs,' which the owner is not deprived of by 'ip theft', the role of the government becomes precisely opposite to its role with respect to physical property
taking a car as a paradigmatic example of physical property, private ownership rights (whether protected by the state, by moral suasion, by mob violence, or by any other means) protect your car from being stolen so that you can use it; without any security of ownership, people would just get in the nearest car and drive off. you could never be sure a car would be available for you to use, and this would eliminate any private incentive to build cars or to repair or maintain them, resulting in rapid material impoverishment; soon you would have no cars except perhaps for taxpayer-funded public transit. so private ownership rights serve to enable access to cars and similar material goods
literature, knowledge, and designs do not need to be repaired and maintained, especially now that we can use bittorrent instead of linotypes to reproduce them, and if someone else gets in my novel and drives off with it, why, i can still read it as easily as before. the involvement of the state in this case only serves to endanger access to literature, knowledge, and designs—precisely the opposite of the case with physical property
ultimately, intellectual property is completely incompatible with the security of physical property: the trade-secret code of your car's ecu deprives you of some degree of security in your use of that car
people do sometimes argue that an analogous situation obtains with respect to literature, knowledge, and designs: intellectual enclosure through so-called 'intellectual property law' enables creators to require payment from consumers, creating an incentive to write literature, discover knowledge, and create designs. perhaps there is some truth to this, but that is not the only incentive, and evidently it is not a necessary one, given that academic authors (who discover most fundamental knowledge) generally do not receive royalties, and free software reliably leads the software industry in innovation, having almost entirely displaced proprietary software as the basis of the world information infrastructure over the last 30 years
Physical property rights, which I strongly support, are not a law of nature. lions and tigers dont have property rights. They are a concept, but one is no more natural than another. matter exists, knowledge exists. exclusive monopoly to one or the other is no different.
RE Cars:
I think the analogy is apt, but you you ignore the time an effort that goes into creating one. Who would build or buy a car if someone could just get in and drive off.
The same is true for literature. Why spend years writing a book, play, or song, if the first person that hears or sees it reproduces it for everyone and you recieve nothing.
It is just like spending time building a car for someone to drive off with it.
Your argument focuses on the user, not the creator.
Thats all an well for the consumer. The car thief doesnt care either, as long as there are cars to steal and idiots buying cars.
You might argue that peope will create literature out of innate desire,as an argument how screwing them over wont impact incetives, but how is that different from physical property.
You think someone has a deep drive to write the next great ammercian novel, but not grow food, so it is ok to steal one, but not the other. What if people want to grow food, does that then justify stealing it?
I just think it is extremely hypocritical to dismiss IP creators while protecting the car makers or food growers.
IF someone wants to create IP for free, grow food for free, or build cars for free- They CAN!
Maybe it does justify, if stealing doesn't mean depriving the owner of the food.
I want to create IP for free, but I don't have a surplus of time to create it. IP holders deprive me of the surplus, because it's going to reduce the value of their “property” if I create IP for free.
They have less food than before you took it, how is that not depriving them?
Who authors a book so that they can read it themselves? Is that a reasonable model of the world?
How are they taking your surplus time? Nobody is forcing you to buy IP?
That's my point. You keep conflating copyright infringement and stealing. Conceptually and legally, they are different.
In order to earn a living, I have to give away my rights to the IP that I produce. I don't have any time left to produce IP that I could give away freely. My point is that it's not as simple as claiming that people can produce IP for free, given the status quo. IP law makes it more difficult for people to give away IP for free.
given that about 150 words of my comment, more than a third of the total, was about this and its analogues in the world of intellectual work, i can only conclude that you didn't even spend the minute and a half required to read my comment, much less take the time to understand the ideas i was expressing. consequently there is no point in replying further to you
People have been writing books and creating art before intellectual property rights were enforced or even existed.
The highest selling book in the world, the bible, is free of copyright.
What incentive does an author have to write a book if they can't benefit from intellectual property? Perhaps self expression?
Unlike physical property, "intellectual property" doesn't exist by itself. The thing that exists, the product that it is that you ostensibly "own" that is protected by the government is the copyright, a soft of reified extension of ownership, which only has any value inasmuch as the government protects it.
In other words, intellectual property is only worth what the government says it is. It doesn't just hold the gun, in this case, it contrived the whole scenario in which a gun was necessary, and the presence of the gun is the only thing that prevents the "intellectual property" from spreading naturally, as information is wont to do.
You can argue whether the creation of this market is for the greater good, but the fact is that it's not in any way the same kind of market as evolves around physical goods, and is not regulated or enforced in the same kind of way.
I don't think it is any different for physical property. What is a pound of gold worth when you are defenseless and someone else has armed men willing to kill you.
Might makes right and owns all property is the default. Every situation that deviates from this is imposed by governments.
A market around physical goods is no different.
Why did you choose gold as an example? How about a sword? Claiming that physical and intellectual property are the same is reductionistic at best. The fact that both properties are protected by power doesn't mean that they are the same. Physical property is not copyable. Intellectual property is.
Im not saying they are the same, and stated as much above.
I am saying their relation to the government is the same. Government maintains both, and creates a market for them by doing so.
No government> no private property> no goods for sale.
I dont see how copyable is relevant at all.
Then I don't have anything to object. But I suspect that the above points were clear in your comments. I don't think anybody here would object to the idea that physical property law and IP law have the same legal standing. What people object to are the principles of the IP law.
EDIT: Following up with more analysis of the parent's comments... Indeed the following was clearly stated [0]:
The following example to clarify the above statement muddies the water, though:
Physical property theft deprives the owner absolutely. Whether IP “theft” deprives owner of anything is questionable, even in the legal sense. Regardless, government is “right” to pursue enforcing both laws, because they are laws after all.
[0] https://news.ycombinator.com/item?id=40911265
There is no deviation. It is still the might of governments (through police/military) imposed on all with less might, in service of those keeping them in power.
I hope you don't mind when I come and take your computer.
this specious argument relies on conflating 'intellectual property' in the sense of 'useful knowledge and designs' with 'intellectual property' in the sense of 'legal restrictions on using knowledge and designs'. to avoid getting caught up in these word games, let's use different terms for these two concepts; for this comment, i'll use the terms 'knowledge' and 'intellectual enclosure'
when someone says 'ip is doing more harm than good' what they mean is 'intellectual enclosure is doing more harm than good'. when someone says 'ip-intensive industries contribute (...)% of the (...) gdp' what they mean is 'knowledge-intensive industries contribute (...)% of the (...) gdp'
specifically the thing that intellectual enclosure is doing harm to is those knowledge-intensive industries, who are obliged to spend large fractions of their revenues on unproductive lawsuits instead of creating and sharing knowledge. in numerous cases it has destroyed major parts of those industries; two memorable examples are digital, which created the minicomputer and much of the internet, and diamond, which created the mp3 player
but the greatest casualties are not the productive activities that are terminated by intellectual enclosure, but the productive activities that are never born. do you know why linux didn't get a crashproof filesystem with snapshots 25 years ago? it's because of netapp patents. all the damage done by accidental file deletion and crashes on linux in that time could have been avoided. do you know why today there's still no simple way for regular people to send a ten-gigabyte file across the internet? mgm vs. grokster. and for every well-known catastrophe like this, there are ten thousand that never grow big enough for us to even guess what might have been
unsurprisingly the businesses that are most profitable in the current market are using strategies that fit well with the current regulatory regime. but that does not constitute argument that the current regulatory regime is good in any way, except perhaps by the minimal criterion of not completely cratering the entire economy yet
my intellect is not your property
How does a company like NVIDIA operate without IP law? They are fabless. Everything they produce is digital, either in the form of software or in the form of chip designs. As far as I can tell, without protection from IP law (patents, copyrights, trade secrets) NVIDIA could not function as a company. They would have no means whatsoever at preventing a fab like TSMC from manufacturing clones of their devices and selling them, cutting NVIDIA out of the loop.
it's an interesting question, and hopefully after a few decades of repealing intellectual property laws, we can find out. 40 years ago it wasn't obvious that anything like nvidia could exist in the first place. possibilities include:
- no company like nvidia could exist, and for chip design and fabrication we'd be stuck with companies like intel, digital, micron, samsung, and texas instruments; but many other kinds of companies could exist that can't exist currently
- fabs like tsmc would hire design firms like nvidia to produce designs to fabricate; the division of labor would be the same as at present, but banks and investors would send their money to tsmc to pay to nvidia, rather than to nvidia to pay to tsmc
- fabs like tsmc would provide open-source pdks like the skywater pdk to anyone who was interested in designing chips. different open-source gpu designs would proliferate, and jen-hsun huang would be the head of a nonprofit foundation in oregon, spending his days coordinating the contributions of a worldwide network of volunteer electrical engineers and raising his children
- microelectronics fabrication machinery research would be focused on small job-shop equipment using electron beams rather than multibillion-dollar euv fabs, so you could get the chips of your choice fabricated in any downtown with five-day turnaround, much like printed-circuit boards. as before, different open-source gpu designs would proliferate, and huang would be the head of a nonprofit foundation in oregon
- gpu development and fabrication would be internally funded by companies that wanted to use large numbers of gpus, such as amazon and the nsa
of course, companies like nvidia don't depend on the particular 'intellectual property' law being weaponized against anna's archive, and things like anna's archive benefit companies like nvidia rather than threatening them
fabs like tsmc would hire design firms like nvidia to produce designs to fabricate; the division of labor would be the same as at present, but banks and investors would send their money to tsmc to pay to nvidia, rather than to nvidia to pay to tsmc
There’s a bit of a snag with this. Chip designs are rarely ever done from scratch. Instead they’re iterated over many years, similar to how browsers, operating systems, other critical software is developed. It took NVIDIA decades to get where they are now. If anyone else can just take their designs as a starting point then NVIDIA’s whole investment (billions of dollars in R&D over decades) ceases to be a competitive advantage.
I think what would actually end up happening is that chip design as NVIDIA is doing would cease to exist as a business. Perhaps we’d end up with something more akin to an open source model like Linux. But then cost of manufacturing (paying for the masks and order startup costs) would still run into the millions, and TSMC would hold all the cards.
The reason I brought all this up though may have been missed by all the commenters to my original post: the U.S. government and their strategic interests. Having American companies like NVIDIA (and Apple as well, really) lose power and marketshare is not in the interest of the government. The last thing the US wants to see is for China to close the technology gap on this stuff.
in the case you mention, the geopolitical considerations run strongly counter to what you suggest
chinese policymakers can loosen domestic restrictions on innovation such as copyright and patent laws; then the laws in the us will only restrict us companies like nvidia. in large part this has happened, which is a major reason chinese companies (in both prc and roc) have become the leading organizations in a wide variety of high-tech fields, including solar panels, cell phones, electric cars, nuclear power, and microelectronics
your nvidia analogy predicts that gcc engineers and linux kernel engineers would have terrible job security, since anyone who needs a gcc backend or device driver written can hire literally any programmer; there are no legal restrictions. but in fact this seems to make the barriers to entry higher rather than lower. they're just in the form of 'human capital', knowhow, rather than in the form of the assets of a company
also, you may not be aware of this, but tsmc is a chinese company, and it's already left the us behind. sentences like 'The last thing the US wants to see is for China to close the technology gap on this stuff.' reflect wishful thinking that the technology gap is the other way around from how it actually is
In sentences like "The last thing the US wants to see is for China to close the technology gap on this stuff", China is referring to the PRC. In sentences like "you may not be aware of this, but tsmc is a chinese company", China is referring to Taiwan. It is disingenuous to conflate these.
conflating them has been the official policy of both prc and taiwan since the prc was founded, as well as of the un, and it shows little sign of changing. in day-to-day life, there is an enormous flow of money, technical information, hardware, and skilled workers back and forth between taipei and shenzhen. taipei is a 20-minute flight away from fuzhou. the idea of maintaining a 'technology gap' between the prc and the roc is more wishful thinking, like the idea of maintaining a technology gap between california and washington, or between france and germany. i mean, at least france and germany speak different languages
Being ambiguous as to what "China" means is both vitally important for international relations and also generally unhelpful for the purpose of clear communication. Corporate (and other) espionage notwithstanding.
I assure you that 40 years ago chip companies were suing each other over IP just like today, if not worse.
If this were possible it would be happening now.
that's an odd thing to say. if it's possible now it should have been happening 10 years ago? when does the infinite regress stop?
Don't bother us with such complexities. We developed our ideas about IP after being outraged by attempts to stop our piracy of music and movies, and carefully reviewing lists of all the cons of IP (after completely ignoring and throwing out the list of the pros). The only righteous path is for the law to be reform to reflect our views.
Since the current copyright system is very extreme, it makes sense that some opposite extreme opinions have developed as a reaction.
Making the copyright system more reasonable would increase it's perceived social worth.
maybe you should listen to people like me who do think about such complexities then
Good?
One loss is another's gain.
The numerous countries submerged in misery due to US intervention certainly would welcome a lessening of its power.
I don't know about "submerged in misery" but yes, of course countries that perceive themselves oppressed by the US would welcome its decline, obviously.
How would you replace that?
Do what the other 59% and 56% are doing?
Yes, please.
I mean, have you read the USTR's glorified naughty list of countries and their utter contempt for the business models of american corporations? "Our stakeholders" this, "our stakeholders" that. These corporations literally leverage the military might of the USA to extract profit worldwide. There are countries out there where people do not have basic sanitation, the last thing they care about is policing the imaginary property of americans. But Wall Street won't have it so.
Do you have any evidence for that? It astonishes me that humanity has progressed for thousands of years without any IP protection, but for some unclear reason it wouldn't today.
That's right. Change the rules and 44% of people would instantly — instantly — lose their jobs. The US Economy would tank like a torpedoed ship overnight! Gone in a flash!
Or ... maybe what you wrote was a little overhyped, and perhaps when the rules changed the Market, as they say, would decide.
I've been trying to convince people of this for years. The problem is from my view, that people have the idea of IP being some sort of the American Dream ingrained in their heads that they can't even reason about anything else.
What's a suitable replacement?
A copyright term of 30 years after first publication, or 30 years after creation if no publication happens in this period. Ideally add some provision that ensures works are actually available after that period (similar to how many countries require all printed books to be submitted to the national library, but extend it to all media that achieves some benchmark of significance). Patent duration adjusted on a per-industry basis. Trademarks are fine as is
The best model I've seen for patents is increasingly escalating renewal fees for periodic terms. Such as get the first 5 years for very limited fees, every 5 years (up to some max such as 30) require 2-5x to renew or the patent expires. The exponentially increasing fees limit the hoarding of patents without direct economic benefits, but the high cost means you are able to provide offsetting social benefits while still providing incentives for innovation.
So, basically, long-term patents should be a privilege reserved for the rich (people and corporations). Everyone else should GTFO? I'm sure FAANG would still be totally able to massively horde patents with whatever fee structure you propose.
FFS, Amazon developed literally dozens of commercial properties in my area, and then let them sit vacant for literal years because they had some change in strategy. They're perfectly happy to burn money, since they have so much.
So, your idea doesn't solve the problem you're trying to solve, and it would make things worse besides.
If the scale is exponential, it becomes billions of dollars to keep a patent after a couple decades. Then trillions, at some point after which they’ve decided to give up the patent. Many companies are rich, but they aren’t infinitely rich.
Patents already have a limited term, why would we want this in a policy?
Current term is something like 70 years after the death of the author.
It's simply too long for a world that moves this fast.
You are thinking about copyright, which the fact that you can't discern between the two should probably be a sign to bow out of dictating IP policy.
The term is too long. This changes it from a fixed term, to a use-it-or-lose-it oriented policy.
Patent term is 20 years from the date of filing, and doesn't begin to run until the patent issues, which is usually years after it was filed. It's incomprehensible to me how you can seriously argue that the US patent term is too long but then should be replaced with a system that allows for an indefinite term.
Do you know what a trade secret is? Trade secret already has an indefinite term. There's no reason why patent terms should be indefinite, it's antagonistic the concept of disclosure that is required to obtain a patent in the first place.
Copyright should probably grant some benefit to an author, but just the bare minimum necessary to incentivize people to actually submit their work and file for copyright (also, we should resume requiring that you actually file for copyright, as was the case before 1978). This probably means some period of exclusive monetization rights. Anyone should be able to search for and read any filed works for free from the moment they are filed, or possibly after this exclusive monopoly period.
Any additional benefits to copyright holders beyond what is needed to make sure we don't lose the works are essentially graft, no different in principle to the medieval church selling lucrative offices.
If these things are valuable only because of scarcity, then we are incentivizing scarcity by granting monopoly, so we should do as little of that as we can manage. If they are inherently valuable, they should be as widely disseminated as possible (a cost that government can easily afford given modern technology). If they are worthless, there is no harm in the government keeping a copy anyway.
"but just the bare minimum necessary to incentivize people to actually submit their work and file for copyright"
Oh, okay!
The problem is from my view, that people have the idea of IP being some sort of the American Dream ingrained in their heads that they can't even reason about anything else.
Maybe it depends on which people you ask, but I don't think the American Dream is about IP at all --- but mostly freedom and independence.
Deeply entrenched in that freedom is the fantasy of inventing some new miracle product, or supremely popular song or book. It’s protecting the idea that someday I might be the wealthy benefactor of these practices and rules. We see people vote against their interests all the time because of “The American Dream”, and the hope that they might achieve it.
At the heart of the American Dream (and indeed of most Western culture) is the meritocracy - or more specifically the lie of it.
https://en.wikipedia.org/wiki/Myth_of_meritocracy
Freedom and independence are propaganda in the same way that high school football coaches tell their players they can get into the NFL if they work hard enough
perhaps my formulation in https://news.ycombinator.com/item?id=40911782 will be helpful to you. or in https://news.ycombinator.com/item?id=40911903
What would we use instead?
Copyright should expire after like 15 years.
Academic publishers should not exist, research - especially publically funded research - should just go straight to the public domain.
5 year, renewable twice to a maximum of 15 years, by registering and paying a nominal renewal fee (let's say $200), which is multiplied on the second renewal (let's say 3x or $600).
So after 15 years, I can start selling other people's movies and books? That seems odd.
Some research is funded through private donations to a specific school, or because the team shopped their project around to find private funding directly. This goes beyond public funding which grays up that public access to the research especially when compared to something like data produced by NASA
14 years was considered good enough back when it was prohibitively expensive to publish anything and worldwide distribution was basically impossible. Today, when publishing is essentially free and worldwide distribution happens at close to light speed you think we should expand copyright for another year?
I lean more towards 7-10 years, with required registration involving a DRM free copy of the work submitted to the US copyright office (where possible) who will automatically host that file for free once the copyright term has expired. There should be an RSS feed from copyright.gov with download links to the latest works entering the public domain. That'd also make it dead simple to find who you need to contact if you want to negotiate rights to use a work still under copyright's protection.
I agree that anything getting public funding should be public domain on day one (normal exceptions for national security etc)
I can respect that. I've been commenting on the insanity of intellectual property and calling for its abolishment for years. Now I'm thinking about writing an article on it with my opinions so I can just point to it instead of arguing the same points over and over again.
Expressing views on HN doesn't directly change anything but there's some benefits. For example, over years it's become clear to me that I'm not alone in thinking this system is screwed up. Every time I expressed some "unhinged" opinions, as some people here called them, it felt like going against the status quo, against all odds. Inevitably though, somebody else would show up and show me that I'm not insane, in fact I'm not even being radical enough.
Definitely recommend everyone interested to listen to rms's talk "Copyright vs Community". It changed the way I thought about it some 15 years ago. It's only got worse since, but it seems more people are coming to the same conclusion. Maybe we can do something about it.
rms suggests dialling back copyright rather than completely abolishing it: 10 years from date of publication. Of course he doesn't believe in copyright for software at all, but that's another matter.
The funny thing is the way these greedy assholes in the copyright industry are behaving is just making it worse for them. It's driving people to places like z-library because essentially everything is in copyright. A child has just been born who won't ever see a work that was published 50 years ago go out of copyright. It's insane. With sensible copyright lengths we wouldn't need z-library.
It's good that you still said it!