It looks like a lot of this stuff is already in the public domain and so they are just letting people use the scans/photos of these works that they took and are hosting. It's a shame that we don't already have this being done by public libraries.
It seems like these are all already in the public domain. They were already free to use however we liked and neither the Getty nor anyone else could use copyright to prevent that. It is nice that the Getty is providing some scanning and hosting to make them accessible, though.
The article specifically mentions Irises by Van Gogh, but the link (https://www.getty.edu/art/collection/object/103JNH) takes you to a page where it seems like you have to ask nicely and agree to terms to use this public domain image.
The artwork is in the public domain. The photo of it is not. You can take your own photo, but if you want to use a photo someone else took, then you need their permission.
This was a nice idea that was tested and failed in court: https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v._Corel.... Making a reproduction with a camera does not create a new copyright because copyright protects creative expression, not skill with a camera.
Maybe if you have money to burn taking it to court. In practice if you want to publish it you get permission. The publisher is not going to go to court for you.
People can sue for anything they like and yet we still mangage to get up in the morning and do things. Bridgeman was 25 years ago — I would be curious if in that time, you can point to any cases where someone has challenged the use of a photographic reproduction of a public domain work? My hunch is no — not because people don't want to make money licensing things or enjoy going to court, but because most people actually agree with the premise of the decision.
Well, then your wife and her publisher are partially responsible for the continual erosion of our rights. The public domain is important and worth defending.
Just because someone has an image that is in the public domain doesn't mean they're obligated to share it with you. They can simply say no. Your options are:
1. Hire a photographer to take a photo of the artwork that you want to use. You have to pay the photographer, and you still need the museum's permission to access the work in a setting where you can take a good photo suitable for publication. You probably can't just snap a photo while touring the museum and use that.
2. Use a photo that the museum has and pay them and/or agree to their terms. Maybe you could take that photo and then share it because it's technically public domain, but guess what's going to happen the next time you want to use one of their photos then?
Well, they can't say no if you are already in possession of the image.
Certainly, they are under no obligation to take the photo, deliver the photo, host the photo on a server, but once you have the photo, there is no legal mechanism from preventing you from using it. Your arguments in this thread have gone: copyright protects photos of public domain works -> well people can still sue you -> well people don't have to give you access. You have arrived at the truth: museums don't have to take photos or share them with you. They own the physical artwork and can control physical access to it. I don't think that was in dispute.
It’s a moot point that you can use it without permission if you can’t obtain it without permission. This is a distinction without a difference.
That’s why it’s actually a big deal and a Good Thing that museums are making these images available online. It’s not as simple as “they were public domain anyway”.
Well, once a copy of a public domain work is published, you can reproduce it from that publication without permission or license.
My wife is an art historian, she's currently publishing a book and collecting permissions for all the images she wants to use. Why? Because if she doesn't the publisher won't publish her book and she won't get tenure. Do they care that there is a 25 year old legal precedent about how they could use the image for free? No, they do not. If it's an image the museum hasn't made publicly available on their website, then they still have to send it to you in order for you to use it and they're going to attach whatever terms and payment requirements to it that they want before they do that, no matter how much you tell them that it's a public domain image.
That sounds like it is entirely the publisher’s internal policy, not at all related to whether something is in the public domain or not.
A publisher can certainly impose their own restrictions before publishing something to lower liability, but that does not mean the photographs themselves are not public domain, as GP comment explicitly showed.
If you have any case law since then that proves otherwise, great, but an anecdote is not proof enough.
Publishers, documentary makers, etc. tend to go out of their way to cover their butts with written permissions. It's easier than having to deal with someone who thinks they're owed something after the fact.
I imagine court gets a lot cheaper when there is a precedent covering the exact situation in question.
And probably cheaper still if you have written permission to use the image in question.
This is great, and I will repeat something here I have mentioned in the past about AI image generator datasets.
Instead of building image generators off of images scraped from people's art without their consent, we could use openly licensed images, and intentionally push the public towards licensing more of their images for open datasets by encouraging twitter and instagram/meta to add image license options to image uploads, and running some public service campaigns on these platforms to encourage use of open licenses to help build better datasets.
At the same time the smaller image sets that would be available would encourage additional research in to sample efficiency, which I regularly hear would be a generally useful area for further research.
This approach would ensure several things: Public datasets would be available to all, not just the major institutions (assuming enough people cared about open licensed models to push the major players to use them), sample efficiency would be improved, more people would get used to licensing their images for public use, and artists who did not want their style copied by these technologies would have their rights respected.
That last point would have a HUGE effect on public opinion about AI, building trust between the public and the AI research community.
Ultimately I am in favor of a world where intellectual property restrictions are sharply curtailed, but to do that ethically we would need to put in place other systems to provide for those whose livelihood depends upon their intellectual works. And if someone created a work before AI existed, and they reserved their legal rights for their work at the time of publication, I think that should be respected such that the work would be left out of machine learning datasets.
This approach would take more work, but setting this expectation would encourage the tech industry to clarify image licenses on their platforms, ultimately promoting open culture and open image licenses.
Whats the benefit to artists to licensing works in a way thats favorable for AI models? Most artists hate the concept of AI art regardless of whether or not its a threat to their livelihoods.
The benefit is that each individual artist loses very little from their particular piece of art being licensed, but they can make money for that.
The AI art models are still going to be trained regardless. One artist opting out, individually doesn't stop any of that.
Artists lose little by licensing their art to AI models???
Correct, they lose little.
Because those art models don't need one individual artist.
There are existing art models already out there, and new models aren't going to be stopped because one artist didn't license their art.
Yeah, I don't see how they lose anything.
I could see it slowly approaching the open source model - you do it because you care, and maybe some people will donate to you because they want to see your work continue. You can also say you've done it, so if you're a prolific artist it would probably look good.
Also, I feel like the main reason artists have a beef with AI right now is mostly because lots of their published works were used without permission to train models. I think if instead SD/Midjourney et al had used open datasets curated in the way TaylorAlexander described, there would be a lot less pushback, because everyone would know the models were trained with consent of the underlying artists responsible for the training data's existence.
There is still the concern of automation eliminating demand for work done by humans, but I have a hunch that in the long term, artists will embrace these tools in the same way that's been done with Photoshop and every other digital tool. It still might be very different, i.e. AI is much more powerful/enabling than Photoshop, but I'm not sure that'll change the outcome.
I have a hunch that in the long term, artists will embrace these tools in the same way that's been done with Photoshop and every other digital tool.
I think artists will too, but so will everyone else - including many who wouldn't have the skills to create the art they want without AI and who would have had to hire an artist. Photoshop did this is a small extent, but it is possible that AI will meet most people's needs in most situations. As someone with little ability to draw or paint I'm excited for that future personally, but I can't blame professional artists for being nervous. That their own artwork is being used to train their replacement is just rubbing salt into their wounds.
Right now, AI is putting out a lot of substandard work, and artists may find themselves employed just to fix the quirks of AI output, but I doubt they'll find that fulfilling. Eventually AI art may become so homogenized, derivative, and censored that it won't satisfy clients and their customers and if that happens demand for real artists will improve, but I think things could get really difficult for many professional artists in the meantime and I don't think many will be willing to offer up their art for AI the same way most people wouldn't offer to weave rope or sharpen axes for their executioner.
I personally find this incredibly hard to believe. The further you go down the path of learning an art (say painting, photography, or sculpting), the more you realize it's not just technique - it's a way of thinking, seeing, and organizing an image in a way that viewers can understand it.
AI images can copy "composition rules," but without an understanding of what the "artist" is trying to achieve, it can only guess. And the artist themself, if untrained in art, does not know what they are trying to achieve.
If you haven't read "Drawing on the Right Side of the Brain," it may help to get across exactly what I mean. Most non-artists cannot pre-visualize the image they want to create, even if it's the scene directly in front of them.
we could use openly licensed images
There simply aren't enough of them. Stable Diffusion is trained on 5 billion images. That kind of scale doesn't exist in public domain artwork. This dataset of 88k images is 0.0017% of that.
Also, it's worth noting that intellectual property is a weird construct. Human artists are trained from looking at copyrighted works their entire lifetime. If you asked me to draw a cartoonistic bear I cannot guarantee you that it doesn't vaguely look like Winnie the Pooh or Baloo. I've seen those things and can't un-erase them from my head. And if you prevented me from ever seeing copyrighted works for my whole life, I might not be able to draw anything.
So why are we holding AI to a different standard?
I don't think we are on the training side. I think the artists would like us to.
Stable Diffusion is trained on 5 billion images. That kind of scale doesn't exist in public domain artwork.
Right, which is why work on sample-efficiency would be so valuable.
So why are we holding AI to a different standard?
Because it is an automated computer system, not a human being. It can be held to a different standard because it is an entirely different system.
So an image uploaded by a random person (not necessarily the copyright owner) to twitter and it automatically is added to this public set? Not sure legally it would be any different.
The understanding (or level of giving a crap) the average social media user has of copyright is pathetic. How often do you see "DM for credit" or "no copyright intended"?
This is pretty much what Adobe Firefly does, being trained on stock images that they have rights to.
It seems the highest res. images for sizes over 10k (greater side res) constantly fail for me.
For example I can't download the 11k version of this [1].
Is anyone else experiencing this?
Worked for me
I can't either. wget reports "Read error at byte 7843894" every time.
It's saying "Read error at byte 9604429" for me. Feels like the Load Balancers might be overloaded.
Also, the server (or the load balancer) doesn't support range download, so I cannot resume where it failed :(
Getting the 11k works for me.
Edit: Upon trying to get some others, the error shows up "Read error at byte 7127040" seems they probably either limiting, overloaded servers or having some more serious issues.
Nice! I love using high quality images of art for some of my personal projects. Might built something similar with these like I did for the Rijksmuseum in Amsterdam[1].
Very cool. How are you extracting the colors from the image? Trying to do something similar but without yielding anything I'm happy with
Thanks! Some of the artworks have the colors as part of their metadata, so I directly use those.
For a different project I’m looking in into using k-means to determine the dominant colors.
Yeah, I have a similar-ish project that I should see about about adding these images to.[1] Will have to see if they have a nice API.
I think those are called dark-sky preserves? https://en.wikipedia.org/wiki/Dark-sky_preserve
Think ya replied to the wrong thread, mate.
A lost explorer from https://news.ycombinator.com/item?id=39709981
You're right... somehow I jumped to the previous thread.
"J. Paul Getty was not a billionaire known for his generosity. "
this is a rather disrespectful way to refer to the benefactor.
To the contrary, it is a very mild way of describing one of the most famously callous, cartoonishly vicious misers in modern history. This man was right up there with Mr. Burns.
https://brightside.me/articles/the-story-of-the-richest-man-...
this speaks more about your character than his
Not impressive. I run a meme image content farm and I have scraped many times more than that. Available for all of you to right-click-save.
In ~5 years, only one of them will still be around.
Now add detailed metadata and categorise every image.
Hey, about these double ones, are they like old 3D photography?
Yes, it's a stereograph: https://www.americanantiquarian.org/stereographs.htm
Pretty cool! Thank you for the link!
Great. Now Samsung can put these on my Frame TV and charge me $5.99 a month for the privilege of accessing them.
Getty-Images* will barefacedly "license" you public domain images at significant cost (i.e. thousands per image.)
Worse, they've attempted to extract licensing fees from people who use such public domain images, in one unfortunate case: Carol M. Highsmith, a photographer who had donated her works to the public domain received a letter of demand from Getty Images for using her own public domain images.
https://petapixel.com/2016/11/22/1-billion-getty-images-laws...
* While Getty Images and Getty Museum are born of the same "Getty" family, they aren't connected entities, and one does not reflect upon the other.
sounds like PR
Yes, I was disappointed to find only one or two images from each famous artist.
Not to be confused with Getty Images Holdings Inc who would likely fall quite on the opposite side of this move.
I suggest a title update to “The J. Paul Getty Museum …” to the OP if they read this
Those images have only a little commercial use IMO.
When I was publishing a typography magazine in the 90s, most of my covers were non-typographic images. I used an image of a statue of Venus from the Getty for one cover. They not only provided the transparency free of charge, but offered to take a picture from a different angle if I needed a different image of the statue (also for no charge). The Getty does a lot to share their collections. (In contrast, I paid a couple hundred dollars to LACMA for the use of a transparency of a 17th century painting in their collection.)
If you're ever in LA, I recommend checking out the Getty. I'm by no means an art buff in any capacity, and honestly I thought going might be boring because I did the Getty Villa in Malibu when I was in elementary school and wasn't appreciative, but it completely changed my mind. Even if you don't like the art or the exhibits, the views of Los Angeles are amazing, probably better than Griffith in my personal opinion.
Even if the artwork is in the public domain, the photo of the artwork that someone takes still has a copyright and you need permission to use it.
This is not true in the US.
Under U.S. copyright law, the person who creates a work is the copyright owner. So if a photographer takes a picture of an artwork, they own the rights to their image.
Read more here: https://www.copyright.gov/engage/visual-artists/
Copyright only protects creative works, where the author's artistic intent can be distinguished.
Faithful reproductions of non-copyrighted two-dimensional work are considered non-copyrightable, because nothing of artistic value is added in the process. (There is a lot of mechanical work around photos, but mechanical work doesn't enjoy copyright.)
Right. Just because something is a lot of work ("sweat of the brow"), doesn't make it copyrightable in the US. Feist is one of the main cases in thiss area as I understand it.
You might find it helpful to refer to the Copyright Office's Compendium, which covers copyright law specifics in more detail. It's big, but quite approachable.
Chapter 300 covers copyrightability in general: https://copyright.gov/comp3/chap300/ch300-copyrightable-auth...
Chapter 900 covers visual art specifically and goes into more detail on the copyrightability of photographs: https://www.copyright.gov/comp3/chap900/ch900-visual-art.pdf
But the photo needs to be taken at a weird angle and it helps to place a live squirrel somewhere on the artwork.
A photo of a public domain work doesn't generally count as "creating a work". Especially not when there's nothing artistic added or any additional context to it. If I just take a photo of an artwork that's already in the public domain it is essentially little more than a reproduction.
This is not true in the US. The most relevant case is Bridgeman v. Corel [1], ruling that photographic reproductions of public domain paintings could not be copyrighted.
Museums like to pretend that they hold copyrights on these photos. They do not.
(In the US).
[1] https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v._Corel....
In practice, you still need the museum's cooperation to do anything commercial with it, for most people who don't have the resources to defend a lawsuit.
More importantly, even though you can't copyright a photo/scan of a public domain work you also don't have to share those images with others at all, you could still paywall them off and (arguably) impose additional restrictions as a condition of getting access, and you certainly don't have to host the files online for everyone like this.
Yes, you could physically and contractually prohibit others from copying the original work.
Could you though? Once you have a good image if it isn't copyrighted it sounds like any licencing agreement is void. You could just distribute not under terms of the license, but as a public domain images.
Wikipedia has assumed, since Bridgeman vs. Corel, that images of public domain works are public domain. Once some portrait museum in the UK threatened to sue, but backed down quickly.
That’s jurisdiction dependent. For example it’s not longer the case in the UK. https://www.theartnewspaper.com/2023/12/29/court-of-appeal-r...
> It's a shame that we don't already have this being done by public libraries.
What makes you say that? It's already done by so many institutions within the GLAM sector. They usually don't have the same marketing budget as Getty though. Here's[1] a glimpse into the digital heritage of the EU. That's a good starting point for some exploration.
[1] https://www.europeana.eu/en
Download link is a dead dropbox account. And this is the first thing I tried.
Bad luck to get something like that in 50,000,000+ chances. Which item was it?
I tried 5 random things and every single one was a dead link. Taxpayer money well spent here
Spending 6 millions EUR per year to get an index of blurry pictures in ElasticSearch (that they just crawled from somewhere). Well, it's only here for 15 years... so make the calculation.
You are clearly misrepresentating the value here. They also put a watermark on each image. /s
In all fairness, they don't all have watermarks. I don't know why. Perhaps they collected images from a variety of sources before downscaling and those sources included watermarks. I kind of wish that sort of thing had been part of the QA process though. Decent resolution and clean images would make this so much more valuable.
Where and what exactly?
I entered "dog" as search term and each and every item I clicked on, that had a Download option, worked fantastically well: https://www.europeana.eu/en/search?page=1&view=grid&query=do...
Taxpayer money well spent indeed!
Still very low resolution though :/
the number chances is not the problem, it's their outcome ;)
This is the first item I tried:
https://www.europeana.eu/en/item/08547/Museu_ProvidedCHO_Sta...
The image is only 600x768 pixels. Way too small to be useful. You can't even read the text in the image. The original work is 30x42 cm.
A starting point for exploration, as I said… You usually have to follow links to the online collections of the originating institution to find high resolution files. Some have them, some do not.
It would be cool if it actually worked like that (as a low resolution preview of things you can get on the actual museum website) - in this case the one offered on the museum website appears to be identical. Same low resolution, same watermark. Perhaps they could save people time by tagging entries that they had already received in low or watermark damaged quality so they could be filtered out of the search.
Ouch. And with a 300 pixel wide watermark at that...
don't dismiss the value in simply making them accessible. they're providing a platform to access these works, which is great.
I'm surprised to find myself defending Getty Images at all, but here I am. Before I dealt with huge collections as a developer, I didn't consider how much intellectual work went into managing collections as entities-- Things like ontology, or even maintaining consistent terminology usage in that much metadata is not trivial, and older materials cause the most problems. Even if these were exclusively public domain images, it would still be a praiseworthy effort. I'm sure some Getty archivist or the like fought hard for this.
It’s 88k images… I think you are massively overestimating the complexity
Which necessary steps and how much time would you estimate per image?
"Before I dealt with huge collections as a developer" says the comment I responded to. 88k is not a huge collection of anything from a software development perspective.
The compilation of that collection is obviously more considerable. Archivists at Getty have already said it is about 10 years work for ~20 people.
I guess that depends on what you consider "development" to be?
Yeah, I suppose thats a fair point.
They probably think they can code something in 5 minutes
You're thinking about this from a technical perspective, but the tech is the easy part. The content, and therefore structure and logical chunking of the metadata, is far more consequential than the number of images. I've seen people in academic environments pour tens of thousands of post-doc-hours into cataloging a few thousand images:
https://nuremberg.law.harvard.edu/
The J. Paul Getty Museum, not to be confused with Getty Images. (I made the same mistake reading the headline.) Don't worry, you can continue to hate on Getty Images at your leisure.
Haha... d'oh! That's a fail.
This isn't a Getty Images initiative, but rather being run by the Getty Museum which is under the Getty Trust. My partner actually works on some of these systems at the Getty and you're right, it's daunting. The sad part is that the Getty gets to do these kinds of things only because it's one of the best endowed institutions on the planet with funding to pay researchers/archivists/developers to do this work. There is so much more that is locked away at large, medium, and small institutions that would be available for the public, but just isn't because of lack of funding.
ADHD fail on my part! I even skimmed the article and totally didn't even parse that basic bit of context. Considering that Getty Images (originally photodisc) didn't appear until the early 90s, I'd love to see this turn into a Worldwide Wildlife Federation/Wordwide Wrestling Federation thing.
It being a museum makes it a lot more compelling though. Great job Getty!
And even within those institutions there's still more work than could feasibly be done. I worked in one of the best-funded, if not the best-funded non-governmental library systems on the planet and they've got an order of magnitude more things that need to be digitized than what they've already done, and they do a LOT already.
Some do. A lot depends on the library’s resources, but the Library of Congress has a lot of free imagery available, and I think a number of other large public libraries with art collections also do the same.
Hopefully they're taking advantage of this release and adding these images and scans into their own collections.
I'd be shocked if the IA wasn't all over this already.
Many AIs didn't care about whether images were free to use before training. Including many from Getty, with the watermark, and it resulted in a lawsuit that is still ongoing.
And 88k images is tiny when we consider that LAION has billions of entries.
IA = the Internet Archive, https://archive.org.
Yeah, sorry. Theory of mind failure on my part. Not everybody has my list of acronyms in their head, apparently.
Except we do.
https://digitalcollections.nypl.org/
https://www.flickr.com/photos/britishlibrary