The difference between (A) software engineers reacting to AI models and systems for programming and (B) artists (whether it's painters, musicians or otherwise) reacting to AI models for generating images, music, etc. is very interesting.
I wonder what's the reason.
Coding Assistants are not good enough (yet). Inline suggestions and chats are incredibly helpful and boost productivity (and only to those who know to use them well), but that's as fast as they go today.
If they can take a Jira ticket, debug the code, create a patch for a large codebase and understand and respect all the workarounds in a legacy codebase, I would have a problem with it.
Have you seen https://www.swebench.com/ ?
Once you engage agentic behaviour, it can take you way further than just the chats. We're already in the "resolving JIRA tickets" area - it's just hard to setup, not very well known, and may be expensive.
Looks like the definition of "resolving a ticket" here is "come up with a patch that ensures all tests pass", which does not necessarily include "add a new test", "make sure the patch is actually doing something meaningful", "communicate how this is fixed". Based on my experience and what I saw in the reports in the logs, a solution could be just hallucinating completely useless code -- as long as it doesn't fail a test.
Of course, it is still impressive, and definitely would help with the small bugs that require small fixes, especially for open source projects that have thousands of open issues. But is it going to make a big difference? Probably not yet.
Also, good luck doing that on our poorly written, poorly documented and under tested codebase. By any standard django is a much better codebase than the one I work on every day.
Some are happy with creating tests as well, but you probably want to mostly write them yourself. I mean, only you know the real world context - if the ticket didn't explain it well enough, LLMs can't do magic.
Actually the poorly documented and poorly written is not a huge issue in my experience. The under tested is way more important if you want to automate that work.
For very simple tasks maybe, but not for the kinds of things I get paid to do.
I don't think it will be able to get to the level of reliably doing difficult programming tasks that require understanding and inferring requirements without having AGI, in which case society has other things to worry about than programmers losing their jobs.
Except they can't do the equivalent for art yet either, and I am fairly familiar with the state of image diffusion today.
I've commissioned tens of thousands of dollars in art, and spent many hundreds of hours working with Stable Diffusion, Midjourney, and Flux. What all the generators are missing is intentionality in art.
They can generate something that looks great at surface level, but doesn't make sense when you look at the details. Why is a particular character wearing a certain bracelet? Why do the windows on that cottage look a certain way? What does a certain engraving mean? Which direction is a character looking, and why?
The diffusers do not understand what they are generating, so they just generates what "looks right." Often this results in art that looks pretty but has no deeper logic, world building, meaning, etc.
And of course, image generators cannot handle the client-artist relationship as well (even LLMs cannot), because it requires an understanding of what the customer wants and what emotion they want to convey with the piece they're commissioning.
So - I rely on artists for art I care about (art I will hang on my walls), and image generators for throwaway work (such as weekly D&D campaign images.)
Of course the "art" art -- the part that is all about human creativity -- will always be there.
But lots of people in the art business aren't doing that. If you didn't have midjourney etc, what would you be doing for the throwaway work? Learn to design the stuff yourself, hire someone to do it on Upwork, or just not do it all? Some money likely will exchange hands there.
The throwaway work is worth pennies per piece to me at most. So I probably wouldn't do it at all if it wasn't for the generators.
And even when it comes to the generators, I typically just use the free options like open-source diffusion models, as opposed to something paid like Midjourney.
But that’s not that far. Like sure, currently it’s not. But "reading a ticket with a description, find the relevant code, understand the code (often better than human), test it, return the result" is totally doable with some more iterations. It’s already doable for smaller projects, see GitHub workspaces etc.
Because code either works or it doesn't. Nobody is replacing our entire income stream with an LLM.
You also need a knowledge of code to instruct an LLM to generate decent code, and even then it's not always perfect.
Meanwhile plenty of people are using free/cheap image generation and going "good enough". Now they don't need to pay a graphic artist or a stock photo licence
Any layperson can describe what they want a picture to look like so the barrier to entry and successful exit is a lot lower for LLM image generation than for LLM code generation.
and getting sandwich photos of ham blending into human fingers:
https://www.reddit.com/r/Wellthatsucks/comments/1f8bvb8/my_l...
And yet, even knowing what I was looking for, I didn't see it long enough that I guessed I misunderstood and swiped to the second image, where it was pointed out specifically. Even if I had noticed myself--presumably because I was staring at it for way too long in the restaurant--I can't imagine I would have guessed what was going on, BUT EVEN THEN it just wouldn't have mattered... clearly, this is more than merely a "good enough" image.
At best it's a prototype and concept generator. It would have to yield assets with layers that can be exported by an illustration or bitmap tool of choice. AI generated images are almost completely useless as-is.
I agree there are plenty of images with garbled text and hands with 7 fingers, but text to image has freely available generators which create almost perfect images for some prompts. Certainly good enough to replace an actor holding a product, a stock photo, and often a stylised design.
Look at who the tools are marketed towards. Writing software involves a lot of tedium, eye strain, and frustration, even for experts who have put in a lot of hours practicing, so LLMs are marketed to help developers make their jobs easier.
This is not the case for art or music generators: they are marketed towards (and created by) laypeople with who want generic content and don't care about human artists. These systems are a significant burden on productivity (and fatal burden on creativity) if you are an honest illustrator or musician.
Another perspective: a lot of the most useful LLM codegen is not asking the LLM to solve a tricky problem, but rather to translate and refine a somewhat loose English-language solution into a more precise JavaScript solution (or whatever), including a large bag of memorized tricks around sorting, regexes, etc. It is more "science than art," and for a sufficiently precise English prompt there is even a plausible set of optimal solutions. The LLM does not have to "understand" the prompt or rely on plagiarism to give a good answer. (Although GPT-3.5 was a horrific F# plagiarist... I don't like LLM codegen but it is far more defensible than music generation)
This is not the case with art or music generators: it makes no sense to describe them as "English to song" translators, and the only "optimal" solutions are the plagiarized / interpolated stuff the human raters most preferred. They clearly don't understand what they are drawing, nor do they understand what melodies are. Their output is either depressing content slop or suspiciously familiar. And their creators have filled the tech community with insultingly stupid propaganda like "they learn art just like human artists do." No wonder artists are mad!
What you say may be true about the simplest workflow: enter a prompt and get one or more finished images.
But many people use diffusion models in a much more interactive way, doing much more of the editing by hand. The simplest case is to erase part of a generated image, and prompt to infill. But there are people who spend hours to get a single image where they want it.
This is true, and there's some really cool stuff there, but that's not who most of this is marketed at. Small wonder there's backlash from artists and people who appreciate artists when the stated value proposition is "render artists unemployed".
": they are marketed towards (and created by) laypeople with who want generic content and don't care about human artists"
Good. The artists I know have zero interest in doing that work. I have sacrificed a small fortune to invest in my wife's development as an artist so she never had to worry about making any money. She uses AI to help with promoting and "marketing" herself.
She and all of her colleagues all despise commissioned work and they get a constant stream of them. I always tell her to refuse them. Some pay very well.
If you are creating generic "art" for corporations I have little more than a shrug for your anxiety over AI.
I love art and code, IMO is because Cursor is really good and AI art is not that good.
There isn't a good metaphor for the problem with AI art. I would say it is like some kind of chocolate cake that the first few bites seem like the best cake you have ever had and then progressive bites become more and more shit until you stop even considering eating it. Then at some point even the thought of the cake makes you want to puke.
I say this as someone who thought we reached the art singularity in December 2022. I have no philosophical or moral problem with AI art. It just kind of sucks.
Cursor/Sonnet on the other hand just blew my mind earlier today.
AI art is an oxymoron. It will never give me chills or make me cry.
There are really good models for AI art, if people care. I think that AI is better at making an image from start to finish than making some software from start to finish.
And I use Claude 3.5 Sonnet myself.
Is it really? I know people who love using LLMs, people who are allergic to the idea of even taking about AI usability and lots of others in between. Same with artists hating the idea, artists who spend hours crafting very specific things with SD, and many in between.
I'm not sure I can really point out a big difference here. Maybe the artists are more skewed towards not liking AI since they work with medium that's not digital in the first place, but the range of responses really feels close.
It's just gatekeeping.
Artists put a ton of time into education and refining their vision inside the craft. Amateur efforts to produce compelling work always look amateur. With augmentation, suddenly the "real" artists aren't as differentiated.
The whole conversation is obviously extremely skewed toward digital art, and the ones talking about it most visibly are the digital artists. No abstract painter thinks AI is coming for their occupation or cares wether it is easier to create anime dreamscapes this year or the next.
I mean, it's supply and demand right.
- There is a big demand for really complex software development, and an LLM can't do that alone. So software devs have to do lots of busywork, and like the opportunity to be augmented by AI
- Conversely, there is a huge demand for not very high level art. - eg, lots of people want a custom logo or a little jingle, but no many people want to hire a concert pianist or comission the next Salvadore Dali.
So most artists spend a lot of time doing a lot of low level work to pay the bills, while software devs spend a lot of time doing low level code monkey work so they can get to the creative part of their job.