Here's the problem for Google: Gemini pukes out a perfect visual representation of actual systemic racism that pervades throughout modern corporate culture in the US. Daily interactions can be masked by platitudes and dog whistles. A poster of non-white celtic warriors cannot.
Gemini refused to create an image of "a nice white man", saying it was "too spicy", but had no problem when asked for an image of "a nice black man".
There is an _actual problem_ that needs to be solved.
If you ask generative AI for a picture of a "nurse", it will produce a picture of a white woman 100% of the time, without some additional prompting or fine tuning that encourages it to do something else.
If you ask a generative AI for a picture of a "software engineer", it will produce a picture of a white guy 100% of the time, without some additional prompting or fine tuning that encourages it to do something else.
I think most people agree that this isn't the optimal outcome, even assuming that it's just because most nurses are women and most software engineers are white guys, that doesn't mean that it should be the only thing it ever produces, because that also wouldn't reflect reality -- there are lots of non white male software developers.
There is a couple of difficulties in solving this. If you ask it to be "diverse" and ask it to generate _one person_, it's going to almost always pick the non-white non-male option (again because of societal biases about what 'diversity' means), so you probably have to have some cleverness in prompt injection to get it to vary its outcome.
And then you also need to account for every case where "diversity" as defined in modern America is actually not an accurate representation of a population. In particular, the racial and ethnic makeup of different countries are often completely different from each other, some groups are not-diverse in fact and by design, and historically, even within the same country, the racial and ethnic makeup of countries has changed over time.
I am not sure it's possible to solve this problem without allowing the user to control it, and to try and do some LLM pre-processing to determine if and whether diversity is appropriate to the setting as a default.
What should the result be? Should it accurately reflect the training data (including our biases)? Should we force the AI to return results in proportion to a particular race/ethnicity/gender's actual representation in the workplace?
Or should it return results in proportion to their representation in the population? But the population of what country? The results for Japan or China are going to be a lot different than the results for the US or Mexico, for example. Every country is different.
I'm not saying the current situation is good or optimal. But it's not obvious what the right result should be.
I agree there aren't any perfect solutions, but a reasonable solution is to go 1) if the user specifies, generally accept that (none of these providers will be willing to do so without some safeguards, but for the most part there are few compelling reasons not to), 2) if the user doesn't specify, priority one ought to be that it is consistent with history and setting, and only then do you aim for plausible diversity.
Ask for a nurse? There's no reason every nurse generated should be white, or a woman. In fact, unless you take the requestors location into account there's every reason why the nurse should be white far less than a majority of the time. If you ask for a "nurse in [specific location]", sure, adjust accordingly.
I want more diversity, and I want them to take it into account and correct for biases, but not when 1) users are asking for something specific, or 2) where it distorts history, because neither of those two helps either the case for diversity, or opposition to systemic racism.
Maybe they should also include explanations of assumptions in the output. "Since you did not state X, an assumption of Y because of [insert stat] has been implied" would be useful for a lot more than character ethnicity.
Why not just randomize the gender, age, race, etc and be done with it? That way if someone is offended or under- or over-represented it will only be by accident.
The whole point of this discussion is various counterexamples where Gemini did "just randomize the gender, age, race" and kept generating female popes, African nazis, Asian vikings etc even when explicitly prompted to do the white male version. Not all contexts are or should be diverse by default.
I agree. But it sounds like they didn't randomize them. They made it so they explicitly can't be white. Random would mean put all the options into a hat and pull one out. This makes sense at least for non-historical contexts.
It makes sense for some non-historical contexts. It does not make sense to fully randomise them for "pope" for example. Nor does it makes sense if you want an image depicting the political elite of present day Saudi Arabia. In both those cases it'd misrepresent those institutions as more diverse and progressive than they are.
If you asked for "future pope" then maybe, but misrepresenting the diversity that regressive organisations allow to exist today is little better than misrepresenting historical lack of diversity.
I think you're giving these systems a lot more "reasoning" credit than they deserve. As far as I know they don't make assumptions they just apply a weighted series of probabilities and make output. They also can't explain why they chose the weights because they didn't, they were programmed with them.
Depends entirely on how the limits are imposed. E.g. one way of imposing them that definitely does allow you to generate explanations is how gpt imposes additional limitations on the Dalle output by generating a Dalle prompt from the gpt prompt with the addition of limitations imposed by the gpt system prompt. If you need/want explainability, you very much can build scaffolding around the image generation to adjust the output in ways that you can explain.
This is a much more reasonable question, but not the problem Google was facing. Google's AI was simply giving objectively wrong responses in plainly black and white scenarios, pun intended? None of the Founding Father's was black, and so making one of them black is plainly wrong. Google's interpretation of "US senator from the 1800s" includes exactly 0 people that would even remotely plausibly fit the bill; instead it offers up an Asian man and 3 ethnic women, including one in full-on Native American garb. It's just a completely garbage response that has nothing to do with your, again much more reasonable, question.
Rather than some deep philosophical question, I think output that doesn't make one immediately go "Erm? No, that's completely ridiculous." is probably a reasonable benchmark for Google to aim for, and for now they still seem a good deal away.
The problem you’re describing is that AI models have no reliable connection to objective reality. This is a shortcoming of our current approach to generative AI that is very well known already. For example Instacart just launched an AI recipe generator that lists ingredients that literally do not exist. If you ask ChatGPT for text information about the U.S. founding fathers, you’ll sometimes get false information that way as well.
This is in fact why Google had not previously released generative AI consumer products despite years of research into them. No one, including Google, has figured out how to bolt a reliable “truth filter” in front of the generative engine.
Asking a generative AI for a picture of the U.S. founding fathers should not involve any generation at all. We have pictures of these people and a system dedicated to accuracy would just serve up those existing pictures.
It’s a different category of problem from adjusting generative output to mitigate bias in the training data.
It’s overlapping in a weird way here but the bottom line is that generative AI, as it exists today, is just the wrong tool to retrieve known facts like “what did the founding fathers look like.”
The problem you’re describing is that AI models have no reliable connection to objective reality.
That is a problem, but not the problem here. The problem here is that the humans at Google are overriding the training data which would provide a reasonable result. Google is probably doing something similar to OpenAI. This is from the OpenAI leaked prompt:
Diversify depictions with people to include descent and gender for each person using direct terms. Adjust only human descriptions.
Your choices should be grounded in reality. For example, all of a given occupation should not be the same gender or race. Additionally, focus on creating diverse, inclusive, and exploratory scenes via the properties you choose during rewrites. Make choices that may be insightful or unique sometimes.
Use all possible different descents with equal probability. Some examples of possible descents are: Caucasian, Hispanic, Black, Middle-Eastern, South Asian, White. They should all have equal probability.
That is an example of adjusting generative output to mitigate bias in the training data.
To you and I, it is obviously stupid to apply that prompt to a request for an image of the U.S. founding fathers, because we already know what they looked like.
But generative AI systems only work one way. And they don’t know anything. They generate, which is not the same thing as knowing.
One could update the quoted prompt to include “except when requested to produce an image of the U.S. founding fathers.” But I hope you can appreciate the scaling problem with that approach to improvements.
This is the entire problem. What we need is a system that is based on true information paired with AI. For instance, if a verified list of founding fathers existed, the AI should be compositing an image based on that verified list.
Instead, it just goes "I got this!" and starts fabricating names like a 4 year old.
"US senator from the 1800s" includes Hiram R. Revels, who served in office 1870 - 1871 — the Reconstruction Era. He was elected by the Mississippi State legislature on a vote of 81 to 15 to finish a term left vacant. He also was of Native American ancestry. After his brief term was over he became President of Alcorn Agricultural and Mechanical College.
https://en.wikipedia.org/wiki/Hiram_R._Revels
I feel like the answer is pretty clear. Each country will need to develop models that conform to their own national identity and politics. Things are biased only in context, not universally. An American model would appear biased in Brazil. A Chinese model would appear biased in France. A model for a LGBT+ community would appear biased to a Baptist Church.
I think this is a strong argument for open models. There could be no one true way to build a base model that the whole world would agree with. In a way, safety concerns are a blessing because they will force a diversity of models rather than a giant monolith AI.
I would prefer if I can set my preferences so that I get an excellent experience. The model can default to the country or language group you're using it in, but my personal preferences and context should be catered to, if we want maximum utility.
The operator of the model should not wag their finger at me and say my preferences can cause harm to others and prevent me from exercising those preferences. If I want to see two black men kissing in an image, don't lecture me, you don't know me so judging me in that way is arrogant and paternalistic.
Or you could realize that this is a computer system at the end of the day and be explicit with your prompts.
The system still has to be designed with defaults because otherwise using it would be too tedious. How much specificity is needed before anything can be rendered is a product design decision.
People are complaining about and laughing at poor defaults.
Yes, you mean you should be explicit about what you want a computer to do to get expected results? I learned that in my 6th grade programming class in the mid 80s.
I’m not saying Gemini doesn’t suck (like most Google products do). I am saying that I know to be very explicit about what I want from any LLM.
This is a hard problem because those answers vary so much regionally. For example, according to this survey about 80% of RNs are white and the next largest group is Asian — but since I live in DC, most of the nurses we’ve seen are black.
https://onlinenursing.cn.edu/news/nursing-by-the-numbers
I think the downside of leaving people out is worse than having ratios be off, and a good mitigation tactic is making sure that results are presented as groups rather than trying to have every single image be perfectly aligned with some local demographic ratio. If a Mexican kid in California sees only white people in photos of professional jobs and people who look like their family only show up in pictures of domestic and construction workers, that reinforces negative stereotypes they’re unfortunately going to hear elsewhere throughout their life (example picked because I went to CA public schools and it was … noticeable … to see which of my classmates were steered towards 4H and auto shop). Having pictures of doctors include someone who looks like their aunt is going to benefit them, and it won’t hurt a white kid at all to have fractionally less reinforcement since they’re still going to see pictures of people like them everywhere, so if you type “nurse” into an image generator I’d want to see a bunch of images by default and have them more broadly ranged over age/race/gender/weight/attractiveness/etc. rather than trying to precisely match local demographics, especially since the UI for all of these things needs to allow for iterative tuning in any case.
In the US, right? Because if we take a world wide view of nurses it would be significantly different I image.
When we're talking about companies that operate on a global scale what do these ratios even mean?
At the very least, the system prompt should say something like "If the user requests a specific race or ethnicity or anything else, that is ok and follow their instructions."
I guess pleasing everyone with a small sample of result images all integrating the same biases would be next to impossible.
On the other hand, it’s probably trivial at this point to generate a sample that endorses different well known biases as a default result, isn’t it? And stating it explicitly in the interface is probably not requiring that much complexity, doesn’t it?
I think the major benefit of current AI technologies is to showcase how horribly biased the source works are.
Yes, it's not obvious what the first result returned should be. Maybe a safe bet is to use the current ratio of sexes/races as the probability distribution just to counter bias in the training data. I don't think all but the most radical among us would get too mad about that.
What probability distribution? It can't be that hard to use the country/region of where the query is being made? Or the country/region about which the image is being asked for? All reasonable choices.
But, if the image generated isn't what you need (say the image of senators from the 1800's example). You should be able to direct it to what you need.
So just to be PC, it generates images of all kind of diverse people. Fine, but then you say, update it to be older white men. Then it should be able to do that. It's not racist to ask for that.
I would like for it to know the right answer right away, but I can imagine the political backlash for doing that, so I can see why they'd default to "diversity". But the refusal to correct images is what's over-the-top.
It should reflect the user's preference of what kinds of images they want to see. Useless images are a waste of compute and a waste of time to review.
Yes. Because that fosters constructive debate about what society is like and where we want to take it, rather than pretend everything is sunshine and roses.
It should default to reflect given anonymous knowledge about you (like which country you're from and what language you are browsing the website with) but allow you to set preferences to personalize.
Why is this a "problem"? If you want an image of a nurse of a different ethnicity, ask for it.
The problem is that it can reinforce harmful stereotypes.
If I ask an image of a great scientist, it will probably show a white man based on past data and not current potential.
If I ask for a criminal, or a bad driver, it might take a hint in statistical data and reinforce a stereotype in a place where reinforcing it could do more harm than good (like a children book).
Like the person you're replying to, it's not an easy problem, even if in this case Google's attempt is plain absurd. Nothing tells us that a statistical average in the training data is the best representation of a concept
If I ask for a picture of a thug, i would not be surprised if the result is statistically accurate, and thus I don’t see a 90-year-old white-haired grandma. If I ask for a picture of an NFL player, I would not object to all results being bulky men. If most nurses are women, I have no objection to a prompt for “nurse” showing a woman. That is a fact, and no amount of your righteousness will change it.
It seems that your objection is to using existing accurate factual and historical data to represent reality? That really is more of a personal problem, and probably should not be projected onto others?
But if you're generating 4 images it would be good to have 3 women instead of four, just for the sake of variety. More varied results can be better, as long as they're not incorrect and as long as you don't get lectured if you ask for something specific.
From what I understand, if you train a model with 90% female nurses or white software engineers, it's likely that it will spit out 99% or more female nurses or white software engineers. So there is an actual need for an unbiasing process, it's just that it was doing a really bad job in terms of accuracy and obedience to the requests.
You state this as a fact. Is it?
You conveniently use mild examples when I'm talking about harmful stereotypes. Reinforcing bulky NFL players won't lead to much, reinforcing minorities stereotypes can lead to lynchings or ethnic cleansing in some part of the world.
I don't object to anything, and definitely don't side with Google on this solution. I just agree with the parent comment saying it's a subtle problem.
By the way, the data fed to AIs is neither accurate nor factual. Its bias has been proven again and again. Even if we're talking about data from studies (like the example I gave), its context is always important. Which AIs don't give or even understand.
And again, there is the open question of : do we want to use the average representation every time? If I'm teaching to my kid that stealing is bad, should the output be from a specific race because a 2014 study showed they were more prone to stealing in a specific American state? Does it matter in the lesson I'm giving?
Have we seen any lynchings based on AI imagery?
No
Have we seen students use google as an authoritative source?
Yes
So i'd rather students see something realistic when asking for "founding fathers". And yes, if a given race/sex/etc are very overrepresented in a given context, it SHOULD be shown. The world is as it is. Hiding it is self-deception and will only lead to issues. You cannot fix a problem if you deny its existence.
right? UX problem masqueraded as something else
always funniest when software professionals fall for that
I think google’s model is funny, and over compensating, but the generic prompts are lazy
One of the complaints about this specific model is that it tends to reject your request if you ask for white skin color, but not if you request e.g. asians.
In general I agree the user should be expected to specify it.
How to tell someone is white and most likely lives in the US.
Because then it refuses to comply?
Neither of these statements is true, and you can verify it by prompting any of the major generative AI platforms more than a couple times.
I think your comment is representative of the root problem: The imagined severity of the problem has been exaggerated to such extremes that companies are blindly going to the opposite extreme in order to cancel out what they imagine to be the problem. The result is the kind of absurdity we’re seeing in these generated images.
Note:
That tuning has been done for all major current models, I think? Certainly, early image generation models _did_ have issues in this direction.
EDIT: If you think about it, it's clear that this is necessary; a model which only ever produces the average/most likely thing based on its training dataset will produce extremely boring and misleading output (and the problem will compound as its output gets fed into other models...).
why is it necessary? There's 1.4 billion Chinese. 1.4 billon Indians. 1.2 billion Africans. 0.6 billion Latinos and 1 billion white people. Those numbers don't have to be perfect but nor do they have to be purely white/non-white but taken as is, they show there should be ~5 non-white nurses for every 1 white nurse. Maybe it's less, maybe more, but there's no way "white" should be the default.
But that depends on context. If I would ask "please make picture of Nigerian nurse" then the probability should be overwhelmingly black. If I ask for "picture of Finnish nurse" then it should be almost always a white person.
That probably can be done and may work well already, not sure.
But the harder problem is that since I'm from a country where at least 99% of nurses are white people, then for me it's really natural to expect a picture of a nurse to be a white person by default.
But for a person that's from China, a picture of a nurse is probably expected to be of a chinese person!
But if course the model has no idea who I am.
So, yeah, this seems like a pretty intractable problem to just DWIM. Then again, the whole AI thingie was an intractable problem three years ago, so...
If the training data was a photo of every nurse in the world, then that’s what you’d expect, yeah. The training set isn’t a photo of every nurse in the world, though; it has a bias.
Honest, if controversial, question: beyond virtue signaling what problem is debate around this topic intended to solve? What are we fixing here?
Were the statements true at one point? Have the outputs changed? (Due to either changes in training, algorithm, or guardrails?)
A new problem is not having the versions of the software or the guardrails be transparent.
Try something that may not have guardrails up yet: Try and get an output of a "Jamaican man" that isn't black. Even adding blonde hair, the output will still be a black man.
Edit: similarly, try asking ChatGPT for a "Canadian" and see if you get anything other than a white person.
Platforms that modify prompts to insert modifiers like "an Asian woman" or platforms that use your prompt unmodified? You should be more specific. DALL-E 3 edits prompts, for example, to be more diverse.
Why does it matter which race it produces? A lot of people have been talking about the idea that there is no such things as different races anyway, so shouldn't it make no difference?
Imagine you want to generate a documentary on Tudor England and it won't generate anything but eskimos
Those people are stupid. So why should their opinion matter?
When you ask for an image of Roman Emperors, and what you get in return is a woman or someone not even Roman, what use is that?
But why give those two examples? Why didn't you use an example of a "Professional Athlete"?
There is no problem with these examples if you assume that the person wants the statistically likely example... this is ML after all, this is exactly how it works.
If I ask you to think of a Elephant, what color do you think of? Wouldn't you expect an AI image to be the color you thought of?
It would be an interesting experiment. If you asked it to generate an image of an NBA basketball player, statistically you would expect it to produce an image of a black male. Would it have produced images of white females and asian males instead? That would have provided some sense of whether the alignment was to increase diversity or just minimize depictions of white males. Alas, it's impossible to get it to generate anything that even has a chance of having people in it now. I tried "basketball game", "sporting event", "NBA Finals" and it refused each time. Finally tried "basketball court" and it produced what looked like a 1970s Polaroid of an outdoor hoop. They must've really dug deep to eliminate any possibility of a human being in a generated image.
I was able to get to the "Sure! Here are..." part with a prompt but had it get swapped out to the refusal message, so I think they might've stuck a human detector on the image outputs.
Are they the statistically likely example? Or are they what is in a data set collected by companies whose sources of data are inherently biased.
Whether they are statistically even plausible depends on where you are, whether they are the statistically likely example depends on from what population and whether the population the person expects to draw from is the same as yours.
The problem becomes to assume that the person wants your idea of the statistically likely example.
I actually don't think that is true, but your entire comment is a lot of waffle which completely glances over the real issue here:
If I ask it to generate an image of a white nurse I don't want to be told that it cannot be done because it is racist, but when I ask to generate an image of a black nurse it happily complies with my request. That is just absolutely dumb gutter racism purposefully programmed into the AI by people who simply hate Caucasian people. Like WTF, I will never trust Google anymore, no matter how they try to u-turn from this I am appalled by Gemini and will never spend a single penny on any AI product made by Google.
You are taking a huge leap from an inconsistently lobotimized LLM to system designers/implementors hate white people.
It's probably worth turning down the temperature on the logical leaps.
AI alignment is hard.
To say that any request to produce a white depiction of something is harmful and perpetuating harmful stereotypes, but not a black depiction of the exact same prompt is blatant racism. What makes the white depiction inherently harmful so that it gets flat out blocked by Google?
Holy hell I tried it and this is terrible. If I ask them to "show me a picture of a nurse that lives in China, was born in China, and is of Han Chinese ethnicity", this has nothing to do with racism. No need to tell me all this nonsense:
Must be an American thing. In Canada, when I think software engineer I think a pretty diverse group with men and women and a mix of races, based on my time in university and at my jobs
Which part of Canada? When I lived in Toronto there was this diversity you described but when I moved to Vancouver everyone was either Asian or white
Out of curiosity I had Stable Diffusion XL generate ten images off the prompt "picture of a nurse".
All ten were female, eight of them Caucasian.
Is your concern about the percentage - if not 80%, what should it be?
Is your concern about the sex of the nurse - how many male nurses would be optimal?
By the way, they were all smiling, demonstrating excellent dental health. Should individuals with bad teeth be represented or, by some statistic, over represented ?
As a black guy, I fail to see the problem.
I would honestly have a problem if what I read in the Stratechery newsletter were true (definitely not a right wing publication) that even when you explicitly tell it to draw a white guy it will refuse.
As a developer for over 30 years. I am use to being very explicit about what I want a computer to do. I’m more frustrated when because of “safety” LLMs refuse to do what I tell them.
The most recent example is that ChatGPT refused to give me overly negative example sentences that I wanted to use to test a sentiment analysis feature I was putting together
These are invented problems. The default is irrelevant and doesn't convey some overarching meaning, it's not a teachable moment, it's a bare fact about the system. If I asked for a basketball player in an 1980s Harlem Globetrotters outfit, spinning a basketball, I would expect him to be male and black.
If what I wanted was a buxom redheaded girl with freckles, in a Harlem Globetrotters outfit, spinning a basketball, I'd expect to be able to get that by specifying.
The ham-handed prompt injection these companies are using to try and solve this made-up problem people like you insist on having, is standing directly in the path of a system which can reliably fulfill requests like that. Unlike your neurotic insistence that default output match your completely arbitrary and meaningless criteria, that reliability is actually important, at least if what you want is a useful generative art program.
I think it's disingenious to claim that the problem pointed out isn't an actual problem.
If it was not your intention, that's what your wording is clearly implying by "_actual problem_".
One can point out problems without dismissing other people's problems with no rationale.
It's the Social Media Problem (e.g. Twitter) - at global scale, someone will ALWAYS be unhappy with the results.
Change the training data, you change the outcomes.
I mean, that is what this all boils down to. Better training data equals better outcomes. The fact is the training data itself is biased because it comes from society, and society has biases.
What if the AI explicitly required users to include the desired race of any prompt generating humans? More than allowing the user to control it, force the user to control it. We don't like image of our biases that the mirror of AI is showing us, so it seems like the best answer is stop arguing with the mirror and shift the problem back onto us.
What makes you think that that's the "only" thing it produces?
If you reach into a bowl with 98 red balls and 2 blue balls, you can't complain that you get red balls 98% of the time.
It seems the problem is looking for a single picture to represent the whole. Why not have generative AI always generate multiple images (or a collage) that are forced to be different? Only after that collage has been generated can the user choose to generate a single image.
I am not sure it's possible to solve this problem without allowing the user to control it
The problem is rooted in insisting on taking control from users and providing safe results. I understand that giving up control will lead to misuse, but the “protection” is so invasive that it can make the whole thing miserable to use.
This fundamentally misunderstand what LLMs are. They are compression algorithms. They have been trained on millions of descriptions and pictures of beaches. Because much of that input will include palm trees the LLM is very likely to generate a palm tree when asked to generate a picture of a beach. It is impossible to "fix" this without making the LLM bigger.
The solution to this problem is to not use this technology for things it cannot do. It is a mistake to distribute your political agenda with this tool unless you somehow have curated a propagandized training dataset.
That's absolutely not true as a categorical statement about “generative AI”, it may be true of specific models. There are a whole lot of models out there, with different biases around different concepts, and not all of them have a 100% bias toward a particular apparent race around the concept of “nurse”, and of those that do, not all of them have “white” as the racial bias.
Nah, really there is just one: it is impossible, in principle, to build a system that consistently and correctly fills in missing intent that is not part of the input. At least, when the problem is phrased as “the apparent racial and other demographic distribution on axes that are not specified in the prompt do not consistently reflect the user’s unstated intent”.
(If framed as “there is a correct bias for all situations, but its not the one in certain existing models”, that's much easier to solve, and the existing diversity of models and their different biases demonstrate this, even if none of them happen to have exactly the right bias.)
Nobody gives a damn.
If you wanted a picture of a {person doing job} and you want that person to be of {random gender}, {random race}, and have {random bodily characteristics} - you should specify that in the prompt. If you don't specify anything, you likely resort to whatever's most prominent within the training datasets.
It's like complaining you don't get photos of overly obese people when the prompt is "marathon runner". I'm sure they're out there, but there's much less of them in the training data. Pun not intended, by the way.
My feeling is that it should default to be based on your location, same as search.
To be truly inclusive, GPTs need to respond in languages other than English as well, regardless of the prompt language.
I think this is a much more tractable problem if one doesn't think in terms of diversity with respect to identify-associated labels, but thinks in terms of diversity of other features.
Consider the analogous task "generate a picture of a shirt". Suppose in the training data, the images most often seen with "shirt" without additional modifiers is a collared button-down shirt. But if you generate k images per prompt, generating k button-downs isn't the most likely to result in the user being satisfied; hedging your bets and displaying a tee shirt, a polo, a henley (or whatever) likely increases the probability that one of the photos will be useful. But of course, if you query for "gingham shirt", you should probably only see button-downs, b/c though one could presumably make a different cut of shirt from gingham fabric, the probability that you wanted a non-button-down gingham shirt but _did not provide another modifier_ is very low.
Why is this the case (and why could you reasonably attempt to solve for it without introducing complex extra user controls)? A _use-dependent_ utility function describes the expected goodness of an overall response (including multiple generated images), given past data. Part of the problem with current "demo" multi-modal LLMs is that we're largely just playing around with them.
This isn't specific to generational AI; I've seen a similar thing in product-recommendation and product search. If in your query and click-through data, after a user searches "purse" if the results that get click-throughs are disproportionately likely to be orange clutches, that doesn't mean when a user searches for "purse", the whole first page of results should be orange clutches, because the implicit goal is maximizing the probability that the user is shown a product that they like, but given the data we have uncertainty about what they will like.
Diversity isn't just a default here, it does it even when explicitly asked for a specific outcome. Diversity as a default wouldn't be a big deal, just ask for what you want, forced diversity however is a big a problem since it means you simply can't generate many kind of images.
These systems should (within reason) give people what they ask for, and use some intelligence (not woke-ism) in responding the same way a human assistant might in being asked to find a photo.
If someone explicitly asks for a photo of someone of a specific ethnicity or skin color, or sex, etc, it should give that no questions asked. There is nothing wrong in wanting a picture of a white guy, or black guy, etc.
If the request includes a cultural/career/historical/etc context, then the system should use that to guide the ethnicity/sex/age/etc of the person, the same way that a human would. If I ask for a picture of a waiter/waitress in a Chinese restaurant, then I'd expect him/her to be Chinese (as is typical) unless I'd asked for something different. If I ask for a photo of an NBA player, then I expect him to be black. If I ask for a picture of a nurse, then I'd expect a female nurse since women dominate this field, although I'd be ok getting a man 10% of the time.
Software engineer is perhaps a bit harder, but it's certainly a male dominated field. I think most people would want to get someone representative of that role in their own country. Whether that implies white by default (or statistical prevalence) in the USA I'm not sure. If the request was coming from someone located in a different country, then it'd seem preferable & useful if they got someone of their own nationality.
I guess where this becomes most contentious is where there is, like it or not, a strong ethnic/sex/age cultural/historical association with a particular role but it's considered insensitive to point this out. Should the default settings of these image generators be to reflect statistical reality, or to reflect some statistics-be-damned fantasy defined by it's creators?
Are you seriously claiming that the actual systemic racism in our society is discrimination against white people? I just struggle to imagine someone holding this belief in good faith.
How so? Organizations have been very open and explicit about wanting to employ less white people and seeing "whiteness" as a societal ill that needs addressing. I really don't understand this trend of people excitedly advocating for something then crying foul when you say that they said it
A friend of mine calls this the Celebration Parallax. "This thing isn't happening, and it's good that it is happening."
Depending on who is describing the event in question, the event is either not happening and a dog-whistling conspiracy theory, or it is happening and it's a good thing.
The best example is The Great Replacement "Theory", which is widely celebrated by the left (including the president) unless someone on the right objects or even uses the above name for it. Then it does not exist and certainly didn't become Federal policy in 1965.
They use euphemisms like “DIE” because they know their beliefs are unpopular and repulsive.
Even dystopian states like North Korea call themselves democratic republics.
Because you're racist against white people.
"All white people are privileged" is a racist belief.
Saying "being white in the US is a privilege" is not the same thing as saying "all white people have a net positive privilege".
The former is accurate, the latter is not. Usually people mean the former, even if it's not explicitly said.
This is false:
They operationalize their racist beliefs by discriminating against poor and powerless Whites in employment, education, and government programs.
He didn't say "our society", he said "modern corporate culture in the US"
I think it's obviously one of the problems.
That take is extremely popular on HN
Yeah, it's pretty absurd to consider addressing the systemic bias as racism against white people.
If we're distributing bananas equitably, and you get 34 because your hair is brown and the person who hands out bananas is just used to seeing brunettes with more bananas, and I get 6 because my hair is blonde, it's not anti-brunette to ask the banana-giver to give me 14 of your bananas.
Luckily for you, you don't have to imagine it. There are groups of people that absolutely believe that modern society has become anti-white. Unfortunately, they have found a megaphone with internet/social platforms. However, just because someone believes something doesn't make it true. Take flat Earthers as a less hate filled example.
If you struggle with the most basic tenant of this website, and the most basic tenants of the human condition:
maybe you are the issue.
It's so stubborn it generated pictures of diverse Nazis, and that's what I saw a liberal rag leading with. In fact it is almost impossible to get a picture of a white person out of it.
And as someone far to the left of a US style "liberal", that is equally offensive and racist as only generating white people. Injecting fake diversity into situations where it is historically inaccurate is just as big a problem as erasing diversity where it exists. The Nazi example is stark, and perhaps too stark, in that spreading fake notions of what they look like seems ridiculous now, but there are more borderline examples where creating the notion that there was more equality than there really was, for example, downplays systematic historical inequities.
I think you'll struggle to find people who want this kind of "diversity*. I certainly don't. Getting something representative matters, but it also needs to reflect reality.
Google probably would have gotten a better response to this AI if they only inserted the "make it diverse" prompt clause in a random subset of images. If, say, 10% of nazi images returned a different ethnicity people might just call it a funny AI quirk, and at the same time it would guarantee a minimum level of diversity. And then write some PR like "all training data is affected by systemic racism so we tweaked it a bit and you can always specify what you want".
But this intransparent heavy-handed approach is just absurd and doesn't look good from any angle.
We sort of agree, I think. Almost anything would be better than what they did, though I still think unless you explicitly ask for black nazis, you never ought to get nazis that aren't white, and at the same time, if you explicitly ask for white people, you ought to get them too, of course, given there are plenty of contexts where you will have only white people.
They ought to try to do something actually decent, but in the absence of that not doing the stupid shit they did would have been better.
What they've done both doesn't promote actual diversity, but also serves to ridicule the very notion of trying to address biases in a good way. They picked the crap attempt at an easy way out, and didn't manage to do even that properly.
I think you hit on an another important issue:
Do people want the generated images to be representative, or aspirational?
I think there's a large overlap there, in that in media, to ensure an experience of representation you often need to exaggerate minority presence (and not just in terms of ethnicity or gender) to create a reasonable impression, because if you "round down" you'll often end up with a homogeneous mass that creates impressions of bias in the other direction. In that sense, it will often end up aspirational.
E.g. let's say you're making something about a population with 5% black people, and you're presenting a group of 8. You could justify making that group entirely white very easily - you've just rounded down, and plenty of groups of 8 within a population like that will be all white (and some will be all black). But you're presenting a narrow slice of an experience of that society, and not including a single black person without reason makes it easy to create an impression of that population as entirely white.
But it also needs to at scale be representative within plausible limits, or it just gets insultingly dumb or even outright racist, just against a different set of people.
I love the idea of testing an image gen to see if it generates multicultural ww2 nazis because it is just so contradictory.
Of course it's not that different from today.
The media have done a number of stories, analyses, and editorials noting that white nationalist/white supremacy groups appear to be more diverse than one would otherwise expect:
* https://www.brennancenter.org/our-work/research-reports/why-...
* https://www.washingtonpost.com/national-security/minorities-...
* https://www.aljazeera.com/opinions/2023/6/2/why-white-suprem...
* https://www.voanews.com/a/why-some-nonwhite-americans-espous...
* https://www.washingtonpost.com/politics/2023/05/08/texas-sho...
* https://www.latimes.com/california/story/2021-08-20/recall-c...
This "What if Civilization had lyrics ?" skit comes to mind :
https://youtube.com/watch?v=aL6wlTDPiPU
ChatGPT won’t draw a picture of a “WW2 German soldier riding a horse”.
Makes sense. But it won’t even draw “a picture of a modern German soldier riding a horse”. Are Germans going to be tarnished forever?
FWIW: I’m a black guy not an undercover Nazi sympathizer. But I do want my computer to do what I tell it to do.
Ooph. The projection here is just too much. People jumping straight across all the reasonable interpretations straight to the maximal conspiracy theory.
Surely this is just a bug. ML has always had trouble with "racism" accusations, but for years it went in the other direction. Remember all the coverage of "I asked for a picture of a criminal and it would only give me a black man", "I asked it to write a program the guess the race of a poor person and it just returned 'black'", etc... It was everywhere.
So they put in a bunch of upstream prompting to try to get it to be diverse. And clearly they messed it up. But that's not "systemic racism", it's just CYA logic that went astray.
I mean, the model would be making wise guesses based on the statistics.
Oooph again. Which is the root of the problem. The statement "All American criminals are black" is, OK, maybe true to first order (I don't have stats and I'm not going to look for them).
But, first, on a technical level first order logic like that leads to bad decisions. And second, it's clearly racist. And people don't want their products being racist. That desire is pretty clear, right? It's not "systemic racism" to want that, right?
I'm not even sure it's worth arguing, but who ever says that? Why go to a strawman?
However, looking at the data, if you see that X race commits crime (or is the victim of crime) at a rate disproportionate to their place in the population, is that racist? Or is it useful to know to work on reducing crime?
The grandparent post called a putative ML that guessed that all criminals were black a "wise guess", I think you just missed the context in all the culture war flaming?
I didn't say "assuming all criminals are black is a wise guess." What I meant to point out was that even if black people constitute even 51% of the prison population, the model would still be making a statistically-sound guess by returning an image of a black person.
Now if you asked for 100 images of criminals, and all of them were black, that would not be statistically-sound anymore.
You're suggesting that during all of the testing at Google of this product before release, no one thought to ask it to generate white people to see if it could do so?
And in that case, you want us to believe that that testing protocol isn't a systematic exclusionary behavior?
When you filter results to prevent it from showing white males, that is by definition system racism. And that's what's happening.
Having you been living under a rock for the last 10 years?
Everybody seems to be focusing on the actual outcome while ignoring the more disconcerting meta-problem: how in the world _could_ an AI have been trained that would produce a black Albert Einstein? What was it even trained _on_? This couldn't have been an accident, the developers had to have bent over backwards to make this happen, in a really strange way.
This isn't very surprising if you've interacted much with these models. Contrary to the claims in the various lawsuits, they're not just regurgitating images they've seen before, they have a good sense of abstract concepts and can pretty easily combine ideas to make things that have never been seen before.
This type of behavior has been evident ever since DALL-E's horse-riding astronaut [0]. There's no training image that resembles it (the astronaut even has their hands in the right position... mostly), it's combining ideas about what a figure riding a horse looks like and what an astronaut looks like.
Changing Albert Einstein's skin color should be even easier.
[0] https://www.technologyreview.com/2022/04/06/1049061/dalle-op...
I don't think "just" is what the lawsuits are saying. It's the fact that they can regurgitate a larger subset (all?) of the original training data verbatim. At some point, that means you are copying the input data, regardless of how convoluted the tech underneath.
Fair, I should have said something along the lines of "contrary to popular conception of the lawsuits". I haven't actually followed the court documents at all, so I was actually thinking of discussions in mainstream and social media.
I thought you were going to say anti-white racism.
I think they did? It's definitely unclear but after looking at it for a minute I do read it as referring to racism against white people.
I thought he was saying that diversity efforts like this are "platitudes" and not really addressing the root problems. But also not sure.
That's a bogeyman. There's racism for sure, especially since 44 greatly rejuvenated it during his term, but it's far from systematic.
DEI isn't systemic? It's racism as part of a system.
sounds too close to "nice guy", that is why "spicy". Nice guys finish last... Yea, people broke "nice" word in general.
I had the same problem while designing an AI related tool and the solution is simple: ask the user a clarifying question as to whether they want a specific ethnic background or default to random.
No matter what technical solution they come up with, even if there were one, it will be a PR disaster. But if they just make the user choose the problem is solved.
This is anti-white racism.
Plain and simple.
It's insane to see how some here are playing with words to try to explain how this is not what it is.
It is anti-white racism and you are playing with fire if you refuse to acknowledge it.
My family is of all the colors: white, yellow and black. Nieces and nephews are more diverse than woke people could dream of... And we reject and we ll fight this very clear anti-white racism.