return to table of content

Animate Anyone: Image-to-video synthesis for character animation

EwanG
54 replies
1d19h

I'm just waiting for the tool or toolchain where I can take a manga that I like that doesn't have an anime, and get a season or two out to watch when I feel like it rather than wait for it to get an official release.

Bonus points if I can let the tool ingest season 1 or an OVA of said material where a season 2 is never going to come (looking at you "No Game, No Life")

__loam
50 replies
1d17h

This is so bleak. I hope artists get enough legal protection in the future to stop ghouls from doing this to their work.

matheusmoreira
43 replies
1d15h

I hope not. I want AI to hammer in the final nails into the coffin of intellectual property so we can finally bury it. I want this technology to be so ubiquitous it cannot be controlled. I want them to give up on controlling things with such nonsense "legal protections", society should change permanently instead.

__loam
40 replies
1d15h

In the absence of a legal framework, I hope enough people recognize that scooping someone else's work into some kind of grotesque anime sausage machine is super shitty to do without the input of the author, and the people doing it get ruthlessly mocked. Most anime production involves the input of and express permission from the author.

Imagine someone doing this to show us "their" ending of berserk, for example. Just utterly disrespectful to Miura's legacy. The man dedicated his life to his art and you reinterpreted it like an ass hole because you lack patience and demand more content at the expense of all artistic agency. Complete trash.

matheusmoreira
35 replies
1d14h

someone else's work

Just a number really. That's all intellectual work is: a number. All numbers already exist, humans just discover them.

I just can't take it seriously, this notion that intellectual work is somehow "special" and deserving of "protections". To me it's as delusional as trying to own numbers.

input of the author

Why do you need somebody's permission to do math with numbers? Makes no sense whatsoever.

Imagine someone doing this to show us "their" ending of berserk

No need to imagine it when fanfiction.net exists. AI will make it even easier.

DeIlliad
14 replies
1d14h

This "just math" argument is so reductionist and unhelpful.

matheusmoreira
13 replies
1d4h

And yet it's true. Literally cannot be refuted. It's a fact that all information is a number.

catapart
12 replies
1d2h

Bruh. I'm with you on IP being useless, but insisting on something being a "fact" is meaningless.

It's a fact that everything can be represented in binary. You can identify every knowable concept as either "cat" or "not cat". That's a fact.

Now... who cares? What does that being a fact mean? It means nothing. It's not useful. It doesn't prove a point any more than what you're saying does. All information is a number, that's a fact. Okay. Now... so WHAT?

The charge here is that some public information deserves protection. You disagree with that but, again, who cares? Prove why your opinion on that is relevant to any other human being, or engage with your irrelevance all by yourself. On the "protect public info" side, we can say that things like your social security number needs to be necessarily public, but we need to be able to stop people from using it however they want. There are indexing reasons for that. There are medical reasons for that. There are (yes, far too many) financial reasons for that.

And if you don't like that example, let's try one more directly applicable with a creative example: trade marks are public info. But if we let every person do whatever they wanted with a trade mark, they lose the real, tangible value that they provide (iconography is foundational to how human brains efficiently interpret information).

So pick you poison, but then engage with the actual poison; don't just wink behind smug tautologies. Tell us how, in your view, we can have a framework that provides the values that we get from IP, when we don't protect IP, as a concept. Because, buddy, I've been longing for, and working towards, that kind of framework since I figured out how pathetically inept IP is at actually doing what it does.

matheusmoreira
11 replies
1d1h

Now... who cares?

I care. So should you, and so should everyone here on Hacker News.

What does that being a fact mean? It means nothing.

It means everything. It means what they really want is to more or less control what arguments you can pass to memcpy. The only way they can possibly do that is to destroy computing as you and I know it.

Think about it, about what they're really doing. All information is bits and copying bits is a fundamental computer operation. In order to restrict that, computers need to be made so that they will run all the programs, except the ones that invalidate intellectual property. Most likely they'll be made so they can run only the programs they approve. Government signed binaries, programming licenses.

Surely everyone here agrees that such a future is dystopian.

catapart
10 replies
1d1h

Neat.

Now, on to that actual engagement with the problem that I asked you to do...?

How do you guarantee the same value we get out of IP, without IP? OR, how do you propose we get along without the value that we get out of IP? What does that world look like, in a practical sense.

You can whinge about your pet libertarianism all you want but it doesn't actually get you any closer to being relevant. No matter how much you think anybody "should" care.

matheusmoreira
9 replies
1d

How do you guarantee the same value we get out of IP, without IP?

I guarantee you nothing. There are alternative business models that don't require artificial scarcity. Returns are not guaranteed.

OR, how do you propose we get along without the value that we get out of IP?

I propose we don't.

You can whinge about your pet libertarianism all you want but it doesn't actually get you any closer to being relevant.

It's just information, numbers. Therefore you can just input it all into stable diffusion models and LLMs and generate an infinite number of similar enough numbers at negligible costs. And there's nothing anyone can do about it short of tyranny the likes of which this world has never seen. And even then it's doubtful it's gonna work.

It's just information, numbers. Therefore you can copy and transmit it with no limits and with complete impunity. The only success they had at stopping it was making it easier not to infringe copyright but their greed destroyed even that. Doesn't matter how many billions they invest into DRM, copyright infringement still happens at scale literally every day. Copyright infringement is natural, it's how people think. It happens every single time someone right clicks and saves a picture off a website, every single time someone makes a little meme. And there's virtually nothing they can do about it short of destroying the internet as we know it.

How's that for relevance?

The only ones who are "whinging" about anything here are these dinosaurs. They're the ones who are irrelevant. They are so absolutely and thoroughly irrelevant they need government intervention, literal government granted monopolies, in order to not be completely and utterly wiped out by technology. They're sacrificing computing, everything we love, so that they can continue their rent seeking with centuries-long monopolies on bits. All I'm saying is we should sacrifice them instead. And let the chips fall where they may.

catapart
8 replies
23h13m

Right. Because the only answer this ideology proposes is to "burn it all down and..."

Which you might note is not an "answer" so much as it is succumbing to the chaos that you don't feel is important enough to control.

I do appreciate you making that explicit though! It's often hard to get people to admit to such naked dismissal of functional societal mechanisms that actual people get practical, tangible value from. Definitely contextualizes the framing more vividly.

matheusmoreira
7 replies
22h50m

Because the only answer this ideology proposes is to "burn it all down and..."

Nothing wrong with that. Intellectual property proponents will "burn down" the entire technology field to get their way. Every day it's some new DRM bullshit. They couldn't care less about hacking and values like computing freedom. Then you have governments who want to do things like stop citizens from using strong cryptography.

The fact people currently get value from it doesn't justify its continued existence.

succumbing to the chaos that you don't feel is important enough to control.

Controlling it requires destroying everything I care about. I'd sacrifice the entire copyright industry to keep it.

I do appreciate you making that explicit though!

It's what I strived to do from the beginning. I want to see things clearly, as they are. If the conclusion is the current status quo is a lie and should be abolished, so be it.

catapart
6 replies
20h55m

Must be nice to be so privileged! I hope every damaging thing you are apathetic about happening to other people happens to you so that you are able to test your principles in the most comprehensive way.

matheusmoreira
5 replies
20h27m

No need to hope, I generally try to put my money where my mouth is. I give away my software under copyleft licenses. My website has no advertising, no tracking, nothing. I have no intention of ever suing anyone for license violations either. I have a sponsors thing which I think is a perfectly ethical way to make money since it doesn't depend on artificial scarcitiy. It's currently sitting at zero dollars per month and I'm not complaining about it. I don't even advertise it because I think advertising is unethical. I try not to post my projects unless people ask me about them. Whatever wealth I have was not earned by means of intellectual property.

If you find instances where I'm contradicting my beliefs, I give you free license to call me out on it in public. I will either try to justify myself or try to change immediately if I can't.

catapart
4 replies
20h6m

I didn't say anything about hypocrisy; I don't care how you comport yourself or whatever personal restraints or actions you use to justify how you eschew empathy for ideological purity. You think you're unique? Or particularly virtuous? All of my software and licenses (and contributions) are CC0. How is that relevant? It's not.

What I hope for you is that when you are reliant on something, it is stripped from you without regard for how you need it, and hopefully using a particular kind of logic that you find repugnant. That would be an appropriate reversal of the ideology you are preaching. I hope that you lose something you NEED and that the people that could help you simply won't. Not out of inability but out of ideology. I hope that it happens to you not because it would be distressing to you, but because I am confident that such experiences can make one understand how ideological purity (in any specific aspect, logic included) is not a particularly useful measure of the validity or utility of an ideology as it pertains to applicable function. And if it doesn't, then your argument is only enriched by having experienced it. It's a win-win, if you're interested in the type of rigidly structured nonsense you're expressing.

matheusmoreira
3 replies
19h35m

Then you'll be happy to know it's already happened.

I relied on my country being a democracy but it's not. I also relied on socialists not being in power but they are. I also relied on my country growing economically but things are looking pretty bad, and they seem to get worse everyday. There's a never ending list of these things. There are people out there who could have prevented it and they did nothing due to self-interest rather than ideology. Those who couldn't have stopped it but protested anyway are currently rotting away and dying in prison, having received worse punishments than murderers and rapists.

If you check my submissions you'll find threads about it. I'm sure you'll get a kick out of reading my comments. Enjoy.

catapart
2 replies
19h22m

Thank you, that is good to hear. I hope it continues to happen to you in more dire circumstances such that your education continues to enrich. I hope that you are so marginalized by ideology that you can no longer be, just like you advocate for those reliant on the values of IP. I hope that your education continues until you can be an example of the perfect version of what it means to have those engaging in ideological purity decide that you are an 'unfortunate consequence' of rearranging society in their image.

I'll take your silence as a realization of said pinnacle, and any future responses as indicative of your need for further education. And, in those cases, please understand that I am continually hoping for your success in that. Bonne soirée!

matheusmoreira
1 replies
17h45m

Be well.

catapart
0 replies
17h11m

Y tu!

raincole
4 replies
1d13h

Yeah, if you follow this logic... then humans are just a bunch of molecules. Mostly water. Some protein and fat.

Why do humans have any right at all? I can't take this seriously. Molecules don't have rights.

matheusmoreira
3 replies
1d4h

Let's not compare the copying of bits to the killing of human beings.

thfuran
2 replies
1d2h

Why not? Humans are, after all, just large quantum state vectors, numbers like any other.

matheusmoreira
1 replies
1d

Because information and human beings exist in completely separate realms of reality and morality.

thfuran
0 replies
21h23m

Everything is information. Humans are no less representable as a number than the subject of the average patent.

apersona
3 replies
1d13h

This is so reductionist, it's not even funny.

Imagine debugging your own code (your code being your intellectual work) and this guy barges into your room and says "Why are you wasting your time? Why didn't you just pick a better 'number'?"

DeIlliad
2 replies
1d13h

Listen, I'm on your side of the argument and this example makes no sense at all.

__loam
1 replies
18h3m

It's got some sense to it. It's like, there are "objectively right" numbers, so why commit yourself to any form of craft if a computer can just find better numbers?

There's a reason you can't copyright songs constructed through combinatorial algorithms. Human authorship plays a role in creative pursuit, by law.

matheusmoreira
0 replies
16h34m

there are "objectively right" numbers

No one said that.

so why commit yourself to any form of craft

Because you enjoy it.

creata
2 replies
1d12h

Why do you need somebody's permission to do math with numbers? Makes no sense whatsoever.

It's part of a very sensible legal framework to incentivize the creation and publication of hard-to-find numbers that you value very much.

I hate intellectual property as much as the next guy, but it's not nonsense.

Just a number really. That's all intellectual work is: a number.

You saw that everything could be encoded in a number, and instead of expanding your view of what a number could be, it diminished your view of everything else. That's depressing.

matheusmoreira
1 replies
1d4h

It's part of a very sensible legal framework to incentivize the creation and publication of hard-to-find numbers that you value very much.

I disagree about "sensible".

It's the 21st century, the age of ubiquitous globally networked computers. The copying, moving and transmission of numbers is literally a fundamental operation of these machines.

The tyranny required to maintain this "sensible" numeric ownership system grows every year. The world changing potential of computers is squandered because of this "sensible" system. Look at how much technology has been held back by the copyright industry. Are LLMs going to be yet another victim? I'm sick of it. Society needs to find another way to incentivize creation.

That's depressing.

Absolutely. This state of affairs depresses me very much. To maintain this "sensible" system, the computer freedom we enjoy today must be destroyed. It's the antithesis of everything the word "hacker" means.

xtracto
0 replies
1d3h

Just wanted to chip in saying that I completely agree with you.

Intellectual property is an idea from last century or older. It doesn't have a place in this new century.

Those "sensible" systems will have to change. Instead of incentivising people to find high value numbers by this artificial scarcity, number searchers will have to find other ways to get paid. That's all.

People that pick up the trash in public parks don't get paid over and over for the beautiful work they did. They have to do it again and again as most of us, once the product of our time is out.

progman32
1 replies
1d14h

To play devil's advocate:

The idea being that if we prevent people from doing specific math on specific numbers, the more proficient number-discoverers will have greater incentive to discover more of those pleasing numbers, and we all benefit.

... Which to me sounds wishful, given the reality of the system as implemented today.

Perhaps someone can help me understand the appeals to creative sanctity. Does making my own fan fiction cheapen the original work? Why is it immoral to change a fiction to suit my own preferences, as long as original authorship is not implied? I mod my single player video games all the time to enjoy them more. I have no qualms about patching out a tech tree if the grinding isn't too my taste, is this an affront? How about covering my favorite song? Making a custom cover for a special book in my collection? Skipping the scariest part of a horror movie? Singing a new jazz song imitating Armstrong's style? Help me understand.

matheusmoreira
0 replies
1d4h

if we prevent

That's a pretty big if. Think about what must happen for such violations to be prevented. You need computers that only execute legal software. It's literally the end of computing as we know it, the end of everything the word "hacker" ever stood for.

Creators are good but I certainly don't support protecting them at all costs. Certainly not at the cost of computing freedom. I love computers and I hate to see them limited to enable obsolete business models.

erdaniels
1 replies
1d13h

I'm thinking of a number, but I'm definitely not going to tell you it now. It's all mine.

matheusmoreira
0 replies
1d4h

That's all right. Secrecy is the only possible way to maintain control over information. I don't publish everything I create either.

Intellectual property is all about controlling public information. They deliberately publish it out there and then expect to control its flow and what it's used for. Makes no sense.

dclowd9901
1 replies
1d14h

You sound like a peach.

I’d love to know if you’ve actually ever created something novel that was good. Not code. Not math. Something uniquely original with no foundational basis.

matheusmoreira
0 replies
1d5h

Why not code? Why exclude that?

If you want to see the stuff I've created, the links to my website and GitHub is in my profile. It's all free and open source software. I'm not a fan of advertising either so I try very hard not to post my stuff here unless it's directly relevant to the discussion. I don't know about "good", that's for others to judge.

__loam
1 replies
23h22m

The idea that everything is just numbers is cynical, delusional, and dehumanizing. You should think more seriously about this before proclaiming everything is "just math". Math is just an approximation humans made up to explain things. It's very good at that but it is provably flawed.

I think it's extremely disrespectful to make a TV show out of a work of fiction without the permission of the author. That author would be well within their rights to sue you.

matheusmoreira
0 replies
23h2m

The idea that everything is just numbers is cynical, delusional, and dehumanizing.

Delusion is thinking you can own numbers.

What does it matter what these silly laws say? They're borderline unenforceable anyway. Imaginary ownership rules are constantly violated world wide at massive scales. Ever saved a photo from a website? Ever shared a picture with someone else? You violated these rules.

And what are they going to do about it? Are they gonna sue everyone? Throw everybody in jail for "disrespecting" their imaginary numeric ownership. What a bunch of bullshit.

It's the 21st century. Better to just accept it and let go.

hau
2 replies
1d7h

disrespectful to Miura's legacy

You mean highest form of flattery? Literally participating and reflecting, trying to recreate and reinterpret is how humans integrate, show acceptance, make it part of themselves and culture. His art is not sacred, nor is it absolutely original, nor is it made in isolation from the world. Miura's legacy is sharing his reinterpretation of whatever. For it to exist there must be less elitist underground bunkers full of sacred IP which we unable to think about, discuss and share our interpretations of.

__loam
1 replies
23h30m

Yeah this opinion sucks dude. Have some respect.

stale2002
0 replies
1d

Fair use and transforming other people's work is completely accepted in the creative industry.

But, of course, when techies do the same thing that creatives have done for decades (IE, make transformative fan art), well now thats not fair apparently.

you reinterpreted it

Reinterpretations, fan art, and "what if" scenarios are completely common in creative industries. People build whole careers on transforming others work.

an_aparallel
1 replies
1d11h

stem splitting makes me feel that way - i feel like legal music protection is just going to be turned on its head...when people can now sample JUST jimmy's guitar, just cobhams drums...JUST elton's piano...i wonder if its already happening under our noses?

ChatGTP
0 replies
20h4m

No one is listening to Jimmy's guitar, they're listening to rubbish on TikTok.

owenpalmer
1 replies
1d14h

There is no amount of legal protection that can prevent 1 individual from generating billions of movies based on an artists style. It's time embrace what the act of creating art will mean in the future. You can't just regulate away the driving force automation.

__loam
0 replies
22h6m

Whenever someone says something like this, my immediate reaction is to think they don't understand the economics of media or the attention economy. Productivity alone isn't a good metric.

boppo1
1 replies
1d14h

Artist here. I can hardly protect my work from ghouls as it is. If I want to work for a company as an illustrator, it's like pulling teeth to get my personal work separated from their IP. Many contracts I've seen were 'you get salary, we get everything you make'. Hell I've seen a contract try to claim my knowledge of niche animation tools was company property.

__loam
0 replies
18h0m

Not saying existant frameworks are the ideal, I just think there should be a reasonable middle ground between corporate intellectual property hellscape And corporate we steal all your work to feed our ai sausage machine hellscape.

dclowd9901
0 replies
1d14h

It is. People who say AI won’t kick people to the curb permanently are either completely ignorant or selling it.

ChatGTP
0 replies
1d17h

They won’t. I wonder if there will still be demand for original content though because it creates a connection. A communication of what others have to say ?

sgbeal
0 replies
1d8h

Bonus points if I can let the tool ingest season 1 or an OVA of said material where a season 2 is never going to come (looking at you "No Game, No Life")

At long last, _Firefly_ Season 2 is within our grasp!

all2
0 replies
1d19h

To be honest, all the pieces are there to create a pipeline. There's still a lot of work on the human side for shot composition, camera movement, etc., but the pieces all exist right now to make this a reality.

Pxtl
0 replies
1d19h

"Hey Bing, can you make me a live action version of the Scouring of the Shire as if it were part of the Peter Jackson Lord of the Rings movies?".

tobr
23 replies
1d19h

That’s quite astonishing. In just a few years this might even be generalized to work for characters other than conventionally attractive young women.

nwoli
11 replies
1d19h

You really want that to be a comment you make on revolutionary new tech? Think of what you’d think of finding dismissive comments like that about the invention of the telephone

simonw
6 replies
1d17h

I agree with OP: having a demo video that's 90% pretty young ladies is a bad look.

I want to show this demo to people, but I'm honestly slightly embarrassed to do so because it's such an obvious example of the male gaze. It's kind of tacky, and it distracts from the extremely impressive technology that's being demonstrated.

nmfisher
2 replies
1d15h

These shitty dances are all over TikTok, more than 90% of which are women (in fact, I don’t think I’ve ever seen a man doing one).

spiderice
1 replies
1d15h

There are a ton of men doing dances on TikTok. They just don’t get shown to you because the algorithm knows what keeps your attention.

all2
0 replies
1d8h

I'd love to see how this imagen works for some of the complex footwork or acrobatics videos that trend. Testing the limits with extreme poses would be quite interesting.

tobr
1 replies
1d5h

More specifically, if you claim to be able to “animate anyone”, why wouldn’t you demonstrate a diverse set of characters? A mix of ages, genders, skin colors, body shapes, and abilities, a well as a mix of photos and various illustration styles.

bigfudge
0 replies
1d5h

This exactly. An in reply to GP -- I enjoy showing tech demos like this to my teenage boys and did also feel slightly embarrassed sending them this one. At least it's the opportunity this evening for a conversation about it.

JSavageOne
0 replies
1d12h

Yea they definitely should've thrown in some ugly ones in there

redleggedfrog
1 replies
1d18h

I'll go one further. The uncomfortable fixation on attractive women for the models is only an inkling of what is to be the primary application of such technology which will be porn. No matter how amazing that tech underlying these animated stills may be be the race for the lowest common denominator is already shining through on their examples. Don't think for one moment they didn't choose those on purpose. They know where it's heading.

TeMPOraL
0 replies
1d17h

Porn isn't the lowest common denominator. Advertising is. Which they more than hint at through the explicit mentions of fashion applications and comparisons to other models in this space.

People who thinking porn is anywhere near the worst application are, IMO, letting their sensibilities cloud their judgement.

dvngnt_
0 replies
1d18h

For ML questioning the data seems okay

ChatGTP
0 replies
1d17h

You’re arguing that this moment is important in history and the parent is arguing that if true, the a less superficial subject should be presented to demonstrate.

When I see all this generative AI stuff, most people are excited about generating manga, computer games and porn.

I know it’s exciting if you’re into consuming media. But I think the average person would look at this “breathtaking moment” and be surprised computers couldn’t do this already, honestly.

Probiotic6081
5 replies
1d18h

Hey, they're demonstrating it's working for overly sexualised cartoon characters too!

kaliqt
4 replies
1d8h

There's no such thing as "oversexualized", stop shaming people for being normal. Your harassment is not welcome here.

lopis
1 replies
1d6h

There's no such thing as "oversexualized"

There is, when we're talking about characters made to look like children.

gumballindie
0 replies
17h49m

Right? And the audacity to claim calling them out is harassment is just uhm … creepy to say the least.

bigfudge
0 replies
1d5h

Oversexualised used this way refers as much to an industry/society as individual people. For example, while individual sex workers might have autonomy and we can respect their choices, we can still notice that our culture creates a situation where women are primarily given value for their physical attributes, above others, and that this applies less to men.

Probiotic6081
0 replies
23h45m

keep jerking off to cartoons weirdo

rcbdev
2 replies
1d6h

This comment thread is peak derangement. They chose anime girls because that's where we have the most comprehensive close-to-homogenous visual data available. Nothing more, nothing less.

bigfudge
1 replies
1d5h

To call this comment deranged is to misrepresent the context entirely, and also be extremely naive about the selection of videos for the paper. Yes it's possibly a bit snarky, but quite funny too. It's certainly what I was thinking when I looked at the results: this paper is written in field which doesn't have many women in it! I think it's highly likely this will get more exposure because of the choice of examples. Yes they have a robot too — but there aren't any "normal looking" people doing dances.

GaggiX
0 replies
1d4h

What do you mean there are not any normal looking people doing dances? The individuals shown on the page might be considered attractive by some but they are not abnormal for that.

gumballindie
0 replies
17h51m

My thoughts exactly. I am not sure why so many working in ai are obsessed with child looking anime characters, or in some cases, sexualised human looking animals. I have a few suspicions and i find the trend at least disturbing.

Terretta
0 replies
1d17h

There are a couple males (under Human, click 4th dot) and an ironman robot too (first under Humanoid).

mbo
18 replies
1d12h

The choice of test images here feels super inappropriate. Surely there's a more diverse and standardized dataset for benchmarking this task than what was chosen here.

I quote similar criticism from Dianne P. O'Leary:

Suggestive pictures used in lectures on image processing ... convey the message that the lecturer caters to the males only. For example, it is amazing that the "Lena" pin-up image is still used as an example in courses and published as a test image in journals today.
newfriend
8 replies
1d10h

This appears to be published by a group of folks from China where political correctness hasn't infected every aspect of society. It's ok to not be offended by everything.

hiddencost
7 replies
1d10h

Diversity makes stuff better, and even if it didn't, it would be worth while.

That folks like you think advocating for diversity is about "political correctness" rather than building a world that is better to live in is very sad.

vunderba
1 replies
1d9h

More diversity is fine (give me some manly men aka the burly Simpson towel man) but you ignored the fact that OP also very much handed down a virtue signaling level of high handed "judgment" of the images in question, claiming that their inclusion was "super inappropriate"...

I mean, give me a break. They're completely tame other than being mostly women, but apparently that alone is enough to send people into a fit of hysterics.

mbo
0 replies
19h16m

Okay, clearly the feminist angle has fallen flat here as I, an anonymous user, am now being accused of virtue signaling. I'll try a more lowbrow angle:

This paper is what I'd characterize as "horny on main" (https://knowyourmeme.com/memes/horny-on-main). It's embarrassing. They've spent all this time doing impressive, cutting edge research just to make women (and girls!) dance for them. They could have chosen a dataset that at least made them look cool, rather than like weirdos. It weirded me out! Very likely weirded out others.

unsupp0rted
1 replies
1d8h

Diversity makes stuff better

Citation needed

autoexec
0 replies
1d8h
newfriend
0 replies
1d9h

That's just like.. your opinion man

Diversity is fine. Forced diversity is not. Having different opinions and things that are not forcefully diverse is it's own form of diversity. Making everyone conform to your ideas is not diversity.

kaliqt
0 replies
1d8h

No, it does not.

KHRZ
0 replies
1d7h

This is also China's justification for being a dictatorship. "We are just different and it works best for us, not every country must be a democrazy."

IshKebab
4 replies
1d9h

I wonder what your comment would say if the examples were all men.

Vinnl
1 replies
1d9h

I imagine it might be much the same if those men were mostly underwear models flexing their muscles?

Though of course, that would've mostly been super uncomfortable if the field was also dominated by women.

stale2002
0 replies
1d

Did you comment on the right thread?

None of the examples have people in their underwear.

Instead, it is regular people, and animated characters, doing tiktok dances.

Not sure why you are equating this to sexual content. Its simply trendy/viral content. Nothing sexual about it.

Dancing, and animated characters, are a great way of showing off what the technology is capable of.

mbo
0 replies
19h26m

I would be okay with it because men are not as systematically objectified in our society and do not face the same barriers to entry to computer science as women.

Peritract
0 replies
1d9h

The point is that they aren't.

numpad0
1 replies
1d8h

I don't get this "suggestive pictures cater to male" logic. The reality don't support it. It's almost like saying only men are humans or only men has certain functions so pictures of certain objects are for men only. It's just isn't how anything works.

aqme28
0 replies
1d5h

What do you mean "The reality don't support it"? Literally every example in the header video is a young woman or anime girl

kaliqt
1 replies
1d8h

There's absolutely nothing wrong with it. Most researchers are male, and this is what helps people get through the work day. Stop trying to harass people.

spondylosaurus
0 replies
1d8h

What part of their comment is harassment?

crazygringo
12 replies
1d19h

Just wow. This is the first time I've seen AI generate convincing human movement. (And the folds of fabric in dresses seem to move realistically too!)

Of course, the actual movement skeleton is coming from presumably real motion-capture, but still.

I'm curious what the current state is of generating the movement skeletons, which obviously has a ton of relevance to video games. Where's the progress in models where you can type "a burly man stands with erect posture, then crouches down and sneaks five steps forward, then freezes in fear" and output an appropriate movement skeleton?

lukew3
3 replies
1d18h

I don't think that text to movement will be desired. There is a lot of room for interpretation there, and if a creator already has a vision for movement, they would probably choose to have the model mimic a video of them acting rather than trying to get lucky and correctly describe a set of movements.

crazygringo
1 replies
1d17h

To the contrary, I think text-to-movement is going to be huge for videogames especially.

I don't see any other way to smoothly link 1,000 possible movements to and from each other, including when there are various fixed distances between you and a ladder or ledge etc.

I think models will learn "movement personalities" the same way they learn a particular celebrity's voice -- everybody moves with different rhythms. So your big burly action-hero character will move with a totally different rhythm from your waif-thin ethereal elf.

But there will still be a textual vocabluary that generates the motion -- "stealthily creeps to the door 2.2 meters away, taking 6.3 seconds, and then suddenly and dramatically opens it with a flourish".

spookie
0 replies
1d14h

Inverse kinematics have been solving most of the animation blending you're describing here for years. Not to dismiss completely the potential of this, but there already pretty good, reliable ways to solve these issues. I believe the person you answered to to be correct in most cases, there is a lot of nuance lost if done by text.

ygjb
0 replies
21h11m

Text to movement is highly desirable. There is a long way to go before tools like this are used in AAA games, and blockbuster films, but as much money as is tied up in those areas, even more is tied up in advertising and lower tier content production pipelines.

Look at tools like Adobe Animate and Character Animator, these tools are absolutely capable of doing interpolation between frames while giving the person operating the tools a fine degree of control. When you can use text to movement to quickly create tens or hundreds of sample scenes quickly, then manually iterate on and edit them, instead of hand drawing and compositing, the value proposition is pretty crazy for time and budget constrained production where tradeoffs on quality and fidelity are acceptable.

anonylizard
3 replies
1d18h

This is already highly, highly relevant for 2d animation.

Many complex moves (especially dancing) are filmed in video first, then the movement is traced over by hand. This is called Rotoscoping.

This is basically auto-rotoscoping, and I expect it to see commercial usage within popular high budget projects within 2 years. Previously, even the highest budget anime couldn't really afford 2d dance scenes due to the insane drawings required.

sircastor
0 replies
1d16h

As I recall, Ralph Bakshi used this to mixed effect back in the 70s with The Lord of The Rings animated films.

hondo77
0 replies
22h44m

Many complex moves (especially dancing) are filmed in video first, then the movement is traced over by hand.

If the "animator" is a hack.

Tanoc
0 replies
1d15h

Rotoscoping is directly tracing from film frame by frame. Most of the time animators instead use what are called reference key poses. For extremely stylized characters it can be difficult if not impossible to use rotoscoping for complex sequences because the proportions make the motions impossible. Instead specific frames where there's resting motion in the sequence are pulled, and the animators translate the positions of the joints to the unusual silhouette of the character for the key frames. The Castlevania and Masters Of The Universe: Revelation series from Netflix use this quite a bit for realistic characters -- it was heavily used during the Trevor and Death fight and Skelegod versus Savage He-Man fight -- and Rise Of The Teenage Mutant Ninja Turtles and High Guardian Spice used it for their stylized characters for perspective and fight scenes.

If this technology can instead be used to not only find the key poses but also roughly determine the inbetweens for a heavily stylized character like Rocko Wallaby or Blossom, Bubbles, and Buttercup then it is extremely useful.

COAGULOPATH
1 replies
1d17h

It still doesn't look ready for primetime to me. When the characters' hands move, they either gain a billion extra fingers or morph into fleshy shapeless clubs. It's just hard to tell because the size of the video samples are small and poor quality. At 1080p, it would look extremely bad.

sp0rk
0 replies
1d15h

The individual example videos are nearly 1080p (1024 height) and they look pretty good to me. There is some glitchiness on the hands but it's very minor and could definitely be fixed easily in post-production or through further refinements. In fact, if you play the high resolution videos and pause at random points, it's hard to even find a frame where there are extra fingers, though there are a few.

the8472
0 replies
1d18h

Of course, the actual movement skeleton is coming from presumably real motion-capture, but still.

I recall seeing from-scratch learned realistic motion somewhere. I think was meant for character animation that would more realistically interact with the terrain and cross obstacles in video games. So that should be synthesizable too.

netruk44
0 replies
1d19h

The input poses appear to be generated from OpenPose [0], which uses regular images as input. With the creation of stable diffusion video, you could theoretically prompt it to make a video of what you wanted, then run it through OpenPose.

But I think the more realistic approach is to take a video of yourself doing the motions you want the AI to generate, and run that through OpenPose instead.

Using just words leaves a lot to the model’s interpretation. I feel like you might wind up spending a lot of time manually fixing little things, similar to how you might infill the “wrong” parts of an AI generated image. It might be easier to just take a 15 second video to get the exact skeleton animation you want.

[0] https://github.com/CMU-Perceptual-Computing-Lab/openpose

elpocko
8 replies
1d19h

Why would you publish your findings on Github of all places, but not release any code? I think this trend is really weird.

Kiro
1 replies
1d18h

What's the alternative? I haven't found anything that's easier to deploy and manage than GitHub Pages.

elpocko
0 replies
1d18h

I always thought of Github as a place to host software projects, but it does work as an image hosting service as well I guess.

ryanackley
0 replies
1d17h

I'm fine with them hosting on Github but they have a link that specifically says "Code" that takes you to a relatively empty github repo with no code. Hopefully, putting the actual code on github is a work in progress.

runeblaze
0 replies
1d11h

When researchers finish papers often they are too exhausted to choose anything beyond the "easiest" path, in this case maybe using an existing template and a gh pages website to attract publicity.

(I know people are often too exhausted to even upload the preprint... They take a break and upload later)

octagons
0 replies
1d19h

Just guessing based on the authors’ names and their affiliation with Alibaba Group, but I think this research was published by exclusively Chinese citizens.

In my experience, it’s difficult to operate a small, personal website from within China because of their regulations in regards to non-government websites. Because of this, you will often find that Chinese citizens will use approved (or at least unrestricted) services like GitHub pages.

Having worked closely with many businesses based in China due to my hobbies, I have noted that services like Google Docs and Drive are favored for this reason.

I would guess there are ways to host content like this more easily on platforms that are only accessible within China or are not navigable without the user understanding Chinese language.

I would also guess that this is part of the reason why services that target customers in China tend to become “super apps” that combine several services that non-Chinese users would expect to find on disparate sites. For example, services may combine a social media style newsfeed/interaction API, banking, email, shopping, and food delivery into a single platform.

errnoh
0 replies
1d7h

At least based on my observations it's been common practice in ML papers for some years already. Usually releasing Github hosted project page and a repository with the same information, then releasing the code on that repo afterwards at some point.

I don't feel that's an issue. A lot more people are able to see what's happening on the bleeding edge than if they'd just release the paper without accompanying demo page, and faster than if they'd wait for the code to be ready for release? Of course one can argue that "they should just release whatever code they have instantly", but that's their choice if they want to clean it up, remove secrets etc.

crazygringo
0 replies
1d19h

Because it's basically free webhosting and you don't need to manage registering a domain?

I don't know for sure, but that's my guess. You could achieve something similar with S3, but you need a credit card attached, and then you need to worry about whose credit card, and what if it gets unexpected traffic and who will pay...

You could use Google Sites as well, but then you need to buy a domain, which again means requiring a credit card, and whose responsibility is it to pay and for how many years?

I don't think it's mostly about the cost, I think it's mostly about just not having to link a credit card?

aqme28
0 replies
1d5h

It is a little bit funny that there is a link to "Code" that takes you to github, but it's a repo with just a readme and a video.

esotericsean
6 replies
1d19h

Pretty huge breakthrough. Hopefully we'll be able to access this soon. Between this, SVD, and others, 2024 is going to be the year of AI Video.

esafak
4 replies
1d18h

I suppose SVD means something other than singular value decomposition, because that is not new?

ilaksh
1 replies
1d18h

The tendency to instantly and liberally invent and use acronyms like this is very annoying and a poor communication style.

I think they mean Stable Video Diffusion.

esafak
0 replies
1d17h

Good luck to them if they choose to name it SVD; they're going to get buried in search engines.

nmfisher
0 replies
1d15h

Stable Video Diffusion.

matheusmoreira
0 replies
1d15h

SVD could also mean this badass sniper rifle.

https://en.wikipedia.org/wiki/SVD_(rifle)

ChatGTP
0 replies
1d17h

I don’t know why but who cares ? People who like consuming TikTok videos ?

thyrox
5 replies
1d16h

Just imagine in a few years there will be a site like YouTube aka videogpt where all the videos are created on the fly, like chatgpt does for text.

From step by step repairing my electronics to learning about science it will all be customised according to my learning level and focus on things I want to know more. And hopefully without that a word from our sponsors section.

chii
4 replies
1d16h

And hopefully without that a word from our sponsors section.

it won't have this, because the product placement would be seamlessly integrated into the content so naturally that you don't notice being subliminally persuaded to think or buy what they want you to!

matheusmoreira
2 replies
1d15h

Looks like we're gonna need AI ad blockers soon.

erybodyknows
1 replies
1d1h

Eyelids, the original ad blockers.

matheusmoreira
0 replies
1d

They'd put ads under our eyelids if they could. They'd put ads in our dreams.

corobo
0 replies
1d14h

And don't even think of pausing a TV show now that the paused characters can animate and sell you products!

Actually that tech would be kinda cool. The reality would be bleak, but the tech would be cool.

If nothing else have the characters look out of the TV and tap their foot impatiently like Sonic used to in the videogame when you stopped playing for a moment

xrd
3 replies
1d16h

Great, this looks perfect as a way to sell more fast fashion, the 2nd worst polluting industry in the world. So many better applications of this technology and instead they showcase dancing boring body stereotypes.

nathan_compton
1 replies
1d6h

Thanks for making me feel like I'm not the only one who has a problem with this. This is a pretty unprofessional way to publish results. Like just think for 2 seconds about what women scientists might think about this particular presentation.

I'm sure there are women out there who don't have a problem with this. And I am not some kind of weirdo who has a problem with people enjoying looking at attractive people. But I think in a professional context there really isn't any good reason to lead with stuff that might alienate people. It is, at the very least, tacky.

xrd
0 replies
1d3h

I was downvoted heavily on that comment. I've got two daughters and this is not the way I want them to think about usage of AI. My 8 year old is already worried about her body and her weight and it feels like it is appropriate to at least discuss how this lands on all people who see it. I appreciate your comment because the downvotes make me think I'm alone in this way of thinking.

IshKebab
0 replies
20h32m

This looks like it would be useful for any online clothes shopping. Why fixate on negative uses?

sys32768
3 replies
1d18h

Just imagine when this merges with 3D modeling and VR.

The VR pr0n, the video games with dynamic AI characters. Dead actors and historical figures resurrected into movies and education.

I'm not so scared about my future nursing home now.

BizarroLand
1 replies
1d16h

Live people will probably make hella money with VR teledildonics in the space between when the products first become available and when AI learns the process well enough to outperform any human at a fraction of the cost.

numpad0
0 replies
1d8h

Humans have always outperformed generators in terms of discriminating live captures against generated data. There is no guarantee that that continues, but there is no sign of that balance of power changing.

civilitty
0 replies
1d6h

Finally, I’ll be able to act out my Abe Lincoln sex fantasy.

justanotherjoe
3 replies
1d13h

why is everything from this space so horny. I'm not sure if I like it or not. Obviously it's a bit problematic but on the other hand i welcome when people are honest about their intentions like this.

siddbudd
0 replies
1d13h

They made the model learn by watching TikTok videos. WYSIWYG.

numpad0
0 replies
1d8h

Because trying to create porn gives a strong and instantest gratification. Things work out, you get more energy to the brain. Things don't work, and the brain goes head first into debug logs. When culturally not inhibited, it grants you superpowers.

IshKebab
0 replies
20h34m

The world is horny. This is just not hiding that.

Seriously all the people complaining about it being used for attractive women dancing... no shit. Have you seen life?

all2
3 replies
1d19h

I'll mention Corridor Crew's Rock, Paper, Scissors [0] as the previous state of the art in terms of character animation/style transfer/etc. using AI tooling.

I imagine this will make the barrier to entry for animated stuff very, very low. You literally only need a character sheet.

Also, the creep factor for AI girlfriends has ratcheted up another notch.

[0] https://www.youtube.com/watch?v=7QAGEvt-btI

autoexec
1 replies
1d8h

I think quality animation will still require a lot of skill. I do think that AI will be the end of animators who do nothing but inbetweens though. Considering the huge amount of skilled work involved and the low pay maybe that's a good thing, but it could also make it harder to develop and discover amazing animators.

kamranjon
0 replies
1d2h

Are there really that many animators left that actually do in-betweens? I would think most animation houses are using software these days for that. I think maybe old-school shops like studio ghibli maybe still do some hand drawn stuff, but I’d guess most animation these days uses tweening in software.

Handprint4469
0 replies
1d5h

Also, the creep factor for AI girlfriends has ratcheted up another notch.

I don't know if I agree with the framing here ("creep"), but for sure this will make AI girlfriend/boyfriend products even more addicting.

radarsat1
1 replies
1d9h

I'm as interested as anyone in the methods behind this and find deep learning just continues to amaze and I really enjoy working in the AI space. Having said that, seeing that we now essentially have the capability to synthesize videos of people doing things they haven't done and saying things they haven't said, I really do worry about the potential for abuse of this technology once it's really perfected. What does it mean for society when can no longer enter recordings into evidence?

unsupp0rted
0 replies
1d8h

Net benefit. Before we should have questioned the veracity and applicability of all evidence. Now we have no choice.

huytersd
1 replies
1d18h

How do you generate the movement data?

netruk44
0 replies
1d18h

It looks like they’re using OpenPose [0] images fed to a special “pose guider” model. You can make them from just regular video.

[0] https://github.com/CMU-Perceptual-Computing-Lab/openpose

hombre_fatal
1 replies
1d19h

This is absolutely insane. The DreamPose output they compare themselves to is less than one year old.

It's funny to go back to the first avocado chair or deep dream images that wowed me just a couple years ago.

I can't help but feel lost in the pace of tech.

johnyzee
0 replies
1d18h

It is a massive seismic shift. Almost any project in the works right now, that involves visual content, looks like it will be antiquated in a very short time. Including some that I am working on :(. The new possibilities though... Breathtaking to think about.

bitwize
1 replies
1d16h

This is why actors are striking.

yreg
0 replies
1d14h

The strike has ended on Nov 9th and this wasn't the main cause.

Anyway, they will have to adapt. The diffusion models are not replacing e.g. live theater.

achatham
1 replies
1d15h

What's the trigger for all the diffusion video projects this week? Are they all at the same conference, or did one trigger everyone to rush to publish? I'm grateful but curious. Animating still pictures from my kids' books is my goal and it seems close to plausible just from this week's advances.

riotnrrd
0 replies
21h1m

CVPR submissions were two weeks ago.

simonmysun
0 replies
1d5h

Is this considered as absusing the GitHub CDN? The project is not open source, butis using GitHub for hosting so many videos.

rvz
0 replies
1d15h

I predict that Meta will release another open source version of this, given their advanced research in pose estimation.

modeless
0 replies
1d12h

This looks impossibly good. I think these samples are cherry-picked and I also think the system is essentially overfit on these datsets and would not generalize to anything even slightly different. I want to see their failure cases! Lack of failure cases is a red flag.

Still, though, it could be useful in its current form, and making a more general system might mostly be a matter of collecting appropriate training data. Impressive work, just needs a more realistic presentation.

maxglute
0 replies
8h54m

I love how Chinese image synthesis studies fixate on anime. I'm sure others do to, but they're particularly weeby.

kamikaz1k
0 replies
1d16h

Page is crashing my mobile chrome tab…

dartos
0 replies
1d12h

Why even have a GitHub?

classicalhabits
0 replies
1d

Wow this is uncanny

brunorsini
0 replies
1d13h

Apologies for the lazy question... But with so many incredible models being released, how are you all keeping up? Are you mostly trying them all out on the web or are you installing each of them locally?

allanrbo
0 replies
1d19h

Very impressive quality.

EGreg
0 replies
1d8h

How did they make the girls smile as she dances in the last ones?