HN comments for: Open-source project ZLUDA lets CUDA apps run on AMD GPUs

dpflan

47 replies

1d2h

2024-03-05 15:56:39 UTC

This seems relevant for discussion:

Nvidia bans using translation layers for CUDA software to run on other chips [1]

____

[1] https://news.ycombinator.com/item?id=39592689

TehCorwiz

18 replies

1d1h

2024-03-05 17:19:47 UTC

If I'm not using Nvidia hardware, and I don't use Nvidia drivers, and I haven't agreed to their EULA then why would I care?

Emulation is legally protected both explicitly and through legal precedence. The replication of APIs for compatibility purposes has been argued to the US Supreme Court and found to not be copyrightable. At least within some pretty broad scope.

IANAL, but I fail to see what legal basis Nvidia is relying on. For a single user or company who owns no Nvidia hardware this feels moot. For a company with existing Nvidia hardware I could see them having an argument, kinda. But wouldn't that be squarely in the anti-competitive behavior wheelhouse?

Cheer2171

7 replies

2024-03-05 17:39:34 UTC

If I'm not using Nvidia hardware, and I don't use Nvidia drivers, and I haven't agreed to their EULA then why would I care?

So in your entire life, you have never downloaded an Nvidia driver and clicked through the EULA? Once you agree, you've agreed.

croes

2 replies

2024-03-05 17:54:56 UTC

If you don't own Nvidia hardware why should download an Nvidia driver?

fragmede

1 replies

23h58m

2024-03-05 18:27:34 UTC

Because Nvidia's libraries, whether acquired through them or otherwise, are (currently) required for this trick to work.

croes

0 replies

23h4m

2024-03-05 19:21:45 UTC

You can get the libraries without installer, so no installer, no EULA, no acceptance of it.

throwing_away

0 replies

2024-03-05 17:43:26 UTC

That is not how contract law works...

tarsinge

0 replies

23h41m

2024-03-05 18:44:43 UTC

I don’t think agreeing or not to EULA has any value in EU. At least in France were consumer rights are codified and so an EULA cannot limit these legal rights.

randomname93857

0 replies

2024-03-05 17:51:47 UTC

But does that matter? if someone tried a software or a service and then terminated or quit that, then does that end-user agreement still applies in perpetuity? Let say I cancel a cable TV subscription or quit MySpaces, do I still really bound by their EULA?

OtomotO

0 replies

2024-03-05 17:45:38 UTC

Even if they had, they agreed to a specific version, not all future versions. That's why the EULA comes up again and again if it changes.

Or am I totally wrong here?

underdeserver

4 replies

2024-03-05 18:04:45 UTC

Hah, I wonder if they're big enough now for the European Commission to fine them over this.

jraph

2 replies

2024-03-05 18:13:00 UTC

Europe, the fine continent :-)

fddrdplktrew

1 replies

22h57m

2024-03-05 19:28:09 UTC

isnt Asia part of Europe?

jraph

0 replies

20h15m

2024-03-05 22:10:58 UTC

No, however Asia and Europe would be parts of Eurasia, or of Afro-Eurasia.

My joke would be more accurate referring to the EU :-)

adventured

0 replies

2024-03-05 18:09:54 UTC

$50+ billion in expected annual operating income going forward ($13b in the latest quarter with rapid growth).

More EU budget money coming right up.

jsheard

1 replies

1d1h

2024-03-05 17:25:44 UTC

If I'm not using Nvidia hardware, and I don't use Nvidia drivers, and I haven't agreed to their EULA then why would I care?

If the CUDA software you want to run on ZLUDA contains first-party Nvidia libraries, which it usually does, you have to care about how those dependencies are licensed.

TehCorwiz

0 replies

2024-03-05 17:30:44 UTC

Yeah, I commented below at someone who mentioned that. I wasn't aware that people were distributing Nvidia binary libraries with their apps.

rezonant

0 replies

23h43m

2024-03-05 18:42:36 UTC

The application developer has to agree to the terms of the CUDA toolkit. If ZLUDA or other mechanisms require the developer to opt in, that could cause a problem. Perhaps someone more familiar can let us know if that's how it works?

braiamp

0 replies

23h57m

2024-03-05 18:28:50 UTC

EULA or not, it would fail anti-competitive laws. In the US they also avoided the Copyright question by saying it was fair use.

AndrewKemendo

0 replies

1d1h

2024-03-05 17:24:57 UTC

But wouldn't that be squarely in the anti-competitive behavior wheelhouse?

Precisely why they are making that statement. The goal is to threaten people who attempt to avoid CUDA

easyThrowaway

17 replies

1d1h

2024-03-05 17:07:59 UTC

What's the difference between this and Wine/Proton? I guess Microsoft EULA has similar conditions, if they were enforceable wouldn't Microsoft do the same and send a C&D to Wine devs?

bri3d

9 replies

1d1h

2024-03-05 17:17:42 UTC

This isn't NV saying "you can't make ZLUDA," it's NV saying "you can't run our libraries like cuDNN, cuBLAS, cuBN on non-NV hardware."

While this kind of restriction still isn't legally cut and dry, it does come with tons of precedent, from IBM banning the use of their mainframe OSes in emulation to Apple only licensing OSX for use on Apple hardware.

TehCorwiz

2 replies

2024-03-05 17:26:53 UTC

I think this is a more nuanced point that just running "CUDA software" which is what's been commonly discussed. Nvidia is licensing their shared libraries in such a way that you can only run then on their hardware.

It's notable however, that both of the examples you give are for Operating Systems, rather than a library which is part of a larger work. Do you know of an example of a single application or library being hardware locked? the only instance I can think of off the top of my head are the old Dos versions of Autocad which had a physical dongle. But even that was DRM and not just an EULA restriction.

Actually, that might be an interesting direction for them to go. Include some key in Nvidia hardware which the library validates. Then they'd get DMCA protection for their DRM.

fragmede

0 replies

23h45m

2024-03-05 18:41:00 UTC

old Dos versions of Autocad which had a physical dongle.

Those didn't go away in the industry, though AutoCAD moved away from them. Resolume (professional VJ software) and Avid (professional video editing software) still have hardware dongles. Arguably so does Davinci, theirs are just much bigger ;). progeCAD (AutoCAD compatible CAD program) also has USB protection dongles available as a license option.

bri3d

0 replies

2024-03-05 17:36:46 UTC

Hardware-dongle DRM was the default licensing model in the 90s for any kind of enterprise-type software.

Pretty much all audio and video production and editing software for many years, and even today. C compilers, for many years, as well. PhysX for awhile. Native Instruments stuff. Saleae logic analyzer software.

Another message in this thread reminded me, too, of the Google Play frameworks on Android, which are also a very good analogy - Google ship these libraries licensed for use only on approved phones.

asdff

1 replies

22h55m

2024-03-05 19:30:33 UTC

Except the hackintosh community exists. Clearly there is no precedent to actually enforce anything and shut down these community tools.

Lazonedo

0 replies

20h9m

2024-03-05 22:16:27 UTC

Except the hackintosh community exists. Clearly there is no precedent to actually enforce anything and shut down these community tools

You can't shut down the tools themselves, but you can shut down their use.

https://en.wikipedia.org/wiki/Psystar_Corporation

On November 13, 2009, the court granted Apple's motion for summary judgement and found Apple's copyrights were violated as well as the Digital Millennium Copyright Act (DMCA) when Psystar installed Apple's operating system on non-Apple computers.

Besides the copyright violation, it is very important to note that the court also considered that circumventing the hardware checks were a violation of the DMCA and illegal in and of itself.

Apple doesn't do anything about the hackintosh ""community"" because they simply don't care about a bunch of random nerds in their basement running macOS but the moment a corporation starts using it to replace their macs you can bet they're going to be sued to oblivion. Not that it would ever happen, hackintosh are going to prove a complete dead end once Apple drops support for x86.

We live in a post-DMCA world. This isn't the era that allowed Bleem to win against Sony, and this is the era that saw the switch emulator developers shit their pants and promise millions to Nintendo in a settlement because they were very unconfident in the possibility of winning in a trial. NVIDIA, for better or worse, has a strong legal standing to clamp down on people who think it would be funny to run their libraries on non-NVIDIA hardware. Do it in your basement if you will, but don't try to push this in a data center.

skissane

0 replies

21h12m

2024-03-05 21:13:09 UTC

from IBM banning the use of their mainframe OSes in emulation

It doesn't change your point (which is that it appears de facto legally established that IBM can do this), but IBM doesn't completely ban the use of their mainframe OSes in emulation. They are totally okay with people running them in their own emulators (zPDT and ZDT); the thing they won't authorise is people running them on the open source Hercules emulator, since their own emulators cost $$$$, and Hercules is free, and it appears they view the $0 of Hercules as a threat to the $$$$ of their mainframe ecosystem.

In the past they've even authorised third party commercial emulators, such as FLEX-ES. At some point they stopped allowing FLEX-ES for new customers, although I believe some customers who bought licenses when it was allowed are still licensed to use it. But, it isn't impossible they might authorise a third party commercial emulator again – make it expensive, non-open source, and make it only run on IBM hardware (such as POWER Systems), and there's a chance IBM might go along with it.

fancyfredbot

0 replies

2024-03-05 17:59:07 UTC

It's actually NVIDIA saying you can't reverse engineer anything you build using CUDA SDK in order to run it on another platform. If someone else built it with the SDK and you have never downloaded the SDK yourself then you would not be bound by this agreement. I don't think you could get the cublas etc libraries without agreeing to the EULA so it includes what you are saying, but also includes apps or libraries you build yourself using the SDK.

"You may not reverse engineer, decompile or disassemble any portion of the output generated using SDK elements fo the purpose of translating such output artifacts to target a non-NVIDIA platform."

dist-epoch

0 replies

2024-03-05 17:30:42 UTC

It seems like it should be pretty easy for cuDNN, cuBLAS to authenticate the hardware.

Someone

0 replies

2024-03-05 18:08:13 UTC

from IBM banning the use of their mainframe OSes in emulation

I don’t think they would have tried that in the 1970s, when there was an antitrust suit against them for disallowing running their software on plug compatible (https://en.wikipedia.org/wiki/Plug_compatible) mainframes.

That (I think) made IBM offer reasonable licensing terms for their software (https://en.wikipedia.org/wiki/Amdahl_Corporation#Company_ori...: “Amdahl owed some of its success to antitrust settlements between IBM and the U.S. Department of Justice, which ensured that Amdahl's customers could license IBM's mainframe software under reasonable terms.”)

That case eventually got dropped in 1982, so it didn’t lead to any jurisprudence as to if/when such restrictions are permitted.

(Aside: for a case that ran for over a decade and produced over 30 million pages (https://www.historyofinformation.com/detail.php?id=923), I find it strange this case doesn’t seem to have made it to Wikipedia yet, and how little there’s elsewhere. Nice example of how bad the public digital record is)

parentheses

6 replies

1d1h

2024-03-05 17:10:56 UTC

EULA permits the companies to have grounds to take someone to court. Their choice to prosecute. It's a reserved right more so than one that they use 100%.

In this case translating CUDA can allow AMD to chip away at NVidia's market share.

adastra22

5 replies

1d1h

2024-03-05 17:11:54 UTC

Not if the person being sued never “signed” the EULA.

LoganDark

4 replies

1d1h

2024-03-05 17:23:26 UTC

Nvidia cares about the people reverse-engineering CUDA, who don't have anything to reverse-engineer unless they actually download the SDK. Of course, when you download the SDK it has an associated license.

TehCorwiz

3 replies

2024-03-05 17:29:25 UTC

It's also possible to do a clean-room reverse where one group tests the hardware and software and writes specifications and another group who has never seen the hardware or docs then does the implementation. This has been legally tested and protected going back to at least the early 1980s.

mandevil

2 replies

2024-03-05 18:19:52 UTC

It is possible, but a) it is expensive as hell to get enough engineers whom you can prove in court have no exposure to the original software (most SWE's would get some exposure just naturally in college nowadays, leave alone at any sort of job) and b) CUDA is constantly changing and updating and so you need to have this expensive clean-room process going constantly or else you will fall behind.

The most famous case of clean-room reverse engineering is for the original BIOS chips back in the early 1980's, where the chips themselves couldn't change- they were hardware! It's going to be orders of magnitude more difficult to do that for software packages that change regularly.

TehCorwiz

0 replies

22h48m

2024-03-05 19:37:56 UTC

I disagree on Cuda changing constantly. The hardware is stable once sold and new devices are usually backwards compatible by at least a version or two. The API is also locked down for already deployed versions, can't go pulling the rug from paying customers. However, new versions of both hardware and Cuda do introduce new things that'll need addressed. I don't think it's much of a moving target though.

Me1000

0 replies

22h39m

2024-03-05 19:46:32 UTC

it is expensive as hell

Probably, but Nvidia's market cap suggests there's more than $2 trillion in reasons to front that expense.

mort96

2 replies

1d1h

2024-03-05 16:59:23 UTC

Does that even matter? It's not like you need someone's permission to implement a system with a compatible interface to another. It violates the EULA but you don't need to accept the EULA unless you download the CUDA software, which I guess the authors of ZLUDA could avoid doing

jsheard

1 replies

1d1h

2024-03-05 17:11:44 UTC

The complication is that most CUDA apps you would want to run on ZLUDA contain first-party libraries provided by Nvidia (e.g. cuDNN) which may have restrictive license terms saying you're not allowed to run them on a third-party runtime. ZLUDA itself may be legally in the clear as a cleanroom reimplementation free of Nvidia code, but it's not so clear-cut for the users of ZLUDA.

mort96

0 replies

2024-03-05 18:01:34 UTC

Aha, so we'd need clean-room re-implementations of those libraries too, in principle.

dotnet00

2 replies

1d1h

2024-03-05 16:54:46 UTC

Should be emphasized again that contrary to the article's claim, the clause in question has been in CUDA's EULA, even in the downloads (contrary to the updated statement in the article), since Jan 2022.

croes

1 replies

2024-03-05 17:55:42 UTC

Means Nvidia is even longer anti competitive than we thought.

contravariant

0 replies

2024-03-05 18:01:58 UTC

Wait, you thought they started after 2022?

jrepinc

1 replies

2024-03-05 17:34:22 UTC

Fsck you nvidia even more. Just going the same evil ways as Nintendo I see. Good thing I don't waste my money on your products.

mtillman

0 replies

2024-03-05 17:42:46 UTC

I find myself more upset with AMD for completely dropping the ball on firmware and software. Almost intentional levels of incompetence.

gsich

0 replies

2024-03-05 17:28:30 UTC

My hardware - my rules.

adastra22

0 replies

1d1h

2024-03-05 17:11:20 UTC

NVIDIA doesn’t have the authority to do that. There’s no NVIDIA SDK involved here.

Blackthorn

40 replies

1d2h

2024-03-05 16:15:25 UTC

It's absolutely absurd that AMD stopped funding this, since right as it got released as open source it started producing value for AMD users. You'd think this exact thing would be their top priority, but instead they've been faffing around for years with two (are they up to three now?) alternate APIs with minimal support so far.

dkjaudyeqooe

17 replies

1d1h

2024-03-05 16:27:43 UTC

If it ever became a reliable option Nvidia would just send a cease and desist and then sue. It's a blind alley as a serious solution.

It makes sense in that context.

beeboobaa

15 replies

1d1h

2024-03-05 16:34:02 UTC

Why would software that lets me use hardware that I own, installed in my machine, be subject to a cease and desist?

szundi

8 replies

1d1h

2024-03-05 16:42:27 UTC

Welcome the the US when you can patent protocols and apis. (Afaik)

In EU you could have done it but because of US risks they killed it anyway.

bee_rider

7 replies

1d1h

2024-03-05 16:52:13 UTC

If it is totally fine in the EU, why not just host it there? Spain (or whoever) could start up a cottage industry of ignore-local-ip-law-as-a-service. The Uber of IP law.

croes

6 replies

2024-03-05 17:58:11 UTC

Because the US enforce their rules on world wide.

That's not legal but who's gonna stop them.

anthk

4 replies

23h6m

2024-03-05 19:19:06 UTC

No, the US can't.

Also, Wine does the same since forever for DOS binaries. Or NetBSD with compat_* libreries for tons of Unixlike OSes.

croes

1 replies

22h15m

2024-03-05 20:10:33 UTC

Try selling cuban goods in Europe to another European citizens in Europe and let him pay with PayPal (Europe) S. à r.l. et Cie, S.C.A.

anthk

0 replies

20h0m

2024-03-05 22:25:17 UTC

Spain does it fine with hotel chains. Maybe not Paypal, but for sure it does commerce with Cuba.

calgoo

1 replies

22h30m

2024-03-05 19:56:03 UTC

The US uses trade agreements to enforce the rule in the EU. Spain used to be quite lenient with copyright, but the US threatened to block all sales to Spain of movies and music. Then a minister basically implemented new restrictions a week before their term was up.

anthk

0 replies

22h17m

2024-03-05 20:08:13 UTC

It's still lenient. You can still legally share movies and music without profit.

Zambyte

0 replies

22h5m

2024-03-05 20:20:36 UTC

Yet I can download VLC

madsbuch

5 replies

1d1h

2024-03-05 16:42:46 UTC

because of the way the software you use, use other software that is licensed.

Just like it is not legal to do copyright infringement indifferent to how much you own the hardware you do it on.

zamalek

2 replies

1d1h

2024-03-05 16:50:39 UTC

This is a LD_LIBRARY_PATH emulator. No CUDA installation required.

chpatrick

1 replies

1d1h

2024-03-05 17:05:46 UTC

You probably still want to use things like cublas if you want to run existing CUDA software.

Const-me

0 replies

18h18m

2024-03-06 00:07:36 UTC

I would want an equivalent of cublas optimized for my specific GPU model and implementing the same API.

AFAIK cublas and other first-party libraries are hand-optimized by nVidia for different generations of their hardware, with dynamic dispatch in runtime for optimal performance. Pretty sure none of these versions would run optimally on AMD GPUs because ideally AMD GPUs run 64 threads / wavefront, nVidia GPUs run 32 threads / wavefront.

beeboobaa

1 replies

23h24m

2024-03-05 19:01:53 UTC

Just like it is not legal to do copyright infringement indifferent to how much you own the hardware you do it on.

It is legal for me to make a copy of any copyright protected media using hardware that I own. It is not legal for me to share this copy with others.

https://nl.wikipedia.org/wiki/Thuiskopie

indrora

0 replies

8h51m

2024-03-06 09:34:59 UTC

Emulator developers should incorporate in the Netherlands, it seems.

varispeed

0 replies

23h16m

2024-03-05 19:09:22 UTC

Nvidia can't copyright an API. Sure they can sue, but that would be a SLAPP.

bri3d

12 replies

1d2h

2024-03-05 16:24:37 UTC

I don't really think this was in AMD's best interest when you think about it strategically. Unless it were production-grade and legally tested, it's basically a tool that would enable developers to build applications using AMD and then deploy on NV. Perhaps a short term win on the consumer card side, but a long term foot-gun that would only serve to continually entrench NV in the datacenter.

outworlder

7 replies

1d1h

2024-03-05 16:43:27 UTC

That's a problem indeed, since it would further entrench CUDA. If you want people to develop for your platform, it could be counter-productive.

That said, what _is_ AMD's platform? OpenCL? Vulkan compute? If they don't have an alternative, then the strategy doesn't make sense.

incrudible

2 replies

1d1h

2024-03-05 16:54:36 UTC

What is AMDs platform?

That is a very good question that apparently AMD has no good answer for. I have lost track of the amount of half baked implementations for GPGPU that AMD has attempted and then left to rot. Even if they told me they had the answer now, and they are going to put all their focus on that, I would not trust them to deliver on it. Their best bet is to create implementations for popular libraries like torch that actually stand a chance to work as a drop in replacement.

happymellon

0 replies

23h18m

2024-03-05 19:07:06 UTC

I have lost track of the amount of half baked implementations for GPGPU that AMD has attempted and then left to rot.

They heard that it was successful for Google?

claudex

0 replies

23h6m

2024-03-05 19:19:21 UTC

That's exactly what they are doing https://rocm.docs.amd.com/projects/install-on-linux/en/devel...

bee_rider

1 replies

1d1h

2024-03-05 17:08:59 UTC

A problem is probably that there is that there are just a ton more Nvidia cards out there, and CUDA is hard to keep up with, so nobody is doing to invest in an alternative more open language.

A possible solution (that doesn’t involve being better than Nvidia at the things they are good at, which seems to be impossible) is to create frameworks that spit out CUDA or, whatever, OpenCL. Nobody actually wants to use the languages, right? Everybody loves CuBLAS and CuDNN, not CUDA, make GPUOpenBLAS. Maybe they could get Intel to come along with them.

bri3d

0 replies

2024-03-05 17:28:49 UTC

That's literally what they're doing. HIP supports Nvidia as a backend, and AMD are making replacements like MIOpen, which is intended as a quick-port replacement for cuDNN.

zozbot234

0 replies

1d1h

2024-03-05 17:00:40 UTC

Vulkan Compute is not AMD-specific in any way, it should "just work" on any Vulkan-capable GPU including integrated ones.

bri3d

0 replies

1d1h

2024-03-05 17:20:57 UTC

HIP/ROCm is the direct equivalent to CUDA (with MiOpen as the cuDNN equivalent). It's actually Not Bad, although it is pretty AMD-typically buggy.

incrudible

2 replies

1d1h

2024-03-05 16:44:45 UTC

What is the alternative? Build their own API that expects developers who have been burned by AMD again and again to provide dedicated support for? Good luck with that. CUDA is already entrenched, NVIDIA hardware is already deployed, so as a developer I can’t not support it. Why would I then go out of my way for the single digit percentage of users that need AMD support?

bri3d

1 replies

1d1h

2024-03-05 17:22:51 UTC

I think they're betting that as NVIDIA hardware gets more expensive and harder to get, especially in the datacenter, there's now an opportunity to entice developers to take on the pain of porting to HIP/ROCm.

Nobody cares about "users" in this case, it's bespoke applications running on bespoke infrastructure at scale.

incrudible

0 replies

2024-03-05 17:46:13 UTC

AMD is just as fabless as NVIDIA, both compete for production capacity, and eventually TSMC etc will get around to producing more chips than the market will want to buy.

Steltek

0 replies

1d1h

2024-03-05 16:51:54 UTC

Instead customers buy and develop on Nvidia and then deploy on Nvidia leaving AMD with nothing at all.

"Production grade" feels pretty ambiguous. It either works or it doesn't for any particular developer's use case.

sva_

3 replies

23h39m

2024-03-05 18:46:59 UTC

Is it perhaps because they want people to use HIP?

HIP is very thin and has little or no performance impact over coding directly in CUDA mode.

The HIPIFY tools automatically convert source from CUDA to HIP.

1. https://github.com/ROCm/HIP

mardifoufs

1 replies

22h56m

2024-03-05 19:29:06 UTC

Isn't it just a translation tool set? Can it translate actual CUDA code at runtime?

entropicdrifter

0 replies

22h22m

2024-03-05 20:03:13 UTC

Not at runtime, it translates the source, so HIPIFY can only be used before compile-time

eptcyka

0 replies

23h10m

2024-03-05 19:15:58 UTC

These help developers, ZLUDA could end up helping users.

strangescript

2 replies

1d1h

2024-03-05 16:51:24 UTC

I am sure they got the heads up about NVIDIA's announcement and cut this contractor loose. As per the contract agreement, the project would become open source.

dotnet00

1 replies

1d1h

2024-03-05 16:57:46 UTC

There was no heads up to give, the news was false, the restriction has been in place for 2 years.

croes

0 replies

2024-03-05 17:59:39 UTC

But maybe the didn't recognize until now.

sfmike

0 replies

1d1h

2024-03-05 17:23:53 UTC

inner taiwanese dealmaking in zhubei

ActionHank

0 replies

23h59m

2024-03-05 18:26:16 UTC

The assumption here is that they chose to abandon it, what if they have something better they're building?

zero_k

16 replies

1d1h

2024-03-05 16:37:37 UTC

This confirms what everyone who ever touched AMD GPGPUs knows -- that the only thing holding back AMD from becoming a 2 Trillion dollar company is their absolutely atrocious software. I remember finding a bug in their OpenCL compiler [1], but crashing their OpenCL compiler via segfault was also a piece of cake (that was never fixed, I gave up on reporting it).

AMD not developing a competitor to CUDA was the most short-sighted thing I have ever seen. I have no idea why their board hasn't been sacked and replaced with people who understand that you can make the best hardware out there, but if your SW to use it is -- to be very mild -- atrocious, nobody is gonna buy it or use it.

Us, customers, are left to buy the overpriced NVidia cards because AMD's board is too rich to give a damn about a trillion or so of value left on the table. Just... weird. Whoever owns AMD stock I hope is asking questions, because that board needs to go down the nearest drain.

[1] https://github.com/msoos/amdmiscompile -- they eventually fixed this

wruza

10 replies

2024-03-05 17:26:05 UTC

Can someone explain like I'm javascript, what's the deal with GPGPU?

My naive understanding is that a graphics card is just a funny computer on which you can upload opcodes and data and let it cook itself.

Why is CUDA such a big deal? Can't AMD just give direct access to its GPU as if it was an array of 4096 Arduino boards?

dotnet00

4 replies

2024-03-05 17:41:59 UTC

The kind of GPGPU code where the language is a thin architecture agnostic layer over the opcodes is how most GPU code (eg shaders used in graphics applications) are implemented. This is also how OpenCL, Vulkan Compute etc do their thing. This approach requires a lot of boilerplate and babysitting, but works well for relatively short bits of code.

CUDA is much higher level. It's roughly on par with a "C++ is C with classes" level in terms of language capability. This makes it much easier to develop complex applications. The C compatibility means that you can reuse the exact same code between CPU and GPU in many cases. It eliminates a lot of boilerplate, since you don't need to manage your data in as much detail (eg, while you still have to make sure your pointers are valid for the GPU, the code for uploading the function arguments is generated by the CUDA compiler).

The value add that makes CUDA especially strong is all the first party libraries which have been carefully optimized and have widespread and proven long term support.

mFixman

2 replies

23h44m

2024-03-05 18:41:40 UTC

Why is it so hard to develop a CUDA-like to AMD-opcode compiler?

If it wasn't taking the tech community so long I would imagine it would not be harder than porting GCC to a new architecture.

dotnet00

1 replies

23h15m

2024-03-05 19:10:59 UTC

I think the issues are mostly relating to getting the optimizer right. After all, not much of a point to GPU acceleration if it isn't meaningfully faster than CPU. A lot of these compatibility layers have this issue, DirectML, ZLUDA, etc. GPUs tend to expect the compiler to bake in things like instruction reordering/out of order execution for optimal performance.

The other challenge is that there isn't an "AMD-opcode", each generation tends to change around the opcodes a bit, so you want to compile to an intermediate representation which the driver would ingest and compile down to what the GPU uses. NVIDIA uses PTX for this, it works very well.

AMD's ROCm doesn't use an IR, it compiles the code for each architecture, which means they have a very limited support window for consumer GPUs (to limit binary size). OpenCL and Vulkan supports SPIR-V as an IR, but IIRC, the OpenCL SPIR-V on AMD is very buggy, and Vulkan SPIR-V is very different.

This has other side effects, like BLAS library support range. Hard for an open source community to justify putting in tons of effort into optimizing an entire BLAS library for each generation, when it'll only be supported for ~4 years (so, by the time you're finishing up with driver stability and library optimization, you're probably already at least a quarter of the way through the support period).

mFixman

0 replies

22h56m

2024-03-05 19:30:00 UTC

Neat, that makes sense.

Still, giant missed opportunity for AMD not to focus on this when NVIDIA has an almost monopoly on massive GPGPUs.

wruza

0 replies

23h54m

2024-03-05 18:31:16 UTC

Ah, I see. Thanks to everyone in this subthread for explanations!

mnau

3 replies

2024-03-05 17:42:36 UTC

Cuda is big deal because it works on every nv hw. Plus theyvhave fine tuned sw that utiluzes hw to max.

Amd doesn't. Go watch https://www.youtube.com/watch?v=NPinFkavsrk or https://www.youtube.com/watch?v=AqPIOtUkxNo

That is an attempt to avoid buggy amd implementation and go closer to hw. Everything crashes (kernel), their own demos lock up the card and so on.

wruza

2 replies

23h34m

2024-03-05 18:51:53 UTC

At first I thought "not gonna watch a 6+ hour stream", but decided to give it a go anyway. Those who are interested and have at least ~some assembly background may find all the funny things in the first 10 minutes of the first video, and that isn't even a highlight compilation. I understand the problem much deeper now :) thanks!

mnau

1 replies

20h6m

2024-03-05 22:19:51 UTC

Yea, in the 5:50 of video: Oh no, AMD driver just dereferenced null pointer.

It has been a while since I watched these streams, but that is about the theme of all six hours (and few others streams). It's just complete mine field, where he is trying to walk through a very narrow path of success.

Combine it with things like

this (they now actually have a improved documentation, significant progress): https://github.com/ROCm/ROCm/issues/1714

or this (HIP doesn't support L2 cache coherency, so disable GPU L2 cache) https://docs.amd.com/projects/HIP/en/latest/user_guide/hip_p...

and it's just FUBAR.

zero_k

0 replies

43m

2024-03-06 17:42:50 UTC

Haha null deref is funny. Not surprised they are so bad still. They were terrible back then. Didn't improve over time.

dist-epoch

0 replies

2024-03-05 17:38:36 UTC

CUDA is like the TypeScript compiler which takes your nice code and turns it into something the browser (NVIDIA GPU) can run.

AMD only has CoffeScript and it sucks compared to the TypeScript from NVIDIA.

outworlder

2 replies

1d1h

2024-03-05 16:51:34 UTC

This confirms what everyone who ever touched AMD GPGPUs knows -- that the only thing holding back AMD from becoming a 2 Trillion dollar company is their absolutely atrocious software.

Indeed. On the flip side they are quite more friendly to open-source, in general, compared to NVidia that's actively hostile(and has been for a while(see Linus "F* you!" video).

Companies that develop hardware generally suck at software. There are exceptions, but they aren't numerous (and indeed have been rewarded in their stock price). I do not know anything about AMD's company culture in their software business units, but fixing that generally requires pretty large changes.

I have no idea why their board hasn't been sacked and replaced with people who understand that you can make the best hardware out there, but if your SW to use it is -- to be very mild -- atrocious, nobody is gonna buy it or use it.

You probably can't just replace the board (unless C-level mandates are the only thing dragging down the company). You need to replace many more management levels, including a sizable portion of middle management. Sometimes even ICs if software hiring hasn't been properly handled.

paulmd

0 replies

2024-03-05 17:32:20 UTC

Indeed. On the flip side they are quite more friendly to open-source, in general,

those are "community-friendly" segfaults I guess, and it's really only a demonstration of how user-hostile NVIDIA is, what with their working HDMI 2.1 support and compilers and runtimes that actually build and run properly... /s

the "open-source so good!" stuff only really only matters when the open-source stack at least gets you across the starting line. When you are having to debug AMD's openCL runtime or compiler and submit patches because it segfaults on the demo code, that is not "engaging the community as a force-multiplier", it's shipping a defective product and fobbing it off on the community to do your job for you.

It's incumbent on AMD to at least get the platform to the starting line, and their failure to do that has been a problem for over a decade at this point.

Also, honestly, even if you submit a patch how long until they break something else? If demo projects don’t even run… they aren’t exactly doing their technical diligence. People seem to love love love the idea of being an ongoing, unpaid employee working on AMD’s tech stack for them, and nvidia is just sitting there with a working product that people don’t like for ideological reasons…

To wit: the segfault/deadlock issues geohot ran into aren’t just a one-off, they’re a whole class of bug that AMD has been fighting for years in ROCm, and they keep coming back. Are you willing to keep fixing that bug every couple months for the rest of your projects life? After they drop support for your hardware in 6 months, are you willing to debug the next AMD platform for them too?

littlestymaar

0 replies

1d1h

2024-03-05 17:04:10 UTC

Companies that develop hardware generally suck at software. There are exceptions, but they aren't numerous (and indeed have been rewarded in their stock price)

True, and you can even get the highest market cap as a hardware manufacturer who suck at software.

sorenjan

0 replies

23h31m

2024-03-05 18:54:59 UTC

I don't understand why AMD doesn't cooperate with Intel to push SYCL as the standard GPGPU and heterogeneous programming method. Intel is good with software, SYCL is an open standard so both companies would benefit from the same code, and customers could run SYCL code on Threadrippers if they wanted (some of them are as fast as some GPUs now).

Is AMD trying to create their own proprietary lock in eco system? Why aren't they committing to cross platform open standards?

KronisLV

0 replies

2024-03-05 17:56:32 UTC

This confirms what everyone who ever touched AMD GPGPUs knows -- that the only thing holding back AMD from becoming a 2 Trillion dollar company is their absolutely atrocious software.

I actually rather enjoyed the AMD Software in particular, since it made very easy to tweak graphics (limit framerates to 60 when I don't want the GPU maxing out when games/software don't support it by default), setup instant replays with a hotkey press (like Shadowplay, where it has a constant recording buffer of the last X minutes) and also both power limit the GPU (when my UPS wasn't very good) as well as overclock it automatically (since I still want to squeeze like a year out of my RX 580).

Except that any version of the software/drivers after around 2020 crashes VR titles after less than an hour. And that there is no software package for Linux and CoreCtrl isn't as good. And that sometimes the instant replay thing just doesn't work. And that I haven't been able to get ROCm working with any of the local LLMs even once across both Windows and Linux (DKMS sure loved to do a whole bunch of pointless compiling upon each apt upgrade).

I'm honestly considering either going for Intel Arc as my next GPU because I'm curious, or just going back to something from Nvidia, so it's probably a split between: A580, RX 6600, RTX 3050. Or maybe I can hold out until other parts drop in price, time will tell.

indymike

4 replies

1d1h

2024-03-05 16:52:09 UTC

Intel eventually decided there was “no business case for running CUDA applications on Intel GPUs”,

Oh, boy.

joe_the_user

2 replies

2024-03-05 18:10:31 UTC

One simple way to put things is that at a certain size and age, every company is an aspiring monopolist, not an aspiring competitor.

winwang

1 replies

23h56m

2024-03-05 18:29:14 UTC

That would make more sense in the alternate history where Intel and AMD never try to make GPUs anymore.

joe_the_user

0 replies

20h26m

2024-03-05 21:59:32 UTC

Not necessarily. They're making GPUs but they're avoiding any head-to-head competition with NVidia and instead trying for a smaller share of the market but one they have some unique advantage or other.

asdff

0 replies

22h52m

2024-03-05 19:33:54 UTC

Intels graphics wing is so bad they had to stop calling it intel hd because of the taste it left in people's mouths.

ok_dad

3 replies

23h15m

2024-03-05 19:10:26 UTC

Is there a programming language that compiles into any of the various kernel languages like Metal, CUDA, whatever AMD has, etc? If not, why not? We have C compilers that compile to various CPU architectures. Shouldn’t there be a compiler to GPU architectures? Perhaps it’s just that no one has created it yet?

jawilson2

1 replies

23h10m

2024-03-05 19:15:56 UTC

Do you count OpenCL?

https://www.khronos.org/api/opencl

ok_dad

0 replies

21h40m

2024-03-05 20:45:16 UTC

I think so, yes! Even more so because it works with CPU and other things too.

ortichic

0 replies

22h38m

2024-03-05 19:47:35 UTC

OpenMP 5 specified GPU support. A quick search suggests that some compilers at least partially support it by now

lvl102

3 replies

1d1h

2024-03-05 16:36:29 UTC

If AMD couldn’t do it by now they either have no intention or process. The fact that they are selling investors on this notion that they can compete with Nvidia in AI space is borderline fraud.

Zambyte

2 replies

1d1h

2024-03-05 16:39:17 UTC

The fact that they are selling investors on this notion that they can compete with Nvidia in AI space is borderline fraud.

Idk, running 7B language models on my 6650 XT with ROCm has been pretty slick. Doesn't seem like fraud to me.

lvl102

1 replies

1d1h

2024-03-05 17:12:48 UTC

Are you in this business or student/hobbyist? No one’s running anything on 6650XT. Gimme a break.

Zambyte

0 replies

22h18m

2024-03-05 20:07:46 UTC

Hobbyist. Not meeting your needs doesn't make it fraud.

dheera

3 replies

23h26m

2024-03-05 18:59:18 UTC

Does this work on an AMD 7950X?

MikeTheRocker

2 replies

23h24m

2024-03-05 19:01:52 UTC

No, the 7950X is a CPU. CUDA is an API for computing on GPUs.

dheera

0 replies

22h1m

2024-03-05 20:24:40 UTC

Yeah, I figured, though it has an iGPU ... would be nice to just test out whether I could run CUDA code on it, even if slow.

derstander

0 replies

22h56m

2024-03-05 19:29:57 UTC

To be fair AMD has a graphics card with the model name 7900XT so the names aren’t that far apart in Levenshtein distance.

qwertox

2 replies

22h38m

2024-03-05 19:48:03 UTC

Previous discussion 22 days ago: AMD funded a drop-in CUDA implementation built on ROCm: It's now open-source [0], 400 comments.

Noteworthy top comment in that thread:

This event of release is however a result of AMD stopped funding it per "After two years of development and some deliberation, AMD decided that there is no business case for running CUDA applications on AMD GPUs. One of the terms of my contract with AMD was that if AMD did not find it fit for further development, I could release it. Which brings us to today." from https://github.com/vosen/ZLUDA?tab=readme-ov-file#faq

[0] https://news.ycombinator.com/item?id=39344815

thomasahle

0 replies

2h19m

2024-03-06 16:06:44 UTC

One of the terms of my contract with AMD was that if AMD did not find it fit for further development, I could release it.

I need to put this in all contracts for all jobs in the future.

dang

0 replies

22h32m

2024-03-05 19:53:33 UTC

Thanks! Macroexpanded:

AMD funded a drop-in CUDA implementation built on ROCm: It's now open-source - https://news.ycombinator.com/item?id=39344815 - Feb 2024 (410 comments)

Zluda: Run CUDA code on Intel GPUs, unmodified - https://news.ycombinator.com/item?id=36341211 - June 2023 (90 comments)

Zluda: CUDA on Intel GPUs - https://news.ycombinator.com/item?id=26262038 - Feb 2021 (77 comments)

Also recent and related:

Nvidia bans using translation layers for CUDA software to run on other chips - https://news.ycombinator.com/item?id=39592689 - March 2024 (155 comments)

parentheses

2 replies

1d1h

2024-03-05 17:12:40 UTC

This is almost identical to Oracle vs Google re: using JVM bytecode.

bri3d

1 replies

2024-03-05 17:44:57 UTC

I don't really think so; what's in dispute isn't the bytecode translation, it's locking the higher-level library IP to hardware. This would be like Google saying "you can only run our Android applications on a Google-approved phone," which my understanding is, they do when it comes to their Play frameworks and things like Maps.

ddtaylor

0 replies

23h44m

2024-03-05 18:42:03 UTC

Google kind of does that with pixel phones there are features that are only available on their Hardware that you can't run otherwise even if you put the APK over Etc one is holdforme

varispeed

1 replies

23h17m

2024-03-05 19:08:39 UTC

AMD evaluated ZLUDA for two years, but also decided not to go further with the project – at which point, Janik open-sourced the updated code.

Such a dick move from AMD.

asdff

0 replies

22h50m

2024-03-05 19:35:40 UTC

Their legal team probably said the fees from the resulting nvda war would be out of budget

3abiton

1 replies

1d2h

2024-03-05 16:22:40 UTC

Anyone managed to get it working for AMD iGPU (APU) yet? I got the vega archi, still no luck running LLMs with either ZLUDA or ROCm backend.

lhl

0 replies

23h11m

2024-03-05 19:14:54 UTC

Someone got ZLUDA running llama.cpp a while back (search the ZLUDA/llama.cpp issues). If I recall, it ran about half the speed of the existing ROCm implementation.

I tried ROCm on my iGPU last year and you do get a bit of a benefit for prompt processing (5x faster) but inference is basically bottlenecked by the memory bandwidth whether you're on CPU or GPU. Here were my results: https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYp...

Note, only GART memory, not GTT is accessible in most inferencing options, so you will only basically be able to load 7B quantized models into "VRAM" (BIOS controlled, but usually max out at 6-8GB). I have some notes on this here: https://llm-tracker.info/howto/AMD-GPUs#amd-apu

If you plan on running a 7B-13B model locally, getting something like a RTX 3060 16GB (or if you're on Linux, the 7600 XT 16GB might be an option w/ HSA_OVERRIDE) is probably your cheapest realistic option and will give you about 300 GB/s of memory bandwidth and enough memory to run quantizes of 13/14B class models. If you're buying a card specifically for GenAI and not going to dedicate time to fight driver/software issues, I'd recommend going w/ Nvidia options first (they typically actually give you more Tensor TFLOPS/$ as well).

v3ss0n

0 replies

2024-03-05 17:34:10 UTC

He's dead, Jim

singhrac

0 replies

1d1h

2024-03-05 17:03:22 UTC

Also relevant is geohot's persistent struggles with (expensive) AMD GPUs: https://twitter.com/__tinygrad__/status/1764734675002810622

simonw

0 replies

14h28m

2024-03-06 03:58:04 UTC

I heard an interesting rumor recently that the person responsible for CUDA at NVIDIA spent YEARS fighting for resources and trying to convince the company to take the project seriously.

Without CUDA, there's absolutely no way NVIDA would be a nearly trillion dollar company today.

physicsguy

0 replies

21h57m

2024-03-05 20:28:15 UTC

Said it before and said it again, the issue with AMD GPUs is not individual kernels which are easy to translate, but the libraries. From the release notes saying 'Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML' it looks like this project was going towards this which is great. Whether it'll have momentum after AMD stop funding it... who knows.

can16358p

0 replies

22h48m

2024-03-05 19:37:39 UTC

Does anyone know would there be support for Apple Silicon GPUs/Metal?

Steltek

0 replies

1d1h

2024-03-05 16:57:38 UTC

Has anyone tried this to run OSS photogrammetry tools like Meshroom? They mention a few proprietary ones in the article but my needs are pretty small.

Eager

0 replies

23h4m

2024-03-05 19:22:02 UTC

Someone should have an LLM start generating random valid CUDA programs.

Compile each one to get a binary.

Train a language model with the source and output binary.

Hey presto, clean room compiler.

Edit: Oh wait.. duh.. just train it on the equivalent target source.

Presumably you can do this for other targets as well.

Der_Einzige

0 replies

2024-03-05 17:58:13 UTC

Can anyone confirm if this actually works in practice for GenAI? In general, CUDA translation layers are usually broken for SOTA ML applications.