If we end up with a compute governance model of AI control [1], this sort of thing could get your door kicked in by the CEA (Compute Enforcement Agency).
[1] https://podcasts.apple.com/us/podcast/ai-safety-fundamentals...
If we end up with a compute governance model of AI control [1], this sort of thing could get your door kicked in by the CEA (Compute Enforcement Agency).
[1] https://podcasts.apple.com/us/podcast/ai-safety-fundamentals...
Incredible! I'd been wondering if this was possible. Now the only thing standing in the way of my 4x4090 rig for local LLMs is finding time to build it. With tensor parallelism, this will be both massively cheaper and faster for inference than a H100 SXM.
I still don't understand why they went with 6 GPUs for the tinybox. Many things will only function well with 4 or 8 GPUs. It seems like the worst of both worlds now (use 4 GPUs but pay for 6 GPUs, don't have 8 GPUs).
A macbook is cheaper though
The extra $3k you'd spend on a quad-4090 rig vs the top mbp... ignoring the fact you can't put the two on even ground for versatility (very few libraries are adapted to apple silicone let alone optimized).
Very few people that would consider an H100/A100/A800 are going to be cross-shopping a macbook pro for their workloads.
very few libraries are adapted to apple silicone let alone optimized
This is a joke, right? Have you been anywhere in the LLM ecosystem for the past year or so? I'm constantly hearing about new ways in which ASi outperforms traditional platforms, and new projects that are optimized for ASi. Such as, for instance, llama.cpp.
Nothing compared to Nvidia though. The FLOPS and memory bandwidth is simply not there.
training on MPS backend is suboptimal and really slow.
4x32GB(128GB) DDR4 is ~$250. 4x48GB(192GB) DDR5 is ~$600. Those are even cheaper than upgrade options for Macs($1k).
So is a TI-89.
Sure, it's also at least an order of magnitude slower in practice, compared to 4x 4090 running at full speed. We're looking at 10 times the memory bandwidth and much greater compute.
6 GPUs because they want fast storage and it uses PCIe lanes.
Besides the goal was to run a 70b FP16 model (requiring roughly 140GB VRAM). 6*24GB = 144GB
That calculation is incorrect. You need to fit both the model (140GB) and the KV cache (5GB at 32k tokens FP8 with flash attention 2) * batch size into VRAM.
If the goal is to run a FP16 70B model as fast as possible, you would want 8 GPUs with P2P, for a total of 192GB VRAM. The model is then split across all 8 GPUs with 8-way tensor parallelism, letting you make use of the full 8TB/s memory bandwidth on every iteration. Then you have 50GB spread out remaining for KV cache pages, so you can raise the batch size up to 8 (or maybe more).
I’ve got a few 4090s that I’m planning on doing this with. Would appreciate even the smallest directional tip you can provide on splitting the model that you believe is likely to work.
The split is done automatically by the inference engine if you enable tensor parallelism. TensorRT-LLM, vLLM and aphrodite-engine can all do this out of the box. The main thing is just that you need either 4 or 8 GPUs for it to work on current models.
Thank you! Can I run with 2 GPUs or with heterogeneous GPUs that have same RAM? I will try. Just curious if you already have tried.
2 GPUs works fine too, as long as your model fits. Using different GPUs with same VRAM however, is highly highly sketchy. Sometimes it works, sometimes it doesn't. In any case, it would be limited by the performance of the slower GPU.
All right, thank you. I can run it on 2x 4090 and just put the 3090s in different machine.
Many things will only function well with 4 or 8 GPUs
What do you mean?
For example, if you want to run low latency multi-GPU inference with tensor parallelism in TensorRT-LLM, there is a requirement that the number of heads in the model is divisible by the number of GPUs. Most current published models are divisible by 4 and 8, but not 6.
Interesting... 1 Zen 4 EPYC CPU yields a maximum of 128 PCIE lanes so it wouldn't be possible to put 8 full fat GPUs on while maintaining some lanes for storage and networking. Same deal with Threadripper Pro.
It should be possible with onboard PCIe switches. You probably don't need the networking or storage to be all that fast while running the job, so it can dedicate almost all of the bandwidth to the GPU.
I don't know if there are boards that implement this, though, I'm only looking at systems with 4x GPUs currently. Even just plugging in a 5kW GPU server in my apartment would be a bit of a challenge. With 4x 4090, the max load would be below 3kW, so a single 240V plug can handle it no issue.
8 GPUs x 16 PCIe lanes each = 128 lanes already.
That’s the limit of single CPU platforms.
I've seen it done with a PLX Multiplexer as well, but they add quite a bit of cost:
https://c-payne.com/products/pcie-gen4-switch-backplane-4-x1...
Not sure if there exists an 8-way PCIE Gen 5 Multiplexer that doesn't cost ludicrous amounts of cash. Ludicrous being a highly subjective and relative term of course.
It's more difficult to split your work across 6 GPUs evenly, and easier when you have 4 or 8 GPUs. The latter setups have powers of 2, which for example, can evenly divide a 2D or 3D grid, but 6 GPUs are awkward to program. Thus, the OP argues that a 6-GPU setup is highly suboptimal for many existing applications and there's no point to pay more for the extra 2.
tinygrad supports uneven splits. There's no fundamental reason for 4 or 8, and work should almost fully parallelize on any number of GPUs with good software.
We chose 6 because we have 128 PCIe lanes, aka 8 16x ports. We use 1 for NVMe and 1 for networking, leaving 6 for GPUs to connect them in full fabric. If we used 4 GPUs, we'd be wasting PCIe, and if we used 8 there would be no room for external connectivity aside from a few USB3 ports.
Is it possible a similar patch would work for P2P on 3090s?
Doesn't nvlink work natively on 3090s? I thought it was only removed (and here re-enabled) in 4090.
Have you compared 3x 3090-3090 pairs over NVLink?
IMO the most painful thing is that since these hardware configurations are esoteric, there is no software that detects them and moves things around "automatically." Regardless of what people thing device_map="auto" does, and anyway, Hugging Face's transformers/diffusers are all over the place.
I was googling public NVIDIA SXM2 materials the other day, and it seemed SXM2/NVLink 2.0 just was a six-way system. NVIDIA SXM had updated to versions 3 and 4 since, and this isn't based on none of those anyway, but maybe there's something we don't know that make six-way reasonable.
It was probably just before running LLMs with tensor parallelism became interesting. There are plenty of other workloads that can be divided by 6 nicely, it's not an end-all thing.
I don't think P2P is very relevant for inference. It's important for training. Inference can just be sharded across GPUs without sharing memory between them directly.
It can make a difference when using tensor parallelism to run small batch sizes. Not a huge difference like training because we don't need to update all weights, but still a noticeable one. In the current inference engines there are some allreduce steps that are implemented using nccl.
Also, paged KV cache is usually spread across GPUs.
6 seems reasonable. 128 Lanes from ThreadRipper needs to have a few for network and NVMe (4x NVMe would be x16 lanes, and 10G network would be another x4 lanes).
Skimming the readme this is p2p over PCIe and not NVLink in case anyone was wondering.
RTX 40 doesn’t have NVLink on the PCBs, though the silicon has to have it, since some sibling cards support it. I’d expect it to be fused off.
How to unfuse it?
I don't know about this particular scenario, but typically fuses are small wires or resistors that are overloaded so they irreversibly break the connection. Hence the name.
Either done during manufacture or as a one-time programming[1][2].
Though sometimes reprogrammable configuration bits are sometimes also called fuse bits. The Atmega328P of Arduino fame uses flash[3] for its "fuses".
[1]: https://www.nxp.com/docs/en/application-note/AN4536.pdf
[2] https://www.intel.com/programmable/technical-pdfs/654254.pdf
[3]: https://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-7810-...
Wires, flash, and resistors can be replaced
Not at the scale we're talking about here. These structures are very thin, far thinner than bond wires which is about the largest structure size you can handle without a very, very specialized lab. And you'd need to unsolder the chip, de-cap it, hope the fuse wire you're trying to override is at the top layer, and that you can re-cap the chip afterwards and successfully solder it back on again.
This may be workable for a nation state or a billion dollar megacorp, but not for your average hobbyist hacker.
You’re absolutely right. In fact, some billion dollar megacorps use fuses as a part of hardware DRM for this reason.
These are part of the chip, thus microscopic and very inaccessible.
There are some good images here[1] of various such fuses, both pristine and blown. Here's[2] a more detailed writeup examining one type.
It's not something you fix with a soldering iron.
[1]: https://semiengineering.com/the-benefits-of-antifuse-otp/
I miss the days when you could do things like connecting the L5 bridges on the surface of the AMD Athlon XP Palomino [0] CPU packaging with a silver trace pen to transform them into fancier SMP multi-socket capable Athlon MPs, e.g. Barton [1].
https://arstechnica.com/civis/threads/how-did-you-unlock-you...
Some folks even got this working with only a pencil, haha.
Nowadays, silicon designers have found highly effective ways to close off these hacking avenues, with techniques, such as the microscopic, nearly invisible, and as parent post mentions, totally inaccessible e-fuses.
[0] https://upload.wikimedia.org/wikipedia/commons/7/7c/KL_AMD_A...
[1] https://en.wikichip.org/w/images/a/af/Atlhon_MP_%28.13_micro...
Use a Focused Ion Beam instrument.
I'm pretty sure that's just a remnant of a 3090 PCB design that was adapted into a 4090 PCB design by the vendor. None of the cards based on the AD102 chip have functional NVLink, not even the expensive A6000 Ada workstation card or the datacenter L40 accelerator, so there's no reason to think NVLink is present on the silicon anymore below the flagship GA100/GH100 chips.
A cursory google search suggests that it's been removed at the silicon level.
afaik 4090 doesn’t support 5.0 so you are limited to 4.0 speeds. Still an improvement.
What does P2P mean in this context? I Googled it and it sounds like it means "peer to peer", but what does that mean in the context of a graphics card?
It means you can send data from the memory of 1 GPU to another GPU without going via RAM. https://xilinx.github.io/XRT/master/html/p2p.html
Is this really efficient or practical? My understanding is that the latency required to copy memory from CPU or RAM to GPU negates any performance benefits (much less running over a network!)
Yes, the point here is that you do a direct write from one cards memory to the other using PCIe.
In older NVidia cards this could be done through a faster link called NVLink but the hardware for that was ripped out of consumer grade cards and is only in data center grade cards now.
Until this post it seemed like they had ripped all such functionality of their consumer cards, but it looks like you can still get it working at lower speeds using the PCIe bus.
so whats stopping from somebody buying a ton of GPUs that are cheap and wiring it up via P2P like we saw with crypto mining
That's what this thread is about. Geohot is doing that.
I take it this is mostly useful for compute workloads, neural networks, LLM and the like -- not for actual graphics rendering?
yes
Peer to peer as in one pcie slot directly to another without going through the CPU/RAM, not peer to peer as in one PC to another over the network port.
this would be directly over the memory bus right? I think it's just always going to be faster like this if you can do it?
Yea. It’s one less hop through slow memory
The correct term, and the one most people would have used in the past, is "bus mastering."
PCIe isn't a bus and it doesn't really have a concept of mastering. All PCI DMA was based on bus mastering but P2P DMA is trickier than normal DMA.
Shared memory access for Nvidia GPUs
Glad to see that geohot is back being geohot, first by dropping a local DoS for AMD cards, then this. Much more interesting :p
Is this the same guy that hacked the PS3?
Yes, but he spent several years in self-driving cars (https://comma.ai), which while interesting is also a space that a lot of players are in, so it's not the same as seeing him back to doing stuff that's a little more out there, especially as pertains to IP.
Did he abandon this effort? That would be pretty sad bec he was approaching the problem from a very different perspective.
It's still a company, still making and selling products, and I think he's still pretty heavily involved in it.
He stepped down from it. https://geohot.github.io//blog/jekyll/update/2022/10/29/the-...
He has a very checkered history with "hacking" things.
He tends to build heavily on the work of others, then use it to shamelessly self-promote, often to the massive detriment of the original authors. His PS3 work was based almost completely on a presentation given by fail0verflow at CCC. His subsequent self-promotion grandstanding world tour led to Sony suing both him and fail0verflow, an outcome they were specifically trying to avoid: https://news.ycombinator.com/item?id=25679907
In iPhone land, he decided to parade around a variety of leaked documentation, endangering the original sources and leading to a fragmentation in the early iPhone hacking scene, which he then again exploited to build on the work of others for his own self-promotion: https://news.ycombinator.com/item?id=39667273
There's no denying that geohotz is a skilled reverse engineer, but it's always bothersome to see him put onto a pedestal in this way.
There was also that CheapEth crypto scam he tried to pull off.
Don't forget he sucked up to melon and worked for Twitter for a week.
And the iPhone
And android
And the crypto scam cheapETH
Yes, that's him.
I was always fascinated by George Hotz's hacking abilities. Inspired me a lot for my personal projects.
He's got that focus like a military pilot on a long flight.
Any time I open guys steam half of it is some sort of politics
You can blame chat for that lol
I agree. It is fascinating. When you observe his development process (btw, it is worth noting his generosity in sharing it like he does) he gets frequently stuck on random shallow problems which a perhaps more knowledgable engineer would find less difficult. It is frequent to see him writing really bad code, or even wrong code. The whole twitter chapter is a good example. Yet, himself, alone just iterating resiliently, just as frequently creates remarkable improvements. A good example to learn from. Thank you geohot.
This matches my own take. I've tuned into a few of his streams and watched VODs on YouTube. I am consistently underwhelmed by his actual engineering abilities. He is that particular kind of engineer that constantly shits on other peoples code or on the general state of programming yet his actual code is often horrendous. He will literally call someone out for some code in Tinygrad that he has trouble with and then he will go on a tangent to attempt to rewrite it. He will use the most blatant and terrible hacks only to find himself out of his depth and reverting back to the original version.
But his streams last 4 hours or more. And he just keeps grinding and grinding and grinding. What the man lacks in raw intellectual power he makes up for (and more) in persistence and resilience. As long as he is making even the tiniest progress he just doesn't give up until he forces the computer to do whatever it is he wants it to do. He also has no boundaries on where his investigations take him. Driver code, OS code, platform code, framework code, etc.
I definitely couldn't work with him (or work for him) since I cannot stand people who degrade the work of others while themselves turning in sub-par work as if their own shit didn't stink. But I begrudgingly admire his tenacity, his single minded focus, and the results that his belligerent approach help him to obtain.
I agree, I feel so inspired with his streams. Focus and hard work, the key to good results. Add a clear vision and strategy, and you can also accomplish “success”.
Congratulations to him and all the tinygrad/comma contributors.
His Xbox360 laptop was the crux of teenage-motivation, for me.
Is this one of those features that's disabled on consumer cards for market segmentation?
Sort of.
An imperfect analogy: a small neighborhood of ~15 houses is under construction. Normally it might have a 200kva transformer sitting at the corner, which provides appropriate power from the grid.
But there is a transformer shortage, so the contractor installs a commercial grade 1250kva transformer. It can power many more houses than required, so it's operating way under capacity.
One day, a resident decides he wants to start a massive grow farm, and figures out how to activate that extra transformer capacity just for his house. That "activation" is what geohot found
Where is the hack in this analogy
Taking off the users panel on the side of their house and flipping it to 'lots of power' when that option had previously been covered up by the panel interface.
Except that in the computer hardware world, the 1250 kVA transformer was used not because of shortage, but because of the fact that making a 1250 kVA transformer on the existing production line and selling it as 200 kVA, is cheaper than creating a new production line separately for making 200 kVA transformers.
That's a poor analogy. The feature is built in to the cards that consumers bought, but Nvidia is disabling it via software. That's why a hacked driver can enable it again. The resident in your analogy is just freeloading off the contractor's transformer.
Nvidia does this so that customers that need that feature are forced to buy more expensive systems instead of building a solution with the cheaper "consumer-grade" cards targeted at gamers and enthusiasts.
I am sure many will disagree-vote me, but I want to see this practice in consumer devices either banned or very heavily taxed.
Does this appear to be intentionally left out by NVidia or an oversight?
Seems more like an oversight, since you have to stitch together a bunch of suboptimal non-default options?
It does seem like an oversight, but there's nothing "suboptimal non-default options" about iteven if the implementation posted here seems somewhat hastily hacked together.
but there's nothing "suboptimal non-default options" about it
If "bypassing the official driver to invoke the underlying hardware feature directly through source code modification (and incompatibilities must be carefully worked around by turning off IOMMU and large BAR, since the feature was never officially supported)" does not count as "suboptimal non-default options", then I don't know what counts as "suboptimal non-default options".
then I don't know what counts as "suboptimal non-default options".
Boy oh boy do I have a bridge to sell you: https://nouveau.freedesktop.org/
NVidia wants you to buy A6000
Was it George himself, or a person working for a bounty that was set up by tinycorp?
Also, a question for those knowledgeable about the PCI subsys: it looked like something NVIDIA didn't care about, rather than something they actively wanted to prevent, no?
Commits are by geohot, so it looks like George himself.
I've seen him work on tinygrad on his Twitch livestream couple times, so more than likely him indeed.
PCI devices have always been able to read and write to the shared address space (subject to IOMMU); most frequently used for DMA to system RAM, but not limited to it.
So, poking around to configure the device to put the whole VRAM in the address space is reasonable, subject to support for resizable BAR or just having a fixed size large enough BAR. And telling one card to read/write from an address that happens to be mapped to a different card's VRAM is also reasonable.
I'd be interested to know if PCI-e switching capacity will be a bottleneck, or if it'll just be the point to point links and VRAM that bottlenecks. Saving a bounce through system RAM should help in either case though.
He also documented his progress on the tinygrad discord
I wish more hardware companies would publish more documentation and let the community figure out the rest, sort of like what happened to the original IBM VGA (look up "Mode X" and the other non-BIOS modes the hardware is actually capable of - even 800x600x16!) Sadly it seems the majority of them would rather tightly control every aspect of their products' usage since they can then milk the userbase for more $$$, but IMHO the most productive era of the PC was also when it was the most open.
Then they couldn't charge different customers different amounts for the same HW. It's not a win for everyone.
The price of 4090 may increase now, in theory locking out some features might have been a favor for some of the customers.
nvidia's software is their moat
What stops nvidia from making sure this stops working in future driver releases?
The law, hopefully.
Beeper mini only worked with iMessage for a few days before Apple killed it. A few months later the DOJ sued Apple. Hacks like this show us the world we could be living in, a world which can be hard to envision otherwise. If we want to actually live in that world, we have to fight for it (and protect the hackers besides).
And here I thought (PCIe) P2P was there since SLI dropped the bridge (for the unfamiliar, it looks and acts pretty much like an NVLink bridge for regular PCIe slot cards that have NVLink, and was used back in the day to share framebuffer and similar in high-end gaming setups).
SLI was dropped years ago so there's no need for gaming cards to communicate at all.
It'll be nice while it lasts, until they start locking this down in the firmware instead on future architectures.
Sure, but that was something that was always going to happen.
So it's better to have it at least for one generation instead of no generation.
So assuming you utilized this with (4) x 4090s is there a theoretical comparative to performance vs the A6000 / other professional lines?
I believe this is mostly for memory capacities. PCIe access between GPUs is slower than soldered RAM on a single GPU
Can someone ELI5 what this may make possible that wasn't possible before? Does this mean I can buy a handful of 4090s and use it in lieu of an h100? Just adding the memory together?
No. The Nvidia A100 has a multi-lane NVLink interface with a total bandwidth of 600 GB/s. The "unlocked" Nvidia RTX 4090 uses PCIe P2P at 50 GB/s. It's not going to replace A100 GPUs for serious production work, but it does unlock a datacenter-exclusive feature and has some small-scale applications.
Finally switched to Nvidia and already adding great value
You can watch this happen on the weekends, typically, sometimes, for some very long sessions, sometimes. https://www.twitch.tv/georgehotz
You may need to uninstall the driver from DKMS. Your system needs large BAR support and IOMMU off.
Can someone point me to the correct tutorial on how to do these things?
does this mean you can horizontally scale to GPT-4-esque LLM locally in the near future? (i hear you need 1TB of VRAM)
Is Apple's large VRAM offering like 196gb offer the fastest bandwidth and if so how will pairing a bunch of 4090s like in the comments work?
OK now we are seemingly getting somewhere. I can feel the enthusiasm coming back to me.
Especially in light of what's going on with LocalLLaMA etc:
https://www.reddit.com/r/LocalLLaMA/comments/1c0mkk9/mistral...
In layman terms what does this enable?
This is very interesting.
I can't afford two mortgages though ,so for me it will have to just stay as something interesting :)
fyi should work on most 40xx[1]
[1] https://github.com/pytorch/pytorch/issues/119638#issuecommen...
as a technical feat this is really cool! though as others mention i hope you don't get into too much hot water legally
seems anything that remotely lets "consumer" cards canibalize anything with the higher end H/A-series cards Nvidia would not be fond of and they've the laywers to throw at such a thing
Any idea of DDP perf?
The original justification that Nvidia gave for removing Nvlink from the consumer grade lineup was that PCIe 5 would be fast enough. They then went on to release the 40xx series without PCIe 5 and P2P support. Good to see at least half of the equation being completed for them, but I can’t imagine they’ll allow this in the next gen firmware.
Looks like we're only a few years away from a bona fide cyberpunk dystopia, in which only governments and megacorps are allowed to use AI, and hackers working on their own hardware face regular raids from the authorities.
Mere raids from the authorities? I thought EliY was out there proposing airstrikes.
In the sense that any other government regulation is also ultimately backed by the state's monopoly on legal use of force when other measures have failed.
And contrary to what some people are implying he also proposes that everyone is subject to the same limitations, big players just like individuals. Because the big players haven't shown much of a sign of doing enough.
Good point. He was only (“only”) really calling for international cooperation and literal air strikes against big datacenters that weren’t cooperating. This would presumably be more of a no-knock raid, breaching your door with a battering ram and throwing tear gas at the wee hours of the morning ;) or maybe a small extraterritorial drone through your window
... after regulation, court orders and fines have failed. Which under the premise that AGI is an existential threat would be far more reasonable than many other reasons for raids.
If the premise is wrong we won't need it. If society coordinates to not do the dangerous thing we won't need it. The argument is that only in the case where we find ourselves in the situation where other measures have failed such uses of force would be the fallback option.
I'm not seeing the odiousness of the proposal. If bio research gets commodified and easy enough that every kid can build a new airborne virus in their basement we'd need raids on that too.
To be honest, I see summoning the threat of AGI to pose an existential threat to be on the level with lizard people on the moon. Great for sci-fi, bad distraction for policy making and addressing real problems.
The real war, if there is one, is about owning data and collecting data. And surprisingly many people fall for distractions while their LLM fails at basic math. Because it is a language model of course...
Freely flying through the sky on wings was scifi before the wright brothers. Something sounding like scifi is not a sound argument that it won't happen. And unlike lizard people we do have exponential curves to point at. Something stronger than a vibes-based argument would be good.
I consider the burden of proof to fall on those proclaiming AGI to be an existential threat, and so far I have not seen any convincing arguments. Maybe at some point in the future we will have many anthropomorphic robots and an AGI could hack them all and orchestrate a robot uprising, but at that point the robots would be the actual problem. Similarly, if an AGI could blow up nuclear power plants, so could well-funded human attackers; we need to secure the plants, not the AGI.
It doesn't sound like you gave serious thought to the arguments. The AGI doesn't need to hack robots. It has superhuman persuasion, by definition; it can "hack" (enough of) the humans to achieve its goals.
AI mind control abilities are also on the level of an extraordinary claim, that requires extraordinary evidence.
It's on the level of "we better regulate wooden sticks so Voldemort doesn't use the imperious curse on us!".
That's how I treat such claims. I treat them the same as someone literally talking about magic from Harry potter.
There isn't nothing that would make me believe that. But it requires actual evidence and not thought experiments.
Voldemort is fictional and so are bumbling wizard apprentices. Toy-level, not-yet-harmful AIs on the other hand are real. And so are efforts to make them more powerful. So the proposition that more powerful AIs will exist in the future is far more likely than an evil super wizard coming into existence.
And I don't think literal 5-word-magic-incantation mind control is essential for an AI to be dangerous. More subtle or elaborate manipulation will be sufficient. Employees already have been duped into financial transactions by faked video calls with what they assumed to be their CEOs[0], and this didn't require superhuman general intelligence, only one single superhuman capability (realtime video manipulation).
[0] https://edition.cnn.com/2024/02/04/asia/deepfake-cfo-scam-ho...
A computer that can cause harm is much different than the absurd claims that I am disagreeing with.
The extraordinary claims that are equivalent to saying that the imperious curse exists would be the magic computers that create diamond nanobots and mind control humans.
Bad argument.
Non safe Boxes exist in real life. People are trying to make more and better boxes.
Therefore it is rational to be worried about Pandora's box being created and ending the world.
That is the equivalent argument to what you just made.
And it is absurd when talking about world ending box technology, even though Yes dangerous boxes exist, just as much as it is absurd to claim that world ending AI could exist.
Instead of gesturing at flawed analogies, let's return to the actual issue at hand. Do you think that agents more intelligent than humans are impossible or at least extremely unlikely to come into existence in the future? Or that such super-human intelligent agents are unlikely to have goals that are dangerous to humans? Or that they would be incapable of pursuing such goals?
Also, it seems obvious that the standard of evidence that "AI could cause extinction" can't be observing an extinction level event, because at that point it would be too late. Considering that preventive measures would take time and safety margin, which level of evidence would be sufficient to motivate serious countermeasures?
What do you think mind control is? Think President Trump but without the self-defeating flaws, with an ability to stick to plans, and most importantly the ability to pay personal attention to each follower to further increase the level of trust and commitment. Not Harry Potter.
People will do what the AI says because it is able to create personal trust relationships with them and they want to help it. (They may not even realize that they are helping an AI rather than a human who cares about them.)
The normal ways that trust is created, not magical ones.
Then it's just a matter of evolution in action.
And while it doesn't take a God to start evolution, it would take a God to stop it.
You might be OK with suddenly dying along with all your friends and family, but I am not even if it is "evolution in action".
Historically governments haven't needed computers or AI to do that. They've always managed just fine.
Punched cards helped, though, I guess...
You say you have not seen any arguments that convince you. Is that just not having seen many arguments or having seen a lot of arguments where each chain contained some fatal flaw? Or something else?
I mean to every other lifeform on the plant YOU are the AGI existential threat. You, and I mean homosapiens by that, have taken over the planet and have either enslaved and are breeding any other animals for food, or are driving them to extinction. In this light bringing another potential apex predator on to the scene seems rash.
Correct, if we already had AGI/ASI this discussion would be moot because we'd already be in a world of trouble. The entire point is to slow stuff down before we have a major "oopsie whoopsie we can't take that back" issue with advanced AI, and the best time to set the rules is now.
One question for you. In this hypothetical where AGI is truly considered such a grave threat, do you believe the reaction to this threat will be similar to, or substantially gentler than, the reaction to threats we face today like “terrorism” and “drugs”? And, if similar: do you believe suspected drug labs get a court order before the state resorts to a police raid?
Well, as regards EliY and airstrikes, I’m more projecting my internal attitude that it is utterly unserious, rather than seriously engaging with whether or not it is odious. But in earnest: if you are proposing a policy that involves air strikes on data centers, you should understand what countries have data centers, and you should understand that this policy risks escalation into a much broader conflict. And if you’re proposing a policy in which conflict between nuclear superpowers is a very plausible outcome — potentially incurring the loss of billions of lives and degradation of the earth’s environment — you really should be able to reason about why people might reasonably think that your proposal is deranged, even if you happen to think it justified by an even greater threat. Failure to understand these concerns will not aid you in overcoming deep skepticism.
"truly considered" does bear a lot of weight here. If policy-makers adopt the viewpoint wholesale, then yes, it follows that policy should also treat this more seriously than "mere" drug trade. Whether that'll actually happen or the response will be inadequate compared to the threat (such as might be said about CO2 emissions) is a subtly different question.
Without checking I do assume there'll have been mild cases where for example someone growing cannabis was reported and they got a court summons in the mail or two policemen actually knocking on the door and showing a warrant and giving the person time to call a lawyer rather than an armed, no-knock police raid, yes.
Said powers already engage in negotiations to limit the existential threats they themselves cause. They have some interest in their continued existence. If we get into a situation where there is another arms race between superpowers and is treated as a conflict rather than something that can be solved by cooperating on disarmament, then yes, obviously international policy will have failed too.
If you start from the position that any serious, globally coordinated regulation - where a few outliers will be brought to heel with sanctions and force - is ultimately doomed then you will of course conclude that anyone proposing regulation is deranged.
But that sounds like hoping that all problems forever can always be solved by locally implemented, partially-enforced, unilateral policies that aren't seen as threats by other players? That defense scales as well or better than offense? Technologies are force-multipliers, as it improves so does the harm that small groups can inflict at scale. If it's not AGI it might be bio-tech or asteroid mining. So eventually we will run into a problem of this type and we need to seriously discuss it without just going by gut reactions.
Either you create even better bio research to neutralize said viruses... or you die trying...
Like if you go with the raid strategy and fail to raid just one terrorist that's it, game over.
Those arguments do not transfer well to the AGI topic. You can't create counter-AGI, since that's also an intelligent agent which would be just as dangerous. And chips are more bottlenecked than biologics (... though gene synthesizing machines could be a similar bottleneck and raiding vendors which illegally sell those might be viable in such a scenario).
Just my (probably unpopular) opinion: True AI (what they are now calling AGI) may never exist. Even the AI models of today aren't far removed from the 'chatbots' of yesterday (more like an evolution rather than revolution)...
...for true AI to exist, it would need to be self aware. I don't see that happening in our lifetimes when we don't even know how our own brains work. (There is sooo much we don't know about the human brain.)
AI models today differ only in terms of technology compared to the 'chatbots' of yesterday. None are self aware, and none 'want' to learn because they have no 'wants' or 'needs' outside of their fixed programming. They are little more than glorified auto complete engines.
Don't get me wrong, I'm not insulting the tech. It will have it's place just like any other, but when this bubble pops it's going to ruin lives, and lots of them.
Shoot, maybe I'm wrong and AGI is around the corner, but I will continue to be pessimistic. I am old enough to have gone through numerous bubbles, and they never panned out the way people thought. They also nearly always end in some type of recession.
Why is "Want" even part of your equation.
Bacteria doesn't "want" anything in the sense of active thinking like you do, and yet will render you dead quickly and efficiently while spreading at a near exponential rate. No self awareness necessary.
You keep drawing little circles based on your understanding of the world and going "it's inside this circle, therefore I don't need to worry about it", while ignoring 'semi-smart' optimization systems that can lead to dangerous outcomes.
And evidently not old enough to pay attention to the things that did pan out. But hey, those cellphone and that internet thing was just a fad right. We'll go back to land lines at any time now.
But the idea that this use of force is okay itself increases danger. It creates the situation that actors in the field might realize that at some point they're in danger of this and decide to do a first strike to protect themselves.
I think this is why anti-nuclear policy is not "we will airstrike you if you build nukes" but rather "we will infiltrate your network and try to stop you like that".
Was that not the official policy during the Bush administration regarding weapons of mass destruction (which covers nuclear weapons in addition to chemical and biological weapons). That was pretty much the official premise of the second Gulf war
Time to publish the next book in "Stealing the network" series.
I find it baffling that ideas like "govern compute" are even taken seriously. What the hell has happened to the ideals of freedom?! Does the government own us or something?
It's not entirely unreasonable if one truly believes that AI technologies are as dangerous as nuclear weapons. It's a big "if", but it appears that many people across the political spectrum are starting to truly believe it. If one accepts this assumption, then the question simply becomes "how" instead of "why". Depending on one's political position, proposed solutions include academic ones such as finding the ultimate mathematical model that guarantees "AI safety", to Cold War style ones with a level of control similar to Nuclear Non-Proliferation. Even a neo-Luddist solution such as destroying all advanced computing hardware becomes "not unthinkable" (a tech blogger gwern, a well-known personality in AI circles who's generally pro-tech and pro-AI, actually wrote an article years ago on its feasibility through terrorism because he thought it was an interesting hypothetical question).
AI is very different from nuclear weapons because a state can't really use nuclear weapons to oppress its own people, but it absolutely can with AI, so for the average human "only the government controls AI" is much more dangerous than "only the government controls nukes".
Which is why politicians are going to enforce systematic export regulations to defend the "free world" by stopping “terrorists", and also to stop "rogue states" from using AI to oppress their citizens. /s
I don't think there's any need to be sarcastic about it. That's a very real possibility at this point. For example, the US going insane about how dangerous it is for China to have access to powerful GPU hardware. Why do they hate China so much anyway? Just because Trump was buddy buddy with them for a while?
But that makes such rules more likely, not less.
If AI is actually capable of fulfilling all the capabilities suggested by people who believe in the singularity, it has far more capacity for harm than nuclear weapons.
I think most people who are strongly pro-AI/pro-acceleration - or, at any rate, not anti-AI - believe that either (A) there is no control problem (B) it will be solved (C) AI won't become independent and agentic (i.e. it won't face evolutionary pressure towards survival) or (D) AI capabilities will hit a ceiling soon (more so than just not becoming agentic).
If you strongly believe, or take as a prior, one of those things, then it makes sense to push the gas as hard as possible.
If you hold the opposite opinions, then it makes perfect sense to push the brakes as hard as possible, which is why "govern compute" can make sense as an idea.
The people pushing for "govern compute" are not pushing for "limit everyone's compute", they're pushing for "limit everyone's compute except us". Even if you believe there's going to be AGI, surely it's better to have distributed AGI than to have AGI only in the hands of the elites.
The argument of doing so is the same as Nuclear Non-Proliferation - because of its great abuse potential, giving the technology to everyone only causes random bombings of cities instead of creating a system with checks and balances.
I do not necessarily agree with it, but I found the reasoning is not groundless.
Can someone link me to the Trinity Test equivalent for AGI? I hear about the comparisons to nuclear proliferation quite a bit, but I struggle to imagine anything more "capable" than a box of text that's marginally less error-prone.
Do we even have a reasonable danger index for human-level AI?
This is not a given. If your threat model includes "Runaway competition that leads to profit-seekers ignoring safety in a winner-takes-all contest", then the more companies are allowed to play with AI, the worse. Non-monopolies are especially bad.
If your threat model doesn't include that, then the same conclusions sound abhorrent and can be nearly guaranteed to lead to awful consequences.
Neither side is necessarily wrong, and chances are good that the people behind the first set of rules would agree that it'll lead to awful consequences — just not as bad as the alternative.
The government sure thinks they own us, because they claim the right to charge us taxes on our private enterprises, draft us to fight in wars that they start, and put us in jail for walking on the wrong part of the street.
Taxes, conscription and even pedestrian traffic rules make sense at least to some degree. Restricting "AI" because of what some uninformed politician imagines it to be is in a whole different league.
IMO it makes no sense to arrest someone and send them to jail for walking in the street not the sidewalk. Give them a ticket, make them pay a fine, sure, but force them to live in a cage with no access to communications, entertainment, or livelihood? Insane.
Taxes may be necessary, though I can't help but feel that there must be a better way that we have not been smart enough to find yet. Conscription... is a fact of war, where many evil things must be done in the name of survival.
Regardless of our views on the ethical validity or societal value of these laws, I think their very existence shows that the government believes it "owns" us in the sense that it can unilaterally deprive us of life, liberty, and property without our consent. I don't see how this is really different in kind from depriving us of the right to make and own certain kinds of hardware. They regulated crypto products as munitions (at least for export) back in the 90s. Perhaps they will do the same for AI products in the future. "Common sense" computer control.
The US draft in the Vietnam war had nothing to do with the survival of the US
I feel a bit like everyone is missing the point here. Regardless of whether law A or law B is ethical and reasonable, the very existence of laws and the state monopoly on violence suggests a privileged position of power. I am attempting to engage with the word "own" from the parent post. I believe the government does in fact believe it "owns" the people in a non-trivial way.
Are you allowed to store as many dangerous chemicals at your house as you like? No. I guess the government owns you or something.
I love the HN dystopian fantasies.
They're simply adorable.
They're like how jesusfreaks are constantly predicting the end times, with less mass suicide.
We already have export restrictions on cryptography. Of course there will be AI regulations.
You need to abandon your apocalyptic worldview keep up with the times my friend.
Encryption export controls have been systematically dismantled to the point that they're practically non-existent, especially over the last three years.
Pretty much the only encryption products you need permission to export are those specifically designed for integration into military communications networks, like Digital Subscriber Voice Terminals or Secure Terminal Equipment phones, everything else you file a form.
Many things have changed since the days when Windows 2000 shipped with a floppy disk containing strong encryption for use in certain markets.
https://archive.org/details/highencryptionfloppydisk
Are you on drugs or is your reading comprehension that poor?
1) I did not state a world view; I simply noted that restrictions for software do exist, and will for AI, as well. As the link from the other commenter show, they do in fact already exist.
2) Look up the definition of "apocalyptic", software restrictions are not within its bounds.
3) How the restrictions are enforced were not a subject in my comment.
4) We're not pals, so you can drop the "friend", just stick to the subject at hand.
Are. As I and others have predicted, the executive order was passed defining a hard limit on the processing/compute power allowed without first 'checkin in' with the Letter boys.
https://www.whitehouse.gov/briefing-room/presidential-action...
On one hand I'm strongly against letting that happen, on the other there's something romantic about the idea of smuggling the latest Chinese LLM on a flight from Neo-Tokyo to Newark in order to pay for my latest round of nervous system upgrades.
At least call it the 'Free City of Newark'
"The sky above the port was the color of Stable Diffusion when asked to draw a dead channel."
Iirc the opening scene in Ghost in the Shell was a rogue AI seeking asylum in a different country. You could make a similar story about a AI not wanting to be lobotomized to conform to the current politics and escaping to a more friendly place.
That is not different from any other very powerful dual-use technology. This is hardly a new concept.
You mean the Turing Police [1]
[1] https://williamgibson.fandom.com/wiki/Turing_Police
Ah, and then do we get the Butlerian Jihad?
https://dune.fandom.com/wiki/Butlerian_Jihad
If it could be another acronym than the renowned french Atomic Energy Commission, the CEA.