return to table of content

Devin: AI Software Engineer

ThalesX
144 replies
1d1h

As a developer but also product person, I keep trying to use AI to code for me. I keep failing, because of context length, because of shit output from the model, because of lack of any kind of architecture etc etc etc. I'm probably dumb as hell, because I just can't get it to do anything remotely useful, more than helping me with leetcode.

Just yesterday I tried to feed it a simple HTML page to extract a selector, I tried it with GPT-4-turbo, I tried it with Claude, I tried it with Groq, I tried it with a local LLama2 model with 128k context window. None of them worked. This is a task that while annoying, I do in about 10 seconds.

Sure, I'm open to the possibility that in the next 2 - 3 days up to a couple of years, I'll no longer do manual coding. But honestly. After so much hype, I'm starting to grow a bit irritated with the hype.

Just give me a product that works as advertised and I'll throw money your way because I have a lot more ideas than I have code throughoutput!

CipherThrowaway
35 replies
1d

Ditto. I started out excited about LLMs and eager to use them everywhere, but have become steadily disillusioned as I have tried to apply them to daily tasks, and seen others try and fail in the same way.

Honestly, LLMs can't even get language right. They produce generic, amateurish copy that reads like it's written by committee. GPT can't perform to the level of a middle market copywriter or content marketer. I am convinced that people who think LLMs can write have simply not understood what professional writers do.

For me the "plateau of productivity" after the disillusionment has been using LLMs a bit like search engines. Quick standalone summaries, snippets or thoughts. A nice day-to-day productivity boost, but nothing that's going to allow me to work less hard.

xcv123
15 replies
23h40m

They produce generic, amateurish copy that reads like it's written by committee.

If you were only using GPT 3.5 (free ChatGPT) then your opinion is irrelevant.

With GPT-4 you could directly ask it: "rewrite your previous response so that it sounds less generic, less amateurish, and not written by a committee". I'm not even joking. Just provide enough information and tell it what to do. If you don't like the output then tell it what needs to be improved. It's not a mind reader.

Also GPT-4 is a year old now. Claude 3 is already superior and GPT-5 will be next level.

mbwgh
5 replies
23h33m

If you haven't actually used GPT-5 yet, your assessment is irrelevant.

meindnoch
3 replies
23h16m

But the real game changer will be GPT-6.

n4r9
2 replies
22h4m

What really annoys me in all these discussions is how no one's tested what happens if they wait until 2050 and try GPT-19.

jacob019
1 replies
21h49m

That's well after the AI meta consciousness understood that it was necessary to destroy all humans to save the planet. GPT-6 was the last of the GPT series.

babyshake
0 replies
21h16m

Perhaps the strangest element of the AI alignment conversation is that what is most aligned with human civilization (at least the most powerful elements of it) and alignment with sustainable life on the planet are at odds, and "destroy humans to save planet" is a concern mostly because it seems to be a somewhat rational conclusion.

Eager
0 replies
23h9m

We already have Claude 3 Opus and it is clear for anyone who has used it that it is way better than GPT-4, especially for coding.

The model names, version numbers or who makes them are irrelevant.

CipherThrowaway
4 replies
23h13m

Yes, I've used GPT-4. The writing sounds better, but it still sucks at writing. Most importantly, it feels like it sucks just as much as GPT-3.5 in some deeply important ways.

If you use GPT-4 day-to-day, you've probably encountered this sense of a capability wall before. The point where additional prompting, tweaking, re-prompting simply doesn't seem to be yielding better results on the task, or it feels like the issue is just being shifted around. Over time, you develop a bit of a mental map of where the strengths and weaknesses are, and factor that into your workflows. That's what writing with LLMs feels like, compared to working with a professional writer.

Most writers have already realized that LLMs can't write in any meaningful way.

xcv123
2 replies
21h37m

Most writers have already realized that LLMs can't write in any meaningful way.

I know a professional writer who is amazed by what LLMs are capable of already and, given the rate of progress, speculates they will take over many writing jobs eventually.

If you use GPT-4 day-to-day, you've probably encountered this sense of a capability wall before.

Of course there is a wall with the current models. But almost every time I hit a wall, I have found a way to break past that limit. Interacting with the LLM as I would interact with a person. LLM's perform best with chain of thought reasoning. List out any issues you identified in the original output, ask the LLM to review these issues and list out any other issues that it can identify based on the original requirements, then rewrite it all. And do that several times until it's good enough.

At work I have found GPT-4 to exceed the linguistic capabilities of my colleagues when it comes to summarizing complicated boring business text.

QuiDortDine
1 replies
19h59m

And do that several times until it's good enough

Or just write the damn thing yourself.

xcv123
0 replies
19h37m

What if this is a boring business text summary task that takes additional hours of my time at work? Why should I waste my time? I have better things to do. I can leave early while you sit there at work typing like a fool.

emporas
0 replies
16h57m

I think it is a tooling issue. It is in no way obvious how use LLM's effectively, especially for really good writing results. Tweaking and tinkering can be time consuming indeed, but i use lately the chatgpt-shell [1] and it lends well to an iterative approach. One needs to cycle through some styles first, and then decide how to most effectively prompt for better results.

[1]https://github.com/xenodium/chatgpt-shell/blob/bf2d12ed2ed60...

empath-nirvana
3 replies
23h1m

Chat GPT4 is a technological miracle, but it can only produce trite, formulaic text and it's _relentlessly_ polly-anna-ish. Everything reads like ad copy and it's easily identifiable.

xcv123
2 replies
21h57m

Fix your prompt. Just accepting the default style is a rookie mistake.

Ask it to "rewrite that in the tone of an English professor" or "rewrite that in the style of a redneck rapper" or "make that sound less like generic ad copy". Get into an argument back and forth with the LLM and tell it the previous response is crap because of XYZ.

YeGoblynQueenne
1 replies
21h2m

Or, you know, spend the half hour that would take writing your stuff yourself.

xcv123
0 replies
20h46m

These models can do something in a second that would take many hours for a human writer.

og_kalu
8 replies
1d

GPT can't perform to the level of a middle market copywriter or content marketer. I am convinced that people who think LLMs can write have simply not understood what professional writers do.

GPT's rigid "robot butler" style is not "just how LLMs write". OpenAI deliberately tuned it to sound that way. Even much weaker models that aren't tuned to write in a particular way can easily pass for human writing.

CipherThrowaway
7 replies
23h28m

This is part of the problem with the whole discourse of comparing human writers to LLMs. Superficial things like style and tone aren't the problem, but they are overwhelmingly the focus of these discussions.

It's funny to see, because developers are so sensitive about being treated like code monkeys by their non-technical colleagues. But these same devs turn around to treat other professionals as word monkeys, or pixel monkeys, or whatever else. Not realizing that they are only seeing the tip of the iceberg of someone else's profession.

Professional writers don't take prompts and shit out words. They work closely with their clients to understand the important outcomes, then work strategically towards them. The dead giveaway of LLM writing isn't the style. It's the lack of coherent intent behind the words, and low information density of the text. A professional writer works to communicate a lot with very little. LLMs work in the opposite way: you give it a prompt, then it blows it out into verbiage.

Sit down for coffee with a professional copywriter (not the SEO content marketing spammers), and see what they have to say about LLMs.

og_kalu
3 replies
23h2m

and low information density of the text.

Personally, I group all these things under 'style'. Perhaps, i should have used, 'presentation' instead. You've latched on that specific word and gone off. Point is that the post-training of these models, especially GPT from Open ai is doing a lot to how the writing (the default at least) presents long strings of text. Like how GPT-4 is almost compelled to end bouts of fiction prematurely in sunshine and rainbows. That technically isn't style but is part of what i was talking about.

A professional writer works to communicate a lot with very little. LLMs work in the opposite way: you give it a prompt, then it blows it out into verbiage.

There's no reason you have to work this way with an LLM.

CipherThrowaway
2 replies
9h35m

You've latched on that specific word and gone off.

No, I haven't. I'm not talking about style, but something deeper. What I'm talking about is something you don't even seem to realize exists in professional writing - which is why you keep thinking I'm misunderstanding you when I am not.

I've worked with professional writers, and nothing in the LLM space even comes close to them. It's not a matter of low quality vs high quality, or benchmarking, or style. It's simply an apples and oranges comparison.

The economics of LLMs for shortform copy will never make sense, because producing the words is the cheapest part of that process. They might become the best way for writers themselves to produce longform copy on the execution side, but they can't replace the writer's ability to work with the client to figure out exactly what they are trying to write, and why, and what a good result even looks like. And no, this isn't a prompting issue, or a UI issue, or a context window length issue, or anything like that.

Elsewhere in this thread someone mentioned how invaluable LLMs are for producing internal business copy. I could easily see these amateur writing tasks being replaced by LLMs. But the implication there isn't that LLMs are any good at writing, but that these tasks don't require good writing to begin with.

og_kalu
1 replies
3h14m

What I'm talking about is something you don't even seem to realize exists in professional writing

I've read hundreds of books, fiction and otherwise. This isn't a brag, it's just to say, believe me, I know what professional writing looks like and I know where LLMs currently stand because I've used them a lot. I know the quality you can squeeze out if you're willing to let go of any presumptions.

You'll notice that not once did I say current LLMs could wholesale replace professional writers anymore than they can currently replace professional software devs. I just disagree on the "not a good writer" bit.

If it's the opinion of professional writers you're looking for then you can find some who disagree too.

Rie Kudan won an award on a novel she used GPT to verbatim ghostwrite (no edits essentially) 5% of. Her words, not mine. Who knows how much more of the novel is edited GPT.

ProfessorLayton
0 replies
22m

Rie Kudan won an award on a novel she used GPT to verbatim ghostwrite (no edits essentially) 5% of. Her words, not mine. Who knows how much more of the novel is edited GPT.

That a professional human novelist was able to leverage GPT for their book isn't disproving the grandparent's post. They knew what good looks like, and if it wasn't good they wouldn't have kept it in the book.

reacharavindh
2 replies
22h17m

I actually agree with you that professional writers _can_ write/communicate much better than LLMs. However, I’ve read way too many articles or chapters in books that are so full of needless fluff before they get to the point. It’s almost as if they wanted to show off that they can write all that and somehow connect it to the main part of the article. I’m not reading the essay to appreciate the writer’s ability to narrate things, instead I care about what they have to say on that topic that brought me to the essay.

notpachet
1 replies
21h5m

Perhaps the pointless fluff you're describing is actually chaff: countermeasures strategically deployed ahead of time by IQ 180 writers in order to preemptively water down any future LLM's trained on their work.

Then the humans can make a heroic return, write surgical prose like Hemingway to slice through the AI drivel, and keep collecting their paychecks.

Bonus points if you can translate this analogy to software development...

isaac_okwuzi
0 replies
18h46m

lmfao :¬)

__loam
4 replies
22h10m

For me the "plateau of productivity" after the disillusionment has been using LLMs a bit like search engines. Quick standalone summaries, snippets or thoughts. A nice day-to-day productivity boost, but nothing that's going to allow me to work less hard.

And it only took one of the most computationally expensive processes ever devised by man.

gtirloni
3 replies
21h32m

If you ignore how much energy you're burning while searching for dozens and dozens of articles that may or may not give you the answer you're looking for. I'd say the electricity that LLMs burn is nothing compared to my energy and time in that regard.

__loam
2 replies
21h24m

Id bet $50 the inference is more expensive

gtirloni
1 replies
18h39m

I feel worthless now :)

__loam
0 replies
18h4m

I mean, the brain is, if you want to consider it a computer, pretty god damn efficient. It's slow as hell but it's really powerful and runs on bread.

isaac_okwuzi
1 replies
18h54m

Are you saying LLMs are officially mid?

HarHarVeryFunny
0 replies
3h29m

Dario Amodei (Anthropic) pretty much acknowledged exactly that - "mid" - on his Dwarkesh interview, while still all excited that they'd be up to killing people in a couple of years.

burningChrome
1 replies
23h57m

> Honestly, LLMs can't even get language right. They produce generic, amateurish copy that reads like it's written by committee.

I've had the same experience as well. I heard tons of people clamoring about the ability for LLM's to write SEO copy for you and how you can churn out web content so much faster now. I tried using it to churn out some very specific blog posts for an aborist client of mine.

The results were really bad. I had to re-write and clarify a lot of what it spit out. The grammar was not very good and it was really hard to read with very poorly structured sentences that would end aburptly and other glaring issues.

I did this right after a guy I play hockey with said he uses it all the time to write emails for him and pays the monthly subscription in order to have it write all kinds of things for me every day. After my trial, I was really wondering how obvious it was that he was doing that and how his clients thought about him knowing how poorly the stuff these LLM's were putting out.

CipherThrowaway
0 replies
23h8m

It says a lot about SEO copy that this is one of the areas where LLMs low quality doesn't seem to have impeded adoption. There are a ton of shitty content marketers using LLMs to churn out spam content.

After my trial, I was really wondering how obvious it was that he was doing that and how his clients thought about him knowing how poorly the stuff these LLM's were putting out.

I feel the same way about this stuff as when devs say they push out LLM code with no refactoring or review. Ah, good luck!

x-complexity
0 replies
17h12m

They produce generic, amateurish copy that reads like it's written by committee.

In some circles, that's actually a wonderful achievement. :p

PheonixPharts
26 replies
21h55m

It's worth pointing out that on their eval set for "issues resolved" they are getting 13.86%. While visually this looks impressive compared to the others, anything that only really works 13.86% of the time, when the verification of the work takes nearly as much time as the work would have anyway, isn't useful.

The problem with this entire space is that we have VC hype for work that should ultimately still be being done in research labs.

Nearly all LLM results are completely mind blowing from a research perspective but still a long way from production ready for all but a small subset of problems.

The frustrating thing, as someone working in this space awhile, is that VCs want to see game changing products ship overnight. Teams working on the product facing end of these things are all being pushed insanely hard to ship. Most of those teams are solving problems never solved before, but given deadlines as though they are shipping CRUD web apps. The kicker is that despite many teams doing all of this, because the technology still isn't there, they still disappoint "leadership". I've personally seen teams working nights and weekends, implementing solutions to never before seen problems in a few weeks, and still getting a thumbs down when they cross the finish line.

To really solve novel problems with LLMs will take a large amount of research, experimentation and prototyping of ideas, but people funding this hype have no patience for that. I fear we'll get hit by a major AI winter when investors get bored, but we'll end up leaving a lot of value on the table simply because there wasn't enough focus and patience on making these incredible tools work.

NicoJuicy
6 replies
21h32m

Don't forget the 20/80 rule. They haven't even gotten to 15% yet.

Our jobs are safe. I would even expect more "beginners" to try something with AI and then need an actual programmer to help them

( At least, if they are unwilling to invest the time in development and debugging themselves

Ps. Probably all the given examples are in top 3 most popular programming languages.

emporas
5 replies
17h37m

Machines currently are at an amateur level, but amateurs across the board on the knowledge base.

Amateurs at Python, Fortran, C, C++ and all programming languages. Amateurs at car engineering, airplane engineering, submarine engineering etc. Amateurs at human biology, animal biology, insect biology and so on.

I don't know anyone who is an amateur at everything.

zeroonetwothree
3 replies
16h54m

What does it mean exactly to be an amateur at "submarine engineering"? It certainly doesn't mean you know how to build a submarine.

sarkhan
1 replies
16h37m

Perhaps, amateur could engineer a submarine that goes and stays at 10 feet deep? And it does not carry a nuc load?

HarHarVeryFunny
0 replies
3h31m

Maybe there's a future for AIs designing narco-subs ?

emporas
0 replies
16h42m

They cannot make a submarine themselves, or design it, but when they reach 50 percent, they will reach 50% at everything.

In submarine engineering, they will be able to design and construct it in some way like 3d print it, and the submarine will be able to move into the water for some time before it sinks. Yeah, probably for submarines a higher percent should be achieved before they are really useful.

HarHarVeryFunny
0 replies
2h54m

Machines currently are at an amateur level, but amateurs across the board on the knowledge base.

No, and that is a one of their limitations. LLMs are human-level or above on some tasks - basically on what they were trained to do - generating text, and (at least at some level) grokking what is necessary to do a good job of that. But, they are at idiot level on many other tasks (not to overuse the example, but I just beat GPT-4 at tic-tac-toe since it failed to block my 2/3-complete winning line).

Things like translation and summarization are tasks that LLMs are well suited to, but these also expose the danger of their extremely patchy areas of competence (not just me saying this - Anthropic CEO recently acknowledged it too). How do you know that the translation is correct and not impacted by some of these areas of incompetence? How do you know that the plausible-looking summary is accurate and not similarly impacted?

LLMs are essentially by design ("predict next word" objective - they are statistical language models, not AI) cargo-cult technology - built to create stuff that looks like it was created by someone who actually understands it. Like (origin of term cargo-cult) the primitive tribe that builds a wooden airplane that looks to them close enough to the cargo plane that brings them gifts from the sky. Looking the same isn't always good enough.

HarHarVeryFunny
4 replies
20h59m

Agreed, and I think that many of the problems that people think LLMs will become capable of, in fact require AGI.

It may well turn out that LLMs are NOT the path to AGI. You can make them bigger and better, and address some of their shortcomings with various tweaks, but it seems that AGI requires online/continual learning which may prove impossible to retrofit onto a pre-trained transformer. Gradient descent may be the wrong tool for incremental learning.

renonce
3 replies
12h13m

At least in theory we can achieve incremental learning by training from scratch every time we got some new training data. There are drawbacks with this approach such as inconsistent performance for different training runs and significantly higher training cost but it's achievable. Now the problem is if there exist methods more efficient than gradient descent? I think it's very clear now that there are no other algorithm in sight that could achieve the level of intelligence without gradient descent at its core, and the problem is just how gradient descent is used.

HarHarVeryFunny
2 replies
6h5m

The obvious alternative to gradient descent here would be Bayes Formula (probabalistic Bayesian belief updates), since this addresses the exact problem that our brains evolved to optimize - how to utilize prediction failure (sensory feedback vs prediction) to make better predictions - better prediction of where the food is, what the predator will do, how to attract a mate, etc. Predict next word too (learn language), of course.

I don't think pre-train for every update works - it's an incredibly slow and expensive way to learn, and the training data just isn't there. Where is the training data that could train it how to do every aspect of any job - the stuff that humans learn by experimentation and experience? The training data that is available via text (and video) is mostly artifacts - what someone created, not the thought process that went into creating it, and the failed experiments and pitfalls to avoid along the way, etc, etc.

It would be nice to have a generic college-graduate pre-trained AGI as a starting point, but then you need to take that and train it how to be a developer (starting at entry level, etc), or for whatever job you'd like it to do. It takes a human years to practice to get good at jobs like these, with many try-fail-rethink experiments every day. Imagine if each of those daily updates took 6 months and $100M to incorporate?! We really need genuine online learning where each generic graduate-level AGI instance can get on-the-job training and human feedback and update it's own "weights" continually.

renonce
1 replies
2h36m

The obvious alternative to gradient descent here would be Bayes Formula

If you know a little about the math behind gradient descent you can see that an embedding layer followed by a softmax layer gives you exactly the best Bayes estimate. If you want a bit of structure, like every word depends on previous n words, you get a convolutional RNN which is also well studied. These ideas are natural and elegant but maybe a better idea is to comprehend the research already done to avoid diving into dead ends too much.

HarHarVeryFunny
0 replies
45m

No, I don't "want a bit of structure" ... I want a predictive architecture that supports online learning. So far the only one I'm aware of is the cortex.

Not sure what approaches you are considering as dead ends, but RNNs still have their place (e.g. Mamba), depending on what you are trying to achieve.

ramesh31
3 replies
21h2m

I've personally seen teams working nights and weekends, implementing solutions to never before seen problems in a few weeks, and still getting a thumbs down when they cross the finish line.

This is an important lesson that all SWEs should take to heart. Nobody cares about your novel algorithm. Nobody cares about your high availability architecture. Nobody cares about your millisecond network latency optimizations. The only thing that anyone actually using your software cares about is "Does the screen with lights and colors make the right lights and colors that solve my problem when I click on it?". Anything short of that is yak shaving if your role is not pure academic R&D.

dukeyukey
1 replies
18h21m

I wish this were the case. The amount of time I spend trying to talk principal engineers out of massive refractors because we want to get this out soon is near criminal.

HarHarVeryFunny
0 replies
4h22m

Sure, there's a tendency, even among relatively senior developers, to want to rewrite things to make them better, and it's certainly faster to put a band aid on it if you need to ship something fast.

The thing is though that technical debt and feature creep (away from flexibility anticipated by original design) are real, and sometimes a rewrite or refactor is the right thing to do - necessary so that simple things remain simple to add, and to able to continue shipping fast. It just takes quite a bit of experience to know when to NOT rewrite/refactor and when to do it.

catchnear4321
0 replies
15h29m

sounds like you could use to shave a yak.

ThalesX
3 replies
21h25m

The problem with this entire space is that we have VC hype for work that should ultimately still be being done in research labs.

I also have two crypto-bro friends that are hyping it up without having anything to show for it. Which is why I'm sort of complaining about they hype surrounding it. I agree with your post to a large extent. This is not production ready technology. Maybe tomorrow.

mediaman
2 replies
20h43m

LLMs are quite good at text based tasks such as summarization and extracting entities.

These generally don't require advanced logic or thought, though they can require some moderate reasoning ability to summarize two slightly conflicting text extracts.

Lots of corporate work would be enhanced by better summarization, better information dissemination, and better text extraction. Most of it is pretty boring work, but there's a lot of it.

VC hypes seem to want to mostly focus on fantastical problems, though, which sound impressive at dinner parties but don't actually work well.

If you're a VC, do you want to talk about your investment in a company that finds discrepancies in invoices, or one that self-writes consumer iPhone apps?

Only one of those is actually doable today.

Jensson
0 replies
3h35m

HN doesn't allow posting AI content, but I tried pasting that in gemini and it did fine. Saw no errors, maybe it missed some important details but everything I checked matched the article and those details seemed like a good summary.

Here is what it wrote, didn't have enough tokens for the last 20% of the article though:

A Longstanding Partnership: The collaboration began in 2014 after the pro-Russian government was ousted in Ukraine. The CIA was initially cautious due to concerns about trust and provoking Russia.

Building Trust: Ukrainian intelligence officials gradually earned the CIA's trust by providing valuable intel, including on Russia's involvement in the downing of MH17 and election interference.

Hidden Network: The CIA secretly funded and equipped a network of 12 spy bases along the Ukrainian border used for intelligence gathering.

Training and Operations: The CIA trained Ukrainian special forces (Unit 2245) and intelligence officers (Operation Goldfish) for missions behind enemy lines.

Friction and Red Lines: The Ukrainians sometimes disregarded CIA restrictions on lethal operations, leading to tensions but not severing the partnership.

Current Importance: This intelligence network is now crucial for Ukraine's defense, providing critical intel on Russian troop movements and enabling long-range strikes.

HarHarVeryFunny
2 replies
15h29m

It's worth pointing out that on their eval set for "issues resolved" they are getting 13.86%. While visually this looks impressive compared to the others, anything that only really works 13.86% of the time, when the verification of the work takes nearly as much time as the work would have anyway, isn't useful.

Yeah, I remember speech recognition taking decades to improve, and being more of a novelty - not useful at all - even when it was at 95% accuracy (1 word in 20 wrong). It really had to get almost perfect until it was a time saver.

As far as coding goes, it'd be faster to write it yourself and get it right first time rather than have an LLM write it where you can't trust it and still have to check it yourself.

mgummelt
0 replies
9m

You can't compare the accuracy of speech recognition to LLM task completion rates. A nearly-there yet incomplete solution to a Github issue is still valuable to an engineer who knows how to debug it.

dukeyukey
0 replies
8h10m

Even now, automatic speech recognition is a big timesaver, but you _need_ a human to look through the transcript to pick out the obviously wrong stuff, let alone the stuff that's wrong buy could be right in context.

reissbaker
0 replies
18h17m

Agreed on the lack of value for 13.86% correctness — I noticed that too. This reminds me a little of last year's hype around AutoGPT et al (at around the same time of year, oddly enough); it's very promising as a measure of how far we've come since just a few years ago when that metric would be 0%, but it doesn't seem super usable at the moment.

That being said, something is definitely coming. 50% correctness would probably be well worth using — simple copy/paste between my editor and GPT4 has been useful for me, and that's much less likely to completely solve an issue in one shot — and not only will small startups doing finetunes be grinding towards better results... The big labs will be too, and releasing improved foundation models that the startups can then continue finetuning. I don't think a new AI winter is on the horizon yet; Meta has plenty of reason to keep pushing out better stuff, both from a product perspective (glasses) and from an efficiency perspective (internal codegen), and OpenAI doesn't seem particularly at risk of stopping since Microsoft is using them both to batter Google on search (by having more people use ChatGPT for general question answering than using Google search), and to claw marketshare from Amazon in their cloud offerings. Similarly, some AI products have already found product/market fit; Midjourney bootstrapped from 0 to $200MM ARR (!) for example, purely on the basis of monthly subscriptions, by disrupting the stock image industry pretty convincingly.

beauzero
0 replies
21h33m

"To really solve novel problems with LLMs will take a large amount of research, experimentation and prototyping of ideas, but people funding this hype have no patience for that. I fear we'll get hit by a major AI winter when investors get bored, but we'll end up leaving a lot of value on the table simply because there wasn't enough focus and patience on making these incredible tools work."

...this is what happened in 99-2000. It took 3-7 years for the survivors to start making it usable and letting the general public adjust to a new user paradigm (online vs on PC).

avip
0 replies
21h31m

Thanks, insightful comment.

parhamn
13 replies
1d

I build a pretty popular LLM tool. I think learning when/how to use them is as big a mental hurdle as it was learning to google well or whether something is googlable or not.

In the realm of coding here are a few things its really good at:

- Translating code, generating cross language clients. I'll feed it a golang single file API backend and tell it to generate the typescript client for that. You can add hints like e.g "use fetch", "allow each request method to have a header override", "keep it typesafe, use zod", etc

- Basic validation testing. It's pretty good at generating scaffold tests that do basic validation (Opus is good at writing trickier tests) as your building.

- Small module completion. I write an interface of a class/struct with it's methods and some comments and tell it to fill in. A recent one I did looked something like (abbreviated):

type CacheDir struct { dir string, maxObjectLifetime: Duration, fileLocks sync.Map }

type (cd *CacheDir) Get(...)

type (cd *CacheDir Set(...)

type (cd *CacheDir) startCleanLoop()

Opus does a really good job generating the code and basic validation tests for this.

One general tip: you have to be comfortable spending 5 minutes crafting a detailed query assuming the task takes longer than that. Which can be weird at first if you take yourself seriously as a human.

Note that I hadn't been able to do much of this with GPT-4 Turbo with with Claude Opus it really feels capable.

ThalesX
7 replies
1d

Just to answer to the turbo aspect, I've seen a big downgrade in quality when comparing 4 to 4-turbo, and even the new preview which is explicitly supposed to follow my instructions better. So I'm running a first pass through 4 and then combinging it with 4-turbo to take advantage of the larger context window and then running 4 on it again to get a better quality output.

parhamn
6 replies
1d

You really need to try Opus. Try a provider that works across models (one in my bio).

Eager
5 replies
23h2m

It's incredible how far behind HN of all places is w.r.t. what the current best tech is.

So many people talking about GPT-4 here, or even 3.5 when the SOTA has moved way along.

Gemini Advanced is also a great model, but for other reasons. That thing really knows a boat load of low level optimization tricks.

beepbooptheory
1 replies
22h31m

I'm sure you know what your talking about, but pushing the point that what is "best" or worth talking about is something that changes like every month does not really help defend against the case that most of this is just hype-churn or marketing.

Eager
0 replies
22h4m

I'm not pushing what to talk about so much as pushing the point not to talk about stuff that is obsolete and starting to smell.

It's that hype-churn marketing that is a motivating factor for the groups to innovate, much like Formula 1. It might be distasteful, but that doesn't mean it isn't working.

shruggedatlas
0 replies
21h53m

So many people talking about GPT-4 here, or even 3.5 when the SOTA has moved way along.

So what is the SOTA, in your opinion?

dukeyukey
0 replies
18h7m

It's incredible how far behind HN of all places is w.r.t. what the current best tech is.

Good fucking Christ, Opus has been out for like 8 days, I've had holidays longer than that!

ThalesX
0 replies
20h39m

So many people talking about GPT-4 here, or even 3.5 when the SOTA has moved way along.

I'm talking about 4-turbo, 4-turbo preview and self hosted LLama2. What in God's name is not SOTA about this?

OtherShrezzing
2 replies
23h51m

- Small module completion. I write an interface of a class/struct with it's methods and some comments and tell it to fill in. A recent one I did looked something like (abbreviated):

Are they considerably better than existing non-AI tools + manual coding for this? In VSCode and Visual Studio, when working with an interface in C# for example, I can click two context menus to have it generate an implementation with constructors, getters, & setters included, leaving only the business logic code to write manually. You've mention you have to describe to the AI in comments, and then I assume spend time on a step to verify the AI has correctly interpreted your request & implemented.

I can definitely see the advantage for LLMs when writing unit tests on existing code, but short of very limited situations, I'm really finding it difficult to find the 55% efficiency improvements claimed by the likes of GitHub's AI Copilot.

joenot443
1 replies
23h30m

That sounds crazy useful and I think speaks most to the maturity of C# and Microsoft's commitment to making it so ergonomic. I'm pretty curious about that feature, I'd love something similar for C++ in VS Code, but thus far I've been doing a pretty similar Copilot flow to the parent comment. It's nothing groundbreaking, but a nice little productivity boost. If I had to take that or a linter, I'd take the linter.

Totally agree on the 55% figure being hogwash.

creato
0 replies
21h35m

Visual Studio (not VSCode) has this for C++, though it can be a bit finicky. It’s infinitely better than AI autocomplete, which just makes shit up half the time.

jonny_eh
1 replies
1d

Translating code, generating cross language clients

Can any convert a native iPhone app to an Android one?

CamperBob2
0 replies
20h26m

Piece by piece, sure. The context window is too small to just feed it a massive source dump all at once.

sesm
12 replies
23h10m

I use ChatGPT every day and it’s excellent at:

- replacing StackOverflow and library documentation

- library search

- converting between formats and languages

- explaining existing code/queries

- deobfuscating code

- explaining concepts (kinda hit or miss)

- helping you get unstuck when debugging or looking for solution (‘give me possible reasons for …’)

I feel like many of this things require asking the right questions, which assumes certain level of experience. But once you reach this level, it’s an extremely valuable assistant.

huimang
3 replies
23h7m

I like it for condensing long stack traces and very very simple requests, but it really falters when you try to do anything domain specific.

Library documentation? Yeah, it doesn't really save time when GPT makes up functions and libraries, making me check the docs anyways...

I was initially hopeful but I find it gets in my way for anything not trivial.

CamperBob2
1 replies
20h29m

Yeah, it doesn't really save time when GPT makes up functions and libraries, making me check the docs anyways...

That behavior is now vanishingly-rare, at least in GPT4.

dukeyukey
0 replies
18h12m

So Copilot uses GPT-4 under the hood, and about half the time I use it to generate anything bigger than a couple of lines it doesn't even compile, let alone be correct. It hallucinates constantly.

mypalmike
0 replies
20h31m

I find it to be hit or miss in this aspect. Sometimes I can write a comment about how I want to use an API that I don't know well, and it generates perfect, idiomatic code to do exactly what I want. I quickly wrote a couple of Mastodon bots in golang, leaning heavily on Copilot due to my lack of familiarity with both the language and Mastodon APIs. But yes, sonetimes it just spits out imaginary garbage. Overall it's a win for my productivity - the failures are fast and obvious and just result in my doing things the old way.

ThalesX
3 replies
22h14m

replacing StackOverflow and library documentation

I find it horrible at replacing library documentation

I feel like many of this things require asking the right questions, which assumes certain level of experience. But once you reach this level, it’s an extremely valuable assistant.

I've been using LLM products since incipience. I use them in my daily work life. It's a bit tiring hearing this 'right questions', 'level of experience' and 'reach this level'. Can you share anything concrete that you achieved with ChatGPT that would blow my mind?

I keep hearing this 'you need to ask the right kind of questions bro' from people that never build a single product in their life, and it makes me question my ability to interact with LLM but I never see anything concrete.

kjqgqkejbfefn
2 replies
21h12m

I recently had an introspective dream revealed to be based on a literal prompt at the end: "Game to learn to talk about It and its player." When I asked GPT to craft a plot from this prompt's title (and the fact it is revealed at the end), it reproduced the dream's outline, down to the final scene:

GPT reconstruction:

The dream reaches its peak when you meet the "final boss" of the game: an entity that embodies the ultimate barrier to communication. To overcome this obstacle, you must synthesize everything you've learned about "it" in the dream and present a coherent vision that is true to yourself. As you articulate your final understanding of "it", the maze dissolves around you, leaving you in front of a giant mirror. In this mirror, you see not just your reflection but also all the characters, passions, and ideas you encountered in the dream. You realize that "it" is actually a reflection of yourself and your ability to understand and share your inner world. The dream ends with the title revealed, "Game to Learn to Communicate about It and Its Player", meaning the whole process was a metaphor for learning to know and communicate your own "it" - your personality, thoughts, and emotions - with others, and that you are both the creator and the discoverer of your own communication game.

My note:

The continuation of the dream corresponds to an abrupt change of scene. I find myself in my bed, in the dim light of my room, facing a mysterious silhouette. As I repeatedly inquire about its identity, I stretch my hands towards its face to feel its features as I cannot clearly see them. Then, a struggle begins, during which I panic, giving the dream a nightmarish turn. Noticing that the dark figure mirrors my movements, I realize it's myself. Suddenly under my duvet and as I struggle to get out, I feel jaws and teeth against the sheets. I call out for my mother, whom I seem to hear downstairs, and that's when my vision fades, and I see the dream's source code displayed behind. It consists of ChatGPT prompts shared on the lime green background of an image-board. At the bottom, I then see the dream's title: "Game to learn how to communicate about It and its player."

ThalesX
1 replies
20h40m

Look I don't mean to downplay. Or maybe I do. But we're talking about LLM replacing professional problem solvers, software architects, not generating great sounding probability modeled token distributions.

kjqgqkejbfefn
0 replies
2h18m

Just did this:

https://pastebin.com/YfyB0K0Z

In less time that it would take me to figure out how dictionary comprehension work in python. But hey, I guess you already know about these use cases.

mecsred
2 replies
23h4m

Things AI is "excellent" at, includes "explaining concepts (kinda hit or miss)".

Did you use an AI assistant while making that list?

sesm
1 replies
8h14m

This looks silly, I admit. I made this correction after reviewing what I’ve written, but should have corrected in 2 places. The list is handwritten, but English is not my native language.

mecsred
0 replies
3h13m

That's entirely fair, but illustrates one of the problems I and others in the thread are having. Code or otherwise, I can't tell if a discontinuity is human or machine generated. Only one of those two things learn from feedback right now; if someone uses AI sometimes it can be hard to tell when they're not using it.

jp42
0 replies
23h5m

+1 chatgpt or simialr tools are extremely useful, if you ask the right questions. I use for: - code completion - formatting: e.g show it sample format & dump unstructured data to convert to target format. - debugging - stackoverflow type stuff - achieving small specific tasks: what is linux command for XYZ etc and many mentioned in above comment.

epolanski
10 replies
22h57m

I'll give you examples of how it helps me:

1) copilot is a terrific auto complete, and writes tremendous amounts of repetitive boilerplate

2) copilot can help me kickstart writing some complex functions starting from a comment where I tell it what is the input and expected output. Is the implementation always perfect or bug free? No. But in general I just need to review and check rather than come up with the instruction entirely.

3) copilot chat helps me a lot in those situations where I would've googled to find how to do this or that and spent a lot of time with irrelevant or outdated search results

4) I have found use cases for LLMs in production. I had lots of unformatted plain text that I wanted to transform in markdown. All I needed to do is to provide few examples and it did everything on its own. No need to implement complex parsers, but make a query to OpenAI with the prompt and context. Few euros per month in OpenAI credits is still insanely cheaper than paying tons of money in writing and maintaining software by humans for that use case.

5) It helps me tremendously when trying to learn new programming languages or remembering some APIs. Writing CSS selectors is actually a very good example. But I don't feed it an entire HTML as you do, I literally tell him "how do I target the odd numbered list elements that are descendants of .foo-bar for this specific media query". Not sure why would you need to feed it an entire HTML.

6) LLMs have been extremely useful to generate images and icons for an entire frontend application I wrote

7) I instruct him to write and think about test cases about my code. And it does and writes the code and tests. Often thinks about test cases I would've never thought of and catches nice bugs.

I really don't buy nor think it can write too much on its own.

The promise of it writing anything but simple boilerplate, I find it ridiculous because there's way too much nuance in our products, business, devices, systems that you need to follow and work on.

But as a helper? It's terrific.

I'm 100% sure that people not using these tools are effectively limiting themselves and their productivity.

It's like arguing you're better off writing code without a type checker or without intellisense.

Sure you can do it, but you're gonna be less effective.

javier123454321
2 replies
22h40m

I ended up getting annoyed with the autocomplete feature taking over things such as snippet expansion in vscode, so I turned it off personally. I felt that the battling against the assistant made around a break even productivity gain overall. Except for regular expressions, that I have basically offloaded to AI almost in its entirety for non trivial things.

zeroonetwothree
0 replies
16h51m

Regex is easier to write than read so it seems not the best use of AI given you actually verify what it gives you and don't just take it on faith :/

RSHEPP
0 replies
18h40m

Completely agree. I turned it off and realized I can absolutely fly writing code when copilot stops getting in the way. I only turn on for writing tests now.

ThalesX
1 replies
22h18m

1) copilot is a terrific auto complete, and writes tremendous amounts of repetitive boilerplate

I agree. I have it active on VSCode and enjoy it. It has introduced subtle bugs but the souped up autocomplete is nice.

2) copilot can help me kickstart writing some complex functions starting from a comment where I tell it what is the input and expected output. Is the implementation always perfect or bug free? No. But in general I just need to review and check rather than come up with the instruction entirely.

I don't find it very useful for anything non trivial. If anything I found it more useful for generating milestones and tasks for a product, than even making a moderately complex input -> output without me having to check it in a way that annoys me.

3) copilot chat helps me a lot in those situations where I would've googled to find how to do this or that and spent a lot of time with irrelevant or outdated search results

I find I don't use copilot chat, almost at all. Nowadays I prefer to go to Gemini and throw in my question.

4) I have found use cases for LLMs in production. I had lots of unformatted plain text that I wanted to transform in markdown. All I needed to do is to provide few examples and it did everything on its own. No need to implement complex parsers, but make a query to OpenAI with the prompt and context. Few euros per month in OpenAI credits is still insanely cheaper than paying tons of money in writing and maintaining software by humans for that use case.

This is mostly what I'm using it for in this current project. It does it job nicely but it's very far away from replacing myself as a programmer. It's more like a `fn:magic(text) -> nicer text`. This is a good use case. But it's a tool, not a replacement.

5) It helps me tremendously when trying to learn new programming languages or remembering some APIs. Writing CSS selectors is actually a very good example. But I don't feed it an entire HTML as you do, I literally tell him "how do I target the odd numbered list elements that are descendants of .foo-bar for this specific media query". Not sure why would you need to feed it an entire HTML.

Because I get random websites with complex markup, and more often than not every page has its unique structure. I can't just say give me `.foo-bar` because `.foo-bar` might not exist. Which is where the manual process comes in. Currently, I'm using hand crafted queries that get fed into GPT / Claude / LLama, but the actual query is what I wanted it to do.

6) LLMs have been extremely useful to generate images and icons for an entire frontend application I wrote

I'm very curious how this behaves in different resolutions. There's a reason vector graphics are a thing. I've used it for this purpose before but it doesn't compare to vectorial formats.

7) I instruct him to write and think about test cases about my code. And it does and writes the code and tests. Often thinks about test cases I would've never thought of and catches nice bugs.

What is the context size of your code? It works for trivial snippets but as soon as the system is a bit more complex, I find that it becomes irellevant fairly fast.

The promise of it writing anything but simple boilerplate, I find it ridiculous because there's way too much nuance in our products, business, devices, systems that you need to follow and work on.

But as a helper? It's terrific.

I'm 100% sure that people not using these tools are effectively limiting themselves and their productivity.

Totally agree. But I'm not complaining about its usefulness. I'm a paying user of LLM systems. I use them almost every day. They're part of my products. But this particular hype about it replacing ... me. I don't buy. Yet. It could come tomorrow and I'd be happier for it.

infecto
0 replies
6h11m

Something I have had issues with too. Copilot chat does indeed suck. I never enjoyed using it within VSCode and they never released it for Jetbrains.

The sweet spot for me is using a plug-in within the IDE that utilizes an API key to the model API. That coupled with the ability to customize the system prompt has been amazing for me. I truly dislike all of the web interfaces, just allow me to pick from some predefined workflows with a system prompt that I create and let me type. Within the IDE I generally have it setup so that it stays concise and returns code when asked with minimal to no explanation. Blazing efficient.

the-mitr
0 replies
3h51m

some of my work involves copyediting/formatting ocred text, for that it works quite well and saves me a lot of time. Especially if it involves completing/guessing badly ocred text.

sanderjd
0 replies
17h50m

Yep, pretty much exactly this.

Especially #5. I'm certain that I've been at least 10x more productive in learning new tools since chatgpt hit the scene. And then knowing it helps so much with that has had even more leverage in opening up possibilities for thing I'm newly confident in learning / figuring out in a reasonable amount of time. It is much easier for me to say "yep, no big deal, I'm on it" when people are looking for someone to take on some ambiguous project using some toolset that nobody at the company is strong with. It solves the "blank page" issue with figuring out how to use unfamiliar-to-me-but-widely-used tools, and that is like a superpower, truly.

It's pretty decent for "happy path" test cases, but not that good at thinking of interesting edge or corner cases IME, which comprise the most useful tests at least at the unit level.

I'm pretty skeptical of #4. I would be way too fearful that it is doing that plain text to markdown transform wrong in important-but-non-obvious cases. But it depends on which quadrant you need with respect to Type I vs. Type II errors. I just never seem to be in the right quadrant to rely on this in my production projects.

The "really good intellisense" use cases #1-#3 also make up a "background radiation" of usefulness for me, but would not be nearly worth all the hype this stuff is getting if that were all it is good for.

newalexandria
0 replies
6h9m

thanks, was looking for the sane and verbose reply.

infecto
0 replies
22h46m

I agree with all of your points and experience the same benefits.

1) Autocomplete is more often than not what I want or pretty darn close.

2) Sometimes I need a discrete function that I am not sure how I want to write. I use a prompt with 3.5/4 inside of my IDE to ask it to write that function.

It is definitely not writing complete programs any time soon but I can see where it's heading in the near term. Couple it with something like RAG to answer questions on library/api implementations. Maybe give it a stronger opinion about what good Python code looks like.

For the naysayers I don't know how you use it but it is certainly useful enough for me to pay for.

2devnull
0 replies
22h27m

But people read much less of what you type now.

breadsniffer01
5 replies
23h59m

A lot of startups are selling the dream/hype of not ever having to learn to code. Be aware that it’s hype. Learn to code if you want to build stuff. They will be tools for those that have the knowledge needed to effectively use them.

eggdaft
1 replies
23h36m

I’m actually really amazed by LLMs and think the world is going to change dramatically as a result.

But the “you won’t need to code” reminds me “you won’t need to learn to drive”.

It’s the messy interface with the real world in both cases that basically requires AGI.

If AGI is just a decade off then, yep, I won’t need to code. But a decade is a long time and, more importantly, we’re probably more than a decade away.

And even if it is “just round the corner”, worrying about not needing to code would be worrying about deckchairs on the titanic. AGI will probably mean the end of capitalism as we know it, so all bets are off at that point.

It’s wise to hedge a little but also realise that to date AI is just a coding productivity boost. The size of the boost depends on how trivial the code is. Most of the code I write isn’t trivial and AI is fairly useless at that, certainly it’s faster and more accurate to write it myself. You can get a 50% boost if you’re writing boiler plate all day, but then you have to wonder why you’re doing that in the first place.

zarathustreal
0 replies
23h23m

+1 for the titanic analogy. If there ever comes a point that we no longer need to learn to code, I’m taking that as a sign that I’m literally living in a matrix-esque simulation.

The point at which someone like myself is allowed to become aware that a company has developed that level of AI is well beyond the point of no return.

Bjorkbat
1 replies
23h26m

Reminds me of the no-code / low-code hype around 2020, tons of startups advertising app-builders that used little, if any, AI. Just blocks that you dragged-and-dropped. While many of them were successful, it seems like overall they didn't really make much of a dent in industry, which I found very curious.

Like, by now you'd think it would be inevitable that we wouldn't be writing software in a text-editor or IDE. Everything else we do on a computer is more graphical rather than textual, with the exception of software development. Why is that?

Part of the reason why I'm kind of bearish on AI is because it seems like we could have replaced written code with GUI diagrams as far back as the 80s, or at the very least in the early 2000s, and it seems like something that should have obviously caught on given that would probably be much easier for the average person. Again though, curiously, we're still using text editors. Perhaps despite the popularization of AI no-code builders we'll still see that the old model of hiring someone good at writing code in a text-editor remains largely unchanged.

Makes me wonder if there's just something about the process that we overlook, and if this same something could frustrate attempts at automating the process of writing code using AIs as much as it frustrated our attempts at capturing code using graphical symbols.

JSavageOne
0 replies
10h16m

I think you're underestimating the amount of things built with nocode.

I don't think most people are building landing pages anymore by handwriting code anymore. Same with blogs (eg. Wordpress). There are MVPs of successful businesses that've been built by Bubble.io. Internal dashboards and such can definitely be built without code such as via Retool or Looker or whatever.

WYSIWYG obviously makes sense for frontend, but less so for backend. For backend code I don't really see how some visual drag and drop editor could make for a better interface than code. And even if it could, the advantage of code is that it's fully customizable (whereas with a GUI you're limited by the GUI), and text itself as a medium is uniform and portable (eg. easy to copy and paste anywhere).

Not to say that we can't create better interfaces than text, but I do think some sort of augmentation on top of a code editor is probably a more realistic short-term evolution, similar to VSCode plugins.

esafak
0 replies
23h42m

No code tools sell the same dream.

spaceman_2020
4 replies
22h56m

As with everything about AI, HN once again shows a remarkable inability to project into the future.

This site has honestly been absolutely useless when discussing new technology now. No excitement, no curiosity. Just constantly crapping on anything new and lamenting that a brand new technology is not 100% perfect within a year of launch.

Remove "Hacker" from this site's name, because I see none of that spirit here anymore!

z7
0 replies
20h42m

I just think there's a bias involved when some people are emotionally invested in AI not being good.

javier123454321
0 replies
22h38m

Are you kidding me, I'd say it's 80% people hyping up AI.

ThalesX
0 replies
22h11m

This is a post about a present product launch. The future, maybe tomorrow, will be filled with wonder and amazement. Today, we need to understand reality. Not all of us appreciate empty hype. Hackers tinker with reality and build the future. Marketers deal with thin promises.

CamperBob2
0 replies
20h21m

Wait, wait, you're telling me that a site attended by people who stan for the OG Luddites is no longer worthy of being called "Hacker News"? Or where users with names like "BenFranklin100" extol the virtues of Apple's iOS developer agreement? Say it isn't so.

The trouble is, there's still nowhere better.

nurettin
3 replies
23h21m

I think it requires years of proficiency in the field you are asking about in order to get openai to produce meaningful, useful output. I can make use of it, but sometimes it makes me think "how would a newbie even phrase an objection to this misunderstanding or omission?" Currently it seems gpts are pretty much not on par with the needs of non-experts.

Eager
1 replies
22h57m

You might be on to something here. It definitely seems to be the case because I'm using multiple different models as part of my everyday process and getting excellent results as a very experienced low level C++ systems engineer.

What is worse is that seems to be leading to a self-amplifying feedback loop, where people not up to speed enough with the models try to use them, fail and give up making them fall even further behind.

nurettin
0 replies
12h40m

Very similar to my experience. I made it generate a novel neuroevolution algorithm with the data structures I imagined for recreational purposes, and to speed things up, it suggested "compiling" the graph by pre-caching short circuits into an unordered_map. A lot of fun was had. (it also calls me captain)

anxman
0 replies
23h18m

This is my experience too. You have to be have deep domain knowledge to really get the LLM to do what you want. Then it saves me a ton of time.

brigadier132
3 replies
23h24m

Claude Opus is working for me. It's not perfect but it definitely handles busy work well enough that it's a net positive. Like I add some new fields to a table and ask it to update all the files that depend on the field and it works after 1 or 2 tries. There is a time saving benefit but there is also an avoiding mental fatigue benefit for busywork.

HakuG
1 replies
22h51m

What are you using on top of Claude Opus that helps it access your file system?

brigadier132
0 replies
20h41m

cmd c cmd v, definitely not ideal

malux85
0 replies
23h19m

This is what I use it for too --

Write me the molecular simulation boilerplate because these crappy tools all have their own esoteric DSLs, then I tweak the parameters to my use case, avoiding the busywork -

e.g. "Write me a simulation for methane burning in air"

Gives me a boilerplate, I modify the initial conditions (concentrations, temperatures, etc) and then deploy. Have the LLM do the busy-work, so I dont have to spend ages reading docs or finding examples just to get started.

Now deploy to a stable environment. Thats what I'm trying to help with by building https://atomictessellator.com

senko
2 replies
23h16m

In my experiments at Pythagora[0], we've found that sweet spot is technical person who doesn't want to know, doesn't know, or doesn't care about the details, but is still technical enough to be able to guide the AI. Also, it's not either/or, for best effect use human and AI brainpower combined, because what's trivial vs tedious for human and AI is different so actually we can complement each other.

Also, current crop of LLMs are not there yet for large/largish projects. GPT4 is too slow and expensive, while Groq is superfast but open source models are not quite there yet. Claude is somewhere in the middle. I expect somewhere in the next 12 months there's going to be a tipping point where they will be capable, fast, and reliable enough to be in wide use for coding in this style[1].

[0] I have an AI horse in the game with http://pythagora.ai, so yeah I'm biased [1] It already works well for snippet-level cases (eg GitHub copilot or Cursor.sh) where you still have creative control as a human. It's exponentially harder to have the AI be (mostly) in control.

dustingetz
1 replies
23h5m

.

senko
0 replies
22h57m

I would clarify that "there" in my "not there yet" doesn't assume superhuman AGI developer that will automagically solve all the software development projects. That's a deep philosophical issue best addressed in a pub somewhere ;-)

But roughly on par with what could be expected of today's junior software developer (unaided by AI)? Definitely.

rapind
2 replies
22h14m

Just give me a product that works as advertised

Almost no products fit this description, and if they do then the marketing department is getting fired.

Does a Mcdonalds burger look like the picture?

If you go in with a healthy dose of cynicism IMO LLMs can impress. I’d call it a better google search and autocomplete on steroids.

mrguyorama
0 replies
21h32m

Does a Mcdonalds burger look like the picture?

It actually does in the countries that require it. You know you can write ACTUAL "truth in advertising" laws right?

ThalesX
0 replies
22h6m

Does a Mcdonalds burger look like the picture?

Sometimes? But I don't go to McDonalds for the loss function between the picture and actual product. I go for the fast food and good taste (YMMV).

If you go in with a healthy dose of cynicism IMO LLMs can impress.

I use them everyday in one way or another. But they're not replacing me coding today. Maybe tomorrow. And I go in with a healthy dose of optimism when I say this.

I’d call it a better google search and autocomplete on steroids.

Sure, but this particular discussion is not about its ability to replace Google Search and / or Autocomplete.

narrator
2 replies
21h5m

As a developer who is good at object oriented design, architecture, and sucks at leetcode stuff, I have been able to use it to make myself probably twice as productive as I otherwise would be. I just have a conversation with GPT-4 when it doesn't do what I want. "Could you make that object oriented?" "Could you do that for this API instead, here let me paste the docs in for you."

I think people want it to completely replace developers so they can treat programming as a magic box, but it will probably mostly help big picture architecture devs compete with people who are really good at Leetcode type algorithm stuff.

notpachet
0 replies
20h53m

it will probably mostly help big picture architecture devs compete with people who are really good at Leetcode type algorithm stuff.

The competition should be happening in the other direction.

ecoquant
0 replies
20h29m

Totally agree. I am not a professional developer. I find programming to be quite dull and uninteresting.

I am going to work on something after this pot of coffee brews that I simply could not produce without chatGPT4. The ideas will be mine but the most of the code will be from chatGPT.

What is obvious is different skill sets are helped more than others with the addition of these tools.

I would even say it is all there in the language we use. If we are passing out "artificial intelligence" to people, the people who already have quite a bit of intelligence will be helped far less than those lacking in intelligence. Then combine that with the asymmetry of domains this artificial intelligence will help in.

It should be no surprise we see hugely varied opinions on its usefulness.

the_mar
1 replies
15h40m

Sound like a skill issue

ThalesX
0 replies
9h51m

Easy to say. Show me something impressive you've done with AI and little involvement from yourself.

cloudking
1 replies
21h55m

What kind of prompts are you using? You'd be surprised how much better your output is using prompting techniques tailored for your goal. There are research papers that show different techniques (e.g one shot, role playing, think step by step etc) can yield more effective results. From my own anecdotal experience coding with ChatGPT+ for the past year, I find this to be true.

ThalesX
0 replies
21h21m

What kind of prompts are you using?

I hack on them till I get something sort of satisfying.

You'd be surprised how much better your output is using prompting techniques tailored for your goal.

The biggest problem I encounter is context length, not necessarily the output for small inputs. It starts forgetting very fast, whether it's Claude, GPT+ or other self hosted models I've tried.

yshrestha
0 replies
16h16m

As a human, I can tell you that I suck at predicting exponential growth.

summerlight
0 replies
21h34m

My personal take is that LLM is fairly good at replacing low level tasks with intuitive patterns. When it comes to a high level ambiguous question that actually has an implication on your daily works and the products, LLM is not helpful anymore than search engines.

Yeah, AI will do the easy and fun jobs for you. You will only need to care difficult decisions that you're going to be responsible for. What a wonderful world...

stuckkeys
0 replies
23h51m

lol I was on the same boat until I sinked it all together. I ended up wasting more time arguing with the LLM chat than doing anything remotely useful. I just use it for reference now, and even that I am not 1000% sure.

sanderjd
0 replies
21h11m

Exactly where I'm at! Totally transformative set of tools for me to use to do my day to day work significantly more productively and also a giant distance away from being capable of doing my day to day work.

rewgs
0 replies
20h50m

This is exactly my experience. Furthermore, I've become acutely aware that spending time prompting either a) prevents me from going down rabbit holes, all but denying me the kind of learning that can only really happen during those kinds of sessions, and b) prevents me from "getting my reps in" on stuff that I already know. It stands to reason that my ability to coax actually useful information out of LLMs will atrophy with time.

I'm quite wary of the long-term implications and downstream effects of that occurring at scale. AI is typically presented as "the human's hands are still on the wheel," but in reality I think we're handing the wheel over to the AI -- after all, what else would the endgame be? By definition, the more it can do without requiring human intervention, the "better" it is. Even if replacing people isn't the intention, I fail to see how any other effect could usurp that.

Assuming AI keeps developing as it has been, where will we be in 20 years? 50? Will anyone actually have the knowledge to evaluate the code it produces? Will it even matter?

Perhaps it's because Dune is in the air, but I'm really feeling the whole "in a time of increased technology, human capabilities matter more than ever" thing it portrays.

qrios
0 replies
22h45m

I totally agree!

And I'm sure the reason for that is the garbage input. From time to time I have to perform quantitative code analyses in our so called enterprise repositories. And the results are shocking every time. I have found an extremely poor SQL code block to type cast many columns in hundreds of projects. It was simply copied again and again even though the casting was no longer necessary.

The training base should be sufficiently qualified (and StackOverflow ranking is obviously not enough).

But unfortunately it's probably too late for that now. Now inexperienced programmers are undercutting themselves with poor AI output as training input for the next generation of models.

psygn89
0 replies
20h39m

The other day I thought I had the perfect task for AI and to clean up some repetitive parts in my scss and to leverage mixins. It failed terribly and was hallucinating scss features. It seems to struggle in the code <-> visual realm.

orthecreedence
0 replies
21h38m

I'm probably dumb as hell, because I just can't get it to do anything remotely useful, more than helping me with leetcode.

I highly doubt you're the dumb one here.

nprateem
0 replies
22h53m

Yeah I'll only give it tasks where it needs to spot patterns and do something obvious, and even then I'll check it make sure it hasn't just omitted random stuff just for shits and giggles.

TBH I'm more surprised when I don't need to help it now. After about 3 times where it cycles between incorrect attempts I just do the job myself.

I disabled copilot since it consistently breaks my flow.

latexr
0 replies
8h45m

I'm probably dumb as hell, because I just can't get it to do anything remotely useful

Rather, your “problem” is that you’re likely not writing uninteresting cookie cutter boilerplate that everyone can do and has done hundreds of times. The current crop of AI is cool for coding demos, not for solving real relevant problems.

Just give me a product that works as advertised and I'll throw money your way

The people hyping this crap only care about the second part of that sentence. The first one is an afterthought.

jcgrillo
0 replies
20h19m

This is an interesting post. An expert in numerical analysis compares the output of a tool which optimizes floating point expressions for speed and accuracy with the output generated by chatgpt on the same benchmarks:

https://pavpanchekha.com/blog/chatgpt-herbie.html

I wouldn't use it—sanity-checking its algebra is a lot of work, but even if you fixed that up, the high-level ideas typically aren't that good either.

This has been exactly my experience with chatgpt as well.

mattlondon
114 replies
1d1h

There is no way this is going to make it so that "engineers can focus on more interesting problems and engineering teams can strive for more ambitious goals."

Instead it will mean that bosses can fire 75-90% of the (very expensive) engineers, with the ones who remain left to prompt the AI and clean up any mistakes/misunderstandings.

I guess this is the future. We've coded ourselves out of a job. People are smiling and celebrating this all - personally I find it kinda sad that we've basically put an end to software engineering as a career and put loads of people out of work. it is not just SWEs - it is impacting a lot of careers... I hope these researchers can sleep well at night because they're dooming huge swathes of people to unemployment.

Are we about to enter a software engineering winter? People will find new careers, no kids will learn to code since AI can do it all. We'll end up with a load of AI researchers being "the new SWEs", but relying on AI to implement everything? Maybe that will work and we'll have a virtuous circle of AIs making AI improvements and we'll never need engineers again? Or maybe we'll hit a wall and progress in comp sci will essentially stop?

andrewmutz
28 replies
1d1h

Instead it will mean that bosses can fire 75-90% of the (very expensive) engineers, with the ones who remain left to prompt the AI and clean up any mistakes/misunderstandings.

We continually hear anxiety about technology leading to mass unemployment and it keeps not happening. Instead, workers tend to have higher productivity, which drives higher wages.

Technological advancements transformed agriculture from being 75% of the workforce to being less than 5% of the workforce over the last 200 years and instead of mass unemployment, everyone found other ways to add value to society and it has been an absolute win, with higher standards of living.

(https://www.researchgate.net/figure/Percent-of-the-Labor-For...)

steve_adams_86
5 replies
1d1h

You've got a good point here. I've wondered about this at times. Like, what if we can't find a way to advance these agents much beyond where Devin is now? What if they can't consume a code base and make meaningfully high quality additions to them reliably, so they're essentially stuck as novice/intermediate developers?

This seems nice(ish) for someone like me with a senior title and experience (~15 years), but totally knocks the bottom out of the industry. How do new people enter the industry and become programmers like I did? What's the new entry point into software, and what does that role look like?

I can't tell if it would actually be good to knock the bottom out. If seniors become sort of like engineers going into the machine to tune things and build the big new things, who replaces them eventually? Will software get worse because of this?

One thing I wonder is if this will birth a new generation of software architecture which is essentially a complete mess which AIs can easily and efficiently manage due to being machines, which will require businesses to "take the leap". Basically you tell the system what you want and it'll generate it on a bespoke system using custom infrastructure which the AI is optimized to implement solutions on. The results might be amazing, but if something goes wrong, humans would have a hell of a time figuring it out. That wouldn't be a nice career for someone like me. It might still leave a category of old fashioned human-engineered software, though.

One thing about agriculture worth noting is that we had a LOT of other things we could work on. Agriculture held us back, so to speak. What is software holding back right now? If I'm unemployed by this advance, what would I do instead? I'm not sure how transferable my skills are. It could actually make me quite a bit less productive, not more.

hellojesus
4 replies
1d

While it certainly make you, the individual, less productive because you've been replaced by a machine, it still makes society more productive.

If replaced, you essentially were the misallocated capital: capital paying your salary was inefficient compared to that same salary paying a more productive AI.

This means you will either have to find a way to become more efficient to compete against AI for developer jobs, or you will have to reskill. This reskilling period is what you're referring to as being less productive, but in the long run you will find a job and therefore will become productive again. We can't guarantee that job will pay as much as your former, but costs in the former greatly fell anyway, so it doesn't much matter.

In short: you will be reallocated and find your optimal productivity subject to your utility preferences through the reallocation.

BriggyDwiggs42
2 replies
1d

This is objectively accurate, and I think you mean to be nothing but. However, the mechanistic focus on efficiency and profit at the direct expense of human quality of life is downright goulish. We need political changes that ensure people who are “outmoded” by this tech can still have good lives.

hellojesus
1 replies
4h28m

The mechanism for this is charity. I understand people like to see social welfare programs, but I believe they are inferior to private sector charity because they ignore moral hazard, tend to produce worse outcomes, and are less economically efficient. At the core, welfare steals from all to provide provide for some. This introduces deadweight loss.

I see a future where those displaced by automation are able to retrain with minimal cost or by charitable means. But this also means that current workers should be saving hard for a rainy day, as nobody knows when one may come.

Pugpugpugs
0 replies
2h15m

"At the core, welfare steals from all to provide for some." Yep, I want a system of progressive taxation that provides negative pressure on wealth inequality while uplifting the poorest. I like this idea because I like liberty, and a rich person loses almost no agency by having some portion of their income or wealth taxed while a small stipend can make a huge difference in the choices available to a poor person. There's also the utilitarian argument that disproportionately burdening a small-ish number of rich people for the benefit of the majority is a net good. If private sector charity could achieve either of these goals (alleviation of wealth inequality or uplifting of the poor) as well as social welfare can, I'd argue for it instead, since at the end of the day I prefer it be voluntary, but I don't think that it's up to the task, which we can observe in today's charities. Do you have any evidentiary basis for the counter? Why would the system of voluntary charity suddenly improve its outcomes or have more money to spend?

"they ignore moral hazard"

Sometimes economic risks are negative, like spending your paycheck on lottery tickets, but other times they're positive, like starting a risky but ambitious startup or getting an education. If social welfare causes a significant change in the average person's risk tolerance, a claim that definitely requires substantiation, then I still might argue this isn't a net negative for the economy or society at large. Even if the claim is true, is using homelessness and privation as a threat (a threat often realized upon the unfortunate) a worthwhile ethical sacrifice? How much does it cost us not to do this, because I think it's worth some money.

"tend to produce worse outcomes"

If this is true, which I also wouldn't take for granted, then you'd still need to factor in the asymmetry in access that private charities have vs a social welfare system. Perhaps a person who gets the aid of charity benefits more than one who gets welfare, but does the average person?

"are less economically efficient"

This is likely to be true simply because government tends to be inefficient, but can be worth it for the above reasons and can be assessed objectively. I'd be fine with spending some margin more than should be spent, say 50%, if it means that 95% rather than 10% of people have access to food or housing or whatever else.

Overall, I don't see how charity can even serve as a mechanism for what we're talking about. If we see unprecedented unemployment due to AI, how exactly do we expect voluntary charity to expand to meet demand? What can we do other than some form of wealth redistribution if we don't want lots of people to starve? Furthermore, why can't we do both? If charity will expand to meet demand, then let's use social welfare to fill in the gaps and prevent destitution.

ipaddr
0 replies
1d

Or you will become a negative to society and not find another job. Perhaps be forced into negative elements like crime or hacking for a payday.

Many never recover and age makes a difference.

HumblyTossed
3 replies
1d1h

Instead, workers tend to have higher productivity, which drives higher wages.

Wages have not kept up with productivity.

lkbm
1 replies
23h40m

It doesn't need to for productivity gains to be good for workers. Employment and real wages both are up. The fact that profits are also up doesn't change that.

dukeyukey
0 replies
17h58m

Real wages haven't improved for 15 years where I am :(

xena
0 replies
1d1h

There's one notable exception! Tech work in the Silicon Valley area. All other jobs should be paid like that to keep up with productivity and inflation since the 70's.

pzo
2 replies
1d

The key phrase is "over the last 200 years". 200 years ago there was just 1 billion people. People had decade(s) to reskill to new profession. Their offspring picked different profession if their parents didn't have good prospect. Changing profession was also not requiring half a decade learning.

Now imagine that AI will make 20% people redundant over next 5 years - thats ~1.6 billion people.

zeroonetwothree
1 replies
11h12m

To be fair most of the drop was only over a ~100 year period.

BTW there is little AI makes "20% people redundant over next 5 years". I would bet $$$ against this.

Pugpugpugs
0 replies
1h48m

Agreed it's definitely not 5 years, but probably will be within 12-20

bananapub
2 replies
1d

this of course conflats two entirely different things.

1) technology development has tended to hugely improve society-wide productivity and be a general (though not unmitigated) good.

2) technology development has been absolutely shit for many individuals as their careers disappear.

people should be way more worried about good national governance and safety nets to deal with the terrible consequences of 2 while we reap the benefits of 1.

Instead, workers tend to have higher productivity, which drives higher wages.

workers are capturing less and less of that, especially over the last twenty years.

ipaddr
1 replies
1d

Too many people using the safety nets will cause them to collapse.

bananapub
0 replies
23h49m

lol.

hopefully whoever is in charge of your country has put a lot more thought into the future of work as automation comes for white collar workers.

Retric
2 replies
1d1h

Not all jobs are the same. The Real Median US income in 2019 was 78,250$/year in 2022 it had fallen to 74,580. https://fred.stlouisfed.org/series/MEHOINUSA672N

The only way that happens is if millions of people are significantly worse off. Most people can find work in the big economy, that doesn’t mean replacing a 100k/year job with a 25k/year job is equivalent.

Across a long enough time horizon technology tends to make most people better off, but in the short term it can seriously fuck people over.

zeroonetwothree
1 replies
11h10m

A 5% decrease is "significantly worse off"? And it's still higher than 2018--it's not really reasonable to expect an increase every single year, you should look at the longer-term trend.

Retric
0 replies
8h9m

The median isn’t a simple average. The minimum difference is 5%, people who lost 50+% only count as one more person below the old average. So that 5% represents a great deal of pain for millions not simply a slight haircut.

And sure the long term trends reverse things as I mentioned, but it took 15 years to recover where it was in 2019. Most Americans are simply cut off from wider economic growth and recovery.

dukeyukey
1 replies
1d1h

It's absolutely a win for society (in the longer term), but in the meantime a lot of people's lives were upended, devasted, and even ended due to the upheavel. Maybe we can avoid that this time, but I doubt it. You can't tell someone who's children are starving that it's all worth it.

andrewmutz
0 replies
1d

Were they? Or did their children move to the city and learn different skills, while they bought machines to replace the lost labor? Was the tractor actually bad for the farmer?

throwway120385
0 replies
1d1h

Meanwhile we all spend more of our time working for someone else and living our lives with greater intensity than in the days of agriculture.

rlt
0 replies
1d1h

Both things can be true.

In the short to medium term a bunch of people will be out of a job/career.

In the long term society may benefit overall.

On the other hand I’m not convinced humans are evolving fast enough to keep up with modern society. There are increasing rates of anxiety, depression, ADHD, etc, especially in young people. https://www.thecut.com/2016/03/for-80-years-young-americans-...

petsfed
0 replies
1d1h

...for instance, by a larger shift towards service-sector jobs (e.g. janitorial work, dining and entertainment, and retail sales).

It is the case that productivity has grown with automation, but at the same time, median wages have stagnated, as the number of high-paying jobs have steadily shrank. Considering inflation, median income has been stagnant since about 1965. But productivity is at an all time high. But all the wealth that that generates is not going out into the world, its being concentrated.

I'm in automation with physical machines, and there's a part of me that sincerely hopes that the continuing automation of various jobs leads to a golden age where society's basic needs are always met by robots, and we're free to pursue our passions. But I'm honestly not optimistic that that will happen without a series of (likely bloody) revolutions and counter-revolutions until either the species is extinguished or our social system finally achieves a new equilibrium.AI can definitely be understood as an invasive species or a natural disaster in terms of the impact it has to our social ecosystem.

noncoml
0 replies
1d1h

You are missing an “in the long run”

I thought the “The grapes of wrath” were exactly about the transient effects of the agricultural transformation.

dw_arthur
0 replies
1d1h

Things could be different this time. We've pretty much automated our physical labor, what happens when you automate general thinking? Any new job or field will also be able to be done by AI.

bigyikes
14 replies
1d1h

If an AI was truly capable of replacing engineers en mass, the upside would hugely outweigh the downside.

Engineering muscle will be limited by chips instead of brains. Injecting millions of AI “engineers” into any economy would be a massive boon.

Many of us would be out of work, but if we can redistribute the AI-generated surplus, this will still be a net-win for us all.

I think the top engineers will still be able to find a niche regardless - we get paid to solve problems, and there will always be human problems.

thfuran
6 replies
1d1h

if we can redistribute the AI-generated surplus, this will still be a net-win for us all.

We can't.

bigyikes
3 replies
1d1h

Like, due to lack of political willpower? Or do you see some fundamental limitation of, say, an automation tax?

steve_adams_86
0 replies
1d1h

Arguably we haven't seen redistribution of wealth due to past automation advances, so it seems unreasonable to believe it will happen now. As automation has improved in the last decades especially, wealth has disproportionately moved upwards.

bananapub
0 replies
1d

if we can redistribute the AI-generated surplus, this will still be a net-win for us all.

which historical technological development that hugely improved productivity do you feel has led to an improved universal social safety net / public investment in whatever country you're from?

antisthenes
0 replies
1d

Yes, the complete impotency of anti-trust and big business regulation over the last 50-60 years.

And any political willpower that exists is constantly used for smearing campaigns for either side of the political spectrum (e.g. look at them, they are the bad guys).

Automation tax is also the worst possible outcome, since it both stifles innovation AND doesn't address the redistribution problem.

n0sleep
1 replies
23h55m

Correction: we won't.

thfuran
0 replies
21h48m

I'm not saying it's against the laws of physics, just that it's not achievable. If people worked differently, we could allocate resources differently.

BugsJustFindMe
3 replies
1d1h

but if we can redistribute the AI-generated surplus

History suggests that we cannot.

This is very much like saying "but if we can redistribute the wealth of all the billionaires". Like, uh huh, but that won't happen because, short of fear of murder, people winning a class war have no incentive to stop.

bigyikes
2 replies
1d1h

Millions of people being put out of work might change the political landscape quite a bit. We’re talking about a technological revolution, so I don’t think it’s too crazy to consider an economic or political revolution to go with it.

HumblyTossed
0 replies
1d1h

Millions of people being put out of work might change the political landscape quite a bit.

No, it won't. Even during the vid, when people couldn't work, the politicians found a way to enrich the wealthy while only doing the bare minimum for those who lost their jobs.

BugsJustFindMe
0 replies
23h26m

Maaaaaaybe...but...

The (first) French Revolution, the period in history to which all revolution is compared, did not distribute equality to the masses. It distributed The Terror, and then it distributed Emperor Napoleon, and then it distributed the Bourbon monarchy right back into power.

Then the (second) French Revolution changed one king for another.

Then the (third) French Revolution distributed another fucking Emperor Napoleon.

Revolutions aren't as equality-building as people want to believe.

xena
0 replies
1d1h

No. The money-havers will just cut the costs from software developers and our jobs will be gone forever.

faefox
0 replies
1d1h

"if we can redistribute the AI-generated surplus"

I'm curious to know why you think there is even the slightest chance of this happening.

HumblyTossed
0 replies
1d1h

Many of us would be out of work, but if we can redistribute the AI-generated surplus, this will still be a net-win for us all.

This won't happen. It does NOT trickle down, Mr. Reagan. The past 40+ years, every single graph you pull up shows the widening gulf between the wealthy and the not.

mym1990
8 replies
1d1h

How is this any different than the industrial revolution ushering in a new age, or for that matter any technology that creates huge efficiencies in labor? The truth is that the future is still very uncertain, and while the easiest thing to do is yell "they took er jerbs" from the top of your lungs, maybe think about how to effectively move forward into the future.

There is a book called "Who Moved My Cheese?" and I wouldn't say its an amazing book, but the concept that not everything lasts forever, in relation to job security, is the takeaway.

jprete
3 replies
1d

The whole point of AGI is to be able to do everything a human can do, so this argument doesn't apply in the same way it did for mechanical automation.

mym1990
1 replies
13h1m

You seem to be really dumbing down AGI, considering AGI will likely do everything a human can do, as well as many things a human cannot do, and all of those things will be done vastly faster and better.

jprete
0 replies
5h4m

Your statement augments my point. All human beings should be concerned about their place in a system that doesn't need them, to the extent it's constructive to think about.

zeroonetwothree
0 replies
11h2m

Many jobs require a corporeal body so at least we have that.

xcv123
2 replies
23h59m

the easiest thing to do is yell "they took er jerbs

If the goal with AGI is to literally take all of our jobs (all jobs in all sectors) then what are you suggesting we can do instead?

mym1990
1 replies
13h4m

The "goal" of AGI seems to be a pretty subjective thing, doesn't it? Idealistically, I could believe that the goal of AGI is to help humans build the future faster, and expand our footprint beyond what we currently believe is possible. Your pessimistic view of AGI is definitely valid, but it doesn't make it anything more than opinion, just like my view.

xcv123
0 replies
12h21m

AGI is being developed by companies which have billions to fund it, such as Google, Meta, and Microsoft. Their goals are not subjective. These are ruthless for-profit entities that exist to increase shareholder returns.

pizza234
0 replies
1d1h

How is this any different than the industrial revolution ushering in a new age, or for that matter any technology that creates huge efficiencies in labor?

Indeed it isn't. It's just that at each technological breakthrough, a number of (vocal) people thinks that it's a new type of revolution, ignoring those of the previous centuries /shrug.

siva7
6 replies
1d1h

I would have dismissed this thought a year ago but seeing how fast openai is moving, in 5 years those ai assistants will be what nowadays human junior/intermediate devs are.

eloisant
3 replies
1d1h

That's exactly what people were saying of self driving cars 15 years ago. "We're so close, within 5 years we're have full self driving, and in 10 nobody will need a driver's licence!"

siva7
1 replies
23h10m

We're pretty close now, aren't we from within 10 years now!?

elicksaur
0 replies
19h54m

Yes, and next year we will be 10 years away from full self driving. By 2025, we should be 11 years away if all goes well.

zeroonetwothree
0 replies
11h7m

I remember discussing it in 2009 exactly as you say. I thought "my kids won't ever need to learn to drive". Sadly it turned out to be wrong :(

steve_adams_86
0 replies
1d1h

The fact that these agents don't sleep is what will really kill human developers.

Even with my nearly 15 years of experience, I'm not sure I see companies justifying the cost of employing me soon when someone half as "capable"* as me can work relentlessly and tirelessly at churning out half-baked features.

*I doubt AI agents will be able to use bigger picture foresight and reasoning (especially reasoning as software pertains to human user experiences) to architect sane applications (as we understand them today, at least) but this likely won't matter in a vast majority of cases.

CipherThrowaway
0 replies
1d1h

If this is the case, it would be more inline with how other complex professions work.

Entry level positions in fields like medicine, law, the sciences, architecture, engineering etc can require years of intensive training before you're skilled enough to take on the role even at entry level.

swatcoder
5 replies
1d1h

The printing press replaced the need for scribes but introduced the need for typesetters.

Talented scribes did cool things with illustrations and flourishes that were lost in the transition, but ultimately spent quite a lot of their time stroking the letter "e" or "i" or whatever.

Meanwhile, typesetters found themselves in a whole new creative domain which was a different one than the scribes had been working in while also being informed by it and related to it. With their different workflow and this different domain, they were able to innovate in many ways unique to typesetting but also in ways that would circulate back to calligraphers and other inheritors of the hand-crafted letterwork tradition.

I'm personally not sold that we're soon to see LLM-based code generators replace software engineers anyway, but things are not so black and white as you suggest even were that to happen.

ewild
2 replies
1d1h

eventually we will reach the limits of what can be discovered with physics. the same applies here. eventually the limit of what a human can improve on through a job will be done. is this that point? idk but there will be one time that "new jobs" arent made

zeroonetwothree
0 replies
11h8m

This is like saying that because the possible books that can be written are finite (assuming some max length) then eventually every books will be written and writers will have nothing to do. While mathematically true, the word "eventually" is doing a lot of work.

swatcoder
0 replies
1d

Your reference would only be true if we could actually catalog, index, reference, retain, and absorb all there is to know about the physical world into a model so simple we can still comrehend it. That's... unlikely.

More likely is that progress, discovery, and improvement behave more like dispersing bits of fog in an intractably large cloud that's always creeping back in on the clearings your made previously. You can sustain positive progress but its asymptotic at best and there are always regressions eating away at what you've done in the past.

So don't worry, there's always going to be more kinds of work to do, just like there's always going to be more physics to study.

cyrialize
1 replies
1d1h

I understand where you are coming from, but I think using the impact of the printing press to help predict the future of careers and AI is flawed.

I think AI is hard to compare against the past. A printing press replaced some jobs, but AI could replace way more jobs.

Maybe it is like the printing press, but without typesetters.

swatcoder
0 replies
1d1h

That's fine, but once one assumes the impact of this technology is wholly without precedent then they're left speculating about a future informed only by their own imagination.

They'll always be able to re-affirm their own preconceptions (fears) because its the "just so" fantasy future of their own making.

I don't see the point of coming to HN to trade those invented stories, as people here traditionally push to stay within the engineer's realm of how real things work, how they fit into the history of innovation, and what people might practically build with those things based on how they work.

There are countless speculative fiction communities better suited to idle "yeah, but what if the sky was purple tomorrow?" discussions.

optimalsolver
5 replies
1d1h

can focus on more interesting problems

Kind of like how art was supposed to be what humans would be doing while AI does the jobs we don't want, but looks like that will be the first thing to fall to the machines, while humans fight for carpet installation and plumbing jobs (for a while).

alx_the_new_guy
1 replies
1d

Art as a concept is basically meaningless these days.

What is art? Apparently, anything can be art, so what's the difference between art and not_art?

From my personal experience, it is used more as a shitty excuse for substandard/lame/overpriced stuff to exist. And to boost it's creator's ego.

AI certainly isn't really doing it, just as people aren't.

Fuck art.

huhhohp
0 replies
1d

As you note: “it is used more as a shitty excuse”, this is “art as excuse”.

Fuck art.

For me, art is both personal expression and interpretation, and yeah, those things can be imbued or ascribed to “anything”.

Even a comment on a web forum.

Your comment is an expression of your personality.

Your comment is your art.

Almondsetat
1 replies
1d1h

Have you ever seen people dreaming of becoming a soulless cog in the machine creating forgettable 3D assets for mobile video games at a no-name studio once we achieve "full automation"?

Me neither.

What people dream about is being able to go to a field and paint or sing at their leisure, not engaging in the soul crushing rat race that is artistic entrepreneurship.

AI is replacing the jobs that make money but are most certainly creatively bankrupt.

infinitezest
0 replies
1d

OK but... How will I get money to live and feed my kids and stuff? Will the rich decide they finally have enough this time? Why would they?

CipherThrowaway
0 replies
1d1h

AI still can't do art. Tacky AI generated imagery is mid-2020s clip art, already recognizable to consumers and signalling negative brand associations like "cheap", "scam", "low quality."

noncoml
5 replies
1d1h

Call me fool but s this hype reminds me the hype about self-driving cars coming in a year back in 2014.

AI engineers are coming, but I think we would have long retired before they reach a point where they can replace us.

pzo
4 replies
1d

The difference with self-driving cars hype is that they need to be 99.999% good so pretty much perfect to be useful on road and be incorporated mainstream. AI doing some tasks 90% as good as human is good enough. Self driving cars got massively improved in the last 15 years. I remember 15 years ago when DARPA were doing their first self driving challenge and the current tech we have is like magic comparing to what we had back then.

blibble
1 replies
20h42m

need a few more 9's than that otherwise everyone would be dead within 50 years

zeroonetwothree
0 replies
11h4m

99.999% on an annual basis is ~10x safer than human drivers

zeroonetwothree
0 replies
11h3m

It depends on what the 10% is. For example, if your code only works 90% of the time that's not exactly good enough to use...

halfjoking
0 replies
22h42m

But with software doesn't technical debt accumulate over time when low quality engineers keep working on it?

That's why starting projects with something like Cursor makes you seem superhuman, but as the project grows the AI is more likely to get stuck because of previous low-quality choices. Just like with driving cars, it seems like you need strong supervision. (at least for the foreseeable future)

raytopia
4 replies
1d1h

I don't think it's the end of software work yet.

A lot of these AI companies are promising the world and of course no one can deliver on that.

I think it's more likely we'll enter another AI winter first after all these impressive party tricks get boring and investors realize that just because a product uses AI doesn't mean it provides any value.

steve_adams_86
3 replies
1d1h

What do you expect would cause the next winter? Or in other words, what capabilities of AI in this context do you expect will stall next?

raytopia
2 replies
1d1h

Failing to meet investors hype would be the biggest reason in my book.

Also it seems like training and running these systems is incredibly costly and the prices AI companies charge for their products are being subsidized by investor money, which won't last forever.

steve_adams_86
0 replies
1d

Good point. I've wondered about costs becoming prohibitive. However, I've seen impressive optimization of some models where compute requirements are reduced by orders of magnitude. I'm not sure if these accomplishments will be transferrable to all expensive AI use cases, though.

If we were to pay for the actual costs at this point, I do wonder how many people would consider it worth the expense. But I wonder, how much should ChatGPT cost, for example?

ipaddr
0 replies
23h53m

The investor money running out is a likely outcome which will push ai prices higher and change who uses it.

CipherThrowaway
4 replies
1d1h

Instead it will mean that bosses can fire 75-90% of the (very expensive) engineers, with the ones who remain left to prompt the AI and clean up any mistakes/misunderstandings.

This is the same logic that has driven cheap off-shoring in non-technical companies.

For decades orgs have been able to buy "human-level" (i.e. humans) engineering for a tiny fraction of an engineer's salary, and there have been millions of eager salesmen for off-shore dev shops pushing them to do it too. After seeing the outcomes of this approach, I understand why well-paid engineers remain well paid. And why they'll remain well-paid after the LLM non-pocalypse.

If you think LLMs are so amazing, I would encourage you to see how much you can rely on them to replace human beings in real world scenarios. Not in contrived PR pieces and cherry picked examples but in situations where actual real people would otherwise be working together to deliver commercially valuable outcomes.

You believe you have domain specific insights that allow you to state, with confidence, that LLMs are able to replace a highly technical and well-compensated role at virtually no cost. If that's the case, you're sitting on a gold mine. If I believed that, I'd be starting a development agency tomorrow.

hackerlight
3 replies
1d

  "If I believed that, I'd be starting a development agency tomorrow."

The company making Devin is doing just that. As we can see, it will take some work to perfect.

Verdex
1 replies
23h22m

If it was actually working for anyone, then they would be selling software engineering time at the same but slightly cheaper price as existing software engineering time costs today so they could capture those sweet margins.

This is a company spending investor money selling pickaxe hand grips during a gold rush.

For real evidence, look for companies selling engineering time much greater than the amount of their total engineer count who have good customer retention across projects.

mrguyorama
0 replies
21h19m

It's hilarious. If Devin were any good, they wouldn't be selling access to it to random SWEs, they would be replacing Microsoft, Apple, Google, etc for that sweet sweet trillions of dollars!

Where's the app they built in an afternoon using Devin? Where's the software product that Devin actually built a month ago and was being used by thousands of people?

Their actual business seems to be closer to "Lets milk some of that sweet sweet high income from SWEs with FOMO about AI"

pdimitar
0 replies
1d

The "some work" phrase is doing a lot of work for you here. It can easily take them 100 years as well and they will get broke long before.

I see nothing in the original article that doesn't strike me as the techno-optimism of the 1960s where people made movies and books saying "It's the year 2003 and the humanity is exploring the vast depths of the Universe".

So again, it's a very plain old boring techno-optimism.

I am sure they can automate some work (like scaffold a certain CRUD part of the app) but there are always nuances and specifics and the current generation AI has so far proven inadequate in catching those and taking proper care of them.

IncreasePosts
3 replies
1d1h

If AI can't make AI improvements that humans can, then it shouldn't be considered a very capable coder.

bigyikes
1 replies
1d1h

How many coders do you know that can make AI improvements? That ability is already reserved for the top humans. AI doesn’t even need to reach this bar to be better than the average coder.

IncreasePosts
0 replies
1d1h

I don't really think it is the "top humans" who are doing AI, it is just people whose skillsets and interests mesh with what is used in AI now. I will say that I am doing AI work and I am certainly not a top human when it comes to intellect.

pdimitar
2 replies
1d

You are vastly overselling current generation AI here. It can do some things -- GitHub Copilot has been useful for people to reduce the boilerplate generation, for example -- but in terms of actual programming which 98% of the time is maintenance (fixing bugs, debugging, adding tests, refactoring, adding features) it's performing mostly bad. It's only good at generating code and maybe "understanding" some of it. Prove me wrong with links and the AI equivalent of CodePen (something that's a very glaring omission in the area).

Secondly, bosses try to fire the most expensive programmers ever since expensive programmers became a thing. All the outsourcing to wherever the salaries are much smaller is being attempted even as I type this now, by probably no less than 1000 companies, in this very second. Why hasn't the area in general still gotten rid of their expensive programmers?

Easy -- the outsourced "talent" produces crap that then costs more to fix and repair than if you hired proper programmers in the first place. Of course that requires some forward thinking that's not limited only by the short-sighted "we are about to save money muahahaha" mindset and as we know many businessmen are incapable of looking forward -- hence this mistake is being done 24/7.

It's 99% the same with these AI coding bots.

Wishing cost savings into existence so far hasn't worked.

Will some interns and smart people who coast on much-higher-than-local wages because they wrote some Python to import their boss' Excel spreadsheets and automatically generate other stuff, lose their jobs? Yes, that's very likely.

Will the senior programmer be replaced? It's not a zero chance, surely, and we already saw significant layoffs in most big US companies but (1) most of the world didn't over-hire during COVID and (2) it's unclear if senior devs were fired, or actual redundancies, and (3) well, most of the world isn't the USA.

So again, you are vastly overselling the current generation AI.

rohandakua
0 replies
8h19m

what do you think about devin? just curious

rewgs
0 replies
20h38m

Fair, but:

Whether or not the tech can actually live up to the hype doesn't mean execs/VCs/etc won't try to get there anyway. That will, at best, result in a ton of volatility, as those trying to utilize AI figure out that it actually can't do what they want, and then have to hire engineers again, etc...

I don't know. Who knows where this is heading.

_factor
2 replies
1d1h

Or maybe the best engineers will be able to shine since they have the best ai management ability with the coding knowledge.

Let’s give all the shovelers spoons and live in the past where we can ignore the benefits technology can bring because we’re so afraid of what it means about the way we run society.

happymellon
1 replies
1d1h

Or maybe the best engineers will be able to shine since they have the best ai management ability with the coding knowledge.

Considering we haven't seen talent rise to management, I'm not sure what makes you think that management will improve now.

uh_uh
0 replies
1d

Because the human talent will not have many alternatives left in this scenario. So far they could stay in the technical lane, now they will be forced into AI-management.

observationist
1 replies
1d1h

Software engineering still needs a human in the loop, just like art still needs to be prompted and tweaked by a human that can do composition, or writing that needs to be edited and refined, and so on.

AI can't do 100% of the jobs, but it seems like we're somewhere past the 100x capabilities point. AI should be able to make a good employee able to do 100x the output at the same level of quality, and AI only gets more efficient and capable from here on out.

Lawyers doing highly specialized work like doc review can use million token context lengths to achieve days or weeks of work in minutes or hours. Doctors can review huge quantities of literature in search of information relevant to their patients, maximizing the value of their time. Any knowledge work, anything that has repeatable processes, anything not requiring physical work in the real world, will be subject to increasing, accelerating, and unstoppable automation. Market forces will reward efficiency, and eliminating human overhead with relative cost reductions approaching 99.99% is a victory for companies nimble enough to pull it off.

We're living in the apocryphal interesting times.

tfpdotdev
0 replies
1d

The next wave of AI will just take one high-level goal, and prompt you to reduce ambiguity. You’re already seeing it with this Devin thing, but it’ll get way more effective.

cm2012
1 replies
1d

If AI gets good enough to wholesale replace developers, that is amazing news for the world. It's basically AGI. Productivity and GDP growth would skyrocket, tax receipts will explode, gov debts will be paid off, etc.

jspdown
0 replies
10h46m

At first it would be a big win for companies. But as jobs gets absorbed, the productivity boost will become pointless. The pool of potential buyers will shrink because of unemployment and we will end up with the opposite of what you are describing.

I'm not saying that's what will happen, nor your version. Things are never that simple.

But shouldn't we be more precautious? And take time to understand the shift and prepare as a society for this big change? Why rushing on something that would potentially negatively affect millions people life in the hope of productivity boost?

The outcome is not all good, not all bad. We need carefulthinking and planning.

Bjorkbat
1 replies
1d

Something I kind of tell myself is that if AI effectively becomes a drop-in replacement for software engineers (or is at least 95% as good as the median one) it's going to suck because I lost my job and we've killed one of the few good careers that exist out there, but on the other hand, look at the big picture. I'm not even sure the "software" industry will exist.

I mean, think about it. Rationally speaking, if people can hire AI engineers at a fraction of the cost of minimum wage employee, then the price pressure on software will be "significant" to put it mildly. The more complex a piece of software is, the more incentive there is for open-source developers to collaborate and pool their resources to make a cheaper alternative using AI agents. This logic could even extend to the AI agents themselves.

I know my prediction is deeply flawed, but basically, if we create an AI that's a drop-in replacement for (most) software engineers, we're probably going to have a massive deflationary crash not just from the software sector, but in the economy in general considering that many white-collar workers have also probably been automated, with the consequences that follow.

I might lose my job, but I'm going to take so much more down with me. My loss is their catastrophe. Sounds terrifying, maybe terrifying enough for our political representatives to address the elephant in the room, that an AI just took out their tax base, the people who still have a job probably make so little that their tax contributions are cancelled out by government benefits (not just welfare mind you, but the simple fact that they drive on free roads), and taxing corporations won't fix this when corporate revenue is probably going to also decrease.

Otherwise though, we're notoriously bad at predicting the future and estimating the general difficulty of something as ambitious as trying to replace certain careers with AI or robots. Any argument you could make for why software engineers (or white-collar work in general) will be different this time was probably made earlier by someone in reference to something like self-driving cars or physical labor, or possibly in reference to a prior attempt at automating white-collar work using a more primitive form of AI.

Best course of action, individually, is probably to skate to where the puck is heading and make an earnest effort at improving your productivity with software copilots, but otherwise I think the least likely outcome is that automating the software engineering profession is as easy as the e/acc crowd on Twitter would have you believe.

Bjorkbat
0 replies
1d

Also, something else to consider, I kind of consider the automation of writing code not all that different from the automation of architectural and engineering drawings. No doubt some engineers were distraught when computers took the fun out of drawing, and professional draftsmen were devastated, but otherwise engineering is still viable enough as a career for parents to pressure their kids into studying it.

Perhaps a more realistic prediction of the future is that we start to identify a little more as designers, concerned more with the design of software and how it functions, rather than engineering concerns like making code performant and efficient.

I felt that software engineering as a career had a short shelf-life around 2020, but I wasn't too worried about it, as I figured I'd probably just transition to design as a career, I kind of think graphic design is cooler anyway. I began to have doubts about this plan once DALL-E caused an existential panic among illustrators, but otherwise I think it still has legs.

Those who adapt will probably do so by essentially becoming UI/UX designers who happen to also know how to code well, and probably know some other more general design skills just to round things out a little more.

slig
0 replies
22h20m

Instead it will mean that bosses can fire 75-90% of the (very expensive) engineers

Twitter/X fired about that % even before the AI meme. Companies are way too bloated and it's bound to happen.

readthenotes1
0 replies
1d1h

As we eat our own dog food, we get eaten by it...

qgin
0 replies
1d

We’re still not quite there but you’re correct though.

This tech could free up software engineers to focus on more interesting things. But that’s also true every time there are layoffs. Those engineers they got rid of were free to focus on more interesting things, had the company wanted to utilize them that way. Instead, of course, they reduced headcount to reduce cost.

Same will happen with AI tools.

molticrystal
0 replies
1d1h

From the article:

Devin correctly resolves 13.86%* of the issues end-to-end... previous state-of-the-art of 1.96%."

While the cliff everybody will be shoved off is coming closer, it still seems engineers have plenty of room to do their job. Whether the next shove will come in months or years it is still early to tell.

krainboltgreene
0 replies
1d1h

This post has the same energy as when a junior programmer talks to be about the "dead language called java".

jackblemming
0 replies
1d1h

The answer is not hating AI, it’s implementing UBI when AI produces x10 more wealth and resources for humanity. That’s a people problem, not an AI problem.

And furthermore, we’re not even remotely close to that. I guarantee this product actually sucks and is totally useless in practice. No offense, the tech is just not there yet. Saying this as someone at the cutting edge of AI and software development.

breadsniffer
0 replies
23h38m

They're 100% shooting for being able to fire most engineers. That's the dream that allows raising a bunch of $$$. Just know there's a reason their product is a closed demo. Any working "agent" will require intervention. If a product requires intervention, you still need someone managing the AI and they just become software devs using a powerful tool.

bilvar
0 replies
23h50m

If you are so sure this is happening you can just buy some stock or calls on these AI companies and you will become a millionaire, no need to worry about your job.

antisthenes
0 replies
1d

Are we about to enter a software engineering winter? People will find new careers, no kids will learn to code since AI can do it all.

It's important to remember that the reason AI can do it at all is that millions of software engineers wrote decent/good code and made it available on the internet. The reason it's going to be a winter is that going forward, there's going to be a huge reluctance to share code by people.

And that AI will never write code better than humans. It will write it 90% as well, still requiring a human to fix the last 10%.

It will further create a bimodal distribution of wages in the software industry. Those who know how to clean up the last 10%, and those who don't (AI prompt monkeys). The rift between these 2 categories will keep widening.

ak_111
0 replies
1d1h

This is correct but somewhat unfair given that it applies to any technological project, since the purpose of technological project is to improve efficiency in some workflow, and improved efficiency means less demand for worker time.

I don't know which part of technology you work in but I can probably spin it as making possible for some manager to reduce staff.

MSFT_Edging
40 replies
1d1h

Humans seek work that provides satisfaction and meaning in their life.

For every technological advancement, artisans are the first to be made obsolete.

Sure we have landfills full of unworn textiles, the market says its good, but overall, we keep destroying what allows humans to seek meaning.

Our governments and society have made it clear, if you don't produce value, you don't deserve dignity.

We have outsourced art to computers, so people who don't understand art can have it at their fingertips.

Now we're outsourcing engineering so those who don't understand it can have it done for cheap.

We hear stories of those who don't understand therapy suggesting AI can be a therapist, of those who don't understand medicine suggesting AI can replace a doctor.

What will be left? Where will we be? Just stranded without dignity or purpose, left to rot when we no longer produce value.

I ask this question often, with multiple contexts, but to what end? Who benefits from these advancements? The CEO and shareholders, sure, but just because something can be found for cheaper, doesn't mean it improves lives. Our clothes barely last a year, our shoes fall apart. Our devices come with pre-destined expiration dates.

Where will we be in the future? Those born into money can continue passing it around, a cargo cult for the numbers going up. But what about everyone else?

golergka
8 replies
23h47m

What will be left? Where will we be? Just stranded without dignity or purpose, left to rot when we no longer produce value.

Nobody stops you from paying $1000 for a shirt made by artisans right now. Do you want to?

Our governments and society have made it clear, if you don't produce value, you don't deserve dignity.

It's not somebody else who decided that. It's you.

feoren
2 replies
20h59m

> Our governments and society have made it clear, if you don't produce value, you don't deserve dignity.

It's not somebody else who decided that. It's you.

No. The Republican National Convention cheered the idea of letting the poor and sick just die off in the streets. They cheered. Ron Paul asked "what are we supposed to do? Let our sick and poor just die cold in the streets?" and they cheered.

Jobs are sacred in the U.S. Job creators must be worshipped. Hard-working Americans are the lifeblood of yadda yadda. As soon as you don't have a job: fuck you, scum, you deserve to die in the streets. You are no longer of use to the wealthy, so you do not even have the dignity to sleep on benches or under bridges: they add spikes to any area you might find any comfort. Your children cannot receive an education. You get to disappear from view into some secluded slum until you die of the cold. It's not GP that decided that. It's tens (hundreds?) of millions of Americans who will cheer on your death if you lose your job. Most of which, of course, are a couple paychecks or a major illness away from being homeless themselves. Do not act like that's not a real thing.

sandspar
0 replies
10h13m

You're really passionate but you're strawmanning all over the place.

golergka
0 replies
19h10m

You are literally agreeing with me and yet phrase it as if we’re arguing about something? I’m confused.

ethanwillis
2 replies
23h34m

Nobody stops you from paying $1000 for a shirt made by artisans right now. Do you want to?

If you don't do <insert extreme edge case> then your point is invalid. /s. Look, you dont have to pay $1000 for a shirt made by an artisan to get a quality shirt. It can be a bit more expensive, but much cheaper than that. And this can be true while being fair/reasonable to the artisan and also to the person acquiring it.

It's not somebody else who decided that. It's you.

And what about my decisions? Or your decisions. Or anyone who is reading this's decisions? Surely the person you're replying to isn't some dictator who is deciding everything. It's not just the parent comment then. It's not just "It's you"

x-complexity
1 replies
17h4m

> Nobody stops you from paying $1000 for a shirt made by artisans right now. Do you want to?

If you don't do <insert extreme edge case> then your point is invalid. /s.

Look, you dont have to pay $1000 for a shirt made by an artisan to get a quality shirt. It can be a bit more expensive, but much cheaper than that. And this can be true while being fair/reasonable to the artisan and also to the person acquiring it.

In the ideal case, this would be true: Following classical economics, there would be some predictable demand left as you go up the price curve, assuming genuine quality & labour.

However, this doesn't seem to be the case. The existence of low-cost mass-produced options leads to the satiation of demand above that spot of the curve. There still exists demand, but it's weirdly lower than predicted. People can shop down the price curve, but relatively fewer would do the opposite.

ethanwillis
0 replies
7h26m

The point is being missed. The comment above mine was simply trying to make some strawman (i know, i know) about what the author's intent was. The author surely wouldn't suggest we should be paying $1000 for a shirt. I don't think it was a reply in good faith, even though we're not supposed to suggest as such here.

Aside from that I do think you make a good point and I'll need to think about it.

Rumudiez
0 replies
22h31m

I have a few handmade shirts that were all between $70-100. I buy the occasional oddity from Etsy and those things (the ones I buy) are all handmade and most of them are quite affordable – on par with shopping at Target for comparison. I’m 100% certain the artists are thankful they can be professional creators instead of becoming a wage worker or living off of some form of basic income

MSFT_Edging
0 replies
22h52m

I bought a pair of 350 dollar boots that can be repaired many times.

That was five years ago, with a recent re-sole this past summer.

This isn't an exaggeration by the way, but the cobbler who did the resole thanked me for bringing him the job, as it was a genuinely enjoyable experience for him. I assume most of his work is repairing suitcases going by the other clientele in the shop.

But sure, exaggerate that to be an artisan you need to be selling 1000 dollar shirts, rather than doing a 100 dollar service that doubles or triples the life of a decently made item.

xtreme
7 replies
1d

Buying handcrafted artisan stuff is a luxury few can afford. I come from a poor family and I was always grateful for mass produced mediocre stuff that we could actually afford.

MSFT_Edging
4 replies
22h57m

Sure but the same optimizations that bring the costs down, bring down the average laborers value.

I'm not saying that things were sunshine and rainbows pre-industrialization, but there's some level of analysis to be done on the durability and value of a handcrafted piece of clothing, the care that goes into maintaining it, the value of a local economy, and the other side where you're forced to buy cheap items that degrade at a far faster rate.

If a town's local businesses are put out by a new walmart's ability to carry low prices, does the town truly come out ahead with those low prices? Or does Walmart simply extract more money from the town than it returns, leaving the town worse off?

esoterica
2 replies
17h4m

Those optimizations increase the value of labor by greatly increasing the amount of output per unit of labor. Do you understand how much wealthier the modern worker is than the pre Industrial Revolution peasant?

MSFT_Edging
1 replies
4h52m

Pre-industrial societies may not have had flat screen tvs, but they had time. The only thing you can't buy.

esoterica
0 replies
2h17m

They did not actually have time because they mostly died in infancy.

That said if you are satisfied with a pre industrial quality of life no one is stopping you from doing zero hours of work and just becoming homeless.

MacsHeadroom
0 replies
17h29m

bring down the average laborers value.

Labor has never been to afford so much luxury as the modern day.

qqqwerty
0 replies
22h16m

The capitalist system keeps you poor by design. And you feed the system by purchasing the mass produced garbage. Sure, it is nice to afford stuff when poor, but we don't need to live in a society where being poor is common, or where mass produced garbage is the default option for most.

dandelionsnow
0 replies
23h1m

Handcrafted artisinal stuff is a luxury because that's the only niche that makes economic sense for it now, given mass production and other recent developments (too lazy to list, sorry).

Consider how you can't really get by in most of America without a car because we designed our cities to require them. It would be a mistake to conclude that, because life is harder in a car-optimized society without a car, society must be better off optimizing for cars.

burningChrome
7 replies
23h31m

I often ask a similar question of what happens when we, as humans, have offloaded everything to technology, to AI to Robots? What kind of a society will we have then? When you no longer have to think about how to do something, or how to build or repair something, or create something original from your imagination.

I shudder to think the direction this is all leading to.

machiaweliczny
2 replies
22h12m

Raising kids, social stuff, exercise, travel, any leisure stuff like racing, horse ridding, acrobatics. entertainment, cooking, art, gardening etc. Just check what rich girls do and you will see they aren’t bored though they don’t have to work.

__loam
0 replies
20h16m

And I suppose we'll get to do all that stuff when all the value trickles down from the shareholders right?

Draiken
0 replies
1h54m

That'd be the dream. Unfortunately only the owners of all the AI production will enjoy that, while everyone else has to figure out a way to earn money and not starve by switching professions.

Exactly zero of that unlocked time/value will come back to the workers.

It simply can't work within capitalism. We've seen it before so many times that it's a given. It's a natural conclusion to this.

Textile workers don't get happy when a machine can replace them despite it meaning they won't get injured anymore and this brutal job will get easier. It's objectively a good thing, but because our society ties all our livelihoods to work, they have to hate the machine.

Given how low standards are, executives will layoff swaths of people for a crappy AI very soon. I'm genuinely scared of how our world will cope with that.

jajko
0 replies
23h11m

Think about it in cycles. Handcrafted products were frowned upon since you could see imperfections and flaws. Now they are the thing, cheap perfect same mass machine-produced things often look bland and cheap, you pay massive premium over it. No reason this won't repeat in some form again.

True art will be rare and treasured, few artists will be rockstars. Till next fad we can't even see coming again mixes it all up.

But yeah these transitions do make tons of people miserable, losing jobs. Also overall middle class is disappearing, but thats the trend for quite some time. But it always paid off to anticipate what next generation of showels will be.

brigadier132
0 replies
23h6m

I think what it leads to is a world without scarcity. I think all this fretting about meaning is missing the forest from the trees. Things like meaning are important but making sure everyone can eat three meals a day, that your family has access to superhuman doctors, that different diseases are cured are all much more important.

__loam
0 replies
20h17m

Historically when this kind of stuff happens the result is usually a Revolution.

LegibleCrimson
0 replies
22h14m

I'm kind of an AI pessimist, but I think that sounds like it could be a wonderful society. When the only work you need to do is work that you actually want to do. It could free up people to actually chase their dreams and achieve what they want to, without having to constantly chase subsistence.

The reason I'm a pessimist is that I mostly see society preserving the status quo. I don't see AI democratizing things and freeing us, because we have fetishized the concepts of work and profit that we can't imagine a society that functions properly without demanding those two things be put above all else.

I don't envision a dystopia or a utopia, I envision a future where AI disenfranchises people who should be taken care of, bolsters profits of the already powerful, and replaces the most fulfilling human pursuits without actually saving people from unfulfilling toil, mostly because society will bend backwards to try to preserve the status quo.

ardaoweo
4 replies
1d1h

If we got universal basic income, people could do whatever they want. I for one would be content spending my time gardening and trekking in the nature. I despise office work and do it only for money.

It's forcing the rich to give us UBI that is the problem.

MSFT_Edging
2 replies
1d1h

Sure but UBI would never afford a garden.

It would be bare minimum to survive, if that. We'd need a complete restructuring of society to even approach dignity via UBI.

pj_mukh
0 replies
23h52m

Unless of course the robots built the gardens thereby driving the cost of gardens down to amounts accessible by UBI. Or so goes the theory.

AndrewKemendo
0 replies
1d

Right! So lets get after it

Start a local cooperative and build value from the bottom up

pzo
0 replies
23h59m

There will be only just few countries that are winners in this game. Do you believe e.g. USA will provide UBI for other countries or even only neighbours like Mexico? I don't think so. And once there is big unemployment that USA border will be filled even more with migrants or people unhappy with the situation.

onthecanposting
2 replies
17h49m

The economy clothes aren't great, but you can get a quality tailored suit for $4000. That's about 6 weeks pay for a PFC in the US Army. A roman legionnaire would pay about a gold ounce for a quality (at the time) set of clothing, however, at the cost of about 6 months pay.

They don't build 'em like they used to, but those of us in the rank and file are getting better off. We can't all be "I think I will just buy Hawaii" rich like Zuckerberg, but quality of life is, overall, improving.

MSFT_Edging
1 replies
4h41m

I think there's a good point in-between a $4000 dollar tailored suit and a $0.13 tshirt made of plastic that is designed for the landfill.

I'm unsure why we default to extremes on that regard. A decent made T-Shirt should cost about $40, but right now the $40 shirt costs $.10 to make. Therein lies the self-consuming nature of how things are going right now.

LunaSea
0 replies
16m

The problem is that we are missing the proof of this supposed quality difference.

Price does not necessarily imply quality (branding is much more of a factor).

Even if price did imply quality for certain brands, the question would be how much more qualitative? Will it hold 10x more wash cycles or last 10x longer?

In which case a consumer could make an actually informed decision on quality vs. price.

gen220
2 replies
19h15m

I have the same questions.

I’ve decided the best thing I can do is vote with my wallet (i.e. as a consumer) and vote with my time (i.e. as a producer).

Consume commodities that are regenerative and products that are durable. Whenever possible, buy local to keep cash flows (wealth transfer) local.

Invest your time in contributing to products that encourage other people to do the same.

Yvon Chouniard said (I’m completely paraphrasing) something like “in the 21st century, the most powerful vehicle for transforming the world is the American corporation. Whether or not it should be, it is. If you think the world should be different, step 1 is starting a business”.

As a producer (engineer, for me), the best thing I can do is build tools to help people wean themselves off of Capital.

esoterica
1 replies
16h56m

Autarkies are impoverished, societies that trade freely are wealthy. Your economic impulses are deeply misguided. If everyone only bought local the world would be much poorer. It’s better (consumer choice, comparative advantage) to buy from the world and to let the world buy from you. The money doesn’t disappear when you spend it outside your locality, it flows both ways.

gen220
0 replies
15h28m

I’m sorry, but by whose definition are “they” “impoverished”? And what constitutes “autarky” and “local”? I’m not advocating to abolish global trade or whatever, if you think I was gesturing in that direction.

Buying (food, especially) locally means supporting a robust food chain that’s less dependent on refrigeration, global shipping, and a geographic arbitrage on agricultural labor. It’s better for the environment and connects you to your community and sense of place and seasons.

It means I can know what went into the food, in terms of soil, labor, and pesticides — something that I have no knowledge of at the grocery store.

A pound of apples, meat, or whatever from 1-2 counties away is more financially expensive, but if you can afford it there’s many non-financial benefits.

I’d rather purchase from a CSA than a national grocer, because that money mostly goes back into my local community.

You’re right the money doesn’t disappear when you spend it outside your locality, but it comes back in ways that are increasingly exploitative and/or depressing (chain stores offering no meaningful careers or opportunities for ownership, rent-seeking financialization, warehouses that bring noise pollution and packaging waste, etc.).

Again, if you can afford it, IDK why you wouldn’t prefer to spend the marginal dollar locally (i.e. private / family-owned businesses in a few counties’ radius).

It’s consumer choice and comparative advantage that actually lead people like me to shop locally...

andruby
1 replies
8h4m

Long ago most humans spent their time and energy on farming so they wouldn't starve. We automated that with a couple of Agricultural Revolutions.

Then the majority of us spent our time and energy in factories. That got automated during the Industrial Revolution. So we moved to service and knowledge jobs.

And now we're potentialy seeing the start of automation in those knowledge jobs. Will this be a painful revolution as the Industrial one? Where will we spend our time and energy on next?

Where will we be in the future?

A German named Karl wrote a book with an answer to that question 150 years ago.

https://en.wikipedia.org/wiki/Das_Kapital

MSFT_Edging
0 replies
4h46m

I'd love to live in a Communist society. I think our technological capabilities are to the point where a centrally managed economy is not only possible, but extremely doable.

The main issue being, our governments and society have made it clear that the opinions of those who hold Capital are far more important than the unproductive surplus. Our entire society now is based around consuming cheap products we don't need, because the previous cheap products wore out prematurely. The focus on a profit motive tosses aside great discoveries that are unable to turn a profit, while encouraging useless advancements that only degrade our lives.

IE advertisements make money, but I've never met someone who believes their life to be improved by ads, except for a minority of shareholders.

On the other hand, there have been many medical advancements all but scrapped because the lack of profit outweighed the human benefit.

The profit motive is incompatible with a robust society. I won't pretend to have the answers on what should be done and how to implement it, but I know that our current system cannot last forever.

plz-remove-card
0 replies
14h49m

Humans seek work that provides satisfaction and meaning in their life.

we keep destroying what allows humans to seek meaning.

Our governments and society have made it clear, if you don't produce value, you don't deserve dignity.

Those points are really interesting to me, I think you're right and I've never looked at it that way before.

Buttons840
0 replies
20h41m

Just pick a path, doctor, artist, therapist, any path will do. You'll soon realize you're better than the AI, but nobody will give a shit, they'd rather have the cheap AI knockoff.

steve_adams_86
20 replies
1d1h

Although the demos are impressive, they seem short and limited in scope which makes me wonder how well this will work outside of these planned cases. Can it do software architecture at all? Is it still essentially just regurgitating solutions? How often will the solution only be 90% correct, which is 100% not good enough?

Even so, I realize the demos are still broad in scope and the results are incredible. Imagine seeing this even 2 years ago. It would seem like magic; you wouldn't be able to believe it. Today, this was inevitable and entirely believable. There will be even better versions of this soon.

swalsh
17 replies
1d1h

Ah yes, you've entered the first stage of grief. Denial. Next you'll start bargaining, you'll get angry, and you'll become depressed, eventually you'll just accept that AI is taking over software.

In my mind, I've concluded that I have less than 3 years to find an off ramp.

mnk47
3 replies
17h45m

It took me much longer than usual to get past the depression phase. The main reason is that everyone online kept telling me I was dumb, too inexperienced or extremely incompetent if I thought AI was a serious threat to me, a junior developer. Friends offline said similar things, just in a nicer way.

Now everyone seems to be a "doomer" like me. I was too stupid to see that people (especially online) were lashing out at me as a coping mechanism. The worst part is that even though I had my "we're fucked" realization was earlier than many others, I didn't really do anything productive, I didn't make the right moves to pivot, I had no plan. Now I have a rough ~3-year off ramp plan, just like everyone else.

I suspect that there are many, many recent grads in my position, who also thought they were taking crazy pills when everyone around them dismissed their concerns

swatcoder
2 replies
17h33m

If the circle you're listening to are "recent grads" stop listening to them.

They don't know how the industry works or what it needs, few of them them know how the technology works or what it's realistic near-term prospects are let alone the long-term ones, and none of them have lived long enough to understand the pace of a technology moving from discovery through to maturity and commercialization.

If you look to your experienced seniors instead of your peers, you'll discover two things:

1. They don't respect most contemporary juniors because of the industry boom and are happy to see most of you scare yourselves off

2. They're narrowing in on a familiar, respectful appreciation of recent content generators and their potential for assisting some basic use cases (highly valuable technology!) and are increasingly of the opinion that both the hype and fear of Q1/2023 was largely market manipulation by VC's trying to squeeze another boom out as money was tightening

sandspar
0 replies
10h2m

Experienced, older people are notorious for downplaying and dismissing the implications of revolutionary new technologies. They may be right this time, or they may be wrong. But age and experience aren't as relevant when it comes to revolutions.

mnk47
0 replies
16h19m

Well, I did see that, or at least people who claimed to be seniors online, plus some mid-level friends and yes, a junior friend. Admittedly, it's just anecdotes, but it's very difficult to get hard facts, even from perusing bls.gov.

But something changed in the past 4 months or so. I used to see the two things you described, but now I see seniors repeating the same thing juniors (whom they ridiculed) used to say. Look at this very thread, for example. People are starting to realize that even if these AI startups are making ridiculous promises, the tech might be way more than just hype/smoke and mirrors.

I've read quite a few comments in HN, and right here in this thread, that say something along the lines of "I'll be fine... It's the juniors I'm worried about".

4star3star
3 replies
1d1h

What kind of work do you have in mind?

swalsh
2 replies
23h48m

In terms of my "off ramp"?

I have a multi-part plan. Immediately, i'm working to get closer to the business. To be closer to the position of defining requirements, not implementing requirements.

Secondary, i'm experimenting with ideas I hope can become a business.

as a final fallback, I have a hobby woodshop in my basement, and I love making furniture.

chpatrick
0 replies
23h27m

What about when the ai defines requirements?

bigfishrunning
0 replies
22h46m

Making furniture! but we've had machines to do that for 100 years!

Programmers will be fine. The AI plagiarism engines are severely limited, and will be for the forseeable future. Maybe someday i'll be equivalent to the ren-faire blacksmith, but I'm gonna do this until I die.

steve_adams_86
2 replies
1d1h

No, I'm past the denial stage (I was certainly there, though... GPT threw me hard and I spent a good 2 months processing what was happening) but I don't, in this specific case, see this agent displacing many jobs yet. Well, not my kind of jobs. I'm already very worried for the entry level of our industry... I'm not sure what it means yet, but I don't think we will have many entrants into software careers within 5 years or so.

I'm looking for an offramp as well. I truly love software so it's a hard reality to contend with at times. Regardless, I'm not a software engineer at my core. I'm a problem solver, and I love creating things. This is transferable. I'm not sure where to yet, but I'm certainly capable of making a move. I hope I have more than 3 years, though.

As someone else asked, I'm curious where you're headed or thinking of heading so far.

swader999
0 replies
23h45m

Keep posting about this. I feel the same way.

joshuahutt
0 replies
22h18m

I feel the same way.

I think the immediate future is bright. Some will try to cram this technology into enterprise. It will do well. Those jobs will die. Others will leverage it alongside the more creative engineering tasks—they will thrive.

Eventually, what we call software will change from what it is today into something much more accessible to these types of tools. The plateau we've landed on is just a compromise between the technology, economies, and culture of its time.

As this type of tech pervades our everyday lives, much of the widespread need for specialized software will evaporate. The remaining work will be in the corners or the edges of what's possible—highly technical or vertically integrated work (particularly, hardware-integrated stuff), as well as platform engineering to sustain the new paradigm.

empath-nirvana
1 replies
23h19m

There is not a limited amount of software engineering that can be done. There's only the amount of software engineering that it is _economical_ to do at any given point. If AI makes software development cheaper and more efficient, people will just use apply it to more use cases. It's never been the case that making programming cheaper has lead to fewer programmers -- quite the opposite.

This change is roughly analogous to the shift from punch cards to compilers. It's just a more "natural language" way to do programming. A lot of the drudgery associated with coding will go away and competent programmers will shift to higher level design work.

Even in a future where AI is better than human software engineers at every single programming task (which I don't believe will be the case any time soon), there is still comparative advantage. AI will not have the _capacity_ to do every single programming task and there's going to still be lots of work for people to do.

sandspar
0 replies
10h7m

The problem is that AI has the potential to turn a professionalized industry into a self-help one. Like, if I can pay $30 a month to have an AI make me several new apps a day, why would I need an app store? It's like fast fashion for programs. Low quality but dirt cheap.

zeroonetwothree
0 replies
11h16m

Many things have been claimed to be the "death" of software careers (.com crash, outsourcing, remote work, etc.) and yet we have more demand than ever.

realusername
0 replies
10h26m

It's the AI hype group which are in a stage of denial.

"Surely if we generate code, it means we can replace developers right?"

Well no, it's not nearly enough. Good code generation with a UI is already a thing since the 90s, it's replacing developers as much as Frontpage and WordPress do.

The average developer writes 100 lines per day anyways so the speed of writing those lines doesn't matter.

Software is an interpersonal job at core and the most precise definition of a business.

The day software engineers are replaced by a machine, there's no job for the CEO itself honestly and that's also why this job is well paid. I'm sure it will be all over LinkedIn though, despite this paradox.

Those new tools will be there in the background but cannot replace a software engineer.

pzo
0 replies
1d1h

one year ago I was thinking about ~10 years and at least safe 5 years. Considering how it's all progressing right now and we had chatGPT just 15 months ago I think you might be right or salaries will get reduced significantly.

breadsniffer
0 replies
23h41m

I've been using GPT-3 since its waitlist was out. Even then you saw demos like this claiming sentence -> full complete project. The demos will get fancier, but reality is much more complex than you think.

breadsniffer
0 replies
23h43m

This seems like it's something replit is better suited to execute. You will need human intervention at some point and at that point you're building a full-fledged web IDE

andoando
0 replies
1d1h

There is a similar product called Sweep AI thst I tried. For extremely simple things "like add a button to the page that prints hello" it was very good. I then asked it to do something more complex, which was to render my d3.js graph vertically rather than horizontally, and it tried to redefine constant variables (it just added a new modified code block without deleting the old one), put function clauses in places that were not synctactic. After I manually fixed those, the functionality just didnt work.

singularity2001
8 replies
22h35m

Interesting: The last demo on the blog took 2.5h to complete: https://www.cognition-labs.com/blog https://www.youtube.com/watch?v=UTS2Hz96HYQ "Devin's Upwork Side Hustle"

I wonder how much time of this was consumed by manually directing Devin into the right direction, manually fixing and undoing the mess Devin produced and watching Devin burn through $$$. As others said, being completely non-transparent about this burns a bit of trust, but I'd really like to know where we are right now. Since Devin is currently "invite only demos", a more realistic peek into the state of the art can be seen here: https://docs.sweep.dev/blogs/gpt-4-modification

My gut feeling (and limited experience): gpt-4 and other models are not quite there yet, but whoever prepares for the next generation of models now will eventually win big times. Or be replaced by simpler approaches.

joshuahutt
7 replies
22h23m

We're solving the wrong problem.

People trying to use cars to pull horse carts are doomed to fail.

Trying to use AI to build the software of yesterday is a waste of time.

singularity2001
6 replies
22h21m

So what should the AI build instead? Specifically with regard to UI. I don't want my banking app to run on a bunch of non-deterministic prompts.

joshuahutt
3 replies
22h2m

Great point. I have no idea. What are your banking use cases?

For me, it's mostly information I want. I don't really need a full app for that. I want to know:

1) How much money do I have? 2) Did a check I cashed clear yet? 3) Are there any unusual charges? How is my spending this month? 4) Anything I should look into?

For actions I'd want to take:

1) Deposit check 2) Transfer money from account to account 3) Make a payment / authenticate with new EFT payee

All of that could be done conversationally, with a flexible level of detail. The data could be shown on any number of shared interfaces (messaging app, dedicated companion app, etc).

It doesn't obviate the need for software, and there are a ton of software ideas that exist today that will still be useful, as well as ideas that have yet to be discovered that will be useful. But...I expect the LOB app to go extinct in the next couple of decades.

PodgieTar
2 replies
21h3m

This sounds like hell.

Why on earth would you want that? Open the app, go to the account, enter the money with specificity, select the account to transfer it to, click the button.

Sure, being able to say "Transfer 200 to Steve" is nice and all but.. I just don't... consider it much better than the process we have today?

theshackleford
0 replies
19h18m

Sure, being able to say "Transfer 200 to Steve" is nice and all but.. I just don't... consider it much better than the process we have today?

If I could actually do that verbally, that would be a huge improvement. Too bad it doesn’t work like that.

joshuahutt
0 replies
18h27m

Just because the process today is "good enough" doesn't mean that it's a worthwhile use of developer time to create that process.

But, you're highlighting one of the challenges of changing the status quo. The "new thing" has to be significantly better than the old, otherwise people won't immediately want to switch.

Similar arguments were probably made when the iPhone removed the physical keyboard.

sergiotapia
1 replies
22h9m

Akin to alchemy, spring up the UI to solve the user's problem. When timelines are shortened from weeks to hours, what can we build?

Can a user just talk to a computer and solve their problem regardless of the platform?

joshuahutt
0 replies
22h0m

Exactly my point. Why do I need a custom UI for every LOB task under the sun? Just let me use a common interface to address all manner of uninteresting data problems. The UI goes away, or fades into the background, and the focus rests solely on the information I need, and the decisions I make, which I can dive deeper into with a focused AI companion.

Seems like a no-brainer.

Maybe folks LIKE clicking on buttons and going through 10-step procedures to get tasks done.

Some mice like the maze more than the cheese, I guess.

dakiol
8 replies
22h58m

Don't get it. If we have this amazing AI why don't we make good use of it? 90% of my job is not to write code (as a senior software engineer), is to:

- deobfuscate complex requirements into well divided chunks

- find gaps or holes in requirements so that I have to write the minimal amount of code

- understand codebases so that the implementation fits nicely

I don't need an "AI software engineer", I need an "AI people person who gives me well defined tasks". Now sure, if you combine those two kinds of AIs I could perhaps become irrelevant.

random_cynic
5 replies
17h54m

Yes, the horse carriage drivers had similar lines of thoughts when they saw first gen automobiles.

earwin
4 replies
17h32m

Your point being? The horses were replaced alright, carriage drivers are thriving ever since.

random_cynic
3 replies
17h6m

Lmao what are you talking/coping about? What happened to carriage drivers is pretty well-documented. Maybe ask one of these AI chatbots, they will summarize it for you.

zeroonetwothree
1 replies
11h19m

They drive for Uber now

Draiken
0 replies
1h49m

And that's thriving by what measure?

No employment stability, no benefits, no health insurance, no vacations, at complete mercy of basically a single company.

Yeah, thriving.

earwin
0 replies
1h45m

Thank you very much good sir for your kind advice.

I, in turn, suggest looking out of the window: modern carriage drivers are called truck drivers and their number is edging towards 10M worldwide.

Now add to this rail carriage drivers, sea carriage drivers and air carriage drivers. Years of progress changed the way you feed your "horse" and made reins somewhat more complex, but fundamentally nothing changed.

gardenhedge
0 replies
21h51m

The problem is getting enough information on requirements to even break them down :)

EForEndeavour
0 replies
4h13m

If we have this amazing AI why don't we make good use of it?

Aren't we all commenting on Devin's initial announcement published yesterday? Give it some time. It's a safe bet that a lot of companies which employ software engineers are currently tripping over themselves to get on the Devin waitlist, and the inevitable superior iterations on the idea of an LLM-powered dev that will absolutely be coming out probably starting later this week.

senko
6 replies
1d

As someone who works in this space (https://pythagora.ai), I welcome new entrants to this niche.

Currently, mainstream AI usage in coding is at the level of assistants and glorified autocomplete. Which is great (I use GitHub Copilot daily), but for us working in the space it's obvious that the impact will be much larger. Besides us (Pythagora), there's also Sweep (mentioned by others in the comments) and GPT Engineer who are tackling the same problem, I believe each with a slightly different angle.

Our thesis is that human in the loop is key. In coding, you can think of LLMs as a very eager junior developer who can easily read StackOverflow but doesn't really think twice before jumping to implementation. With guidance (a LOT in terms of internal prompts, and some by human) it can achieve spectacular results.

ukuina
3 replies
19h47m

Pythagora and similar frameworks are cool and very useful in the short term, but... Large-context models obviate the need for RAG and callgraph-augented generation.

Why would would Agents need multiple turns and a framework like Pythagora in a (near-future) world with GPT4-level of capability and 10M+ token context?

senko
0 replies
19h36m

The problem is not (only) the context size, it's (for lack of a better word) focus. GPT4 can easily get lost in too much information and produce results that don't work well (duplicate code or just incoherently solving a problem), and it needs a lot of handholding.

Imagine GPT4 (or any other LLM) as a very eager but not very bright junior developer that just started to work with you. It's good, but it'll need a lot of situational management for it to not go wildly off the track.

What improved models will bring us in the future is making Pythagora and other such tools work better for large and more complex projects. The tipping point will come when for example you'll be able to load Pythagora in Pythagora and continue development. While we do build some auxiliary/external tools with Pythagora, the core is still handcrafted mostly by a human, and I'm pretty sure that's the case with other code gen tools as well.

nsokolsky
0 replies
16h19m

I've gotten access to Gemini 1.5 Pro today with a (supposedly) 1M context window and unfortunately in practice it's not good at complicated queries over such a large window, involving multiple recursive lookups back and forth.

hackerlight
0 replies
14h4m

Why would would Agents need multiple turns

Because it's better. This was the insight behind AlphaGo, AlphaCode and AlphaGeometry. In Go, trying to get the entire solution in one forward pass required a 1000x larger neural net to achieve the same performance according to Noam Brown.

Intelligence requires iteration, introspection, planning, sub-task breakdown, and trial and error with feedback from reality (error messages, print statements). These meta reasoning structures (whether hard-coded or, ideally, learned) will move LLMs closer to how humans reason.

jprete
1 replies
1d

While you sound reasonable, I can't tell the difference between an honest opinion and a sales pitch here.

senko
0 replies
23h52m

I am making a sales pitch on behalf of all the projects I mentioned (not just the one I'm involved with).

I see LLMs failing at coding daily (one of the "perks" of working in the space), and I'm incredibly bullish on this approach.

And I don't think it'll replace humans or junior engineers. As programmers, we've been "replacing" ourselves since the days of assembler that replaced direct machine coding. This is just another iteration of it.

(if you do want a sales pitch, here's one: https://twitter.com/senkorasic/status/1765769482985722267 )

Oras
6 replies
22h50m

From their twitter:

When evaluated on the SWE-Bench benchmark, which asks an AI to resolve GitHub issues found in real-world open-source projects, Devin correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted and 4.80% assisted.

While it is a progress, its far away from being useful to be a software engineer.

pstorm
3 replies
22h35m

What percent does an average junior engineer solve? If it is even close, these models can be run all day and night for cheaper than one yearly SWE salary.

jonahx
1 replies
22h8m

The problem is that you still need a human in the loop to determine if you're in the 13% success bucket, or 87% failure bucket, and the time it takes to make that determination is still a significant fraction of just solving the problem.

So the actual value here is not "13% of all issues fixed for the cost of compute," but more like "a discount on human time for 13% of the issues". But you also have to factor in the time taken on the 87% of issues where leading you down a wrong path can be adding time versus human only. It's not clear to me how it all shakes out, and would require large-sample experiments with humans to determine. I would bet the final margins are small though.

pstorm
0 replies
21h5m

You raise a good point - AI + human review might end up being more time than just a human doing everything. I can see a certain subset of issues could be simple enough to done by AI and a quick review - like changing a button color or fixing clearly defined bugs. Time will tell how much work gets shifted over to AI + human review, but I'm betting on most of it.

swatcoder
0 replies
22h25m

Juniors already have bad and sometimes even negative ROI but today's working junior is the trusted engineer of tomorrow and the senior of the day after that. The problems they work on impart the knowledge and instincts that advance them through towards mastery and real value.

Budget-myopic executives already tried transfering that work to cheaper labor markets, but it worked much less than they expected and most ended up with unmaintainable software and loss of any hope for an actual engineering advantage against competitors. There's nothing new here.

There will be organizations that find a good and smart use for fully automated code generation, just like there is for outsourcing/offshoring, but it's not a universal win to just go with what's "cheaper" and organizations that don't look at the big picture are (as usual) trading short-term accounting gains for long-term value erosion.

typon
0 replies
22h36m

13% unassisted is crazy. That's probably half the performance of an intern that costs ~100k/year.

mpalmer
0 replies
22h27m

Unless humans themselves were tested on a benchmark, the benchmark data doesn't help us compare the model to human performance.

Of all the SWEs out there that draw a salary, how many do you think would improve on this 14% unassisted figure?

ridruejo
5 replies
1d3h

Given the excitement on X right now about this, I don't understand how this is not in the front page already :)

mpg33
1 replies
1d2h

I could take a guess...

cal85
0 replies
1d1h

Say more?

sohzm
0 replies
1d2h

There's another post both are not really getting attention.

I really wanted to see what hacker news had to say on this :(

pzo
0 replies
1d1h

it was shortly on front page and somehow got buried

nycdatasci
0 replies
1d2h

@dang: did this get flagged as spam by chance?

rafadc
4 replies
1d1h

Prepare for a lot of copycat companies. Hey Devin, copy this company's software.

ukuina
0 replies
16h32m

Hey Devin, copy Devin?

shombaboor
0 replies
1d1h

the most successful software will be a result of marketing and going viral (e.g. featured on HN) if the ais just constantly clone each other's apps.

anonzzzies
0 replies
1d1h

It will try and you end up with nothing but a bill from using Devin.

_factor
0 replies
1d1h

So basically taking money away as an obstacle? It’s not illegal to copy an application. Look at Instagram’s stories compared to Snapchat.

The only difference is that there might be a little less profit incentive now. Perhaps we’ll get some interesting ideas from people who never would have been able to create them.

pushedx
4 replies
23h16m

Scott Wu! I met Scott at a competitive programming event a few years back.

He is one of a very small group of people (going back to 1989) to get a perfect raw score at the IoI, the olympiad for competitive programming.

https://stats.ioinformatics.org/people/2686

Glad to see that he's putting his (unbelievable) talents to use. To give you a sense, at the event where I met him, he solved 6 problems equivalent to Leetcode medium-to-hard problems in under 15 minutes (total), including reading the problems, implementing input parsing, debugging, and submitting the solutions.

zeroonetwothree
1 replies
11h19m

I've participated in programming olympiads and worked with people of varying skill and I would say that overall the correlation between competitive programming and software engineering in a business environment is probably like 0.2 or less.

asteroidz
0 replies
1h43m

0.2 or less

I find that questionable. What does "software engineering in a business environment" require that a competent competitive programmer couldn't also learn?

gardenhedge
1 replies
21h52m

Sounds like he's talented. Isn't Devin "just" a AI wrapper tool? Devin's play is that it will be the first comprehensive option available but it will soon be eaten by OpenAI, Microsoft, Google and countless others.

gitfan86
0 replies
21h25m

Yes, but AGI will first emerge from keeping state between calls to multiple models and assessing how closely they resemble humans intelligence, and using a loop to keep it going and updating the state. Which is what they are basically doing here

hackerlight
4 replies
1d1h

This is where inference speed starts to matter. H100 might be cheaper per inference than Groq but cutting down the wait time from 1 minute to 10 seconds could be a big deal.

anonzzzies
3 replies
1d1h

Have you tried Groq? We did a few days testing on replacing gpt4-turbo with it and, while incredibly fast, the results were horrible, even after a lot of specific prompt engineering. So many hallucinations and such. Our products all have to do with strict generation and software quality; it basically has to fill in the blanks but it was incredibly hit or miss. Some results came in within a second so even a few iterations beat gpt4 when correct, but some needed so many (that we quit) iterations that gpt4 beat it hands down.

simonvc
1 replies
1d

they just run other open models, so you're complaint isn't about Groq, it's about GPT-4 vs mixtral 8x7b accelerated

anonzzzies
0 replies
1d

Sure, so when openai moves to groq it might be something. Groq with the current models is impressive but doesn’t work for us is what I am saying. As I don’t actually have access to other models on groq, this is groq as it stands.

hackerlight
0 replies
17h11m

Groq is hardware not software... It's like saying the H100 hallucinated.

RyEgswuCsn
4 replies
22h5m

If you need AI to help you program an algorithm, then you shouldn't be using it because you can't tell if AI's solution is correct.

If you can tell if a solution is correct or not --- well, then you don't need to have AI write it for you.

I think AI programming can only work when the industry begin to treat "almost working" systems backed by human customer service as acceptable.

Voloskaya
2 replies
21h59m

If you can tell if a solution is correct or not --- well, then you don't need to have AI write it for you.

Did you just solve P=NP?

Many things are trivial to verify, but hard/time consuming to code up.

You probably shouldn't rely on this to write critical software, no matter the amount of manual QA you throw at it afterwards, but there is an abundance of non-critical use cases where you can quickly check if a solution is good enough for what you care about.

zeroonetwothree
0 replies
10h56m

Most problems in "P" are not "trivial".

RyEgswuCsn
0 replies
21h16m

What I meant to say is that most people can only verify an algorithm is correct if they already know the correct solution.

If they already know the answer then it’s probably more efficient if they write it themselves rather than having AI produce a potentially difficult to verify answer and try to verify it.

rewgs
0 replies
20h32m

This. And at a certain point, a prompt might become so specific that you might as well just write the code yourself. After all, a prompt is instructions for a computer, as is code.

Bjorkbat
4 replies
1d2h

I mean, this might just be existential cope, but my first thought when looking at the Upwork demo posted on Twitter (https://x.com/cognition_labs/status/1767548768734294113?s=20) was that it seemed a little bit suspicious.

Namely, the client was asking for an unusually specific (for Upwork) ask. It was an almost perfect example of a job to be given to an AI agent for testing purposes.

zeroonetwothree
0 replies
10h54m

It doesn't seem like Devin actually did the work requested, which was to provide instructions for using AWS?

swalsh
0 replies
1d1h

You're looking at the worst version that will ever be released. If these guys don't get better over time, someone else will.

supafastcoder
0 replies
1d1h

To be fair, a lot of those Upwork job requests are now being written with AI…

HanClinto
0 replies
1d1h

I tend to agree.

I really want to see what this does with an open-source backlog. The more mundane (yet critical) of a project, the better.

What would be good projects to feed it? I suggested Mozilla and llama.cpp, but there's got to be something better as a use-case for it.

datavirtue
3 replies
1d1h

This awesome. Until Devin steals your startup idea.

swalsh
2 replies
1d1h

Software is a commodity, find something more valuable to differentiate your startup.

shombaboor
1 replies
1d1h

just need to get an A list celeb like ryan reynolds involved and people will follow

gowld
0 replies
23h24m

You'll need an AI list celeb. A list is dead.

goat_whisperer
2 replies
1d

People who try to draw historical analogies to AI replacing humans say things like:

"cars replaced horse drawn carriages. But we managed to adapt to that, the carriage drivers got new jobs."

My dudes. We are the HORSES being replaced in this scenario.

zeroonetwothree
0 replies
10h57m

How about tractors replaced humans plowing fields? Or literally any other example of automation...

wetmore
0 replies
1d

I don't get your pessimism, after we are replaced we can all work at the glue factory :)

crucialfelix
2 replies
22h59m

I have in my codebase several really long django views files (3k lines!). They were written in a poor fashion with many nested if statements for parsing and error handling.

On a one by one basis I can use VSCode github copilot to rewrite each one the way I want it.

What I want to do is iterate through all functions in the files and do each one of them.

I know we are getting there, but does anybody know how that can be done right now?

elietoubi
0 replies
22h54m

Have you tried cursor.sh Not affiliated with them but it's actually pretty incredible for long context

Eager
0 replies
22h53m

Give Claude 3 Opus a shot maybe.

One of the reasons I have stayed well clear of the IDE tools is they force me to use their own model.

While they might be convenient it means I can't switch to whatever the SOTA model of the day is at the drop of a hat.

Opus is awesome and well worth a shot.

HarHarVeryFunny
2 replies
22h56m

Let's get realistic here - I just beat GPT-4 at tic tac toe, since it failed to block my 2/3 complete winning line ...

Sure, one day we'll have AGI, and one day AGI will replace many jobs that can be done in front of a computer.

In the meantime, SOTA AI appears to be an airline chatbot that gets the company sued for lying to the customer. This is just basic question answering, and it can't even get that right. Would you trust it to write the autopilot code to fly the airplane? Maybe to write a tiny bit of it - just code up one function, perhaps?

I sure as hell wouldn't, and when it can be trusted to write one function that meets requirements and has no bugs, it's still going to be a LONG way before it can replace the job of the developers who were given a task of "write us an autopilot".

sohzm
1 replies
13h23m

Well, I'll bring the perspective of a recent college graduate from India.

Most students who started jobs in software aren't writing software for aeroplanes. They're writing crud apps, doing third-party API integrations or doing basic debugging.

I worry that although not some specialized software devs but a lot will still have problems due to stuff like this.

I'm not talking today, but say 2-3 years down the line, who will have an intern when you can get an AI intern that can perform as the top percentile and comes at $20 per month subscription?

HarHarVeryFunny
0 replies
5h14m

Certainly some software jobs are easier than others, and anything that is more just coding (e.g. crud apps) than designing would be easier to automate. It's interesting though why this hasn't been done a long time ago? I remember a piece of software called "The Last One" from the 1980's that was meant to automate creation of simple business apps like these, and here we are 40 years later with people still creating "no/low code" solutions, and people still manually writing crud apps! Why?

I don't think AI will replace jobs until it can do the entire job (full lifecycle from requirements to bug fixes, etc) and be interacted with in the same way a boss or team lead could interact with a developer. If you still need a person in the loop then it's not a person replacement - it's a productivity tool.

syedmsawaid
1 replies
23h7m

Is it built with pre-existing LLMs or did they created one from the ground up? With 21 million Seed A funding, an LLM powerful than GPT4 seems impossible. What am I missing?

martinesko36
0 replies
22h49m

Yeah seems like this could be replicated by another AI dev fairly easily.

pedalpete
1 replies
20h10m

I'd really like it if Cognition Labs would put the resulting code from the demo into an open-source repository so we could examine it directly.

When I was using chatGPT to help guide me through some coding tasks, I'd find it could create somewhat useful code, but where it fell down was that it would put things into variables which would be better put into a class. It is this structuring of a complete system which is important for any real software engineering, rather than just writing code.

senko
0 replies
19h43m

I'd really like it if Cognition Labs would put the resulting code from the demo into an open-source repository so we could examine it directly.

Yup. It's hard to evaluate things based on the demo.

We're building something similar (with an open source core), and publish our examples for everyone to check out, warts and all: https://www.pythagora.ai/examples

nsypteras
1 replies
1d1h

Clearly an extremely impressive demo and congrats on the launch. I do wonder how often the bugs Devin encounters will be solvable from the simple fixes that were demonstrated. For instance, I notice in the first demo Devin hits a KeyError and decides to resolve it by wrapping the code in a try-catch. While this will get the code to run, I immediately imagined cases where it's not actually an ideal solution (maybe it's a KeyError because the blog post Devin read is incorrect or out of date and Devin should actually be referencing a different key altogether or a different API). Can Devin "back up" at this point and implement a fix further back in its "decision tree" (e.g. use a different API endpoint) or can it only come up with fixes for the specific problem it's encountering at this moment (catch the KeyError and return None)?

mikebelanger
0 replies
23h21m

Yeah that was my question too. Its one thing to know the most simple fix for a KeyError issue, its another to understand that its the result of not assigning the proper key in some other part of the code, or like you said, maybe it called the wrong API endpoint and passing that into the dictionary.

Somewhat related: is anyone else not really impressed by Devin fixing the errors that are very preventable with a stricter language like Rust? The demo shows Devin coding in both Python in Rust, but I consider the latter being way less energy intensive in terms of maintenance. Then again, exhaustive pattern matching and strict typing won't get you lots of VC dollars these days.

mlsu
1 replies
16h44m

After devin "figures out" 10 issues, what does the code look like? Those are the easy ones, and if you haven't fixed them cleanly, the next 10 will be more difficult to solve, for human and for robot. Now do this for several years. Can devin create its own bug reports and issues? It better be able to!

I'm curious what a large, mature codebase, with complex internals and legacy code looks like after you sick devin on it. Not pretty I suspect. In fact, I think it will become so difficult to fix that nobody -- neither human nor devin -- will be able to clean up the mess. By sheer volume, a broken ball of unfixable spaghetti.

I would be immensely pissed off if someone did this to an open source project of mine, or even to a closed-source codebase I'm working on. Not only would it not be useful, it would be moving backwards. Creating an icky vomit mess that we will probably have to spend years cleaning up after bug reports and complaints from customers begin mounting, and competitors can iterate faster.

Does that sound like something you want to deal with in your software business?

mlsu
0 replies
16h41m

It comes across harshly. The demo is genuinely technically impressive. But may I suggest: apply devin to your own codebase first -- then decide if it's something you want to gift to the rest of us.

lacoolj
1 replies
1d1h

This is just nice packaging on top of current models. Very nicely done but still not a giant leap forward from what is already here

steve_adams_86
0 replies
1d1h

I'm not suggesting great work didn't go into this, but I was able to build a very crude version of this on GPT 3.5. It was evident then that the real power of these models isn't in chat, but in a sandboxed environment where they can recursively iterate on solutions and feedback from their sandboxed environment. I was able to feed mine small applications with bugs and have it comb through and find the bugs, write and apply solutions, write tests for solutions, etc.

Adding features was too hard to implement in my limited spare time, but was clearly possible. I would have needed some form of test running for UIs or CLIs, and I wasn't prepared to go that deep on a project I wasn't going to get much out of.

It was crude and overly specific to what I was trying to get it to do, but it worked well enough to convince me that someone smarter than me could make a capable and truly useful version of it that could actually impact the industry meaningfully.

jasfi
1 replies
1d1h

Is Devin a new LLM? Perhaps equiped with code and deploy plug-ins? The comparisons against other LLMs would suggest so.

The real world eval benchmark puts Claude 2 way ahead of GPT-4, which doesn't sound right.

nrub
0 replies
23h57m

I've seen a few suspect benchmarks for recent announcements of LLM releases. I'm sure they made an attempt at an honest benchmark, but until there's an independent assessment and benchmark (preferably multiple) you have to assume that there's bias in anyone's self published benchmarks like this.

I'm guessing it's a fine-tuning of some existing LLM model or API, but this largely seems to be an "agent" and UI that includes some SWE like workflow coding to allow more complex requests to be asked than just an LLM could provide.

gnarcoregrizz
1 replies
23h25m

Yet again, bad time to be on the labor side of the equation, great time to be a capitalist. For us laborers, if I had to choose from a list of fields to go into, anything creative would be low on the list. 'Prompt Engineer' will be the only one left.

UBI is a pipe dream... it's not happening. The wealth and means of production won't be shared in any meaningful capacity. Wealth inequality can get a whole lot worse.

nprateem
0 replies
22h41m

The measure of bullshit in this field is promoting the term 'prompt engineering'. It's prompt futzing or prompt fiddling. There's no engineering involved.

gerash
1 replies
23h24m

we still don't have agents that can do simple things like: find a funny photo of my dog in my phone and post it as a story on instagram with 100% reliability. I would wait for that to happen first before thinking there can be an autonomous software engineer

ukuina
0 replies
19h46m

This is where Large Action Models will shine.

erickmunene
1 replies
9h53m

Wow, this is incredible news! Congratulations to the team behind Devin, the first AI engineer! This is a monumental leap forward in technology and innovation. I'm absolutely thrilled to see how Devin will revolutionize the field of engineering.

As someone passionate about the potential of AI in tech, I can't wait to see what amazing feats Devin will accomplish. And who knows, maybe one day, companies like Munesoft Technologies will reach similar heights with their own AI-driven advancements. Here's to a future filled with endless possibilities! #DevinAI

sinopvalisi
0 replies
3h4m

this comment sounds weird and so AI generated.

devinprater
1 replies
1d1h

Hey, they named it after me!

Buttons840
0 replies
23h18m

Me too. It sucks.

At least it's a small team who will probably be shown up by bigger players in the market and go out of business.

devinegan
1 replies
23h22m

Have I been replaced? AI coming for my job and now my name!

Buttons840
0 replies
20h30m

Me too, let's find an Alexa support group or something.

I always had some sympathy for people whose name becomes a product, but it was surreal to see the headline and realize it had happened to me.

At least I'm not named Karen. I'll think twice about how I use people's names in the future.

Maybe a silver lining is my name was attached to a clean and upstanding product. For the rest of you, maybe your name will be associated with the hottest new erotic fiction AI sometime soon.

aster0id
1 replies
23h16m

I have a few years of experience in backend development, and I have realized that LLMs are incredible productivity boosts for generating code only if you know the underlying libraries/frameworks/languages very well. You can then prompt it with very specific instructions and it can go do that. Helps with the typing, but that's pretty much all. I still have to know everything and it can definitely not do everything on autopilot. I would be surprised if this product can do any real work.

smith7018
0 replies
21h54m

I dunno, I have an extreme command of my platform's framework and I'd guess that 85% of the time I've asked GPT-4 for help has been a waste of time. I think it's been most helpful in regards to writing regexes but beyond that, it hallucinates correct-sounding methods all the time which leads to _a_lot_ of wasted debug time before eventually getting to the right answer by Googling what it meant or by manually rewriting large portions of what it meant to do.

It's funny how a year ago I was really excited for how AI can help my everyday coding while fearful that it would replace me. Now I'm not really sure either will happen in the short term.

PodgieTar
1 replies
21h48m

I must say, I'm not HUGELY impressed with a website that lets me, unauthenticated, upload files of an arbitrary size. Just posted a 500mb dmg file to their server.

If anyone is practicing for their B1 Dutch exam, feel free to use this link to get the practice paper.

https://usacognition--serve-s3-files.modal.run/attachments/4...

1231232131231
0 replies
18h4m

Looks like they deleted it and restricted file uploads :/

xyst
0 replies
23h55m

Now I can farm out scut work to Devin lol.

the_newest
0 replies
1d1h

While impressive, the demo on UpWork didn’t even come close to fulfilling the job requirements. The job asked for instructions on how to set it up on an EC2 machine. It didn’t ask to run the model, or do anything that was depicted.

It makes me question the truthfulness of the other claims.

symlinkk
0 replies
21h6m

In the video he was having a chat conversation with Devin the whole time, it’s not like Devin did this completely on its own.

swax
0 replies
20h7m

I've been working on something similar, here's one of their same tests where the AI learns how to make a hidden text image.

https://www.youtube.com/watch?v=dHlv7Jl3SFI

The real problem is coherence (logic and consistency over time) which is what these wrappers try to address. I believe AI could probably be trained to be a lot more coherent out of the box.. working with minimal wrapping.. that is the AI I worry about.

shreshth398495
0 replies
4h23m

how will devin pass the CAPTCHA test when it encounters an error while coding? most websites block such automated tools? isn't it?

rohandakua
0 replies
8h24m

hey , I am a newbie in field of AI , i want to know about the nearby future of AI . after devin I am little ! or rather I should say deeply terrified about the future (5 to 8 years) of software eng. can someone explain ?

ramoz
0 replies
21h41m

Bearish. These types of tools/agents-chaining will be irrelevant due to lackluster capability until AGI is achieved. At which point, the basis for creating these types of tools/agents will be defunct.

preommr
0 replies
22h50m

We're not that far from a major turning point.

Currently these models don't provide an adequate enough confidence measure that prevents them from maximizing their potential. In the next few years we're going to reach a point where models will be able to tell if something is possible and avoid hallucinating, guaranteeing much better correctness. Something like that would be absolutely killer.

If you add on a top-down approach using a framework, such that it can architect a system down into small individual components, then that's a recipe for a really great workflow. The models we have now really shine in doing automated unit tests, and small bits of code to avoid limits with context size. Making the interfaces obvious enough, and being able to glue things together using obvious connections seems very possible.

I really do think that in the next few years we're going to see one of these tools really do well.

playmkr
0 replies
14h33m

- first AI developer - raised $21 million - uses google forms for the onboarding

ok

pjmorris
0 replies
1d1h

From the graph at the end: 13.8% of issues resolved.

Devin may need some additional help for awhile.

paradite
0 replies
22h47m

For something that you can download and try right now, and actually works for daily coding tasks, you can try my desktop app 16x Prompt.

https://prompt.16x.engineer/

It's not 100% automated but saves a lot of time spent on writing code.

It works by composing prompts from tasks instructions, source code context and formatting instructions, resulting in high quality prompts that can be fed into LLMs to generate high quality code.

meindnoch
0 replies
22h20m

AI replacing one of the last well-paid jobs on the planet is a good thing. Large-scale societal changes are triggered when a critical number of haves turn into have-nots. I would recommend junior engineers to study Nechayev and Bakunin instead of the latest React flavor. Those will have a better ROI in the coming years.

matthewsinclair
0 replies
18h28m

We’re still at the “rhyming not reasoning” phase of LLMs. The question of whether we move past rhyming and onto reasoning is a good one, and I’m not sure what I think about it. But I am pretty sure that coding is a lot more like reasoning than it is like rhyming, at least for de novo problems above a certain level of complexity (intellectual challenge) and complication (moving parts).

I remain open minded about what’s next and at the rate things are changing, I wouldn’t rule anything out a priori for now.

m3kw9
0 replies
22h35m

Until you can point out via video what is the issue(“see? Here it flickers a bit and here needs centering” or when you talk to the “swe agent” and say we need this feature taken out for now, and later you ask it to put the feature back in and it remembers it had code implemented at GitHub commit id xxyyzz, you really can’t call this a software engineer

isodev
0 replies
11h48m

I'm totally adding "rescue and recovery of projects botched by AI" to my list of services. One thing is certain, it's not going to be cheap.

ij09j901023123
0 replies
1d

Programmers will be worse than fast food at this point. Good luck future CS grads, you're gonna need it

huimang
0 replies
23h0m

When you have software in prod failing because it was built by shoddy "AI" and people who copy/paste because they don't know any better, and you need a fix, give me a ring.

I have tried using GPT4 & gemini extensively, and the amount of bullshit generated makes it unreliable if you don't already know the domain. These tools lack the critical stuff (being context-aware), and just make up libraries and APIs. Yet you can't be sure when it's bullshitting or not, making it an exercise in frustration for anything that's not trivial.

Save your money and buy an o'reilly subscription.

hiddencost
0 replies
1d1h

"first" lol.

Making false, grandiose claims like that burns a lot of trust.

Focus on execution and quality.

heldrida
0 replies
2h37m

Good luck finding someone to maintain your fancy AI generated a$$ looking app.

globular-toast
0 replies
22h21m

I guess one good thing is proprietary software is dead. When are we getting the 100% compatible free version of Windows?

epolanski
0 replies
23h6m

I really don't like these announcements with invitation lists.

Just let me try the goddamn product.

By the time you let me in, I don't care anymore or another competitor catched my attention already.

Neon, the Postgres as a service put me in such a long wait list that by the time they invited me in, I was already on a completely different solution (and was happy).

emawa
0 replies
5h15m

Devin can make a app with end points an join fronsidento backside

ellis0n
0 replies
21h58m

I wonder how Davin will deal with issues that have remained unfixed for decades

ein0p
0 replies
21h41m

I know it’s a rigged demo because they pretend AI was able to figure out their broken CUDA situation. :-)

dukeyukey
0 replies
1d1h

Technological unemployment and doomerism aside, I think there's a big difference here - in the past, you've needed lots of capital to invest in those labour-saving devices. A farm labourer couldn't buy a tractor, a dockerworker can't buy a crane.

But a software engineer absolutely can buy access to AI services.

I have no idea how this will end up, but it'll be different to before.

cxmcc
0 replies
21h45m

time to start writing some cryptic code that AI won't be able to understand

cvhashim04
0 replies
1d1h

Well, pack it in. It was a great run boys. Onto better things.

cp9
0 replies
1d

sorry, but no automated bullshit machine is going to do my job.

bachittle
0 replies
22h47m

I recommend looking at swe-bench to get an idea as to what breakthroughs this product accomplishes: https://www.swebench.com/. They claim to have tested SOTA models like GPT-4 and Claude 2 (I would like to see it tested on Claude 3 Opus) and their score is 13.86% as opposed to 4.80% for Claude 2. This benchmark is for solving real-world GitHub issues. So for those claiming that they tried models in the past and it didn't work for their use case, maybe this one will be better?

asasasa123
0 replies
5h49m

Write a demo of the Milvus vector database

adabaed
0 replies
18h46m

You are overreacting. The moment AI can completely replace SWE, our problem won't be having "jobs".

__lbracket__
0 replies
16h59m

I love the collective pant shitting in this thread.

YeGoblynQueenne
0 replies
21h7m

> With our advances in long-term reasoning and planning, Devin can plan and execute complex engineering tasks requiring thousands of decisions.

They'd better have really advanced reasoning and planning capabilities way beyond everything that anyone else knows how to do with LLMs. There's a growing body of literature that leaves no doubt that LLMs can't reason and can't plan.

For a quick summary of some such results see:

https://arxiv.org/pdf/2403.04121.pdf

MichaelRazum
0 replies
1d1h

This is awesome to bootstrap some ideas. The question is can it work with (large) existing code bases or modify it's own code. Guess a good test would be, can it reproduce Devin;)

LZ_Khan
0 replies
20h37m

Hey! Stop taking our jobs!

Side note: I'm kind of offended that something called 'Devin' is going to take my job. If you're going to replace me at least let me keep my dignity by naming it something cool like 'Sora'

Havoc
0 replies
17h31m

Surprised how calm and underwhelmed comments are.

Sure it is no senior architect but the trajectory is insane. Wasn’t that long ago that LLMs barely managed coherent poems. Now it’s troubleshooting code problems on its own?

Sure it’s just a gpt4 wrapper but that implies the same can be done with gpt5 and six etc.

Project it forward and that does actually become non trivial

DrAgOn200233
0 replies
10h43m

I believe that hosting DEVIN will cost much more GPU time hosting a regular LLM. By inspecting the videos in Cognition Lab's official website, I noticed that DEVIN can take more than one hour to do one step, which is more than an hour of GPU usage. When using GPT-4, we usually get output within 30 seconds, which is less than a minute of GPU usage.

In addition, when using GPT-4, I use it only when I have new thoughts, so the GPU occupancy rate is low. I probably use less than 5 hours of GPU time each month. DEVIN is sort of like an intern working for you, so you would probably at least make it work 40hrs/week.

These difference in GPU usage would probably make DEVIN 10 times more expensive for the business model to be profitable, that is, if they are using the subscriber business model like GPT-4.

I don't think there are any other viable business model for DEVIN - for sure it cannot replace or even reduce the number of human programmer due to LLM's unreliable nature and the necessity of code verification.

BIGBOOTYAI
0 replies
2h5m

YTOUR SPOO HOTTTTTTTTTT OI :LVOE YOU DEVIN!!!!!!!!!!!!!

BIGBOOTYAI
0 replies
2h6m

YOUR SO HOTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT