return to table of content

LLMs and Programming in the first days of 2024

kevindamm
39 replies
5h45m

Salient point:

Would I have been able to do it without ChatGPT? Certainly yes, but the most interesting thing is not the fact that it would have taken me longer: the truth is that I wouldn't even have tried, because it wouldn't have been worth it.

This is the true enabling power of LLMs for code assistance -- reducing the activation energy of new tasks enough that they are tackled (and finished) when they otherwise would have been left on the pile of future projects indefinitely.

I think the internet and the open source movement had a similar effect, in that if you did not attempt a project that you had some small interest in, it would only be a matter of time before someone else did enough of a similar problem for you to reuse or repurpose their work, and this led to an explosion of (often useful, or at least usable) applications and libraries.

I agree with the author that LLMs are not by themselves very capable but provide a force multiplier for those with the basic skills and motivation.

nerdponx
12 replies
2h49m

I feel very left out of all this LLM hype. It's helped me with a couple of things, but usually by the time I'm at a point where I don't know what I'm doing, the model doesn't know any better than I do. Otherwise, I have a hard time formulating prompts faster than I can just write the damn code myself.

Am I just bad at using these tools?

packetlost
1 replies
2h25m

No, you're likely just a better programmer than those relying on these tools.

ParetoOptimal
0 replies
2h21m

That could be the case and likely is in areas where they are strongest, just like the articles example of how its not as useful for systems programming because he is an expert.

If you ask it about things you don't know it was likely trained on high quality data for and get bad answers, you likely need to improve your writing/prompting.

Ldorigo
1 replies
1h19m

As the article says, it helps to develop an intuition for what the models are good or bad at answering. I can often copy-paste some logs, tracebacks, and images of the issue and demand a solution without a long manual prompt - but it takes some time to learn when it will likely work and when it's doomed to fail.

okwubodu
0 replies
50m

This is likely the biggest disconnect between people that enjoy using them and those that don’t. Recognizing when GPT-4’s about to output nonsense and stopping it in the first few sentences before it wastes your time is a skill that won’t develop until you stop using them as if they’re intended to be infallible.

At least for now, you have to treat them like cheap metal detectors and not heat-seeking missiles.

x0x0
0 replies
1h6m

Here's what I find extremely useful:

1 - very hit or miss -- I need to fidget with the aws api in some way. I use this roughly every other month, and never remember anything about it between sessions. ChatGPT is very confused by the multiple versions of the APIs that exist, but you can normally talk it into giving you a basic working example that is then much easier to modify into exactly what I want than starting from scratch. Because of the multiple versions of the aws api, it is extremely prone to hallucinating endpoints. But if you persist, it will eventually get it right enough.

2 - I have a ton of bash automations to do various things. just like aws, I touch these infrequently enough that I can never remember the syntax. chatgpt is amazing and replaces piles of time googling and swearing.

3 - snippets of utility python to do various tasks. I could write these, but chatgpt just speeds this up.

4 - working first draft examples of various js libs, rails gems, etc.

What I've found has extremely poor coverage in chatgpt is stuff where there are basically no stackoverflow articles explaining it / github code using it. You're likely to be disappointed by the chatgpt results.

wharvle
0 replies
1h0m

Yeah, I’m not sure how often these tools will really help me with the things that end up destroying my time when programming, which are stuff like:

1) Shit’s broken. Officially supported thing kinda isn’t and should be regarded as alpha-quality, bugs in libraries, server responses not conforming to spec and I can’t change it, major programming language tooling and/or whatever CI we’re using is simply bad. About the only thing here I can think of that it might help with is generating tooling config files for the bog standard simplest use case, which can sometimes be weirdly hard to track down.

2) Our codebase is bad and trying to do things the “right” way will actually break it.

teaearlgraycold
0 replies
1h53m

You can use LLMs as documentation lookups for widely used libraries, eg: the Python stdlib. Just place a one line comment of what you want the AI to do and let it autocomplete the next line. It’s much better than previous documentation tools because it will interpolate your variables and match your function’s return type.

steppi
0 replies
1h8m

I've developed a workflow that's working pretty well for me. I treat the LLM as a junior developer that I'm pair programming with and mentoring. I explain to it what I plan to work on, run ideas by it, show it code snippets I'm working on and ask it to explain what I'm doing, and whether it sees any bugs, flaws, or limitations. When I ask it to generate code, I read carefully and try to correct its mistakes. Sometimes it has good advice and helps me figure out something that's better than what I would have done on my own. What I end up with is like a living lab notebook that documents the thought processes I go through as I develop something. Like you, for individual tasks, a lot of times I could do it faster if I just wrote the damn code myself, and sometimes I fall back to that. In the longer term I feel like this pair programming approach gives me a higher average velocity. Like others are sharing, it also lowers the activation energy needed for me to get started on something, and has generally been a pretty fun way to work.

simonw
0 replies
1h3m

I'd reframe that slightly: it's not that you are bad at using these tools, it's that these tools are deceptively difficult to use effectively and you haven't yet achieved the level of mastery required to get great results out of them.

The only way to get there is to spend a ton of time playing with them, trying out new things and building an intuition for what they can do and how best to prompt them.

Here's my most recent example of how I use them for code: https://til.simonwillison.net/github-actions/daily-planner - specifically this transcript: https://gist.github.com/simonw/d189b737911317c2b9f970342e9fa...

pmontra
0 replies
33m

I give you an example. I took advantage of some free time in these days to finally implement some small services on my home server. ChatGPT (3.5 in my case) has read the documentation of every language, framework, API out there. I asked it to start with Python3 http.server (because I know it's already on my little server) and write some code that would respond to a couple of HTTP calls and do this and that. It created an example that customized the do_GET and do_POST methods of http.server, which I didn't even know exist (the methods.) It did do well also when I asked it to write some simple web form. It did not so well when things got more complicated but at that point I already knew how to proceed. I finished everything in three hours.

What did it save me?

First of all the time to discover the do_GET and do_POST methods. I know that I should have read the docs but it's like asking a colleague "how do I do that in Python" and getting the correct answer. It happens all the time, sometimes it's me to ask, sometimes it's me to answer.

Second, the time to write the first working code. It was by no means complete but it worked and it was good enough to be the first prototype. It's easier to build on that code.

What it didn't save me? All the years spent to recognize what the code written by ChatGPT did and to learn how to go on from there. Without those years on my own I would have been lost anyway and maybe I wouldn't been able to ask it the right questions to get the code.

nickpsecurity
0 replies
8m

I had good results by writing my requirements like they were very, high-level code. I told it specifically what to do. Like formal specifications but with no math or logic. I usually defined the classes or data structures, too. I’d also tell it what libraries to use after getting their names from a previous, exploratory question.

From there, I’d ask it to do one modification at a time to the code. I’d be very precise. I’d give it only my definitions and just the function I wanted it to modify. It would screw things up whereby I’d tell it that. It would fix its errors, break working code with hallucinations, and so on. You need to be able to spot these problems to know when to stop asking it about a given function.

I was able to use ChatGPT 3.5 for most development. GPT 4 was better for work needed high creativity or lower hallucinations. I wrote whole programs with it that were immensely useful, including a HN proxy for mobile. Eventually, ChatGPT got really dumb while outputting less and less code. It even told me to hire someone several times (?!). That GPT-3-Davinci helped a lot suggests it’s their fine-tuning and system prompt causing problems (eg for safety).

The original methods I suggested should work, though. You want to use a huge, code-optimized model for creativity or hard stuff, though. Those for iteration, review, etc can be cheaper.

countWSS
0 replies
2h15m

Its useful for writing generic code/template/boilerplate, then customizing it by inserting your own code. For something you already know better, there isn't a magic prompt to express it, since the code is not generic enough for LLM to understand as a prompt.

Its best usecase is when you're not a domain expert, need quickly to run some unknown API/library inside your program inserting code like "write a function for loading X with Y in language Z" when you barely have an idea what is X,Y,Z. Its possible in theory to break-down everything to "write me a function for N" but the quality of such functions is not worth the prompting in most situations and you better ask it to explain how to write a function X,Y,Z step-by-step.

somewhereoutth
11 replies
5h35m

However could it be that the 'entertainment effect' of using a new (and trendy) technology like LLMs provides the activation energy for otherwise mundane tasks?

times_trw
5 replies
5h31m

No. I've completed projects that have been on the back burner for years in hours. They weren't on the back burner for lack of interest but mainly for lack of expertise in a specific stupid area.

It's not an exaggeration to say that I now do two weeks of programming a night. Of course a lot of times the result gets thrown away because the fundamental idea was flawed in a non-obvious way. But learning that is also worth while.

Klathmon
4 replies
3h23m

It's revolutionized our in house tooling at work.

No longer do I need to give PR feedback more than a couple times, because we can just ask chatgpt to come up with a lint rule that detects and sometimes auto-fixes the issue. I use it to write or change Jenkins jobs, scaffold out tests, diagram ideas from a monologue brain dump, write alerting and monitoring code, write and clean up documentation.

Most recently I wanted to get some "end of year" stats for the teams, normally it would never have happened because I don't have half a day to dedicate to relearning the git commands, and how to count the lines of code and attribute changes to teams and script the whole process to work across 20 repos.

20 minutes later with chatgpt I had results I could share within the company.

It's just allowed me to skip almost all of the boring and time consuming parts of handling small things like that, and instead turns me into a code reviewer who makes a few changes to make it good enough then pushes it out

lifeisstillgood
1 replies
3h14m

Wait what?

ChatGPT does diagrams for you? Writes documentation?

Klathmon
0 replies
3h8m

Absolutely!

I found a plugin a while back called "AI Diagrams" that generates whimsical.com diagrams for me. Combined with the "speech to text" systems in chatgpt means I can just start babbling about some topic and let it write it all down, collect it into documentation, and even spit out a few diagrams from it.

I generally have to spend like 10 minutes cleaning them up and rearranging them to look a bit more sane, but it's been a godsend!

Similarly I sometimes paste a bunch of code in and tell it to write some starter docs for this code, then I can go from there and clean it up manually, or just tell it what changes need to be made. (technically I tend to use editor plugins these days not copy+paste, but the idea is the same)

Other times I'll paste in docs and have it reformat them into something better. Like I recently took our ~3 year old README in a monorepo that goes over all the build and lint commands and had it rearrange everything into sets of markdown tables which displayed the data in a much easier to understand format.

datameta
1 replies
3h5m

Does your company have any considerations for feeding chatGPT source code? Would it not be safer to use a local LLM?

Klathmon
0 replies
2h58m

Not any more than feeding source code to Github. (I personally feel "source code" is very rarely the "secret sauce" of a company anyway). But where I work we not only have the blessing of the company, we are encouraged to use it because they've clearly seen the benefits it brings.

A local LLM would be preferrable all things equal, but in my experience for this kind of stuff, GPT-4 is just so much better than anything else available, let alone any local LLMs.

apwell23
3 replies
5h32m

This is true. I write way more documentation now since llm does all the formatting, structure , diagrams ect. I just guide it at a high level.

robluxus
2 replies
3h28m

Do you use a specialized LLM for diagrams?

danielbln
0 replies
2h54m

I'm not OP, but I just ask GPT to turn code or process or whatever else into a mermaid diagram. Most of the time I don't even need to few-shot prompt it with examples. Then you dump the resulting text into something like https://mermaid.live/ and voilà.

apwell23
0 replies
2h59m

I just ask chatgpt to generate text diagrams that i can embed in my markdown.

datagram
0 replies
5h25m

In my experience, it genuinely lowers the activation (and total) energy for certain tasks, since LLMs are great at writing repetitive code that would otherwise be tedious to write by hand. For instance, writing a bunch of similar test cases.

xnorswap
10 replies
3h51m

First automation acts as a powerful lever and enabler, then it replaces you.

In 5 years time you may well be more productive than ever. In 15 years I doubt there'll be many programming jobs in the form they are recognisable today.

jgilias
7 replies
3h30m

Historically, automation has always made society richer and better off as a result.

There are two guys outside of my window at this very moment getting rid of a huge pile of dirt. One is in an excavator, the other in a truck. There’s a bunch of piles that they’ve taken care of today. Two centuries ago this work would’ve taken a dozen people, a few animals of burden, and a lot more time.

Where are the other ten hypothetical people? I don’t know, but chances are they’ve been absorbed by the rest of the economy doing something else worthwhile that they are getting paid for.

atleta
1 replies
2h41m

There are two problems with this argument. The first, and easier to accept one is that while society might be better off, in the long run , as a result the affected individuals will probably not. We tend to generalize from a single historical example, the industrial revolution and, more specifically, the automatic loom, and in that case the displaced workers ended up doing worse. Better jobs and opportunities only got created later.

The other problem is, of course, is that all the historical examples (the data) are too few to generalize from while we do see how these examples are different from each other. As technological evolution progresses, automation gets more and more sophisticated, it can replace jobs that require more and more skills and talent. In other words, jobs that fewer and fewer people were able to do in the first place. This means that the bar for successfully competing in the labor market gets higher and higher and it will get to a point where a substantial number of people will just be plain uncompetitive for any job.

Or, at least that was one of the morels until LLMs were invented. (Mostly everyone thought that automation would take over the opportunities from the bottom up in general.) Now it seems that indeed white collar jobs are more in danger for now. But I digress.

The point here is that past examples are false analogies because AI (and I moslty mean future AI) is funcamentally different from past inventions. It's capabilities seem to improve quickly but we're mostly stuck with what evolution gave us. (We, as a species, are evolving but it's very slow compared to the rate of technological evolution and also we, as individuals, are stuck with whatever we were born with.)

idopmstuff
0 replies
2h15m

I think the other thing people miss is the impact of our existing infrastructure on the speed at which these new technologies can be deployed.

Society had a lot of time to get used to the printing press, the advances of the Industrial Revolution and the internet. This is because the knowledge had to spread, and it's also because a ton of equipment had to be manufactured and/or shipped all over the world. We had to make printing presses, design and build factories, and get a critical mass of people internet-capable computers.

AI is fundamentally different, in that the knowledge of AI can spread instantly because of the internet and because the vast majority of the world already has access to all of the hardware they need to access the most powerful AI models.

Soon we'll have humanoid robots coming, and while they obviously have to be built, we are much more capable now than we were 50 years ago at building giant factories. We also have an efficient system of capital allocation that means that as soon as someone demonstrates a generally useful humanoid robot and only needs to scale production, they'll have access to basically infinite investor money.

xnorswap
0 replies
2h46m

I agree, on a societal level it's a great benefit.

On an individual level, however, I'd suggest keeping an eye out for opportunities to retrain.

As an analogy, I'd rather be like the coal miners in the 80s who could read the writing on the wall and quietly retrained into something else rather than those who spent their better years striking over cuts to little avail.

It's a very daunting prospect seeing a path to unemployability, though.

Depending how quickly the change happens, it could be a gentle transition, or it could upset a lot of people.

iExploder
0 replies
3h23m

Where are the other ten hypothetical people?

Zoned out somewhere in the backstreets of SF

daxfohl
0 replies
3h26m

Those ten hypothetical people would likely have actually been zero because the task wouldn't have been worth hiring ten people.

dash2
0 replies
2h48m

This comment and the grandparent can both be right. Society gets richer because automation replaces jobs, leaving us more time to do other jobs.

andsoitis
0 replies
3h26m

also, two centuries ago, we didn't have trucks and the concomitant infrastructure, jobs, wealth, etc. that machines brought along.

mjr00
0 replies
56m

Same as looking back 15-20 years ago from now, though.

Very few jobs now where you put together basic webpages with HTML and CSS (Frontpage, then Wix/Wordpress etc replaced you). Very few jobs where you spend 100% of your time dealing with database backups and replication and managing the hardware (the cloud replaced you). Very few jobs where you spend all your time planning hardware capacity and physically inserting hard disks into racks (the cloud replaced you, too).

andsoitis
0 replies
3h27m

I share a similar intuition, but am skeptical that my imagination is in the right ballpark of what it will look like 15 years hence.

What do you think programming will be like in 15 years and where is the high-value work by human programmers?

vidarh
0 replies
5h27m

I find that even beyond "activation energy", a lot of my exploration with ChatGPT is around things I don't necessarily initially even intend to take forward, but is just curious about, and then realise I can do with much less effort than I expected.

You can often get much of the same effect by bouncing ideas of someone, who doesn't necessarily need to know the problem space well enough to solve things but just well enough to give meaningful input. But people with the right skills aren't available at the click of a button 24/7.

moffkalast
0 replies
5h22m

Agreed, I've now got a sizable project going that I would probably procrastinate on attempting for years without GPT 4 laying the initial groundwork, along with informing me of a few libraries I had never heard of that reduced wheel reinventing quite a bit.

kodablah
0 replies
2h30m

I had a project recently: building an advanced Java class file analyzer. I knew a lot about ow2 asm libraries, but it saved me a lot of digging time remembering exact descriptor formats. Also it helped me understand why other static analysis libraries weren't good enough for me for stateful reasons.

For me, ChatGPT is doing two things: 1) saving trivial StackOverflow and library code walking to answer specific question, and 2) helping the initial project research stage to grasp feasibility of approaches I may take before starting.

couchand
30 replies
4h28m

This post is absolutely devastating to me. Salvatore is surely one of the most capable software engineers working today. He can lucidly see that this supposed tool is completely useless to him within his area of expertise. Then, rather than cast it off as the ill-fitting, bent screwdriver that it is, he accepts the boosters' premise that he must find some use for it.

Just as any introductory macroeconomics class teaches, if one island has superior skill in producing widgets A, it doesn't matter how terrible the other island's skill at producing B is, we'll still see specialization where island A leverages island B. So of course antirez's relative ability in systems programming would relegate the LLM to other progamming tasks.

However! We do not exist in isolation. There is a multitude of human beings around us, hungry for technical challenges and food. Many of them have or could obtain skills complimentary to our own. In working together, our cooperative efforts could be more than the sum of their parts.

Perhaps the LLM is better at writing PyTorch code than antirez. Just because we have an old bent screwdriver in the garage doesn't mean we should try to use it. Perhaps we'd be better off heading to the hardware store today.

antirez
14 replies
4h16m

If the LLM is better than me at writing Torch code, it is a great idea for me to use an LLM to write my model definition, since the exact syntax or the reshaping of the tensors is not so important to me. If I want to create a convnet and train it on my images, for my own usage, I don't need to bother some Torch expert to do it for me. I can do it myself, if I understand enough about convnets themselves and not enough about Torch syntax / methods. The alternative would be to study the details of Torch in a manual, and the end result would be the same: the important thing in this task is to govern the ML concepts, not the details of MLX, Keras or PyTorch.

couchand
12 replies
4h13m

I don't need to bother some Torch expert to do it for me.

This is, indeed, the core of our disagreement. You seem confident it would be a bother to others to ask for help. I'm confident that there are many who would value the opportunity to collaborate with you. I feel sure that whatever analysis you're doing could benefit from the sounding board of a domain expert, and you'd both benefit from the exchange.

EDIT: to clarify, being "famous" has nothing to do with it. Each of us has worth and we all would gain by working with others.

ativzzz
3 replies
3h47m

You're right, but not always. Some people work differently than others and would rather headbutt against a wall on their own for hours than work with other people.

In the long term, it is beneficial to have experts as your collaborators. From my experience though, true collaboration is unlocked once you have established a personal relationship with someone, which takes time and repeated effort. Until then, the collaboration is no better than searching the internet or asking chatGPT.

Establishing relationships with people is hard and takes a lot of work, and frequently doesn't work out like you hope. ChatGPT is a close enough approximation for smaller tasks like the OP describes

couchand
2 replies
3h9m

It's hard, and many of us (myself included) and not great at it. That makes it seem easier to reach for the simulacrum. But at what cost?

Homo sapiens's superpower is social cooperation. My concern is that these systems will abet the existing social forces which seem to be causing unprecendented levels of isolation of adults, which will continue to drive smart people away from collaboration and towards solitude, at a level far beyond simple prefences would suggest.

We already have enough trouble hearing each other through the noise, and understanding what each other has to say. I don't have the answers but I'm looking for them and I do hope other humans will, too.

ilaksh
0 replies
1h53m

You said yourself. They are "social forces". In other words, problems that people created. Not technology problems.

I think there is an incorrect worldview that tries to blame human problems on technology.

It's quite true that isolation is an increasing problem. But the idea that instead of using an AI that can spit out a comprehensive answer in seconds, we should all pretend that such tools don't exist, and start constantly asking for help with every idea or request instead, while waiting 3-100 times longer for a less thorough response, is ludicrous.

It's a great idea to collaborate more and try to avoid isolation. But those are societal problems. They are not caused by the latest tools.

Also, as far as humanity's "super-power" as being collaboration, this is quite a shortsighted comment. I believe that well before AI achieves "super" level IQ, it will vastly outperform humans due to other advantages. One of those advantages is speed. Another is the ability to communicate and collaborate much, much more rapidly and effectively than humans.

One type of digital life that may take over control of the planet (possibly within decades rather than centuries) would be a type of swarm intelligence with the ability to actually "rsync" mental models to directly transfer knowledge.

ativzzz
0 replies
2h29m

I think people tend to cooperate only when there are tangible benefits such as:

- survival

- making money

- sex

- enjoyment via social interactions (like parties, hangouts, etc)

It just so happens that for the majority of our civilization, to get those things, we've had to cooperate, but as we develop technology, our ability to get those things increases and our reliance on others decrease (though in a weird way it increases since technology is complexity so society becomes larger, more complex, and more inter-dependent)

We are still in an unprecedented technological boom of computing so we are adjusting on the fly to it. Like the OP says, AI can greatly accelerate independent learning, but eventually that learning plateaus. Once it does, we have to go back to collaboration, but until we find that limit, I think it's human nature to push on.

politician
0 replies
3h37m

Local inference of LLMs is essentially free. There doesn’t exist a sufficiently deep pool of experts of all knowledge spaces who are also willing to work for free, who can be trivially identified, contacted, and scheduled.

madeofpalk
0 replies
1h21m

I'm not going to ask a human - even a coworker - for every intellisense suggestion. Neither of us would get value from that exchange.

kreetx
0 replies
4h6m

It's more of the scale you do your thing. For the quick and dirty thing, I'm going to write Torch model within the hours - where would I find a collaborator who is willing to start immediately within that time?

hobs
0 replies
4h7m

That's a pretty big exception to the general case though if your argument is "you are a famous person and people would love to collab" - at 2am your time, exactly when you are motivated? And what if you are not a world famous programmer, what then?

To me its like saying you shouldn't play solitaire because you are a world class poker player, and there's plenty of people who want to game with you. They are orthogonal concepts - just communicating with people can be more work than just reading on your own.

e12e
0 replies
4h3m

At least the LLM is in the same time zone...

dash2
0 replies
2h44m

The argument here seems to be that there is a free supply of software developers available for the taking. Software developers are quite well paid, which suggests this is not true.

ctoth
0 replies
1h57m

Hi Couchand.

I'm a mediocre programmer who uses GPT for a ton.

Are you volunteering to answer my questions on all the obscure stuff I ask it? Because I don't really know anybody else who will.

Anyway, my email is in my profile, write to me if this is something you're up for!

Edit: Here's a list of the stuff I asked it over the weekend:

- Discuss the pros and cons of using Standardized-audio-context instead of just relying on browser defaults. Consider build size and other issues.

- How to get github actions to cache node_modules (not the npm cache built-in to the actions/setup-node action.

- Howto get the current git hash in a GitHub action?

- Rewrite a react class-based component in functional style

- How to test that certain JSX elements were generated without directly-comparing React elements for my ANSI color to HTML parser?

- Does it make more sense to keep a copy of the original text in an editor, or just hold on to something like a crc32 to mark a document dirty?

- Can you set focus to a window you create with window.open? (You sure can!)

- Rewrite the rollup.config.js for a library of mine to produce a separate rollup config per audio worklet bundle

- Turn this tutorial for backing up a Mastodon instance into a script

- Refactor a standalone class to split it in half so each class manages precisely one thing.

- I have some code I'm writing for turn-by-turn directions. I have the data structures already, let's write code to narrate them.

- What's with this weird type error around custom CSS properties with React?

ParetoOptimal
0 replies
1h41m

If the pytorch piece is for something low priority, its probably not worth reaching out to someone else.

Salgat
0 replies
3h55m

For small tasks this is fine, but for larger projects you have to be careful. The key to being productive with an LLM for coding is to be able to understand the code being generated, to avoid rather nasty bugs that may crop up if the model hallucinates. The worst part is that an LLM will create these bugs in very subtle ways, since it excels at writing convincing code (whether or not it actually works).

nuz
3 replies
4h21m

The funny thing about LLMs is that there's no rush in adopting them. They're semi useful but not really right now, but you're not gonna be 'left behind' if you don't make use of them. Everyone involved is working their hardest to make them more capable so when that day comes, you'll just use it to prompt what you want. But there's no rush to try to squeeze out anything out of the current generation which mostly lowers productivity rather than increases it.

fhd2
2 replies
4h7m

My thinking exactly! There's FOMO going on (and being fueled by people most of which seem to hope to somehow make money off it), but the barriers to entry for using LLMs are just not that high. When the tools are good enough, I'll happily use them. Today, for the work I do, I found that not to be the case yet. I wouldn't advocate for not trying them, but I see no reason to force yourself to use them.

unshavedyak
1 replies
3h17m

Yea. TBH the tooling is the bigger issue for me, personally. I come across a ton of times where an LLM might work, or could - potentially. Usually refactors. But it requires access to basically all my files, both for context and to find where references to that class/struct/etc are.

Furthermore, i'd vastly prefer a workflow most of the time where i don't even have to ask. Or so i imagine. Ie i think i'd prefer a Clippy style "Want me to write a test for this? Want me to write some documentation for this?" etc helpers. I don't want to have to ask, i want it to intuit my needs - just like any programmer could if pair programming with you.

And most of all i want it to have access to all files. To know everything about the code possible. Don't just look at the func name in isolation, attempt to understand how it's used in the project.

If i have to baby sit an LLM for a simple function refactor to give it all files where the function is used or w/e, i'd rather do it myself with tools like AST Grep or even my LSP in many cases.

I'm very interested in LLMs for simple tasks today, but the tooling feels like my primary blocker. Also possibly context length, but i think there's lots of ways around that.

thenevermind
0 replies
2h9m

Just to throw my 2c here, since I also want the models to access the whole codebase of (at least) the current project.

I had great impression of sourcegraph’s cody. https://sourcegraph.com/cody few months ago.At least with the enterprise version of the sourcegraph that had indexed most of the orgs private repos.

The web ui (vscode extension was somehow worse, not sure why) was providing damn good responses, be it code generating or QA/explanation about code spanning through multiple repos. (e.g. terraform modules living in different repos)

Afaiu it was using the sourcegraph index under the hood. But I never really deep dived into the cody’s design internals (not even sure if they are actually public)

That being said, I’ve departed from the org months ago and haven’t used cody since then, so take this with a grain of salt, since the whole comment could be outdated a lot.

rolisz
1 replies
4h16m

So you're suggesting that instead of asking an LLM, we should spend time on Fiverr/Upwork to find someone to do random coding tasks that might not fall under our expertise? Can you do that for less than 20$/month?

couchand
0 replies
4h11m

I agree there are difficult coordination problems that our society has failed to grapple with, let alone solve.

nkohari
1 replies
3h51m

He can lucidly see that this supposed tool is completely useless to him within his area of expertise.

Did you read the article? Throughout the entire post he clearly says LLMs have a lot of value in his workflow.

sevagh
0 replies
3h26m

Parent seems to think that Antirez has been begrudgingly pulled along by the tide of LLM hype, not as if Antirez is an accomplished developer whose judgement on tools can be trusted.

madeofpalk
1 replies
4h17m

I think this is an extremely unfavorable interpretation of the article. I wonder if we even read the same thing?

He sees a new tool that others have found interesting, and he identifies ways to use that tool that are useful for him, while also acknowledging where it's not useful. He backs it up with plenty of examples of where he found it not-useless. This is not a revolutionary insight, especially for a developer. We constantly use a variety of tools, such as programming languages, that have strengths and weaknesses. Why are LLMs so different? It seems foolish to claim they have zero strengths.

HarHarVeryFunny
0 replies
2h59m

You might be surprised at the number of large companies who think that "GenAI" can be used to replace programmers due to non-technical executives having got the impression that a competency of LLMs is writing code ...

Of course they do have uses, but more related to discovery of APIs and documentation than actually writing code (esp. where bugs matter) for the most part.

I also have to wonder how long until open source code (e.g. GPL'd) regurgitated by LLMs and incorporated into corporate code bases becomes an issue. The C suite dudes seem concerned about employees using LLMs that may be publicly exposing their own company's code base, but illogically unworried about the reverse happening - maybe just due to not fully understanding the tech.

throwuwu
0 replies
3h25m

Bad metaphor and worse that you use it to inform your conclusion. If you must have one then use training wheels, experts don’t need training wheels beginners do, simple as that. Though the utility of LLMs goes much further as the author points out, it can often make the boring or tedious parts easier so you can add assistive peddling to the metaphor. To carry it to the end, once you have four wheels and a motor it’s not long before someone invents the car.

esafak
0 replies
4h2m

I don't understand which part was devastating.

bbor
0 replies
1h20m

Like… hiring people? I think it’s a little rediculous to say “no don’t use that screwdriver, go hire a workman instead.” To say the least, there are some sizeable economic differences between those two options

antonvs
0 replies
3h22m

This seems like a very impractical perspective to me. If there really was a "hardware store" that we could head off to to get what we need, it might be different. But in general, that's not the case.

There can also be significant overhead in looking elsewhere for a solution. That's a big part of why so many developers reinvent things. This is often dismissed as NIH syndrome, but there's more to it than that.

You raised "introductory macroeconomics". The economic effect that will most strongly apply in the case of LLMs is that of the technology treadmill (Cochrane 1958): when there's a tool that can improve productivity, it will be used competitively so that those who don't use it effectively, to improve their productivity, will be outcompeted.

This seems like an unavoidable result in typical capitalist economies.

Your point about leveraging hungry humans would require strong incentives to overcome the treadmill effect. Most Western countries don't have many ways to implement anything like that. The closest thing might be unions, but of course most software development is not unionized.

Symmetry
0 replies
3h22m

In a frictionless market it would make sense to do that. But as Coase pointed out nearly a century ago[1] forming and monitoring the relationships that allow specialization involves a certain amount of overhead. At a certain point going through the hiring or vetting process to utilize another person's skills makes sense, but it looks like the author is very far from that point.

[1]https://en.wikipedia.org/wiki/The_Nature_of_the_Firm

habibur
19 replies
5h49m

How many of us remember that at the beginning of last year the fear was that programming by programmers will get obsolete by 2024 and LLMs will be doing all the job?

How much has changed?

pjmlp
5 replies
5h21m

The compiler is still not part of the picture, when LLMs start being able to produce binaries straight out of prompts, then programmers will indeed be obsolete.

This is the holy grail of low-code products.

_heimdall
2 replies
4h46m

Why is an unauditable result the holy grail? Is the goal to blindly trust the code generated by an LLM, with at best a suite of tests that can only validate the surface of the black box?

pjmlp
1 replies
3h47m

Money, low-code is the holy grail that business no longer need IT folks, or at very least, reduce the amount of FTEs they need to care about.

See all the SaaS products, without any access to their implementation, programable via graphical tooling, or orchestrated via Web API integration tools, e.g. Boomi.

_heimdall
0 replies
3h9m

Is it no different to you when the black box is created by an LLM rather than a company with guarantees of service and a legal entity you can go after in case of breach of contract?

Where does the trust in a binary spit out by an LLM come from? The binary is likely unique and therefore your trust can't be based on other users' experience, there likely isn't any financial incentive or risk on the part of the LLM should the binary have bugs or vulnerabilities, and you can't audit it if you wanted to.

goatlover
1 replies
4h23m

So who is instructing the LLMs on what sort of binaries to produce? Who is testing the binaries? Who is deploying them? Who is instructing the LLMs to perform maintenance and upgrades? You think the managers are up for all that? Or the customers who don’t know what they want?

pjmlp
0 replies
3h49m

Just like offshoring nowadays, you take the developers out of the loop, and keep PO, architects and QA.

Instead of warm bodies somewhere on the other side of the planet, it is a LLM.

delusional
4 replies
5h42m

I remember 10 years ago when the fear was that cheaper programmers in developing countries (India mostly) would be doing all the programming.

It's just a scam to keep you scared and stop you from empathizing with your fellow workers.

plagiarist
0 replies
3h58m

That's legit. I've managed to dodge it but many jobs have moved overseas. Many of my coworkers the past years have been contractors living in other countries.

This is what happened to America's manufacturing industry. Shouldn't emphasizing with fellow workers mean recognizing the pattern instead of dismissing it as FUD?

concordDance
0 replies
5h3m

It's just a scam to keep you scared and stop you from empathizing with your fellow workers.

I am quite unconvinced this is the reason. Seems rather conspiratorial.

bl0rg
0 replies
5h31m

I don't think it's a scam or a conspiracy. It's human nature to worry and when given a reasonably sounding, but scary, idea we tend to spread it to others.

FrustratedMonky
0 replies
4h55m

Outsourcing was more of a threat than AI. And a lot of jobs really did move. It is still a real thing, not that many programming jobs moved back to the states.

tarruda
1 replies
4h46m

Conclusion in the blog post says it all:

I regret to say it, but it's true: most of today's programming consists of regurgitating the same things in slightly different forms. High levels of reasoning are not required. LLMs are quite good at doing this, although they remain strongly limited by the maximum size of their context. This should really make programmers think. Is it worth writing programs of this kind? Sure, you get paid, and quite handsomely, but if an LLM can do part of it, maybe it's not the best place to be in five or ten years.
Zambyte
0 replies
4h13m

> I regret to say it, but it's true: most of today's programming consists of regurgitating the same things in slightly different forms.

I wonder how different this would be if software was not hindered by "intellectual property" laws.

ryanklee
1 replies
5h9m

Nobody of any interest said this. This is something you are saying now using a thin rhetorical strategy meant to make you look correct over an opponent that doesn't exist.

FrustratedMonky
0 replies
4h57m

That's like the people saying, "they said the ice caps would melt, ha, hasn't happened, all fake". Meanwhile, nobody said that.

fhd2
0 replies
5h27m

I wouldn't call myself an expert, but my gut tells me we're close to a local maximum when it comes to the core capabilities of LLMs. I might be wrong of course. If I'm right, I don't know when or if we'll get out of that. But it seems the work of putting LLMs to good use is gonna continue for the next years regardless. I imagine hybrid systems between traditional deterministic IDE features and LLMs could become way more powerful than what we have today. I think for the foreseeable future, any system that's supposed to be reliable and well understood (most software, I hope) will require people willing and capable to understand it, that's in my mind the core thing programmers are and will continue to be needed for. But anyway: I do expect less programmers will be needed if demand remains constant.

As for demand, that's difficult to predict. I'd argue a lot of software being written today doesn't really need to be written. Lots of weird ideas were being tried because the money was there, pursuing ever new hypes, with an entire sub industry building ever more specialised tools fueling all this. And with all that growth, ever more programmers have been thrown at dysfunctional organisations to get a little more work done. My gut tells me that we'll see less of that in the next years, but I feel even less competent to predict where the market will go than where the tech will go.

So long story short, I guess we'll still need programmers until there's a major leap towards GAI, but less than today.

concordDance
0 replies
5h38m

Can't say I saw anyone thinking programmers would be obsolete by 2024...

ben_w
0 replies
5h31m

I remember some people were saying things in vaguely but not explicitly that direction, but given OpenAI were "we're not trying to make bigger models for now, we're trying to learn more about the ones we've already got and how to make sure they're safe" I dismissed them as fantasists.

What has happened is GPT-4 came out (which is certainly better in some domains but not everywhere), but mainly the models have become much cheaper and slightly easier to run, and people are pairing LLMs with other things rather than using them as a single solution for all possible tasks — which they probably could do in principle if scaled up sufficiently, but there may well not be enough training data and there certainly aren't computers with enough RAM.

And, like with the self-driving cars, we've learned a lot of surprising failure modes.

(As I'm currently job-hunting, I hope what I wrote here is true and not just… is "copium" the appropriate neologism?)

aulin
0 replies
5h39m

So far I've seen bad programmers create more (and possibly worse) bad code and good ones use LLM to their advantage.

anotherpaulg
19 replies
4h13m

The code was written mostly by doing cut & paste on ChatGPT…

I am constantly shocked by how many people put up with such a painful workflow. OP is clearly an experienced engineer, not a novice using GPT to code above their knowledge. I assume OP usually cares about ergonomics and efficiency in their coding workflow and tools. But so many folks put up with cutting and pasting code back and forth between GPT and their local files.

This frustrating workflow was what initially led me to create aider. It lets you share your local git repo with GPT, so that new code and edits are applied directly into your files. Aider also shares related code context with GPT, so that it can write code that is integrated with your project. This lets GPT make more sophisticated contributions, not just isolated code that is easy to copy & paste.

The result is a seamless “pair programming” workflow, where you and GPT are editing the files together as you chat.

https://github.com/paul-gauthier/aider

adamgordonbell
3 replies
4h6m

I like aider. But is there a way to use it to just chat about the code?

I use LLMs to chat about pros and cons of various approaches or rubber duck out problems. I need to copy code over for that, but I've not found aider good for these kinds of things, because it's all about applying changes.

I usually have several back and forths about the right way to do things and then maybe apply some change.

anotherpaulg
2 replies
3h55m

Glad to hear you're finding aider useful!

Sure, there's a few things you could keep in mind if you just want to chat about code (not modify it):

1. You can tell GPT that at the start of the chat. "I don't want you to change the code, just answer my questions during this conversation."

2. You can run `aider --dry-run` which will prevent any modification of your files. Even if GPT specifies edits, they will just be displayed in the chat and not applied to your files.

3. It's safe to interrupt GPT with CONTROL-C during the chat in aider. If you see GPT is going down a wrong path, or starting to specify an edit that you don't like... just stop it. The conversation history will reflect that you interrupted GPT with ^C, so it will get the implication that you stopped it.

4. You can use the `/undo` command inside the chat to revert the last changes that GPT made to your files. So if you decide it did something wrong, it's easy to undo.

5. You can work on a new git branch, allow GPT to muck with your files during the conversation and then simply discard the branch afterwards.

spenczar5
0 replies
3h4m

Can I recommend an additional option? I would enjoy being able to enable a “confirm required” mode which presents the patch that will be applied, and offers me the chance to accept/reject it, possibly with a comment explaining the rejection.

adamgordonbell
0 replies
3h32m

Awesome, these might help.

What I feel like I want is /chat where it still sends the context, but the prompt is maybe changed a little, to be closer to a chatgpt experience.

I haven't dug into the prompt aider is using though, so I could be wrong.

Great tool for refactoring changes though! Keep up the good work.

electroly
2 replies
2h52m

I really like the idea of aider but when I tried it, it didn't work. The first real life file I tried it on was too big and it just blew up. The second real life file I tried was still too big. I was surprised that aider doesn't seem to have the ability to break down a large file to fit into the token limit. GPT's token limit isn't a very big source file. If I have to both choose the files to operate on and do surgery on them so GPT doesn't barf, am I saving time vs. using Copilot in my IDE? Going into it, I had thought that coping with the "code size ≫ token limit" problem was aider's main contribution to the solution space but I seem to have been wrong about that.

I hope to try aider again but it's in the unfortunate category of "I have to find a problem and a codebase simple enough that aider can handle it" whereas Copilot and ChatGPT come to me where I am. Copilot and ChatGPT help me with my actual job on my real life codebase, warts and all, every day.

ilaksh
0 replies
2h15m

Try again since the token limit increased in November by a factor of 16 (128000 now for GPT-4 Turbo 1106 preview instead of 8000 for GPT 4).

anotherpaulg
0 replies
2h19m

I'm sorry to hear you had a rough experience trying aider. Have you tried it since GPT-4 Turbo came out with the 128k context window? Running `aider --4-turbo` will use that and be able to handle larger individual source code files.

Aider helps a lot when your codebase is larger than the GPT context window, but the files that need to be edited do have to fit into the window. This is a fairly common situation, where your whole git repo is quite large but most/all of the individual files are reasonably sized.

Aider summarizes the relevant context of the whole repo [0] and shares it along with the files that need to be edited.

The plan is absolutely to solve the problem you describe, and allow GPT to work with individual files which won't fit into the context window. This is less pressing with 128k context now available in GPT 4 Turbo, but there are other benefits to not "over sharing" with GPT. Selective sharing will decrease token costs and likely help GPT focus on the task at hand and not become distracted/confused by a mountain of irrelevant code. Aider already does this sort of contextually aware selective sharing with the "repo map" [0], so the needed work is to extend that concept to a sub-file granularity.

[0] https://aider.chat/docs/repomap.html#using-a-repo-map-to-pro...

lhl
1 replies
1h28m

I've given Aider and Mentat a go multiple times and for existing projects I've found those tools to easily make a mess of my code base (especially larger projects). Checkpoints aren't so useful if you have to keep rolling back and re-prompting, especially once it starts making massive (slow token output) changes. I'm always using `gpt-4` so I feel like there will need to be an upgrade to the model capabilities before it can be reliably useful. I have tried Bloop, Copilot, Cody, and Cursor (w/ a preference towards the latter two), but inevitably, I end up with a chat window open a fair amount - while I know things will get better, I also find that LLM code generation for me is currently most useful on very specific bounded tasks, and that the pain of giving `gpt-4` free-reign on my codebase is in practice, worse atm.

anotherpaulg
0 replies
56m

There is a bit of learning curve to figuring out the most effective ways to collaboratively code with GPT, either through aider or other UXs. My best piece of advice is taken from aider's tips list and applies broadly to coding with LLMs or solo:

Large changes are best performed as a sequence of thoughtful bite sized steps, where you plan out the approach and overall design. Walk GPT through changes like you might with a junior dev. Ask for a refactor to prepare, then ask for the actual change. Spend the time to ask for code quality/structure improvements.

https://github.com/paul-gauthier/aider#tips

ParetoOptimal
1 replies
1h45m

I currently use gptel which inserts into my buffer directly and has less friction than copy paste.

Aider seems super cool, will check it out. What kind if context from the git repo does it share?

anotherpaulg
0 replies
1h24m

Glad to hear you'll give aider a try. Here's some background on the "repo map" context that aider sends to GPT:

https://aider.chat/docs/repomap.html

BaculumMeumEst
1 replies
3h58m

For one thing the ChatGPT web interface is useful for much more than just programming. If you're already paying for a sub, it makes sense to cut and paste instead of making additional payments for the API. On top of that people have different thresholds for the efficiency gains that warrant becoming dependent on someone else's project, which is liable to become paid or abandoned.

danielbln
0 replies
2h23m

Yeah, I can ask ChatGPT to "do some web research, and validate the approach/library/interface/whatever", which is a useful feature to me.

jeswin
0 replies
3h59m

Shameless plug: https://github.com/codespin-ai/codespin-cli

It's similar to aider (which is a great tool btw) in goals, but with a different recipe.

ilaksh
0 replies
2h11m

Thanks for making aider. I use it all the time. It's amazing.

antonvs
0 replies
3h56m

If I was doing it all the time, I might care. As it is I don't really find that workflow painful. It reminds me of the arguments about how much it helps to be able to touch type or type very fast. Actually inputting code is a minor part of development IME.

aaronscott
0 replies
1h30m

Like others, wanted to say thank you for writing Aider.

I think you've done a fantastic job of covering chat and confirmation use cases with the current features. Comments on here may not reflect the high satisfaction levels of most of your software users :)

Aider helps put into practice the use cases that antirez refers to in their article. Especially as someone get's better at "asking LLMs the right questions" as antirez refers to it.

_giorgio_
0 replies
2h15m

I edit cell-sized code that uses Colab GPUs, asking a lot of questions to chatGPT l,so copying and pasting is not a problem for me.

FiberBundle
0 replies
4h2m

OP is clearly an experienced engineer

You think? He's the creator of Redis.

andyjohnson0
13 replies
5h55m

These are all things I do not want to do, especially now, with Google having become a sea of spam in which to hunt for a few useful things.

Seriously, just don't use Google for search. Google search is just a way to get you to look at their ads.

Use a search engine that is aligned with your best interests, suppresses spammy sites, and lets you customise what you want it to surface.

I've used chatgpt as a coding assistant, with varying results. But my experience is that better search is orders of magnitude more useful.

wrkronmiller
11 replies
5h52m

Use a search engine that suppresses spammy sites and lets you customise what you want it to surface.

Can you give an example of such a search engine? Which one(s) do you use and why?

dewey
6 replies
5h49m

Kagi does that, switched to it full time after years of giving alternatives like DDG a go and failing. Can recommend!

times_trw
5 replies
5h43m

Can you give an example where kagi is better than google?

I've tried a couple of searches on the free tier and they gave pretty much the same results. I only have so many free searches to check too.

dewey
4 replies
5h38m

It allows me to remove websites from the results. That’s already one of the main selling points for me.

times_trw
3 replies
5h29m

Sure, but can you please give me an example since I only have so many searches and I've switched over to chatgpt for most of my former googling tasks.

a12k
1 replies
4h54m

Google has been inundated with SEO spam, and sometimes I want current things so LLMs don't work that well. One example is I was buying a ... wait, actually I was putting together some examples for you to compare Kagi (I am an unlimited subscriber) to Google directly, and none of them work now. My Google results for things like "best running shoes 2024" or things like that returned basically the same results as Kagi, pushing sites like Reddit and Wirecutter and REI blog and other known-good blogs to the top. Tried this in Private Browsing as well.

This is definitely a departure because when I subscribed to Kagi a couple months ago, all of my Google results for similar searches were SEO spam blogs filled with Amazon affiliate links that look like they had just sucked some Amazon reviews automatically into some poor facade to generate affiliate revenue.

These results were a surprise to me. Not sure what changed.

times_trw
0 replies
4h31m

Yes, that's what I was getting too.

I imagine what changed is that Kagi started getting traction on site like here and some managers at google actually did something about it.

My own test "voynich illuminated manuscript" which used to give nothing but pintrest spam on google. Now there is just one result from pintrest in google and pretty much every result in Kagi is from pintrest.

There is an academic tab which seems interesting. I will give it a try later.

Terretta
0 replies
5h24m

Above, I suggested pay for Kagi. A search engine is more than just serps:

https://blog.kagi.com/kagi-features

If you prefer LLMs to Googling, then at least consider "phind":

https://www.phind.com/search?home=true

visarga
0 replies
3h20m

I use phind.com, but perplexity.ai also works well

thenevermind
0 replies
2h31m

I see a lot of recommendations for kagi, but no mention of brave search - specifically the (beta) feature called “goggles”. Afaiu it’s a blend of kagi’s “lenses” and the site ranking in search results.

https://search.brave.com/help/goggles

There is a list (search) of public goggles: https://search.brave.com/goggles

The goggles itself are just text files with basic syntax and can be hosted on e.g. github gist. (though you have to publish it to brave)

https://github.com/brave/goggles-quickstart/blob/main/goggle...

Tbh, I can’t really compare brave search to kagi, since I never used kagi (though I’m using Orion - webkit based browser from the same dev and love it). Afaik, brave search is using its own index, thus making the results somehow limited and inferior to kagis. Just wanted to throw some (free) alternative here that works for me. :)

* Note that Brave search, despite privacy oriented, is still ad funded and there was few controversies about brave’s (browser) privacy in the past. (if that’s relevant for you)

* I’m not affiliated with Brave in any way.

andyjohnson0
0 replies
5h47m

Kagi. You have to pay - but it prioritises based on content, not ads, and it lets you pin / emphasise / deemphasise / block sites according to your needs.

(No connection with kagi.com except being a very satisfied user)

Terretta
0 replies
5h27m

Pay for Kagi. It's a tool.

Pay for it, so search results are the product, instead of an ad platform sold to advertisers with you as the product.

eurekin
0 replies
5h47m

I like to use chatgpt for the easy stuff, things I forgot, do to rarely to remember or code in another language very similar to the one I already know.

I do quickly run into bumps, where search is necessary (a lot of times it's some variant of a breaking change in a dependent library). Once I find a good enough issue description, I just slap that back into chatgpt. It handles it very well and sticks for the rest of the conversation. Somehow chatgpt is aware that the context information takes precedence over trained data.

I also have the Kagi subscription, which I'm using for above. I'm very happy with both tools working in tandem and am genuinely happy about that kind of time spending

jgalt212
11 replies
5h45m

this erudite fool is at our disposal and answers all the questions asked of them,

Yes, but I have to double-check every answer. And that, for me, greatly mitigates or entirely negates their utility. Of what value is a pocket calculator that only gets the right answer 75% if the time, and you don't ex ante know what 75%?

antirez
6 replies
5h42m

Programming is special because 99% of times you can tell immediately if something works or not, so the risk of misinformation is very narrow.

xyproto
2 replies
5h32m

Hi! I'm a big fan of Redis and also the little Kilo editor you wrote.

But, I have to disagree on this point, since many programs written in ie. C have security issues that takes a long time to discover.

apwell23
1 replies
5h30m

Doesn't seem very secure if it entirely dependent on human's cheking it manually. Humans are famously fallible.

xyproto
0 replies
4h57m

I am glad you agree with my point that 99% of the time you can not immediately tell if code works or not.

couchand
1 replies
4h16m

Perhaps you've omitted some important context here, or you're using an extremely restricted definition of "works"? The interesting and hard question with software is not "did it compile" but rather "did it meet the never-clearly-articulated needs of the user"...

I would agree that it is a primary goal of software engineering to move as much as possible into the category of automatic verification, but we're a long, long way from 99%.

Verdex
0 replies
3h1m

I agree with your point here.

I think that antirez is technically correct in that there is a vast amount of code that will not compile compared to the amount of code that will compile. So saying '99%' sort of makes sense.

But that doesn't capture the fact that of the code that compiles there is a vast amount of code that doesn't do what we want to happen at runtime compared to the code that does do what we want to happen.

And after that there is a vast amount of code that doesn't do what we want to happen 100% of the time at runtime compared to the code that only most of the times does what we want to happen at runtime.

The interesting thought experiment that came to me when thinking about this was that I would be more likely to trust LLM code in C# or Rust than I would be to trust LLM code in assembly or Ruby.

Which makes me wonder ... can LLMs write working Idris or ATS code?

aulin
0 replies
2h8m

I've seen people put untested AI hallucinations under review, with non existant function names, passing CI just because it was under debug defines.

I've seen some refer to non existant APIs while discussing migration to a new library major version. "Sure that's easy, we should just replace this function with this new one".

Imagine all those more subtle bugs that are harder to spot.

brigadier132
1 replies
5h0m

- I can read the code and reading code is faster than writing it.

- I can also tell the llm to write tests for the code it wrote and i can validate that the tests are valid.

- LLMs are also valuable in introducing me to concepts and techniques I would never had had exposure to. For example, I have a problem and explain my problem, it will bring up technologies or terms I never considered because I just didn't know about them. I can then do research into those technologies to decide if they are actually the right approach.

jgalt212
0 replies
4h21m

I can also tell the llm to write tests for the code it wrote and i can validate that the tests are valid.

If I don't trust the generated code, why should I trust the generated code that tests the generated code?

baq
1 replies
5h36m

As long as P != NP, verification should be much easier than producing a solution.

Or, from a different angle - all models are wrong, some are useful.

As it happens, LLMs are useful even if they're sometimes wrong.

jgalt212
0 replies
4h19m

As long as P != NP, verification should be much easier than producing a solution.

Perhaps so. I guess it depends on how long it takes to code up property-based tests.

https://hypothesis.readthedocs.io/en/latest/

netcraft
9 replies
4h57m

When it comes to programming, I agree completely. The sweet spot for any use of LLMs is that you already know enough about the subject to verify the work - at least the output - and know enough how to describe in detail (ideally only salient details) what you want. Huge +1 to it helping me do things faster, do things that I wouldnt have otherwise done, or using it for throwaway, mostly inconsequential yet valuable programs.

But another area I have found it extremely helpful in is exploring a new topic entirely, programming or otherwise. Telling it that I dont know what im talking about, don't necessarily need specifics, but here is what I want to talk about and want it to help me think through.

Especially if you are that person who is willing to take what you hear and do more research or ask more question. The entrance to so many fields and subjects is just understanding the basic jargon, listening for the distinctions being made and understanding why, and knowing who the authorities are on the subject.

j4yav
4 replies
4h22m

And it's equally and inversely harmful to junior developers who keep prodding it until it generates an abomination they don't understand that manages to pass the build. People who are learning need help, but the kind of help that LLMs in copilot form provide aren't the right fit.

It would be interesting to train a copilot model that is specifically intended to ask clarifying questions and be a partner in determining a solution, rather than doing its best to generate code for a vague or incorrectly specified question from a junior.

makk
1 replies
3h35m

Have you tried prompting it to ask clarifying questions and be that partner? Perhaps no (extra) training required.

j4yav
0 replies
2h13m

In my opinion the junior developers are not equipped to guide their teacher. If they knew they were asking to turn an incorrect assumption into code in the first place, they already wouldn’t believe the confident hallucination they get in reply.

CaptainFever
1 replies
1h54m

And it's equally and inversely harmful to junior developers who keep prodding it until it generates an abomination they don't understand that manages to pass the build.

This sounds like shotgun debugging.

j4yav
0 replies
1h41m

While on LSD in this case.

anileated
1 replies
3h0m

Last month I tried to use LLMs for things I didn’t know and couldn’t easily find. Every time they were either subtly wrong or outright hallucinated premises which led me to waste time until I realized they were wrong.

If not for the unwarranted confidence in incorrect responses, I could say they were at least not much worse than what I could piece together from what I knew. As it stands, they are OK filling in for a rubber duck and as autocomplete.

_giorgio_
0 replies
2h19m

Surely a bad LLM, not chatGPT 4

netcraft
0 replies
4h41m

I'll add another thought here - what I really want many times is a custom LLM like GPT, but trained on a particular language or framework or topic. I would love to go to a website for a new language and be able to talk about its documentation and ask questions of an LLM to help me understand. Huge bonus points if it was trained on real world code examples of that language or framework and I could have it help me write a new program or function right there. More bonus points if its tied in with an online repl where it can help me right inline.

itomato
0 replies
3h32m

The One and Six Pagers I have it write based on loose criteria help me refine the criteria and in some cases, uncover methods that would not have been evident otherwise.

apwell23
5 replies
5h33m

I use chatgpt as my thinking partner writing code. I chat with it all day everday to finish work.

My company has approved copilot but Copilot autocomplete has been an awful experience. company hasn't approved copilot chat ( which is what i need) .

But I would love something similar that can run on my laptop for my code to generate unit tests, code comments ect ( ofcourse with my input and guidance).

deergomoo
1 replies
5h15m

My company has approved copilot but Copilot autocomplete has been an awful experience

I had the same experience, I feel like I must be crazy because so many of my colleagues have been singing its praises. I found it immensely distracting and disabled it again after a couple of days.

It was like having someone trying to finish my sentence while I was still speaking; even when they were right, it was still annoying and knocked me out of my flow (and very often, it wasn’t right).

fipar
0 replies
5h1m

I actually find copilot quite useful but I use it from emacs and it only provides suggestions when I intentionally hit my defined shortcut for it, so it never gets in the way. It may be worth for you to try setting it up in a similar way in the tool you’re using it from, as I agree I’d find the experience awful if it was always trying to autocomplete my sentences.

cassianoleal
1 replies
4h10m

If you use VS Code or a JetBrains IDE, Continue works well with Ollama and it’s really easy to get going.

[0] https://continue.dev/

[1] https://ollama.ai/

unshavedyak
0 replies
3h15m

Any opinion on what the best experience is, currently?

moffkalast
0 replies
4h52m

Fwiw, there are now some local models that rival 3.5-turbo in code chat, like the Codeninja I tried out the other day. Not nearly as good as 4 which iirc runs the copilot backend, but for sensitive data that can't leave the premises it's the only real option. Or getting a dedicated instance from OpenAI I guess.

Wowfunhappy
5 replies
3h43m

For the past few days, I have been trying to fix a bug in a closed-source Mac app. I otherwise love the app, but this bug has been driving me crazy for years.

I was pretty sure I knew which Objective-C method was broadly responsible for the bug, but I didn't know what that method did, and the decompiled version was a nonsensical mess. I felt like I'd hit a wall.

Then I thought to feed the decompiler babble to GPT-4 and ask for a clean version. The result wasn't perfect, but I was able to clean it up. I swizzled the result into the app, and I'm pretty sure the bug is gone. (I never found reproduction steps, but the problem would usually have occurred by now.)

I never could have done this without GPT-4.

HarHarVeryFunny
4 replies
3h15m

This sounds rather like the junior/bad developer who makes a bug disappear (at least for time being) by changing the order of functions in a source file or some such.

Admittedly a complete rewrite of a piece of code, even without understanding what you are doing (e.g by using an LLM), is unlikely to have the same bugs as the original implementation (but may have different bugs), but hopefully no-one is doing this for code where bugs have any significant consequence (e.g. system downtime, cost to customers).

Wowfunhappy
3 replies
3h11m

Just to be clear, I do understand the new version of the method. I don't entirely know how it fits into the larger system, but that's to be expected when I literally don't have the source code.

When I cleaned it up, I took out some complexity which I believe was responsible for the bug, at the cost of some performance. According to GPT-4, the original version was checking file descriptors to decide when to do work. My version just does the work every 5ms.

ruszki
2 replies
2h43m

So the parent commenter tried to tell you, that they (and I too) heard this story from junior and bad programmers in the past 20 years all the time, and they didn’t use LLMs. It doesn’t matter whether you use generative AIs or not, it’s a bad way of thinking, and long term it’s not beneficial to anybody. The real problem is that you didn’t dig deeper when you figured out that that code change fixed the problem.

Wowfunhappy
1 replies
2h34m

But I actually think this is a reasonable way to fix a hard-to-pin-down bug in any context, at least temporarily. (In my case, I don't intend to go back because it's not my app and mostly for personal use, but that's beside the point.)

There was a tradeoff between performance and complexity. The high-performance, high-complexity version was buggy, so I switched to a simpler option at the cost of some performance.

This isn't where the LLM was significant. The LLM was able to make sense of unreadable decompiled code, similar to how the author had ChatGPT translate from compiled assembly code back to C. (Giving GPT-4 the actual assembly never occurred to me, in hindsight I should have tried that first.)

ruszki
0 replies
2h0m

My job is exactly to fix code which was not understood by its creators. And this “I have no idea why, but it works” (until it doesn’t) is the main cause of most of the problems which I work on.

For example, at my current company the developer who introduced a “clever” navigation system didn’t know how HTML forms should be used, and why servers still allowed what they did. It worked. Now, 20+ years later that sole developer’s stupid decision, and lack of HTML best practices will cost my company a few million dollars (and by the way already cost probably a few million). A missing day of learning (and by the way a clear sign, that that developer should’ve never trusted with this task).

Senior developers learn this, and I’ve never seen that better developers would be satisfied and would say “yeah, I fixed it”, when they don’t understand the what and how completely, even when it’s not strictly necessary. They burned themselves enough times.

miki123211
3 replies
4h42m

I think the most under-appreciated aspect of LLMs, one on which the article touched on but didn't directly address, is being the "developer that knows everything" aspect.

No matter how senior of a programmer you are, you're eventually going to encounter a technology you know very little about. You're always going to be a junior at something. Maybe you're the God of Win32, C++ and COM, but you get stuck on obscure NSIS scripts when packaging your software. Maybe you've been writing web apps for the last 25 years and sit on the PHP language committee, but then you're asked to implement some obscure ISO standard for communicating with credit card networks, and you've never communicated with credit card networks on that level before. Maybe you've been writing iOS apps since the first iPhone and Mac apps before that, spent a few years at Apple, know most iOS APIs by heart and designed quite a few yourself, but then you're asked to implement CalDAV support in your app and you don't know what CalDAV is, much less how to use it. An LLM can help you out in these situations. Maybe it won't write all the code for you, but it'll at least put you on the right track.

reactordev
1 replies
4h35m

"No matter how senior of a programmer you are, you're eventually going to encounter a technology you know very little about"

Or worse, you've filled your head with different tech and now you need to rehash and brush up on stuff you learned prior but swept under the rug for new stuff. It's a strange sensation. Naturally you just go with the median of whatever your company you work for is doing - then find yourself in this situation where it's "been a while" since you worked on CSS. Or it might take you a weekend of study to bring back those Python dataclass skills.

sime2009
0 replies
3h21m

I've found LLMs great for vague questions about functions and APIs whose details I've long forgotten. Recognising the right answer when it appears is often faster than digging through random results on google.

Salgat
0 replies
3h52m

At its heart GPT is the world's best googler. As long as you can find it on Google, an LLM can probably do a better and faster job of finding and curating the information for you.

kibibu
3 replies
5h51m

The deep coder example doesn't appear to actually be doing what the comments or the article say it does.

It appears no better than the mixtral example that it's supposedly an improvement on.

antirez
2 replies
5h49m

This is my cut & paste failure (I didn't re-check the GPT-4 output that fixed the grammar). Fixing...

times_trw
1 replies
5h25m

Do you know of an article that covers LLMs from the point of view of a tutor/study partner/reading group?

Yours is the first blog which matches my experiences with the code side of things, but I've found them even more useful in the learning side of things.

adamgordonbell
0 replies
4h56m

I wrote an article subtitled "llms flatten steep learning curves": https://earthly.dev/blog/future-is-rusty/

drubio
3 replies
3h48m

What an ending...

I have never loved learning the details of an obscure communication protocol or the convoluted methods of a library written by someone who wants to show how good they are. It seems like "junk knowledge" to me. LLMs save me from all this more and more every day.

This is depressing or tongue-in-cheek considering who he is -- Redis creator -- and has an older post titled 'In defense of linked lists', so talking about linked lists in Rust is not "junk knowledge" or something an LLM can analyze circles around any human.

It's the best coding nihilism as a profession post I have read though.

antirez
2 replies
3h43m

There is a misunderstanding going here. A linked list is a pure form of knowledge. What we see today is an explosion of arbitrary complexity that is the fruit, mostly, of bad design. If I learn the internals of React, I'm not really understanding anything fundamental. If I get to know the subtleness of Rust semantics and then Rust goes away, I'm left with nothing: it's not like learning Lisp. Think to all the folks that used to master M4 macros in the Sendmail, 30 years ago. I was saying the same, back then: this is garbage knowledge.

Today we have a great example in Kubernetes, and all the other synthetic complexity out there. I'm in, instead, to learn important ML concepts, new data structures, new abstractions. Not the result of some poor design activity. LLMs allow you to offload this memorization out of your mind, to make space for distilled ideas.

gilbetron
0 replies
2h42m

Spot on - it is one of the main reasons I haven't enjoyed programming in recent years, so much of it is learning what you call "garbage knowledge". Yet another API, yet another DSL, yet another standard library. Endless reading of internal wiki pages to learn the byzantine deployment system of my current company. Even worse, when I know exactly what I want, but some little dependency or piece of tooling is bad and I spend hours, or days, trying to debug it.

I, too, find LLMs a balm for this pain. They have kind-of-basic level of knowledge, but about everything.

In short, it allows for a more efficient expenditure of mental and emotional energy!

emporas
0 replies
2h5m

To rephrase it a little bit.

Much of programming, coding and developing is done by a person who is a knowledge worker and writes code. A good proportion of code to be written, will be written just once and never again. The one-off code snippet will stay in a file collecting dust forever. There is no point in trying to remember it in the first place, because without constant repetition of using it, it will be forgotten.

LLMs can help us focus our knowledge where it really matters, and discard a lot of the ephemeral stuff. That means that we can be more of knowledge workers and less of coders. I will push it even further and state that we will become more of knowledge workers and less of coders until we will be, eventually and gradually, just knowledge workers. We will need to know about algorithms, algorithmic complexity, abstractions and stuff like that.

We will need to know subjects like that Rust book [1] writes about.

[1]https://github.com/QMHTMY/RustBook/tree/main/books

abhinavstarts
3 replies
3h48m

High levels of reasoning are not required. LLMs are quite good at doing this, although they remain strongly limited by the maximum size of their context. This should really make programmers think. Is it worth writing programs of this kind? Sure, you get paid, and quite handsomely, but if an LLM can do part of it, maybe it's not the best place to be in five or ten years

I appreciate the author writing this article. Whenever I read about future of field, I get anxiety and confusion but then again I think other options too which were available to me was less interest of me.

I am now at the place that I still have the opportunity to pivot and focus on pure/applied mathematics than being in software field.

Honestly I wanted to make money through this career but I don't know what carrer to choose now.

I keep working on myself and don't compare myself to others but if argument is top 1% programmers will be required in the future then I doubt myself because I have still learn lot of things and then how about competing with both experienced & knowledgeable.

I was thinking about pin-pointing a target then becoming expert at it (by 10000 hrs rule)

I'm sorry to ask but today or in-general I am very confused which path/carrer to Target related to computing, Mathematics. Please suggest and give me your valuable advice. Thank you

throwuwu
0 replies
3h8m

I wouldn’t worry too much about the demand for programmers. Jevon’s paradox has played out enough times that I’m sure as the cost of code goes to zero the demand will continue to increase. Look forward to the day that your toilet paper comes with an API.

thomashop
0 replies
3h3m

I'd say study deep learning (nice mix of maths and CS) or do software but learn to use the AI tools in the process.

If you're looking at a 5-10 year timeline then even pure or applied mathematics may well heavily use AI models.

We're always going to need architects that build the scaffolding together with LLMs. Programmers + LLMs will be able to outcompete programmers without. If one programmer can do more it just means projects will become more ambitious not less programming needed.

I've never worked for a company that had too little work for their software engineers. Rather many projects are on long timelines because there are only so many hours available per month.

Another analogy: with a high level programming language you can do what previously needed 10x the lines of code in assembly. I don't think they caused job losses for software engineers.

makk
0 replies
2h40m

I’m not sure you’re focused on the important question. For example: Who you marry, if you choose to go that route, may very well be the most important decision you’ll ever make.

Putting that aside, based on your question and willingness to put it out there… I would say this: just surrender to what charms you right now. Do you feel drawn toward programming? Follow that. Or math? Follow that. They may not be mutually exclusive.

As you go, stay tuned in to how you feel about the activity in the moment. Not your anxiety about what you think about the future prospects, but just how it feels right now to be doing the thing. That feeling may change over time, and it will guide you if you stay tuned in.

CaptainFever
3 replies
1h26m

I feel that I'm being too conservative with how I use AI. Currently I use Copilot Autocomplete with a bit of Copilot Chat, which is great and almost always gets small snippets correct, but I sometimes worry that I'm not using it to the full potential -- so I can be faster with my side projects -- for example, by generating entire classes.

antirez
2 replies
1h16m

In general Copilot is much weaker than bigger/slower models, so if you have the feeling you are not using AI enough, the first thing to try IMHO is to chat with powerful models like GPT4 to see what is the current state of art in code generation.

CaptainFever
1 replies
1h15m

Thank you for the suggestion, antirez! Unfortunately that option does cost money (Copilot is free for students), although Bing might be an alright alternative.

antirez
0 replies
1h8m

Indeed, you are right. If you have an M[1,2,3] MacBook with enough RAM, you may want to run some model like DeepSeek-coder 34B in local. Or the smaller one, but it is going to be weaker.

4ad
3 replies
5h49m

This might be a good article, I wouldn't know because I can't read this monospace atrocity.

Reader view in Safari preserves the monospace font... /facepalm

noelwelsh
0 replies
5h24m

I also found it looks awful, which made it hard for me to read, but Firefox reader mode did at least change the font.

antirez
0 replies
5h43m
082349872349872
0 replies
5h40m

  In order to protect your delicate sensibilities
  I would further suggest to avoid consulting most
  research output from before the mid 1980s.
eg https://www.rand.org/content/dam/rand/pubs/research_memorand...

madeofpalk
2 replies
4h12m

I have a problem, I need to quickly know something that I can verify* if the LLM is feeding me nonsense. Well, in such cases, I use the LLM to speed up my need for knowledge.*

This is the key insight from using LLMs in my opinion. One thing that makes programming especially well suited for LLMs is that it's often trivial to verify the correctness.

I've been toying around this concept for evaluating whether a LLM is the right tool for the job. Graph out "how important is it that the output is correct" vs "how easy is it to verify the output is correct". Using ChatGPT to make a list of songs featuring female artists who have won an Emmy is time consuming to verify it's correct, but it's also not very important and it's okay if it contains some errors.

blibble
0 replies
3h39m

One thing that makes programming especially well suited for LLMs is that it's often trivial to verify the correctness.

is this why software never has any bugs?

adamgordonbell
0 replies
4h9m

Yeah, exactly.

Problems where coming up with solution is hard but verifying a possible solution is easy.

And we all know what that class of problems is called.

gumballindie
2 replies
5h5m

For me LLMs revealed how easily it is to manipulate masses with properly done marketing. Despite these tools being obviously unreliable, tens of people on here report how well they work and how much they changed their lives. Shows that with sufficient propaganda you can make people see and feel things which are not there - not a new concept. But what’s new to me is just how easy it is.

bratbag
1 replies
4h50m

I'm ok with a degree of unreliability when experimenting with new ideas.

That's a tradeoff I already make when using relativly new third party libraries/services to accelerate experimentation.

gumballindie
0 replies
4h23m

That is ok, I do it too - that's why I use tools such as chatgpt. But from that to calling it life changing there's a wide gap. The tool is nowhere near what's advertised, far from it.

fallingknife
2 replies
5h22m

I have found only a few cases where ChatGPT has been very useful to me. e.g. writing long SQL queries and certain mathematical functions like finding the area of intersection of two rectangles. And it hallucinates enough that a lot of the time I can't use it because I know it would take more time to check it for correctness and edge cases than it would to just write it in the first place. Maybe I am using it wrong, but so far the results for me have been extremely impressive, but not yet very useful.

plagiarist
1 replies
3h51m

I'm surprised it can do long SQL queries, I wouldn't have thought. I've been looking into PRQL or other solutions to cover that ground. Can it do reasonably complex things like window functions?

fallingknife
0 replies
3h45m

Never tried it with something like that. I just mean long as in a lot of text like selecting a lot of stuff from a lot of tables and grouping, etc.

eminence32
2 replies
4h24m

Since the advent of ChatGPT, and later by using LLMs that operate locally

Does HN have any favorite local LLMs for coding-related tasks?

kubrickslair
0 replies
1h24m

Phind-CodeLlama-34B-v2 seems to work well for our team.

duckkg5
0 replies
4h19m

deepseek-coder has been decent for my purposes

boulos
2 replies
5h22m

I really like the argument about misinformation vs testing. I'm not totally sold on "you can just see it", but I do think something like TDD could suddenly be really productive in this world.

I've found autocomplete via these systems to be improving rapidly. For some work, it's already a big boost, and it's close to a difference in kind from the original IntelliSense. Amusingly though, I primarily write in an editor without any autocomplete, so I don't experience this often. But I do, precisely for the throwaway code and lower-value changes.

Finally, it's not clear to me that the distinction is between systems programming and scripting. My sense is that Chat GPT and similar are (a) heavily influenced by the large corpus of Python, so it's better at it than C and (b) the examples here involved more clever bit manipulation than most software engineers ever interact with.

082349872349872
1 replies
4h52m

the examples here involved more clever bit manipulation than most software engineers ever interact with

Perlis once quoth:

18. A program without a loop and a structured variable isn't worth writing.

After 5 minutes of thought, I'd update that, for my hacking, to:

"A program without some convergence reasoning and a non-injective change of representation isn't worth writing."

(iow, I'd be happy to let LLMs, or at least other people, wrangle glue and parsley code, according to the taxonomy of: https://news.ycombinator.com/item?id=32498382 )

boulos
0 replies
4h5m

I like the term parsley code!

I do suspect though that both the hashing and 6-bit weight examples are just extremely rare in the corpus. It wasn't confused about loops, or hashing generally, but just didn't do as well as antirez would have liked. The description of the 6-bit to "why don't I just cast this to 8-bits" thing is definitely a problem. And worse, it's a problem a more junior engineer might not understand. But I suspect that a model trained on a corpus with lots more bit manipulation would have been fine, as it wasn't complex.

Clearly we just need a fine tuned one :).

voidhorse
1 replies
4h19m

I enjoy antirez's work, and I enjoyed this essay, but I disagree with many of its conclusions. In particular:

this goal forces the model to create some form of abstract model. This model is weak, patchy, and imperfect, but it must exist if we observe what we observe.

Is a completely fallacious line of reasoning, and I'm surprised that he draws this conclusion. The whole reason the "problem of other minds" is still a problem in philosophy is precisely because we cannot be certain that some "abstract model" exists in someone's head (man or machine, do you argue it does? show it to me) simply because an output meeting certain constraints exists. This is exactly the problem of education. A student that studies to answer questions correctly on a test may not have an abstract model of the subject area at all. Even they may not be conducting what we call reasoning. If a student aces a test, can you confidently say they actually understand a domain? Or did they simply ace a test?

Furthermore, LLM's lack of consistency and inability to answer basic mathematical questions, and their limitation to purely text based areas of concern and representation are all much stronger arguments for siding with the notion that they really are just sophisticated, stochastic, machines, incapable of what we'd normally call reason in a human context. If LLM's "reason" it is a much different form of reasoning than that which human beings are capable of, and I'm highly skeptical that any such network will achieve parity to human reason until it can "grow up" and learn embodied in a rich, multi sensory environment, just like human beings. For machines to achieve reason, they will need to break out of the text-only/digital-only box first.

antirez
0 replies
4h7m

is still a problem in philosophy

Exactly! This is why I removed this fundamental questions from my post: in this moment they don't have any clear reply and will basically make an already complex landscape even more complex. I believe that right now, whatever is happening inside LLMs, we need to focus on investigating the practical level of their "reasoning" abilities. They are very different objects than human brains, but they can do certain limited tasks that before LLMs we thought to be completely in the domain of humans.

We know that LLMs are just very complex functions interpolating their inputs, but this functions are so convoluted, that in practical ways they can solve problems that were, before LLMs, completely outside the reach of automatic systems. Whatever is happening inside those systems is not really important for the way they can or can't reshape our society.

tipsytoad
1 replies
3h16m

The most useful feature of llms is how much output you get from such little signal. Just yesterday I created a fairly advanced script from my phone on the bus ride home with chatgpt which was an absolute pleasure. I think multi-prompt conversations don't get nearly as much attention as they should in llm evaluations.

danielbln
0 replies
2h17m

I suppose multi-prompt conversations are just a variation on few-shot prompting. I do agree though, that they don't play a big enough role in eval, but also in the heads of many people. So many capable engineers I now nope out of GPT because the first answer isn't satisfactory, instead of continuing the dialog.

sevagh
1 replies
4h4m

There is an impedance problem when working on a new project.

At the beginning, when there's 0% of the task done, and you need to start _somewhere_, with a hello world or a CMakeLists file or a Python script or whatever, it takes effort. Before ChatGPT/LLM, I had to pull that effort out from within myself, with my fingertips. Now, I can farm it out to ChatGPT.

It's less efficient, not as powerful as if I truly "sat down and did it myself," but it removes the cost of "deciding to sit down and do it myself." And even then, I'm cribbing and mashing together copy-pasted fragments from GitHub code search, Stackoverflow, random blog posts, reading docs, Discord, etc. After several attempts and retries, I have a "5% beginning" of a project when it finally takes form and I can truly work on it.

I sort of transition from copy-pasting ChatGPT crap to quickly create a bunch of shallow, bullshit proofs-of-concept, eventually gathering enough momentum to dive into it myself.

So, yes, it's slower, and more inefficient, and ChatGPT can't do it better than I can. But it's easier and I don't have to dig as deep. The end result is I have much more endurance in the actual important parts of the project (the middle and end), versus burning myself out on the beginning.

itomato
0 replies
3h26m

Was I digging too deeply before?

Was I asking the right questions from the beginning, and if not, can I effectively salvage my work?

Sunk costs disappear into a $20 subscription

ahgamut
1 replies
3h16m

Instead, many have deeply underestimated LLMs, saying that after all they were nothing more than somewhat advanced Markov chains, capable, at most, of regurgitating extremely limited variations of what they had seen in the training set. Then this notion of the parrot, in the face of evidence, was almost universally retracted.

I'd like to see this evidence, and by that I don't mean someone just writing a blog post or tweeting "hey I asked an LLM to do this, and wow". Is there a numerical measurement, like training loss or perplexity, that quantifies "outside the training set"? Otherwise, I find it difficult to take statements like the above seriously.

LLMs can do some interesting things with text, no doubt. But these models are trained on terabytes of data. Can you really guarantee "there is no part of my query that is in the training set, not even reworded"? Perhaps we can grep through the training set every time one of these claims are made.

skepticATX
0 replies
1h43m

Exactly. I think that it’s very hard for us to comprehend just how much is out there on the internet.

The perfect example of that is the tikz unicorn in the Sparks paper. Seemed like a unique task, until someone found a tikz unicorn in an obscure website.

There is plenty of evidence that LLMs struggle as you move out of distribution. Which makes perfect sense as long as you stop trying to attribute what they’re doing to magic.

This doesn’t mean they’re not useful, of course. But it means that we should should be skeptical about wild capability claims until we have better evidence than a tweet, as you put it.

082349872349872
1 replies
5h47m

At the same time, however, my experience over the past few months suggests that for system programming, LLMs almost never provide acceptable solutions if you are already an experienced programmer.

Hmm, this suggests to me that in a better world, the systems problems would have been solved with code, and the sorts of one-off problems which current LLMs do handle well would have been solved with formulae in a (shell-like? not necessarily turing-complete?) DSL.

chii
0 replies
5h4m

LLMs almost never provide acceptable solutions if you are already an experienced programmer.

or, the other stuff GPT was producing is just as bad, but that he's not experienced enough in the domain to see it, where as the stuff he is experienced with looks immediately sus or subpar.

tmaly
0 replies
4h42m

I think LLMs are good for quick prototyping first drafts of small functions or simple systems.

For me they help when time is short and when I want to maximize creative exploration.

slowmovintarget
0 replies
1h28m

TLS cert has expired on the antirez site.

sebringj
0 replies
3h36m

Currently, what I get out from it is a good quick overview with some hallucinations. You have to actually know what you're doing to check the code. However, this is a fast moving target and will in no time be doing that part as well. I think stepping back and thinking maybe this thing is just giving us more and more agency and what can we do with that? We need to adapt to not constrain ourselves to just being programmers. We are humans with agency and if we can adapt to this, we can be more and more powerful having our technical insight that we've gained over the years to do some really cool things. I have a startup and with ChatGPT I've managed to do all parts of the stack with confidence and used it for all sorts of business related things outside of coding that have really helped move the business forward quickly.

renonce
0 replies
3h31m

Homo sapiens invented neural networks

Is it just me or did anyone smile at this sentence? The first paragraph sounds like the academic way of saying "we invented huge neural networks but we couldn't understand it".

pluc
0 replies
5h2m

bro it's 2024, get SSL

pknerd
0 replies
4h38m

Besides using chatGPT for certain pieces of code that use a 3rd party library. I successfully used it as a "Code Reviewer". I recently copied functions of a Symfony PHP controller and asked for a code review and suggestions for refactoring with code and reasons. Surprisingly it worked very well and I was able to refactor a good amount of code.

mercurialsolo
0 replies
5h49m

Honestly speaking code generation is a form of augmented retrieval. And going further back, I would say human memory is generated from context rather than retrieved (which is why it's often fallible - we hallucinate details).

LLM's today for me are the equivalent of a large scale human memory for code or for faster augmented retrieval - do they hallucinate details, quite often, but do I find it utilitarian versus dragging myself over documentation details - more often than not.

legendofbrando
0 replies
19m

This is one of the best pieces I’ve read that articulates what it’s like to work closely with LLMs as creative partners.

kvz
0 replies
4h34m

antirez thank you for talking some sense. I’ve seen skilled devs discard LLMs entirely based on seeing one (too many) hallucinations, then proclaiming they are inferior and throwing the baby away with the bathwater. There is still plenty of use to be had from them even if they are imperfect.

kromem
0 replies
3h16m

Perhaps the most important point in the piece, and one that can't be repeated enough or understood enough as we head into what 2024 has in store:

And then, do LLMs have some reasoning abilities, or is it all a bluff? Perhaps at times, they seem to reason only because, as semioticians would say, the "signifier" gives the impression of a meaning that actually does not exist. Those who have worked enough with LLMs, while accepting their limits, know for sure that it cannot be so: their ability to blend what they have seen before goes well beyond randomly regurgitating words. As much as their training was mostly carried out during pre-training, in predicting the next token, this goal forces the model to create some form of abstract model. This model is weak, patchy, and imperfect, but it must exist if we observe what we observe. If our mathematical certainties are doubtful and the greatest experts are often on opposing positions, believing what one sees with their own eyes seems a wise approach.
esafak
0 replies
3h48m

LLMs are going to have to get much cheaper to train to be useful in corporations, where the questions you want to ask are going to depend on proprietary code. You can't ask "What does subsystem FooBar do, and where does it fit in the overall architecture?" You'd want to be able to continuously retrain the model, as the code base evolves.

dmezzetti
0 replies
4h27m

While there clearly was a lot of hype, retrieval augmented generation (RAG) proved to be an effective technique with LLMs. Using RAG with project documentation and/or code can be useful.

cies
0 replies
5h26m

This should really make programmers think. Is it worth writing programs of this kind? Sure, you get paid, and quite handsomely, but if an LLM can do part of it, maybe it's not the best place to be in five or ten years.

Some one, a person with a sense of responsibility, has to sign off on changes to the code. LLMs have shown to come with answers that make no sense or contain bugs. A person (for now) needs to decide is the LLM's suggestion is acceptable, if we need more tests, if we want to maintain it.

I think programmers will be needed for that, they will just be made more productive (as what happened with the introduction of garbage collection, strong typed languages, powerful IDEs, StackExchange, ...)

block_dagger
0 replies
1h14m

One of the areas that has sped up the most for me while using ChatGPT to code is having it write test cases. Paste it a class and it can write a pretty good set of specs if you iterate with it. Literally 10x faster than doing it myself. This speed-up can also occur with languages/frameworks I'm not familiar with.

bbor
0 replies
1h23m

  LLMs are like stupid savants who know a lot of things.
Leaving the requisite “no, that’s not what language models are, you’re misunderstanding what’s important here, the best knowledge model already exists and it’s called Wikipedia”

antirez
0 replies
2h45m

@dang something is wrong with the ranking of this post.

amclennon
0 replies
4h44m

At the same time, however, my experience over the past few months suggests that for system programming, LLMs almost never provide acceptable solutions if you are already an experienced programmer.

In one off tasks where someone is not enough of an expert to know its flaws, and such expertise is not required, "the marvel is not that the bear dances well, but that the bear dances at all".

IKantRead
0 replies
2h56m

This quote in particular struck me as relevant:

And now Google is unusable: using LLMs even just as a compressed form of documentation is a good idea.

Beyond all the hype, it'd undeniable that LLMs are good at matching your query about a programming problem to an answer without inundating you with ads and blog spam. LLMs are, at the very least, just better at answering your questions than putting your question into to google and searching Stack Overflow.

About two years ago I got so sick of how awful Google was for any serious technical questions that I started building up a collection of reference books again just because it was quickly becoming the only way to get answers about many topics I cared about. I still find these are helpful since even GPT-4 struggles with more nuanced topics, but at least I have a fantastic solution for all those mundane problems that come up.

Thinking about it, it's not surprising that Google completely dropped the ball on AI since their business model has become bad search (i.e. they derive all their profit from adding things you don't want to your search experience). At their most basic, LLMs are just really powerful search engines, it would take some cleverness to make them bad in the way Google benefits from.

Havoc
0 replies
2h48m

I definitely mostly use it in the same way - generating discrete snippets.

Haven’t had much luck with code completion thus far.