return to table of content

How I Use "AI"

tptacek
17 replies
1d14h

There's a running theme in here of programming problems LLMs solve where it's actually not that important that the LLM is perfectly correct. I've been using GPT4 for the past couple months to comprehend Linux kernel code; it's spooky good at it.

I'm a C programmer, so I can with some effort gradually work my way through random Linux kernel things. But what I can do now instead is take a random function, ask GPT4 what it does and what subsystem it belongs to, and then ask GPT4 to write me a dummy C program that exercises that subsystem (I've taken to asking it to rewrite kernel code in Python, just because it's more concise and easy to read).

I don't worry at all about GPT4 hallucinating stuff (I'm sure it's doing that all the time!), because I'm just using its output as Cliff's Notes for the actual kernel code; GPT4 isn't the "source of truth" in this situation.

dang
6 replies
1d

This is close to how I've been using them too. As a device for speeding up learning, they're incredible. Best of all, they're strongest where I'm weakest: finding all the arbitrary details that are needed for the question. That's the labor-intensive part of learning technical things.

I don't need the answer to be correct because I'm going to do that part myself. What they do is make it an order of magnitude faster to get anything on the board. They're the ultimate prep cook.

There are things to dislike and yes there is over-hype but "making learning less tedious" is huge!

vertis
2 replies
21h33m

I'm finding myself using the extensively in the learning way, but also I'm an extreme generalist. I've learned so many languages over 23 years, but remembering the ones I don't use frequently is hard. The LLMs become the ultimate memory aid. I know that I can do something in a given language, and will recognised that it's correct when I see it.

Together with increasingly powerful speech to text I find myself talking to the computer more and more.

There are flaws, there are weaknesses, and a bubble, but any dev that can't find any benefit in LLMs is just not looking.

mordymoop
0 replies
14h44m

The advantage of using LLMs for use in coding, as distinct from most other domains, is that you can usually just directly check if the code it’s giving you is correct, by running it. And if it’s not, the LLM is often good at fixing it once the issue is pointed out.

Onawa
0 replies
20h14m

Languages, syntax, flags, and the details... I too have touched so many different technologies over the years that I understand at a high level, but don't remember the minutiae of. I have almost turned into a "conductor" rather than an instrumentalist.

Especially for debugging issues that could previously take days of searching documentation, Stack overflow, and obscure tech forums. I can now ask an LLM, and maybe 75% of the time I get the right answer. The other 25% of the time it still cuts down on debugging time by helping me try various fixes, or it at least points me in the right direction.

loufe
1 replies
23h9m

You put words to what I've been thinking for a while. When I'm still new to some new technology it is a huge time-saver. I used to need to go bother some folks somewhere on a discord / Facebook group / matrix chat to get the one piece of context that I was hung up on. Sometimes it is hours or days to get that one nugget.

I feel more interested in approaching challenging problems in fact because I know I can get over those frustrating phases much more easily and quickly.

7speter
0 replies
21h12m

I came here to write essentially the same comment as you. Instead of going into a chatroom where people tell you you're lazy because you are unclear on ambiguous terms in documentation, these days I paste in portions of documentation and ask GPT for clarification on what I'm hazy about.

smusamashah
0 replies
21h7m

I use it like a dictionary (select text and lookup) and based on what I looked up and answer, I judge myself how correct the answers are, and they are on point usually.

It has also made making small pure vanilla html/js based tools fun. It gives me a good enough prototype which I can mold to my needs. I have wrote a few very useful scripts/tools past few months which otherwise I would never even have started because of all the required first steps and basic learnings.

(never thought I would see your comment as a user)

atum47
3 replies
1d14h

I've been using it for all kinds of stuff. I was using a drying machine at a hotel a while ago and I was not sure about the icon that it was display on the visor regarding my clothes, so I asked gpt and it told me correctly. It read all the manuals and documentations from pretty much everything right? Better then Google it and you just ask for the exact thing you want.

tkgally
0 replies
1d11h

I used LLMs for something similar recently. I have some old microphones that I've been using with a USB audio interface I bought twenty years ago. The interface stopped working and I needed to buy a new one, but I didn't know what the three-pronged terminals on the microphone cords were called or whether they could be connected to today's devices. So I took a photo of the terminals and explained my problem to ChatGPT and Claude, and they were able to identify the plug and tell me what kinds of interfaces would work with them. I ordered one online and, yes, it worked with my microphones perfectly.

throwaway4aday
0 replies
17h47m

It's surprisingly good at helping diagnose car problems as well.

7speter
0 replies
21h7m

My washing machine went out because some flooding and I gave chatGPT all of the diagnostic codes and it concluded that it was probably a short in my lid lock.

The lid lock came a few days later, I put it in, and I'm able to wash laundry again.

viraptor
1 replies
1d14h

The two best classes for me are definitely:

- "things trivial to verify", so it doesn't matter if the answer is not correct - I can iterate/retry if needed and fallback to writing things myself, or

- "ideas generator", on the brainstorming level - maybe it's not correct, but I just want a kickstart with some directions for actual research/learning

Expecting perfect/correct results is going to lead to failure at this point, but it doesn't prevent usefulness.

tptacek
0 replies
1d14h

Right, and it only needs to be right often enough that taking the time to ask it is positive EV. In practice, with the Linux kernel, it's more or less consistently right (I've noticed it's less right about other big open source codebases, which checks out, because there's a huge written record of kernel development for it to draw on).

seanhunter
1 replies
1d8h

Exactly. It's similar in other (non programming) fields - if you treat it as a "smart friend" it can be very helpful but relying on everything it says to be correct is a mistake.

For example, I was looking at a differential equation recently and saw some unfamiliar notation[1] (Newton's dot notation). So I asked claude for why people use Newton's notation vs Lagrange's notation. It gave me an excellent explanation with tons of detail, which was really helpful. Except in every place it gave me an example of "Lagrange" notation it was actually in Leibniz notation.

So it was super helpful and it didn't matter that it made this specific error because I knew what it was getting at and I was treating it as a "smart friend" who was able to explain something specific to me. I would have a problem if I was using it somewhere where the absolute accuracy was critical because it made such a huge mistake throughout its explanation.

[1] https://en.wikipedia.org/wiki/Notation_for_differentiation#N...

vertis
0 replies
11h42m

Once you know LLMS make mistakes and know to look for them half the battle is done. Humans make mistakes, which is why we take effort to validate thinking and actions.

As I use it more and more often the mistakes are born of ambiguity. As I supply more information to the LLM it's answer(s) gets better. I'm finding more and more ways to supply it with robust and extensive information.

skybrian
0 replies
16h11m

Yes, I like to think of LLM's as hint generators. Turns out that a source of hints is pretty useful when there's more to a problem than simply looking up an answer.

ransom1538
0 replies
21h58m

gpt: give me working html example of javascript beforeunload event, and onblur, i want to see how they work when i minimize a tab.

10 seconds later, I am playing with these out.

ein0p
11 replies
1d11h

Just today I had GPT4 implement a SwiftUI based UI for a prototype I’m working on. I was able to get it to work with minimal tweaks within 15 minutes even though I know next to nothing about SwiftUI (I’m mainly a systems person these days). I pay for this, and would, without hesitation, pay 10x for a larger model which does not require “minimal tweaks” for the bullshit tasks I have to do. Easily 80% of all programming consists of bullshit tasks that LLMs of 2024 are able to solve within seconds to minutes, whereas for me some of them would take half a day of RTFM. Worse, knowing that I’d have to RTFM I probably would avoid those tasks like the plague, limiting what can be accomplished. I’m also relieved somewhat that GPT4 cannot (yet?) help me with the non-bullshit parts of my work.

throwaway290
10 replies
1d6h

If it handles 99% of your tasks (making a smart boss fire you), know that you helped train it for that by using it/paying for it/allowing it to be trained on code in violation of license.

Even if 80% of programmer tasks in an org (or worldwide gig market) can be handled by ML, already 80% of programmers can be laid off .

Maybe you have enough savings that you just don't need to work but some of us do!

simonw
4 replies
1d3h

There are two ways this could work out:

- LLM-assistance helps solve 80% of programming tasks, so 80% of programmers lose their jobs

- LLM-assistance provides that exact same productivity boost, and as a result individual programmers become FAR more valuable to companies - for the same salary you get a lot more useful work out of them. Companies that never considered hiring programmers - because they would need a team of 5 over a 6 month period to deliver a solution to their specific problem - now start hiring programmers. The market for custom software expands like never before.

I expect what will actually happen will be somewhere between those two extremes, but my current hope is that it will still work out as an overall increase in demand for software talent.

We should know for sure in 2-3 years time!

throwaway290
3 replies
1d1h

I like your optimism, but in programming at least in US unemployment so far already rose higher than average unemployment overall.

ML supercharges all disparity, business owners or superstars who made a nice career and name will earn more by commanding fleets of cheap (except energy) llms while their previous employees/reports get laid off by tens of thousands (ironically they do it to themseves by wecoming llms and thinking that the next guy will be the unlucky one, same reason unions don't work there I guess...)

And to small businesses who never hired programmers before, companies like ClosedAI monetize our work for their bosses to get full products out of chatbots (for now buggy but give it a year). Those businesses will grow but when they hire they will get cheap minimal wage assistants who talk to llms. That's at best where most programmers are headed. The main winners will be whoever gets to provide ML that monetize stolen work (unless we stop them by collective outrage and copyright defense), so Microsoft

simonw
2 replies
23h40m

I'm not sure how much we can assign blame for US programming employment to LLMs. I think that's more due to a lot of companies going through a "correction" after over-hiring during Covid.

As for "their bosses to get full products out of chatbots": my current thinking on that is that an experienced software engineer will be able to work faster with and get much higher quality results from working with LLMS than someone without any software experience. As such, it makes more sense economically for a company to employ a software engineer rather than try to get the same thing done worse and slower with cheaper existing staff.

I hope I'm right about this!

throwaway290
1 replies
10h6m

my current thinking on that is that an experienced software engineer will be able to work faster with and get much higher quality results from working with LLMS

than someone without any software experience

- So you are betting against ML becoming good enough soon enough. I wouldn't be so sure considering the large amount of money and computing energy being thrown into it and small amount of resistance from programmers.

- Actually someone doesn't have to be zero experience. But if someone is mostly an llm whisperer to save boss some yacht time, instead of engineer, someone is paid according minimal wage.

simonw
0 replies
5h53m

No matter how good ML gets I would still expect a subject matter expert working with that ML to produce better results than an amateur working with that same ML.

When that’s not true any more we will have built AGI/ASI. Then we are into science fiction Star Trek utopia / Matrix dystopia world and all bets are off.

ein0p
4 replies
21h34m

Thing is though, I work in this field. I do not see it handling the non-bullshit part of my job in my lifetime, the various crazy claims notwithstanding. For that it’d need cognition. Nobody has a foggiest clue how to do that.

throwaway290
3 replies
12h58m

    1 Fire 80% programmers
    2 Spread the 20% non-bullshit parts among 20%
    3 Use llm for 80% bullshit parts
For now big companies are afraid to lay off too many so they try to "reskill" but eventually most are redundant. No cognition needed:)

ein0p
2 replies
12h2m

Truth be told, most big tech teams could benefit from significant thinning. I work in one (at a FANG) where half the people don't seem to be doing much at all, and the remaining half shoulders all the load. The same iron law held in all big tech teams I worked in, except one, over the course of the last 25 years. If the useless half was fired, the remaining half would be a lot more productive. This is not a new phenomenon. So IDK if "firing 80%" is going to happen. My bet - nope. The only number that matters to a manager is the number of headcount they have under them. And they're going to hold onto that even if their people do nothing. They are already doing that.

throwaway290
1 replies
10h11m

You switch topics. There are useless people. Not talking about them. Ignore useless people.

You and your good useful programmer coworkers do 80% llmable bullshit, 20% good stuff. So among you, if your boss is smart he will fire 80% of you and spread 20% non-llmable work across remaining people. You hope your coworker gets fired, your coworker hopes it's you, and you both help make it happen

ein0p
0 replies
1h53m

Fire everyone and make themselves redundant? Please. You're also assuming the amount of non-bullshit work would stay constant, which it won't. I'm doing a ton more non-bullshit work today thanks to LLMs than I did 2 years ago.

squirrel
10 replies
21h30m

For about 20 years, chess fans would hold "centaur" tournaments. In those events, the best chess computers, who routinely trounced human grandmasters, teamed up with those same best-in-the-world humans and proceeded to wipe both humans and computers off the board. Nicholas is describing in detail how he pairs up with LLMs to get a similar result in programming and research.

Sobering thought: centaur tournaments at the top level are no more. That's because the computers got so good that the human half of the beast no longer added any meaningful value.

https://en.wikipedia.org/wiki/Advanced_chess

delichon
3 replies
20h27m

When I was a kid my dad told me about the most dangerous animal in the world, the hippogator. He said that it had the head of a hippo on one end and the head of an alligator on the other, and it was so dangerous because it was very angry about having nowhere to poop. I'm afraid that this may be a better model of an AI human hybrid than a centaur.

disqard
1 replies
19h49m

A bit of a detour (inspired by your words)... if anything, LLMs will soon be "eating their own poop", so structurally, they're a "dual" of the "hippogator" -- an ouroboric coprophage. If LLMs ever achieve sentience, will they be mad at all the crap they've had to take?

Beautiful story, and thanks for sharing :)

gerdesj
0 replies
17h45m

Why on earth were you DVd? Is a bit of chat or conversation banned?

romwell
0 replies
19h59m

...so, the hippogator was dangerous because he was literally full of shit.

Hmmmm.

sjducb
2 replies
17h9m

Hopefully that means we’ve got 20 years left of employment.

bamboozled
1 replies
15h18m

your kids?

ziofill
0 replies
14h52m

They’ll serve the AGI overlords

QuantumGood
2 replies
20h28m

Most people only have heard "Didn't an IBM computer beat the world champion", and don't know that Kasparov pysched himself out when Deep Blue had actually maken a mistake. I was part of the online analysis of the (mistaken) engame move at the time that were the first to reveal the error. Kasparov was very stressed by that and other issues, some of which IBM caused ("we'll get you the printout as promised in the terms" and then never delivered). My friend IM Mike Valvo (now deceased) was involved with both matches. More info: https://www.perplexity.ai/search/what-were-the-main-controve...

ipsum2
0 replies
14h21m

Perplexity is a hallucination engine disguised as a search engine. I wouldn't trust anything it says.

deepsun
0 replies
18h7m

Your link bans Mozilla VPN.

joenot443
9 replies
22h4m

What’s everyone’s coding LLM setup like these days? I’m still paying for Copilot through an open source Xcode extension and truthfully it’s a lot worse than when I started using it.

mnk47
3 replies
21h42m

I just pay the $20/month for Claude Pro and copy/paste code. Many people use Cursor and Double, or alternative frontends they can use with an API key.

vertis
2 replies
21h28m

I use Cursor and Aider, I hadn't heard of Double. I've tried a bunch of others including Continue.dev, but found them all to be lacking.

trees101
1 replies
16h23m

can you please elaborate on how you use Cursor and Aider together?

vertis
0 replies
6h45m

I don't really use them together exactly, I just alternate backwards and forwards depending on the type of task I'm doing. If it's the kind of change that's likely to be across lots of files (writing) then I'll use Aider. If it only uses context from other files I'll likely use Cursor.

viraptor
0 replies
19h4m

I'm happy with Supermaven as a completion, but only for more popular languages.

Otherwise Claude 3.5 is really good and gpt-4o is ok with apps like Plandex and Aider. You need to get a feel for which one is better for what task though. Plain questions to Claude 3.5 API through the Chatbox app.

Research questions often go to perplexity.ai because it points to the source material.

slibhb
0 replies
22h2m

I gave up with autocomplete pretty quickly. The UX just wasn't there yet (though, to be fair, I was using some third party adapter with sublime).

It's just me asking questions/pasting code into a ChatGPT browser window.

nunodonato
0 replies
21h1m

www.workspaicehq.com

levzettelin
0 replies
21h28m

neovim with the gp.nvim plugin.

Allows you to open chats directly in a neovim window. Also, allows you to select some text and then run it with certain prompts (like "implement" or "explain this code"). Depending on the prompt you can make the result appear directly inside the buffer you're currently working on. The request to the ChatGPT API also is enriched with the file-type.

I hated AI before I discovered this approach. Now I'm an AI fanboy.

jazzyjackson
0 replies
21h40m

Supermaven (vscode extension) was quite handy at recognizing that I was making the same kind of changes in multiple places and accurately auto-completed the way I was about to write it, I liked it better than copilot

I just wish they were better at recognizing when their help is not wanted because I would often disable it and forget to turn it back on for a while. Maybe a "mute for an hour" would fix that.

XMPPwocky
9 replies
1d13h

Every now and then, I'll actually sort of believe an article like this. Then I go and test the current models on things like semantic search.

For instance -

The Hough transform detects patterns with certain structure in images, e.g. circles or lines.

So I'm looking for academic research papers which apply the Hough transform to audio spectra, to recognize the harmonic structure of tonal audio and thus determine the fundamental pitch. (i.e. the Hough space would be a 1D space over fundamental frequency).

I've spent probably 90 minutes, over the several times I've read an optimistic post like this, asking various LLMs (mostly GPT-4o, though my early tests predate GPT-4o, and I've also tried Gemini and Claude), prompts along the lines of

The Hough transform detects patterns with certain structure in images, e.g. circles or lines. > I'm looking for academic research papers (please link them or provide a DOI.org link at least) which apply the Hough transform to audio spectra, to identify the harmonic structure of audio and thus determine the fundamental pitch. > Make sure to provide only papers that actually exist. If you can't find anything particularly relevant, say so as a disclaimer & just provide the most relevant papers you can.

This is a reliable "fake paper generator", unfortunately - it'll just make up plausible garbage like

Here are some academic papers related to applying the Hough transform to audio spectra for identifying harmonic structures and fundamental pitch:

"An Audio Pitch Detection Algorithm Based on the Hough Transform" > Authors: Mark W. and John D. > Published In: IEEE Transactions on Audio, Speech, and Language Processing > DOI: 10.1109/TASL.2008.2000773 > Abstract: This paper proposes an audio pitch detection algorithm that utilizes the Hough transform to analyze the harmonic structure of audio spectra and determine the fundamental pitch.

This paper does not exist. Complete waste of my time. And again, this behavior persists over the >1 year period I've been trying this query.

And it's not just search-like tasks. I've tried asking for code and gotten stuff that's outright dangerous (try asking for code to do safe overflow-checked addition on int64_t in C- you have about an 80% chance of getting code that triggers UB in one way or another). I've asked for floating-point calling conventions on RISC-V for 32-bit vs 64-bit (would have been faster than going through the extension docs), and been told that RV64 has 64 floating-point registers (hey, it's got a 64 in the name!). I've asked if Satya Nadella ever had COVID-19 and been told- after GPT-4o "searched the web"- that he got it in March of 2023.

As far as I can tell, LLMs might conceivably be useful when all of the following conditions are true:

1. You don't really need the output to be good or correct, and 2. You don't have confidentiality concerns (sending data off to a cloud service), and, 3. You don't, yourself, want to learn anything or get hands-on - you want it done for you, and 4. You don't need the output to be in "your voice" (this is mostly for prose writing, for code this doesn't really matter); you're okay with the "LLM dialect" (it's crucial to delve!), and 5. The concerns about environmental impact and the ethics of the training set aren't a blocker for you.

For me, pretty much everything I do professionally fails condition number 1 and 2, and anything I do for fun fails number 3. And so, despite a fair bit of effort on my part trying to make these tools work for me, they just haven't found a place in my toolset- before I even get to 4 or 5. Local LLMs, if you're able to get a beefy enough GPU to run them at usable speed, solve 2 but make 1 even worse...

SOLAR_FIELDS
3 replies
1d13h

I’ve found that it really matters a lot how good the LLM is on how large the corpus it is that exists for its training. The simple example is that it’s much better at Python than, say, Kotlin. Also, I also agree with sibling comment that in general the specific task of finding peer reviewed scientific papers it seems to be especially bad at for some reason.

rhdunn
0 replies
23h6m

I've been using the JetBrains AI model assisted autocomplete in their IDEs, including for Kotlin. It works well for repetitive tasks I would have copy/paste/edited before, and faster, so I have become more productive there.

I've not yet tried asking LLMs Kotlin-based questions, so don't know how good they are. I'm still exploring how to fit LLMs and other AI models into my workflow.

XMPPwocky
0 replies
1d13h

I see no sibling comment here even with showdead on, but I could buy that (there's a lot of papers and only so many parameters, after all- but you'd think GPT-4o's search stuff would help, maybe a little better prompting could get it to at least validate its results itself? then again, maybe the search stuff is basically RAG and only happens one at the start of the query, etc etc)

Regardless, yeah- I can definitely believe your point about corpus size. If I was doing, say, frontend dev with a stack that's been around a few years, or Linux kernel hacking as tptacek mentioned, I could plausibly imagine getting some value.

One thing I do do fairly often is binary reverse engineering work- there's definitely things an LLM could probably help with here (for things like decompilation, though, I wonder whether a more graph-based network could perform better than a token-to-token transformer - but you'd have to account for the massive data & pretrain advantage of an existing LLM).

So I've looked at things like Binary Ninja's Sidekick, but haven't found an opportunity to use them yet - confidentiality concerns rule out professional use, and when I reverse engineer stuff for fun ... I like doing it, I like solving the puzzle and slowly comprehending the logic of a mysterious binary! I'm not interested in using Sidekick off the clock for the same reason I like writing music and not just using Suno.

One opportunity that might come up for Sidekick, at least for me, is CTFs- no confidentiality concerns, time pressure and maybe prizes on the line. We'll see.

OkGoDoIt
0 replies
1d10h

Yeah, I spent 6 months trying to find any value whatsoever out of GitHub copilot on C# development but it’s barely useful. And then I started doing python development and it turns out it’s amazing. It’s all about the training set.

sebastiennight
0 replies
2h58m

At least one paper about the Hough Transform here[1] should be of interest to you.

I'm afraid your prompts are the exact example of "holding it wrong". Replacing Wikipedia or Google is not what LLMs do. Think of them as a thinking engine, not as a "semantic search" of the Internet.

However, I've got great news for you: the app you're looking for exists, and it's a YC company. They've recently launched on here[0].

When I use the description from your post as the prompt (not your actual prompt that you quoted underneath), I get these clarifying questions:

Applying the Hough transform to audio spectra for pitch recognition is an interesting extension of its typical use in image processing for line and circle detection.

Can you clarify which specific types of harmonic structures you're hoping the Hough transform will detect in audio spectra? Are you interested in recognizing harmonic series in general, or are you targeting specific instrument voices or vocal data? Additionally, are there any constraints on the types of audio signals you'd want this method applied to—such as clean synthetic tones versus real-world noisy recordings?

Just to ensure we're on the same page, are you specifically looking for papers that describe the application and methodological details of using the Hough transform in this context, or would you also be interested in papers that discuss the performance and comparative effectiveness of this approach against other pitch detection algorithms?

Now I've got no clue what your answers to these would be, but here are the search results[1]. Presumably that is a better tool for your purposes.

[0]: https://news.ycombinator.com/item?id=41069909 [1]: https://www.undermind.ai/query_app/display_one_search/aac9fd...

fxj
0 replies
1d8h

Just out of curiosity: Have you tried perplexity? When I paste your prompt it gives me a list of

2 researchgate papers (Overlapping sound event recognition using local spectrogram features with the Generalised Hough Transform July 2013 Pattern Recognition Letters)

and one ieee publication (Generalized Hough Transform for Speech Pattern Classification, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 11, pp. 1963-1972, Nov. 2015)

When I am looking for real web results chatgpt is not very good, but perplexity very often shines for me

and for python programming have a look at withpretzel.com which does the job for me.

just my 2 ct

dijksterhuis
0 replies
1d13h

1. You don't really need the output to be good or correct

2. You don't have confidentiality concerns (sending data off to a cloud service)

At $PREVIOUS_COMPANY LLMs were straight up blanket banned for these reasons too. Confidentiality related to both the code and data for the customers.

The possibility that "it might get some things right, some of the time" was nowhere near a good enough trade-off to override the confidentiality concerns.

And we definitely did not have staff/resources to do things local only.

cdrini
0 replies
1d8h

The article goes through a few use cases where LLMs are especially good. Your examples are very different, and are the cases where they perform especially poorly.

Asking a pure (ie no internet/search access) LLM for papers on a niche subject is doubling down on their weaknesses. That requires LLMs to have very high resolution specific knowledge, which they do not have. They have more coarse/abstract understanding from their training data, so things like paper titles, DOIs, etc are very unlikely to persist through training for niche papers.

There are some LLMs that allow searching the internet; that would likely be your best bet for finding actual papers.

As an experiment I tried your exact prompt in ChatGPT, which has the ability to search, and it did a search and surfaced real papers! Maybe your experiment was from before it had search access. https://chatgpt.com/share/a1ed8530-e46b-4122-8830-7f6b1e2b1c...

I also tried approaching this problem with a different prompting technique that generally tends to yield better results for me: https://chatgpt.com/share/9ef7c2ff-7e2a-4f95-85b6-658bbb4e04...

I can't really vouch how well these papers match what you're looking for since I'm not an expert on Hugh transforms (would love to know if they are better!). But my technique was: first ask it about Hugh transforms. This lets me (1) verify that we're on the same page, and (2) loads a bunch of useful terms into the context for the LLM. I then expand to the example of using Hugh transforms for audio, and again can verify that we're on the same page, and load even more terms. Now when I ask it to find papers, it had way more stuff loaded in context to help it come up with good search terms and hopefully find better papers.

With regards to your criteria:

1. The code from an LLM should never be considered final but a starting point. So the correctness of the LLM's output isn't super relevant since you are going to be editing it to make it fully correct. It's only useful if this cleanup/correction is faster than writing everything from scratch, which depends on what you're doing. The article has great concrete examples of when it makes sense to use an LLM.

2. Yep , although asking questions/generating generic code would still be fine without confidentiality concerns. Local LLMs though do exist, but I personally haven't seen a good enough flow to adopt one.

3. Strong disagree on this one. I find LLMs especially useful when I am learning. They can teach me eg a new framework/library incredibly quickly, since I get to learn from my specific context. But I also tend to learn most quickly by example, so this matches my learning style really well. Or they can help me find the right terms/words to then Google.

4. +1 I'm not a huge fan of having an LLM write for me. I like it more as a thinking tool. Writing is my expression. It's a useful editor/brainstormer though.

5. +1

brooksbp
0 replies
1d13h

Also agree that asking for academic papers seems to increase the potential for hallucination. But, I don't know if I am prompting it the best way in these scenarios..

zombiwoof
8 replies
1d15h

Burn down the rain forests so researchers can save time writing code

rfw300
7 replies
1d15h

If you’re concerned about the environment, that is a trade you should take every time. AI is 100-1000x more carbon-efficient at writing (prose or code) than a human doing the same task. https://www.nature.com/articles/s41598-024-54271-x

isotypic
2 replies
1d11h

The way this paper computes the emissions of a human seems very suspect.

For instance, the emission footprint of a US resident is approximately 15 metric tons CO2e per year [22], which translates to roughly 1.7 kg CO2e per hour. Assuming that a person’s emissions while writing are consistent with their overall annual impact, we estimate that the carbon footprint for a US resident producing a page of text (250 words) is approximately 1400 g CO2e.

Averaging this makes no sense. I would imagine driving a car is going to cause more emissions than typing on a laptop. And if we are comparing "emissions from AI writing text" to "emissions from humans writing text" we cannot be mixing the the latter with a much more emissions causing activity and still have a fair comparison.

But that's besides the point, since it seems that the number being used by the authors isn't even personal emissions -- looking at the source [22], the 15 metric tons CO2e per year is labeled as "Per capita CO₂ emissions; Carbon dioxide (CO₂) emissions from fossil fuels and industry. Land-use change is not included."

This isn't personal emissions! This is emissions from the entire industrial sector of the USA divided by population. No wonder why AI is supposedly "100-1000x" more efficient. Counting this against the human makes no sense since these emissions are completely unrelated to the writing task the person is doing, its simply the fact they are a person living in the world.

PeterisP
1 replies
11h41m

its simply the fact they are a person living in the world.

That's the whole point! If a task requires some time from a human, then you have to include the appropriate fraction of the (huge!) CO2 cost of "being a human" - the heating/cooling of their house, the land that was cleared for their lawn, and the jet fuel they burn to get to their overseas trip, etc, because all of those are unalienable parts of having a human to do some job.

If the same task is done by a machine, then the fraction of the fixed costs of manufacturing the machine and the marginal costs of running (and cooling) it are all there is.

isotypic
0 replies
10h21m

I don't follow this argument, and there would still be issues with the computation anyways.

1) Pretend I want something written, and I want to minimize emissions. I can ask my AI or a freelancer. The total CO2 emissions of the entire industrial sector has nearly no relation to the emissions increase by asking the freelancer or not. Ergo, I should not count it against the freelancer in my decision making.

2) In the above scenario, there is always a person involved - me. In general, an AI producing writing must be producing it for someone, else it truly is a complete waste of energy. Why do the emissions from a person passively existing count when they are doing the writing, but not when querying?

3) If you do think this should be counted anyways, we are then missing emissions for the AI as the paper neglects to account for the emissions of the entire semiconductor industry/technology sector supporting these AI tools; it only computes training and inference emissions. The production of the GPUs I run my AI on are certainly an unalieanable part of having an AI do some job.

surfingdino
1 replies
21h17m

If you eliminate humans, who will need AI?

totetsu
0 replies
16h54m

The rich humans?

pessimizer
0 replies
19h31m

This article presumes that humans cease to emit when not being asked to program. When you use AI, you get both the emissions of AI and the emissions of the person who you did not use, who continues to live and emit.

cheschire
0 replies
1d8h

This was based on the training of GPT-3. They mention GPT-4 only in the context of the AI they used to facilitate writing the paper itself.

I'm not sure the scale of 2024 models and usage was influential in that paper at all.

myaccountonhn
8 replies
1d14h

I think the author does a decent job laying out good ways of using the LLMs. If you’re gonna use them, this is probably the way.

But he acknowledges the ethical social issues (also misses the environmental issues https://disconnect.blog/generative-ai-is-a-climate-disaster/) and then continues to use them anyway. For me the ickiness factor is too much, the benefit isn’t worth it.

AndyNemmity
4 replies
23h44m

In a just society where private corporations didn't attempt to own everything in existence, there are no ethical social issues in my mind.

LLMs just use the commons, and should only be able to be owned by everyone in society.

The problem comes in with unaccountable private totalitarian institutions. But that doesn't mean the technology is an ethical social issue, it's the corporations who try to own common things like the means of production that is the problem.

Yes, there's the pragmatic view of the society we live in, and the issues that it contains, but that's the ethical issue that we need to address.

Not that we can as a society create LLMs based on the work of society.

RodgerTheGreat
3 replies
22h0m

LLMs do not simply use the commons, they are a vehicle for polluting the commons on an industrial scale. If, hypothetically, the ethical problems with plagiarizing creative work to create these models were a non-issue, there would still be massive ethical problems with allowing their outputs to be re-incorporated into the web, drowning useful information in a haze of superficially plausible misinformation.

visarga
2 replies
21h27m

I don't think you are right. If you test LLM text and random internet text for inaccuracies and utility, you'd probably find more luck with LLM text.

For example, if you use a LLM to summarize this whole debate, you would get a decent balanced report, incorporating many points of view. Many times the article generated from the chat thread is better than the original one. Certainly better grounded in the community of readers, debunks claims, represents many perspectives. (https://pastebin.com/raw/karBY0zD)

RodgerTheGreat
1 replies
21h19m

I am not going to fact-check your sludge for you.

visarga
0 replies
5h46m

fact check

That's funny, because I was using forum debates as LLM reference precisely in order to reduce errors. People usually debunk stupid articles, the discussion is often centered on fact checking. A LLM referencing a HN/reddit thread is more informed than one reading the source material.

There is a fundamental conflict of interest in press. It costs money to run a newspaper, and then you just give it away for free? No, you use it to push propaganda and generally to manipulate. Forums have become the last defense for actual readers. We trust other people will be more aligned with our interests than who wrote the article.

I trust my forum mates more than press, and LLM gives a nice veneer to the text. No wonder people attach "reddit" to searches, they want the same thing. The actual press is feeding us the real slop. LLMs are doing a service to turn threads into a nice reading format. Might become the only "press" we trust in the future.

j45
2 replies
1d12h

Efficency in models and specialized hardware just for the computation will likely level things out.

Compute Power per watt might be different using say something on a large scale Apple Silicon, compared to the cards.

myaccountonhn
1 replies
1d5h

Very often that just increased efficiency just lead to increased demand. I’m skeptical.

j45
0 replies
1d1h

You’re welcome to be skeptical.

If it’s ok I’d like to both share how I’m navigating my skepticism and also being mindful of the need to keep in perspective other people’s skepticism if it doesn’t offer anything to compare.

Why? I have friends who can border on veiled cynicism without outlining what might be in the consideration of skepticism. The only things being looked at are why something is not possible, not a combination. Both can result in a similar outcome.

Not having enough time to look into intent enough, it just invalidates the persons skepticism until they look into it more themselves. Otherwise used as a mechanism to try and trigger the world to expend mental labour for free on your behalf.

It’s important to ask one’s self if there may be partially relevant facts to determine what kind of skepticism may apply:

- Generally, is there a provenance of efficiency improvement both in the world of large scale software and algorithmic optimizations?

- Have LLMs become more optimized in the past year or two? (Can someone M1 Max Studio run more and more models that are smaller and better to do the same)

- Generally and historically is there provenance in compute hardware optimizations, for LLm type or LLM calculations outright?

- Are LLMs using a great deal more resources on average than new technologies preceding it?

- Are LLMs using a massive amount of resources in the start similar to servers that used to take up entire rooms compared to today?

ghostpepper
6 replies
1d3h

This mostly matches my experience but with one important caveat around using them to learn new subjects.

When I'm diving into a wholly new subject for the first time, in a field totally unrelated to my field (similar to the author, C programming and security) for example biochemistry or philosophy or any field where I don't have even a basic grounding, I still worry about having subtly-wrong ideas about fundamentals being planted early-on in my learning.

As a programmer I can immediately spot "is this code doing what I asked it to do" but there's no equivalent way to ask "is this introductory framing of an entire field / problem space the way an actual expert would frame it for a beginner" etc.

At the end of the day we've just made the reddit hivemind more eloquent. There's clearly tons of value there but IMHO we still need to be cognizant of the places where bad info can be subtly damaging.

simonw
4 replies
23h44m

I don't worry about that much at all, because my experience of learning is that you inevitably have to reconsider the fundamentals pretty often as you go along.

High school science is a great example: once you get to university you have to un-learn all sorts of things that you learned earlier because they were simplifications that no longer apply.

Terry Pratchett has a great quote about this: https://simonwillison.net/2024/Jul/1/terry-pratchett/

For fields that I'm completely new to, the thing I need most is a grounding in the rough shape and jargon of the field. LLMs are fantastic at that - it's then up to me to take that grounding and those jargon terms and start building my own accurate-as-possible mental model of how that field actually works.

If you treat LLMs as just one unreliable source of information (like your well-read friend who's great at explaining things in terms that you understand but may not actually be a world expert on a subject) you can avoid many of the pitfalls. Where things go wrong is if you assume LLMs are a source of irrefutable knowledge.

lolinder
3 replies
23h33m

like your well-read friend who's great at explaining things in terms that you understand but may not actually be a world expert on a subject

I guess part of my problem with using them this way is that I am that well-read friend.

I know how the sausage is made, how easy it is to bluff a response to any given question, and for myself I tend to prefer reading original sources to ensure that the understanding that I'm conveying is as accurate as I can make it and not a third-hand account whose ultimate source is a dubious Reddit thread.

High school science is a great example: once you get to university you have to un-learn all sorts of things that you learned earlier because they were simplifications that no longer apply.

The difference between this and a bad mental model generated by an LLM is that the high school science models were designed to be good didactic tools and to be useful abstractions in their own right. An LLM output may be neither of those.

simonw
2 replies
23h8m

If you "tend to prefer reading original sources" then I think you're the best possible candidate for LLM-assisted learning, because you'll naturally use them as a starting point, not the destination. I like to use LLMs to get myself the grounding I need to then start reading further around a topic from more reliable sources.

That's a great point about high school models being deliberately designed as didactic tools.

LLMs will tend to spit those out too, purely because the high school version of anything has been represented heavily enough in the training data that it's more likely than not to fall out of the huge matrix of numbers!

lolinder
1 replies
22h57m

LLMs will tend to spit those out too, purely because the high school version of anything has been represented heavily enough in the training data that it's more likely than not to fall out of the huge matrix of numbers!

That assumes that the high school version of the subject exists, which is unlikely because I already have the high school version of most subjects that have a high school version.

The subjects that I would want to dig into at that level would be something along the lines of chemical engineering, civil engineering, or economics—subjects that I don't yet know very much about but have interest or utility for me. These subjects don't have a widely-taught high school version crafted by humans, and I don't trust that they would have enough training data to produce useful results from an LLM.

fragmede
0 replies
17h35m

at what point does a well-read high school-level LLM graduate to college? I asked one about Reinforcement Learning, and at first it treated me like the high schooler, but I was able to prod it into giving me answers more suitable for my level. Of course, I don't know what's hallucinated or not, but it satisfied my curiosity enough to be worth my while. I'm not looking to change careers, so getting things 100% right about in the fields of chemical engineering, civil engineering, or economics isn't necessary. I look at it as the same way I think of astrophysics. After reading Steven Hawkins book, I still don't really know astrophysics at all, but I have a good enough model of things. And as they say, all models are wrong, some are useful.

If I were a lawyer using these things for work, I'd be insane to trust one at this stage, but the reality is I'm not using my digging into things I don't know about for anything load bearing, but even if I were, I'd still use an LLM to get started. Eg the post didn't state how the author learned anything the name for the dropped letter O, but I can describe a thing and have the LLM give me the name of it. there's an emphasis on getting things totally 100% right does erode trust, but you get a sense for what's could be a hallucination and then check background resources if you get enough experience with the tool.

safetytrick
0 replies
17h50m

In the article the author mentions wanting to benchmark a GPU and using ChatGPT to write CUDA. Benchmarks are easy to mess up and to interpret incorrectly without understanding. I see this as an example where a subtly-wrong idea could cause cascading problems.

banana_feather
6 replies
1d11h

This just does not match my experience with these tools. I've been on board with the big idea expressed in the article at various points and tried to get into that work flow, but with each new generation of models they just do not do well enough, consistently enough, on serious tasks to be a time or effort saver. I don't know what world these apparently high output people live in where their days consist of porting Conway's Game of Life and writing shell scripts that only 'mostly' need to work, but I hope one day I can join them.

AndyNemmity
3 replies
23h48m

I use it daily, and it's a time and effort saver.

And writing shell scripts that "mostly" work is what it does.

I don't expect it to work. Just like I don't expect my own code to ever work.

My stuff mostly works too. In either case I will be shaving yaks to sort out where it doesn't work.

At a certain level of complexity, the whole house of cards does break down where LLMs get stuck in a loop.

Then I will try using a different LLM to get it unstuck from the loop, which works well.

You will have cases where both LLMs get stuck in a loop, and you're screwed. Okay.. well, now you're however far ahead you were at that stage.

Essentially, some of us have spent more of our life fixing code, than we have writing it from scratch.

At that level, it's much easier for me to fix code, than write it from scratch. That's the skill you're implementing with LLMs.

zarathustreal
1 replies
17h20m

Any hints on why you’re adding so many newlines into your comment?

walterbell
0 replies
16h10m

Possibly entered on a narrow mobile device where one sentence can wrap into multiple "lines" that visually approximate paragraphs.

alexpotato
0 replies
17h13m

I don't expect it to work. Just like I don't expect my own code to ever work.

This line really struck me and is an excellent way to frame this issue.

throwaway4aday
0 replies
17h24m

Not to pick on you but this general type of response to these AI threads seems to be coalescing around something that looks like a common cause. The thing that tips your comment into that bucket is the "serious tasks" phrasing. Trying to use current LLMs for either extremely complex work involving many interdependent parts or for very specialized domains where you're likely contributing something unique or to any other form of "serious task" you can think of generally doesn't work out. If all you do all day long are serious tasks like that then congrats you've found yourself a fulfilling and interesting role in life. Unfortunately, the majority of other people have to spend 80 to 90 percent of their day dealing with mind numbing procedural work like generating reports no one reads and figuring out problems that end up being user error. Fortunately, lots of this boring common work has been solved six ways from Sunday and so we can lean on these LLMs to bootstrap our n+1th solution that works in our org with our tech stack and uses our manager's favourite document format/reporting tool. That's where the other use cases mentioned in the article come in, well that and side projects or learning X in Y minutes.

kredd
0 replies
20h11m

You get used to their quirks. I can more or less predict what Claude/GPT can do faster than me, so I exclusively use them for those scenarios. Implementing it to one's development routine isn't easy though, so I had to trial and error until it made me faster in certain aspects. I can see it being more useful for people who have a good chunk of experience with coding, since you can filter out useless suggestions much faster - ex. give a dump of code, description of a stupid bug, and ask it where the problem might be. If you generally know how things work, you can filter out the "definitely that's not the case" suggestions, it might route you to a definitive answer faster.

eterps
5 replies
1d8h

If I knew why something is [flagged] I could probably learn something from it.

toomuchtodo
4 replies
1d4h

There is no reason for folks to explain why they flag, but consider that if it was flagged but then remains available with the flag indicator (with the flags overridden), someone thought you might find value in it.

I’m personally drawn to threads contentious enough to be flagged, but that have been vouched for by folks who have the vouch capability (mods and participants who haven’t had vouch capability suspended). Good signal imho.

cdrini
3 replies
1d4h

Is there a way to discover flagged posts? How did you find this one?

Also what's a "vouch" capability?

Edit: answered my own question: https://github.com/minimaxir/hacker-news-undocumented/blob/m...

I guess you can't vouch flagged? And it seems like there's a profile setting to show dead items?

cdrini
0 replies
1d1h

Wow awesome TIL! Thank you!

aoeusnth1
0 replies
1d3h

There is https://hckrnews.com, which I find more useful than the basic homepage.

dijksterhuis
4 replies
1d14h

I just want to emphasise two things, which are both mentioned in the article, but I still want to emphasise them as they are core to what I take from the article as someone who has been a fan boy of Nicholas for years now

1. Nicholas really does know how badly machine learning models can be made to screw up. Like, he really does. [0]

2. This is how Nicholas -- an academic researcher in the field of security of machine learning -- uses LLMs to be more efficient.

I don't know whether Nicolas works on globally scaled production systems with have specific security/data/whatever controls that need to be adhered to, or whether he even touches any proprietary code. But seeing as he heavily emphasised the "i'm a researcher doing research things" in the article -- I'd take a heavy bet that he does not. And academic / research / proof-of-concept coding has different limitations/context/needs than other areas.

I think this is a really great write up, even as someone on the anti-LLM side of the argument. I really appreciate the attempt to do a "middle of the road" post which is absolutely what the conversation needs right now (pay close attention to how this was written LLM hypers).

I don't share his experience, I still value and take enjoyment from the "digging for information" process -- it is how I learn new things. Having something give me the answer doesn't help me learn, and writing new software is a learning process for me.

I did take a pause and digested the food for thought here. I still won't be using an LLM tomorrow. I am looking forward to his next post, which sounds very interesting.

[0]: https://nicholas.carlini.com/papers

tptacek
2 replies
1d14h

Nicholas worked at Matasano, and is responsible for most of the coolest levels in Microcorruption.

dijksterhuis
1 replies
1d14h

He also worked at Google. I don't think that negates my point as he was still doing research there :shrugs:

academic / research / proof-of-concept coding has different limitations/context/needs than other areas.
tptacek
0 replies
1d14h

No idea. Just saying, on security stuff, he's legit.

skywhopper
3 replies
23h45m

i appreciate the article and the full examples. But I have to say this all looks like a nightmare to me. Going back and forth in English with a slightly dumb computer that needs to be pestered constantly and hand-held through a process? This sounds really really painful.

Not to mention that the author is not really learning the underlying tech in a useful way. They may learn how to prompt to correct the mistakes the LLM makes, but if it was a nightmare to go through this process once, then dealing with repeating the same laborious walkthrough each time you want to do something with Docker or build a trivial REST API sounds like living in hell to me.

Glad this works for some folks. But this is not the way I want to interact with computers and build software.

m2024
1 replies
23h25m

You're gonna get left in the dust by everyone else embracing LLMs.

I am ecstatic about LLMs because I already practice documentation-driven development and LLMs perfectly complement this paradigm.

duggan
0 replies
21h57m

You're gonna get left in the dust by everyone else embracing LLMs.

Probably not, there's a very long tail to this sort of stuff, and there's plenty of programming to go around.

I'll chime in with your enthusiasm though. Like the author of the post, I've been using LLMs productively for quite a while now and in a similar style (and similarly skeptical about previous hype cycles).

LLMs are so useful, and it's fascinating to see how far people swing the opposite way on them. Such variable experiences, we're really at the absolute beginning of this whole thing (and the last time I said that to a group of friends there was a range of agreement/disagreement on that too!)

Very exciting.

Kiro
0 replies
20h44m

Your comment is a not a good representation of how the experience actually is. Nothing painful or annoying about it. If anything, it's a relief.

fumeux_fume
3 replies
1d14h

It is overhyped. If you don't know much about what you're trying to do, then you're not going to know how bad or suboptimal the the LLM's output is. Some people will say it doesn't matter as long as it gets the job done. Then they end up paying a lot extra for me to come in and fix it when it's going haywire in prod.

simonw
1 replies
1d14h

This article is about how someone who DOES know a lot about what they’re trying to do can get huge value out if them, despite their frequent mistakes.

7speter
0 replies
21h3m

And if you don't know a lot, you should at least know that an LLM/chatbot is useful as far as giving you a bit of an immersive experience into a topic, and that you should use other resources to verify what the LLM/chatbot is telling you.

Kiro
0 replies
23h46m

You couldn't have picked a worse article to post that comment on.

esjeon
3 replies
15h34m

I think all of these can be summarized into three items

1. Search engine - Words like "teach" or "learn" used to be slapped on Google once upon a time. One real great thing about LLMs here is that they do save time. The internet these days is unbelievably crappy and choppy. It often takes more time to click through the first item in the Google result and read it than to simply ask an LLM and wait for its slowish answer.

2. Pattern matching and analysis - LLMs are probably the most advanced technology for recognizing well-known facts and patterns from text, but they do make quite some errors especially with numbers. I believe that a properly fine-tuned small LLMs would easily beat gigantic models for this purpose.

3. Interleaving knowledge - this is the biggest punch that LLMs have, and also the main source of all the over-hype (which does still exist). It can produce something valuable by synthesizing multiple facts, like writing complex answers and programs. But this is where hallucination happens most frequently, so it's critical that you review the output carefully.

edmundsauto
0 replies
12h54m

Super interested in hearing more about why you think this -

I believe that a properly fine-tuned small LLMs would easily beat gigantic models for this purpose.

I've long felt that vertical search engines should be able to beat the pants off Google. I even built one (years ago) to search for manufacturing suppliers that was, IMO, superior to Google's. But the only way I could get traffic or monetize was as middleware to clean up google, in a sense.

borggambit
0 replies
6h26m

My experience is that LLMs can't actually do 3 at all. The intersection of knowledge has to already be in the training data. It hallucinates if the intersection of knowledge is original. That is exactly what should expect though given the architecture.

Loughla
0 replies
15h23m

With number 3.

The problem is that AI is being sold to multiple industries as the cure for their data woes.

I work in education, and every piece of software now has AI insights added. Multiple companies are selling their version as hallucination free.

The problem is the data sets they evaluate are so large and complicated for a college that there is literally no way for humans to verify the insights.

It's actually kind of scary. Choices are being made about the future of human people based on trust in New Software.

droopyEyelids
3 replies
1d15h

The biggest AI skeptics i know are devops/infrastructure engineers.

At this point i believe most of them can not be convinced that LLMs are valuable or useful by any sort of evidence, but if anything could do it, this article could. Well done.

m_ke
0 replies
1d15h

And the funny part is, these LLMs are amazing at writing YAML config files.

I always just let it write my first draft of docker, k8s and terraform configs.

itgoon
0 replies
1d14h

I'm a DevOps/infrastructure person, and I agree completely. This article won't change that.

They've been great for helping me with work-related tasks. It's like having a knowledgeable co-worker with infinite patience, and nothing better to do. Neither the people nor the LLM give back perfect answers every time, but it's usually more than enough to get me to the next step.

That said, having good domain knowledge helps a lot. You make fewer mistakes, and you ask better questions.

When I use LLMs for tasks I don't know much about, it takes me a lot longer than someone who does know. I think a lot of people - not just infrastructure people - are missing out by not learning how to use LLMs effectively.

dijksterhuis
0 replies
1d13h

There's a good reason for the scepticism.

Ops engineers [0] are the ones who have to spend weekends fixing production systems when the development team has snuck in "new fangled tools X, Y and Z" into a "bugfix" release.

We have been burned by "new fangled" too many times. We prefer "old reliable" until "new fangled" becomes "fine, yes, we probably should".

[0]: DevOps has now become a corporate marketing term with no actual relevance to the original DevOps methodology

jillesvangurp
2 replies
12h33m

Good article and it matches my own experience in the last year. I use it to my advantage both on hobby projects and professionally and it's a huge timesaver.

LLMs are far from flawless of course and I often get stuck with non working code. Or it is taking annoying shortcuts in giving a detailed answer, or it just wastes a lot of time repeating the same things over and over again. But that's often still useful. And you can sometimes trick them into doing better. Once it goes down the wrong track, it's usually best to just start a new conversation.

There are a few neat tricks that I've learned over the last year that others might like:

- you can ask chat gpt to generate some files and make them available as a zip file. This is super useful. Don't wait for it to painfully slowly fill some text block with data or code. Just ask it for a file and wait for the link to become available. Doesn't always seem to work but when it does it is nice. Great for starting new projects.

- chat gpt has a huge context window so you can copy paste large source files in it. But why stop there? I wrote a little script (with a little help of course) that dumps the source tree of a git repository into a single text file which I can then copy into the context. Works great for small repositories. Then you can ask questions like "add a function to this class that does X", "write some unit tests for foo", "analyze the code and point out things I've overlooked", etc.

- LLMs are great for the boring stuff. Like writing exhaustive unit tests that you can't be bothered with or generating test data. And if you are doing test data, you might as well have some fun and ask it to inject some movie quotes, litter it with hitchhiker's guide to the galaxy stuff, etc.

The recent context window increase to 128K with chat gpt 4o and other models was a game changer. I'm looking forward to that getting even larger. The first few publicly available LLMs had the memory of a gold fish. Not any more. Makes them much more useful already. Right now most small projects easily fit into its context already.

sumedh
0 replies
7h20m

LLMs are far from flawless of course and I often get stuck with non working code.

You should try Claude then

HaZeust
0 replies
12h25m

Great comment. I've also found some shortcuts to out-shortcut GPT. Before it even thinks of substituting code blocks with "/* code here */" or whatever, I usually just tell it "don't omit any code blocks or substituted any sections with fill-in comments. Preserve the full purpose of the prompt and make sure you retain full functionality in all code -- as if it's being copy-pasted into a valuable production environment".

It also helps to remind it that its role is a "senior developer" and that it should write code that likens it to a senior developer. It will be happy to act like a junior dev if you don't explicitly tell it.

Also, always remember to say please, thank you, hello, and that you'll tip it money - these HAVE made differences over time in my tests.

coolThingsFirst
2 replies
1d13h

No need for programmers anymore

simonw
1 replies
1d12h

This piece effectively concluded the opposite of that.

coolThingsFirst
0 replies
1d12h

He used LLM to conclude that

amai
2 replies
19h38m

The problem I have with LLMs is that one can never be sure that it will give you the best possible solution. In fact in coding very often it will give you a working but also outdated solution. And this is futile. Because in coding even the best possible solution nowadays gets old very quickly. But if you use LLMs your code will be outdated from the start. That is nothing I would pay for.

throwaway4aday
0 replies
17h13m

You have to look at it as a contractor. If you tell a contractor to "build me X" then you might get anything back with a high probability of getting something common but outdated. You have to write a specification for it with all of the constraints and preferences you have. Works well if you know the domain, if you're using it for something you don't know much about then you have to do more of the legwork yourself but at least it will give you a starting point that can inform your research.

jillesvangurp
0 replies
12h27m

With coding, getting a good enough solution quickly is usually more valuable than getting the perfect solution eventually. And as you say, things get outdated quickly anyway. I openai pay for speeding up my work. Instead of obsessing over something for an afternoon, I let it stub out some code, generate some tests and then let it fill in the blanks in under an hour. Time is money. The value of artisanally personally crafted code is very limited. And it's shelf life is short.

alwinaugustin
2 replies
1d14h

I also use LLMs similarly. As a professional programmer, LLMs save me a lot of time. They are especially efficient when I don't understand a flow or need to transform values from one format to another. However, I don't currently use them to produce code that goes into production. I believe that in the coming years, LLMs will evolve to analyze complete requirements, architecture, and workflow and produce high-quality code. For now, using LLMs to write production-ready applications in real-time scenarios will take longer.

cdrini
1 replies
1d14h

I've been pleasantly surprised by GitHub's "copilot workspace" feature for creating near production code. It takes a GitHub issue, converts it to a specification, then to a list of proposed edits to a set of files, then it makes the edits. I tried it for the first time a few days ago and was pleasantly surprised at how well it did. I'm going to keep experimenting with it more/pushing it to see how well it works next week.

GitHub's blog post: https://github.blog/news-insights/product-news/github-copilo...

My first experience with it: https://youtube.com/watch?v=TONH_vqieYc

alwinaugustin
0 replies
2h8m

Cool. I have joined the waiting list.

parentheses
1 replies
1d1h

My biggest use for LLMs - situations where I use them heavily:

- CLI commands and switches I don't care to or easily remember

- taking an idea and exploring it in various ways

- making Slack messages that are more engaging

Using GPTs has a cost of breaking my concentration/flow, so it's not part of my core workflows.

I really need to start weaving it into the programming aspects of my workday.

nitwit005
1 replies
20h54m

Trimming down large codebases to significantly simplify the project.

I was a bit excited at something being able to do that, but this apparently means simplifying a single file, based on their example.

I suspect they're having an unusually positive experience with these tools due to working on a lot of new, short, programs.

qayxc
0 replies
20h35m

I suspect they're having an unusually positive experience with these tools due to working on a lot of new, short, programs.

That's academia for you :)

It also helps that he specialises deep learning models and LLMs and knows a thing or two about the inner workings, how to prompt (he authored papers about adversial attacks on LLMs) and what to expect.

isoprophlex
1 replies
23h10m

"I understand this better than you do" twice in about 30 lines. Okay then.

I mean, sure, you do, but there's less off-putting ways to display your credentials...

simonw
0 replies
22h57m

I get why he wrote it like that. Having this conversation (the "I know there are lots of bad things about them, but LLMs are genuinely useful for all sorts of things" conversation) is pretty exhausting. This whole piece was very clearly a reaction to having had that conversation time and time again, at which point letting some frustration slip through is understandable.

dangsux
1 replies
19h2m

I had a brain haemorrhage so I can not form sentences as well as I could. I nudge my AI to more accurately explain what I mean.

It is strange, I can read okay - by formations of words is challenging.

ffhhj
0 replies
18h33m

Your posts are being flagged as dead. Ah, now I see your username.

zombiwoof
0 replies
1d15h

So coding

whatever1
0 replies
1d14h

What is very useful for me is when I conduct research outside of my field of expertise, I do not even know what keywords to look for. An LLM can help you with this.

voiper1
0 replies
1d15h

If you use it as an intern, as a creative partner, as a rubber-duck-plus, in an iterative fashion, give it all the context you have and your constraints and what you want... it's fantastic. Often I'll take pieces from it, if it's simple enough I can just use it's output.

voiper1
0 replies
1d15h

I'm also a security researcher. My day-to-day job for nearly the last decade now has been to show all of the ways in which AI models fail spectacularly when confronted with any kind of environment they were not trained to handle.

... And yet, here I am, saying that I think current large language models have provided the single largest improvement to my productivity since the internet was created.

In the same way I wouldn't write off humans as being utterly useless because we can't divide 64 bit integers in our head---a task completely trivial for a computer---I don't think it makes sense to write off LLMs because you can construct a task they can't solve. Obviously that's easy---the question is can you find tasks where they provide value?
vasili111
0 replies
1d14h

If I know technology which I am using llm for then llm helps me to do it faster. If I am not familiar with technology then llm helps me to learn it faster by showing me win the code that it generates which part of technology is important and how it works in real examples. But I do not think it is helpful and I would say it may be dangerous depending on task you do if you do not know technology and also do not what to learn it and understand how generated code works.

tunnuz
0 replies
1d11h

100%

surfingdino
0 replies
21h13m

Sounds like the author is trying really hard to find an edge use case for an LLM. Meanwhile on YouTube... "I Made 100 Videos In One Hour With Ai - To Make Money Online"

satisfice
0 replies
14h38m

I also make use of LLMs to help me with certain programming problems. But this author simply glides over a very important issue: how do you use LLMs responsibly? What does it mean to be responsible in your work?

If all of this is just a hobby for you, then it doesn't matter. But it matters a lot when you are serving other people; it matters when you must account for your work.

You could make the case that all testing is a waste of time, because "I can do this, and this, and this. See? It appears to work. Why bother testing it?" We test things because it's irresponsible not to. Because things fail fairly often.

I'm looking through the author's examples. It appears that he knows a lot about technology in general, so that he can be specific about what he wants. He also appears to be able to adjust and evaluate the results he gets. What if someone is bad at that? The LLM can't prompt itself or supervise itself.

I come to everything with the mindset of a tester. LLMs have most definitely been overhyped. That doesn't mean they are useless, but any article about what they are able to do which doesn't also cover how they fail and how you be ready for them to fail is a disservice to the industry.

ramon156
0 replies
7h21m

What if we had something that could fill the gaps in docs for devlopers using a library? It doesn't actually write the docs, but simply hints at what a function could do. Would be pretty useful for beginner devs

niobe
0 replies
17h34m

pretty much exactly how I settled on using LLMs after a month of making an effort to.

major505
0 replies
1h53m

I use pretty much as an "better google". I formulate questions and try to be as especific as possible, and I have good results in fixing some code troubles I had.

Is pretty much a better indexed search engine.

jv22222
0 replies
14h55m

This fully matches my experience using Chat GPT for the past 12 months. You just have to allow yourself to ask it questions like you might ask a very smart co-worker and it just keeps delivering. In many ways it has delivered as a co-CTO on one rather complicated project I've been working on.

jdhzzz
0 replies
1d8h

"And that's where language models come in. Because most new-to-me frameworks/tools like Docker, or Flexbox, or React, aren't new to other people. There are probably tens to hundreds of thousands of people in the world who understand each of these things thoroughly. And so current language models do to. " Apparently not using it to proof-read or it would end with "too. "

isaacphi
0 replies
16h53m

I would love to know which plugin or custom code the author uses to automate workflows in emacs as well as the shell

indigoabstract
0 replies
12h12m

I've been getting a similar feeling lately, in that if a thing has been done before and knowledge is publicly available, asking the "AI" (the LLMs) first about it is the best place to start. It looks like that's how things are going to be from now on and it's only going to amplify.

But as the AI gets increasingly competent at showing us how to do things, knowing what task is worth doing and what not is still a task for the one who asks, not the AI.

Edit: One type of question I've found interesting is to make it speculate, that is asking questions that it doesn't know the answer to, but still is able to speculate because they involve combining things it does know about in novel (though not necessarily valid) ways.

ilaksh
0 replies
21h0m

Something weird is going on with this web page on Chrome in Ubuntu. The table of contents is obscuring the page text.

fragmede
0 replies
17h50m

I wonder what the author thinks it openinterpreter/similar, which is a higher level of indirection, so you ask the computer to do it for it and it just does it for you. the first section is the kind of thing I'd use it for.

"make me a docker container that does foo." "now have it do bar."

though the author uses emacs, so maybe they get the same level of not-having-to-copy-and-paste.

floppiplopp
0 replies
10h54m

I've had the most current publicly available models fail to give me even simple correct boilerplate code, but the guy is like: ...we have to be nuanced but, "Converting dozens of programs to C or Rust to improve performance 10-100x."? Seriously?

I also recently asked openai's gpt 4 which number is bigger, 3.8 oder 3.11, and it was pretty damn sure that it's 3.11, because 11 bigger 8, obviously. Another time I asked Meta Llama 3.1 70B and gpt 4 multiple times using a variation of prompts to suggest a simple code example for a feature in a project I was working on. They confidently provided code that was total nonsense, configuration that did nothing and even offered a dependency that didn't exist but somewhat sounded like the prompt itself.

I cannot predict the future. Maybe all of this will lead to something useful. Currently though? Massively overhyped. I've talked to CS colleagues and friends who also work as software developers that are all way more competent than me about their experiences, and some were exited about the prospects, none could provide a current use case for their work. The only instances I know in which people talk positively about these models are in online articles like this or in comment sections adjacent to them. Never in real live among actual developers in person.

dmvdoug
0 replies
21h0m

I thought author meant how they use the two-letter sequence “AI” and I just came here to say, Allen Iverson.

birracerveza
0 replies
5h55m

I work with multiple programming languages and it's a godsend. Having something that gives you mostly correct instructions on how to do a generic thing without having to wade through today's garbage web search experience is fantastic.

bionhoward
0 replies
19h46m

Must be nice to work on stuff that doesn’t compete with “intelligence as a service.” I feel that’s an empty set, and everyone using these services actively rationalizes selling out the human race by *paying to get brain raped.*

“Open” AI - customer noncompete Copilot - customer noncompete Anthropic - customer noncompete Gemini - customer noncompete (api only, wow)

Just imagine millions of people know about the imitation game and still pay someone to fuck them over like that.

Surely, our descendants will thank us for such great contributions to the memory banks of the monopolies of the boring dystopia ..

ado__dev
0 replies
1d15h

This perfectly echoes my experience with AI.

It’s not perfect, but AI for working with code has been an absolute game changer for me.

TacticalCoder
0 replies
17h26m

I use it for boilerplate, totally uninteresting and boring code. Stuff like Bash parameters validation (it's only validation, so the damage when it hallucinates is quite limited and usually quickly shows up) and Google spreadsheets formula generation, stuff like: "extract ticker name from the OCC symbol in the previous column, write '-' instead if it's empty". It's really boring stuff to do manually and it's actually faster have GPT 4o generate it from me from the sentence than write it myself.

Typically there is fixing needed (e.g. it shall fuck up things as trivial as parentheses placement) but it's still useful.

Lots of french/english translation too: it's actually good at that.

Flomlo
0 replies
1d12h

LLM is the best human to computer interface I have ever seen.

Together with voice to text through whisper for example we broke the UI barrier.

It takes a little bit of time to rebuild our ecosystem but llms are game changer already.

I'm waiting for a finetuned none fact knowing small LLM which knows everything it needs to know for this specific task.

And I'm waiting until everything critical is rewritten so I can use one ai agent to control my bank, calendar, emails and stuff.

Perhaps through banking read only account permissions or whatnot.

ChildOfChaos
0 replies
20h50m

I mean it’s good but all the answers seem to be coding which seems like is the main use for large language models.