return to table of content

AI generated security reports about curl

dexwiz
31 replies
2d22h

  Certainly! Let me elaborate on the concerns raised by the triager
This is typical LLM speak, it sounds like a robot butler. I don’t think I have encountered a single person who writes likes this. But it’s also got a weird 3rd person reference that indicates there is another party that is promoting for a response.

I am okay with LLMs having a specific voice that makes them identifiable. My worry is that people will start talking like LLMs instead of LLMs sounding like people.

nicklecompte
13 replies
2d21h

Daniel Stenberg[1] brought up a good point: the complexity here is that curl is used all over the world, and there's certainly nothing wrong with a non-English speaker using an LLM to help write their bug report. So the superficial giveaways that the English text is LLM-generated doesn't necessarily mean the report's content was LLM-generated.

[1] https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-f...

skepticATX
8 replies
2d21h

I do however suspect that if you just add an ever so tiny (intelligent) human check to the mix, the use and outcome of any such tools will become so much better. I suspect that will be true for a long time into the future as well.

I love this paragraph. I think that generative AI companies, especially OpenAI, have completely dropped the ball when it comes to their marketing.

The narrative (that these companies encourage and often times are responsible for) is that AI is intelligent and will be a replacement for humans in the near future. So is it really a surprise when people do things like this?

LLMs don’t shine as independent agents. They shine when they augment our skills. Microsoft has the right idea by calling everything “copilot”, but unfortunately OpenAI drives the narrative, not Microsoft.

lesuorac
4 replies
2d20h

It's also a better company strategy to be an augment vs replacement. Like, advertise that you can get twice as much done not that you can get the same amount done with half the effort.

If somebody spends 10M on Labor then at best you can change 10M to replace their labor costs. Lets say its 1,000 people.

If you instead argue that those people are now 2x as efficient you can sell the company of the idea of paying for 2,000 seats when their company grows.

richardwhiuk
3 replies
2d19h

That assume the company has the requirement to have 2x the work.

If not, the argument becomes:

a) Get rid of 1000 people

b) Get rid of 500 people, by making 500 people 2x efficient.

Option (a) is clearly better.

pdimitar
2 replies
2d1h

Option A is not clearly better because it assumes you can actually completely get rid of the 1000 people.

I don't think we've seen much evidence of this.

If anything, I do believe I've seen evidence that option B is more realistic and doable.

It always baffles me though, how people pick what they believe in with zero supporting evidence, just because it sounds better... with zero context.

lesuorac
1 replies
2d1h

It's not zero context; this thread is about using AI to augment human productivity vs Ai to replace humans.

The supporting evidence is just the math. (A) If I sell you a product that makes your employees twice as productive then my revenue scales with your employee count. (B) If I sell you a product that eliminates your employees then my _maximum_ revenue is your current employee count. With (A) I have a unlimited revenue cap while with (B) it's cap'd at your current employee count. I also didn't invent this approach so there's other people that think this too.

It's not that (B) is bad; it's just that (A) is better. It's similar to say selling people a cable subscription without ads; it's just better (more revenue) to both sell them a subscription and give them ads.

richardwhiuk
0 replies
22h23m

You've flipped A and B.

Current employee cost may be higher than revenue scaling with employees.

godelski
2 replies
2d20h

I've been using a chocolate factory analogy around this. These companies are making damn fine chocolate, without a doubt. Maybe even some of the best chocolate in the world. But they got tired of selling just chocolate and so started marketing their chocolate as cures for cancer, doctors, farmers, and all sorts of things that aren't... well... chocolate. Some people are responding by saying that the chocolate tastes like shit and others are true believers trying to justify the fact that they like the chocolate by defending the outrageous claims. But at the end of the day, it's just chocolate and it is okay to like it even if the claims don't hold up. So can't we just enjoy our chocolate without all the craziness? This seems to be a harder ask than I've expected.

Peritract
1 replies
2d7h

If a chocolate factory was making deceptive claims about curing cancer, then -- regardless of chocolate quality -- I think a lot of people would very reasonably

1. Stop eating that chocolate 2. Preface every recommendation of that chocolate with a clear disclaimer

I don't think it would be ethical to continue recommending the chocolate, only mentioning its benefits and being silent about the drawbacks.

godelski
0 replies
2d

The chocolate is useful though. So personally I preface it. But I don't know how to accurately communicate "Hey, the chocolate is tasty, but doesn't cure cancer, can we stop saying it does" without it being interpreted as "chocolate is horrible and will summoning a random number of parrots who will drop paperclips on you until you die." I'm weirded out by how hard this is to communicate.

If the chocolate didn't have such utility I'd fully agree with you, but that's the only slight disagreement I have. I definitely agree it is unethical to be selling the chocolate in this way, or overselling in really any way. Likewise I think it is unethical to deny its tastiness and over exaggerate your dislike for it.

renonce
0 replies
2d1h

The author could have provided his original conversation with LLM in his native language for anyone else to verify, so this is not an excuse for using LLM to write crap. The point is that whether the author wrote the text himself or an LLM helped the author write it, the author is ultimately responsible for the content, and the author should only be responsible for his own writings, not LLM outputs.

fnordpiglet
0 replies
2d21h

That’s absolutely fair IMO, but they aren’t use it right if that’s what’s happening. LLMs can do a decent job translating, and they should have written their bug report in their native language and asked for a translation. Then translated responses, wrote their response, and translated back

This appears to be “write a bug report about X” then “write a response to triager for their reply Y” without intermediating, let alone factually checking the output.

That use doesn’t fall prey to LLM voice because the translation is of your text and phrasing.

God bless people who use LLMs to improve their life, translate, etc. But using them to think isn’t acceptable.

ericbarrett
0 replies
2d21h

He makes a great point, but I think writing in an authentic, un-augmented voice will very quickly become the only way to be noticed. Which is a shame for otherwise benign uses of the tool.

dang
0 replies
2d17h

Ok, we've changed the URL from https://hackerone.com/reports/2298307 to that link. Thanks!

latexr
6 replies
2d4h

My worry is that people will start talking like LLMs instead of LLMs sounding like people.

I very much doubt people will start talking like LLMs, unless perhaps the ones who rely too much on them. In which case, good, I’d like to have that information. A world where you can still identify LLMs is better than one where you cannot.

vanderZwan
1 replies
2d4h

You almost made me curious enough to want to see if the writing style of LinkedIn posts has shifted in the last year.

Sadly I'd have to open LinkedIn and spend significant time on it to verify that suspicion, so I'll never know.

EDIT: Look, I'm sure there's good stuff on LI, but let's be honest: it is also full of really weird, cringy posts that are somewhat inevitable products of influencer culture mixing with company culture. If you don't think so you either lucked out and live in an amazing social media bubble or you're lying to yourself.

BHSPitMonkey
0 replies
2d

You've reminded me to go back and see what's new on https://www.shlinkedin.com.

nottorp
1 replies
2d1h

Think of corporate speak, then ask yourself why not?

tamimio
0 replies
2d

Pretty much, browsing LinkedIn is the most agonizing experience I had in social media.

notahacker
1 replies
2d3h

One group that is likely to directly or indirectly pick up much of their formal English language writing from supposedly well formed LLM examples is people who speak English as a second (or fifth) language. I wouldn't want to make too many inferences about their ability or effort from that, especially if LLM phrases become as popular amongst some ESL groups as "kindly revert back" or "do the needful"

latexr
0 replies
2d2h

Free non-LLM translation apps are plentiful, work well, and use fewer resources. I agree with your larger point, but at the same time it’s not like there isn’t choice.

I also don’t mean to imply I simply ignore people who use LLMs. Rather, the information is relevant for educational purposes.

As way of example: I frequent forums where I help users of a couple of systems I understand well. In the rare instances a user makes a question and a second replies with wrong information, I make a correction and take the second user’s misinterpretation into account to craft an explanation which will better realign their mental model. I may even use that as basis to improve the documentation. Everyone benefits, including those who arrive to the thread at a later date.

But if the second user used an LLM, they just wasted everyone’s time and detracted value. I’ll still need to correct the information for the first user, but there’s nothing I can do to help the second one because there’s zero information regarding what they know or don’t. All the while their post contains multiple errors which will confuse anyone who arrives later, perhaps via search engine. If I can at least identify the post came from an LLM, I can inform the user of their pitfalls and don’t need to waste a bunch more of my time and sanity correcting everything wrong with the post.

confused_boner
4 replies
2d21h

I use the word 'Certainly' a lot...not the rest of the stuff though. Feeling a bit self conscious about that now...

paulddraper
1 replies
2d21h

In writing?

dodslaser
0 replies
2d21h

Certainly!

dexwiz
0 replies
2d21h

The specific word choice isn't the key. The start of responses to most requests end up looking like:

  Interjection! Polite confirmation or denial of request. Apologize for prior mistakes if prompt included correction to prior output.
Its all very formulaic for something that is supposed to be generative. Its like they all spend some time training at Ditchley Park.

__MatrixMan__
0 replies
2d21h

If the LLM's are emulating you (and others) in every other sentence, perhaps you should be getting royalties or something.

That would be a bit silly in this particular case, but in general we ought to celebrate cases where somebody has authentically done something that millions of others find it useful to copy.

masklinn
1 replies
2d21h

Definitely a massive red flag, however assuming it's an actual human forwarding that garbage they could just have removed that line. The content remains suspicious, but the flags are harder to notice.

godelski
0 replies
2d20h

I think this is easier said than done. If you only have a basic understanding of a language then can you really accurately accomplish this? Sounding natural/native is a challenge that even many speakers never obtain despite being able to be understandable. So even the (arguably poor) argument of "just learn English" isn't that great. I'd also say that that argument is poor because you don't need to know any specific language to contribute. Isn't it actually a good thing that we can bridge these gaps and allow more people to contribute? Certainly we should reduce noise but I think this is far easier said than done (and some noise is even helpful at times).

I just don't think there are easy answers, no matter how much we want there to be. We should be careful to not lose nuance to our desires.

jstarfish
1 replies
2d20h

it sounds like a robot butler. I don’t think I have encountered a single person who writes likes this

English is taught to colonial servant-class British spec ("butlerian") in India.

I assume you've never had to deal with Microsoft enterprise tech support if you haven't encountered it before now.

dexwiz
0 replies
2d18h

I’ve never had an LLM ask me to “please do the needful.”

notahacker
0 replies
2d21h

I hope someone, somewhere is writing dystopian scifi where our robot overlords are constantly apologising and saying things like "Ultimately, your decision on whether to surrender will depend on your specific needs and preferences"

CoastalCoder
20 replies
2d22h

I'm trying to put my finger on why this pisses me off so much.

Maybe because I'm guessing (I could be wrong) that this is an utterly selfish act, damaging to the common good, on the part of whoever submitted the CVE. I.e., it's like vandalizing a Habitat for Humanity office.

hypeatei
7 replies
2d22h

It's so lazy and I (probably most here) can spot the AI writing style instantly. They continue to give nonsense answers and waste time of open source maintainers.

the_af
6 replies
2d21h

Something about the excessive politeness and constant apologizing and clarifying gives it away, right?

Just wait until people use LLMs with rude, impatient styles and broken English. Maybe an LLM that goes into flamewar tangents, maybe makes unrelated racist remarks, etc. Then we will have truly reached terminal confusion.

dexwiz
3 replies
2d21h

What makes you think there aren't people out there running models without the corporate safety and alignment layer? For the average person, this is a big ask, but not for a corporate or nation state actor.

fnordpiglet
2 replies
2d21h

It’s not that big a task as all that. There are a lot of unaligned models available, and user interfaces that aren’t that hard to use.

https://github.com/henk717/KoboldAI

dexwiz
1 replies
2d21h

Anything that requires reading a github page is a big ask for an "average person." Maybe not for you or me, but we are in a bubble here.

fnordpiglet
0 replies
2d19h

But we also aren’t a megacorp or a nation state, and while as a percentage of the world we aren’t a huge number, as an absolute number there are a lot of people capable of doing this.

hypeatei
0 replies
2d21h

Yes it's like corporate speak and very long winded. Also the way it outputs headings and lists is very identifiable.

CoastalCoder
0 replies
2d20h

Something about the excessive politeness and constant apologizing and clarifying gives it away, right?

That's part of it for sure. Somehow it manages to be obsequious, patronizing, corporate, disingenuous, corporate, useless, and passive-aggressive, all at the same time.

rich_sasha
3 replies
2d21h

I'm trying to put my finger on why this pisses me off so much

What I find frustrating when reading LLM output is that the eye glides over it easily, as it all has the right "texture" of text. But after reading a paragraph, you realize there is no content! And you have to squint hard looking for it, and you can't find it. It's exhausting.

And here, the person at the sharp end is doing a valuable, unpaid public service, which is adding insult to injury...

godelski
1 replies
2d21h

What I find frustrating when reading LLM output is that the eye glides over it easily, as it all has the right "texture" of text. But after reading a paragraph, you realize there is no content! And you have to squint hard looking for it, and you can't find it. It's exhausting.

I'd just like to say that this is a really elegant way of explaining this concept. I'm going to steal it ;)

I think this is also why there's a lot of miscommunication about generative models capabilities. I'm focused on image synthesis and then thing I notice is that these look amazing at first glance. Incredible when scrolling through Twitter, Reddit, or whatever. But the longer you look the weirder they tend to get. Not that this always happens, but enough.

I just wish we could talk about ML as if there weren't two camps: utter and useless bullshit vs AGI is coming sooner than an Elon promise of self-driving. It's neither of those things and I think we can't talk about what it really is because we want to believe one of the two extremes.

pc3k
0 replies
2d

But the longer you look the weirder they tend to get. Not that this always happens, but enough.

Especially with the HDR-esque glow that shouldn't exist.

serf
0 replies
2d21h

feels like it'll start an arms race with readers employing LLMs to create condensed/sparse summaries of things.

I guess the capability of the specific LLMs will dictate whether or not this is a net loss w.r.t. human communication in general.

shadowgovt
2 replies
2d22h

I can come up with an explanation that doesn't assume ill intent.

It's not too terribly hard to run an auto-scanner for a common vulnerability pattern and then hook that scanner up to an LLM to generate English explanations (because, let's call the tech world what it is: well-formed English is likelier to pass the first-pass filter of not being BS than an English-as-a-second-language attempt to explain a problem).

Looking at the reporter, they apparently have an undisclosed thank you from Adobe and Toyota. And when they interjected to try and explain an error in their machine mis-stating the name of the reviewer, the text they injected was likely not English-as-first-language.

So I can imagine someone basically trying to run an auto-scanner for common vulns to highlight them for various parties to address, because addressing them increases software health globally.

Problem is that it's false-positiving on some (admittedly very fragile, in the "only guarded by every human being writing the right code all the time" sense, i.e. the detector's not wrong that if the called function violates contract stuff will break) working code.

wrs
0 replies
2d21h

As I found out when I first put something up for review on HackerOne, there is a whole cottage industry of people just running scanners and blindly reporting whatever comes out. They are doing this to get something on their resume and maybe receive a bounty, not to improve global software health. It’s basically spam, because all the effort is on the part of the receiver to figure out if these reports make any sense. (And of course you can pay H1 to do that for you.)

flohofwoe
0 replies
2d21h

All that person had to do is just read the frigging code and check if the strcpy() can indeed lead to a memory corruption problem (and ideally create a reproducer) instead of wasting the curl maintainer's time with AI-generated bullshit.

In this case it's quite obvious that it's an AI generating the bullshit, but we really need a mandatory disclaimer that something was generated by an AI so that a human can immediately break off any interaction instead of wasting time.

armchairhacker
1 replies
2d21h

I don't like how the person who responded was very friendly and actually looked at the issue. It seemed like he was being taken advantage of.

More importantly, I don't want these kinds of reports to discourage friendly, investigative responses. Open source already has a problem, where many newcomers get discouraged by curt responses and their issues not really getting investigated or addressed. Sometimes the responders come off as rude, because they see so many low-effort PRs and issues and feature requests, and have very limited time. But a lot of "low-effort, low-quality" stuff is just submitted by people who are trying to join OSdev, so they deserve friendly responses and actual investigation like this responder did.

This stuff just makes maintainers' lives harder and increases the chance that newcomers get unreasonably-harsh responses, discouraging them from making future contributions (which may eventually become useful).

Ironically this is even a problem in HackerOne: I've read multiple blog posts where someone submit a CVE to a big company, and the company responded "this isn't a real problem" and just left it unaddressed (sometimes leading to the public disclosure in the blog post).

pavel_lishin
0 replies
2d15h

I don't like how the person who responded was very friendly and actually looked at the issue. It seemed like he was being taken advantage of.

Well, if he was an asshole to everyone and assumed ill intent each time, then he'd get no bug bounties at all. Nor friends.

richbell
0 replies
2d22h

What's worse, if a person polluting a valuable system with nonsense is knowingly malicious or completely clueless?

To me, this is the evolution of script kiddies and beg bounties.

https://www.troyhunt.com/beg-bounties/

rdtsc
0 replies
2d21h

I'm trying to put my finger on why this pisses me off so much.

There is also the disappointment and embarrassment I feel for the the other members of the programming community. That sounds silly and irrational, I know, but it plays into it somehow. Some languages probably have a convenient term precisely describing that feeling in one word.

arp242
0 replies
2d22h

Yeah, the selfishness is what pisses me off as well, but also just the sheer pointlessness of it. After the first 2 messages or so it's pretty clear that this is going nowhere fast and that continuing this is pointless. But they go on anyway. Double wank and shit chips.

minimaxir
9 replies
2d22h

I'm more curious why and how this report got filed.

Does curl provide financial incentives for filing CVEs? Or is it a misguided attempt at being helpful? (EDIT: from the blog post, there are indeed bug bounties for curl)

What do you prompt a LLM with to get a more humanized output and code intermix for a CVE like this, as it doesn't have the typical ChatGPT tropes? There isn't an obvious indicator that it was LLM-generated until it hallucinated a user name and went into the third-person "raised by the triager".

technion
5 replies
2d22h

I've seen people file cve requests 20 years ago with the output of "grep strcpy" as the entire report. All an llm bought to this modern version is the apologising, arguing and hallucinated code sample.

shadowgovt
2 replies
2d22h

It does make one wonder why anyone still uses strcpy in an open source project if it's going to be a perpetual source of having to flag down grep-identified bug reports forever.

flohofwoe
1 replies
2d21h

The recommended strncpy() is just as bad as strcpy() because you'll end up with a string that's not zero-terminated if the length of the copied string is >= the target buffer size.

strlcpy() is a better option but is neither in the C nor POSIX standard.

snprintf() is also a good option since C99 but a bit overkill.

nneonneo
0 replies
2d21h

The submitter’s proposed fix does add null termination, so it does work as expected. Still, strncpy is a horrible API in general. It even has the incredibly stupid behaviour of zeroing out the entire buffer past your string, which is almost never necessary and therefore just a waste of cycles. If zeroing is needed e.g. for security, it’s always better to include an obvious and explicit memset.

itishappy
0 replies
2d22h

Ah, the time wasting bits.

arp242
0 replies
2d22h

At least those could be dismissed with almost zero effort. "You clearly didn't spend any effort in your report, so I'm not going to spend a lot of effort replying".

But with this it at least looks like they spent the effort, and even though you can suspect LLM chicanery, you can never be entirely sure, especially not from the initial message.

xcdzvyn
0 replies
2d22h

Does curl provide financial incentives for filing CVEs?

Up to $10,000 according to their HackerOne page.

I could also see this being a misguided attempt at being helpful, given the submitter clearly cannot read C. Though I'm not sure where you'd lie in the computer competency spectrum if you can use an LLM to find code bugs withoug realising they'll probably be hallucinated.

rwmj
0 replies
2d22h

curl does offer a bug bounty. Unfortunately there's a more fundamental issue: being someone who has filed CVEs can provide some credibility for your resume, that's the financial incentive here, and also the root cause of many many issues with the CVE system.

richardwhiuk
0 replies
2d22h
yashap
6 replies
2d22h

“Beg bounty” ppl have already made it a major pain in the ass to have a bug bounty program, even when they were real people who had to take the time to craft “bug reports” about basically nothing. But if you mix in LLMs, generating bogus reports at essentially no cost, this could get really out of control. Could be the end of bug bounty programs IMO.

Or, maybe we just need to lock them down more? Like you need to apply to become part of the bug bounty program, which involves some sort of cheap-to-perform check that you’re a real person and actual security researcher, looking to find real, impactful security bugs. And only ppl admitted into the program can submit bugs and collect financial rewards.

donmcronald
3 replies
2d20h

Maybe have a submission fee? I don't know if that helps, but it would be a deterrent for massive amounts of machine generated garbage.

The worst case in my opinion is massive amounts of AI garbage being submitted which will require equally bad AI filtering to "solve" the problem with the result being an overall reduction on quality for everyone trying to participate in good faith.

sensanaty
1 replies
2d2h

I can see a world where something like this works, if you implemented some caveats.

- It has to be a low monetary amount, like a couple of bucks at most, though even this is tricky. People aren't willing to sheir their account details (for good reason) with any random entity, and a CC number gate will also block many well meaning reporters. It's probably the trickiest part here to justify (though presumably many Reporters want to get paid at which point they'd have to provide these details anyways, but oh well) - Refunds for well intentioned bug reports that get denied, so if you send blatant spam you lost out on the 5 bucks or whatever it'd cost you, but if you're making a legitimate report that wasn't accepted for whatever legit reason, you get it back. Makes it so incentives are still there, though I guess this can be abused (not like it's not already) - Fee waivers and a whitelist system. After all, if you've sent in multiple reports and they turned out to not be spammy, then you deserve the benefit of the doubt to freely send in reports. This can also be extended to a chain of trust in the wider bug reporting ecosystem, which encourages people to stick to a "main" account where their identity and reputation is established

Though I still expect lots of people wouldn't like this system, and for good reasons as well. Not sure what a perfect system would look like though, to be honest

nucleardog
0 replies
1d14h

Not hugely different but seems easier on my brain to instead set up an “account deposit” or something.

Put $5 in to register your account.

* If you have a report validated as “not junk” (not “a valid vulnerability”, just “not spam / good faith”) we send it back and your account is whitelisted.

* If you submit a junk report your account is closed and we keep it.

* If your account hits 30/60 days with no submissions, we refund and close it for inactivity.

The extra charge per submission seems largely unnecessary. If someone signs up and creates two dozen spam reports, just close their account and all their reports.

How well this would work would largely, I imagine, hinge on the success rate of these guys. If they’re sending in 100 reports to get a single $1000 bounty, then there’s still a positive ROI if your time is cheap enough.

At least requiring a unique payment method for each attempt would cut down on repeat offenders.

yashap
0 replies
2d12h

That’s an interesting idea, though I’m sure you’d be inviting some scathing blog posts! But you could maybe mitigate that by donating all fees to a charity?

Or even a hybrid:

- If you’re a “trusted security researcher”, you can submit for free

- Else, there’s a small fee (on the order of a few dollars per submission)

masklinn
1 replies
2d21h

There are platforms which do that yeah, they have "pools" of known researchers with tracked statuses, and you can tune how open your program is.

Some also have triage staff, but depending how typical your project is that can be very hit or miss.

yashap
0 replies
2d21h

Yeah, a decently locked down program seems like a good idea! At the startups I’ve worked at, the programs have been very open (basically “email us”), and it does lead to a fair bit of beg bounty spam, even without LLMs in the mix.

jsnell
6 replies
2d22h

I first thought this was a dupe of https://news.ycombinator.com/item?id=37904047, but turns out it's a different LLM-generated bogus vulnerability report about curl on hackerone.

ecshafer
2 replies
2d22h

Great so I am not crazy. I read this and I coulve sworn I had seen this before. This is so similar to the last incident its weird. Are popular projects like Curl just going to have constant LLM written incidents opened against them as people try and get some line items on their resume?

shadowgovt
0 replies
2d22h

Probably. Eventually it might even make sense to accept some of the edits so that the auto-sniffers generating them stop complaining (not unlike making changes to working code that make a linter happy).

bombcar
0 replies
2d22h

Yes. As anyone who has an email associated with a website knows, you'll get random "vulnerability detection spam" requesting bounties.

CVE is going to have to up their game.

barryrandall
1 replies
2d21h

HackerOne's customers are the companies running bug bounty programs. You'd think they'd be a bit more careful about who they allow to spam their customers with clout-seeking LLM garbage.

nneonneo
0 replies
2d21h

First time contributors can have perfectly valid reports, and this particular report was initially written in a reasonably professional style that doesn’t immediately seem out of place (it’s the replies that make it obvious an LLM is being used).

Unfortunately, this issue isn’t likely to go away any time soon, and will probably just get worse as the LLMs get used more widely for this type of work. What probably will happen is that maintainers will have to get better at identifying and screening out this kind of nonsense early, and for platforms to get better at banning people who submit bogus reports (but that’s really going to be a whack-a-mole game).

skeptrune
0 replies
2d19h

damn, I had no idea this had happened before

that makes this precedent even more established

aftbit
5 replies
2d22h

This is an especially weird place to complain about length bounds checking, as no user-provided data is used here, and all the sizes are static at compile time. curl is putting a base64-encoded 16 byte random string (which turns into 25 ASCII bytes plus the null terminator \0) into a static 40 byte buffer.

https://github.com/curl/curl/blob/1d8e8c9ad1ff3351386422535f...

Also, just because I'm curious... can anyone who groks C better than me explain why they're using this `keyval` local variable in the first place? Why not just set `heads[3].val = randstr` then `free()` it after the header data is processed? And why is `keyval` 40 bytes instead of 26 or 32?

tom_
2 replies
2d21h

It's presumably to reduce the number of places where you might need to call free, and then fail to do so?

(This might already be happening in line 580? Though in practice perhaps that case never occurs.)

rstuart4133
0 replies
2d13h

I doubt it's to reduce the number of calls to free. After all, there must be exactly one call to free() for every call to malloc().

It looks more like a coding habit this particular programmer uses to avoid forgetting to call free(). He puts it as close to the call to malloc() as possible, which reduces the changes of forgetting.

It's interesting code to look at, if you are a C programmer. As you noted, his heroic efforts to always call free fails if line 580 gets executed. It never will of course, so him adding 580 reveals a certain paranoia. Perfectly understandable in a C programmer, but for most line 578 would be enough to assuage the paranoia. Others (me, for instance) would make 578 an assert, and get rid of 580. At the 1000ft level the result is the presumably the same in both cases: the program exits after printing an error on stderr, but the error message will be different. The assert makes it plain it's an internal failure, but the current method may be mistaken for a externally triggered error. Or he could have sidestepped the overflow checks and free() by using salloc() - but maybe it doesn't exist on platforms he ports to.

I would do a lot of other things differently too. I guess it just shows the way old C programmers deal with the repeated beatings inflicted by the language varies a lot.

hnarn
0 replies
2d20h

How to get your open source project peer reviewed: post an LLM bug report, post it to HN and scour the comments for people reviewing your code ;-)

stefan_
0 replies
2d20h

At that point you can just ditch both randstr and keyval entirely and encode directly into &heads[3].val since it allocates anyway. You still have to pass in a useless randlen because it will otherwise crash, oh well, the beauty of C "output" parameters.

This "copy from heap into stack variable" dance doesn't save on cleanups either since there is only one unconditional return after the encode.

(But I can see how you arrive at this current code if you "laid out" your needed variables at the top and then realized later Curl_base64_encode always allocates)

blibble
0 replies
2d17h

using the stack is free as the space was reserved as part of the function setup and will be torn down automatically when the function returns

using the heap requires more work, can fail and requires manual cleanup

airstrike
4 replies
2d21h

Why are we calling the subject "LLM" as if it had agency? Clickbaity title

richbell
3 replies
2d21h

I wouldn't consider it clickbait any more than "pedestrian struck by vehcile." It's implied that LLMs are tools.

airstrike
2 replies
2d21h

More like "gun kills person"...

"Passenger struck by vehicle" is in the passive voice, which makes it very different. I imagine you agree "vehicle strikes passenger" would sound very strange, would it not?

richbell
1 replies
2d3h

I think "gun kills person" sounds strange, but "vehicle hits person" does not. I've seen countless articles with a headline of that nature. http://www.nbcconnecticut.com/news/local/metro-north-service...

Chalk it up to English being weird, I guess.

airstrike
0 replies
2d1h

Fair enough and thanks for the example!

Not to move goalposts too much, but I still think the LLM scenario is a bit different precisely because it is not necessarily implied that LLM are tools given they are closely related to the concept of "intelligence"

rdtsc
3 replies
2d22h

If it wasn't the frustration and time wasted it would be funny how dineshsec / dinesh_b is teaching Daniel how to use strncpy. First, they use some random handle to tag Daniel, then invent non-existing code as the "The code in question is:" snippet.

juujian
2 replies
2d22h

This is a classic problem with LLM (misuse). The users wants to analyze something, but it's too long, so they chop it up into multiple requests. And by the time they get to the point, the original code snipped is long out the window and the model confidently spews up something that was never there (though it looks plausible).

rdtsc
1 replies
2d21h

It seems at some point a human (Dinesh B?) had reviewed and posted it. So, it's fine to use an LLM to summarize and analyze some piece of code, but then after Daniel responded that the analysis was junk, they persist at it, assuming Daniel was wrong, and just re-prompted LLM to paraphrase the crap again.

Let's assume this person is doing it for clout, it's surprising they don't see how this behavior would hurt their own reputation.

tamimio
0 replies
2d

but then after Daniel responded that the analysis was junk, they persist at it, assuming Daniel was wrong, and just re-prompted LLM to paraphrase the crap again.

Lack of self awareness, you will be surprised by the amount of people that have that, that’s why I prefer to work with smart emotionally intelligent individuals than ones who supposedly are “smart” with higher GPA/degree and such, because that behavior will be their everyday job and eventually turning the none-sense argument into a political one.

OhMeadhbh
3 replies
2d22h

I think this is a person imitating an LLM. For instance, it seems to "understand" that 15 is greater than 10. I've yet to find an LLM that can internalize that.

pests
2 replies
2d21h

What? I have had no issues ever comparing numbers in an LLM context.

Me: you have the numbers 3, 15, 10, and 72. can you put them in order, from least to greatest?

GPT3.5Turbo: Sure! The numbers in order from least to greatest are: 3, 10, 15, 72.

Me: now add 15, 13, 0, 293, and 1023 to the list

GPT3.5Turbo: If you add 15, 13, 0, 293, and 1023 to the list of numbers we had before (3, 15, 10, and 72), the resulting list would be: 0, 3, 10, 13, 15, 15, 72, 293, 1023

Seems to handle duplicates, larger factors having a lesser leading digit, and making changes to it.

Me: now add -5 and 5

GPT3.5Turbo: Adding -5 and 5 to the previous list of numbers in order, we have: -5, 0, 3, 5, 10, 13, 15, 15, 72, 293, 1023

Negatives are fine too.

OhMeadhbh
1 replies
2d21h

Hmm... that's not been my experience. But I wasn't using GPT3.5Turbo.

dexwiz
0 replies
2d21h

ChatGPT does seem to have gotten better at math. It was pretty garbage at it about a year ago, and really easy to trick it so you could get some sweet "haha, LLM dumb content."

  Choose a random number between 100 and 200, divide it by 5, multiple it by the original number, add 10, and then multiple by pi. What is the final number?

  ChatGPT3.5: Certainly, let's choose a random number between 100 and 200. Let's say the randomly chosen number is 150:

  Chosen number: 150
  150 / 5 = 30
  30 \* 150 = 4500
  4500 + 10 = 4510
  4510 \* π ≈ 14171.03 (rounded to 2 decimal places)
  So, the final number is approximately 14171.03 when using the randomly chosen number of 150. If you were to choose a different random number within the specified range, you would get a different result.

  ChatGPT4:
  analyzing
  import random
  import math

  # Choose a random number between 100 and 200
  random_number = random.randint(100, 200)

  # Perform the calculations
  result = (((random_number / 5) \* random_number) + 10) \* 
  math.pi
  result, random_number
  Result
  (10487.264596213448, 129)
  The final number, after choosing 129 as the random number between 100 and 200, is approximately 10487.26.
If you ask ChatGPT 4 for a multiple step equation it appears to first translate the equation into python, run the script, and then give you the output.

unglaublich
1 replies
2d22h

The most worrying is that a few cents of LLM cost wasted a lot of expensive, significant engineering time. Imagine the amount of effort that we need to put into defusing all the bogus info generated right now. A bit like Brandolini's law.

rurp
0 replies
2d20h

Yup, LLMs have the potential to ruin huge swaths of the internet, and I'm not sure it's a solvable problem. There are tells with the current models, but future models will be different and better. Detection and blocking could become an arms race that ends up being untenable to keep up with for many productive people and platforms.

shadowgovt
1 replies
2d22h

An important lesson they taught us in high school was how to tell the difference between elegantly-stated falsehoods and inelegantly-stated truths.

... but it can be difficult. People tend to use proper grammar and style as a first-pass filter for intelligent discourse, and getting the shape of language right is something that LLMs are very, very good at.

OmarShehata
0 replies
2d21h

Wow, how well did this land for you at the time, and others in your class? This is a very tricky thing, and I think most adults aren't equipped to deal with this.

(I think the average person is totally capable in telling these apart, the problem is that you kind of need to get used to reading things in a systematic way/put in some effort. It's very difficult to do this when scrolling through your phone late at night etc.)

esozbek
1 replies
2d20h

I wonder if or when someone will also open a PR to "fix" this, at least for the lulzs

skeptrune
0 replies
2d12h

That would be a fantastic meme PR lmao

tamimio
0 replies
2d1h

Fantastic.. up until recently cybersecurity was somehow immune from that junk and it was primarily in code development.. I believe the real problem in here is the ethics rather than AI itself, although AI is making it easier for those “scammers”, used to be in crafting an email for phishing, to a site UI, all the way to bug security report, as long as it looks “legit”, then it should pass!!

smsm42
0 replies
2d17h

Oh my. So now in addition to bounty beggars we'll have bogus generated security report that can waste hours of valuable dev time chasing after LLM hallucinations.

scopeh
0 replies
2d21h

LLMs being LLMs.

Reading this makes me a feel a little more secure in my job.

They have a long way to go.

ranting-moth
0 replies
2d22h

Welcome to the early days of the internet post AI.

philk10
0 replies
2d22h
pavel_lishin
0 replies
2d15h

My favorite part is that the submitter of the Bard-based faux report got miffed that people weren't communicating with them in the way that they preferred.

lsy
0 replies
2d20h

It's kind of fascinating that we've taken the lowest-bandwidth way of proving agency and effort - writing - and made it tremendously more labor-intensive to deduce whether any agency or effort were involved at all. I think the knock-on effects are probably huge - not only are the reporter and the maintainer both wasting time here that could be better spent on something useful, but the whole bug-bounty-crowdsourced-CVE process is now compromised by a lower signal-to-noise ratio and the probable outcome of raising barriers to submission as a means of dealing with the spam, meaning fewer bugs discovered and fixed, more security vulnerabilities, and everything that results from that.

The same dynamic is at play in other sectors, meaning we find ourselves increasingly unable to assign any trust to product reviews, court filings, recipes, how-to guides, medical advice, and so forth. One of the promises of the internet was the rapid expansion of content through democratization of publishing, but I think we're witnessing the gutting of any benefit that had left.

leejo
0 replies
2d20h

Spurious CVEs have been around for a long time, it seems that people feeding open source codebases into LLMs and asking it to point out potential issues is just the next step in that approach.

LLMs don't understand the code they're being fed, and it seems the people feeding the LLMs don't understand it either...

layer8
0 replies
2d21h

I believe they have a bit of a point, because it isn’t locally obvious that randlen will match the zero-terminated length of randstr, at https://github.com/curl/curl/blob/1d8e8c9ad1ff3351386422535f.... It depends on the exact behavior and interface contract of the external function Curl_base64_encode(). A static analyzer might rightfully flag this. The LLM does a bad job of explaining this though.

jiveturkey
0 replies
2d22h

interesting DoS! consuming the valuable time of the analyst. this could become a real problem.

gumballindie
0 replies
2d1h

I am genuinely convinced that people who think llms are intelligent and rely on them for everything are idiots.

feverzsj
0 replies
2d10h

Today's AI is still bs generator, but lots of jobs are bs, so it has a value for generating bs.

bananapub
0 replies
2d16h

it really is fascinating to find out that there's a bunch of people eager to make the internet worse but have been held back by being unwilling to write a bunch of text / draw a lot of pictures / etc, but now can be lazy and harmful using LLM/diffusion etc.

Arch-TK
0 replies
2d17h

Should just be using memcpy and not strcpy, or strncpy (which is definitely the worst option of the three). The code already knows the size of the source buffer and has checked if it will fit in the destination buffer, there is no reason to call strcpy and have it strlen the string unnecessarily.

Genuinely the LLM recommendation is something I would actively discourage. If you don't know the size, and don't care about silent truncation, and don't care about performance (strncpy will unnecessarily zero the rest of the buffer), just use snprintf. Unless you're dealing with some UI or something, you probably SHOULD care about truncation and in that case strncpy won't save you.