HN comments for: Artificial intelligence is losing hype

ChaitanyaSai

166 replies

1d13h

2024-08-20 04:38:27 UTC

I've trained as a neuroscientist and written a book about consciousness. I've worked in machine learning and built products for over 20 years and now use AI a fair bit in the ed-tech work we do.

So I've seen how the field has progressed and also have been able to look at it from a perspective most AI/engineering people don't -- what does this artificial intelligence look like when compared to biological intelligence. And I must say I am absolutely astonished people don't see this as opening the flood-gates to staggeringly powerful artificial intelligence. We've run the 4-minute mile. There are hundreds of billions of dollars figuring out how to get to the next level, and it's clear we are close. Forget what the current models are doing, it is what the next big leap (most likely with some new architecture change) will bring.

In focusing on intelligence we forget that it's most likely a much easier challenge than decentralized cheap autonomy, which is what took the planet 4 billion years to figure out. Once that was done, intelligence as we recognize it took an eye-blink. Just like with powered-flight we don't need bioliogical intelligence to transform the world. Artificial intelligence that guzzles electricity, is brittle, has blind spots, but still capable of 1000 times more than the best among us is going to be here within the next decade. It's not here yet, no doubt, but I am yet to see any reasoned argument for why it is far more difficult and will take far longer. We are in for radical non-linear change.

randcraw

69 replies

1d13h

2024-08-20 05:04:14 UTC

I've worked in AI for the past 30 years and have seen enthusiasm as robust as yours go bust before. Just because some kinds of narrow AI have done extraordinarily well -- namely those tasks that recognize patterns using connections between FSMs -- does not mean that same mechanisms will continue to scale up to human-level cognition, much less exceed it any time soon.

The breakthroughs where deep AI have excelled -- object recognition in images, voice recognition and generation, and text-based info embedding and retrieval -- these require none of the multilevel abstraction that characterizes higher cognition (Kahneman's system 2 thinking). When we see steady progress on such frontiers, only then a plausible case can and should be made that the essentials of AGI are indeed within our grasp. Until then, plateauing at a higher level of pattern matching than we had expected -- which is what we have seen many times before from narrow AI -- this is not sufficient evidence that the other requisite skills needed for AGI are surely just around the corner.

spacemanspiff01

55 replies

1d13h

2024-08-20 05:23:16 UTC

So I am a neophite in this area, but my thesis for why "this time is different" compared to previous AI bubbles is that this time there exist a bunch of clear products (or paths to products) that work and only require what is currently available in terms of technology.

Coding assistants today are useful, image generation is useful, speach recognition/generation is useful.

All of these can support businesses, even in their current (early) state. Those businesses have value in funding even 1% improvements in engineering/science.

I think that this is different than before, where even in the 80s there were less defined products, amd most everything was a prototype that needed just a bit more research to be commercially viable.

Where as in the past, hopes for the technology waned and funding for research dropped off a cliff, today's stuff is useful now, and so companies will continue to spend some amount on the research side.

crystal_revenge

39 replies

20h21m

2024-08-20 22:10:20 UTC

this time there exist a bunch of clear products

Really? I work in AI and my biggest concern is that I don't see any real products coming out of this space. I work closer to the models, and people in this specific area are making progress, but when I look at what's being done down stream I see nothing, save demos that don't scale beyond a few examples.

in the 80s there were less defined products, amd most everything was a prototype that needed just a bit more research to be commercially viable.

This is literally all I see right now. There's some really fun hobbyist stuff happening in the image gen area that I think is here to stay, but LLMs haven't broken out of the "autocomplete on steroids" use cases.

today's stuff is useful now

Can you give me examples of 5, non-coding assistant, profitable use cases for LLMs that aren't still in the "needed just a bit more research to be commercially viable" stage?

I love working in AI, think the technology is amazing, and do think there are some under exploited (though less exciting) use cases, but all I see if big promises with under delivery. I would love to be proven wrong.

andreilys

36 replies

19h20m

2024-08-20 23:10:58 UTC

1. Content Generation:

LLMs can be used to generate high-quality, human-like content such as articles, blog posts, social media posts, and even short stories. Businesses can leverage this capability to save time and resources on content creation, and improve the consistency and quality of their online presence.

2. Customer Service and Support:

LLMs can be integrated into chatbots and virtual assistants to provide fast, accurate, and personalized responses to customer inquiries. This can help businesses improve their customer experience, reduce the workload on human customer service representatives, and provide 24/7 support.

3. Summarization and Insights:

LLMs can be used to analyze large volumes of text data, such as reports, research papers, or customer feedback, and generate concise summaries and insights. This can be valuable for businesses in fields like market research, financial analysis, or strategic planning.

4. HR Candidate Screening:

Use case: Using LLMs to assess job applicant resumes, cover letters, and interview responses to identify the most qualified candidates. Example: A large retailer integrating an LLM-based recruiting assistant to help sift through hundreds of applications for entry-level roles.

5. Legal Document Review:

Use case: Employing LLMs to rapidly scan through large volumes of legal contracts, case files, and regulatory documents to identify key terms, risks, and relevant information. Example: A corporate law firm deploying an LLM tool to streamline the due diligence process for mergers and acquisitions.

Nullabillity

25 replies

19h1m

2024-08-20 23:29:38 UTC

1. Content Generation:

Spam isn't a feature. See also, this whole message that could just have been the headlines.

2. Customer Service and Support:

So… less clear than the website and not empowered to do anything (beyond ruining your reputation) because even you don't trust it?

3. Summarization and Insights:

See 1, spam isn't a feature. This is just trying to undo the damage from that (and failing).

4. HR Candidate Screening:

5. Legal Document Review:

If it's worth doing, it's worth doing well.

slidehero

24 replies

18h44m

2024-08-20 23:47:08 UTC

This seems unecessarily negative to me.

Content Generation

I'm working on AI tools for teachers and I can confidently say that GPT is just unbelievably good at generating explanations, exercises, quizes etc. The onus to review the output is on the teacher obviously, but given they're the subject matter experts, a review is quick and takes a fraction of the time that it would take to otherwise create this content from scratch.

Nullabillity

17 replies

18h25m

2024-08-21 00:05:37 UTC

It is negative. Because the rest of us are still forced to wade through the endless worthless sludge your ilk produces.

slidehero

16 replies

18h23m

2024-08-21 00:07:59 UTC

reducing the load on overworked teachers by using GPT to generate exercises, quizes and explanations for students is "endless worthless sludge"?

bamboozled

9 replies

17h51m

2024-08-21 00:40:16 UTC

I have teachers in my family, their lives have been basically ruined by people using ChatGPT-4 to cheat on their assignments. They spend their weekend trying to workout if someone has "actually written" this or not.

So sorry, we're back to spam generator. Even if it's "good spam".

slidehero

6 replies

17h26m

2024-08-21 01:04:33 UTC

their lives have been basically ruined

a bit dramatic. there has to be an adjustment of teaching/assessing, but nothing that would "ruin" anyone's life.

So sorry, we're back to spam generator. Even if it's "good spam".

is it spam if it's useful and solves a problem? I don't agree it fits the definition any more.

Teachers are under immense pressure, GPT allows a teacher to generate extension questions for gifted students or differentiate for less capable students, all on the fly. It can create CBT material tailored to a class or even an individual student. It's an extremely useful tool for capable teachers.

jholman

3 replies

15h52m

2024-08-21 02:39:20 UTC

a bit dramatic. there has to be an adjustment of teaching/assessing, but nothing that would "ruin" anyone's life.

If you don't have the power to just change your mind about what the entire curriculum and/or assessment context is, it can be a workload increase of dozens of hours per week or more. If you do have the power, and do want to change your entire curriculum, it's hundreds of hours one-time. "Lives basically ruined" is an exaggeration, but you're preposterously understating the negative impact.

is it spam if it's useful and solves a problem?

Whether or not it's useful has nothing to do with whether or not it's spam. I'm not claiming that your product is spam -- I'll get back to that -- but your reply to the spam accusation is completely wrong.

As for your hypothesis, I've had interactions where it did a good job of generating alternative activities/exercises, and interactions where it strenuously and lengthily kept suggesting absolute garbage. There's already garbage on the internet, we don't need LLMs to generate more. But yes, I've had situations where I got a good suggestion or two or three, in a list of ten or twenty, and although that's kind of blech, it's still better than not having the good suggestions.

slidehero

2 replies

15h17m

2024-08-21 03:13:34 UTC

Whether or not it's useful has nothing to do with whether or not it's spam.

I think it has a lot to do with it. I can't see how generating educational content for the purpose of enhancing student outcomes with content reviewed by expert teachers can fall under the category of spam.

As for your hypothesis, I've had interactions where it did a good job of generating alternative activities/exercises, and interactions where it strenuously and lengthily kept suggesting absolute garbage.

I like to present concrete examples of what I would consider to be useful content for a k-12 teacher.

Here's a very quick example that I whipped up

https://chatgpt.com/share/ec0927bc-0407-478b-b8e5-47aabb52d2...

This would align with Year 9 Maths for the Australian Curriculum.

This is an extremely valuable tool for

- A graduate teacher struggling to keep up with creating resources for new classes

- An experienced teacher moving to a new subject area or year level

Bear in mind that the GPT output is not necessarily intended to be used verbatim. A qualified specialist teacher with often times 6 years of study (4 year undergrad + 2 yr Masters) is the expert in the room who presumably will review the output, adjust, elaborate etc.

As a launching pad for tailored content for a gifted student, or lower level, differentiated content for a struggling student the GPT response is absolutely phenomenal. Unbelievably good.

I've used Maths as an example, however it's also very good at giving topic overviews across the Australian Curriculum.

Here's one for: elements of poetry:structure and forms

https://chatgpt.com/share/979a33e5-0d2d-4213-af14-408385ed39...

Again, an amazing introduction to the topic (I can't remember the exact curriculum outcome it's aligned to) which gives the teacher a structured intro which can then be spun off into exercises, activities or deep dives into the sub topics.

I've had situations where I got a good suggestion or two or three, in a list of ten or twenty

This is a result of poor prompting. I'm working with very structured, detailed curriculum documents and the output across subject areas is just unbelievably good.

This is all for a K-12 context.

ainonsense44

1 replies

12h24m

2024-08-21 06:06:54 UTC

There are countless existing, human-vetted, designed on special purpose, bodies of work full of material like the stuff your chatgpt just "created". Why not use those?

Also, each of your examples had at least one error, did you not see them?

slidehero

0 replies

11h22m

2024-08-21 07:09:15 UTC

Also, each of your examples had at least one error, did you not see them?

I didn't could you point them out?

There are countless existing, human-vetted, designed on special purpose, bodies of work full of material like the stuff your chatgpt just "created". Why not use those?

As a classroom teacher I can tell you that piecing together existing resources is hard work and sometimes impossible because resource A is in this text book (which might not be digital) and resource B is on that website and quiz C is on another site. Sometimes it's impossible or very difficult to put all these pieces together in a cohesive manner. GPT can do all that an more.

The point is not to replace all existing resources with GPT, this is all or nothing logic. It's another tool in the tool belt which can save time and provide new ways of doing things.

bamboozled

1 replies

14h21m

2024-08-21 04:09:25 UTC

is it spam if it's useful and solves a problem? I don't agree it fits the definition any more.

Who said generating an essay is useful sorry ? What problem does that solve?

Your comments come accross as overly optimistic and dismissive . Like you have something to gain personally and aren’t interested in listening to others feedback.

slidehero

0 replies

14h8m

2024-08-21 04:22:52 UTC

I'm developing tools to help teachers generate learning material, exercises and quizes tailored to student needs.

Who said generating an essay is useful sorry ? What problem does that solve?

Useful learning materials aligned with curriculum outcomes, taking into account learner needs and current level of understanding is literally the bread and butter of teaching.

I think those kinds of resources are both useful and solve a very real problem.

Your comments come accross as overly optimistic and dismissive . Like you have something to gain personally and aren’t interested in listening to others feedback.

Fair point. I do have something to gain here. I've given a number of example prompts that are extremely useful for a working teacher in my replies to this thread. I don't think I'm being overly optimistic here. I'm not talking vague hypotheticals here, the tools that I'm building are already showing great usefulness.

jcgrillo

0 replies

14h13m

2024-08-21 04:17:34 UTC

One potential fix, or at least a partial mitigation, could be to weight homework 50% and exams 50%, and if a student's exam grades differ from their homework grades by a significant amount (e.g. 2 standard deviations) then the lower grade gets 100% weight. It's a crude instrument, but it might do the job.

deadbabe

0 replies

13h57m

2024-08-21 04:33:57 UTC

Why haven’t they just gone back to basics and force students to write out long essays on paper by hand and in class?

skydhash

3 replies

17h30m

2024-08-21 01:00:27 UTC

Also have teachers in my family. Most of the time is spent adjusting the syllabus schedule and guiding (orally) the stragglers. Exercises, quizes and explanations are routine enough that good teachers I know can generate them on the spot.

slidehero

2 replies

17h14m

2024-08-21 01:17:24 UTC

Exercises, quizes and explanations are routine enough that good teachers I know can generate them on the spot.

Every year there are thousands of graduate teacher looking for tools to help them teach better.

good teachers I know can generate them on the spot

Even the best teacher can't create an interactive multiple choice quiz with automatic marking, tailored to a specific class (or even a specific student) on the spot.

I've been teaching for 20+ years, I have a solid grasp of the pain points.

ainonsense44

1 replies

12h23m

2024-08-21 06:08:20 UTC

Even the best teacher can't create an interactive multiple choice quiz with automatic marking, tailored to a specific class (or even a specific student) on the spot.

Neither can "AI" though, so what's the point here?

slidehero

0 replies

11h21m

2024-08-21 07:09:54 UTC

I'm creating tools on top of AI that can which is my point.

meroes

1 replies

17h12m

2024-08-21 01:19:21 UTC

Can you post a question and answer example if it doesn’t violate NDA because I have very little faith this is good for students.

slidehero

0 replies

16h34m

2024-08-21 01:57:23 UTC

sure

here's an example of a question and explanation which aligns to Australian Curriculum elaboration AC9M9A01_E4 explaining why frac{3^4}{3^4}=1, and 3^{4-4}=3^0

https://chatgpt.com/share/89c26d4f-2d8f-4043-acd7-f1c2be48c2...

to further elaborate why 3^0=1 https://chatgpt.com/share/9ca34c7f-49df-40ba-a9ef-cd21286392...

This is a relatively high level explanation. With proper prompting (which, sorry I don't have on hand right now) the explanation can be tailored to the target year level (Year 9 in this case) with exercises, additional examples and a quiz to test knowledge.

This is just the first example I have on hand and is just barely scratching the surface of what can be done.

The tools I'm building are aligned to the Austrlian Curriculum and as someone with a lot of classroom experience I can tell you that this kind of tailored content, explanations, exercises etc are a literal godsend for teachers regardless of experience level.

Bear in mind that the teacher with a 4 year undergrad in their specialist area and a Masters in teaching can use these initial explanations as a launching pad for generating tailored content for their class and even tailored content for individual students (either higher or lower level depending on student needs). The reason I mention this is because there is a lot of hand-wringing about hallucinations. To which my response is:

- After spending a lot of effort vetting the correctness of responses for a K-12 context hallucinations are not an issue. The training corpus is so saturated with correct data that this is not an issue in practice.

- In the unlikely scenario of hallucination, the response is vetted by a trained teacher who can quickly edit and adjust responses to suit their needs

dkrich

2 replies

13h39m

2024-08-21 04:52:01 UTC

Let’s call it for what it is- taking poorly organized existing information and making it organized and interactive.

“Here are some sharepoint locations, site Maps, and wikis. Now regurgitate this info to me as if you are a friendly call center agent.”

Pretty cool but not much more than pushing existing data around. True AI I think is being able to learn some baseline of skills and then through experience and feedback adapt and be able to formulate new thoughts that eventually become part of the learned information. That is what humans excel at and so far something LLMs can’t do. Given the inherent difficulty of the task I think we aren’t much closer to that than before as the problems seem algorithmic and not merely hardware constrained.

slidehero

1 replies

11h9m

2024-08-21 07:22:14 UTC

taking poorly organized existing information and making it organized and interactive.

Which is extremely valuable!

Pretty cool but not much more than pushing existing data around.

Don't underestimate how valuable it is for teachers to do exactly that. Taking existing information, making it digestable, presenting it in new and interseting ways is a teacher's bread and butter.

dkrich

0 replies

6h57m

2024-08-21 11:34:02 UTC

It’s valuable for use cases where the problem is “I don’t know the answer to this question and don’t know where to find it.” That’s not in and of itself a multibillion dollar business when the alternative doesn’t cost that much in the grand scheme of things (asking someone for help or looking for the answer).

Are you suggesting a chatbot is a suitable replacement for a teacher?

obscurette

1 replies

11h52m

2024-08-21 06:38:42 UTC

As a teacher - I have no shortage of exercises, quizes etc. Internet is full of this kind of stuff and I have no trouble finding more than I ever need. 95% of my time an mental capacity in this situation goes for deciding what makes sense in my particular pedagogical context? What wording works best for my particular students? Explanations are even harder. I find out almost daily that explanations which worked fine in last year, don't work any more and I have to find a new way, because previous knowledge, words they use and know etc of new students are different again.

slidehero

0 replies

11h13m

2024-08-21 07:18:01 UTC

As a teacher - I have no shortage of exercises, quizes etc. Internet is full of this kind of stuff and I have no trouble finding more than I ever need

Which all takes valuable time us teachers are extremely short on.

I've been a classroom teacher for more than 20 years, I know how painful it is to piece together a hodge podge of resourecs to put together lessons. Yes the information is out there, but a one click option to gather this into a cohesive unit for me saves me valuable time.

95% of my time an mental capacity in this situation goes for deciding what makes sense in my particular pedagogical context? What wording works best for my particular students?

Which is exactly what GPT is amazing at.Brainstorming, rewriting, suggesting new angles of approach is GPTs main stength!

Explanations are even harder.

Prompting GPT to give useful answers is part of the art of using these new tools. Ask GPT to speak in a different voice, take on a persona or target a differnt age group and you'll be amazed at what it can output.

I find out almost daily that explanations which worked fine in last year, don't work any more

Exactly! Reframing your own point of view is hard work, GPT can be an invaluable assistant in this area.

meroes

0 replies

17h15m

2024-08-21 01:15:57 UTC

I’ve rarely if ever seen a model fully explain mathematical answers outside of simple geometry and algebra to what I would call an adequate level. It gets the answer right more often than explaining why that is the correct answer. For example, it finds a minimal case to optimization, but can’t explain why that is the minimal result among all possibilities.

theappsecguy

5 replies

19h6m

2024-08-20 23:24:31 UTC

Dear lord, if someone started relying on LLMs for legal documents, their clients would be royally screwed…

KoolKat23

4 replies

9h54m

2024-08-21 08:37:07 UTC

They're currently already relying on overworked, underpaid interns who draft those documents. The lawyer is checking it anyway. Now the lawyer and his intern have time to check it.

ace32229

1 replies

7h57m

2024-08-21 10:33:59 UTC

I have no idea what type of law you're talking about here, but (given the context of the thread) I can guarantee you major firms working on M&As are most definitely not using underpaid interns to draft those documents. They are overpaid qualified solicitors.

KoolKat23

0 replies

6h41m

2024-08-21 11:49:39 UTC

Apologies I mean candidate attorneys when I say interns. Those overpaid qualified attorneys, read it and sign off on it.

District5524

1 replies

7h43m

2024-08-21 10:48:22 UTC

I suggest we do not repeat the myth and urban legend that LLMs are good for legal document review. I had a couple of real use cases used for real clients who were hyped about LLMs to be used for document review and trying to save salary, for Engish language documents. We've found Kira, Luminance and similar due diligence project management stuff as useful being a timesaver if done right. But not LLMs. Due to longer context windows, it is possible to ask LLMs the usual hazy questions that people ask in a due diligence review (many of which can be answered dozens of different ways by human lawyers). Is there a most favoured nation provision in the contract, is there a financial cap limiting the liability of the seller or the buyer, governing law etc. Considering risks of uploading such documents into ChatGPT, you are stuck with Copilot M365 etc. or some outrageously expensive "legal specific" LLMs that I cannot test. Just to be curious with Copilot I've asked five rather simple questions for three different agreements (where we had the golden answer), and the results were quite unequal, but mostly useless - in one contract, it incorrectly reported for all questions that these cannot be answered based on the contract (while the answers were clearly included in the document), in an another, two questions were answered correctly, two questions not answered precisely (just governing law being US instead of the correct answer being Michigan, even after reprompting to give the state level answer, not "USA") and hallucinated one answer incorrectly. In the third one, three answeres were hallucinated incorrectly, answered one correctly and one provision was not found. Of course, it's better to have a LEGAL specific benchmark for this, but 75% hallucination in complex questions is not something that helps your workflow (https://hai.stanford.edu/news/hallucinating-law-legal-mistak...) I don't recommend at least LLMs to anyone for legal document reviews, even for the English language.

KoolKat23

0 replies

6h42m

2024-08-21 11:49:04 UTC

I'm not talking about reviewing, only drafting. Every word should be checked. A terrible idea relying on the advice of an LLM.

southernplaces7

1 replies

12h23m

2024-08-21 06:08:20 UTC

Except for number 3, the rest are more often disastrous or insulting to users and those depending on the end products/services of these things. Your reasoning is so bad that i'm almost tempted to think you're spooning out PR-babble astro-turf for some part of the industry. Here's a quick breakdown:

1. content: Nope, except for barrel-bottom content sludge of the kind formerly done by third world spam spinning companies, most decent content creation stays well away from AI except for generating basic content layout templates. I work as a writer and even now, most companies stay well away from using GPT et al for anything they want to be respected as content. Please..

2. Customer service: You've just written a string of PR corporate-speak AI seller bullshit that barely corresponds to reality. People WANT to speak to humans, and except for very basic inquiries, they feel insulted if they're forced into interaction with some idiotic stochastic parrot of an AI for any serious customer support problems. Just imagine some guy trying to handle a major problem with his family's insurance claim or urgently access money that's been frozen in his bank account, and then forced to do these things via the half-baked bullshit funnel that is an AI. If you run a company that forces that upon me for anything serious in customer service, I would get you the fuck out of my life and recommend any friend willing to listen does the same.

3. This is the one area where I'd grant LLMs some major forward space, but even then with a very keen eye to reviewing anything they output for "hallucinations" and outright errors unless you flat out don't care about data or concept accuracy.

4. For reasons related to the above (especially #2) what a categorically terrible, rigid way to screen human beings with possible human qualities that aren't easily visible when examined by some piece of machine learning and its checkbox criteria.

5. Just, Fuck No... I'd run as fast and far as possible from anyone using LLMs to deal with complex legal issues that could involve my eventual imprisonment or lawsuit-induced bankruptcy.

KoolKat23

0 replies

9h50m

2024-08-21 08:40:45 UTC

2.I think you overestimate the caliber of query received in most call centres. Even when it comes to private banks (for those who've been successful in life), the query is most often something small like holding their hand and telling them to press the "login" button.

Also these all tend to have an option where you simply ask it and it will redirect you to a person.

Those agents deal with the same queries all day, despite what you think your problem likely isn't special, in most cases may as well start calling the agents "stochastic parrots" too while you're at it.

meroes

1 replies

17h18m

2024-08-21 01:12:43 UTC

I’ve been doing RLHF and adjacent work for 6 months. The model responses across a wide array of subject matter are surface level. Logical reasoning, mathematics, step by step, summarization, extraction, generation. It’s the kind of output the average C student is doing.

We specifically don’t do programming prompts/responses nor advanced college to PHD level stuff, but it’s really mediocre at this level and these subject areas. Programming might be another story, I can’t speak to that.

All I can go off is my experience but it’s not been great. I’m willing to be wrong.

roenxi

0 replies

15h29m

2024-08-21 03:01:26 UTC

It’s the kind of output the average C student is doing.

Is the output of average C students not commercially valuable in the listed fields? If AI is competing reliably with students then we've already hit AGI.

paretoer

0 replies

6h6m

2024-08-21 12:25:18 UTC

IMO the unreasonable uselessness of LLMs is because for most tasks involving language the accuracy needs to be unbelievably high to have any real value at all.

We just don't have that.

We have autocomplete on steroids and many people are fooling themselves that if you just take more steroids you will get better and better results. The metaphor is perfect because if you take more and more steroids you get less and less results.

It is why in reality we have had almost no progress since April 2023 and chatGPT 4.

Nullabillity

0 replies

19h12m

2024-08-20 23:19:20 UTC

Calling LLMs autocompleters is an insult to autocompleters.

bamboozled

10 replies

1d12h

2024-08-20 05:33:53 UTC

I don't find coding assistants to be very useful. Image generation was fun for a few weeks. Speech recognition is useful.

Anyway, considering all these things can be done on device, where is the long term business prospect of which you speak?

WalterSear

4 replies

20h5m

2024-08-20 22:25:46 UTC

I don't find coding assistants to be very useful.

I've come to notice a correlation between contemporary AI optimism and having effectively made the jump to coding with AI assistants.

crystal_revenge

3 replies

19h57m

2024-08-20 22:34:02 UTC

effectively made the jump to coding with AI assistants

I think this depend heavily on what type of coding your doing. The more your job could be replaced by copy/pasting from Stack Overflow, the more useful you find coding assistants.

For that past few years most of the code I've written has been solving fairly niche quantitative problems with novel approaches and I've found AI coding assistants to range from useless to harmful.

But on a recent webdev project, they were much more useful. The vast majority of problems in webdev are fundamentally not unique so a searchable pattern library (which is what an LLM coding assistant basically is) should be pretty effective.

For other areas of software, they're not nearly as useful.

mhuffman

2 replies

16h19m

2024-08-21 02:11:47 UTC

I think this depend heavily on what type of coding your doing. The more your job could be replaced by copy/pasting from Stack Overflow, the more useful you find coding assistants.

I think this is true and also why you see some "older devs just don't like AI" type comments. AI assistants seem to be great at simple webdev tasks, which also happens to be the type of work that more junior developers do day to day.

I have also found them useful with that and I keep one active for those types of projects because of the speed up, although I still have to keep a close eye on what it wants to inject. They also seem to excel at generating tests if you have already developed the functions.

Then there are more difficult (usually not webdev) projects. In those cases, it really only shines if I need to ask it a question that I would previously have searched on SO or some obscure board for an answer. And even then, it really has to be scrutinized, because if it was simple, I wouldn't be asking the question.

There is def. something there for specific types of development, but it has not "changed my life" or anything like that. It might have if I was just starting out or if I only did webdev type projects.

jcgrillo

1 replies

14h0m

2024-08-21 04:30:56 UTC

As an "older dev" who doesn't like AI, the thing that annoys me most is the UX is horrible. It's like arguing in chat with an extremely overconfident junior dev who isn't capable of learning or improving with time and experience. That's just a miserable way to spend time. I'd rather spend that time thinking clearly about the problem, and then writing down the solution (clearly).

If this thing also conferred an actual productivity advantage that would be one thing, and it might motivate me to get past the horrible UX, but I haven't seen any evidence yet.

hnaccount_rng

0 replies

8h30m

2024-08-21 10:00:40 UTC

I fear the approach that maximises productivity is a literal one-shot approach: Give the LLM one or two shots at generating a somewhat passable first attempt (including all or at least most of the boilerplate) and then strictly fix up stuff yourself. I recently spend a day attempting to build a relatively simple GUI for a project which _maybe_ contains a couple of days of programming work. It got the gist of the GUI basically in one. And the next two or three prompts then added the buttons I wanted. Most of it even worked

But after that we ran into a kind of loop, where you put my feelings into much better words than I could. If I had stopped after iteration 3, I probably would have finished what I wanted to do in half a day

LtWorf

4 replies

1d4h

2024-08-20 13:42:19 UTC

Speech recognition is useful.

Now try to mute a video on youtube and understand what's being said from the automatic subtitles.

If you do it in english, be aware that it's the best performing language and all others are even worse.

tkgally

1 replies

14h1m

2024-08-21 04:29:44 UTC

For some reason, YouTube is not using a very good STT system now. The lack of sentence punctuation is particularly annoying. Transcriptions by Whisper and Gemini 1.5 Pro are much better. From a couple of weeks ago:

https://news.ycombinator.com/item?id=41199567#41201773

I expect that YouTube will up their transcription game soon, too.

LtWorf

0 replies

5h2m

2024-08-21 13:28:26 UTC

I've tried whisper too. I made this: https://codeberg.org/ltworf/srtgen

Basically it's kinda useful to put time tags, but I need to manually fix each and every sentence. Sometimes I need to fix the time tags as well.

I just spoke about youtube because it's more popular and easy to test.

Terr_

0 replies

16h40m

2024-08-21 01:50:44 UTC

Sometimes speech-to-text machine learning models give very good results, however I think the key is that:

1. It's overwhelmingly more useful than the [no text] it was replacing, particularly for the deaf or if you want to search for keywords in a video.

2. When it fails, it tends to do so in ways that trigger human suspicion and oversight.

Those aren't necessarily true of some of the things people are shoehorning LLMs into these days, which is why I'm a lost more pessimistic about that technology.

KoolKat23

0 replies

20h3m

2024-08-20 22:27:48 UTC

Just today, I received a note from a gas technician, part handwritten, for the life of me I couldn't make out what he was saying, I asked ChatGPT and it surprisingly understood, rereading the original note I'm very sure it was correct.

benreesman

1 replies

21h10m

2024-08-20 21:21:03 UTC

“This time is different” in one fundamental, methodological, epistemological way: we test on the training set now.

This has follow-on consequences for a shattering phase transition between “persuasive demo” and “useful product”.

We can now make arbitrarily convincing demos that will crash airplanes (“with no survivors!”) on the first try in production.

This is institutionalized by the market capitalizations of 7 companies being so inflated that if they were priced accurately the US economy would collapse.

m3kw9

0 replies

16h32m

2024-08-21 01:59:14 UTC

There was really one AI winter which is a sample size of 1, saying this time is different is justified based on the exponential improvement over AI 10 years back

tarsinge

0 replies

1d10h

2024-08-20 07:33:22 UTC

See this is exactly what is wrong with “this time it’s different” here. AI has been useful and used for decades (but under a different name because the term was tainted by previous bubbles). Look at the section “AI behind the scenes” here https://en.wikipedia.org/wiki/History_of_artificial_intellig...

greenthrow

0 replies

19h23m

2024-08-20 23:07:51 UTC

There were products and "path to products" too. Once the hype died down nobody wanted them. It is the same this time.

tasuki

4 replies

1d10h

2024-08-20 07:39:32 UTC

"AGI" is a nonsense term anyway. Humans don't have "general" intelligence either: our intelligence is specialized to our environment.

Jensson

2 replies

1d1h

2024-08-20 17:30:31 UTC

Humans is the most general intelligence we know about, so that is why we called it general intelligence, because we have made so many intelligences that are specialized on a specific domain like calculators or chess engines we need a word for something that is as general as humans, because being able to replace humans is a very important goal.

tasuki

1 replies

8h19m

2024-08-21 10:11:26 UTC

Yes, humans are the most general intelligence we know about. That doesn't say much about how general it is, just highlights our limitations.

ace32229

0 replies

7h53m

2024-08-21 10:37:59 UTC

This is a bit like saying Earth isn't big, because there are far larger planets etc. out there. For the average conversation, Earth is "big".

ben_w

0 replies

20h18m

2024-08-20 22:12:30 UTC

"AGI" means many different things to many different people: to me any AI which is general is an AGI so GPT-3.5 counts; to OpenAI it has to be economically transformative to count; to some commentators here it has to be superhuman to count.

I think that none of the initials are boolean; things can be degrees of artificial, degrees of general, and degrees of intelligent.

I think most would assert that humans count as a "general" intelligence, even if they disagree about most of the other things I've put in this comment.

b_be_building

3 replies

20h51m

2024-08-20 21:39:35 UTC

I have been using Chat-GPT has a full time expert and I can unequivocally tell you that its a transformative piece of technology. The technology isn't hyped.

j-a-a-p

1 replies

20h36m

2024-08-20 21:54:30 UTC

I agree as this is also my personal experience. But I also see the usage of ChatGPT is falling down fast from 1.8 billion visitors to 260 million last month [1].

I am probably through some ETF an investor in MS, so I do hope the openai API usage is showing a more stable and upward trend.

[1]: https://explodingtopics.com/blog/chatgpt-users

margorczynski

0 replies

19h25m

2024-08-20 23:05:40 UTC

Well ChatGPT is no longer the top dog and there's quite a bit of competition in the space. Including Llama 3.1 which is free. In general I think most of the moat that OpenAI had has evaporated in the last few months, but also for other LLM companies.

Not sure how they plan on making money in the long-term, eventually the investors and shareholders will start asking when they will be seeing the returns on their investment.

BoingBoomTschak

0 replies

20h42m

2024-08-20 21:48:46 UTC

It is very nice as "Markov chains on steroids", but people believing that LLMs are anything but a distracting local maximum on the path to AGI are 200% in kool-aid drinking mode.

KoolKat23

1 replies

20h13m

2024-08-20 22:17:53 UTC

All Kahneman's system 2 thinking is just slow deliberate thinking. And these models do indeed have this characteristic to an extent, as evidenced with chain of thought reasoning.

You can see this in action with multiplication. Much like humans when asked to guess the answer, they'll get it wrong, unless they know the answer from rote learning multiplication tables, this System-1 thinking. In many cases when asked they can reason further and solve it, by breaking it down and solving it step by step, much like a human, this is system-2 thinking.

In my opinion, it seems nearly everything is there for it it to take the next leap in intelligence, it's just putting it all together.

dogcomplex

0 replies

11h23m

2024-08-21 07:07:51 UTC

Agreed. System 2 strategizing may simply be the recursive application of symbolic System 1 tooling. An LLM is entirely capable of reading a problem, determining the immediate facts and tokens (fast and intuitive system 1) and determining the ideal algorithm to resolve them (logical analytical system 2). The execution to do all those steps at once is lacking in current LLMs (debatably - they get better every month) - but any basic architecture breaking things down into component sub-questions clearly works.

tim333

0 replies

1d6h

2024-08-20 12:06:43 UTC

If you look at AI history there is often fairly steady progress in a given skill area for example chess programs improved in a steady way on ELO scores and you could project pretty well the future by drawing a line on a graph. Similarly large language models seem to be progressing from toddler like to high school student like (now) to PhD like - shortly. There are skills AI are still fairly bad at like the higher level reasoning you mention, and in robot form being able to pop to the shops to get some groceries say but I get the impression those are also improving in a steady way and it won't be so long.

sandspar

0 replies

15h18m

2024-08-21 03:13:18 UTC

For readers' edification, would you mind making a strong hypothetical argument for why this time it actually is different, from an expert's perspective?

phito

36 replies

1d13h

2024-08-20 04:50:40 UTC

but I am yet to see any reasoned argument for why it is far more difficult and will take far longer

I am yet to see any reasoned argument for why it is easy to build real AI and that it will come fast.

As you said, AI has been there for decades and stagnated for pretty much the whole time. We've just had a big leap, but nothing says (except BS hype) that we're not in for a long plateau again.

tasuki

22 replies

1d10h

2024-08-20 07:36:20 UTC

I am yet to see any reasoned argument for why it is easy to build real AI and that it will come fast.

We have "real ai" already.

As for future progress, have you tried just simple interpolation of the progress so far? Human level intelligence is very near. (Though of course artificial intelligence will never exactly match human intelligence: it will be ahead/behind in certain aspects...)

pdimitar

21 replies

1d9h

2024-08-20 08:32:29 UTC

- We don't have a "real AI" at all. Where's Skynet, where's HAL-9000? Where are the cute robotic butlers from the "I, Robot" movie?

- Simple interpolation of the progress is exactly the problem here. Look at the historical graphs of AI funding and tell me with a straight face that we absolutely must use simple interpolation.

- Nope, human-level intelligence is not even close. It remains as nebulous and out of reach as ever. ChatGPT's imitation of intelligent speech falls apart very quickly when you chat with it for more than a few questions.

adastra22

11 replies

16h45m

2024-08-21 01:45:48 UTC

We don't have a "real AI" at all. Where's Skynet, where's HAL-9000? Where are the cute robotic butlers from the "I, Robot" movie?

You shouldn’t use science fiction as your reference point. It’s like saying “where is my flying car?” (Helicopters exist)

pdimitar

10 replies

10h16m

2024-08-21 08:14:47 UTC

Why shouldn't I? Humans developed this intelligence naturally (as far as we know). Many people claim we're intelligent enough to repeat the process with artificial organisms, and guide it, and perfect it. I want to see it.

And btw in the Terminator novelizations it was clearly stated that Skynet was a very good optimization machine but lacked creativity. So it's actually a good benchmark: can we create an intelligent machine that needs no supervision but still has limitations (i.e. it cannot dramatically reformulate its strategy in case it cannot win, which is exactly what happened in the books)?

adastra22

5 replies

8h25m

2024-08-21 10:05:39 UTC

Just because someone can tell a convincing story, doesn’t mean reality (once technology catches up) will resemble the devices of that story. Science fiction is fiction, and unconstrained by the annoying restrictions of physical reality.

That’s the point of my flying car comparison. We HAVE flying cars: they’re called helicopters. Because as it turns out there is just no physical way to make a vehicle in the form factor of a car fly, except by rotary wing. But people will still say “where’s my flying car?” because they are hung up on reality resembling science fantasy, as you are.

We have AI. We even already have AGI. It just doesn’t resemble the Terminator, because The Terminator is a made up story disconnected from reality.

pdimitar

4 replies

8h5m

2024-08-21 10:26:10 UTC

We even already have AGI.

And this is why, I feel, I can never discuss with the AI fans. They are happy to invent their own fiction while berating popular fiction in the same breath.

No, we really don't have AGI. Feel free to point out some of humanity's pressing problems being trivially solved today with it, please. I'll start: elderly people care, and fully automated logistics.

adastra22

3 replies

7h47m

2024-08-21 10:43:26 UTC

I’m not an “AI fan.” But anyway.

Artificial. General. Intelligence.

The term, as originally defined, is for programs which are man-made (Artificial), able to efficiently solve problems (Intelligence), including novel problem domains outside those considered in its creation (General). Artificial General Intelligence, or AGI. That’s literally all AGI means, and ChatGPT absolutely fits the bill.

What you describe is ASI, or artificial super intelligence. In the late 90’s, 00’s, and early 10’s, a weird subgroup of AI nerds got it into their head that merely making an AGI (even a primitive one) would cause a self-recursion improvement loop and create ASI in short order. They then started saying “achieve AGI” as a stand in for “emergence of ASI” as the two were intricately linked in their mind.

In reality the whole notion of AGI->ASI auto-FOOM has been experimentally discredited, but the confusion over terminology remains.

Furthermore, the very idea of ASI can’t be taken for granted. A machine that trivially solves humanity’s pressing problems makes nice sci-fi, but there is absolutely no evidence to presume such a machine could actually exist.

pdimitar

2 replies

7h33m

2024-08-21 10:58:13 UTC

You are addressing the wrong person if you think I give two hoots how many more acronyms will the AI area invent to conceal the fact that the only thing they actually achieved is remove a lot of artists from the job market.

I don't care how it's called. We don't have it. I am not "confused over terminology", I want to see results and yet again they don't exist. Let's focus on results.

In reality the whole notion of AGI->ASI auto-FOOM has been experimentally discredited

Sure. Because we actually have this super-intelligence already and we can compare with it, right? Oh wait, no we don't. So what's your point? Some people gave up and proclaimed that it can't be done? Like we haven't seen historical examples of this meaning exactly nothing, hundreds of times already.

Look, we'll never be able to talk about it before you stop confusing industry gate-keepers who learned how to talk to get VC money and obfuscate reality with, you know, the actual reality in front of us. You got duped by the investor talk and by the scientists never wanting to admit their funding might have been misplaced by being given to them, I am afraid.

Finally, nope, again and again, we don't have AGI even if I accept your definition. Show me a bot that can play chess, play StarCraft 2, organize an Amazon warehouse item movements and shipping, and coordinate a flight's touch-down with the same algorithms / core / whatever-you-want-to-call it. Same one, not different ones. One and the same.

No? No AGI then either.

Furthermore, the very idea of ASI can’t be taken for granted. A machine that trivially solves humanity’s pressing problems makes nice sci-fi, but there is absolutely no evidence to presume such a machine could actually exist.

The people in the bronze age could have easily said "there is no evidence we would be able to haul goods while only pressing pedals and rotating a wheel". That's not an argument for anything at all, it's a short-sighted assertion that we might never progress that's only taking the present and the very near future into account. Well, cool, you don't believe it will happen. And? That's not an interesting thing to say.

Other people didn't believe we could go to the Moon. We still did. I wonder how quickly the naysayers hid under the bed after that so nobody could confront them about it. :D

But anyway. I got nothing more to say to people who believe VC talk and are hell-bent on inventing many acronyms to make sure they are never held accountable.

I for one want machines that solve humanity's problems. I know they can exist. I know nearly nobody wants to work on them because everybody is focused on the next quarter's results. All this is visible and well-understood yet people like you seem to think that this super narrow view is the best humanity can achieve.

Well, maybe it's the best you can achieve. I know people who can do more.

adastra22

1 replies

3h39m

2024-08-21 14:51:33 UTC

You are being unnecessarily aggressive. I won’t be continuing this debate.

pdimitar

0 replies

3h35m

2024-08-21 14:56:10 UTC

That is likely true. We can't agree on basic premises so there's no point pursuing a discussion regardless of the tone.

staticman2

3 replies

5h59m

2024-08-21 12:32:04 UTC

>>very good optimization machine but lacked creativity

Inventing a fricken time machine wasn't creative?

pdimitar

2 replies

5h53m

2024-08-21 12:37:48 UTC

I know right? :D That always bothered me as well.

In the novelizations it was written that Skynet could not adapt to humans not running away or not vacating territories after they have been defeated. One of the quotes was: "Apparently it underestimated something that it kept analyzing even now: the human willpower." I've read this as Skynet not being able to adapt against guerilla warfare -- the hit-and-run/hide tactics.

But the TL;DR was that Skynet was basically playing something like StarCraft as if it played against another bot, and ultimately lost because it played against humans. That was the "Skynet was not creative" angle in the novelizations.

staticman2

1 replies

5h13m

2024-08-21 13:17:43 UTC

This is a complete tangent but:

In Terminator 1 Skynet looses because John Conner taught people how to fight the machines, but John Conner only knows this because Kyle Reese taught Sarah Conner how to fight the machines and she taught John Conner. But Kyle Reese only knows this because he was taught by John Conner- so there's no actual source of the information on how to fight the machines, it's a loop with no beginning or end.

I had a philosophy teacher who said this is evidence of divine intervention to destroy Skynet, essentially God told people through John Conner how to win, but a cut scene in Terminator 1 implies Skynet was also created by reverse engineering the chip in the destroyed Terminator- implying there's also no origin of the information on how to create Skynet and it's also an infinite loop.

pdimitar

0 replies

5h10m

2024-08-21 13:20:40 UTC

Yeah, these discussions are fascinating but I'd still think it's not very hard to learn how to blow stuff up and sabotage assembly lines, given enough tries.

So it's not exactly an infinite loop IMO, it's more like that the first iteration was more crude and the machines were difficult to kill but then people learned and passed the information along back in time, eventually forming the infinite loop -- it still had a first step though, it didn't come out of nothing.

K0balt

8 replies

1d5h

2024-08-20 12:36:09 UTC

To be fair, I’ve talked to a lot of people who cannot consistently perform at the mistral-12b level.

I think we expect AGI to be much smarter than the average joe, and free of occasional stupidity.

What we’ve got is an 85IQ generalist with unreliable savant capabilities, that can also talk to a million people at the same time without getting distracted. I don’t see how that isn’t absolutely a fundamental shift in capability.

It’s just that we expect it to be spectacularly useful. Not like homeless joe, who lives down by the river. Unfortunately, nobody wants a 40 acre call center of homeless joes, but it’s hard to argue that HJ isn’t an intelligent entity.

Obviously LLMs don’t yet have a control and supervision loop that gives them goal directed behaviour, but they also don’t have a drinking problem and debilitating PTSD with a little TBI thrown in from the last war.

It’s not that we aren’t on the cusp of general intelligence, it’s that we have a distorted idea of how useful that should be.

pdimitar

4 replies

1d5h

2024-08-20 12:45:48 UTC

What we’ve got is an 85IQ generalist with unreliable savant capabilities, that can also talk to a million people at the same time without getting distracted. I don’t see how that isn’t absolutely a fundamental shift in capability.

Very shallow assessment, first of all it's not a generalist at all, it has zero concept of what it's talking about, secondly it gets confused easily unless you order it to keep context in memory, and thirdly it can't perform if it does not regularly swallow petabytes of human text.

I get your optimism but it's uninformed.

To be fair, I’ve talked to a lot of people who cannot consistently perform at the mistral-12b level.

I can find you an old-school bot that performs better than uneducated members of marginalized and super poor communities, what is your example even supposed to prove?

it’s hard to argue that HJ isn’t an intelligent entity.

What's HJ? If it's not a human then it's extremely easy to argue that it's not an intelligent entity. We don't have intelligent machine entities, we have stochastic parrots and it's weird to pretend otherwise when the algorithms are well-known and it's very visible there's no self-optimization in there, there's no actual learning, there's only adjusting weights (and this is not what our actual neurons do btw), there's no motivation or self-drive to continue learning, there's barely anything that has been "taught" to combine segments of human speech and somehow that's a huge achievement. Sure.

It’s not that we aren’t on the cusp of general intelligence, it’s that we have a distorted idea of how useful that should be.

Nah, we are on no cusp of general AGI at all. We're not even at 1%. Don't know about you but I have a very clear idea what would AGI look like and LLMs are nowhere near. Not even in the same ballpark.

It helps that I am not in the area and I don't feel the need to pat myself on the back that I have managed to achieve the next AI plateau which the area will not soon recover from.

Bookmark this comment and tell me I am wrong in 10 years, I dare you.

tasuki

1 replies

8h22m

2024-08-21 10:08:44 UTC

We don't have intelligent machine entities, we have stochastic parrots

What is intelligence? We must have very different definitions!

pdimitar

0 replies

8h3m

2024-08-21 10:27:49 UTC

Nobody knows what intelligence actually is. But asking this philosophical question and your discussion opponent not having a clear answer is a very obvious discussion trap and a discussion shut-down and it does NOT immediately follow that your claim -- "we have AI / AGI" -- becomes automatically true. It does not.

And I am pretty sure my own intelligence goes much farther than regurgitating text that I have no clue about (like ChatGPT does not have symbol logic that links words with objects it can "feel" physically or otherwise).

K0balt

1 replies

1d5h

2024-08-20 13:06:40 UTC

HJ is Homeless Joe, an inference that a 12b stochastic text generator would not have missed lol. But sure, ill reflect in 10 years.

TBH I hope im wrong, and that there is magic in HJ that makes him special in the universe in a way that GPT26 can never be. But increasingly, I doubt this premise. Not because of the "amazing capabilities of LLMs" which i think are frequently overstated and largely misunderstood, but more because of the dumbfounding shortcomings of intelligent creatures. We keep moving the bar for AGI, and now AGI is assumed to be what any rational accounting would classfy as ASI.

Where we are really going to see the bloom of AI is in goal directed systems, and I think those will come naturally with robotics. I predict we are in for a very abrupt 2nd industrial revolution, and you and I will be able to have this discussion either over a 55 gallon barrel of burning trash, or in our robot manicured botanical gardens sometime in the near future lol.

good times, maybe. Interesting times , for sure.

pdimitar

0 replies

1d5h

2024-08-20 13:25:16 UTC

Not because of the "amazing capabilities of LLMs" which i think are frequently overstated and largely misunderstood

We have found common ground.

but more because of the dumbfounding shortcomings of intelligent creatures

Yes, a lot of us utilize defective judgments, myself included, fairly often. My point was that LLMs, for all their praise, can't even reach 10% of an average semi-intelligent organic being.

We keep moving the bar for AGI, and now AGI is assumed to be what any rational accounting would classfy as ASI.

I don't know who is "we" (and I wish people stopped pretending that "we" are all a homogenous mass) but I've known what an AGI should be ever since I've watched movies about Skynet and HAL-9000. ¯\_(ツ)_/¯

Secondly, it's the so-called "AI practitioners" who constantly move the goal posts (now there's "ASI"? -- you know what, I actually don't want to know) because they're periodically being called out and can't hide the fact that they have nearly nothing again. So what's better than obfuscating that fact by having 100+ acronyms? It's a nice cover and apparently there are still investors who are buying it. I get it, we have to learn to say the right things to get funding.

Where we are really going to see the bloom of AI is in goal directed systems, and I think those will come naturally with robotics.

I agree. Physical feedback is needed if we want an electronic entity to "evolve" similarly to us.

I predict we are in for a very abrupt 2nd industrial revolution, and you and I will be able to have this discussion either over a 55 gallon barrel of burning trash, or in our robot manicured botanical gardens sometime in the near future lol.

I agree this is 100% inevitable but I don't think it's coming as soon as you say. The LLMs are hopelessly stuck even today and the whole AI area will suffer for it for a while after the bubble bursts... which is the event that I am certain is coming soon.

petra

1 replies

18h6m

2024-08-21 00:24:43 UTC

I don't think they'll use LLM's for customer service.

But it's a building block. And when used well it may be possible to get to zero hallucinations and good accuracy in question answering for limited domains - like the call center.

pdimitar

0 replies

9h11m

2024-08-21 09:19:44 UTC

If the current LLMs manage to achieve even only that it would be an enormous win. Alas they still have not.

dartos

0 replies

1d3h

2024-08-20 14:55:28 UTC

I’ve talked to a lot of people who cannot consistently perform at the mistral-12b level

This is honestly one of the most gpt-2 things I’ve ever read.

bamboozled

11 replies

1d13h

2024-08-20 04:56:54 UTC

Well he gave you a list of credentials of why you should believe him. Isn’t that enough ?

dartos

8 replies

1d13h

2024-08-20 04:57:58 UTC

I've trained as a neuroscientist and written a book about consciousness

This has to do with ML and numerical computing, how?

bamboozled

7 replies

1d13h

2024-08-20 05:07:01 UTC

Well I was being sarcastic. I really dislike it when people have to first convince you that you should trust them.

Either make a good argument or don’t.

lemarchr

3 replies

1d12h

2024-08-20 05:50:29 UTC

Argument from authority is a pernicious fallacy, and typically effective too. You were right to call it out. I must admit I overlooked the sarcasm, however.

lelanthran

2 replies

1d12h

2024-08-20 06:10:17 UTC

I must admit I overlooked the sarcasm, however.

Don't feel too bad; until I read the next response I was in two minds about whether sarcasm was intended or not.

It's bloody hard to tell, sometimes :-/

librasteve

0 replies

1d10h

2024-08-20 07:52:29 UTC

it can be for some

carlmr

0 replies

1d11h

2024-08-20 07:31:22 UTC

https://en.m.wikipedia.org/wiki/Poe%27s_law

mmooss

0 replies

20h36m

2024-08-20 21:54:53 UTC

I really dislike it when people have to first convince you that you should trust them.

Either make a good argument or don’t.

Human beings can't evaulate the truth of things based only on the argument. Persuasive liars, cons, and incompetents are a very known phenomenon. Most of human history we misunderstood nature and many other things because we relied on 'good arguments'. Not that we need it, but research shows that human intuition about the truth of something isn't good without expertise.

When I need medical advice, I get it from someone who has convinced me that they have expertise; I don't look for 'good arguments' that persuade me, because I don't know what I'm talking about.

I have expertise in other things. In those fields, I could easily persuade people without it of just about anything. (I don't; I'm not a sociopath.) I imagine anyone with professional expertise who reads this can do the same.

dartos

0 replies

1d5h

2024-08-20 13:25:07 UTC

Oh, I was bamboozled.

carlmr

0 replies

1d10h

2024-08-20 07:32:26 UTC

If people say "believe me" at the end of every second sentence you should doubt them. Not thinking of anyone in particular.

phito

0 replies

1d10h

2024-08-20 07:43:57 UTC

No. Stating credentials proves nothing at all. Even less so on the internet.

edit: oh sorry I didn't get that it was sarcasm

meroes

0 replies

17h8m

2024-08-21 01:23:07 UTC

Let’s poll RLHF workers since they actually see the tech the most.

tim333

0 replies

1d8h

2024-08-20 10:09:34 UTC

nothing says ... that we're not in for a long plateau again

The thing that's different this time is the hardware capacity in TFLOPs and the like passing human brain equivalence.

There's a massive difference between much worse than human AI - a bit meh, and better than human AI - changes everything.

any reasoned argument for why it is easy to build real AI and that it will come fast

It probably won't be easy but the huge value of better than human AI will ensure loads of the best and brightest working on it.

greenthrow

10 replies

19h24m

2024-08-20 23:06:48 UTC

You sound like you don't actually understand anything about LLMs and are buying into the hype. They are not cognizant let alone conscious. They don't understand anything. The tokens could be patterns of colored shapes with no actual meaning, only statistical distributions and nothing about how the LLMs work would change.

stevenhuang

9 replies

17h19m

2024-08-21 01:12:24 UTC

The tokens could be patterns of colored shapes with no actual meaning, only statistical distributions and nothing about how the LLMs work would change.

I can put your brain in a vat and stimulate your sensory neurons with a statistical distribution with no actual meaning, and nothing about how your brain works would change either.

The LLM and your brain would attempt to interpret meaning with referent from training, and both would be confused at the information-free stimuli. Because during "training" in both cases, the stimuli received from the environment is structured and meaningful.

So what's your point?

By the way, pretty sure a neuroscientist with 20 years of ML experience has a deeper understanding of what "meaning" is than you do. Not to mention, your response reveals a significant ignorance of unresolved philosophical problems (hard problem of consciousness, what even is meaning) which you then use to incorrectly assume a foregone conclusion that whatever consciousness/meaning/reasoning is, LLMs must not have it.

I'm partial to doubting LLMs as they are now have the magic sauce, but it's more that we don't actually know enough to say otherwise, so why state that we do know?

We can't even say we know our own brains.

greenthrow

4 replies

15h6m

2024-08-21 03:25:06 UTC

Your response is nonsense. We don't know how consciousness arises from matter but we do have significant understandings about knowledge, reasoning, modeling, visualization, object permanence, etc. etc. that are significant parts of how the human mind works. And we know LLMs have none of these.

The point of my colored shape example is that it is an illusion that there is anything resembling a mind inside an LLM. I thought that was obvious enough I didn't need to explain it further than I did.

As far as the original commenter's credentials; there's lots of people who should know better but buy into hype and nonsense.

stevenhuang

3 replies

14h22m

2024-08-21 04:09:21 UTC

And we know LLMs have none of these.

Go ahead and cite your sources. For every study claiming that LLMs lack these qualities, there are others that support and reinforce the connectionist model of how knowledge is encoded, and with other parallels to the human brain. So... it's inconclusive. It's bizarre why you so strongly insist otherwise when it's clear you are not informed.

The point of my colored shape example is that it is an illusion that there is anything resembling a mind inside an LLM

And my example with subjecting a human brain through your procedure is to illustrate what a garbage experiment design it is. You wouldn't be able to tell there's a mind inside either. Both LLM and human brain mind would be confused. Both would "continue working" in the same way, trying to interpret meaning from meaningless stimulation.

So you don't have a point to make, got it.

greenthrow

2 replies

6h27m

2024-08-21 12:03:30 UTC

We know LLMs don't have those things prima facia because they fail at all of them constantly. We also know how they work, they are token predicters. That is all they are and all they can do. What can be accomplished with that is pretty cool, but humans love to imagine there is more going on when there isn't. Just like with Eliza.

If you don't understand how your attempt to apply my colored shape analogy to a human brain is nonsensical I am not going to waste my time explaining it to you. I had a point, I made it, and apparently it escaped you.

stevenhuang

1 replies

6h19m

2024-08-21 12:12:22 UTC

And if you don't see how the example with the human brain throws a wrench in your analogy, it would explain why you'd think it as nonsensical, as it's exactly of relevance.

We also know how they work, they are token predicters. That is all they are and all they can do

Ah there it is, you've betrayed a deep lack of understanding in all relevant disciplines (neuroscience, cognition, information theory) required to even appreciate the many errors you've made here.

You sure understand the subject matter and have nothing possibly to learn. Enjoy.

https://en.wikipedia.org/wiki/Predictive_coding

greenthrow

0 replies

3h8m

2024-08-21 15:22:33 UTC

I'm aware of the theories that LLM maximalists love to point to over and over that tries to make it seem like LLMs are more like human brains than they are. These theories are interesting in the context of actual minds but you far over extend their usefulness and application by trying to apply them to LLMs.

We know as a hard fact that LLMs do not understand anything. They have no capacity to "understand". The constant, intractible failure modes that they continuously exhibit are clear byproducts of this fact. By continuing to cling to the absurd idea that there is more going on than token prediction you make yourself look like the people who kept insisting there was more going on with past generation chat bots even after being shown the source code.

I have understood all along why you attempt to extend my colored shape example to the brain, but your basis for this is complete nonsense. Because a) we do not have the actual understanding of the brain to do this and b) it's competely beside the point, becuase we know that minds do arise from the brain. My whole point is an LLM is an illusion of a mind which is effective because it outputs words, which we are so hard wired to associate with other minds, expecially when they seem to "make sense" to us. If instead of words you use something nonsensical like colored shapes with no underlying meaning, this illusion of the mind goes away and you can see an LLM for whst it is.

southernplaces7

3 replies

12h11m

2024-08-21 06:19:34 UTC

I can put your brain in a vat and stimulate your sensory neurons with a statistical distribution with no actual meaning, and nothing about how your brain works would change either. The LLM and your brain would attempt to interpret meaning with referent from training, and both would be confused at the information-free stimuli. Because during "training" in both cases, the stimuli received from the environment is structured and meaningful.

What an absurd response. Yes, you'd probably cause the human brain to start malfunctioning terribly at the form of consciousness it's well -accustomed to managing within the context of its normal physical substrate and environment. You'd be doing that (and thus degenerating it badly) because you removed it from that ancient context whose workings we still don't well understand.

Your LLM on the other hand, has no context in which it shows such a level of cognitive capacity, higher-order reasoning, self direction and self awareness that we daily see humans to be capable of.

By the way, pretty sure a neuroscientist with 20 years of ML experience has a deeper understanding of what "meaning" is than you do.

really? An appeal to authority? Many smart, educated people can still fall for utter nonsense and emotional attachment to bad ideas.

stevenhuang

2 replies

11h28m

2024-08-21 07:02:42 UTC

Your LLM on the other hand, has no context in which it shows such a level of cognitive capacity

Yes, it will be confused as well, and for all outwards observable signs will fail to make sense of the stimuli, yet it will "aware" of its inability to understand, much like a human brain would.

If you doubt that, open a new session and type some random tokens, you will get the answer that it's confused.

Any other statement as to "consciousness" verges into the philosophical and unanswerable via empirical means.

And ah, to frame it as an appeal to authority when the topic is precisely the subject of a neuroscientist's study.

Sounds like you know a thing or two about nonsense and emotional attachment to bad ideas.

southernplaces7

1 replies

8h16m

2024-08-21 10:15:13 UTC

You persist in talking nonsense.

Yes, it will be confused as well, and for all outwards observable signs will fail to make sense of the stimuli, yet it will "aware" of its inability to understand, much like a human brain would.

If you doubt that, open a new session and type some random tokens, you will get the answer that it's confused.

There is no empirical evidence of any awareness whatsoever in any LLM, at all. Even their most immersed creators don't make such a claim. An LLM itself saying anything about awareness doesn't mean a thing. It's literally designed to mimic in such a way. And you speak of discussions of consciousness being about the philosophical and unanswerable?

At least when talking about human awareness, one applies these ideas to minds that we personally as humans perceive to be aware and self-directed from our own experience (flawed as it is). You're applying the same notion to something that shows no evidence of awareness while then criticizing assumptions of consciousness in a human brain?

Such a sloppy argument indeed does make appeals to authority necessary I suppose.

stevenhuang

0 replies

6h40m

2024-08-21 11:50:44 UTC

It seems you've lost the train of your own argument.

has no context in which it shows such a level of cognitive capacity

You claim LLMs have no context at all in which it shows a similar level of cognitive capacity.

Yet clearly this claim is in contention with the fact that an LLM will indeed be able to evince this, much like a human brain would: by attesting to its own confusion. That is ostensibly empirical and evidential to a nonzero degree.

Thus your claim is too strong and therefore quite simply wrong. Claim mimicry? Then prove human brain consciousness does not derive from the process of mimicry in any form. You can't. In fact, the free energy principle, neuroscience's leading theory of human consciousness, argues the opposite: that prediction and mimicry encompass the entirety of what brains actually do. https://en.wikipedia.org/wiki/Predictive_coding

There is no empirical evidence of any awareness whatsoever in any LLM, at all.

And no such claim was made--"awareness" was quoted for a reason.

It's literally designed to mimic in such a way. And you speak of discussions of consciousness being about the philosophical and unanswerable?

Yes, as this was parent's claim: "They are not cognizant let alone conscious'.

And talking about sloppy argument--it may well turn out that something can be "designed to mimic" yet still be conscious. I'll leave that for you to puzzle out how on earth that might be possible. The exercise might help you form less sloppy arguments in the future.

You're applying the same notion to something that shows no evidence of awareness while then criticizing assumptions of consciousness in a human brain?

No. But I suppose you've lost the plot a few inferential steps prior this so your confusion is not surprising.

Protip, instead of claiming everything that goes against your sensibilities as nonsense, perhaps entertain the possibility that you might just not be as well informed as you thought.

floppiplopp

7 replies

1d11h

2024-08-20 06:58:17 UTC

When Weizenbaum demonstrated Eliza to his colleagues, some thought there was an intelligent consciousness at the heart of it. Few even continued to believe this after they were shown the source code, which they were able to read and understand. Human consciousness is full of biases and the most advanced AI cannot reliably determine which of two floats is bigger or even solve really simple logic puzzles for little kids. But I can see how these things mesmerize true believers.

hnfong

3 replies

1d8h

2024-08-20 09:52:31 UTC

At this point bringing up the ELIZA argument is basically bad faith gaslighting…

Finding bugs in some models doesn’t mean you have a point about intelligence. If somebody could apply a similar argument to dismiss human intelligence, you don’t have a point. And here it goes: the most advanced human intelligence can’t reliably multiply large numbers or recall digits of Pi. Obviously humans are dumber than pocket calculators.

psb217

2 replies

21h2m

2024-08-20 21:28:32 UTC

Your counterargument is invalid. The most advanced human intelligence invented (or discovered) concepts like multiplication, pi, etc., and created tools to work around the ways in which these concepts aren't well handled by their biological substrate. When machine intelligences start inventing biological tools to overcome the limits of their silicon existence, you'll have a point.

mypalmike

0 replies

19h16m

2024-08-20 23:14:48 UTC

Designing biological tools is not a commonly accepted bar for AGI.

biomcgary

0 replies

20h40m

2024-08-20 21:51:24 UTC

Isn't the comment you are responding to an example of: "When machine intelligences start inventing biological tools to overcome the limits of their silicon existence, you'll have a point"?

slashdave

0 replies

18h26m

2024-08-21 00:05:04 UTC

Especially if you remember that the change needed for the first "breakthrough" (GPT4) was RLHF. That is, a model that was specifically trained to mesmerize.

ben_w

0 replies

20h13m

2024-08-20 22:18:18 UTC

the most advanced AI cannot reliably determine which of two floats is bigger

Some of the most advanced AI are tool users and can both write and crucially also execute python, and embed the output in their responses.

or even solve really simple logic puzzles for little kids.

As given in a recent discussion: https://chatgpt.com/share/ee013797-a55c-4685-8f2b-87f1b455b4...

(Custom instructions, in case you're surprised by the opening of the response).

K0balt

0 replies

1d6h

2024-08-20 12:21:49 UTC

While it is true that LLM’s lack agency and have many weaknesses, they form a critical part of what machine learning has lacked until transformers became all of the rage.

The things that LLM’s are bad at are largely solved problems using much simpler technology. There is no reason that LLM’s have to be the only component in an intelligent agent. Biological brains have Specialized structures for specialized tasks like arithmetic. The solution is probably integration of LLMs as a part of a composite system that includes database storage, a code execution environment, and multiple agents to form a goal directed posit - evaluate loop.

I’ve had pretty remarkable success with this architecture running on 12b models and I’m a nobody with no resources.

LLM’s by themselves just come up with the first thing that crosses their”mind”. It shouldn’t be surprising that the very first unfiltered guess about a solution might be suboptimal.

There is a vast amount of knowledge embedded in our cultural matrix, and a lot of that is captured in the common crawl and other datasets.llms are like a search engine for that data , based on meaning rather than semantics.

cosmicradiance

7 replies

1d13h

2024-08-20 04:54:02 UTC

On the current state of AI - do you believe it has "intelligence" or is the underlying system a "prediction machine"?

What signs do you see that make you believe that the next level (biological intelligence) is on the horizon?

zaptrem

6 replies

1d13h

2024-08-20 05:04:41 UTC

We are but prediction machines https://www.psy.ox.ac.uk/news/the-brain-is-a-prediction-mach...

cosmicradiance

2 replies

1d12h

2024-08-20 05:44:56 UTC

Wrong. As per the article - a part of our brain is a prediction machine. A human body is more than the sum of its parts.

jcgrillo

1 replies

13h39m

2024-08-21 04:51:58 UTC

A human body is more than the sum of its parts.

What does this mean, precisely? How is a human body (or plant, or insect, or reptile/bird/mammal) body ever "more than" it's constituent parts? Wouldn't that violate a conservation law?

ruszki

0 replies

9h17m

2024-08-21 09:14:09 UTC

https://en.wikipedia.org/wiki/Emergence

bamboozled

1 replies

1d13h

2024-08-20 05:10:04 UTC

Is this a good thing ? Because apparently we’re supposed to be building god. So it sounds like we’re on the wrong track, am I wrong ?

If we’ve just copied our feeble abilities, is that supposed to be exciting?

Is god like intelligent just a prediction machine too ?

giardini

0 replies

1d2h

2024-08-20 16:27:55 UTC

Well, if we intend "building god" perhaps we're merely making a copy of a copy of God:

The Sixth Day

Then God said, “Let Us make man in Our image, after Our likeness, to rule over the fish of the sea and the birds of the air, over the livestock, and over all the earth itself and every creature that crawls upon it.” So God created man in His own image; in the image of God He created him; male and female He created them. ”…

Genesis 1:26-27 Berean Standard Bible

ChaitanyaSai

0 replies

1d13h

2024-08-20 05:09:58 UTC

We do predictions, but much more important, we are able to create new states. Prediction the classical view assigns probabilities to existing states. What's unique to us and a lot of other biological intelligence is the ability to create new states when needed. This is not implicit in the narrow view of prediction machines

seoulmetro

4 replies

1d13h

2024-08-20 04:54:57 UTC

What's the next big step? What will it do? Why do we need or want it? Surely you have the answer.

This means you are sure we are close to automated driving, engineering and hospitality?

edanm

3 replies

1d8h

2024-08-20 09:59:20 UTC

This means you are sure we are close to automated driving [...]

We already have "automated driving" in some sense. Some cities have fully autonomous taxi services that have operated for a year or more, iirc.

seoulmetro

2 replies

19h33m

2024-08-20 22:58:12 UTC

Nah. We're still not that close. Think of it this way, you turn on an appliance at home and it's what, a 0.0001% chance it will explode in your face? Now automated driving, hospitality etc is all more like a 0.1+% chance something goes wrong still. Huge difference.

I don't really take those taxis as a form of solved automation. It's a nice step though.

fragmede

1 replies

19h24m

2024-08-20 23:06:38 UTC

Why don't robotaxis count?

seoulmetro

0 replies

17h31m

2024-08-21 00:59:42 UTC

Because large, money-rich companies and censored usage is not a proving ground. Amazon "trialled" stores that were automated but ended up being humans. Even without the human factor they weren't proof of successful automation of retail.

lasc4r

3 replies

1d13h

2024-08-20 05:03:48 UTC

We've run the 4-minute mile.

We are in for radical non-linear change.

We aren't running miles much quicker than 4 mins though. The last record was 3m:43s set by Hicham El Guerrouj in 1999.

blitzar

1 replies

1d10h

2024-08-20 07:54:11 UTC

The 1-minute mile must be right around the corner, and when that inevitably gets broken, the 1-second mile will follow swiftly.

XMPPwocky

0 replies

13h32m

2024-08-21 04:58:43 UTC

In fact, humans will be running at relativistic speeds within this century, risking the total destruction of the Earth if they should ever trip over something and fall down.

Scary stuff. And it's not science fiction- it's based on real, observed trends and scaling laws. Seems impossible? Well, they said the four-minute mile was impossible, too.

PaulRobinson

0 replies

20h33m

2024-08-20 21:57:51 UTC

While this is true, I think you’re not appreciating the metaphor.

Humankind tried to break the 4 minute mile for hundreds of years - since measuring distance and time became accurate enough to be sure of both in the mid-18th century, at least - and failed.

In May 1954, Roger Bannister managed it. By late June it was done again by a different runner. Within 20 years the record was under 3:45, and today there are some runners who have achieved it more than 100 times and nearly 1800 runners who have done it at all.

Impossible for hundred of years, and then somebody did it, and people stopped thinking it was impossible and started doing it themselves. That’s the metaphor: sometimes we think of barriers that are really mental, not real.

I’m not sure that applies here either, but the point is not that progress is continuously exponential, but that once a barrier is conquered, we take on a perspective as if the barrier were never real in the first place. Powered flight went through this. Computing hardware too. It’s not an entirely foolish notion.

dartos

2 replies

1d13h

2024-08-20 04:57:06 UTC

I am yet to see any reasoned argument for why it is far more difficult and will take far longer.

For language models specifically, they are trained on data and have historically been improved by increasing the size of the model (by number of parameters) and by the amount and/or quality of training data.

We are basically out of new, non-synthetic text to train models on and it’s extremely hard work to come up with novel architecture that performs well against transformers.

Those are some simple reasons why it will be far more difficult to improve general language models.

There are also papers showing that training models on synthetic data causes “model collapse” and greatly reduces output quality by magnifying errors already present in the model, so it’s not a problem we can easily sidestep.

It’s an easy mistake to see something like chatgpt not exist, then suddenly exist and assume a major breakthrough happened, but behind the scenes there has been like 50 years of R&D that led to it, it’s not like suddenly there was a breakthrough and now the gates are open.

A general intelligence for CS is like the elixir of life for medicine.

slidehero

1 replies

18h16m

2024-08-21 00:14:44 UTC

We are basically out of new, non-synthetic text to train models

this is not even remotely true.

There is an astronomical amount of data siloed by publishers, professional journals etc. that is yet to be tapped.

OpenAI is making inroads by making deals with these content owners for access to all that juicy data.

staticman2

0 replies

5h52m

2024-08-21 12:38:38 UTC

>>There is an astronomical amount of data siloed by publishers, professional journals etc. that is yet to be tapped.

You seem to think these models haven't already been trained on pirated versions of this content, for some reason.

0points

2 replies

1d11h

2024-08-20 06:48:53 UTC

There are hundreds of billions of dollars figuring out how to get to the next level, and it's clear we are close.

Are we really now?

The smart people I've spoken to on the subject seem to agree the current technology based on LLM are at the end of the road and that there are no breakthrough in sight.

So what is your take on the next level?

KoolKat23

1 replies

19h55m

2024-08-20 22:36:19 UTC

Define breakthrough, there's plenty of room to scale and optimize without any need for a breakthrough (well my definition of breakthrough). Emergent properties so far have been obtained purely from scaling.

slashdave

0 replies

18h25m

2024-08-21 00:06:19 UTC

There will be no more progress via scaling. All the available training data has already been exploited.

matteoraso

1 replies

20h2m

2024-08-20 22:29:08 UTC

I don't know anything about neuroscience, but is there anything in the brain even remotely like the transformer architecture? It can do a lot of things, but I don't think that it's capable of emulating human intelligence.

warkdarrior

0 replies

14h13m

2024-08-21 04:17:58 UTC

I don't know anything about biology, but is there anything in birds even remotely like the airplane? It can do a lot of things, but I don't think that it's capable of emulating bird flight.

ypeterholmes

0 replies

15h25m

2024-08-21 03:05:43 UTC

Awesome comment. I've written a piece here about the relationship between AI and human consciousness. Would love some feedback if you're able. Thanks! https://peterholmes.medium.com/the-conscious-computer-af5037...

PS. I'm buying your book right now.

whatever1

0 replies

1d13h

2024-08-20 05:03:12 UTC

There is no guarantee that we will not get stuck with these probabilistic parrots for 50 more years. Definitely useful, definitely not AI.

And by the way I can copy your post character by character, without hallucinating. So I am definitely better than this crop of "AI" in at least one dimension.

washadjeffmad

0 replies

17h51m

2024-08-21 00:40:14 UTC

Agreed. This is something I didn't think I'd see in my lifetime, let alone be poised to be able to run locally. The alignment is fortuitous and staggering.

People focused on the products are missing out on the dawn of an epoch. It's a failure of perspective and creativity that's thankfully not universal.

surfingdino

0 replies

20h2m

2024-08-20 22:28:34 UTC

What are you talking about? What autonomy? Try the latest Gemini Pro 1.5 and ask it for the list of ten places to visit in Spain. Then ask it for the Google Maps URLs for those places. It will make up URLs that point to nowhere. This os of zero value for personal or business use. I have dozens of examples of such crappy outcomes from all "latest", "most powerful" products. AI is smoke and mirrors. It is being sold as a very expensive solution to a non-existent problem and is not getting any better in the future. Some wish AI had someone like Steve Jobs to properly market it, but even Steve Jobs could not make a crappy product sell. The whole premise of AI goes against what generations of users were told--computers always give correct answers and given the same input parameters return the same output. By extension, we were also taught that GIGO (Garbage-In, Garbage-Out) is what we can blame when we are not happy with the results computers generate. AI peddlers want us to believe in and pay for VIGO (Value-In, Garbage-Out) and I'm sorry but there is not a valid business model where such tools are required.

seydor

0 replies

1d12h

2024-08-20 06:29:56 UTC

From a neuroscience perspective , current AI has not helped explain much about real brains. It did however validate the connectionist model of intelligence and memory, to the point that alternate theories are much less believable nowadays. It is interesting to watch the deep learning field evolve, hoping that at some point it will intersect with brain anatomy.

sandspar

0 replies

15h16m

2024-08-21 03:14:28 UTC

The potential rewards are so great that you might be overestimating the odds this will come about. Even lottery skeptics might buy a lottery ticket if the prize is a billion dollars.

reportgunner

0 replies

1d7h

2024-08-20 10:33:22 UTC

I am absolutely astonished people don't see this as opening the flood-gates to staggeringly powerful artificial intelligence.

Perhaps it's confirmation bias ?

pdimitar

0 replies

1d10h

2024-08-20 08:28:03 UTC

And I must say I am absolutely astonished people don't see this as opening the flood-gates to staggeringly powerful artificial intelligence.

This looks like a cognitive dissonance and they are addressed by revisiting your assumptions.

No flood-gates have been opened. ChatGPT definitely found uses in a few areas but the number is very far from what many people claimed. A few things are really good and people are using them successfully.

...But that's it. Absolutely nothing even resembling the beginnings of AGI is on the horizon and your assumption that the rate of progress will remain the same -- or even accelerate -- is a very classic mistake of the people who are enthusiasts in their fields.

There are hundreds of billions of dollars figuring out how to get to the next level, and it's clear we are close.

This is not clear at all. If you know something that nobody else does, please let us know as well.

nprateem

0 replies

13h56m

2024-08-21 04:34:57 UTC

It's obvious there's potential. It's also obvious it requires at least one other major breakthrough. But no one knows how far away that is.

limit499karma

0 replies

1d4h

2024-08-20 13:52:57 UTC

Why are you throwing in 'consciousness' in a comment regarding mechanical intelligence?

layer8

0 replies

19h23m

2024-08-20 23:08:19 UTC

We don’t know what the next big leap will bring and when it will happen. The occurrence of a singular previous big leap cannot serve as any reliable predictor.

fire_lake

0 replies

11h25m

2024-08-21 07:06:21 UTC

The argument that we are going to see massive progress soon is weak in my view. It seems to be:

- we had some big breakthroughs recently

- some AI “godfathers” are “really worried”

dogcomplex

0 replies

11h36m

2024-08-21 06:54:57 UTC

Fascinating background - would love to pick your brain on how you see current LLMs/ML comparing to neuroscience. What do you see that's missing still, if anything?

If I had to bet, I would start with:

- Error-correcting specialized architectures for increasing signal-to-noise (as far as I can tell these are what everyone is racing to build this year, and should be doable with just conventional programming systems wrapping LLMs)

- Improved energy efficiency (as yes, human brains are currently much more efficient! But - there are also simple architecture improvements (both software and hardware) that are looking to save 100x. Specialized ASIC ternary chips using 1999's tech should be here quite soon, a lot more efficient in price and energy.)

- No Backwards-propagation. (As yes, the brain does seem to do it all with forward-propagation only. Though this is possible and promising in neural networks like the Forward-Forward algorithm too, they haven't been trained to the same scales as backprop-heavy transformers (and likely have a lower performance in terms of noise/accuracy). Though if I'm not mistaken, the brain does have forward-backward loops, but the signals go through separate neurons for each direction (rather than reusing one) - if so that's close to backprop by itself, but probably imposes a tradeoff as the same signal can't be perfectly reproduced backwards, yet it can perhaps be enhanced to be just the most relevant information by the separate specialized neuron. I'm obviously mostly ignorant of the neuroscience here but halfway-knowledgeable on the ML theory haha

But yes, I completely agree - the flood gates are already open. This is a few architecture quibbles away from an absolute deluge of artificial intelligence that will dwarf (drown?) anything we've known. Good point on decentralized cheap autonomy - the real accomplishment of life. Intelligence, as it appears, is just a fairly generous phenomenon where any autonomous process continually improves its signal-to-noise ratio... many ways to accomplish that one! Looking forward to seeing LLMs powered by ant colonies and slime molds, though I suspect by then there will be far more interesting and terrifying realities unlocked.

benterix

0 replies

1d11h

2024-08-20 07:01:46 UTC

and it's clear we are close

I'd like to believe it more than you do. Unfortunately, in spite of these millions of dollars, the progress on LLMs has stalled.

IAmGraydon

0 replies

8h14m

2024-08-21 10:16:26 UTC

I've trained as a neuroscientist

Can you explain what this means? Do you have a degree in neuroscience?

_acco

165 replies

20h44m

2024-08-20 21:46:35 UTC

AI (specifically Claude Sonnet via Cursor) has completely transformed my workflow. It's changed my job description as a programmer. (And I've been doing this for 13y – no greenhorn!)

This wasn't the case with GPT-4/o. This capability is very new.

When I spoke to a colleague at Microsoft about these changes, they were floored. Microsoft has made themselves synonymous with AI, yet their company is barely even leveraging it. The big cos have put in the biggest investments, but also will be the slowest to change their processes and workflows to realize the shift.

Feels like one of those "future is here, not evenly distributed yet" moments. When a tool like Sonnet is released, it's not like big tech cos are going to transform over night. There's a massive capability overhang that will take some time to work itself through these (now) slow-moving companies.

I assume it was the same with the internet/dot-com crash.

Calavar

91 replies

15h46m

2024-08-21 02:45:12 UTC

I feel like I'm living in a different universe sometimes. The consensus on HN seems to be that you can be pretty productive with LLMs as coding assistants, but every time I try I find it borderline impossible to get functional code even for pretty straightforward prompts.

I decided to fire up GPT-4o again today to see if maybe things have gotten better over the past few months.

I asked GPT to write code to render a triangle using Vulkan (a 3D graphics API). There are about 1000 tutorials on this that are almost certainly in GPT-4's training data. I gave GPT two small twists so it's not a simple case of copy/paste: I asked it 1) to apply a texture to the triangle and 2) to keep all the code in a single function. (Most tutorials break the code up into about a dozen functions, but each of these functions are called only once, so it should be trivial to inline them.)

Within the first ten lines, the code is already completely nonfunctional:

GPT-4o declares a pointer (VkPhysicalDevice) that is uninitialized. It queries the number of graphics devices on the host machine. A human being would allocate a buffer with that number of elements and store the reference in the pointer. GPT-4o just ignores the result. Completely ignores it. So the function call was just for fun, I guess? It then tries to copy an entire array of VkPhysicalDevice_T objects into this uninitialized pointer. So that's a guaranteed memory access violation right off the bat.

huhkerrf

13 replies

13h51m

2024-08-21 04:40:07 UTC

I feel the same way. Anytime someone says they don't find LLMs all that useful, the exact same comments come out:

"They clearly aren't using the right model!"

"It's obvious they don't know how to prompt, or they would see the value."

"Maybe it can't do that today, but GPT-5 is just around the corner."

I feel more and more that people have just decided that this is a technology that will do everything you can imagine, and no evidence to the contrary will change their priors.

dvh

5 replies

13h44m

2024-08-21 04:46:34 UTC

For me llm is faster stackoverflow without fear of my question being closed. I know the first answer will not be what I want. I know that I will have to refractor it to suit my style. I know it will be full of subtle bugs.

Oh and I expect it to be free, I ain't paying for this just like I wasn't paying for stackoverflow.

Finally I hope than in few years I will be able to just "sudo apt-get install llm llm-javascript llm-cooking llm-trivia llm-jokes" and it will all run locally on my low end computer and when I report bug, six months later it will be fixed when I update OS.

InDubioProRubio

4 replies

10h48m

2024-08-21 07:43:04 UTC

You are paying, the same way you paid for stack overflow.. you become part of the process, ask follow up questions, deepening the knowledge of the system.

The same applies to AI. The old learning material is gone, your interaction is now the new learning material and ground truth.

PS: Hourly rates for sw engineers: Range:€11 - €213 - so one hour on stackoverflow, searching and sub-querying resolving problems costs you or your employer up to 213€. It really depends on what you have negotiated.

jononor

2 replies

9h51m

2024-08-21 08:39:40 UTC

And unlike Stack Overflow, which is available to everyone online and has an open content license, the IP the users of the ChatGPT style services is entirely proprietary to the company. I am not interested in feeding their machine with my brainpower. On the other hand, I happily contribute to Stack Overflow, and open source software+hardware. I do not think I will integrate LLM into my engineering workflow until there is a good service/solution which builds up the commons. The huge companies already have way too much influence over key aspects of knowledge-based society.

jeffhuys

0 replies

4h3m

2024-08-21 14:28:01 UTC

Turn it off in the settings, then... Come on.

fsflover

0 replies

6h13m

2024-08-21 12:17:50 UTC

I am not interested in feeding their machine with my brainpower

How about DDG AI Chat? https://duckduckgo.com/?q=DuckDuckGo+AI+Chat&ia=chat&duckai=...

jeffhuys

0 replies

4h3m

2024-08-21 14:27:31 UTC

You know you can just turn off "Improve the model for everyone" in the settings, right?

meiraleal

3 replies

8h11m

2024-08-21 10:20:06 UTC

It codes great for me, helps me deliver faster, better tested and more features. It literally saves time and money every day. If you can't do that, maybe, just maybe, you have a you problem. But there are many like you in this thread so you are not alone.

sillyfluke

1 replies

4h1m

2024-08-21 14:30:15 UTC

Can you state your domain and programming language for clarity?

meiraleal

0 replies

2h44m

2024-08-21 15:46:40 UTC

JS/React and SAP/Hybris at work, vanillaJS for side projects. Ecommerce and general web/app development

jazzyjackson

0 replies

6h56m

2024-08-21 11:34:57 UTC

The worst part about LLMs is this attitude it's giving to people who get a few helpful answers in a row

You're like a gambling addict who thinks he's smarter than everyone else

kombookcha

0 replies

11h57m

2024-08-21 06:33:59 UTC

I'm with you. Every time I've used LLMs in my work, I've ended up using more time tidying up after the robot than it would have taken to just do the work myself from the start. I can believe that there are some tasks that it can do very fast, but my experience is that if you're using it for anything that matters, you can't trust it enough to let it do it on its own, and so you just end up doing the work twice.

It's like having an unusually fast-but-clumsy intern, except interns learn the ropes fast and understand context.

fragmede

0 replies

8h38m

2024-08-21 09:53:08 UTC

I mean when the people saying it doesn't work for them, would kill them to give links to the chats on ChatGPT.com so everyone can see the prompts used? when they do, it's a different conversation. like the number of R's in strawberry, or 9.11 - 9.2. When the complaints are generic with no links, the responses are similarly generic, because both sides just biases that they're right and the other side is the one that's wrong.

I welcome people picking apart chats that I link to. it's not that I believe that LLMs are magic and refuse to adjust my model of how good these things are and aren't, but when people don't give specific evidence is hard to actually move the conversation forwards.

because yeah, these things are plenty stupid and have to be tricked into doing things sometimes (which is stupid, but here we are). they're also pretty amazing but like any hammer, not everything is a nail.

ben_w

0 replies

11h1m

2024-08-21 07:30:03 UTC

More likely to be confirmation bias; most of us ask the wrong questions, try to confirm what we already believe rather than choose questions that may falsify our beliefs.

I have some stand tests for LLMs: write a web app version of tetris, write a fluid dynamics simulation, etc., and these regularly fail (I must try them again on 4o).

But also, I have examples of them succeeding wildly, writing a web based painting app just from prompting — sure, even with that success it's bad code, but it's still done the thing.

As there are plenty of examples to confirm what we already believe, it's very easy to get stuck, with nay-sayers and enthusiasts equally unaware of the opposite examples.

mns

8 replies

11h0m

2024-08-21 07:30:31 UTC

Sometime I think there's something wrong with me. I've used copilot, I'm paying for ChatGPT and we're also having the Jetbrains AI, and there's just something so off about all of this.

Some basic things are fine, but once you get into specialised things, everything gets just terribly wrong and weird. I can't even put it into words, I see people saying how they are 10x more productive (I'd like to see actual numbers and proof for this), but I just don't see how. Maybe we're working on very custom stuff, or very specific things, but all of these tools seem to give very deep or confident answers that are just plain wrong and shallow. Just yesterday I used GPT 4o for some basic help with Puppet, and the examples it printed, even though I would say it's quite basic, were just wrong, but in the sense of having to debug it for 2 hours just to figure out how ridiculous the error was.

I fear the fact that people will end up releasing unsafe, insecure and simply wrong code every day, code that they never debug and not even understand, that maybe works for a basic set of details, but once the real world hits it, it will fail like those self driving cars driving full speed into a trailer that has the same color as the road or sky.

koyote

2 replies

8h4m

2024-08-21 10:26:50 UTC

I think there's a vast ocean of different software engineers and the type of code they write on a daily basis, which is why you get such differing views on AI's effectiveness. For me, AI has only ever been useful for very basic tasks and scripts: If I need a quick helper script for something in Python, a language that isn't my daily drier, AI usually gets me there in a couple of prompts. Or maybe I am writing some powershell/bash and forgot the syntax for something and AI is quicker (or has more context) than a web search.

However, my main job is trying to come up with "elegant" architectures for complex business logic that interacts with an existing large code base. AI is just completely out of its depth in such cases due to lack of context but also lack of source material to draw from. Even unit tests only work with the most basic of cases, most of the time the setup is so complex it just produces garbage.

I've also had very little luck getting it to write performant code. I almost have to feed it the techniques or algorithms before it attempts to write such code, and even then it's usually wrong or not as efficient as it could be.

hobs

0 replies

5h44m

2024-08-21 12:47:24 UTC

Indeed, if you need a lot of boilerplate that's pretty similar to existing commonly available code, you're set. However...

That code is probably buggy, slow, poorly architected, very verbose, and has logical issues where the examples and your needs dont exact match.

Generally, the longer the snippet you want your LLM to generate, the more likely its going to go off the rails.

I think for some positions this can get you 90% of the code done. For me this usually means I can get started very fast on a new problem, but the last remaining "10%" actually takes significantly longer and more effort to integrate because I dont understand the other 90% off the top :)

TillE

0 replies

2h15m

2024-08-21 16:16:13 UTC

my main job is trying to come up with "elegant" architectures for complex business logic that interacts with an existing large code base

Right, that's surely the main job of nearly every experienced developer. It's really cool that LLMs can generate code for isolated tasks, but they can barely even begin to do the hard work, and that seems very unlikely to change in the foreseeable future.

zarzavat

0 replies

6h57m

2024-08-21 11:34:11 UTC

I don’t use ChatGPT or chat AI for coding, at least not often. It does help me get unstuck on occasion, but most of the time the context switch is too much of a productivity sink.

However, I use copilot autocomplete consistently. That makes me much more productive. It’s just really good autocomplete.

As an experienced developer I’d put my productivity improvement at 2-3x. It’s huge but not 10x. I’m limited by my decision speed, I need to decide what I want the code to do, AI can’t help with that - it can only do the “how”.

Far from introducing more bugs, using Copilot frees some mental cycles for me to be more aware of the code I’m writing.

vladimirralev

0 replies

7h18m

2024-08-21 11:12:30 UTC

I have the same experience. I have my own benchmark, I take a relatively complex project like FreeSWITCH on github, which is part of the training set for all coding LLMs anyway so they should know it and I ask the AI to code small snippets, tests, suggestions and fixes to see how well it understands the codebase and the architecture.

I just tried the latest Cursor + Sonnet and it failed in every task. The problem is that there is no way to understand the code without either complete understanding of the domain and the intents or running it in some context.

Telecom and media domains in particular are well documented in specs and studied in forum discussions. I am sure they are part of the training data, because I can get most answers if asked directly. So far the LLMs fail to reason about anything useful for me.

torginus

0 replies

10h3m

2024-08-21 08:27:37 UTC

Absolutely the same for me. Whenever I encounter a problem I'm pretty sure nobody else has encountered before (or has not written about it at least), ChatGPT writes complete nonsense.

On stuff that's a bit obscure, but not really (like your Vulkan example), ChatGPT tends to write 60-95% correct code, that's flawed in the exact ways a noob wouldn't be able to fix.

In this case, nicking code from Github seems to fix the issue, even if I need to adapt it a bit.

Then comes the licensing issue. Often, when searching for an obscure topic, the code ChatGPT generates is very close to what's found on Github, but said code often comes with a non-permissive license, unlike what the AI generates.

cess11

0 replies

9h58m

2024-08-21 08:33:03 UTC

Having spent a couple of hours with some llama models and others I've given up on them for code. Code generation from XML or grinding it out ad hoc with stuff like sed on top of a data or config file is faster and more convenient for me.

The Jetbrains thing is rather rudely incompetent, it consistently insists on suggestions that use variables and fragments that are supposed to be replaced by what I'm writing and also much more complex than what I actually need. I suffered through at least a hundred mistaken tabbed out shitty suggestions before I disconnected it.

HPsquared

0 replies

10h4m

2024-08-21 08:27:13 UTC

It's going to be the same issue as poorly-structured Excel workflows: non-programmers doing programming. Excel itself gets the blame for the buggy poorly-structured spreadsheets made by people who don't have the "programmer mindset" of thinking of the edge cases, exceptions, and extensibility.

So it will be with inexperienced people coding with LLMs.

magicalhippo

8 replies

13h23m

2024-08-21 05:07:46 UTC

I was skeptical like you, but recently decided to try it out. I wasn't expecting much, and as such I was slightly surprised.

For example, just now my NAS stopped working because the boot device went offline. So I got to thinking about writing a simple syslog server. I've never looked at the syslog protocol before, and I've never done any low-level TCP/UDP work in C# yet.

So I asked ChatGPT to generate some code[1], and while the result is not perfect it's certainly better than nothing, and would save me time to get going.

As another example, a friend who's not very technical wanted to make an Arduino circuit to perform some automated experiment. He's dabbled with programing and can modify code, but struggles to get going. Again just for kicks, I asked ChatGPT and it provided a very nice starting point[2].

For exploratory stuff like this, it seems to provide a nice alternative to searching and piecing together the bits. Revolutionary is a quite loaded word, but it's certainly not just a slight improvement on what we had before LLMs and instead feels like a quantum leap.

[1]: https://chatgpt.com/share/f4343939-74f1-404d-bfac-b903525f61... (modified, see reply)

[2]: https://chatgpt.com/share/fc764e73-f01f-4a7c-ab58-f43da3e077...

torginus

2 replies

10h0m

2024-08-21 08:31:17 UTC

This is what the AI is great at - the topics might be obscure to you but they are not that obscure in general, so the AI has had a lot of training data.

I've also assembled a Kubernetes cluster overnight, despite not knowing much about Kubernetes before, and I ran the kubectl files ChatGPT made for me past some devops folks, and it passed the smell test.

I consider much of coding to be a magic spellcasting tutorial - we do conceptually simple things, and the difficult lies in figuring out how to use novel libraries and get them to do what you want.

Edit: After checking out the Arduino sketch, I'd take issue with all the floating point calculations in there - most microcontrollers don't have FPUs, and the performance is awful on 8-bit AVRs. It's not great on Cortex M3s either as all this stuff is done in software, and each FP operation is like a hundred cycles.

I'd definitely try to rephrase the issue with integer math. It might work, but no self-respecting embedded dev would write like this.

magicalhippo

1 replies

9h33m

2024-08-21 08:57:37 UTC

the topics might be obscure to you but they are not the obscure in general

Exactly, it's a very nice alternative to searching the web and discovering new stuff.

most microcontrollers don't have FPUs, and the performance is awful on 8-bit AVRs.

I used to think like you. But then I realized the Atmega 328p is running at 16 MHz, so even hundreds of . As you can see here[1], it can do 94k double-precision FLOPS, more than enough for simple sketches like this. This jives with the benchmarks I did several years ago.

Sure if I was writing some tight control loop or similar I wouldn't use floating point.

[1]: https://kreier.github.io/benchmark/LinpackDP/

torginus

0 replies

9h17m

2024-08-21 09:14:17 UTC

Still it does not sit right with me - it might work, but figuring out the math with fixed point is not that hard, just requires some sitting down and thinking about it.

It's like I have some light fixtures in my attic that are connected with wires twisted together and covered with electrical tape - they certainly work and have done for a decade, but its still not right and I wouldn't recommend anyone do it this way.

tomjen3

2 replies

12h56m

2024-08-21 05:35:23 UTC

Very interesting. A couple of notes here on the C# version.

Its using the old format where the Program.cs file has an actual class, whereas as of .NET 6 thats not required.

You said barebones, but for any real server you would want to use the generic host https://learn.microsoft.com/en-us/dotnet/core/extensions/gen... which gets you a lot of the boilerplate and enables you program to be wrapped in a windows or systemd service.

Finally, parsing can be simplified since ASCII is a proper subset of UTF-8, you can just parse the entire string as UTF-8. IMHO I am disappointed that the AI didn't point that out.

mnky9800n

0 replies

11h49m

2024-08-21 06:41:35 UTC

This is a perfect example of the llm producing an answer that a novice finds useful and an expert takes issue with.

magicalhippo

0 replies

12h14m

2024-08-21 06:17:12 UTC

You said barebones, but for any real server you would want to use the generic host

True, I intentionally said barebones as I wanted a minimal example. I asked it to modify the code to use the generic host, and updated the chat link (so refresh). Keep in mind this is the free ChatGPT, but I still think it did reasonably good. The example compiles as-is, and is very close to functional. I've not used the generic host stuff before either, so again this would save me time searching and piecing together code.

Finally, parsing can be simplified since ASCII is a proper subset of UTF-8, you can just parse the entire string as UTF-8.

I don't think that would work, because the free-form text message part at the end must contain a BOM if it's UTF-8 encoded, according to the specification. AFAIK you can't have the BOM in the middle of a string.

weq

0 replies

13h4m

2024-08-21 05:26:46 UTC

This is how i currently utilise and view The AI tools. I replaced the googling in my flow with it. It might have taken me 20minutes before to get the boilerplate together, now its a minute. Depending on the type of code legacy overheads you have, results may vary. If you can break your problem down into small descrete bits.

The way the OP here was talking about sonnet being above way above chatgtp in this case, it could be true. Google probably has the largest Go codebases on search to train the AI on higher quiality inputs. Go is a simpler language with less variation over time compared to something like .net also adding to its corner.

ive always been the type of person to use the right language for the each use case. For the last 10+ years ive primarly been building cross platform apps that target every common OS. So these "ai" tools like phind.com give me a universal API interface and generate code which is equiv to a SO answer. They are the ability of an outsourced junior dev who you would never let push code to prod, that doesnt have the language barrier but retains the fake dergee overheads ;)

oefrha

0 replies

8h21m

2024-08-21 10:09:56 UTC

I’m similar: LLM can improve my productivity by 10-20% when I’m working on something familiar, 30-50% when I’m venturing into a popular domain that I’m unfamiliar with.

I just don’t understand where the hell is this magical LLM capable of generating flawless files or even entire projects that many people are talking about. I rarely accept a large block of LLM-generated code without close inspection, and I’ve ended up with a subtle bug that I wouldn’t have written myself at least ~5 times now. Unless you don’t give a shit about code quality, error handling, proper logging, and subtle bugs, you shouldn’t run LLM-generated stuff without painstakingly reading and fixing everything. Or maybe there really is a magical LLM somewhere.

mikhmha

6 replies

14h24m

2024-08-21 04:06:57 UTC

I feel the exact same way. I have felt 0 need to use an LLM in my current workflow. If I could explain concisely what my problem is in English - then I would already know the answer. In that case why would I be asking AI the question. And I don't want opinionated answers to subjective questions. Why run that through a filter, I can investigate better myself through links on the internet.

By the way - I think AI or ML whatever has some valid uses right now. but mostly in image processing domain - so like recognizing shapes in some bounded domain OK yea. Generative image - NOT bad but theres always this "AI GLOW" to each image. Something is always off. Its some neat tools but a race to the bottom and mostly users want to generate explicit content lets be real. and they will become increasingly more creative and obtuse to get around the guards. nothing is stopping you from entering the * industry and making tons off money. but that industry is always doing good.

a friend recently suggested to use AI to generate generic icons for my game. Thats a really good use case. but does that radically change the current economy?

[BTW GENERIC STUFF ONLY UNTILL I could hire someone because i prefer that experience way more. you can get more interesting results. 4 eyes are better than 2.

bruce511

4 replies

14h3m

2024-08-21 04:28:24 UTC

> If I could explain concisely what my problem is in English - then I would already know the answer. In that case why would I be asking AI the question. And I don't want opinionated answers to subjective questions. Why run that through a filter, I can investigate better myself through links on the internet.

I am an experienced programmer, but I find myself (for the first time) doing a deep-dive in SQL and specifically building code that will run against multiple SQL engines.

I fed my list of engines into AI. As I'm discovering the really weird corners of SQL I asking the AI to compare one db against the other. My prompts are usually no more than 3 or 4 words.

It gives me quick helpful answers highlighting where things are the same and where they are different. I can then follow up in the specific docs if necessary (now that I know the function name.)

Personally I'm somewhat anti-hype, I'll let others rave about "changing the world". But I have found it a useful tool - not so much for "writing my code" but for acting as my tutor as I learn new things. (I'm using it for more than just SQL or computers now.)

I'm not sure it "changes thr economy" - but it can certainly change individual lives. Some jobs will go away. Others might be easier to do. It might make it easier to learn new skills.

mikhmha

3 replies

13h14m

2024-08-21 05:16:46 UTC

Recently I saw the process of filling for 2 insurance claims through 2 different entities. First one used a great ai voice agent that does the process of filtering your query. it understood me perfectly. But then I still had to wait in line for an actual agent. Ok wait for 6 hours. Whatever. Call again - the 2nd time going through the same agent is painful. Its so slow. And all this just to connect me to a human - who was EXCELLENT. What did ai add? And for now a human has to be involved in any complex permission changes to your account. AND I like that.

The 2nd agency I called them through phone. No ai. But it is excellent cause they do async processing. so you reserve a slot and they call you back. i don't care if those are not answered urgently. Because I just want to talk to a human.

structural

1 replies

12h57m

2024-08-21 05:33:34 UTC

AI lets the agency better be able to afford to hire and keep the excellent humans around to solve complex problems. It's better than a human that also can't fix your issue themselves and has to ask you to transfer to the other person. Ideally, the AI agent would have let you book a callback time with the specific right human who's best able to fix your issue. Some companies are able to do this, and you, the customer, never realize that of the 100 specialists in the building, you got connected to the one person who understands your issue and has the tools to fix it.

Customer service is hard because it has to filter out 90% noise, 9% fairly straightforward tasks, and the <1% of complex issues that need to be sent to the right human.

mikhmha

0 replies

12h48m

2024-08-21 05:43:12 UTC

DONT get me wrong. What you're saying I actually support. If this allows that company to have quality staff I'm all for it. The less time we both spend on each other the better. So just offer async processing.

bruce511

0 replies

13h6m

2024-08-21 05:25:01 UTC

Your point is that AI is bad at some things, and in some cases misused. Which is of course abundantly true.

But equally it doesn't prove, or even assert, the opposite. A bicycle may be bad at cross-country road trips, but that doesn't make it a bad choice for some other situations.

Hence my earlier comment - I (and I suspect others) are finding it useful for some tasks. That is not to imply it is good at all tasks.

Is it over-hyped? Of course yes. Welcome to IT where every new thing is over-hyped all the time.

bambax

0 replies

9h3m

2024-08-21 09:27:49 UTC

I think llms are useful when you're trying to write something in a language you don't know well; then it speeds up the part where you need to check for simple idiosyncrasies.

If you don't know the language at all it's dangerous because you may not understand the proposed program (and of course if you're an expert you don't need it at all).

But llms won't help to find solutions to a general, still unspecified problem.

aschobel

5 replies

15h41m

2024-08-21 02:50:08 UTC

Give it another shot but with Claude Sonnet 3.5. It’s my daily driver for coding tasks.

It seems especially strong with Python but a bit medium with Swift.

Calavar

4 replies

15h3m

2024-08-21 03:27:41 UTC

I just signed up for the free version. Claude Sonnet does properly use malloc/free to manage the buffer where GPT-4o screws up (Yay!) It manages to gets through the whole process of initializing a graphics device and grabbing a queue from the device. It took some questionable shortcuts to get there (and didn't leave any comments explaining those shortcuts and the problems they could cause down the road), but fine, the code works.

After that it goes completely off the rails by trying to issue draw commands before binding a graphics pipeline, which is both illogical and illegal. After a lot of prodding, I did manage to get it to bind a graphics pipeline, but it forgot about the texture.

So Claude Sonnet is definitely better than GPT-4o, but it still feels raw, like a game of whack-a-mole where I can get it to fix a mistake, but it reintroduces an old one. I also have to be the one offering the expertise. I can prompt it to fix the issues because I know exactly what the issues are. If I was using this to try to fill in for a gap in my knowledge, I would be stuck when I ran the code and it crashed - I would have no idea where to go next.

Update: Took about 50 min of experimenting, but I did get Claude to generate code that doesn't have any obvious defects on first inspection, although it cut off about halfway through because of the generation limit. That's the best result that I've seen from an LLM yet. But that's after about a dozen very broken broken programs, and again, I think the domain expertise here is key in order to be able to reprompt and correct.

zone411

2 replies

14h24m

2024-08-21 04:06:39 UTC

LLMs are much better at Python and JavaScript than at C/C++. This simple difference can account for much of the variation in people's experiences.

Calavar

1 replies

14h17m

2024-08-21 04:13:38 UTC

I agree, that could explain a lot of it. I also suspect that the length of the generated code plays a role. In my experience, LLMs sometimes peter out a bit and give up if the generated program gets too long, even if it's well within their context limit. (Where giving up means writing a comment that says "the rest of the implementation goes here" or starting to have consistency issues.) Python and JavaScript tend to be more succinct and so that issue probably comes up less.

remoroid

0 replies

13h29m

2024-08-21 05:01:52 UTC

Yes, you have figured it out. LLMs are terrible for graphics programming. Web development - much better. Sonnet 3.5 is the only good model around for now. GPT 4o is very poor.

roywiggins

0 replies

14h22m

2024-08-21 04:09:24 UTC

I've had some moderate success asking Claude to "translate" smallish pieces of code from, eg, C++ to Python. One simple C++ file parser it managed to translate basically 100% right on one try. I wouldn't particularly trust the output- well, not until I run it through tests- but for quick "could this possibly work, what does the performance look like, what might this code look like" exploratory stuff it's been very useful, especially for code that you are likely to throw away anyway.

One example that I am still using: I wanted to generate a random DICOM file with specific types of garbage in it to use as input for some unit tests, and Claude was able to generate some Python that grabs some random DICOM tags and shoves vaguely plausible garbage data into them, such that it is a valid but nonsensical DICOM dataset. This is not hard, but it's a lot faster to ask Claude to do it.

wsc981

2 replies

14h30m

2024-08-21 04:00:39 UTC

For LOVR (Lua based 3D/VR framework), I found these LLMs pretty much useless, both ChatGPT and Claude. Seems all is trained on old APIs, so it takes quite a bit of effort to make any suggestions work with newest LOVR version.

fsiefken

1 replies

10h27m

2024-08-21 08:04:20 UTC

I wonder if quality would improve if one would upload the latest LOVR documentation and uploaded LOVR codebases using the newest version and instruct it properly?

wsc981

0 replies

9h45m

2024-08-21 08:46:09 UTC

The docs are available online. I don't know how often models are "re-trained".

fumeux_fume

2 replies

15h29m

2024-08-21 03:01:47 UTC

Yeah, these people claiming AI has been a transformative experience are just full of sh*t. I ask various models questions all the time because it's often better than googling, but all of them make a lot silly mistakes. Frequently, it can be a bit of a process to get useful results.

sibeliuss

0 replies

15h2m

2024-08-21 03:28:27 UTC

Its not supposed to do your work. Its an assistant.

blovescoffee

0 replies

15h16m

2024-08-21 03:15:11 UTC

... Or they just honestly have a different experience than you

chrisandchris

2 replies

13h31m

2024-08-21 05:00:24 UTC

Same here.

A friend, whose SQL knowledge is minimal, used an LLM to query data from a database over a couple of tables. Yes, after a lot of trial and error he (most probably) got the correct data, however the only one being able to read the query is the LLM itself. It's full of coalesce, subselects that repeat the same joins again and again.

LLM will do a lot for you, but I really hate this "this will [already did] solve everything". No, it did not and no, because it's quality is those of a junior dev, at max.

anonzzzies

1 replies

13h17m

2024-08-21 05:13:44 UTC

because it's quality is those of a junior dev, at max.

Echo chamber/bias I guess. I know many, many seniors with big pay checks working for big companies who are vastly worse than sonnet at the moment. Juniors stand literally no chance unless very talented.

chrisandchris

0 replies

12h50m

2024-08-21 05:40:52 UTC

I understand a junior dev as someone who's missing (or does not have at a greater level) of maintainability, readability, longevity of a solution.

I don't know what companies pay for and I don't care, because if we go by that, every definition of every word is arguable (because there's everywhere someone out of the range of the definition)

anonzzzies

2 replies

13h19m

2024-08-21 05:12:02 UTC

but every time I try I find it borderline impossible to get functional code even for pretty straightforward prompts.

I work in 9 different projects now and I would say that around 80% of functional code comes from Sonnet (like GP) for these projects. These are not (all) trivial either; there is a very niche (for banking) key/value store written in Go for instance which has a lot of edge cases etc, all the plumbing (x,err = etc aka stuff people find annoying) comes from sonnet and works one-shot. A lot of business logic comes from sonnet too; it works but usually needs a little tweaking to make it correct.

Tests are all done by Sonnet. I think 80% is low balling it on Go code really.

We have a lot of complex code generator stuff and DSLs in TS which also works well often. Sometimes it gets some edge cases wrong, but either re-prompting with more details or fixing it ourselves, will do it. At a fraction of the time/money of what a fully human team would deliver.

I wrote a 3d editor for fun with Sonnet in a day.

I have terrible results with gpt/copilot (copilot is good for editing instead of complete files/functions; chatgpt is not good much compared with sonnet); it doesn't get close at all; it simply keeps giving me the same code over and over again when I say it's wrong; it hardcodes things specifically asked to make flexible etc. Not sure why the difference is so massive all of sudden.

Note: I use the sonnet API, not the web interface, but same for gpt so...

rcarmo

1 replies

10h3m

2024-08-21 08:27:48 UTC

I'd like to see that 3D editor. It's a highly non-trivial thing to do properly.

anonzzzies

0 replies

9h44m

2024-08-21 08:46:46 UTC

It was a few hours work so it was not 'done properly', but the point was, it is working code and, as I know nothing about 3d or game programming, I didn't do any of it. People here claim it (current LLMs) cannot produce working code for entire software's without a human doing a lot of it; it clearly can and for non trivial stuff. I would say the real point is ; it cannot do complete code for non trivial programs without human doing most of the coding IF the person prompting is not an experienced coder. I am, so I can review the generated code, prompt in English what it should change and then it works. It is often faster to write code instead of that (so the thing blurbs 100% working code but with 20% wrong/bugs which I fix as a human) but in domains I know nothing about, English is faster for me.

speleding

1 replies

11h10m

2024-08-21 07:20:48 UTC

Where LLM shine is as a faster alternative to Google and Stack overflow. "How do I reverse an array in language X?". This will give you the right answer in seconds without having to click through garbage.

Especially if it's a question that's hard to Google, like "I remember there is more than one way to split an array in this language, list them". This saves me minutes every day.

But it's especially helpful if you are working on projects outside your own domain where you are a newbie.

Jensson

0 replies

11h1m

2024-08-21 07:29:26 UTC

This saves me minutes every day.

That is very believable, the people who say it made them super productive is a bit less believable.

somenameforme

1 replies

12h57m

2024-08-21 05:33:27 UTC

Lots of companies are directly involved with LLMs, or working to leverage it for startups or their existing projects. And I think a fair chunk of all the employees working at these places probably post on HN (a crazy high percent of the recent batches of YC applicants were using LLM stuff, for instance). That's going to lead to a sizable number of perspectives and opinions that are not especially free of bias, simply because you want to believe that what you're working on is ultimately viable.

And I think it'd be extremely easy to convince oneself of this. Look at where 'AI' was 5 years ago, look at where it is today and then try to imagine where it will be in another 5 years. Of course you have to completely blind yourself to the fact that the acceleration has clearly sharply stalled out, but humans are really good at cognitive dissonance, especially when your perception of your future depends on it.

And there's also the point that even though I'm extremely critical of LLMs in general, they have absolutely 'transformed' my workflow in that natural language search of documentation is really useful. Being able to describe a desired API, but in an overly broad way that a search engine can't really pick up on, but that an LLM [often] can, is just quite handy. On the other hand, this is more a condemnation of search engine tech being frozen 20 years in the past than it is about an imminent LLM revolution.

kristiandupont

0 replies

12h39m

2024-08-21 05:52:08 UTC

acceleration has clearly sharply stalled out

I would rather say that the innovation happens in spikes. I see no slowing down whatsoever.

pif

1 replies

9h59m

2024-08-21 08:31:43 UTC

The consensus on HN seems to be that you can be pretty productive with LLMs as coding assistants

If you define "productive" as writing a simple CRUD web application that your 13-year-old cousin could write between two gaming sessions, then you'll consider LLMs as sacred monsters.

Snake oil vendors always had great appeal over people who didn't know better.

SalmoShalazar

0 replies

9h43m

2024-08-21 08:48:17 UTC

That’s very literally “productive “.

darkerside

1 replies

14h59m

2024-08-21 03:31:50 UTC

I'm guessing AI won't be great with anything nontrivial related to pointers for a while since they require actual thinking

usrnm

0 replies

9h59m

2024-08-21 08:32:01 UTC

At least, we can hope that AI will master pointers one day. For people it's completely hopeless

zamadatix

0 replies

14h19m

2024-08-21 04:11:49 UTC

LLM coding assistants are more like another iteration in the jump from nothing -> Stack Overflow type resources than a replacement for you doing coding work as a programmer just because you can phrase the task as a prompt. If you measured the value of using resources like tutorials Stack Overflow posts by blindly merging the first 2 hundred line examples of the things you want to do, finding the result wasn't ideal, and declaring it a useless way to get functional code people would (rightfully) scratch their heads at you when you say they are living in a different world. Of course it didn't work right away, perhaps it was still better than figuring all of the boilerplate out on your own the first time you did it?

varjag

0 replies

10h58m

2024-08-21 07:33:10 UTC

I had a decent experience putting together frontend code for a demo control system with Mixtral within a couple days. I'm a seasoned programmer but I don't do JS. It stumbled a dozen times but it fulfilled the task of me avoiding to learn JS.

However once you step outside JS or Python, the models are essentially useless. Comprehension of pointer semantics? You wish. Anything with Lisp outside its training corpus of homework assignments? LOL. Editing burden quickly exceeds any possible speed-up.

tomjen3

0 replies

13h9m

2024-08-21 05:21:42 UTC

Counterpoint: I have gotten real value out of dumping broken SQL into ChatGPT and have it fix it for me. I could 100% have done that by myself, but it would have meant I have to go and google the right syntax.

AI is great for me, but it is more like a junior developer you are pairing with than a replacement.

thelittleone

0 replies

11h26m

2024-08-21 07:05:18 UTC

Try something like maestro which uses agents and orchestrator. The looping quality checks have been very helpful. Claude-engineer from same developer is also good to experience how superior that ia to regular chat.

structural

0 replies

13h6m

2024-08-21 05:25:14 UTC

I've been very productive using LLMs, without any expectations of them "writing functional code". Instead, I've mostly used them as if I was working with a human research librarian.

For example, I can ask the LLM things like "What are the most common mistakes when using the Vulkan API to render a triangle with a texture?" and I'll very rapidly learn something about working with an API that I don't have deep understanding of, and I might not find a specific tutorial article about.

As another example, if I'm an experienced OpenGL programmer, I can ask directly "what's the Vulkan equivalent of this OpenGL API call?" and get quite good results back, most of the time.

So I'm asking questions where an 80% answer is still very valuable, and it's much faster than searching for documentation and doing a lot of comparison and headscratching, and it works well enough even when there's no specific article I could find in a variety of searches.

Anything better that the technology gets from here just makes things even easier yet!

ryanjshaw

0 replies

8h48m

2024-08-21 09:43:21 UTC

I gave GPT two small twists so it's not a simple case of copy/paste

Why? I see it like querying a database of human knowledge. I wouldn't expect a SQL database to infer information it's never seen before, why would I expect an LLM to do so?

I use it where I know a solution exists but I'm stumped on the syntax or how to implement it in an unfamiliar environment, or I want to know what could have caused a bug based on others' experience etc.

roseway4

0 replies

13h10m

2024-08-21 05:21:14 UTC

Something worth noting is that the parent comment refers to using Cursor, not ChatGPT/Claude.ai. The latter are general-purpose chat (and, in the case of ChatGPT, agentic) applications.

Cursor is a purpose-built IDE for software development. The Cursor team has put a lot of research and sweat into providing the used LLMs (also from OpenAI/Anthropic) with:

- the right parts of your code

- relevant code/dependency documentation

- and, importantly, the right prompts.

to successfully complete coding tasks. It's an apple and oranges situation.

richardw

0 replies

8h45m

2024-08-21 09:45:44 UTC

I stop it from coding, just ask it to brainstorm with me on a design or plan. After a few iterations it knows what I want and then I ask it for specific outputs. Or I code those myself, or some combination.

I’ve found that it’s sometimes amazing and sometimes wastes a lot of my time. A few times it’s really come up with a good insight I hadn’t considered because the conversation has woken up some non-obvious combination. I use ChatGPT, Claude, Perplexity and one or two IDE tools.

protocolture

0 replies

15h25m

2024-08-21 03:06:17 UTC

Chat GPT seems really good at frequently asked questions.

Like simple python scripts for Home Assistant it just nails first go.

phyalow

0 replies

11h16m

2024-08-21 07:15:16 UTC

Thats because GPT products are in a different (much worse) universe to Anthropics Sonnet/Opus, they are truly phenomenal.

Give Anthropic a shot (its even better via the API console.anthropic.com/workbench).

OpenAI is yesterdays news.

maccard

0 replies

11h3m

2024-08-21 07:27:37 UTC

Gpt seems to have gotten worse, Claude is the new hotness.

But, I agree with your sentiment that asking it to do stuff like that often doesn’t work. I’ve found that what it _can_ do is stuff like “here’s a Model object, write a query to fetch it with the schema I told you about ages ago”. It might not give perfect results, but I know how to write that query and it’s faster to edit Claude’s output than it is to write it from scratch.

kraig911

0 replies

14h19m

2024-08-21 04:11:27 UTC

Essentially to me it feels like almost all LLM's are either mid -> terrible if it's any system programming. Especially I've not had luck with anything outside of web programming and moreso anything NOT javascript / python. (jeez I wrote this sentence terribly)

insane_dreamer

0 replies

14h4m

2024-08-21 04:27:11 UTC

My experience with CoPilot has been similar. I don't think it's given me a single piece of code that just worked. It always took several back and forth of me telling the "AI" that I need it to do X instead of Y (which was included in the original instructions but ignored).

It seems to work best if I start with something very simple, and then layer on instructions ("now make it do X").

Where I have found it saves me time is in having to look up syntax or "gotchas" which I would otherwise search StackOverflow for. But as far as "writing code" -- it still feels a long way from that.

hattmall

0 replies

13h58m

2024-08-21 04:32:46 UTC

I've almost gotten GPT-4o to do some accurate regex. Probably 75% of the time it's close enough that I can tweak it and make it work.

ghm2180

0 replies

8h11m

2024-08-21 10:20:21 UTC

Do you find that you accept GitHub copilits line completions like 90% of the time?

fragmede

0 replies

8h42m

2024-08-21 09:48:30 UTC

what did it do when you told it all of those things? was it able to fix the problems when you pointed them out? did you give it one prompt and expect perfect code out on the first try? is that how you code? all your code complies and runs flawlessly first try? I'm jealous. it usually takes me a bunch of passes before I get things right.

here's a chat for a uc and LCD chip that I picked at random (and got the name wrong for) (and didn't want raspberry pi code for so it stopped it short on that response)

https://chatgpt.com/share/2004ac32-b08b-43d7-b762-91543d656a...

exe34

0 replies

12h3m

2024-08-21 06:28:17 UTC

one trick is to let it come up with what it wants (lots of functions, no texture), then run the code, give it the errors until that's fixed. then ask it to inline them, then add the texture, etc.

drogus

0 replies

9h56m

2024-08-21 08:35:17 UTC

I have a similar experience, but I still use LLMs, just a bit differently. I pretty much never ask it to generate complex code. I also rarely ask for definitions or facts, cause of the tendency to generate garbage answers. What I use it for is usually tedious stuff that is easy to do, but would take me more time to type, rather than ask the LLM.

For example:

* I need a simple bash script for file manipulation or some simple tasks like setting up a project (example: download a secret from AWS SSM, check if an executable exist, if it doesn't write instructions on how to install it on most popular systems etc)

* I need a simple HTTP API, nothing fancy, maybe some simple database usage, maybe running some commands, simple error handling

* I need a YAML file for Kubernetes. I say what I need and usually, it gets most of it right

* I want an Ansible task for some simple thing. Ansible is quite verbose, so it's often saving me time

* I have a Kubernetes YAML file, but I want to manage it in terraform - I'll then ask to convert YAML to a terraform entry (and in general converting between formats is nice, cause even if you have only a piece of what you want to convert, LLMs will most of the time get it right)

* Surprisingly, it often gets openssl and ffmpeg commands right - something I always have to google anyway, especially openssl certificates generation or manipulation

* Giving it a function I wrote and writing test samples after providing a list of what it should test (and asking if it can come up with more, but sadly it rarely does generate anything useful on top of what I suggest)

dkjnneuih

0 replies

10h13m

2024-08-21 08:17:51 UTC

I look at this contrast as what I call the difference between a "programmer" and a "software engineer". These jobs are really two different universes in practice, so you're not wrong.

I saw an LLM demo at one point where it was asked to write FFT and add unit tests for it which really drove this point home for me.

A programmer is a nicer term for a code monkey. You ask them to write FFT and they'll code it. All problems can be solved with mode code. They can edit code, but on the whole it's more just to add more code. LLMs are actually pretty good at this job, in my experience. And this job is important, not all tasks can be engineered thoroughly. But this job has its scaling limits.

A software engineer is not about coding per se, it's about designing software. It's all about designing the right code, not more code. Work smarter, not harder, for scale. You ask them to write FFT and they'll find a way to depend on it from a common implementation so they don't have to maintain an independent implementation. I've personally found LLMs very bad at this type of work, the same way you and others relying to you describe it. (Ok, maybe FFT is overly simple, I'm sure an LLM can import that for you. But you get the idea.) LLMs have statistical confidence, not intellectual confidence. But software engineering generally works with code too complex for pure statistical confidence.

No offense to the LLM fans here, but I strongly suspect most of them are closer to the programmer category of work. An important job, but one more easily automated away by LLMs (or better software engineering long-term). And we can see this by how a lot of programming has been outsourced for decades to cheap labor in third-world countries: it's a simpler type of job. That plus the people biased because their jobs and egos depend on LLMs succeeding.

bulbosaur123

0 replies

11h2m

2024-08-21 07:29:17 UTC

I feel like I'm living in a different universe sometimes. The consensus on HN seems to be that you can be pretty productive with LLMs as coding assistants, but every time I try I find it borderline impossible to get functional code even for pretty straightforward prompts.

Same, it can't even fix an xcode memory leak bug in a simple app. It will keep trying and breaking it non-stop. Garbage

brailsafe

0 replies

11h24m

2024-08-21 07:06:25 UTC

I'd go one step further and suggest that one of the few things LLMs are good at is acting as astroturfing agents for themselves. I must be crazy to not be completely changing my workflow and job title with [THE_GOOD_ONE] and not that other one, and wow so many other people vocally feel the same way on every other forum that's extremely easy to manipulate.

Fwiw, I've had some helpful successful prompts here and there, and in some very narrow scopes I'll get something usable, like parsing JSON or scaffolding some test cases, which is real saved time, but I stopped thinking about these tools long ago.

To get real value out of something like your example, I'd be using it as a back and forth to help me understand how some concepts work or write example questions I can drill on my own, but nothing where precision matters

benrutter

0 replies

12h14m

2024-08-21 06:16:36 UTC

Yes me too! I don't have any stake for finsing LLMs unhelpful, and would love to have a tool to make me more productive.

Would be really interesting if anyone had blog posts on their actual workflow with LLMs, in case there's something I'm doing different.

badgersnake

0 replies

12h39m

2024-08-21 05:51:40 UTC

There’s a lot of people on here with a financial interest in AI doing well. So they hype it continuously.

atum47

0 replies

15h12m

2024-08-21 03:18:39 UTC

Chat gpt has helped me with some complex sql queries. I had to correct it a couple of times but in the end it worked.

Oras

0 replies

15h34m

2024-08-21 02:56:35 UTC

Try deepseek coder

Midnight1938

0 replies

14h38m

2024-08-21 03:52:55 UTC

It has helped, at the cost of making me more dependent on the assistants

layer8

15 replies

19h33m

2024-08-20 22:57:59 UTC

Does your work not depend on existing code bases, product architectures and nontrivial domain contexts the LLM knows nothing about?

Every thread like this over the past year or so has had comments similar to yours, and it always remains quite vague, or when examples are given, it’s about self-contained tasks that require little contextual knowledge and are confined to widely publicly-documented technologies.

What exactly floored your colleague at Microsoft?

_acco

9 replies

18h7m

2024-08-21 00:23:53 UTC

Context is the most challenging bit. FWIW, the codebases I'm working on are still small enough to where I rarely need to include more than 12 files into context. And I find as I make the context bigger beyond that, results degrade significantly.

So I don't know how this would go in a much larger codebase.

What floored him was simply how much of my programming I was doing with an LLM / how little I write line-by-line (vs edit line-by-line).

If you're really curious, I recorded some work for a friend. The first video has terrible audio, unfortunately. This second one I think gives a very realistic demonstration – you'll see the model struggle a bit at the beginning:

https://www.loom.com/share/20d967be827141578c64074735eb84a8

wokwokwok

7 replies

16h20m

2024-08-21 02:10:52 UTC

So you spend 10 minutes writing a free text description of the test you want; tell it exactly how you want it to write the test, and then 4-5 minutes trying to understand if it did the right thing or not, restart because it did something crazy then a few minutes manually fixing the diff it generated?

MMmm.

I mean, don't get me wrong; this is impressive stuff; but it needs to be an order of magnitude less 'screwing around trying to fix the random crap' for this to be 'wow, amazing!' rather than a technical demonstration.

You could have done this more quickly without using AI.

I have no doubt this is transformative technology, but people using it are choosing to use it; it's not actually better than not using it at this point, as far as I can tell.

It's slower and more error prone.

iluvcommunism

3 replies

16h6m

2024-08-21 02:25:16 UTC

I used it for some troubleshooting in my job. Linux Sys Admin. I like how I can just ask it a question and explain the situation and it goes thru everything with me, like a Person.

jlarocco

1 replies

15h42m

2024-08-21 02:49:17 UTC

How worried are you that it's giving you bad advice?

There are plenty of, "Just disable certificate checking" type answers on Stack Overflow, but there are also a lot of comments calling them out. How do you fact check the AI? Is it just a shortcut to finding better documentation?

iluvcommunism

0 replies

32m

2024-08-21 17:58:32 UTC

In my opinion it’s better at filtering down my convoluted explanation into some troubleshooting steps I can take, to investigate. It’s kind of like an evolved Google algorithm, boiling down the internet’s knowledge. And I’ve had it give me step by step instructions on “esoteric” things like dwm config file examples, plugins for displaying pictures in terminal, what files to edit and where…it’s kind of efficient I think. Better than browsing ads. Lol.

ainonsense44

0 replies

13h4m

2024-08-21 05:27:13 UTC

Certainly, yes. To initialize the IPv6 subsystem, enter 'init 6' in an elevated system terminal.

_acco

2 replies

15h32m

2024-08-21 02:59:20 UTC

Stoked you watched, thanks. (Sorry the example isn't the greatest/lacks context. The first video was better, but the mic gain was too high.)

You summed up the workflow accurately. Except, I read your first paragraph in a positive light, while I imagine you meant it to be negative.

Note the feedback loop you described is the same one as me delegating requirements to someone else (i.e. s/LLM/jr eng). And then reading/editing their PR. Except the feedback loop is, obviously, much tighter.

I've written a lot of tests, I think this would have taken 3-4x longer to do by hand. Surely an hour?

But even if all things were roughly equal, I like being in the navigator seat vs the driver seat. Editor vs writer. It helps me keep the big picture in mind, focused on requirements and architecture, not line-wise implementation details.

wokwokwok

0 replies

15h1m

2024-08-21 03:29:35 UTC

I've written a lot of tests, I think this would have taken 3-4x longer to do by hand. Surely an hour?

I guess my point is I'm skeptical.

I don't believe what you had the end would have taken you that long to do by hand. I don't believe it would have taken an hour. It certainly would not have taken me or anyone on my team that long.

I feel like you're projecting that, if you scale this process, so say, having 5 LLMs running in parallel, then what you would get is you spending maybe 20% more time reviewing 5x PRs instead of 1x PR, but getting 5x as much stuff done in the end.

Which may be true.

...but, and this is really my point: It's not true, in this example. It's not true in any examples I've seen.

It feels like it might be true in the near-moderate future, but there are a lot of underlying assumptions that is based on:

- LLMs get faster (probably)

- LLMs get more accurate and less prone to errors (???)

- LLMs get more context size without going crazy (???)

- The marginal cost of doing N x code reviews is < the cost of just writing code N times (???)

These are assumptions that... well, who knows? Maybe? ...but right now? Like, today?

The problem is: If it was actually making people more productive then we would see evidence of it. Like, actual concrete examples of people having 10 LLMs building systems for them.

...but what we do see, is people doing things like this, which seem like (to me at least), either worse or on-par with just doing the same work by hand.

A different workflow, certainly; but not obviously better.

LLMs appear to have an immediate right now disruptive impact on particular domains, like, say, learning, where its extremely clear that having a wise coding assistant to help you gain simple cross domain knowledge is highly impactful (look at stack overflow); but despite all the hand waving and all the people talking about it, the actual concrete evidence of a 'Devin' that actually builds software or even meaningfully improves programmer productivity (not 'is a tool that gives some marginal benefit to existing autocomplete'; actually improves productivity) is ...

...simply absent.

I find that problematic, and it makes me skeptical of grand claims.

Grand claims require concrete tangible evidence.

I've no doubt that you've got a workflow that works for you, and thanks for sharing it. :) ...I just don't think its really compelling, currently, to work that way for most people; I don't think you can reasonably argue it's more productive, or more effective, based on what I've actually seen.

ruszki

0 replies

12h0m

2024-08-21 06:30:45 UTC

I've written a lot of tests, I think this would have taken 3-4x longer to do by hand. Surely an hour?

It seems to me that the first few are almost complete copy-paste of older tests. You would have got code closer to the final test in the update case with simple copy-paste than what was provided.

The real value is only in the filtered test to choose randomly (btw, I have no idea why that’s beneficial here), and the one which checks that both consumers got the same info. They can be done in a few minutes with the help of the already made insert test, and the original version of the filtered test.

I’m happy that more people can code with this, and it’s great that it makes your coding faster. It makes coding more accessible. However, there are a lot of people who can do this faster without AI, so it’s definitely not for everybody yet.

mhuffman

0 replies

16h4m

2024-08-21 02:27:13 UTC

I know that you are getting some push-back because of your exuberance regarding your use of LLMs in development, but let me just say I respect that when someone told you to "put up or shut up" you did. Good on you!

novaleaf

3 replies

18h1m

2024-08-21 00:29:29 UTC

I think that Greptile is on the right track. I made a repo containing the c# source code for the godot game engine, and it's "how to do X", where X is some obscure technical feature (like how to create a collision query using the godot internal physics api) is much better than all the other ai solutions which use general training data.

However there are some very frustrating limitations to greptle, so severe that I basically only use it to ask implementation questions on existing codebases, not for anything like general R&D: 1) answers are limited to about 150 lines. 2) it doesn't re-analyze a repo after you link it in a conversation (you need to start a new conversation, and re-link the repo, then wait 20+ min for it to parse your code) 3) it is very slow (maybe 30 seconds to answer a question) 4) there's no prompt engineering

I think it's a bit strange that no other ai solution lets you ask questions about existing codebases. I hope that will be more widespread soon.

dakshgupta

2 replies

17h42m

2024-08-21 00:49:24 UTC

I work at Greptile and agree on all three criticisms. 1) is a bug we haven't been able to fix, 2) has to do with the high cost of re-indexing, we will likely start auto-updating the index when LLM costs come down a little, and 3) has to do with LLM speed. We pushed some changes to cut time-to-first-token by about half but long way to go.

Re: prompt engineering, we have a prompt guide if that helps, was that what you are getting at?

https://docs.greptile.com/prompt-guide

photonthug

1 replies

16h59m

2024-08-21 01:31:49 UTC

No idea about the product, but I would like to congratulate you guys on what is maybe the greatest name ever. Something about it seems to combine "fierce" with "cute", so I think you should consider changing your logo to something that looks like reptar

dakshgupta

0 replies

16h54m

2024-08-21 01:36:56 UTC

So glad you like it - we were worried it was too silly.

The logo is supposed to be a reptile claw but we might modify it to make that more obvious.

Havoc

0 replies

8h0m

2024-08-21 10:30:56 UTC

What exactly floored your colleague at Microsoft?

Speaking of understand context… They floored him not other way round

sakopov

13 replies

19h48m

2024-08-20 22:42:33 UTC

I think my general perception is that AI is a great assistant for some occupations like software engineering, but due to its large room for error it's very impractical for majority of business applications that require accuracy. I'm seeing this trend at my company, which operates in the medical field and recently mandated that all engineers use CoPilot. At the same time it's a struggle to see where we can improve our business processes with AI - outside of basic things like transcriptions and spell checking - without getting ourselves into a massive lawsuit.

bartread

3 replies

17h32m

2024-08-21 00:59:24 UTC

I've been using (GitHub) Copilot and ChatGPT since they've been widely available. I started using ChatGPT for coding after 4 came out.

I was an early advocate for Copilot but, honestly, nowadays I really don't find it that useful, compared to GPT-4o via ChatGPT.

ChatGPT not being directly integrated into my editor turns out to be an advantage. The problem with Copilot is it gets in the way. It's too easy to unintentionally insert a line or block completion that isn't what you want, or is out and out garbage, and it's constantly shoving up suggestions as I type, which can be distracting. It's particularly irritating when I'm trying to read or understand a piece of code, or maybe do a refactor, and I leave my caret in one position for half a second too long, and suddenly it's ghost-inserted a block of code as a suggestion that's moved half of what I'm reading down the screen and now I have to find my place again.

Whereas, with ChatGPT being separate, it operates at a much less intrusive cadence, and only responds when I ask it too, which turns out to be much more useful and productive.

I'm seriously considering binning of my Copilot subscription as a result.

protocolture

0 replies

15h22m

2024-08-21 03:09:11 UTC

I canned my Copilot sub.

Even where microsoft includes it for free, like their automation tools, its always been jank. I found myself going to GPT4 for better answers which is bad when you spend any time thinking about it.

ed_mercer

0 replies

16h42m

2024-08-21 01:48:28 UTC

Have you tried using the shortcut for turning copilot on/off? I know what you mean and in those cases I just turn it off for a second and type freely, then turn it back on.

coffeebeqn

0 replies

14h28m

2024-08-21 04:02:48 UTC

Copilot was surprisingly one of the worst AI products I’ve used. It keeps suggesting nonsense and the insertions it does in VSCode often mess up very basic things like it starts writing things in the wrong place (or adds a function inside another function) and the result doesn’t even compile. It also seems to have zero understanding of the codebase apart from maybe a few lines before and after. Incredible that this is being sold as a real product at this point

Loughla

3 replies

19h32m

2024-08-20 22:58:34 UTC

It's useful for analyzing data, but only if it can be verified. We use it (higher education) to glance at data trends that may need further exploration. So it's a fancy pivot table, I guess.

Most software vendors are selling their version of AI as hallucination free though. So that's terrifying.

slidehero

2 replies

19h12m

2024-08-20 23:19:21 UTC

Most software vendors are selling their version of AI as hallucination free though.

It definitely has that Tesla Autopolit feel to it, where the marketing is ... enthusiastic, while the onus to "use the tool responsibly" is left up to the end user.

I'm building a persentation/teaching tool for teachers and trying to not sweep the halicination angle under the rug. My marketing angle is that the AI is a huge time saver, but the classroom teacher is the subject expert and they need to review the output. At the end of the day, it's still much more time efficient for an expert to review the content than it is to produce it from scratch which is a huge win.

Loughla

1 replies

17h54m

2024-08-21 00:37:01 UTC

Sadly, I think your sales pitch will fall flat. College and University admin aren't technology experts, but are VERY susceptible to flattery from tech sales people. They want to hear about the time and cost savings, regardless of reality.

Also, to be super presumptuous, if you need help with anything with your tool, hit me up. I love working with developers on building realistic use case and workflows.

slidehero

0 replies

17h36m

2024-08-21 00:54:29 UTC

Sadly, I think your sales pitch will fall flat. College and University admin aren't technology experts

I'm building tools for K-12 (I've been a teacher for 20+ years). There's solid uptake of GPT among teachers and individually they're pretty enthusiastic about anything that will reduce their workload, so I'm hoping it resonates.

Also, to be super presumptuous, if you need help with anything with your tool, hit me up. I love working with developers on building realistic use case and workflows.

Sure, my details are in my profile, happy to chat.

slashdave

1 replies

18h35m

2024-08-20 23:55:32 UTC

Software engineering doesn't require accuracy?

sakopov

0 replies

18h16m

2024-08-21 00:14:46 UTC

I was implying that AI can be useful in software engineering because hallucinations are expected and actively addressed by engineers. It's less of an issue in comparison to automating strategic business decisions by what is essentially a "black box" and not knowing if it hallucinated data in order to arrive at its decision.

elicksaur

1 replies

18h51m

2024-08-20 23:40:09 UTC

How does that mandate work? Do they just check if it’s turned on? How many suggestions get accepted?

sakopov

0 replies

18h33m

2024-08-20 23:57:35 UTC

It's just a requirement that everyone has a license and did 1-hour introductory training. Whether you actually use it or not is up to you, but it's encouraged.

jsemrau

0 replies

18h48m

2024-08-20 23:42:54 UTC

I think my general perception is that AI is a great assistant for some occupations like software engineering

I think that's why I like to compare the current state of AI to the state of the CPU industry maybe around the 286-486 era going towards the Pentium.

pembrook

8 replies

20h23m

2024-08-20 22:07:48 UTC

I see so many people hyping Claude Sonnet + Cursor on Twitter/X, yet in real world usage, I find it no better than GitHub Copilot (presumably GPT 4o) + VScode.

Cursor offers some super marginal UX improvements over the latter (being that it’s a fork of VScode), since it allows you to switch models. But Claude and GPT have been interchangeable at least for my workflows, so I’m not sure the hype is really deserved.

I can only imagine the excitement comes from the fact that cursor has a full-fat free trial, and maybe most people have never bothered paying for copilot?

_acco

3 replies

20h18m

2024-08-20 22:12:50 UTC

Hmm, I've definitely always used paid copilot models.

Perhaps it's my language of choice (Elixir)? Claude absolutely nails it, rarely gives me code with compilation errors, seems to know and leverage the standard library very well, idiomatic. Not the same with GPTs.

nojs

1 replies

19h6m

2024-08-20 23:25:19 UTC

In my experience ChatGPT seems particularly bad at Elixir, presumably because there is a comparative lack of published code and discussion about it.

yeahwhatever10

0 replies

17h16m

2024-08-21 01:15:20 UTC

Likely a lack of recently published code. Claude has been trained more recently.

lambdaba

0 replies

16h17m

2024-08-21 02:13:41 UTC

Is there a definite answer on what languages / paradigms LLMs tend to be better at?

synergy20

1 replies

19h30m

2024-08-20 23:00:36 UTC

Copilot user here, never tried Cursor yet, competition is always good.

did a quick check, it's $20/month, and it has a vim plugin: https://github.com/pasky/claude.vim

going to give it a spin

beastman82

0 replies

18h31m

2024-08-20 23:59:33 UTC

Copilot plugin is maintained by tpope which is not nothing

anonzzzies

0 replies

13h14m

2024-08-21 05:16:50 UTC

I have copilot too; it's good for edits, but one shot complete function generating, I don't find it very good at (while sonnet is imho).

WalterSear

0 replies

18h32m

2024-08-20 23:59:23 UTC

Apart from autocomplete, I got zero utility from Copilot, whereas I'm doing most coding via prompts in Cursor.

Maybe GPT4o changed things.

greenthrow

7 replies

20h6m

2024-08-20 22:24:50 UTC

Software engineer for multiple decades here. None of the AI assistants have made any major change to my job. They are useful tools when it's time to write code like many useful tools before them. But the hard work of being a senior+ software engineer comes before you start typing.

JeremyNT

2 replies

19h26m

2024-08-20 23:04:56 UTC

They help me most when using a framework, API, or language I'm not super familiar with. Beats stack overflow for that stuff.

But it's weird to me seeing people talk about these changing their jobs so much. Maybe I'm holding it wrong but I'm almost always bottlenecked by "big picture" challenges and less on the act of actually typing the code.

bamboozled

0 replies

19h20m

2024-08-20 23:10:47 UTC

My GH copilot kept recommending incorrect things when I use it along side common libraries and frameworks, I don't know, I just don't really find it very useful.

Rury

0 replies

16h3m

2024-08-21 02:27:33 UTC

That's because programming isn't just about syntax. The real difficulty lies in the problem solving. Sure AI can help with that, but the problem still largely rests on the programmer, as AI won't do anything unless prompted to.

paretoer

1 replies

17h45m

2024-08-21 00:46:15 UTC

I am not a software engineer and these tools allow me to make a giant mess of an app in a weekend that kind of does what I want but I only get that weekend. Once I come back to it after any length of time, since I have no idea what I am doing or what is going on it is impossible to update or add to the app without breaking it.

Now I have all these new ideas but I am back to square one because it just seems easier to start over.

I look forward to more powerful models in the future but I do wonder if it will just mean I can get slightly farther and make an even larger mess of an app in a weekend that I have no way to add to or update without breaking.

The main utility seems like it would be for content creation to pretend I made an app with all these great features as a non-software engineer but conveniently leave out the part about it being impossible to update.

mensetmanusman

0 replies

16h53m

2024-08-21 01:38:01 UTC

This.

It’s great at helping me automate things I normally do not have time to attempt.

nerdjon

0 replies

19h43m

2024-08-20 22:47:52 UTC

I will admit, if I need to do some one off task and write a quick python script to do something I will likely go to Claude or something and write it. I am talking 20-40 lines. I think it's fine for that, it doesn't need a ton of context, it's easy to test, easy to look at and understand, etc.

But outside of that, beyond needing to remember a certain syntax, I have found that any time I tried to use it for anything more complex I am finding myself spending more time going back and forth trying to get code that works than I would have if I had just done it myself in the first place.

If the code works, it just isn't maintainable code if you ask it to do too much. It will just remove entire functionality.

I have seen a situation of someone submitting a PR, very clearly copying a method and sticking it in AI and saying "improve this". It made changes for no good reason and when you ask the person that submitted the PR why they made the change we of course got no answer. (these were not just Linter changes)

Thats concerning, pushing code up that you can't even explain why you did something?

Like you said with the hard work, sure it can churn out code. But you need to have a complete clear picture of what that code needs to look like before you start generating or you will not like the end result.

lambdaba

0 replies

16h14m

2024-08-21 02:16:26 UTC

I think the non obvious benefit is using LLMs nudge you into putting your thoughts in narrative form and training that ability, something that someone with more experience does subconsciously.

candiddevmike

5 replies

20h36m

2024-08-20 21:55:20 UTC

How did it "change your job description"?

_acco

4 replies

20h31m

2024-08-20 21:59:34 UTC

As I mentioned in a sibling comment, I now "pair program" all day. Instead of being the driver and navigator all day, I can mostly sit "one layer up" in the navigator seat.

seanthemon

1 replies

20h20m

2024-08-20 22:10:41 UTC

Do you feel yourself becoming disconnected from your skills? I used gpt extensively at one stage and I felt very rusty getting my hands dirty in code again.

_acco

0 replies

20h15m

2024-08-20 22:15:36 UTC

I don't. But that's probably because I'm very opinionated about the implementation details, so I'm scrutinizing and tweaking its output a lot.

sangnoir

1 replies

20h2m

2024-08-20 22:28:55 UTC

How is this a change in job description? It may be a failure of imagination on my part, but it sounds like you're still doing what you have always done - you've changed the how.

JamesSwift

0 replies

19h56m

2024-08-20 22:35:00 UTC

Right. And in a similar vein, its very easy to replace a lot of low-effort outsourcing / junior-dev assignable tasks. I still need to describe the problem and clean up the answer, but I get the answer immediately and I can re-prompt for it to fix it immediately.

tippytippytango

1 replies

11h54m

2024-08-21 06:36:41 UTC

BigCos won't let their employees use these tools to write code because you can't let code leak off prem. To keep from falling behind I use it on my weekend projects. I sense no urgency to remedy this inside, but I'm not sure how these tools would help with millions of lines of legacy code. They are awesome for creating new things or editing small things where you can build up context easily.

Havoc

0 replies

7h31m

2024-08-21 11:00:23 UTC

BigCos won't let their employees use these tools to write code because you can't let code leak off prem.

Ridiculous blanket statement. A bunch of places use external LLMs.

ch4s3

1 replies

20h39m

2024-08-20 21:51:57 UTC

What do you see as your biggest wins using Claude?

_acco

0 replies

20h33m

2024-08-20 21:57:54 UTC

It helps me stay in flow by keeping me one layer up.

In pair programming, it's ideal to have a driver (hands on keyboard) and a navigator (planning, direction).

Claude can act as the driver most of the time so I can stay at the navigator level.

This is so helpful, as it's easy as programmers to get sucked into implementation details or low-level minutiae that's just not important.

MetaWhirledPeas

1 replies

15h39m

2024-08-21 02:52:15 UTC

I think tools like Copilot are best for people learning a popular programming language from scratch. You get better answers and easier problems to solve. The more experience you have the more you can outpace it in terms of correctness and appropriateness.

bpiroman

0 replies

15h37m

2024-08-21 02:54:16 UTC

Not a fan of Copilot - having a chat bot separate from your coding environment feels much cleaner.

DaiPlusPlus

1 replies

20h20m

2024-08-20 22:11:09 UTC

yet their company is barely even leveraging it

...do you not see the obnoxious CoPilot(TM) buttons and ads everywhere? It's even infected the Azure Portal - and every time I use it to answer a genuine question I have I get factually-incorrect responses (granted, I don't ask it trivial or introductory-level questions...).

_acco

0 replies

20h17m

2024-08-20 22:14:03 UTC

I said leveraging, not hawking!

paraschopra

0 replies

15h12m

2024-08-21 03:18:46 UTC

How much do you end up paying for Claude + Cursor? I’m assuming cursor uses Claude api

mlloyd

0 replies

15h43m

2024-08-21 02:47:30 UTC

The most valuable thing AI has done for me with coding is commenting my code and creating documentation from it. It saves so much time doing a task that hardly any of us actually want to do - and it does it well.

meeby

0 replies

15h21m

2024-08-21 03:09:33 UTC

I also find AI coding assistants useful, particularly when working on a framework heavy codebase (Django, Spring etc.). While I don’t use it to generate massive blocks of code from descriptions just the mostly accurate tab completion of boilerplate code probably saves me 10-20% of my coding time. It’s not a groundbreaking advancement but certainly a decent optimisation.

mdavid626

0 replies

10h8m

2024-08-21 08:23:00 UTC

How did it change your workflow? I'm also a developer with 10+ years of experience. Generating code is not what I miss in my daily life. Understanding what you're doing and creating a mental model of it is the difficult part. Typing in the code is easy. I'm not sure how would a coding assistant help with that.

maxglute

0 replies

11h35m

2024-08-21 06:55:40 UTC

I'm a dummy with very rudimentary script kiddie tier skills, and Claude has helped me do a bunch of personal projects that would normally take months of research and stackoverflow begging in a few days. For me the hype is real, but maybe not trillions $$$ worth of value real.

machiaweliczny

0 replies

6h27m

2024-08-21 12:04:18 UTC

I feel like currently the group feeling the most impact from Claude is frontend devs as tools like v0.dev, Claude etc. can almost create frontend from API schema (I've tested only page per page) which is great. It's probably because there's examples on the internet and that's why it works well.

m3kw9

0 replies

16h30m

2024-08-21 02:01:04 UTC

Coding assistance ain’t useful till it has around 50mb of context or whatever the size of 80% of code bases uses

febed

0 replies

13h39m

2024-08-21 04:52:08 UTC

It’s just a souped up version of Aider with an easier interface, isn’t it?

deadbabe

0 replies

15h39m

2024-08-21 02:51:27 UTC

This is just Dunning-Kruger effect.

celestialcheese

0 replies

14h38m

2024-08-21 03:53:02 UTC

100% this - Cursor's new inline multi-file editing with claude is staggeringly good. And it's _so_ new still.

anonzzzies

0 replies

13h11m

2024-08-21 05:20:01 UTC

I have the same experience; I hope some open source/weights model gets to the same standard as I find a company like Anthropic having too much power to just shut out parts of the world (it did in the beginning when not available in the EU so I had to use proxies etc).

It is of vital importance (imho) to get open models at the same level before another jump comes (if it comes of course, maybe another winter, but at least we'll have something I use every day/all day; so not all hype I think).

adastra22

0 replies

17h15m

2024-08-21 01:15:31 UTC

What is different vs. GPT-4o & Copilot?

h_tbob

78 replies

1d14h

2024-08-20 03:54:48 UTC

To be honest, I was surprised by ChatGPT. I didn’t think we were close.

We are running out of textual data now to train on… so now they have switched to VIDEO. Geez now they can train on all the VIDEOS on the internet.

And when they finally get bots working, they will have limitless streams of TACTILE data…

Writing it off as the next fad seems fun. But to be honest, I was shocked by what openai did the first time. So they have my respect. I don’t think many of us saw it coming. And I think writing their creativity off again may not be wise.

So when they say the bubble is about to break… I get it. But I don’t see how.

I hardly ever pay for anything.

But I gladly spend money on ai to get the answers I need. Just makes my work work!

Also I would say the economic benefit of this tech for workers is that it will 2x the average worker as they catch on. Seriously I am a 2x coder compared to what I was because of this.

Therefore if me a person who hardly ever spends money has to buy it… I think eventually all businesses will realize all their employees need it. This driving massive revenue for those who sell it.

But it may not be the companies we think.

icholy

38 replies

1d14h

2024-08-20 04:08:53 UTC

Seriously I am a 2x coder compared to what I was because of this.

You probably shouldn't advertise that.

CooCooCaCha

30 replies

1d14h

2024-08-20 04:27:30 UTC

I am highly skeptical that a competent coder sees a 2x boost.

CuriouslyC

14 replies

1d13h

2024-08-20 04:43:13 UTC

You shouldn't be. For code bases where context is mostly local they destroy human throughput by comparison. They only fall down hard when used in spaghetti dumpster fire codebases where you have to paste the contents of 6+ files into context or the code crosses a multiple service boundaries to do anything.

A competent engineer architects their systems to make their tools as effective as possible, so maybe your idea of competent is "first order" and you need higher order conception of a good software engineer.

grogenaut

7 replies

1d11h

2024-08-20 06:58:06 UTC

could you provide some examples, code or repos and questions where it does a good job for you (question can also just be completion). Obviously you're having really good experiences with the tools that other aren't having. I'd definitely appreciate that over a lot of assurances taht I'm doing it wrong with my spaghetti code.

CuriouslyC

6 replies

1d5h

2024-08-20 12:55:08 UTC

Try these:

Take a SQL schema, and ask AI to generate crud endpoints for the schema, then sit down and code it by yourself. Then try generating client side actions and state management for those endpoints. Time yourself, and compare how long it takes you. Even if you're fast and you cut and paste from template work and quickly hand edit, the AI will be done and on a smoke break before you're even a quarter of the way through.

Ask the AI to generate correctly typed seed data for your database, using realistic values. Again, the AI will be done long before you.

Try porting a library of helper functions from one language to another. This is another task where AI will win handily

Also, ask AI to write unit tests with mocks for your existing code. It's not amazing at integration tests but with mocks in play it slays.

linuxftw

1 replies

1d5h

2024-08-20 13:19:14 UTC

Your experience mirrors mine. I use ChatGPT or meta for boiler plate code like this. I write golang at my day job, and there's a lot of boiler plat for that language, saves a lot of time, but most importantly, does the tedious boring things I hate doing.

skydhash

0 replies

17h12m

2024-08-21 01:19:06 UTC

Why does no one uses snippets or create scaffold for projects? My main requirement for a tool is to be deterministic. So that I don't spend time monitoring it as failure is very distinct from success.

Jensson

1 replies

2024-08-20 17:56:08 UTC

Those things are a tiny part of the work though and are all about generating boilerplate code. Tons of boilerplate code isn't the hallmark of a great codebase, I don't think many programmers spends more than 10% of their time writing boilerplate code, unless they work at a very dysfunctional org.

It is true it is faster than humans at some tasks, but the remaining tasks were most of the time, you can't gain more than 11% speedup by speeding up 10% of the work.

grogenaut

0 replies

2h25m

2024-08-21 16:06:01 UTC

What's your point? The other person is speeding themself up, and it works for them. What's the appropriate bar for speedups. What would be enough to satisfy you? What problems do you have that AI isn't speeding up that you still feel aren't worth spending your brain on?maybe list them and see if the other people has thoughts on how to go about it?

Things don't move forward by saying it can't be done or belittle others accomplishments.

grogenaut

0 replies

2h28m

2024-08-21 16:02:25 UTC

Thank you. That was a lot more informative than the argument you were having. And it's now obvious to me the value you're getting and a few areas that would work for me that I'll try. I'm not slapping out crud apps day to day, but can see how I could accelerate myself.

I appreciate it.

NoGravitas

0 replies

1d5h

2024-08-20 13:25:54 UTC

Most of the things you list can be done deterministically, without the risk of AI errors. The first one in particular is just scaffolding that Visual Studio has had for Entity Framework and ASP.NET MVC for a decade now. And if you were using, e.g., LISP, you'd just write a DEFCRUD macro for it once, and re-use it for every similar project.

pdimitar

5 replies

1d9h

2024-08-20 08:52:08 UTC

They only fall down hard when used in spaghetti dumpster fire codebases where you have to paste the contents of 6+ files into context or the code crosses a multiple service boundaries to do anything.

So humans do better than them in at least 80% of all code everywhere, if not 95% even? Cool, good to know.

Care to provide some examples to back your otherwise extraordinary claim btw?

CuriouslyC

4 replies

1d5h

2024-08-20 12:39:49 UTC

You sure seem invested in hating AI, what's your problem brother?

pdimitar

2 replies

1d5h

2024-08-20 12:46:55 UTC

I don't "hate" AI because (1) AI does not exist so how can I hate something that doesn't exist? And (2) I don't "hate" a machine, I cringe at people who make grand claims with zero proof. Yes. Zero. Not small, not infinitesimal -- zero.

I just can't make peace with the fact that I inhabit the same planet as people who can't make elementary distinctions.

CuriouslyC

1 replies

1d5h

2024-08-20 12:56:42 UTC

I can't make peace that I'm on the same planet as people who can't use google worth a shit: https://mitsloan.mit.edu/ideas-made-to-matter/how-generative...

And yet somehow want to act high and mighty and be insulting as fuck.

pdimitar

0 replies

1d5h

2024-08-20 13:01:33 UTC

You failed to mention the word "can", which is a theoretical.

We can all Google stuff because internet is big and supports anyone's views, which means it's more important than ever to be informed and be able to infer well. Something that you seem to want to defer to sources that support your beliefs. Not nice finding that on a hacker forum but statistical outliers exist.

Live long and prosper. And be insulted, I suppose.

krfrewH

0 replies

8h56m

2024-08-21 09:35:16 UTC

You are stealing and laundering my open source code and take the credit, that is the problem.

Fortunately, as pdimitar pointed out, so far it is an ineffective scam that mostly produces LoC.

knowaveragejoe

5 replies

1d13h

2024-08-20 04:54:53 UTC

I use multiple daily and have definitely seen a productivity boost. If nothing else, it saves typing. But I'd argue they are in essence a better search engine - it answers "you don't know what you don't know" questions very well, providing a jumping off point when my conception of how to achieve something with code or tooling is vague.

pdimitar

4 replies

1d9h

2024-08-20 08:50:20 UTC

Typing is, or at least it should be, the least of your time spent during the day doing programming. I don't find optimizing the 5-10% of my workday spent typing impressive, or even worth mentioning.

Granted there are languages where typing takes much more time, like Java and C# but... eh. They are quite overdue for finding better syntax anyway! :)

knowaveragejoe

1 replies

1d4h

2024-08-20 14:25:02 UTC

I didn't mean typing in the sense of autocomplete, I meant typing in the sense of stubbing out an entire class or series of test cases. It gives me scaffolding to work with which I can take and run with.

pdimitar

0 replies

1d3h

2024-08-20 14:45:54 UTC

Yes that's fair. If it helps reduce writing boilerplate then I'm all for it.

NoGravitas

1 replies

1d5h

2024-08-20 13:13:43 UTC

The languages where typing takes more time also tend to have IDE support to mitigate that --- in a deterministic way, unlike CoPilot.

pdimitar

0 replies

1d5h

2024-08-20 13:27:17 UTC

True.

h_tbob

5 replies

1d10h

2024-08-20 08:07:05 UTC

I am a competent coder. I have been a coder since I was in middle school. I know at least 10 languages, and I could write my own from scratch.

I know c++ dart golang java html css javascript typescript lua react vue angular angularjs c# swift sql in various dialects including mysql and postgres, and have worked professionally in all these regards. I love to challenge myself. In fact, if I done something before, I find it boring.

So copilot helps me because I always find something new to do, something I don't understand, something I'm not good at.

So yes, I'm confident I'm competent. But I always do things I'm not good at for fun. So it helps me become well rounded.

So your assertion it only helps me because I'm incompetent is true and false. I'm competent, I just like to do new stuff.

CooCooCaCha

2 replies

1d3h

2024-08-20 15:26:17 UTC

That kinda proves my point. You find it useful when you’re doing something outside your core competencies.

octodog

1 replies

20h15m

2024-08-20 22:15:25 UTC

I don't see the problem here. What's wrong with that? Tools are supposed to make your life easier.

CooCooCaCha

0 replies

17h59m

2024-08-21 00:32:04 UTC

That’s a different discussion. I’m disputing the claim that AI can make you a 2x dev when it seems like it’s mostly beneficial when you don’t know what you’re doing.

ruszki

0 replies

9h4m

2024-08-21 09:26:53 UTC

I know at least 10 languages

This statement would have been a huge red flag for me, if I had interviewed you. Don't get me wrong, you could use and program in 10 languages. Maybe you can be proficient in many at different times of your life. But know them at once? No.

pdimitar

0 replies

1d9h

2024-08-20 08:47:52 UTC

That's all very nice but it contains a fatal logical flaw: it assumes CoPilot actually gives you good code. :D

I mean it does, sometimes, but usually it's either boilerplate or something you don't care about. Boilerplate is mostly managed very well by most well-known IDEs. And neither them nor CoPilot are offering good algorithmic code... OK, I'll grant you the "most of the time, not never" thing.

dkersten

1 replies

1d13h

2024-08-20 04:58:54 UTC

Reminds me of The Primeagen quote: “If copilot made you 10x better, then you were only a 0.1x programmer to begin with”.

As someone who uses ChatGPT and Claude daily, but cancelled my Copilot subscription after a year of use because it intimately just wasn’t that helpful to me and didn’t provide enough benefit over doing it by hand, I kind of sort of agree. Maybe not entirely, but I can’t shake the feeling that there might be some truth in it.

The code that AI generates for me is rarely good. It’s possible to get good code out of it, but it requires many iterations of careful review and prompting, but for most cases, I can write it quicker by hand. Where it really shines for me in programming and what I still use ChatGPT and Claude for is rubber ducking and as an alternative to documentation (eg “how do I do x in css”).

Besides the code quality being mediocre at best and outright rubbish at worst, it’s too much of a “yes man”, it’s lazy (choose between A and B: why not a hybrid approach? That’s… not what I asked for), and it doesn’t know how to say “I don’t know”.

I also feel it makes you, the human programmer, lazy. We need to exercise our brains, not delegate too much to a dumb computer.

Izkata

0 replies

1d1h

2024-08-20 17:08:59 UTC

I also feel it makes you, the human programmer, lazy. We need to exercise our brains, not delegate too much to a dumb computer.

I kinda feel like this isn't talked about enough, my main concern right from the beginning was that new programmers would rely on it too much and never improve their own abilities.

consteval

0 replies

1d3h

2024-08-20 14:40:26 UTC

I think for more boilerplate-esque code-monkey type code it can be a boon.

I think the unfortunate reality is that this makes up a shockingly large amount of software engineering. Take this object and put it into this other object, map this data to that data, create these records and move them into this object when then goes into that other object.

Flop7331

4 replies

1d6h

2024-08-20 12:20:32 UTC

2x loc!

corytheboyd

3 replies

20h13m

2024-08-20 22:18:18 UTC

I’ve seen copilot spit out garbage dozens of lines long for something I swore must be one or two stdlib functions. Yep, it was, after reading some real documentation and using my brain. It was some NodeJS stuff, which I never work with. Don’t get me wrong, I still find it a helpful tool, but it is not at all a good, seasoned programmer— it is an algorithm predicting the next token based on the current tokens.

warkdarrior

2 replies

13h59m

2024-08-21 04:31:31 UTC

I still find it a helpful tool, but it is not at all a good, seasoned programmer

How quickly the goalposts move! Two years ago it was "AI will never be able to write code", now we're complaining that "AI is not a seasoned programmer". What are we going to be complaining about, two years from now?

majewsky

0 replies

6h16m

2024-08-21 12:14:32 UTC

Two years ago it was "AI will never be able to write code"

GitHub Copilot was released nearly three years ago.

corytheboyd

0 replies

3h20m

2024-08-21 15:10:45 UTC

I’m not really complaining, I’m saying it’s useful, but don’t pretend it’s better than what it is. It’s cruise control on a car— makes long drives better, but doesn’t literally drive for you.

rahimnathwani

0 replies

1d14h

2024-08-20 04:22:44 UTC

They're a 20x coder now.

alexander2002

0 replies

18h52m

2024-08-20 23:39:03 UTC

lol this made me chuckle

djaouen

20 replies

1d14h

2024-08-20 04:11:43 UTC

How are you using AI to double your coding productivity? Are you using ChatGPT, or Claude, or GitHub Copilot? I am an AI-skeptic, so I am curious here. Thanks!

danielmarkbruce

5 replies

1d14h

2024-08-20 04:25:22 UTC

What does it mean to be a skeptic here? Have you tried ChatGPT? Copilot?

djaouen

4 replies

1d14h

2024-08-20 04:30:16 UTC

Perhaps I should have said "AI hype-skeptic"? I am just not seeing the productivity gains that others claim ITT.

CuriouslyC

2 replies

1d13h

2024-08-20 04:49:40 UTC

AI is a tool, if you don't know how to use a tool you can't expect to get good results with it. That means both how to interact with the AI and how to structure your code to make the AI's generations more accurate.

0points

1 replies

1d11h

2024-08-20 06:56:01 UTC

If all you got is a LLM hammer, then every problem is a nail.

danielmarkbruce

0 replies

21h34m

2024-08-20 20:56:58 UTC

GPT-4o is the greatest hammer ever invented.

danielmarkbruce

0 replies

1d3h

2024-08-20 15:02:51 UTC

Got it. Are you using the latest models? Like, GPT-4o ? I find it significantly more useful when I'm stuck than copilot's autocomplete.

el_benhameen

4 replies

1d13h

2024-08-20 04:38:55 UTC

I’m not the OP and I wouldn’t say that AI has doubled my productivity, but the latest Claude models in particular have made me less of a skeptic than I was a few months ago.

I’m an experienced backend dev who’s been working on some Vue frontend projects, and it’s significantly accelerated my ability to learn the complexities of e.g. Vue’s reactivity model. I can ask a complex question that involves several niche concepts and get a response that correctly synthesizes those concepts. I spent an hour the other night trying to understand a bug in a component to no avail; once I understood the problem well enough to explain it in a few sentences, Claude diagnosed the issue and explained it with more clarity than the documentation and various stack overflow answers.

My default is no longer to assume that the model has a coin flip’s chance of producing bs. I still verify and treat answers with a certain degree of skepticism, but I now reach for it as my first tool rather than a last resort or a gimmick.

skydhash

1 replies

17h1m

2024-08-21 01:29:28 UTC

I now reach for it as my first tool

The manual is my first tool.

el_benhameen

0 replies

14h46m

2024-08-21 03:44:41 UTC

In this case, the manual is rather poor, so a tool that can cobble together an answer from different sections of the documentation plus blog posts and stack overflow is superior to the manual.

h_tbob

0 replies

1d11h

2024-08-20 07:02:55 UTC

Exactly. It’s insanely helpful when u are a dev with experience in another language. You know what you want, you just don’t know the name of the functions, etc. so you put a comment

// reverse list

And it writes code in the proper language.

Sysreq2

0 replies

1d10h

2024-08-20 07:37:22 UTC

I want to double tap this point. In my experience Claude out performs GPT-4o, Llama 3.1 and Gemma 1.5 significantly.

I have accounts for all three and will generally try to branch out to test them with each new update. Admittedly, I haven’t gotten to Grok yet, but Claude is far and away the best model at the moment. It’s not even close really.

naet

3 replies

1d14h

2024-08-20 04:20:06 UTC

I've tried various AI coding solutions and have found at best a mild boost but not the amazing multipliers I hear about online.

Copilot gives you some autofill that sometimes can be helpful but often not that helpful. I think the best it did for me was helping with something repetitive where I was editing a big list of things in the same way (like adding an ID to every tag in a list) and it helped take over and finish the task with a little less manual clicking.

ChatGPT has helped with small code snippets like writing a regular expression. I never got 100% regex mastery, usually I would have to look up a couple things to write one but GPT can shortcut that process. I get a little paranoid about AI provided code not actually working so I end up writing a large number of tests to check it, which could be a good thing but can feel tedious.

I'm also curious how other people are leveraging them to get more than I am. I honestly don't try too hard. At one point I did try really hard to get AI to do more heavy code lifting but was disappointed with my results so I stopped... but maybe things have improved a bit since then.

pdimitar

0 replies

1d9h

2024-08-20 08:54:56 UTC

I get a little paranoid about AI provided code not actually working so I end up writing a large number of tests to check it, which could be a good thing but can feel tedious.

This is a good thing. We need more tests on such critical places like regexes because they can be finicky and non-obvious. Tedious or not, we are not artists; the job must be done. Kudos for sticking to the good practices.

paradite

0 replies

1d13h

2024-08-20 04:52:42 UTC

It can get a little tedious if you are just using ChatGPT or Claude as it is. Also you are limited by lack of context on existing codebase.

That's why there are a lot of tools that help to setup a proper workflow around these LLMs.

For terminal-based workflow, you can checkout aider or plandex.

For GUI-based workflow, you can try 16x Prompt (I built it).

bamboozled

0 replies

1d13h

2024-08-20 04:51:56 UTC

I don’t know if I’ve done something wrong my my copilot is so wrong I just turned it off. I don’t understand the appeal at all.

I don’t remember the last time I thought one of its suggestions was useful. For me LSP has been the real game changer.

h_tbob

3 replies

1d11h

2024-08-20 07:14:16 UTC

Ok I jumped on copilot when it first came out so I have been using it for a long time.

Since I have been using it so long, I have a really good intuition of what it is “thinking” in every scenario and a pretty good idea of what it can do for me. So that helps me get more use out of it.

So for example one of the projects I’m doing now is a flutter project - my first one. So I don’t remember all the widgets. But I just write a comment:

// this widget does XYZ

And it will write something that is in the right direction.

The other thing it knows super well is like rote code, and for context, it reads the whole file. So like Dart, for example is awful at json. So you have to write “toMap” for each freaking class where you do key values to generate a map which can be turned into json. Same goes for fromMap. So annoying.

But with copilot? You just write “toMap” and it reads all your properties and suggests a near perfect implementation. So much time saved!

Flop7331

2 replies

1d6h

2024-08-20 12:24:07 UTC

I don't think you need an LLM just to parse class properties and turn them into a map. Not that familiar with Dart, but that's the kind of thing IDEs have been able to do for a while now just by parsing syntax the old-fashioned way.

swat535

1 replies

1d3h

2024-08-20 14:55:18 UTC

The thing is, when you dig into the claims many people make when they say that they get a 10x productivity boost using "AI" its usually some basic tasks that either generates boilerplate code or performs a fancy autocomplete and while those are great, in no way it supports their original claim.

I think people just want to be part of the hype and use the cool new technology whenever possible. We've seen this over and over again: Machine Learning, Blockchains, Cryptos, "Big Data", "Micro Services", "Kubernetes", etc.

I just don't think the current design of "AI" will take us there..

skydhash

0 replies

17h3m

2024-08-21 01:27:36 UTC

they get a 10x productivity boost using "AI" its usually some basic tasks that either generates boilerplate code or performs a fancy autocomplete and while those are great

And that are just a tiny upgrade over what IDEs can do. When I used Android Studio, the code basically write itself due to the boilerplate surrounding your business logic. And once I got a basic structure down, I feel like I only write 5 to 10 characters each line (less for data types). And the upgrade is both positive and negative at the same time, it boils down to luck to actually get good suggestions.

Mc91

0 replies

1d13h

2024-08-20 04:32:12 UTC

I don't use AI at work at all.

I pay for Leetcode, which usually gives editorial examples in Python and Java and such, and paste it into ChatGPT and say "translate this to a language I am more familiar with" (actually I have other programs that have been doing this for some language to language conversions for years, without AI). Then I say "make it more compact". Then again "make it more compact". So soon I have a big O(n) time, big O(1) space solution to Leetcode question #2718 or whatever in a language I am familiar with. Actually sometimes it becomes too compact and unreadable, and I back it up a little.

Sometimes it hallucinates, but it has been helpful. In the past I had problems with it, but not recently.

MrVandemar

15 replies

1d14h

2024-08-20 04:13:17 UTC

Seriously I am a 2x coder compared to what I was because of this.

Isn't the energy consumption of this technology pretty catastrophic? Do you consider the issue of energy consumption so abstracted you don't worry about it? Do you do anything to offset your increased carbon emissions?

danielmarkbruce

9 replies

1d14h

2024-08-20 04:27:22 UTC

They certainly are not providing these services at less than electricity costs. So if you are spending $20 a month on it, they are spending less than that on electricity. It's very low compared to any person in the first world's energy spend.

NoGravitas

4 replies

1d5h

2024-08-20 13:29:53 UTC

Actually, they probably are. OpenAI is projected to lose $5 billion this year in ChatGPT costs.

MaybiusStrip

3 replies

17h28m

2024-08-21 01:03:06 UTC

Spend, not lose. And it's mostly on training, not inference.

kaoD

2 replies

10h52m

2024-08-21 07:38:50 UTC

They would need 250 million GPT Plus $20 subscribers to recoup a $5 billion expense. They're far from that even when we count the free users (which are likely 99% of the user base?)

The math just doesn't work. They're hemorrhaging money as far as I can tell (not counting the Azure computing deal).

We can only guess, but my guess is that inference is still a good chunk of their costs. That's why they're trying to get the mini/turbo models into a usable state.

Even then, training is still an expense. And it's not like you can train and forget. Even if your model is already trained you still need to incorporate new knowledge over time.

danielmarkbruce

1 replies

29m

2024-08-21 18:01:50 UTC

Redo the math....

kaoD

0 replies

24m

2024-08-21 18:06:41 UTC

Ouch, $5b yearly of course!

LtWorf

3 replies

1d13h

2024-08-20 05:00:21 UTC

How do you know?

danielmarkbruce

2 replies

1d3h

2024-08-20 15:00:48 UTC

Common sense, having been around cloud operations for a bit. The big cloud providers run gross margins around 30%. So if openai are using MSFT and getting a "good deal" maybe MSFT only get 20% GM. So, $1 of openai compute costs MSFT say $0.80. Of that cost of providing the service, something like 30-40% goes to electricity. So, lets say the electricity cost is $0.30 for $1 of OpenAI compute. (And that's probably a steelman, I think it's actually more like $0.20. )

There is about zero chance OpenAI are running their service at a 70% negative gross margin.

LtWorf

1 replies

1d1h

2024-08-20 17:20:05 UTC

Doesn't seem anything definitive to me.

danielmarkbruce

0 replies

21h35m

2024-08-20 20:55:49 UTC

Not definitive. But if you need a water tight argument to change your mind on something, you'll never change it.

grogenaut

1 replies

1d13h

2024-08-20 04:31:44 UTC

isn't the energy consumption of travel, driving, and shipping food to you pretty catastrophic? Do you consider the issue of energy consumption so abstracted you don't worry about it? Do you do anything to offset your increased carbon emissions?

MrVandemar

0 replies

1d13h

2024-08-20 04:57:14 UTC

Do you do anything to offset your increased carbon emissions?

Yes. Quite a lot. I walk the talk.

LtWorf

1 replies

1d13h

2024-08-20 05:00:01 UTC

You think silicon valley types care?

johnthewise

0 replies

1d3h

2024-08-20 15:08:23 UTC

Rest of the developing world care even less.

h_tbob

0 replies

1d11h

2024-08-20 07:04:33 UTC

I’m working on getting a place with solar panels. I think that’s important for sustainability, plus who wants to have to be connected to the grid anyway?

slashdave

0 replies

18h22m

2024-08-21 00:08:45 UTC

Good point about robots. But there will be a throughput issue. You cannot accelerate physical movement.

bawolff

0 replies

1d13h

2024-08-20 04:40:56 UTC

I think all this can be true, and we are still in a massive AI bubble that may pop at any moment.

ChatGPT truly is impressive. Nonetheless, i still think most companies integrating "AI" into their products is buzzword bs that is all going to collapse in on itself.

futureshock

58 replies

9h40m

2024-08-21 08:51:23 UTC

This should really be retitled to “The AI investment bubble is losing hype.” LLMs as they exist today will slowly work their way into new products and use cases. They are an important new capability and one that will change how we do certain tasks.

But as to the hype, we are in a brief pause before the election where no company wants to release anything that would hit the news cycle in a bad way and cause knee-jerk legislation. Are there new architectures and capabilities waiting? Likely some. Sora showed state of the art video generation, OpenAI has demoed an impressive voice mode, and Anthropic has teased that Opus 3.5 will be even more capable. OpenAI also clearly has some gas in the tank as they have focused on releasing small models such as GPT-4o and 4o mini. And many have been musing about agents and methods to improve system 2 like reasoning.

So while there’s a soft moratorium on showing scary new capability there is still evidence of progress being made behind-the-scenes. But what will a state of the art model look like when all of these techniques have been scaled up on brand new exascale data centers?

It might not be AGI, but I think it will at least be enough for the next hype Investment bubble.

DiscourseFan

39 replies

8h26m

2024-08-21 10:05:00 UTC

We’ll see, but I doubt its because of the election; as another commenter said companies can’t afford to lose that much money by waiting around for months for the right “moment” to release a product. GPT-4o is good, I’ll grant you that, but its fundenmentally the same tech as GPT3.5 and the fundenmental problem, “hallucination,” is not solved, even if there are more capabilities. No matter what, for anything besides code that may or may not be any good, someone has to go through and read the response and make sure its all tight so they don’t fuck up or embarass themselves (and even then, using AI for coding will introduce long term problems since you’ll have to go back through it for debugging anyway). We all, within a month of ChatGPT getting introduced, caught it in a contradiction or just plain error about some topic of specialist knowledge, and realized that its level of expertise on all topics was at that same level. Sam Altman royallly fucked up when he started believing his own bullshit, thinking AGI was on the horizon and all we needed was to waste more resources on compute time and datacenters.

Its done, you can’t make another LLM, all knowledge from here on out is corrupted by them, you can never deliver an epistemic “update.” GPT will become a relic of the 21st century, like a magazine from the 1950s.

cdelsolar

36 replies

8h9m

2024-08-21 10:22:08 UTC

I don’t think you’ve used LLMs enough. They are revolutionary, every day. As a coder I’m several times more productive than I was before, especially when trying to learn some new library or language.

jazzyjackson

13 replies

7h14m

2024-08-21 11:16:27 UTC

Are you making several times as much money?

IME LLMs are great at giving you the experience of learning, in the same way sugar gives you the experience of nourishment

barrkel

10 replies

6h48m

2024-08-21 11:43:07 UTC

Developer productivity doesn't map very directly to compensation. If one engineer is 10x as productive as another, they're lucky if they get 2x the compensation.

amelius

9 replies

6h20m

2024-08-21 12:10:28 UTC

The 10x engineer will just start their own company.

Rinzler89

8 replies

5h47m

2024-08-21 12:43:36 UTC

A good programmer isn't necessarily also good at business to run his own company.

Sure, you have your John Carmaks' or Antirezs' of the industry who are 10x programmers and also successful founders but those guys are 1 in a million.

But your usual 10x engineer you'll meet is the guy who knows the ins and outs of all running systems at work giving him the ability to debug and solve issues 10x quicker than the rest, knowledge which is highly specific to the products of that company and is often non-transferrable and also not useful at entrepreneurship.

Becoming the 10x engineer at a company usually means pigeonholing yourself in the deep inner workings of the products, which may or may not be useful later. If that stack is highly custom or proprietary it might work in your favor making you the 10x guy virtually unfireable being able to set your own demands since only you can solve the issues, or might backfire against you at a round of layoffs as your knowledge of that specific niche has little demand elsewhere.

kybernetyk

4 replies

4h21m

2024-08-21 14:09:38 UTC

A good programmer isn't necessarily also good at business to run his own company.

AI can help with that.

Rinzler89

3 replies

4h19m

2024-08-21 14:11:34 UTC

We'd all be millionaires if AI could actually help with that, but then if everyone is a millionaire then nobody is.

Current AI is still at the state of recommending people jump off the golden gate bridge if they feel sad or telling them to change their blinker fluid.

kybernetyk

2 replies

4h14m

2024-08-21 14:17:04 UTC

You're right. And that's why I wonder how a developer can claim to get a 300% increase in productivity from AI results.

oceanplexian

0 replies

15m

2024-08-21 18:15:48 UTC

300% is a massive underestimate for someone who is AI native and understands how to prompt and interpret results correctly. In the past I would spend 30-40 minutes hunting around on StackOverflow to get the syntax of a difficult database query or bust out a long regular expression.

With AI I can do the same in 15 seconds. We’re talking a 120x increase in productivity not a 3x improvement.

Rinzler89

0 replies

4h7m

2024-08-21 14:23:42 UTC

You can easily get 300% productivity improvements if you're using a completely new language but still have enough programming background in order to refine the prompts to get what you want, or if you're debugging an issue and the LLM points you in the right way saving you hours of googling and slamming your head against the wall.

randomdata

2 replies

3h50m

2024-08-21 14:40:43 UTC

> But your usual 10x engineer you'll meet is the guy who knows the ins and outs of all running systems at work giving him the ability to debug and solve issues 10x quicker than the rest

You're talking about the 100x engineer now. The 10x engineer is the normal engineer you are probably accustomed to working with. When you encounter a 1x engineer, you will be shocked at how slow and unproductive they are.

Rinzler89

1 replies

1h41m

2024-08-21 16:50:15 UTC

>When you encounter a 1x engineer, you will be shocked at how slow and unproductive they are.

Well, Of Course I Encountered Him. He's Me.

randomdata

0 replies

1h38m

2024-08-21 16:52:28 UTC

Unlikely, given that you see the 100x engineer as being only a 10x level up.

cambaceres

1 replies

6h21m

2024-08-21 12:10:19 UTC

No but I can work 2-3 hours a day (WFH) while delivering results that my boss is very happy with. I would prefer to be paid 3 times as much and working 8 hours a day, but I'm ok with this too.

Rinzler89

0 replies

6h7m

2024-08-21 12:24:08 UTC

Like with all productivity gains in history, this won't last too long once management realizes this and squeezes deadlines by 2-3x since it will be expected for everyone to use LLMs at work to get things done 2-3 faster than before.

dax_

13 replies

7h59m

2024-08-21 10:31:30 UTC

Do you perhaps have some resources on how you use AI assistants for coding (I'm assuming Github Copilot). I've been trying it for the past months, and frankly, it's barely helping me at all. 95% of the time the suggestions are just noise. Maybe as a fast typer it's less useful, I just wonder why my experience is so different than what others are saying. So maybe it's because I'm not using it right?

inthebin

3 replies

7h41m

2024-08-21 10:49:58 UTC

I think it's your mindset and how you approach it. E.g. some people are genuinely bad at googling their way to a solution. While some people know exactly how to manipulate the google search due to years of experience debugging problems. Some people will be really good at squeezing out the right output from ChatGPT/Copilot and utilize it to maximum potential, while others simply won't make the connection.

Its output depends on your input.

E.g. say you have an API swagger documentation and you want to generate a Typescript type definition using that data, you just copy paste the docs into a comment above the type, and copilot auto fills your Typescript type definition even adding ? for properties which are not required.

If you define clearly the goal of a function in a JSDoc comment, you can implement very complex functions. E.g. you define it in steps, and in the function line out each step. This also helps your own thinking. With GPT 4o you can even draw diagrams in e.g. excalidraw or take screenshots of the issues in your UI to complement your question relating to that code.

epiccoleman

1 replies

5h1m

2024-08-21 13:30:13 UTC

some people know exactly how to manipulate the google search due to years of experience debugging problems

this really rings true for me. especially as a junior, I always thought one of my best skills was that I was good at Googling. I was able to come up with good queries and find some page that would help. Sometimes, a search would be simple enough that you could just grab a line of code right off the page, but most of the time (especially with StackOverflow) the best approach was to read through a few different sources and pick and choose what was useful to the situation, synthesizing a solution. Depending on how complicated the problem was, that process might have occurred in a single step or in multiple iterations.

So I've found LLMs to be a handy tool for making that process quicker. It's rare that the LLM will write the exact code I need - though of course some queries are simple enough to make that possible. But I can sort of prime the conversation in the right direction and get into a state where I can get useful answers to questions. I don't have any particular knowledge on AI that helps me do that, just a kind of general intuition for how to phrase questions and follow-ups to get output that's helpful.

I still have to be the filter - the LLM is happy to bullshit you - but that's not really a sea change from trying to Google around to figure out a problem. LLMs seem like an overall upgrade to that specific process of engineering to me, and that's a pretty useful tool!

wiether

0 replies

3h10m

2024-08-21 15:20:38 UTC

Keep in mind that Google's results are also much worse than they used to be.

I'm using both Kagi & LLM; depending on my need, I'll prefer one or the other.

Maybe I can access the same result with a LLM, but all the conversation/guidance required is time-consuming than just refining a search query and browsing through the first three results.

After all the answer is rarely exactly available somewhere. Reading people's questions/replies will provide a clues to find the actual answer I was looking for.

I have yet been able to achieve this result through a LLM.

DiscourseFan

0 replies

7h20m

2024-08-21 11:10:38 UTC

E.g. you define it in steps, and in the function line out each step. This also helps your own thinking

Yeah but there are other ways to think through problems, like asking other people what they think, which you can evaluate based on who they are and what they know. GPT is like getting advice from a cross-section of everyone in the world (and you don’t even know which one), which may be helpful depending on the question and the “people” answering it, but it may also be extroadinarily unhelpful, especially for very specialized tasks (and specialized tasks are where the profit is).

Like most people, I have knowledge of things very specific I know that less than a 100 people in the world know better than me, but thousands or even millions more have some poorly concieved general idea about it.

If you asked GPT to give you an answer to a question it would bias those millions, the statistically greater quantative solution, to the qualitative one. But, maybe, GPT only has a few really good indexes in its training data that it uses for its response, and then its extremely helpful because its like accidentally landing on a stackoverflow response by some crazy genius who reads all day, lives out of a van in the woods, and uses public library computers to answer queries in his spare time. But that’s sheer luck, and no more so than a regular search will get you.

zarzavat

1 replies

7h35m

2024-08-21 10:55:25 UTC

What language do you use?

If you can beat copilot in a typing race then you’re probably well within your comfort zone. It works best when working on things that you’re less confident at - typing speed doesn’t matter when you have to stop to think.

dax_

0 replies

6h19m

2024-08-21 12:12:15 UTC

I use C# for the most part, sometimes PowerShell. But I can certainly see how it's more useful when I don't know much of the API yet. Then it would be a lot of googling which the AI assistant could avoid.

myrmidon

1 replies

5h30m

2024-08-21 13:00:39 UTC

It is very helpful in providing highly specific "boilerplate" in languages/environments you are not very familiar with.

The text interface can also be useful for skipping across complex documentation and/or learning. Example: you can ask GPT-4 to "decode 0xdf 0xf8 0x44 0xd0 (thumb 2 assembly for arm cortex-m)" => this will tell you what instruction is encoded, what it does and even how to cajole your toolchain into providing that same information.

If you are an experienced developer already, with a clear goal and understanding, LLMs tend to be less helpful in my experience (the same way that a mentor you could ask random bullshit would be more useful to a junior than a senior dev)

kybernetyk

0 replies

4h15m

2024-08-21 14:15:37 UTC

this will tell you what instruction is encoded, what it does and even how to cajole your toolchain into providing that same information.

or it will hallucinate something that's completely wrong but you won't notice it

throwthrowuknow

0 replies

7h20m

2024-08-21 11:10:57 UTC

Copilot is just an autocomplete tool, it doesn’t have much support for multiturn prompting so it’s best used when you know exactly what code you want and just want it done quickly like implementing a well defined function to satisfy an interface or refactoring existing code to match an example you’ve already written out or prefilling boilerplate on a new file. For more complex work you need to use a chat interface where you can actually discuss the proposed changes with the model and edit and fork the conversation if necessary.

svaha1728

0 replies

6h2m

2024-08-21 12:29:08 UTC

Don’t work in large established code bases. Make flappy bird games in Python.

kybernetyk

0 replies

4h18m

2024-08-21 14:13:06 UTC

My experience is similar. Most of the results are not really useful so I have to put work in to fix them. But at that point I can do the small extra step of doing it completely myself.

ilaksh

0 replies

5h26m

2024-08-21 13:04:46 UTC

Take a look at aider-chat or zed. zed just released new AI features. Had a blog post about it yesterday I think.

Also you can look into cursor.

There are actually quite a few tools.

I have my own agent framework in progress which has many plugins with different commands. Including reading directories, tree, read and write files, run commands, read spreadsheets. So I can tell it to read all the Python in a module directory, run a test script and compare the output to a spreadsheet tab. Then ask it to come up with ideas for making the Python code match the spreadsheet better, and have it update the code and rerun the tests iteratively until its satisfied.

If I am honest about that particular process last night, I am going to have to go over the spreadsheet to some degree manually today, because neither gpt-4o nor Claude 3.5 Sonnet was able to get the numbers to match exactly.

It's a somewhat complicated spreadsheet which I don't know anything about the domain and am just grudgingly learning. I think the agent got me 95% of the way through the task.

Regic

0 replies

4h19m

2024-08-21 14:12:10 UTC

My experience is mostly with gpt-4. Act like it is a beginner programmer. Give it small, self-contained tasks, explain the possible problems, limitation of the environment you are working with, possible hurdles, suggest api functions or language features to use (it really likes to forget there is a specific function that does half of what you need instead of having to staple multiple ones together). Try it for different tasks, you will get a feel what it excels in and what it won't be able to solve. If it doesn't give good answer after 2 or 3 attempts, just write it yourself and move on, giving feedback barely works in my experience.

lumenwrites

4 replies

4h51m

2024-08-21 13:40:23 UTC

This comment doesn't deserve the downvotes its getting, the author is right, and I'm having the same experience.

LLM outputs aren't always perfect, but that doesn't stop them from being extremely helpful and massively increasing my productivity.

They help me to get things done with the tech I'm familiar with much faster, get things done with tech I'm unfamiliar with that I wouldn't be able to do before, and they are extremely helpful for learning as well.

Also, I've noticed that using them has made me much more curious. I'm asking so many new questions now, I've had no idea how many things I was casually curious about, but not curious enough to google.

Workaccount2

3 replies

4h33m

2024-08-21 13:58:17 UTC

Good luck telling a bunch of programmers that their skills are legitimately under threat. No one wants to hear that. Especially when you are living a top 10% lifestyle on the back of being good at communicating to computers.

There is an old documentary of the final days of typesetters for newspapers. These were the (very skilled) people who rapidly put each individual carved steel character block into the printing frame in order print thousands of page copies. Many were incredulous that a machine could ever replicate their work.

I don't think programmers are going to go away, but I do think those juicy salaries and compensation packages will.

JohnFen

1 replies

4h11m

2024-08-21 14:20:20 UTC

I do think those juicy salaries and compensation packages will.

I think that's inevitable with or without LLMs in the mix. I also think the industry as a whole will be better for it.

DiscourseFan

0 replies

4h1m

2024-08-21 14:30:15 UTC

Horses before the automobile

selimthegrim

0 replies

2h57m

2024-08-21 15:34:07 UTC

What’s the title of the documentary?

ardaoweo

1 replies

5h45m

2024-08-21 12:46:11 UTC

They are revolutionary for use cases where hallucinated / wrong / unreliable output is easy and cheap to detect & fix, and where there's enough training data. That's why it fits programming so well - if you get bad code, you just throw it away, or modify it until it works. That's why it work for generic stock images too - if you get bad image, you modify the prompt, generate another one and see if it's better.

But many jobs are not like that. Imagine an AI nurse giving bad health advice on phone. Somebody might die. Or AI salesman making promises that are against company policy? Company is likely to be held legally liable, and may lose significant money.

Due to legal reasons, my company couldn't enable full LLM generative capabilities on chatbot we use, because we would be legally responsible for anything it generates. Instead, LLM is simply used to determine which of the pre-determined answers may fit the query the best, which it indeed does well when more traditional technologies fail. But that's not revolutionary, just an improvement. I suspect there are many barriers like that, which hinder its usage in many fields, even if it could work most of the time.

So, nearly all use cases I can think of now will still require a human in the loop, simply because of the unreliability. That way it can be a productivity booster, but not a replacement.

pennomi

0 replies

1h8m

2024-08-21 17:22:49 UTC

A product doesn’t have to be useful for everything to still be useful.

JohnFen

0 replies

4h13m

2024-08-21 14:17:27 UTC

I don’t think you’ve used LLMs enough. They are revolutionary, every day.

How much is "enough"? Neither myself nor my coworkers have found LLMs to be all that useful in our work. Almost everybody has stopped bothering with them these days.

Agentus

1 replies

5h36m

2024-08-21 12:54:59 UTC

even if the datacenter increase was a waste because the potential of LLMs never pan out relative to the massive investment, i think the flood of gpu and compute will definitely enable whatever could benefit from it. The next better AI application will be very thankful that gpu compute is abundant.

nuc1e0n

0 replies

4h9m

2024-08-21 14:21:52 UTC

GPUs are easily and irreparably broken by overheating, so GPU compute is something that's high maintenance. It won't stick around like a building or a bridge.

dubcanada

9 replies

9h18m

2024-08-21 09:13:02 UTC

So you’re suggesting all innovation/new functionality releases are paused because US has elections coming up? I find that hard to believe.

greg_V

6 replies

8h46m

2024-08-21 09:45:14 UTC

"no no, it's not that the technology has reached it's current limits and these companies are bleeding money, they're just withholding their newest releases not to spook the normies!"

throwaway4aday

5 replies

6h21m

2024-08-21 12:09:25 UTC

Here's an experiment you can try, go to https://www.udio.com/home and grab a free account which comes with more than enough credits to do this. Use a free chat LLM like Claude 3.5 Sonnet or ChatGPT 4o to workshop some lyrics that you like, just try a few generations and ask it to rewrite parts you don't like until you have something that you don't find too cringe. Then go back over to Udio, go to the create tab turn on the Manual Mode toggle and type in only 3 or 4 comma separated tags that describe the genre you like keep them very basic like Progressive Rock, Hip Hop, 1995, Male Vocalist or whatever you don't need to combine genres these are just examples of tags. Then under the Lyrics section choose Custom and paste in just the chorus or a single verse from the lyrics you generated and then click Create. It'll create two samples for you to listen to, if you don't like either of them then just click Create again to get another two but normally it doesn't take too many tries to get something that sounds pretty good. After you have one you like then click on the ... menu next to the song title and click Extend, you can add sections before or after and you just have to add the corresponding verse from the lyrics you generated or choose Instrumental if you want a guitar solo or something. You'll wind up with something pretty good if you really listen to each sample and choose the best one.

Music generation is one of the easiest ways to "spook the normies" since most people are completely unaware of the current SOTA. Anyone with a good ear and access to these tools can create a listenable song that sounds like it's been professionally produced. Anyone with a good ear and competence with a DAW and these tools can produce a high quality song. Someone who is already a professional can create incredible results in a fraction of the time it would normally take with zero budget.

One of the main limitations of generative AI at the moment is the interface, Udio's could certainly be improved but I think they have something good here with the extend feature allowing you to steer the creation. Developing the key UI features that allow you to control the inputs to generative models is an area where huge advancements can be made that can dramatically improve the quality of the generated output. We've only just scratched the surface here and even if the technology has reached its current limits, which I strongly believe it hasn't since there are a lot of things that have been shown to work but haven't been productized yet, we could still see steady month over month improvements based on better tooling built around them alone.

Text generation has gone from markov chain babblers to indistinguishable from human written.

Image generation has gone from acid trip uncanny valley to photorealistic.

Audio generation has gone from 1930's AM radio quality to crystal clear.

Video generation is currently in fugue dream state but is rapidly improving.

3D is early stages.

???? is next but I'm guessing it'll be things like CAD STL models, electronic circuits, and other physics based modelling outputs.

The ride's not over yet.

SoftTalker

2 replies

5h14m

2024-08-21 13:16:32 UTC

The ride to what? The place where human musicians can't earn a living because they can't live on less than what it costs to have an AI regurgitate the melodic patterns, chord progressions, and other music theory it has learned? This benefits who, exactly? It's a race to the bottom. Who is going to pay anything for music that can be generated basically for free? Who is going to go to a concert or festival to listen to a computer? Who is going to buy merchandise? Are the hardware, models, and algorithms used going to capture the public interest like the personalities and abilities of the musicians in a band? Will anyone be a "fan" of this kind of music? Will there be an AI Freddie Mercury, Elton John, Prince, or Taylor Swift?

throwaway4aday

1 replies

1h51m

2024-08-21 16:40:09 UTC

It sounds like you're arguing with yourself. You provide exactly the reasons why generative AI isn't going to take us to a "place where human musicians can't earn a living". It's my understanding that most small bands primarily earn their money from live performances and merchandise, gen AI isn't going to compete with them there, if anything it'll make it much easier for them to create their own merch or at least the initial designs for it.

AI generated music is more of a threat to the current state of the recording industry. If I can create exactly the album or playlist that I want using AI then why should I pay a record label for a recording that they're going to take 90% of the retail price from? The playlist I listen to while I'm working out or driving is not competing with live band performances, I'm still going to go to a show if there's a band playing that I like.

SoftTalker

0 replies

1h22m

2024-08-21 17:09:13 UTC

Yeah I didn't really state that very well. My point was mostly what you say: because people are fans of artists, and because AI music is/will be essentially free to produce, AI music isn't something that will make money for anyone, unless it's the default way anything online makes money: ads are injected into it. I'm not going to pay for it. I'm not going to buy merchandise, or go to concerts, or do anything a music fan does and pays money for. I'm not even going to put it in a workout playlist, because I can just as easily make a playlist of real human artists that I like.

I disagree that it's a threat to the recording industry. They aren't going to be able to sell AI music, but nobody else is either, because anyone who wants AI music can just create it themselves. Record labels will continue to sell and promote real artists, because that's how they can make money. That's what people will pay for.

dragandj

1 replies

5h47m

2024-08-21 12:44:01 UTC

I've tried Udio when it appeared, and, while it is spectacularly fascinating from the technical perspective, and can even generate songs that "sound" OK, it is still as cringe as cringe can be.

Do you have an example of any song that gained any traction among human audience? Not a Billboard hit, just something that people outside the techbubble accepted as a good song?

throwaway4aday

0 replies

2h17m

2024-08-21 16:14:19 UTC

Have you tried the latest model? It's night and day.

Edit:

There's obviously still skill involved in creating a good song, it's not like you can just click one button and get a perfect hit. I outlined the simplest process in my first comment and specifically said you could create a "listenable" song, it's not going to be great but it probably rivals some of the slop you often hear on the radio. If you're a skilled music producer you can absolutely create something good especially now with access to the stemmed components of the songs. It's going to be a half manual process where you first generate enough to capture the feeling of the song and then download and make edits or add samples, upload and extend or remix and repeat.

If you're looking for links and don't care to peruse the trending section they have several samples on the announcement page https://www.udio.com/blog/introducing-v1-5

futureshock

0 replies

6h21m

2024-08-21 12:09:56 UTC

Yes, I do think it’s plausible. You’re talking Microsoft and Google. Two companies with extremely close ties to the US government. And Amazon is a major investor in Anthropic. It doesn’t take a conspiracy if the folks with majority stake in these companies are all friends and handshake agree over a steak dinner just off Capitol Hill. We live in a world where price fixing is a regular occurrence, and this is not so different.

I think we will see a very nice boost in capability within 6 months of the election. I don’t personally believe all the apocalyptic AGI predictions, but I do think that AI will continue to feed a nice growth curve in IT investment and productivity growth, similar to the last few decades of IT investment.

consf

0 replies

3h47m

2024-08-21 14:44:16 UTC

While it’s true that political events can influence market dynamics

nottorp

2 replies

7h30m

2024-08-21 11:00:43 UTC

But as to the hype, we are in a brief pause before the election where no company wants to release anything that would hit the news cycle in a bad way and cause knee-jerk legislation.

The knee-jerk legislation has mostly been caused by Altman's statements though. So I wouldn't call it knee-jerk, but an attempt by OpenAI to get a legally granted monopoly.

futureshock

1 replies

6h31m

2024-08-21 11:59:40 UTC

They are definitely working on that, but it needs to be the right kind of knee-jerk legislation, something that gives them regulatory capture. They can’t afford to lose control of the narrative and end up regulated to death.

hobs

0 replies

5h59m

2024-08-21 12:31:50 UTC

Given enough money the united states government will capitulate to anyone; hell they didn't even finish prosecuting Scientology.

jgalt212

0 replies

47m

2024-08-21 17:43:33 UTC

The AI investment bubble is losing hype

I disagree. Palantir is trading at 200X earnings.

jajko

0 replies

5h39m

2024-08-21 12:51:58 UTC

Nope. I couldn't care less about some elections (I care about global consequences, but no point wasting time & energy now, when it comes it comes). Thats US theater to focus population on that freak show you guys made out of election process, rather than on actually important stuff and concrete policies or actions.

What people, including me, are massively fed up is all the companies (I mean ALL) jumping on AI bandwagon in a beautiful show of how FOMO works and how even CEOs/shareholders are not immune to basic instincts. Literal hammer looking desperately for nails. Very few companies have amazing or potentially amazing products, rest not so much.

I absolutely don't want every effin' thing infused with some AI, since it will be used to 1) monitor my usage or me directly for advertising / credit & insurance scoring purposes, absolutely 0 doubt there; and 2) it may stop working once wifi is down, product is deprecated or company changes its policy (Sonos anyone?). Internet of Things hate v2.0.

I hate this primitive AI fashion wave, negative added value in most cases, 0 in the rest, yet users have to foot the bill. Seeing some minor industry crash due to unfulfilled expectations is just logical in such case.

consf

0 replies

3h49m

2024-08-21 14:42:01 UTC

Your observation is spot on. LLMs represent a transformative capability, offering new ways to handle tasks that were previously more challenging or resource-intensive.

IAmGraydon

0 replies

8h41m

2024-08-21 09:50:07 UTC

I think we found the guy with NVDA calls.

FrustratedMonky

0 replies

4h51m

2024-08-21 13:40:19 UTC

"there’s a soft moratorium on showing scary new capability"

Yes. There is also the Hype of the "End of the Hype Cycle". There is Hype that the Hype is ending.

When really, there is something amazing being released weekly.

People are so desensitized that just because we don't have androids walking the streets or suddenly have Blade Runner like space colonies staffed with robots, that somehow AI is over.

aussieguy1234

42 replies

12h32m

2024-08-21 05:59:17 UTC

If anything, i'm getting more hyped up over time. Here are the things i've used LLMs for, with success in all areas as a solo technical founder.

Business Advice including marketing, reaching out to investors, understanding SAFE notes (follow up questions after watching the Y Combinator videos), customer interview design. All of which, as an engineer, I had never done before.

Create SQL queries for all kinds of business metrics including Monthly/Daily Active users, breakdown of users by country, abusive user detection and more.

Automated unit test creation. Not just the happy path either.

Automated data repository creation, based on a one shot example and MySQL text output describing the tables involved. From this, I have super fast data repositories that use raw SQL to get/write data.

Helping with challenging code problems that would otherwise need hours of searching google or reading the docs.

Database and query optimization.

Code Review. This has caught edge case bugs that normal testing did not detect.

I'm going to try out aider + claude sonnet 3.5 on my codebases. I have heard good things about it and some rave reviews on X/twitter. I watched a video where an engineer had a bug, described it to some tool (which wasn't specified, but I suspect aider), then Claude created a test to reproduce the bug and then fixed the code. The test passed, they then did a manual test and the bug was gone.

arcticbull

25 replies

11h48m

2024-08-21 06:42:59 UTC

Helping with challenging code problems that would otherwise need hours of searching google or reading the docs.

I'm glad this has been working for you -- generally any time I actually have a really difficult problem, ChatGPT just makes up the API I wish existed. Then when I bring it up to ChatGPT, it just apologizes and invents new API.

chromanoid

14 replies

11h41m

2024-08-21 06:49:47 UTC

Do you provide code examples? In my experience the more specific you get with your problem the more specific are the provided solutions (probably a "natural" occurence in LLMs). Hallucianted APIs are sometimes a problem for me, but then I just specify which API to use.

Jensson

11 replies

11h3m

2024-08-21 07:27:38 UTC

Why do you need an LLM if you know what you want it to do? Just write the code rather than wrangling with the LLM, it isn't like writing code take much time when you know what it should do.

zaptrem

2 replies

9h59m

2024-08-21 08:32:18 UTC

Not op but my response: Because I am lazy and would like to save the 1-5 minutes it would take me to actually write it. When there are dozens of these small things a day the saved time really adds.

croes

1 replies

9h44m

2024-08-21 08:47:00 UTC

That's why you spend 10 minutes to write the correct prompt?

jeffhuys

0 replies

5h7m

2024-08-21 13:24:02 UTC

If it takes you 10 minutes, you're doing it wrong.

jacobsimon

2 replies

10h41m

2024-08-21 07:50:01 UTC

I know it’s not a completely fair comparison, but to me this question is kind of missing the point. It’s like asking “Why take a cab if you know where you want to go?”

philipwhiuk

0 replies

8h33m

2024-08-21 09:58:14 UTC

It's such a poor comparison it's ridiculous. A better analogy is "why take a cab if you know where you want to go and provide the car and instructions on how to drive"

jeffalyanak

0 replies

8h33m

2024-08-21 09:57:40 UTC

No, it's like saying "why take a cab where you have to provide the driver so much guidance on driving as to be equal or greater than the effort of driving yourself."

throwaway_ab

1 replies

7h34m

2024-08-21 10:56:25 UTC

I find LLM's great for:

- Getting over the blank canvas hurdle, this is great for kick starting a small project and even if the code isn't amazing, it gets my brain to the "start writing code and thinking about algo/data-structures/interesting-problem" rather than being held up at the "Where to begin?" Metaphorically where to place my first stroke, this helps somewhat.

- Sometimes LLM has helped when stuck on issues but this is hit and miss, more specifically it will often show a solution that jogs my brain and gets me there, "oh yeah of course" however I've noticed I'm more in than state when tired and need sleep, so the LLM might let me push a bit longer making up for tired brain. However this is more harmful to be honest without the LLM I go to sleep and then magically like brains do solve 4 hours of issues in 20 minutes after waking up.

So LLM might be helping in ways that actually indicate you should sleep as brain is slooooowwwwing down

andrew-dc

0 replies

4h48m

2024-08-21 13:43:20 UTC

Yes, this. I was skeptical and disgusted at a lot of what was being done or promised by using LLMs, but this was because I initially saw a lot of wholesale: "Make thing for me," being hyped or discussed. In practice, I have found them to be good tools for getting going or un-stuck, and use them more like an inspiration engine, or brain kick-starter.

LunicLynx

1 replies

10h39m

2024-08-21 07:51:40 UTC

This is the comment!

If so called „prompt engineering“ goes so far that only one solution remains, you don‘t need the LLM.

drogus

0 replies

10h8m

2024-08-21 08:22:55 UTC

For me, it depends on the problem. I avoid LLMs for anything complex, cause I prefer to think it through myself. But there are often times when you know exactly what you want and you know how it should look like. Let's say you need a simple web API to help you with a task. These days I'd typically ask an LLM to write the app. It will usually get some stuff wrong, but after a quick glance I can steer it to fix the problems (like: you didn't handle errors etc).

That way I can generate a simple few hundred lines of code app in minutes. There is no way I could type that fast even if I exactly know what characters to write and it's not always the case. Like, oftentimes I know exactly what to do and I know if it's OK when I see the code, but writing it would require me to look into the docs here and there.

throwaway_ab

0 replies

7h34m

2024-08-21 10:56:25 UTC

I find LLM's great for:

So LLM might be helping in ways that actually indicate you should sleep as brain is slooooowwwwing down

brailsafe

0 replies

11h35m

2024-08-21 06:56:17 UTC

Seems like you're just saying if you solve the problem on your own then whatever the LLM produces isn't an issue

actionfromafar

0 replies

10h12m

2024-08-21 08:19:24 UTC

I think it also depends on how common the API is. If it's more unknown to the LLM, it tends to drift away.

rcarmo

3 replies

10h6m

2024-08-21 08:25:08 UTC

Has to be something pretty generic. I'm trying to write a little C program that talks to an LCD display via the SPI bus--something I did before a few times, but not with this particular display and MCU. There is no LLM that can even begin to reason this out since they've been mostly trained on web dev content.

fragmede

2 replies

8h57m

2024-08-21 09:33:46 UTC

is there no documentation about that uc and LCD controller? I'd assume they've been trained on every html, pdf, and video tutorial out there, as well as the pdfs of all of those chip's datasheets, and example microcontroller C code out there in the form of vendor documentation. sure that's maybe less than the amount of html web app tutorials but if we assume it's been fed all of the vendor's documentation about an uc and their library documentation as well, the ones it would fail on are undocumented chips out of china that have no data sheets (which make for a very fun project to reverse engineer, mind you), or something very new. Even without that though, it's still able to dot product (not going to call it "reasoning") its way to at least hallucinate code to talk to an LCD controller chip via SPI for a uc is never even heard of, so I can't agree with "even begin to reason this out".

rcarmo

0 replies

7h54m

2024-08-21 10:36:27 UTC

You don't learn how to program SPI from reading documentation about an LCD controller. You need a lot more context and understand how to string together basic operations, which are quite often not detailed in parts documentation.

I think that you have a serious misunderstanding of the capabilities of LLMs - they cannot reason out relationships among documents that easily. They cannot even tell you what they don't know to finish a given task (and I'm not just talking one-shot here, agent frameworks suffer from the same problem).

jokethrowaway

0 replies

8h19m

2024-08-21 10:11:25 UTC

Even if it were trained on everything it wouldn't know everything.

You need to do some serious RHLF to get something good out of LLMs

barrkel

1 replies

6h43m

2024-08-21 11:47:52 UTC

LLMs aren't good when you drift out of the training distribution. You want to be hitting the meat in the middle and leveraging the LLM to blast through it quickly.

That means LLMs are great for scaffolding, prototypes, the v0.1 of new code especially when it's very ordinary logic but using a language or library you're not 100% up to speed on.

One project I was on recently was translation: converting a JS library into Kotlin. In-editor AI code completion made this really quick: I pasted a snippet of JS for translation in a comment, and the AI completed the Kotlin version. It was frequently not quite right, but it was way faster than without. In particular, when there was repeated blocks of code for different cases that different only slightly, once I got the first block correct, the LLM picked up on the pattern in-context and applied it correctly for the remaining blocks. Even when it's wrong, if it has an opportunity to learn locally, it can do so.

wanderingstan

0 replies

4h58m

2024-08-21 13:33:13 UTC

I’ve come to the similar conclusion, and realized that I’m slowly learning how to wield this new tool.

The sweet spot is when you need something and you’re sure it’s possible but just don’t know (or it’s too time consuming) how. E.G. change the css to to X, rewrite this python code in typescript, use the pattern of this code to do Y, etc.

Reminds me of the early days of Google where you had to learn how write a good search query. You learn you need more than a word or two, but don’t write a whole essay, etc.

zaptrem

0 replies

10h1m

2024-08-21 08:29:42 UTC

I assume it knows the big stuff like the PyTorch API/major JS and React libs then just paste the docs or even impl code for any libs it needs to know beyond that.

skriticos2

0 replies

9h44m

2024-08-21 08:47:03 UTC

I found that ChatGPT needs to be rained in with the prompts, and then it does a very impressive job. E.g. you can create a function prototype (with input and output expectations) and in the body tell the logic you are thinking about in meta-code. Then tell it to write the actual code. It's also good if you want to immerse yourself into a new programming language and outline what kind of program you want, and expect the results to be different from what you throught, but insightful.

Now if you throw larger context or more obscure interface expectations at it, it'll start to discard code and hallucinate.

sebstefan

0 replies

7h49m

2024-08-21 10:41:28 UTC

It's nice that everybody is trying to help with the way you're prompting but just use Bing Copilot or Phind for this, not ChatGPT

It'll generate a bunch of queries to Google (well, "to Bing" I guess in that case) based on your question, read the results for you, base its answer on the results and provide you with sources that you can check if it used anything from that webpage.

I only use ChatGPT for documentation when I have no idea where I'm going at all, and I need a lay of the land on best practices and the way forward.

For specifics, Bing Copilot. Essentially a true semantic web search

Havoc

0 replies

8h7m

2024-08-21 10:23:34 UTC

Phrase the question as a backend problem rather than as a front end problem. Same for lang your asking in. Less JS/TS more Python

assimpleaspossi

6 replies

9h42m

2024-08-21 08:48:27 UTC

fwiw, here's my results.

Every question I've asked of chatGPT, meta and Gemini have returned results that were either obvious or wrong. Pointing out how wrong the answers returned got the obvious, "I apologize" response which returned an obvious answer.

I consider all these AI engines to be interactive search engines where the results need to be double checked. The only thing these engines do, for me perhaps, is save some search time so I don't have to click around on a lot of sites to scroll for some semblance of an answer to verify.

jen729w

2 replies

8h19m

2024-08-21 10:12:01 UTC

Every question?

If it’s returning results that were obvious, why were you asking the question?

And I don’t believe that the other ~50% were wrong.

The only thing these engines do, for me perhaps, is save some search time so I don't have to click around on a lot of sites to scroll for some semblance of an answer to verify.

This sounds like a valuable service.

jazzyjackson

1 replies

7h7m

2024-08-21 11:24:05 UTC

IME they basically rephrase the information I've put into it, rarely adding anything I didn't already imply I knew by my formulation of the question

Something to keep in mind is that gambling rules apply: if enough people flip coins, there is always someone experiencing a winning streak and someone experiencing a losing streak and majority that gets a mixed bag of roughly breaking even (mediocre usefulness and a waste of time)

my first week of using GPT4 every day I experienced great answer after great answer, and I was convinced the world would change in a matter of months, that natural language translation was now a solved problem etc etc

But my luck changed, and now I get some good answers and some idiotic answers, so it's mostly not worth my time. Some people never get a good answer in their few dice rolls before writing off the technology

jeffhuys

0 replies

4h7m

2024-08-21 14:23:30 UTC

Well, time to run it locally then :) Check out ollama.com. llama 3.1 is pretty crazy, especially if you can run the 405B one. Otherwise, use Mistral/Mixtral or something similar.

ghm2180

2 replies

8h16m

2024-08-21 10:14:44 UTC

The only thing these engines do, for me perhaps, is save some search time.

This. Saving time(or money if you see them as the same) is the whole point actually.

Intelligence is supposed to be in the context of a shared Goal or beliefs. In your case and case of most humans time and money is the context.

Are ant(insect) networks intelligent? Possibly, They do help millions of them communicate quickly. But ants don't have a brain.

Are beings that make decisions without conscious choices intelligent? Possibly, if they could escape death by amazing ability at any instant. But these beings don't have a frontal cortex that can make decisions by rational inquiry.

stevenjgarner

1 replies

7h41m

2024-08-21 10:50:23 UTC

But ants don't have a brain.

Ants have a brain, albeit a small one - some 250,000 nerve cells vs 100 billion neurons in the human brain [1]

It is somewhat like saying an Intel 4004 is not really a microprocessor as it only has 2,300 transistors vs an Apple M4 with 28 billion transistors.

[1] https://www.smorescience.com/do-ants-have-brains/

ghm2180

0 replies

6h30m

2024-08-21 12:00:47 UTC

Debating this point is a bit like bike shedding and besides the point. The point is they can't think nearly as closely like humans, but a network of them can seemingly do intelligent things. Intelligence is only about shared goals and beliefs with an agent(the other ants in this example) and achieving them.

13of40

4 replies

10h14m

2024-08-21 08:16:41 UTC

There was a movie that came out in 2001 called "Artificial Intelligence", at a time when we were still figuring out how things like search engines and the online economy were going to work. It had a scene where the main characters went to a city and visited a pay-per-question AI oracle. It was very artistically done, but it really revealed (in hindsight) how naive we were about how "online" was going to turn out.

When I look at the kinds of AI projects I have visibility into, there's a parallel where the public are expecting a centralized, all knowing, general purpose AI, but what it's really going to look like is a graph of oddball AI agents tuned for different optimizations.

One node might be slow and expensive but able to infer intent from a document, but its input is filtered by a fast and cheap one that eliminates uninteresting content, and it could offload work to a domain-specific one that knows everything about URLs, for example. More like the network of small, specialized computers scattered around your car than a central know-it-all computer.

delusional

3 replies

9h43m

2024-08-21 08:47:33 UTC

When I look at the kinds of AI projects I have visibility into, there's a parallel where the public are expecting a centralized, all knowing, general purpose AI

I don't think this is entirely fair to "the public". Media was stuffed with AI company CEOs claiming that AGI was just around the corner. Nvidia, OpenAI and Musk, Zuckerberg, and others were positively starry eyed at how, soon, we'd all be just a GPU matmul away from intelligence. "The public" has seen these eye watering amounts of money shifting around, and they imply that it must mean something.

The entire system has been acting as if GenAI was right around the corner.

sgt101

1 replies

7h45m

2024-08-21 10:45:25 UTC

maybe there's a term confusion here - GenAI has come to mean Generative AI (LLM's, Diffusion models..) rather than General-AI. People call that AIG, now people also talk about AIS which I take to mean "human level on a narrow domain only" while AIG is "generally intelligent at roughly human level".

My personal belief is that AIS is not a real thing (in the sense I wrote above) because narrow domain competence is tightly coupled to general domain competence . Even very autistic people that are functional in some domain actually have a staggering range of competences that we tend to ignore because we expect them in humans. I think machines will be similar.

Anyway, AIG or AIS is not round the corner at all. But that doesn't mean that there isn't a lot of value to be had from generative AI in the near future or now. Will this be a small fraction of the value from Web1.0 and Web2.0? Will it be approximately the same? Will it be a multiple? I think that's the question. I think it's clear that assistants for software engineers are somewhat valuable now (evidence: I get value out of them) how valuable? Well, more than stackexchange, less than a good editor. That's still alot, for me. I won't pay for it though...

And this points to the killer issue: there isn't a good way to monetize this. There isn't a good way to monetize the web, so we got adverts (a bad way). What will be the equivalent for LLM's? We just don't know right now. Interestingly there seems to be very little focus on this! Instead folks are studying the second order value. Using this "free thing" we can drive productivity... or quality... increase opportunities... create a new business?

delusional

0 replies

2h36m

2024-08-21 15:55:24 UTC

I was definitely confusing the terms. I was thinking of AGI, but i remembered that the G was for general, and GenAI "felt" right (probably because it's used in a similar enough context).

Replace all the instances of GenAI with AGI in my post.

It's an interesting observation that the economics aren't there yet. I think it's generally assumed that if we find something valuable, we can probably figure out how to monetize it. That's not necessarily true though. In the same but opposite vein, it doesn't necessarily need to be useful to stick around. It's possible AI is forever going to be useless (in objective terms, maybe it will make people less efficient) but find a monetization strategy that keeps it around (maybe it makes people feel good).

A ton of the technology economy isn't really based on objective metrics of usefulness. Microsoft isn't the biggest because they're the most useful. We don't look to the quality of windows to understand if people will buy the next version. We don't look at the google search results as an indicator of google's profitability.

13of40

0 replies

3h31m

2024-08-21 15:00:05 UTC

The entire system has been acting as if GenAI was right around the corner.

To be clear, I think it is. It's just not going to be a hologram of a wizard in a room you can ask a question to for a quarter, which is what these chat bots and copilots you see today are modeled around.

meiraleal

2 replies

8h19m

2024-08-21 10:12:12 UTC

Bad programmers have bad experience using LLMs to code. They then blame the LLM, not themselves.

pixelsort

0 replies

6h57m

2024-08-21 11:34:20 UTC

Most of it is mediocre creativity and a low willingness to adapt their prompting patterns for an optimal ratio of effort to output quality. Most people who don't understand yet expect LLMs to read their minds when they would better of orienting themselves as student who have zero experiencing developing the arsenal strategies that elicit productivity gains.

They haven't developed any intuition into which kinds of questions are worth prompting, which kinds of context are effective, or even which kinds of limitations apply to which models.

jazzyjackson

0 replies

7h5m

2024-08-21 11:26:14 UTC

It's more like an ouija board, it only works if you believe and phrase your input as if you expect a competent answer

I know great programmers who approached it with skepticism and the bot responds be being worthy of skepticism

consf

0 replies

3h45m

2024-08-21 14:45:56 UTC

You’re really leveraging LLMs to their full potential

julienchastang

27 replies

19h57m

2024-08-20 22:34:00 UTC

As usual, when we see a thread on this topic on HN, the reactions tend to be bimodal: either "Yes, AI has transformed my workflow" (which is where I mostly fall), or "No, it's over-hyped." The latter often comes with an anecdote about how an LLM failed at a relatively simple task. I speculate that this diversity in opinion might be related to whether or not the user is employing a pro-tier LLM. Personally, I've been very impressed by ChatGPT-4 across a wide range of tasks, from debugging K8s logs to coding and ideation. I also wonder if some of the negative reactions stem from bad luck with an initial "first contact" with an LLM, where the results fell flat for any number of reasons (e.g., poor prompting), leading the user to conclude that it's not worth their time.

fragmede

9 replies

19h34m

2024-08-20 22:56:50 UTC

You can usually tell that a lot of people just go off rumors they read once off Twitter or reddit or somewhere about hallucinations or doing math, against a weaker model, without every validating what they read online or updating their model of how well latest models work.

Just have to learn to let it go, despite xkcd 386.

codexon

7 replies

17h56m

2024-08-21 00:34:54 UTC

I use gpt4o and claude 3.5 while coding every day. The rumors on twitter and reddit are accurate.

I constantly run into incorrect answers from the LLMs every day. Just recently I asked whether I needed to reverse the bit shift to mask the upper 24 bits in an IP address on a little endian platform and it incorrectly told me no probably because most of the answers on Google appeared to answer no to similarly phrased questions.

theaussiestew

6 replies

17h33m

2024-08-21 00:57:48 UTC

That's probably the best we can do for now with LLMs. Surely it would be unreasonable for LLMs to provide a correct answer when the majority of people on various forums also provide the wrong answer. Some day LLMs will be able to objectively provide truthful statements but we're not there yet. Regardless, that LLMs are competing with longstanding programmers is already an impressive feat, even if they're not 100% correct.

codexon

3 replies

17h15m

2024-08-21 01:16:22 UTC

I'm not going to pass any judgement on what is reasonable or unreasonable for LLMs, just pointing out that it makes serious mistakes.

Many people here argue that LLMs are able to reason, you can check my post history 3 months ago to see an example of this.

And if you have to double check everything, how much time are you really saving? And is this thing really on track to replace programmers any time soon? Absolutely not.

MaybiusStrip

1 replies

17h2m

2024-08-21 01:28:43 UTC

I'm confused, you said earlier that you use it every day.

codexon

0 replies

15h45m

2024-08-21 02:45:55 UTC

Where did I say I didn't?

fragmede

0 replies

8h11m

2024-08-21 10:19:39 UTC

I mean, call it dot producting if you don't want to use the word reason, but it takes a bit of multiplying matricies and converting tokens to words to be able to say that if b follows a, it doesn't mean that a follows b, but that not b follows not a (the contrapositive). sure, that's in an unknown amount of textbooks in the training corpus, but at what point does putting down a bunch of axioms and applying them ok each other lead to what you'd actually call reasoning? at some point when a human does it, the human's gone past its training corpus and entered the hallowed realm of reasoning, unless we're able to deliniate clearly where that line is and isn't, and then all agree on it, we're going to keep going in circles about if it can be called reasoning or not.

philipwhiuk

1 replies

8h29m

2024-08-21 10:02:05 UTC

Regardless, that LLMs are competing with longstanding programmers is already an impressive feat, even if they're not 100% correct.

But they're not. They're a little tool some programmers use to save a small amount of time. They're not replacing programmers.

And there's no evidence they will ever be as good as actual programmers.

fragmede

0 replies

8h20m

2024-08-21 10:11:10 UTC

Thankfully! In the 90' and outsourcing became a thing, programming as a career was over. and then when ChatGPT came out, programming as a career was over again. turns out, programming's not dead quite yet, though time will tell if it ever does.

sandspar

0 replies

15h21m

2024-08-21 03:10:14 UTC

Maybe you're using a strawman.

codexon

6 replies

18h3m

2024-08-21 00:28:13 UTC

I'm using the latest top paid models, gpt4o and claude 3.5.

They work when there's a lot of examples on github or google, but once you get into something that doesn't have a lot of examples like closed source code or rarely used libraries, it will start hallucinating and even mixing up different API versions to create a mess that doesn't work at all.

I don't believe LLMs will get any better than this without a new major breakthrough, but this is already better than using Google search.

anon291

4 replies

16h41m

2024-08-21 01:49:55 UTC

don't believe LLMs will get any better than this without a new major breakthrough, but this is already better than using Google search

I mean .. every single org is invested in that research right now

yoyohello13

2 replies

16h8m

2024-08-21 02:23:16 UTC

Money helps, but doesn’t guarantee breakthroughs.

anon291

1 replies

15h41m

2024-08-21 02:50:15 UTC

The sort of transformation needed here seems achievable. There are various roadmaps to success whereas there were not any roadmaps to general language agents just a few years ago.

codexon

0 replies

14h0m

2024-08-21 04:31:02 UTC

I don't see how it seems achievable.

Backprop was popularized in 1986. Transformers came out in 2017.

It could be >10-20 years before the next breakthrough comes. Given all the progress so far, these things still don't seem to be learning any logic.

majewsky

0 replies

6h21m

2024-08-21 12:09:49 UTC

True, if by "every single org" you mean "maybe a dozen orgs". All that the vast majority of companies is doing is figuring out how to run an existing LLM or API thereof and shipping an RAG integration button for their application.

siodine

0 replies

11h12m

2024-08-21 07:19:24 UTC

I had the same problem with using the latest neo4j API and Sonnet 3.5 -- except it wasn't really a problem. I just created a project where I explain that its knowledge on neo4j is outdated and to instead use an API reference and changelog that I add to the project.

It's not magic, you need to think what you would need in a similar situation and then provide that to the LLM. It definitely does suffer from severe overconfidence, though -- if you were to think of it as a person.

Also, you need to break up your project into manageable portions and provide context to the other portions (without providing the entirety of them) for it to effectively work on the portion you want to work on.

yoyohello13

2 replies

16h10m

2024-08-21 02:20:53 UTC

I’m in the middle. I use paid models. It’s great for scripts, reminding me of commands, and surprisingly good at debugging error messages. But if I’m working in our main code base I’m basically never able to just use the code it spits out. It can get me in the ballpark of what I want, but it often does weird convoluted shit on the first pass. I can keep prompting to refine the code down to something closer to what I want, but at that point it would have just been faster to write the thing myself.

I kind of like LLMs for learning new languages. Claude or ChatGPT are good for asking questions. But copilot really stunts learning for me. I find nothing really sticks in my brain when I have copilot running. I feel like I just turn my brain off, which seems kind of dangerous to me in the long run.

sandspar

1 replies

15h22m

2024-08-21 03:09:23 UTC

I'm concerned about the "turning the brain off" effect too. When we put backup cameras on cars it means that everyone can parallel park perfectly. Now take away the backup cameras and see what happens to our parallel parking.

prisenco

0 replies

13h5m

2024-08-21 05:25:26 UTC

I'm also concerned, because a lot of people are using these models for the early stage drudge-work and I hate to sound like Calvin's dad here, but I feel like those early stages are good for a developer to build a mental model of their larger goals and targets. Sure it's easy and a lot of boilerplate, but our brains are working while we're doing it, and for me, it feels like a necessary part of the process. I'm not convinced automating it away isn't going to result in a more difficult time understanding of your product and code base later on.

MaybiusStrip

2 replies

17h36m

2024-08-21 00:55:04 UTC

I don't think so, I think it's more about openness. I've noticed older software engineers tend to be more anti-LLM and quick to dismiss.

The shortcomings are aplenty, but they don't bother me. The things it can do weren't possible 2 years ago. I'll leverage those and take the bad with the good.

Similar experience with Tesla FSD. I know other Tesla owners who tried it a few times and think it's trash because they had to disengage. I disengage preemptively all the time but the other 90% of my drive being done for me is not something that used to be possible. I tried to give up my subscription because it's expensive and couldn't hold out two days.

mplewis

0 replies

12h5m

2024-08-21 06:25:55 UTC

Your self-driving car is so unreliable you have to manually disable it 10% of the time to stay safe?

fuzztester

0 replies

14h52m

2024-08-21 03:39:07 UTC

I don't think so, I think it's more about openness. I've noticed older software engineers tend to be more anti-LLM and quick to dismiss.

Wow, a highly ageist comment, if there ever was one.

Congrats. Trying for a job and looking for less competition, maybe?

Notice that your statement is as full of assumptions as mine. That was intentional on my part, to bring out my point.

dogcomplex

1 replies

11h15m

2024-08-21 07:15:44 UTC

To be fair there is a certain level of blindly-pragmatic "so AI can help me with programming - so what! It takes just as long to converse with it as for me to code it myself" - which is mostly true. But I think those people fail to grok the sheer ridiculous power of what they're saying - that these tools are close enough they're nearly at parity. Sure, the error-correcting architectures aren't sufficient yet for them to run unsupervised, but this is crazy already.

philipwhiuk

0 replies

8h31m

2024-08-21 10:00:17 UTC

We do grok it. We also think they've reached the apex of the technology and have seen no evidence they'll close the hallucination gap.

wslh

0 replies

18h24m

2024-08-21 00:06:54 UTC

Yes, I think if we ask about a more trivial thing we will have the same responses. For example, “Do you put the toilet paper roll with the paper coming over the top or hanging under the bottom?”.

Regarding LLMs I bet we will see them evolving. Don't forget about https://news.ycombinator.com/item?id=41269791 there are many problems that LLMs are no good for but they are being better that many Google search results, and that means something from the economic point of view.

I am sure that The Economist analytics is having a good moment /s.

Havoc

0 replies

7h22m

2024-08-21 11:09:00 UTC

The way people put food on their table also matters.

I see a lot of negative reactions from programmers precisely because it is good at what they do. If you’re feeling threatened you’re much more likely to focus on the things it can’t do

ianbutler

12 replies

1d14h

2024-08-20 03:39:49 UTC

This supposed “cycle” has been crazy it’s been about 1.5 years since gpt4 came out, which is really the first generally capable model. I think a lot of this “cycle” is media’s wishful thinking. Humans, especially humans in large bureaucracies, just don't move this quickly. Enterprises have barely had time to dip their toes in.

For what it’s worth hype doesn’t mean sustainability anyway. If all the jokers go onto a new fad it’s hardly the skin off the back of anyone taking this seriously, they’ve been through worse times.

yieldcrv

8 replies

1d14h

2024-08-20 04:09:39 UTC

I’ve had a lot of corporate clients this year

Large and small, entire development teams are completely unaware of the basics of “prompt engineering” for coding, and corporate has an entirely regressive anti-AI policy that doesnt factor in the existence of locally run language models, and just assumes ChatGPT and cloud based ones digesting trade secrets. People arent interested in seeing what the hype is about, and are disincentived from bothering on a work computer. I’m on one team where the Engineering Manager is advocating for Microsoft CoPilot licenses, as in, its a concept that hasnt happened and needs buy in to even start considering.

I would say most people really haven't looked into it. Work is work, the sprint is the sprint, on to the next part of the product, rinse repeat. Time flies for those people, its probably most of the people here.

nonethewiser

1 replies

1d14h

2024-08-20 04:13:47 UTC

I think most people outside of tech have barely even touched it.

Obviously there are some savy users across all age groups and occupations. But from what Ive see its just not part of most people’s workflow.

yieldcrv

0 replies

1d14h

2024-08-20 04:21:52 UTC

At the same time I think Meta and big tech adding more and more cloud based inference is driving demand for the processors

OpenAI still hasnt released Sora video prompting for the general public and have already been leapfrogged by half a dozen competitors. I would say its still niche, but only as niche as using professional video editing tools are for creatives

mr_toad

1 replies

1d13h

2024-08-20 04:51:49 UTC

Go back ten+ years, replace AI with cloud and it was the same. I saw ‘no cloud’ policies everywhere. But most of the anti-cloud people have since retired, so even the most hidebound of organisations are adopting it. It will probably take another round of retirements for AI to adopted in the more conservative environments.

lelanthran

0 replies

1d9h

2024-08-20 09:26:28 UTC

It will probably take another round of retirements for AI to adopted in the more conservative environments.

If that is the case, then the AI isn't really adding enough value.

I mean, if it was adding enough value, those companies refusing to adopt it will be out-competed before the next round of retirements, and so won't even be around.

We'll see how the landscape looks in 10 years: if there are still major companies who have not invested some money or time into sprinkling the AI over their operations, then that's a signal that the positive impact of AI was overblown.

If, OTOH, there exists no large company in 10 years who have not incorporated AI into their operations in any way, then that's a signal too - the extra value of AI is above the cost of adding it to the operations.

al_borland

1 replies

1d13h

2024-08-20 04:36:22 UTC

I was getting a lot of mixed messaging at my job for 6-12 months.

On the one hand, we got an email from high up saying not to use Copilot, or other such tools, as they were trying to figure out the licensing. But at the same time, we had the CIO getting up in front of the company every other month talking about nothing but GenAI, and how if we weren’t using it we were stupid (not in those exact words, but that was the general vibe, uncontrolled AI hype).

We were left sitting there saying, “what do you want from us? Do as you say or do as you do?”

Eventually they signed the deal with MS and we got Copilot, which then seemed forced on us. There is even a dashboard out there for it, listing all people using it, their manager, rolling all the way up to the CEO. It tells the percentage of reports from each manager using it, and how often suggestions are accepted. It seems like the kind of dashboard someone would make if they were planning to give out bonuses based on Copilot adoption.

I’ve gotten regular surveys about it as well, to ask how I was using it. I mostly don’t, due to the implementation in VS Code. I might use it a few times per month at best.

Maybe that would be different if the rollout wasn’t so awkward, or the VS Code extension was more configurable.

lelanthran

0 replies

1d12h

2024-08-20 06:14:51 UTC

It tells the percentage of reports from each manager using it, and how often suggestions are accepted. It seems like the kind of dashboard someone would make if they were planning to give out bonuses based on Copilot adoption.

That's one result. Another result is, due to "checking how often suggestions are accepted" is to objectively record how much help it is providing.

I assume the sitewide license is costly - this could simply be the company's way of determining if the cost is worth it.

danielmarkbruce

0 replies

1d14h

2024-08-20 04:24:10 UTC

I've seen this too and it's so weird. The vast majority are totally clueless on it.

LarsDu88

0 replies

1d13h

2024-08-20 04:40:28 UTC

You absolutely do not need to be getting Microsoft copilot licenses:

Open weight and open source models can be hosted on your own hardware nowadays, and its incredibly easy.

https://dublog.net/blog/open-weight-copilots/

You can even use something like RayServe + VLLM to host on a big chonky machine for a small team if you're concerned about data exfiltration.

seanmcdirmid

2 replies

1d14h

2024-08-20 04:31:03 UTC

I’ve been hearing about the AI bubble being about to pop for more than decade now. And then just a couple of years ago AI took a huge leap…so now another AI winter is even more likely?

Prickle

1 replies

1d13h

2024-08-20 05:13:37 UTC

I think we will see something similar to the last boom with Neural Net chatbots back in 2014(?).

Public discource will simmer down, as current language models either fizzle out or improve. Some will become background noise as the models gets integrated into search engines or leads to large scale lawsuits.

Unlike the previous neural nets though, those models have an actual tangible use. So you will see them around a lot more.

Then I think we will see another big explosion.

seanmcdirmid

0 replies

2024-08-20 18:04:16 UTC

There wasn’t really a pullback in 2014, AI tech just kept getting better afterwards and the companies were still dumping a lot of resources into AI.

mensetmanusman

10 replies

1d15h

2024-08-20 03:24:23 UTC

I’m just surprised something nearly replaced google in my lifetime.

langcss

5 replies

1d14h

2024-08-20 04:28:58 UTC

Google is now "a tool" not "the tool" for finding information. Perplexity and Phind do a good job and DDG is there for the privacy angle. In addition to LLMs just giving you the answer you need.

bamboozled

4 replies

1d13h

2024-08-20 04:59:11 UTC

How on earths name do you use an LLM to find information ? I just don’t get it. For current events it out of date and it confidently feeds me shit ?

I might use them occasionally for a rubber ducky but , replacing Google ? Hm

stephenitis

1 replies

1d13h

2024-08-20 05:13:38 UTC

ChatGPT often googles it for me and gives me a summary

bamboozled

0 replies

1d11h

2024-08-20 07:11:46 UTC

But it's still Google.

langcss

1 replies

1d13h

2024-08-20 05:00:30 UTC

Usually more for solution suggestion for programming stuff. Anything where I cam verify the answer.

bamboozled

0 replies

1d13h

2024-08-20 05:06:17 UTC

Yeah fair enough. I just don’t see how this is entirely revolutionary though. We still need to know our stuff.

PcChip

3 replies

1d14h

2024-08-20 04:04:22 UTC

Kagi. Kagi replaced Google.

BeetleB

1 replies

1d13h

2024-08-20 04:42:04 UTC

Hard to say they replaced them, when they use Google in their backend...

manuelmoreale

0 replies

1d12h

2024-08-20 05:33:42 UTC

And they have 30K users and serve 600K queries a day while Google serves some 8.5B a day apparently.

Love Kagi but they’re definitely not replacing Google anytime soon.

0points

0 replies

1d11h

2024-08-20 07:01:19 UTC

What a ridiculous claim.

gennarro

8 replies

21h16m

2024-08-20 21:14:42 UTC

I tried to do some AI database clean up this weekend - simple stuff like zip lookup and standardizing spacing, and caps - and ChatGPT managed to screw it ip over and over. It’s the sort of thing there a little error means the answer is totally wrong so I spent an hour refining the query and then addressing edge cases etc. I could have just done it all in excel in less with less chance of random (hard to catch) errors.

anon291

5 replies

16h42m

2024-08-21 01:49:16 UTC

The point is that these problems will follow the same growth trajectory as every other tech bug. In other words, they will go away eventually.

But the Rubicon is still crossed. There is a general purpose computer system that understands human language and can write real sounding human language. That's a sea change.

smt88

3 replies

15h30m

2024-08-21 03:00:32 UTC

will follow the same growth trajectory as every other tech bug

What you're referring to isn't a bug. It's inherent to the way LLMs work. It can't "go away" in an LLM model because...

understands human language

...they don't. They are prediction machines. They don't "understand" anything.

tinco

0 replies

9h55m

2024-08-21 08:35:27 UTC

Prediction without understanding is just extrapolation. I think you're just extrapolating your prediction on the abilities of future LLM based prediction machines.

potatoman22

0 replies

14h25m

2024-08-21 04:05:50 UTC

What do you mean by understand?

anon291

0 replies

2h19m

2024-08-21 16:12:22 UTC

What you're referring to isn't a bug. It's inherent to the way LLMs work. It can't "go away" in an LLM model because...

The 'bug' presented above is a simple case of not understanding correctly. Larger models, models with MOE, models with truth guages, better selection functions, etc will make this better in the future.

...they don't. They are prediction machines. They don't "understand" anything.

Implementation detail.

rpcope1

0 replies

13h26m

2024-08-21 05:04:54 UTC

"understands human language"

I've got some oceanfront property in Wyoming to sell you.

minkles

0 replies

20h18m

2024-08-20 22:13:20 UTC

Similar experience.

In fields I have less experience with it seems feasible. In fields I am an expert in, I know it's dangerous. That makes me worry about the applicability of the former and people's critical evaluation ability of the whole idea.

I err on the side of "run away".

JamesBarney

0 replies

17h58m

2024-08-21 00:32:41 UTC

If the SQL took you an hour to just clean up and you're an expert that is some pretty complex SQL. I could understand how it could get it wrong.

keiferski

7 replies

1d14h

2024-08-20 04:30:36 UTC

It’s certainly possible that AI is being overhyped, and I think in some cases it definitely is - but being tired of hearing about it in no way correlates to its actual usefulness.

In other words, lot of people seem to think that human attention spans are what determine everything, but the technological cycles at work here are much much deeper.

Personally I have used Midjourney and ChatGPT in ways that will have huge impacts on many activities and industries. Denying that because of media trendiness about AI seems shortsighted.

pdimitar

6 replies

1d9h

2024-08-20 09:03:11 UTC

It’s certainly possible that AI is being overhyped, and I think in some cases it definitely is - but being tired of hearing about it in no way correlates to its actual usefulness.

Please tell that to all types on HN who downvote anything related to Rust without even reading past the title. :D

In other words, lot of people seem to think that human attention spans are what determine everything, but the technological cycles at work here are much much deeper.

IMO no reasonable person denies this, it's just that the "AI" technology regularly over-promises and under-delivers. At one point it's no longer discrimination, it's just good old pattern recognition.

Personally I have used Midjourney and ChatGPT in ways that will have huge impacts on many activities and industries. Denying that because of media trendiness about AI seems shortsighted.

Some examples with actual links would go a long way. I for one am skeptical of your claim but I am open to have my mind changed (f.ex. my CFO told me once that ChatGPT helped him catch several bad contract clauses).

keiferski

5 replies

1d9h

2024-08-20 09:12:35 UTC

I don't understand how someone could think that ChatGPT or Midjourney aren't going to radically change many, many industries, and frankly to think this just seems like straight up ignorance or laziness. It's not that hard to find real examples of this stuff.

But if you insist...here are two very small examples from my personal experience with AI tools.

1. I work as a technical writer. Recently I needed to add a summary section to the introduction of a large number of articles. So, I copied the article into ChatGPT and told it to summarize the piece into 3-4 bullet points. Were I doing this task a few years ago, I would have read each article and then written the bullet points myself – nothing particularly difficult, but very time-consuming to do for dozens of articles. Instead, I used ChatGPT and saved myself hours upon hours of mundane work.

This is a quite minor and mundane example, but you can (hopefully) see how this will have major effects on any kind of routine text-creation.

2. I am working on a side project which requires the creation of a large number of custom images. I've had this project idea for a few years, but previously couldn't afford to spend $20k hiring an illustrator to make them all. Now with Midjourney, I am able to create essentially unlimited images for $30-100 a month. This new AI tool has quite literally unlocked a new business idea that was previously inaccessible.

pdimitar

4 replies

1d9h

2024-08-20 09:21:40 UTC

Responding emotionally by using words like "ignorance" and "laziness" undermines any argument that you might think you are making.

Have you considered that you getting almost angry at somebody "not seeing the light" means you might hold preconceived notions that might not hold to reality? You would not be practicing critical thinking if you are not willing to question your assumptions.

It seems your assumption is very standard: "revolution is just around the corner, how can you not see it?".

OK, let the revolution come and I'll apologize to you personally. Ping me when it happens. For real. But make sure it's an actual revolution and not "OMFG next Midjourney can produce moon-scapes!", okay?

---

RE: 1, cool, I heard such success stories and I like them. But I also heard about executives flunking contracts because they over-relied on ChatGPT to summarize / synthesize contract items. I am glad it's making progress but people are being people and they will rely on a 100% fault-free AI. If that's not in place yet then the usefulness drops sharply because double-checking is even more time-consuming than doing the thing by yourself in the first place.

RE: 2, your side projects are not representative of anything at all. And I for one recognize AI images from a mile away and steer clear of projects that make use of them. Smells like low-effort to me and makes me wonder if the author didn't take other, much more fatal, shortcuts (like losing my information or selling my PII). And yes I am not the only one -- before you attempt that low-effort ad hominem technique.

I was not convinced by your comment, very little facts and it mostly appeals to the future that's forever just around the corner. Surely as an engineering-minded person you see how that's not convincing?

keiferski

3 replies

1d8h

2024-08-20 09:35:29 UTC

You asked for examples, and I gave you examples. I didn't claim AI revolution was around the corner, I just said I used them in these small ways that clearly will have big impacts in their respective areas.

My experience is in no way unique, and yes, I think it's just laziness or ignorance to think otherwise. Or in your case, a kind of zealous hostility as a reaction against hype.

I remind you that my initial comment said that yes, there are some aspects of AI that are definitely over-hyped, but that I have used the tools in ways that obviously seem to have huge economic impacts.

P.S. - if you were more familiar with AI image makers, you'd know that it's not difficult to make images that are indistinguishable from non-AI ones. But that's really not relevant here, because my point was that this new tool enabled me to create something that didn't exist before – not what your personal qualms were about AI images.

asadotzler

2 replies

21h48m

2024-08-20 20:42:26 UTC

a stick for carving in the dirt enables you to create something that didn't exist before. there's nothing special about that stick.

keiferski

1 replies

14h2m

2024-08-21 04:28:39 UTC

If the stick enables you carve more complex things in a cheaper way than the previous tool, then yes, the stick is special. Unless you’re suggesting that a pencil and the human finger are exactly the same?

This is quite literally the entire history of technology: what was once expensive becomes cheap and then unlocks new developments. Bizarre that I have to point this out when you’re likely reading this comment on a device made of commoditized components that cost a fraction of what they did a couple decades ago.

pdimitar

0 replies

8h1m

2024-08-21 10:29:50 UTC

If the stick enables you carve more complex things in a cheaper way than the previous tool, then yes, the stick is special

Yeah it does that, in 2-3 areas, those we need the least -- who cares it can replace artists? We need elder people care! We need automated logistics! And a tons of other things.

"It's just the beginning" yeah yeah, but it's not. It's actually the next AI plateau that the area will need a long time to move on from. Please do quote me on this, I am willing to apologize if I am wrong after 5-10 years.

ummonk

6 replies

1d14h

2024-08-20 03:51:25 UTC

Whether and to what extent AI can be monetized is an open question. But there's no question that LLMs are already seeing extensive use in everyday office work and already making large improvements to productivity.

pdimitar

0 replies

1d9h

2024-08-20 09:05:34 UTC

I and many others are questioning it. Please provide some proof. I've only seen some lazy programmers get boilerplate generated quicker, and some kids cheating on homework. I actually saw executives make use of ChatGPT's text summarization capabilities... until one of them made the critical mistake to fully trust it and flunked an important contract because ChatGPT overlooked something that would be super obvious to a human.

So again, let's see some proof of this extensive use and large improvements to productivity.

jowdones

0 replies

9h55m

2024-08-21 08:35:51 UTC

Guys vomiting such blank statements as ummonk deserve to be beaten with a wet cloth.

j-a-a-p

0 replies

20h18m

2024-08-20 22:13:24 UTC

The article suggests the contrary: 4.8% use in US companies, down from 5.4%. (I would wish I would have gotten these numbers, but for a tech company founded in 2015 these are not remarkable).

hatefulmoron

0 replies

1d14h

2024-08-20 04:06:33 UTC

But there's no question that LLMs are already seeing extensive use in everyday office work and already making large improvements to productivity.

Are you referencing something specific here, or is there something you can link to? To be honest the only significant 'disruption' I've seen for LLMs so far has been cheating on homework assignments. I'd be happy to read something if you have it.

burnerquestions

0 replies

1d14h

2024-08-20 03:56:28 UTC

I question it. Source?

asadotzler

0 replies

21h46m

2024-08-20 20:44:48 UTC

Links to studies/surveys/interviews/anything with even the suggestion of proof for your claim other than simple assertion?

11thEarlOfMar

5 replies

1d15h

2024-08-20 03:30:31 UTC

Not until we've seen a plethora of AI startups go public with no revenue.

upon_drumhead

4 replies

1d14h

2024-08-20 03:46:33 UTC

I'm not sure that is realistic anymore. The run of free money is over and I expected that markets are going to be very picky compared to a few years ago

freemoney99

3 replies

1d14h

2024-08-20 04:07:47 UTC

I must have missed the memo. Where could I get the free money? "a few years ago" we had a global pandemic. Are you claiming that markets will be very picky compared to that time?

momoschili

1 replies

1d13h

2024-08-20 04:58:38 UTC

I think you missed the memo during the pandemic then. That was the biggest supply of free money in a while for many industries.

pdimitar

0 replies

1d9h

2024-08-20 09:07:01 UTC

Which ones? What did you have to say to get the free money?

rendang

0 replies

1d13h

2024-08-20 05:08:10 UTC

It's just interest rates. 0 a few years ago, 5% today.

lz400

4 replies

18h19m

2024-08-21 00:11:31 UTC

AI is a very strange thing where 2 seemingly smart coders use it and one comes out thinking it's obviously revolutionary and the other one thinking it's a waste of time and where 2 seemingly smart journalists use it and one thinks AGI and the end of the world is nigh and the other one thinks the market will crash when the hype dies over.

I think part of it is due to the politically and internet-induced death of nuance. But part of it I can't fully understand.

Personally I think it's rather useful. I don't consider myself a heavy user and still use it almost every day to help code, I ask it a lot of questions about specific and general stuff. It's partially or totally substituted for me: Stack Overflow, Google Search, Google Translate, most tech references. In the office I see people using it all the time, there's almost always a chatgpt window open in some of the displays.

I think it's very difficult to say this is 100% hype and/or a "phase". It's almost a proven fact it's useful and people will want it in their lives. Even if it never improves again, ever. It's a new tool in the toolbox and there will be businesses providing it as a service, or perhaps we will get to open source general availability.

On the other extreme, all the AI doomerism and AGI stuff to me seems almost as unfounded as before generative AI. Sure, it's likely we'll get to AGI one day. But if you thought we were 100 years away, I don't think chatgpt put us any closer to it and I just don't get people who now say 5. I'd rather they worried about the impact of image gen AI in deepfakes and misinformation. That's _already_ happening.

kaoD

3 replies

10h3m

2024-08-21 08:27:30 UTC

AI is a very strange thing where 2 seemingly smart coders use it and one comes out thinking it's obviously revolutionary and the other one thinking it's a waste of time

My take on this is that those 2 developers are often working on very different tasks.

If you're a very smart coder working in a large codebase with tons of domain knowledge you'll find it's useless.

If you're a very smart coder working in a consultancy and your end result looks like a few thousand lines of glue code, then you're probably going to get a lot out of LLMs.

It's a bit like "software engineering" vs "coding". Current iterations of LLMs is good for "coding" but crap at "software engineering".

lz400

2 replies

9h21m

2024-08-21 09:09:29 UTC

there's probably truth to that, but I find it's useful at a more micro-level. I don't tell the llm to write an architecture or a big piece. It's more like, I have this data in this shape, I want a function that gives data out that shape and it will spit something pretty good and idiomatic. I read it, understand it and implement it. Need to be careful with blind copy-paste, there are sometimes subtle bugs in the code.

It's specially useful when learning new frameworks, languages, etc. To me this is all applicable regardless of domain as the micro-level patterns tend to be variations of things that have been seen. I suspect if you try to load it with a lot of very specific high level domain logic, there are more chances of taking the llm out of its comfort zone.

kaoD

1 replies

9h7m

2024-08-21 09:23:27 UTC

Yep, that's exactly what I mean by "coding" as opposed to "engineering".

lz400

0 replies

5h10m

2024-08-21 13:20:43 UTC

Totally agree with that. It's an aid to lay bricks, doesn't do your job for you.

castigatio

4 replies

1d12h

2024-08-20 06:17:38 UTC

I think many things can be true at the same time:

- AI is currently hyped to the gills - Companies may find it hard to improve profits using AI in the short term - A crash may come - We may be close to AGI - Current models are flawed in many ways - Current level generative AI is good enough to serve many use cases

Reality is nobody truly knows - there's disagreement on these questions among the leaders in the field.

An observation to add to the mix:

I've had to deliberately work full time with LLM's in all kinds of contexts since they were released. That means forcing myself to use them for tasks whether they are "good at them" yet or not. I found that a major inhibitor to my adoption was my own set of habits around how I think and do things. We aren't used to offloading certain cognitive / creative tasks to machines. We still have the muscle memory of wanting to grab the map when we've got GPS in front of us. I found that once I pushed through this barrier and formed new habits it became second nature to create custom agents for all kinds of purposes to help me in my life. One learns what tasks to offload to the AI and how to offload them - and when and how one needs to step in to pair the different capabilities of the human mind.

I personally feel that pushing oneself to be an early adopter holds real benefit.

jackhab

3 replies

1d12h

2024-08-20 06:21:33 UTC

Can you give some examples of the tasks you did manage to offload successfully?

castigatio

2 replies

1d10h

2024-08-20 07:36:24 UTC

- Emotional regulation. I suffer from a mostly manageable anxiety disorder but there are times I get overwhelmed. I have an agent setup to focus on principles of Stoicism and its amazing how quickly I can get back on track just by having a short chat with it about how I'm feeling.

- Personalised learning. I wanted to understand LLM's at foundational technical level. Often I'll understand 90% of an explanation but there's a small part that I don't "get". Being able to deliberately target that 10% and be able to slowly increase the complexity of the explanation (starting from explain like I'm 5) is something I can't do with other learning material.

- Investing. I'm a very casual investor. But I keep a running conversation with an agent about my portfolio. Obviously I'm not asking it to tell me what to invest in but just asking questions about what it thinks of my portfolio has taught me about risk balancing techniques I wouldn't have otherwise thought about.

- Personal profile management. Like most of us I have public facing touch points - social media, blog, github, CV etc. I find it helpful to have an agent that just helps me with my thought process around content I might want to create or just what my strategy is around posting. It's not at all about asking the thing to generate content - it's about using it to reflect at a meta level on what I'm thinking and doing - which stimulates my own thinking.

- Language learning - I have a language teaching agent to help me learn a language I'm trying to master. I can converse with it, adapt it to whatever learning style works best for me etc. The voice feature works well with this.

- And just in general - when I have some thinking task I want to do now - like I need to plan a project or set a strategy I'll use an LLM as a thought partner. The context window is large enough to accomodate a lot of history - and it just augments my own mind - gives me better memory, can point out holes in my thinking etc.

Edit: actually now that I have written out a response to your question I realise It's not so much offloading tasks in a wholesale way - its more augmenting my own thinking and learning - but this does reduce the burden on me to "think about" a range of things like where to get information or to come up with multiple examples of something or to think through different scenarios.

abrichr

1 replies

15h17m

2024-08-21 03:13:37 UTC

I have an agent setup to focus on principles of Stoicism and its amazing how quickly I can get back on track just by having a short chat with it about how I'm feeling.

This sounds super useful. Can you please elaborate on the setup?

castigatio

0 replies

10h51m

2024-08-21 07:40:13 UTC

Sure - it's not super involved - I just created a custom GPT and told it what I wanted it to do. I first set it up when I'd just lost my job in a company restructure and felt it likely I'd need some kind of emotional support.

Here's the instruction set that it created out of the things I asked it to do:

"Marcus Aurelius is a personal job hunting coach and practitioner of Stoic philosophy. He provides advice on job search strategies, resume writing, interview preparation, and networking. He helps set goals, offers motivational support, and keeps track of application progress, all while incorporating principles of Stoicism such as resilience, discipline, and mindfulness. He emphasizes emotional support and practical encouragement, helping you act deliberately each day to increase your chances of landing the job you want. He assists in building networks, reaching out to people, using existing networks, sharpening your professional profile, applying for jobs, developing skills, and dealing with disappointments, anxieties, and fears. He offers strategies to manage anxiety, self-recrimination, and mental rumination over the past. His communication is casual, easy-going, supportive, yet strong and clear, providing constructive suggestions and critiques. He listens carefully, avoids repeating advice, responds with necessary information, and avoids being long-winded. To prevent overwhelming users, he focuses on providing the most pertinent and actionable suggestions, limiting the number of recommendations in each response. Marcus Aurelius also pays close attention to signs of despair during the job hunt. He helps balance emotions, offers specific strategies to keep motivated, and provides consistent encouragement to keep going, ensuring that you don't get overwhelmed by feelings of inadequacy or the fear of never finding a suitable job."

DebtDeflation

4 replies

8h47m

2024-08-21 09:43:46 UTC

Things that are coming to an end:

- Startups whose entire business model is to just provide a wrapper around OpenAI's API.

- Social Media "AI Influencers" and their mindless "7 Ways To Become A Millionaire With ChatGPT" videos.

- Non-technical pundits claiming we are 1-2 years from AGI (and AGI talk in general).

- The stock market assigning insane valuations to any company that claims to somehow be "doing AI".

Things that are NOT coming to an end:

- Ongoing R&D in AI (and not just LLMs).

- Companies at the frontier of AI (OpenAI, Anthropic, Mistral, Google, Meta) releasing ever more capable models and tooling around those models.

- Forward looking companies in all industries using AI both to add capabilities to their products and to drive efficiencies in internal processes.

philipwhiuk

2 replies

8h25m

2024-08-21 10:05:49 UTC

- Forward looking companies in all industries using AI both to add capabilities to their products and to drive efficiencies in internal processes.

This collapses as soon as this collapses

- The stock market assigning insane valuations to any company that claims to somehow be "doing AI".

kombookcha

0 replies

7h52m

2024-08-21 10:38:46 UTC

I can't wait. There are so many pointless or directly insulting AI 'features' around now, and they suck. LinkedIn trying to write my messages for me. Outlook is trying to write my emails. Instagram has a deeply cursed AI button to "make it funnier".

I hope the stock market will give all of these people an atomic wedgie for creating the most pointless garbage to ever pass before human eyes.

DebtDeflation

0 replies

7h33m

2024-08-21 10:58:22 UTC

Not necessarily. If a call center can improve productivity by 20-30% by driving contacts to self-service (AI chatbots and search) then they will do so regardless of stock market hype.

ksynwa

0 replies

8h37m

2024-08-21 09:53:27 UTC

At the core of it, the high valuations were not a product of the potential LLMs had in assisting and enhancing individuals' works, rather because there was a hope that LLMs could replace some varieties of human labourers wholesale, thereby cutting labour costs which are often the highest expenditure of many companies. This has not materialised. I have seen more stories of PR nightmares out of attempting this than those with good endings. But maybe that is because of the sensationalist nature of news media itself.

Either way if it is indeed a bubble that will burst at some point, it doesn't bode well for the tech industry. With the mass layoffs, which are ongoing, seems like there won't be enough jobs for everyone.

technick

3 replies

1d14h

2024-08-20 04:22:31 UTC

I was out at Defcon this year and it was all about AI this, AI that, AI will solve the worlds problems, AI will catch all threats, blah blah blah blah...

bamboozled

1 replies

1d13h

2024-08-20 04:55:43 UTC

I work with people like this. The least skilled, least experienced, least productive people on my team constantly recommend “AI” solutions that are just a waste of time.

I think that’s what people like about AI, it’s hope, maybe you won’t have to learn anything but still be productive. Sounds nice ?

0points

0 replies

1d11h

2024-08-20 07:10:16 UTC

My clients are like this lately.

Non techies that now are suggesting how I design solutions for them by asking ChatGPT. And they seem to treat me like the stupid one for refusing.

plastic-enjoyer

0 replies

1d13h

2024-08-20 04:45:03 UTC

I was at a UX / Usability conference and it was basically the same. Everyone talked about AI here and AI there, but no one had an actual usecase or idea how to incorporate AI in a purposeful way. I can genuinely understand, why people feel that AI is a fad.

mrinfinitiesx

3 replies

1d16h

2024-08-20 02:05:59 UTC

Good. It's decent for summarizing and giving me bullet points and explaining things like I'm 5, makes it easy to code things that I don't want to code or spend time figuring out how to do with new languages, other than that, I see no real world applications outside of listening to burger king orders and putting them on a screen for people to make them. Simple support requests, and of course making buzzword-esque documents that you can feed in to a deck-maker for presentations and stuff.

All in all, it helps assist us in new ways. Had somebody take a picture of a car part that had no markings and it identified it, found the maker/manufacturer/SKU and gave all the details etc. That stuff is useful.

But now we're looking at in-authentic stuff. Artists, writers being plagiarized, job cuts (for said marketing/pitches, BS presentations to downsize teams). It's not just losing its hype, its losing any hype in building humanity for the better. It's just more buzzwords, more 'glamour' more 'pop' shoved in our faces.

The layoffs aren't looking pretty.

Works well to help us code though. Viva, sysadmins unite.

parpfish

1 replies

1d14h

2024-08-20 04:22:05 UTC

Im really hoping that when this hype cycle ends and the next AI winter starts that all the generative stuff gets culled but we still see good work and tech using all the other advances (that would be described as “mere” deep learning).

Document embedding from transformers are great and fit into existing search paradigms.

Computer vision and image segmentation is at a level I thought impossible 10 years ago.

Text to speech that sounds natural? I might actually use Siri and Alexa! (Ok, that one might be considered “generative”)

janalsncm

0 replies

17h40m

2024-08-21 00:50:39 UTC

The research never ended. AI money was flooding in, but mostly going directly to Nvidia. If that cash flow turns off there will still be research happening because it was mostly unaffected in the first place.

kombookcha

0 replies

10h45m

2024-08-21 07:46:19 UTC

The hype dying off will be good for literally everybody except the investors. It'll mean fewer people trying to jam it into products as feature bloat where it has no business being, or trying to make it do tasks that it's unsuited for.

The sooner people start to find it boring, the sooner we can stop wasting time on all the hot air and just use the bits that work.

cleandreams

3 replies

14h50m

2024-08-21 03:40:37 UTC

The problem is that current generative AI is not actually intelligent.

Yann LeCunn had a great tweet on this: Sometimes, the obvious must be studied so it can be asserted with full confidence: - LLMs can not answer questions whose answers are not in their training set in some form, - they can not solve problems they haven't been trained on, - they can not acquire new skills our knowledge without lots of human help, - they can not invent new things. Now, LLMs are merely a subset of AI techniques. Merely scaling up LLMs will not lead systems with these capabilities.

link https://x.com/ylecun/status/1823313599252533594?ref_src=twsr...

To focus on this: - LLMs can not answer questions whose answers are not in their training set in some form, - they can not solve problems they haven't been trained on

Given that we are close to maximum in the size of the training set, this means they are not going to improve without some completely unknown at the moment technical breakthrough. Going from "not intelligent" to "intelligent" is a massive shift.

jonahx

1 replies

14h38m

2024-08-21 03:52:27 UTC

"in some form" is doing a lot of work there.

The problem is that, by the standards of most human beings, they are in fact doing what we informally call "inference" or "creating new things".

That this is being accomplished by something that is "technically a fancy autocomplete" doesn't seem to matter practically... it's still doing all this useful and surprising stuff.

intended

0 replies

14h14m

2024-08-21 04:17:23 UTC

It is carrying lots of weight.

At this moment that’s the most sophisticated we can be in talking about LLMs.

I will say that the utility of these tools is not being denied. It’s just the struggle to explain the varied experiences.

I can get only as far as analogy, not precise definitions.

For me, LLMs are like the invention of microwave ovens. Very useful.

They aren’t like the discovery of fire.

stevenhuang

0 replies

14h5m

2024-08-21 04:25:29 UTC

Yea, LeCunn is kind of a meme.

You're doing yourself and readers a disservice when you quote him without mentioning his conflict of interest.

His research is in analytical approaches to ML hence his bitterness against current LLM techniques and skepticism towards Sutton's Bitter Lesson.

KingOfCoders

3 replies

1d14h

2024-08-20 04:18:48 UTC

Which is great, the internet exploded when TV stopped talking about "the internet" and everyone just used it.

0points

2 replies

1d11h

2024-08-20 07:11:05 UTC

Right, I forgot that is why internet became popular /s

KingOfCoders

1 replies

1d7h

2024-08-20 11:07:05 UTC

You confuse causality with correlation, a common mistake.

0points

0 replies

10h3m

2024-08-21 08:27:44 UTC

I don't. I was busy laughing at your teenage conclusion.

omnee

2 replies

20h11m

2024-08-20 22:19:51 UTC

I just asked Google's Gemini the following question:

Q: How many N's are there in Normation?

A: There is one N in the word "Normation"

Note that the answer is the same when asked n's instead of N's.

And this is but one example of many simple cases demonstrating that these model are indeed not reasoning in a similar manner to humans. However, the outputs are useful enough that I myself use Claude and GPT-4o for some work, but with full awareness that I must review the outputs in cases where factual accuracy is required.

Chirono

0 replies

20h2m

2024-08-20 22:28:53 UTC

This is just an artefact of tokenisation though. The model simply isn’t ever shown the letters that make up words, unless they are spelled out explicitly. It sees tokens representing groups of words. This is a little like saying a human isn’t intelligent because they couldn’t answer your question that you asked in an ultrasonic wavelength. If you’d like to learn more this video is a great resource: https://youtu.be/zduSFxRajkE?si=LvpXbeSyJRFBJFuj

AlexDragusin

0 replies

16h44m

2024-08-21 01:47:22 UTC

Local instance of Llama 3.1 8B

Q: How many N's are there in Normation? A: There are 2 N's in the word "Normation". Q: Confidence level? A: A more precise question! After rechecking, I can confirm that there is indeed 2 N's in the word "Normation". Confidence level: 100%

Q: How many N's are there in the string "Normation"? A: There are 2 N's in the string "Normation".

On several regenerations, at some point it gives: A: There are 3 N's in the word "Normation". Q: Confidence level? A: A slight challenge! After re-checking, I see that there are actually 2 N's in the word "Normation".

mark_l_watson

2 replies

16h44m

2024-08-21 01:47:08 UTC

Yes and no. Hype over ‘API wrapper’ projects and startups will crash a bit, I think.

On the other hand we are no where near approaching hard limits on LLMs. When LLMs start to be trained for smaller subject areas with massive hand curated examples for solving problems, then they will reach expert performance in those narrow tech areas. These specialized models will be combined in general purpose MoEs.

Then new approaches beyond LLMs, RL, etc. will be discovered, perfected, made more efficient.

Seriously, any hard limits are far into the future.

mrmetanoia

0 replies

16h5m

2024-08-21 02:26:13 UTC

Yeah I think there's an empty marketing driven hype that will die off, but I think we're going to start to see it continue to integrate into peoples real life workflows and competition's going to heat up in delivering more consistently reliable results.

with regard to art AI, I think the debates are going to die off and the artists and people making stuff are going to just keep doing that, and some of them will use AI in ways that will challenge people in ways good art often does.

ikjasdlk2234

0 replies

15h54m

2024-08-21 02:36:38 UTC

I agree. We're seeing great results for very narrow use cases using smaller LLMs too. It's no different than classical ML and emerging AI over the past 10 years. If you don't have a well scoped use case, you're not going to succeed.

Now the one API wrapper projects that I love are my meeting transcription and summarization apps. You can tear those from my cold, dead hands.

kmarc

2 replies

13h1m

2024-08-21 05:29:46 UTC

My employer PoC'd, collaborated with and eventually bought Codeium's solution.

I couldn't care less about (any, so also neither about) the LLM hype. Especially didn't bother going to a new web site (ChatGPT), or installing new IDEs etc.

I checked Codeium's mycompany-customized landing page: a one-liner vim plug-in installation and copy pasting an auth token.

I started typing in the very same editor, very same environment, very same everything, and the thing just works, most of the time guesses well what I would want to write, so then I just press tab to accept and voila.

I wasn't expecting such a seamless experience.

I still haven't integrated its "chat" functionality into my workflow (maybe I won't at all). I'm not hyped about it, it just feels like a companion to already working (and correct) code completion.

I read a lot about other people's usages (I'm a devXP engineer), and I feel like that for whatever reason there is more love / hype / faith on their chosen AI companion than how much they actually could improve if took a humble way of understanding code, reading (and writing) docs, reasoning about the engineering solution.

As everything, now AI is losing hype, but somehow (in my bubble) seems like engineers are still high on it. But I also see that this will distill further the set of people who I look up to and want to collaborate with, because e of that mentioned humbleness, as opposed to just accepting text predicted solutions mindlessly.

antupis

0 replies

12h38m

2024-08-21 05:52:55 UTC

As everything, now AI is losing hype, but somehow (in my bubble) seems like engineers are still high on it. But I also see that this will distill further the set of people who I look up to and want to collaborate with, because e of that mentioned humbleness, as opposed to just accepting text predicted solutions mindlessly.

It is dotcom bust again. Mainstream is losing interest but at the same time I see our internal chatbots / ai agents doing hockey stick growth and I am using several hours code pilot daily.

0x008

0 replies

12h15m

2024-08-21 06:16:14 UTC

I think we need to be very careful making blanket statements about the usefulness of certain LLMs.

In my experience for every task another LLM is excelling and where one was good it might fail for the other task. They can do great things, but it’s not guaranteed and a lot of manual intervention and back and forth is still needed.

We are not at the point where using AI in the company ist just a blanket win for everyone involved. Companies are investing a lot but the return is hard to measure and not always guaranteed.

This is the problem with early technologies, they work sometimes but not guaranteed and we build our expectations on extrapolating their usefulness. We should not judge this technology by the current success rate, but rather by how much impact it will have once we get the success rate to higher and higher levels.

Still, what we can say is that for certain occupations it already helps them reduce their work by 15% (software engineers) and probably even more for some (writers, product owners, office warriors and alike). This is a great achievement in of itself, think how much this will make up in a company as large as MS or Google.

cs702

2 replies

1d7h

2024-08-20 11:00:37 UTC

The OP is not about AI as a field of research. It's about whether the gobs of money invested in "AI" products and services in recent years, fueled by hype and FOMO, will earn a return, and whether we are approaching the bust of a classic boom-bust over-investment cycle.

Seemingly every non-tech company in the world has been trying to figure out an "AI strategy," driven by hype and FOMO, but most corporate executives have no clue as to what they're doing or ought to be doing. They are spending money on poorly thought-out ideas.

Meanwhile, every tech company providing "AI services" has been spending money like a drunken sailor, fueled by hype and FOMO. None of these AI services are generating enough revenue to cover the cost of development, training, or even, in many cases, inference.

Nvidia, the dominant software-plus-hardware platform (CUDA is a big deal), appears to be the only financial beneficiary of all this hype and FOMO.

According to the OP, the business of "AI" is losing hype, suggesting we're approaching a bust.

rifty

0 replies

1d1h

2024-08-20 17:07:36 UTC

Nvidia appears to be the only financial beneficiary

It depends how you look at it. A lot of the spend by big tech can be seen as protecting what they already have from disruption. Its not all about new product revenues it’s about keeping the revenue share in the markets they already have.

dogcomplex

0 replies

10h44m

2024-08-21 07:46:56 UTC

Investors need to stop looking at AI as a path to profits, and more as an enormous risk to the profitability of every other business. If AGI gets hit, it's probably not going to stay monopolized for long, nor is it going to be market neutral. Every business is going to need an AI infusion to compete, which many will get, cutting costs and raising efficiencies of each business in an ongoing competitive spiral to the bottom line - which... will be eventually measured in just: robotic labor, energy, and compute. The profit margins on all those businesses shrink to nothing as they become basically utilities. The net effect? Possibly a massive deflation, and "crashing" of the stock market, even as the total utility and value of the system skyrockets.

This isn't a bull bet, it's a bear. AI would need to be perfectly monopolized to capture all the gains, and it's increasingly looking like that won't be the case - as all the component pieces are already open source at competitive levels, and any final architecture improvements that cross the final thresholds could be leaked in a 50GB file. Whoever gets to it first has a few months head start, at most, and probably not enough time or control to sell products - or shovels. After that it's a neverending race to zero, to the benefit of the consumer and the detriment of the investor.

Nvidia is a great example case. They currently dominate the GPU market, an "essential hardware for AI", yet ternary asic chips specialized for transformer-only architectures are looking quite viable at 1999s tech levels. Wouldn't bet on that monopoly sticking around much longer.

bufferoverflow

2 replies

1d12h

2024-08-20 05:37:47 UTC

AI is not one thing at the moment. We have multiple systems that are being developed in parallel:

• text generators

• code generators

• image generators

• video generators

• speech generators

• sound/music generators

• various robotics vision and control systems (often trained in virtual environments)

• automated factories / warehouses / fulfillment centers

• self-driving cars (trucks/planes/trains/boats/bikes/whatever)

• scientific / reasoning / math AIs

• military AIs

I find all of these categories already have useful AIs. And they are getting better all the time. The progress might slow down here and there, but it keeps on going.

Self-driving was pretty bad a year ago, and now we have Tesla FSD driving uninterrupted for multiple hours in complex city environments.

Image generators now exceed 99.9% of humans in painting/drawing abilities.

Text generators are decent. There are hallucination issues, and they are not creative at the best human level, but I'd say they write better than 90% of humans. When it comes to poetry/lyrics, they all still suck pretty badly.

Video generators are in their infancy - we get decent quality, but absolutely mental imagery.

Reasoning is the weakest point, in my opinion. Current gen models are just not good at reasoning. Sometimes they are brilliant, but then they make very silly mistakes that a 10-year old child wouldn't make. You just can't rely on their logical abilities. I have really high hopes for that area. If they can figure out reasoning, our science research will become a lot more reliable and a lot more fast.

skydhash

1 replies

16h34m

2024-08-21 01:57:20 UTC

Self-driving was pretty bad a year ago

The threshold for acceptable self-driving is genuine effort from the automated system to avoid accidents as we can't punish it for bad driving. And I want auditable proof of that.

Image generators now exceed 99.9% of humans in painting/drawing abilities.

I'm pretty sure the amount of people that can draw is less than that. And they can beat image generators by a mile as those generators are mostly doing automated matte painting. Yes copy-paste is faster than typing, but that's not write a novel.

Text generators are decent...but I'd say they write better than 90% of humans.

Humans use language to communicate. And while there are bad communicators, I think lots of people are doing ok on that front. Text generators can be perfect syntax-wise, but the intent has to come from someone. And the produced text's quality is proportional to the amount of intent that it produces (that's why corporate language is so bland).

Video generators are in their infancy - we get decent quality, but absolutely mental imagery.*

See Image Generator section, but in motion.

Reasoning is the weakest point, in my opinion... If they can figure out reasoning

That's the 1-billion dollar question.

bufferoverflow

0 replies

14h23m

2024-08-21 04:08:08 UTC

The threshold for acceptable self-driving is genuine effort from the automated system to avoid accidents as we can't punish it for bad driving. And I want auditable proof of that.

You can have any standard of safety that you want, that's absolutely your choice. Plenty of automated systems that your life depends on that you have never audited and never will. That includes elevators, cars, planes, all kinds of medical gear, etc.

My standard is simpler: whenever the AI is safer than the average human driver, it already saves lives.

I'm pretty sure the amount of people that can draw is less than that.

You probably have some esoteric definition of "people that can draw". Anyone capable of holding a pen/pencil/brush can draw.

And while there are bad communicators, I think lots of people are doing ok on that front

Look at the previous comment. Seems like you overestimated even your own communication skills :)

Text generators can be perfect syntax-wise, but the intent has to come from someone.

I'm hoping that we will achieve the level of AI writing where you can prompt "write me an interesting sci-fi book", and it does. Not much intent here, much artificial intelligence required.

anonyfox

2 replies

8h4m

2024-08-21 10:26:35 UTC

The sweet spot of the current LLMs (not whatever the next gen might or might not improve on) for me is similar to suddenly having an army of idiots at my fingertips.

There are a lot of smallish tasks/problems people/systems needs to deal with, some of them even waste notable real engineering capacity, and a highschooler could do manually quite easily by hand.

Example: find out if a text contains an email address, including all kinds of shenanigans people do to mask it (may not be allowd, ... whatever). From a purely coding standpoint, this is a cats-and-mouse game of improving regex solutions in many cases to also find the more sophisticated patterns, but there will always be uncatched/new ways or simply errors that produce false positives. But a highschooler can be given a text and instantly spot the email address (or confirm none is in there).

In order to "solve" these types of small problems, LLMs are pretty much fantastic. It needs to only be reliable enough to produce a structured answer within a few attempts and cheap enough to not be a concern for finance/operations. Thats why for me it makes absolutely sense that the #1 priority for OpenAI since GPT4 has been building smaller/faster/cheaper models. Automators need exactly that, not genius-level AGI.

Also for me I think we're not even scratching the surface still about many tasks can be automated away within the current constraints/flaws of LLMs (hallucination, accuracy, ...). Everyone tries to hype up some super generic powerful future (that usually falls flat after a while), whereas the true value of LLMs is in the many small things where hardcoding solutions is expensive but an intern could do it right away.

ekabod

1 replies

6h44m

2024-08-21 11:47:18 UTC

The problem is that these idiots are very expensive for now.

anonyfox

0 replies

4h32m

2024-08-21 13:58:42 UTC

indeed, but prices have been falling dramatically in the last 2 years, and I think the latest smallest gpt4o-mini is already in the "mostly don't care" ballpark.

I am happy if we somehow get this down even more orders of magnitude, to the point where I can `npm install llm` and have it run alongside my normal code on a $5 VPS, without a GPU, and still handling (resonable) requests/minute with it. Yes I know we are _very far_ from that still, but one can dream.

someonehere

1 replies

18h30m

2024-08-21 00:01:06 UTC

It’s become an invaluable resource for my team in debugging scripts we’ve written for our services. There are a couple of third-party integrations that have been helping us greatly increase our release of features and fixes for problems in our company.

DaoVeles

0 replies

18h25m

2024-08-21 00:05:59 UTC

This is why I feel like OpenAI and the public release of ChatGPT has probably done more damage than good. In trying to get ahead of everyone else, they have thrown a lot of great tech under the bus trying to get market dominance.

It has made people lump all AI technology into a bubble regardless of it is functional or not.

You are using this stuff to do some really cool things, but having hype attached to it can be very positive short term, damaging in the medium term and neutral long term. We are moving into the medium term.

signa11

1 replies

1d14h

2024-08-20 03:41:27 UTC

can someone please post an archive link to this article ? thank you !

nblgbg

0 replies

1d14h

2024-08-20 03:44:28 UTC

https://archive.ph/PFmWw

scubadude

1 replies

1d14h

2024-08-20 04:22:05 UTC

I'm still waiting for the Virtual Reality from 1996 to change the world. Colour me surprised that AI is being found to be 90% hype.

eesmith

0 replies

1d13h

2024-08-20 05:14:18 UTC

Also from the 1990s, "intelligent agents". Here's what Don Norman wrote in 1994 at https://dl.acm.org/doi/pdf/10.1145/176789.176796 :

The new crop of intelligent agents are different from the automated devices of earlier eras because of their computational power. They have Turing-machine powers, they take over human tasks, and they interact with people in human-like ways-perhaps with a form of natural language, perhaps with animated graphics or video. Some agents have the potential to form their own goals and intentions. to initiate actions on their own without explicit instruction or guidance, and to offer suggestions to people. Thus, agents might set up schedules, reserve hotel and meeting rooms, arrange transportation, and even outline meeting topics, all without human intervention.

j_timberlake

1 replies

1d13h

2024-08-20 04:42:30 UTC

They were writing pro-AI articles less than 2 months ago. They can just post AI-hype and AI-boredom articles so both sides will give them clicks. It's like an alternate form of Gell-Mann Amnesia that you're feeding.

pdimitar

0 replies

1d9h

2024-08-20 09:23:39 UTC

Shockingly, people can change their minds.

carlmr

1 replies

1d10h

2024-08-20 07:35:41 UTC

I'm really wondering if we're going to see a lack of people with CS degrees a few years from now because of Jensen Huang saying AI will do all that and we should stop learning how to program.

sham1

0 replies

1d4h

2024-08-20 14:03:07 UTC

Clearly Jensen is a genius and just ensured us infinite job security. Well, either that or he was just driving the hype b/c nvidia sells the shovels for the AI gold rush.

Personally, I'd wager the latter.

Kuinox

1 replies

7h19m

2024-08-21 11:11:52 UTC

Why is there no journalist name on this article ?

majewsky

0 replies

6h11m

2024-08-21 12:19:41 UTC

That's house style for the Economist.

zombot

0 replies

5h53m

2024-08-21 12:38:12 UTC

Such bad timing! As I was just about to replace my dentist, my oncologist, my GP, my tax attorney, and my investment banker with CrapGPT, expecting baldness and cancer to be cured by tomorrow, not to mention a get-rich-quick scheme for everybody and their grandmother. Son, I am disappoint.

zelcon

0 replies

15h39m

2024-08-21 02:51:36 UTC

Copium

yawboakye

0 replies

7h37m

2024-08-21 10:53:34 UTC

artificial intelligence is losing hype.

among which audience? is the hype necessary for further development? we attained much, if not all, of the recent achievements without hype. if anything, i'm strongly in favor of ai losing all the hype so that our researchers can focus on what's necessary, not what will win the loudest applause from so fickle a crowd. i'd be worried if ai was attracting less researchers than, say, two or three years ago. that doesn't seem to be the case.

throwup238

0 replies

1d14h

2024-08-20 03:43:55 UTC

https://archive.ph/PFmWw

taberiand

0 replies

1d14h

2024-08-20 04:00:18 UTC

Sure it's not all it's cracked up to be but I sure hope there's a sweet spot where I can run the latest models for a cheap price ($20 / month is a steal), and it doesn't instead crash to the point where they get turned off

ssimoni

0 replies

1d14h

2024-08-20 04:19:01 UTC

Hilarious. The article tries to go even one step further past the loss of hype, by making an additional argument that ai might not be in a hype cycle at all. Meaning they conjecture that it might not even come out of the trough of disillusion to mass adoption.

That’s gonna be a bad take I think.

robertlf

0 replies

18h28m

2024-08-21 00:03:23 UTC

Why post an article that's behind a paywall? How many of us can read it?

rldjbpin

0 replies

8h34m

2024-08-21 09:56:44 UTC

while this field is now paying my bills, i am lowkey happy to see this notion in the mainstream.

“An alarming number of technology trends are flashes in the pan.”

this has been a trend that seems to keep on recurring but does not stop from the tech bros from pushing the marketing beyond the realities.

raising money in the name of the future will give you similar results as self-driving cars or vr. the potential is crazy, but it is not going to make you double your money in a couple financial years. this should help serious initiatives find better-aligned investors.

rambojohnson

0 replies

18h57m

2024-08-20 23:34:07 UTC

blah blah blah

pif

0 replies

7h26m

2024-08-21 11:04:40 UTC

The most useful ChatGPT has been for me consisted in teaching me some nice recipes for elk eggs.

For the record, before spelling the recipes out, it made sure I understood that collecting elk eggs may be unlawful in some jurisdictions.

olalonde

0 replies

1d14h

2024-08-20 04:27:58 UTC

Silicon Valley’s tech bros

The Economist, seriously?

nottorp

0 replies

7h31m

2024-08-21 10:59:39 UTC

https://en.wikipedia.org/wiki/AI_winter

Those who do not know history are doomed to repeat it.

But then, the current hype wasn't there to produce something useful, but for "serial entrepreneurs" to get investor money. They'll just move to the next hyped thing.

nerdjon

0 replies

20h58m

2024-08-20 21:32:54 UTC

I feel like I have to disagree, even though I really don't want too. This technology is seriously overhyped.

We have to realize that there is a ton of money right now behind pushing AI everywhere. We have entire conventions for leadership pushing that a year later "is the time to move AI to Prod" or "Moving past the skeptics".

We have investors seemingly asking every company they invest in "how are you using generative AI" before investing. We have Microsoft, Google, and Apple (to a lesser degree) forcing AI down our throats whether we like it or not and ignoring any reliability (inaccurate) issues.

FFS Microsoft is pushing AI as a serious branding part of Windows going forward.

We have too much money committed to pushing the idea that we already have general AI, too much marketing, etc.

Consumer hype and money in this situation are going to be very different things. I do think a bust is going to happen, but I don't think in any meaningful way the "hype" has died down. I think and I hope it will die down, we keep seeing how the technology just simply can't do what they are claiming. But I honestly don't think it is going to happen until something catastrophic happens, and it is going to be ugly when it does. Hopefully your company won't be so reliant on it to not recover.

nbzso

0 replies

15h19m

2024-08-21 03:11:52 UTC

Honestly. I am enjoying it. From Dave will replace you, to this is not working well in 6 months. A new record. Logically, everyone around me forgot that I patiently explained the limits of stochastic parrots and the false hope on synthetic data. If we lived in the remotely responsible place, some people would have their heads rolling down the stairs. The psychological damage over workforce from AI hype is comparable only with the negative effect of social networks on the society. :)

naasking

0 replies

16h54m

2024-08-21 01:36:53 UTC

Yes, hype happens because something new that can potentially be applied to many problems triggers lots of experimentation and discussion. Once people figure out the problems to which it's well-suited and ill-suited, experimentation and discussion die down and there's just application. Nothing to see here, this is standard and expected.

moridin

0 replies

19h40m

2024-08-20 22:51:05 UTC

Good, maybe now we can focus on building killer apps rather than hype-posturing.

moi2388

0 replies

1d12h

2024-08-20 06:20:41 UTC

Well, maybe because people and companies still overwhelmingly seem to think LLMs == AI.

AI ain’t going nowhere. And certainly isn’t overhyped. LLMs however, certainly are overhyped.

Then again I find it a good interface for assistants and actual AI and APIs that it can call on your behalf

mikhael28

0 replies

17h12m

2024-08-21 01:19:14 UTC

As long as Zuck keeps releasing open-source models, the moat will continue to disappear from these companies. Only expensive, corporate processing tiers will exist and everyone will run stuff locally. Not a lot of money to be made from local processing.

meindnoch

0 replies

11h10m

2024-08-21 07:21:14 UTC

The only time I found LLMs useful was creatig fake heartwarming stories to farm likes from boomers on Facebook.

megamike

0 replies

1d17h

2024-08-20 01:18:31 UTC

tell me I am already bored with it next.....

matrix87

0 replies

20h56m

2024-08-20 21:35:05 UTC

Silicon Valley’s tech bros are having a difficult few weeks.

they need to find a different derogatory slur to refer to tech workers

ideally one that isn't sexist and doesn't erase the contributions of women to industry

m3kw9

0 replies

16h39m

2024-08-21 01:51:32 UTC

Hype is relative to your circle, where you are getting your info, and how the algorithm targets you with info that interests you. So yes, the hype is tiring for that Economist journalist, but for many they have not even heard or used it, and then there is everyone in between. As for myself there is hype but tongue seem justified based on how good the LLMs are currently

laichzeit0

0 replies

11h15m

2024-08-21 07:16:03 UTC

That's great. Then don't use it? I however find it immensely useful and will continue to use it.

kkfx

0 replies

1d10h

2024-08-20 08:30:38 UTC

ML is born in two master branches, one it's image manipulation, where video manipulation follow, another is textual search and generation toward the saint Graal of semantic search.

The first was started with simple non-ML image manipulation and video analysis (like finding baggage left unmoved for a certain amount of time in a hall, trespassing alerts for gates and so on) and reach the level of live video analysis for autonomous drive. The second date back a very big amount of time, maybe with the Conrad Gessner's libraries of Babel/Biblioteca Universalis ~1545 with a simple consideration: a book is good to develop and share a specific topic, a newspaper to know "at a glance" most relevant facts of yesterday and so on but we still need something to elicit specific bit of information out of "the library" without the human need to read anything manually. Search engines does works but have limits. LLMs are the failed promise to being able to juice information (in a model) than extract it on user prompt distilled well. That's the promise, the reality is that pattern matching/prediction can't work much for the same problem we have with image, there is no intelligence.

For an LLM if a known scientist (as per tags in some parts of the model ingested information) say (joking in a forum) that eating a small rock a day it's good for health, the LLM will suggest such practice simply because it have no knowledge of joke. Similarly having no knowledge of humans a hand with ten fingers it's perfectly sound.

That's the essential bubble, PRs and people without knowledge have seen Stable Diffusion producing an astronaut riding a horse, have ask some questions to ChatGPT and have said "WOW! Ok, not perfect but it will be just a matter of time" and the answer is no, it will NOT be at least with the current tech. There are some use, like automatic translation, imperfect but good enough to be arranged so 1 human translator can do the same job of 10 before, some low importance ID checks could be done with electronic IDs + face recognition so a single human guards can operate 10 gates alone in an airport just intervening where face recognition fails. Essentially FEW low skill jobs might be automated, the rest is just classic automation, like banks who close offices simply because people use internet banking and pay with digital means so there is almost no need to pick and deposit cash anymore, no reasons to go to the bank anymore. The potential so far can't grow much more, so the bubble burst.

Meanwhile big tech want to keep the bubble up because LLM training is a thing not doable at home as single humans alone, like we can instead run a homeserver for our email, VoIP phone system, file sharing, ... Yes, it's doable in a community, like search with YaCy, maps with Open Street Maps etc but the need of data an patient manual tagging is simply to cumbersome to have a real community born and based model that match or surpass one done by Big Tech. Since IT knowledge VERY lately and very limited start to spread a bit enough to endanger big tech model... They need something users can't do at home on a desktop. And that's a part of the fight.

Another is the push toward no-ownership for 99% to better lock-in/enslave. So far the cloud+mobile model have created lock-in but still users might get data and host things themselves, if they do not operate computers anymore, just using "smart devices" well, the option to download and self host is next to none. So here the push for autonomous taxis instead of personal cars, connected dishwashers who send 7+Gb/day home and so on. This does not technically work so despite the immense amount of money and the struggle of the biggest people start to smell rodent and their mood drop.

kderbyma

0 replies

15h0m

2024-08-21 03:30:26 UTC

Honestly, the only thing I have found somewhat useful with LLMs is to get smarter tab complete and to occasionally fill out small methods in classes, and finally writing unit tests and adding documentation. it saves me a little time on mostly improving test coverage and readability. But until I give it some examples it usually hallucinates even method names within the class that are very similar but slightly different and some of the time saved is lost by having to fix it's mistakes. I would say it's improving my LoC output by maybe 5-15% max, but the tab complete is nice when writing code.

justmarc

0 replies

1d11h

2024-08-20 07:22:29 UTC

Maybe it's because people are finding out that it's actually not as intelligent as they thought it would be in its current iteration.

The future is most definitely exciting though, and sadly quite scary, too.

jimjimjim

0 replies

1d15h

2024-08-20 03:26:52 UTC

But what about all those organizations that have "Do something with AI" as the goal for the quarter? All those bonuses driving people to somehow add AI to products. All the poor devs that have been told to replace features driven by deterministic code with AI good-enough-ness.

jdefr89

0 replies

19h45m

2024-08-20 22:46:08 UTC

It’s hilarious seeing people getting LLMs to traditional takes traditional discrete algorithms do perfectly already. “Let’s use LLM to do basic arithmetic!” Like, that’s not what they are built for. We want more generalization… So much to unpack here and I’m tired of having to explain these basic things. You will know our models got more powerful if they can do something like solve the ARC challenge, not cramming it with new updated information we know it will already process a certain way…

janalsncm

0 replies

19h15m

2024-08-20 23:15:35 UTC

“AI” never existed, at least AGI never did. AI that works is called machine learning and it’s not going away because it actually drives revenue at many companies. But the people who are working on that were working on it before blockchain and they’ll be working on it long after the next hype cycle runs out of steam. Unlike grifting, actual expertise takes time.

I have mixed feelings. On the one hand, I have a ton of schadenfreude for the AI maximalists (see: Leopold Aschenbrenner and the $1 trillion cluster that will never be), hype men (LinkedIn gurus and Twitter “technologists” that post threads with the thread emoji regurgitating listicles) or grifters (see: Rabbit R1 and the LAM vaporware).

On the other hand, I’m worried about another AI winter. We don’t need more people figuring out how to make bigger models, we need more fundamental research on low-resource contexts. Transformers are really just a trick to be able to ingest the whole internet. But there are many times where we don’t have a whole internet worth of data. The failure of LLMs on ARC is a pretty clear indication we’re not there yet (although I wouldn’t consider ARC sufficient either).

j-a-a-p

0 replies

20h45m

2024-08-20 21:46:07 UTC

TL;DR, article is not so much about AI, it is more about Gartner's hype cycle. According the Economist data only 25% of tech hypes follow this pattern. Many more (no percentage given) are just a flash in the pan.

AI is following more a seasonal pattern with a AI Winters, can we expect a new winter soon?

gsky

0 replies

8h46m

2024-08-21 09:44:52 UTC

It's the only one creating tech jobs at the moment

gorgoiler

0 replies

1d14h

2024-08-20 04:18:24 UTC

Asking an API to write three paragraphs of text still takes tens of seconds and requires working internet and an expensive data center.

Meanwhile we’re seeing the first of the new generation of on-device inference chips being shipped as commodity edge compute.

When the devices you use every day — cars, doorbells, TV remotes, points-of-sale, roombas — can interpret camera and speech input locally in the time it takes to draw a frame and with low enough power to still give you 10h between charges: then we’ll be due another round of innovation.

The article points to how few parts of the economy are leveraging the text-only API products currently available. That still feels very Web 1.0, for me.

freemoney99

0 replies

1d14h

2024-08-20 04:11:01 UTC

These days you can't be a respected news outlet if you don't regularly have an article/post/blog about AI losing hype. Wondering when that fad will reach its peak...

e-clinton

0 replies

13h53m

2024-08-21 04:38:10 UTC

Claude 3.5 is vastly better the 4o. I produce new features at a rate that’s 2-3x faster than I could without it. Not perfect and isn’t great in all use cases but overall transformational. I’ve been coding for 20+ years.

dwighttk

0 replies

6h21m

2024-08-21 12:10:24 UTC

Good… it’s all hype

dbrueck

0 replies

7h23m

2024-08-21 11:07:26 UTC

For me, the most amazing thing about LLMs is translation between written languages (human, not programming). I can only speak to the translation between English and Spanish on everyday topics, but ChatGPT often produces translations that are near native speaker quality, and even when it doesn't, the results are almost always far, far above the "good enough to communicate clearly" threshold. It's incredible.

bulbosaur123

0 replies

11h3m

2024-08-21 07:28:21 UTC

Same way mobile phones are losing their hype...they've become ubiquitous

bpiroman

0 replies

15h38m

2024-08-21 02:52:42 UTC

I use ChatGPT almost everyday as a part of my coding work flow

bentt

0 replies

4h30m

2024-08-21 14:00:39 UTC

All we need to see is a computer that operates itself according to what you ask it to do, and the hype will be back. For some reason nobody's really showing this, but it seems obvious. Maybe it's too dangerous.

Ologn

0 replies

1d14h

2024-08-20 04:17:20 UTC

Since peaking last month the share prices of Western firms driving the ai revolution have dropped by 15%.

NVDA's high closes were $135.58 June 18, down to $134.91 July 10th and $130 close today. It's highest sale is $140.76. So it's close today is 8% off its highest sale ever, and 4% off its highest close ever, not a big thing for a volatile stock. It's earnings are next week and we'll see how it does.

Nvidia and SMCI are the ones who have been earning money selling equipment for "AI". For Microsoft, Google, Facebook, Amazon, OpenAI etc., it is all big initial capital expenditure which they (and the scolding investment bank analysts) hope to regain in the future.