New GitHub Copilot research finds 'downward pressure on code quality'

I cancelled my subscription after 2 months because I was spending way too much mental effort going over all of the code vomit fixing all of the mistakes. And it was basically useless when trying to deal with anything non-trivial or anything to do with SQL (even when I frontloaded it with my entire schema).

It was much less effort to just write everything myself because I actually know what I want to write and fixing my own mistakes was easier than fixing the bot’s.

I weep for the juniors that will be absolutely crushed by this garbage.

I cancelled my subscription after 2 months because I was spending way too much mental effort going over all of the code vomit fixing all of the mistakes. And it was basically useless when trying to deal with anything non-trivial or anything to do with SQL (even when I frontloaded it with my entire schema).

Good to know, that means I'm still economically useful.

I'm using ChatGPT rather than Copilot, and I'm surprised by how much it can do, but even so I wouldn't call it "good code" — I use it for JavaScript, because while I can (mostly) read JS code, I've spent the last 14 years doing iOS professionally and therefore don't know what's considered best practice in browser-land. Nevertheless, even though (usually) I get working code, I can also spot it producing bad choices and (what seems like) oddities.

I weep for the juniors that will be absolutely crushed by this garbage.

Indeed.

You avoid the two usual mistakes I see with current AI, either thinking it's already game over for us or that it's a nothing-burger.

For the latter, I normally have to roll out a quote I can't remember well enough to google, that's something along the lines of "your dog is juggling, filing taxes, and baking a cake, and rather than be impressed it can do any of those things, you're complaining it drops some balls, misses some figures, and the cake recipe leaves a lot to be desired".

You avoid the two usual mistakes I see with current AI, either thinking it's already game over for us or that it's a nothing-burger.

This is always really surprising to me that it appears to be these two camps. Though what frustrates me is that if you suggest something in the middle people usually assume you're in the opposite camp than they are. It reminds me a lot of politics and I'm not sure why we're so resistant to nuance when our whole job is typically composed of nuance.

Though I'll point out, I think it is natural to complain about your juggling dog dropping balls or making mistakes on your taxes. That doesn't mean you aren't impressed. I think this response is increasingly common considering these dogs are sold as if they are super-human at these tasks. That's quite disappointing and our satisfaction is generally relative to expectations, not actual utility. If you think something is shit and it turns out to be just okay, you're happy and feel like you got a bargain. If you're expecting something to be great but it turns out to be just okay, you're upset and feel cheated. But these are different than saying the juggling dog is a useless pile of crap and will never be useful. I just want to make that clear, so we avoid my first paragraph.

It reminds me a lot of politics and I'm not sure why we're so resistant to nuance when our whole job is typically composed of nuance.

My theory is the increase in available information is overwhelming everyone's cognitive abilities, and so jumping to reductive conclusions is just a natural defense mechanism, leading to increased polarization and reduced listening/tolerance skills. Certainly software engineers have an above-average tolerance for nuance, but even we have to pick our battles or risk drowning in the flood.

I agree and hold a similar theory. The human mind prefers to put things in boxes. And the simpler the box, the better.

There also seems to be a phenomena where the more complex the system the simpler the box must be. I'm kinda surprised by this as understanding complexities (especially around dynamic environments and future planning) seems to be one of the things that distinguishes humans from other animals. And opposable thumbs. I mean sure, there's other animals with those too but we stand out, literally.

If the box is simple, then it's easier to cast each individual complexity of the system, as needed, into an interpretation that fits and reinforces the box.

But don't you find it odd that my box is complex and that your box is simple? And that this holds for whoever you and I are?

I buy that one, but I think there's a coupling with metric availability.

Where we have measurements for pretty much everything but people don't know measurements are proxies and not always aligned with goals. Like how a ruler doesn't measure meters, but meters according to the ruler and only at the ruler's precision level. I can totally get how people who don't work with these tools don't understand the nuances, but it confuses me with experts. Isn't expertise, by definition, contingent upon understanding nuance?

It seems that the more metrics we have available to us, the less we care about understanding how those metrics work and what their limitations are. That they just become black boxes.

I mean there seems to be a very common belief that you can measure visual image quality by measuring some difference between an intermediate representation of a classification model. Or a belief that entropy measures quality of speech.

I'm really concerned that Goodhart's Law might be one of the Great Filters.

History says that the things in the middle, usually ones with the clearest computational model beneath them, get a fancy name and we stop calling it AI and we just call it algorithms.

I could see someone making a case for this being that 'middle' group but there's a sour note to this process that I don't know from one week to the next whether I find it sneaky or delightful.

Someone else can make that argument, because I'm so sick of silver bullets and gold rushes that I just. don't. care. Sturgeon's Law applies (90% of everything is crap) and I'll listen just enough to see if anyone is proposing which bits are the 10% to keep a finger on the scale if I think it'll matter. But let everyone else bleed over this, because in another ten years they'll be laughing about how silly they were to think this was going to solve all of our problems or end our profession.

Software is eating most things. If something eats software, your employability will be the least of your existential crises.

History says that the things in the middle, usually ones with the clearest computational model beneath them, get a fancy name and we stop calling it AI and we just call it algorithms.

I've heard this before, but do we? Which were those this from the past that we "stopped called AI and we just call algorithms"?

I see three categories:

(1) very complex algorithms that we never did call "AI".

(2) stuff we did call AI, and we still do - things like expert systems, or IBM's Watson, or game AI. We knew, and still know, that those weren't AGI.

(3) some stuff promoted as AI (like the "AI assistant" Clippy), but which, marketing materials aside, nobody really considered AI or called them that.

But I don't remember this demoting/relabelling of stuff from AI to "just algorithms". It might have happened with some stuff, but I doubt it was a "classic" development as its portrayed.

Offhand, things that used to just be "AI": Expert systems, markov chains, genetic algorithms, data mining, and identification/classification.

People in the field would probably say they fall under the AI umbrella, but it's not the common viewpoint. Either someone can conceptualize how they'd work (expert systems) or they've been watered down and turned commonplace (markov chains in software keyboards, identification in Facebook images), and either way it disassociates the technology with the "intelligence" part of "artificial intelligence", so is no longer thought of as part of it.

Also Dynamic Programming, and soon, I predict, Monte Carlo simulation.

Polarisation in general can be well-explained by the business models of social media. You’re shown content that you react to strongest, and that is usually one end of a spectrum of views.

Not only are you shown content that you react the strongest too, people create content that they react strongly too, continuing the cycle.

A user with a balanced interpretation and somewhat neutral feelings about a topic generally won't feel like they want to add something to a discussion. A user with strong opinion will more likely engage with someone with posts with the opposite viewpoint or the same viewpoint.

HN is a bit of an exception because the community is reasonably high quality. But major platforms? The people who bother to write out long and neutral posts learned there is nothing to gain from doing that years ago.

HN is a bit of an exception

Even here, depending on the felt "hotness" of the topic in the community, you might get a lot of negative sentiment for trying to find a middle ground or daring to look at generalized claims in more detail.

I think, in general, people now seem to require you to signal that you share their identification with a certain thought before an open discussion might become possible.

One important aspect seems to be that the higher your educational level, the more likely you are conditioned to identify with your own thoughts. This amplifies polarization on an intellectual level. There is something in it for the individual thinker taking on a new polarized belief. It adds to their identity.

The ultimate catch then is to take the position that I just outlined as an identity contrary to "all" others who are polarized. This is yet another trap.

Hence, the exercise is to practice not getting polarized while being compassionate to those who are. It's just a tendency of the human mind and nobody should be judged for falling for these traps. It's too easy to fall for it given our current conditioning.

I think that’s mostly spillover from over social media, combined with the very opinionated nature of us nerds :-)

Interesting, usually when I express my opinion in the middle people agree. I've only talked to people about chatgpt, ai etc in real life though, not on the internet

I think the real life aspect changes things a lot.

Zealots on either side communicate at 100 times the rate of people who aren't so heavily invested.

What makes a man turn neutral? What makes a man turn neutral? Lust for gold? Power? Or were you just born with a heart full of neutrality?

-- Zapp Brannigan

I think it pays to just point out good use cases. For me, it's a superpowered thesaurus, a more concise and relevant Wikipedia, and an excellent listicle generator.

AI is not all-powerful, but those things alone help me a lot when I'm brainstorming.

I'm surprised by how much it can do, but even so I wouldn't call it "good code"

You can tell it the code is bad and how, a lot of the times it will correct it. For some bs code that you have to write it is a great time saver.

You can tell it the code is bad and how, a lot of the times it will correct it.

For the code questions that I ask, it is sometimes quite non-trivial to check whether the code is correct or not.

It never happened in my tests that it could correct incorret code that it generated. Typically, the bot then generated code that is wrong for sometimes a different and sometimes even a similar reason.

For some bs code that you have to write it is a great time saver.

Again I disagree: the common case where you have to write BS code is when the abstraction is wrong. Implementing a proper abstraction that strongly reduces the BS code to write is the way to go.

Writing that abstraction can also feel BS.

The entire industry is BS all the way down

If you get to a capacitor, you've gone too far - you may be in a power supply.

You can tell it the code is bad and how, a lot of the times it will correct it. For some bs code that you have to write it is a great time saver.

Yes, I've even done that to see how well it works (my experience is that half the time it fixes the code, the other half it claims to but doesn't) — but I can do that because I can recognise that it's bad code.

I don't expect a junior to notice bad code, and therefore I don't expect them to ask for a fix. (I've also spent 6 months in a recent job just fixing code written by work experience students).

I don't expect a junior to notice bad code

That is true. I actually seen instances of juniors struggling with code that doesn't work and frankly doesn't make sense, but they claim they wrote it :)

The criticism from the second camp stems from the fact that the WHOLE job is to not drop anything.

A fence with a hole is useless even if it's 99% intact.

A lot of human jobs, especially white collar, are about providing reassurance about the correctness of the results. A system that cannot provide that may be worse than useless since it creates noise, false sense of security and information load.

adding nuance to your fence analogy, most fences are decorative and/or suggestive of a border but can be overcome somewhat easily and are hence not useless because they have a hole somewhere.

The spirit of the analogy holds, as there are plenty of clear alternatives that map verbatim:

- drinking glass that is 99% hole-free

- car that doesn't explode 99% of the time

- bag of candy where 99% of the pieces are not poisonous

In all of these cases, it's more optimal to start from scratch and build something that you know is 100% reliable than to start with whatever already exists and try to fix it after-the-fact.

Personally, I use AI to assist development, especially in unfamiliar stacks, but in the form of a discussion rather than code-vomit. It's primarily synthesizing documentation into more-specific whole answers and providing options and suggestions.

I beg to differ. True, many fences are easily overcome by adult humans, but most fences are not designed with adult humans in mind. Most fences are intended to keep animals in or out of an area. In more urban areas that may extend to small children. In high security areas, fencing may be just one layer of security, but it is certainly more than just "suggestive". In any of these cases, the fence is useless if it has a gap that can be exploited.

and therefore don't know what's considered best practice in browser-land.

wanting this is probably the worst possible use case for LLM code vomit

I feel like if I ever try something less trivial than generating a wedding speech in the style of HP Lovecraft some AI evangelist tells me I've chosen the wrong use case for an LLM.

That is because you have already found the ideal use case for one.

For a sense of scale, the last time my JS knowledge was close to familiar with the best practices of the day, was 1-1.5 years prior to the release of jQuery.

I way trying to say how ChatGPT is good in relative terms, not absolute.

For the latter, I normally have to roll out a quote I can't remember well enough to google, that's something along the lines of "your dog is juggling, filing taxes, and baking a cake, and rather than be impressed it can do any of those things, you're complaining it drops some balls, misses some figures, and the cake recipe leaves a lot to be desired".

Something can be very impressive without actually being useful, but that still doesn’t make it useful. There’s no market for a working dog that does a bad job of baking cakes and filing taxes, while dogs that can retrieve game birds or tackle fleeing suspects are in high demand.

I'm certainly impressed by a man that can dig a tunnel with a spoon. But you're definitely right that it's not that useful.

But I disagree that half assed work is not useful. It's just lower usefulness. My laundry app isn't even half assed. The programmers couldn't even sort the room list (literal random order) or cache your most recent room. It's still better than the BS system they had before where I had to load a prepaid card and that machine was locked in a building that isn't open on weekends or after 6pm. I'm still immensely frustrated, but I don't want to go back to the old system.

Though I'll mention that my fear is that because so many people see LLMs as having far more utility than they offer, we'll get more shit like the above instead of higher quality stuff. Most issues are solved for me to be comfortable in my life, so I definitely value quality a lot more. Plus, reduces a lot of mental stress as I'm not thinking "how can the person that made this be so dumb? How do you learn how to program and not know what sort is?"

This is the crux of most of modern AI. Nobody debates it's cool. Since 2014 or so there has been no shortage of amazing demos of computers doing stuff many thought wasn't possible or required human level intelligence. But there's not automatically a bridge from those demos to commercial relevance, no matter how amazing.

As a senior frontend/javascript guy, I’m afraid that relying on ChatGPT/copilot for _current best practices_ is probably where it works the worst.

Oftentimes it will produce code that’s outdated. Or, it will output code that seems great, unless you have an advanced understanding of the browser APIs and behaviors or you thoroughly test it and realize it doesn’t work as you hoped.

But it’s pretty good at getting a jumpstart on things. Refining down to best practices is where the engineer comes in, which is what makes it so dicey in the hands of a jr dev.

I’m afraid that relying on ChatGPT/copilot for _current best practices_ is probably where it works the worst

This matches my experience. When ChatGPT started going viral, I started getting a lot of PRs from juniors who where trying it out. Pretty much every single one was using depreciated API calls or best practices from 5-10 years ago. I'd ask why they chose to use an API that is scheduled to be removed in the next release of whatever library or system we are using.

ChatGPT does have it's place. But you need to understand the tools you're using. It can't be great for a first spike or just getting something working. But then you have to go and look at what it's doing and make sure you understand it.

>For the latter, I normally have to roll out a quote I can't remember well enough to google, that's something along the lines of "your dog is juggling, filing taxes, and baking a cake, and rather than be impressed it can do any of those things, you're complaining it drops some balls, misses some figures, and the cake recipe leaves a lot to be desired".

Not the quote, but there was a Farside cartoon along those lines where the dog was being berated for not doing a very good job mowing the lawn:

https://i.pinimg.com/originals/22/22/79/222279ceaa98f293e76e...

Oh no, a really sad Far Side cartoon! Which is very closely related to a shaggy dog joke you can spin out for ages, "$5 talking dog for sale", which ends with the setup / punchline, "why so cheap?" / "because he's a goddamn liar!"

therefore don't know what's considered best practice in browser-land.

I myself also don't know what's considered best practice in Javascript generally (browser or server-side), even though I also have to write it occasionally -- but I wouldn't feel safe trusting that ChatGPT suggestions were likely to be model current best practices either.

On what are you basing your thinking that ChatGPT is more likely than not to be suggesting best practices? (Real question, I'm curious!)

If there are patterns that are good, idiomatic, and mostly repeatable, we should be putting those into the standard library, not an AI tool.

What we have right now is a system to collect information about the sorts of problems developers want existing code to solve for them. We should embrace it.

Right now, for me, it’s fancy Auto Complete at best.

I get a Free subscription to it by using my kids EDU email accounts. Which is handy :)

But I absolutely would not pay for it.

I recall the last time I tried using the chat feature to do something, the code it produced wasn’t very useful and it referenced chapters from a book for further information.

It was clearly just regurgitating code from a book on the subject, and that just feels wrong to me.

At least give credit to the Authors and reference the book so I can go read the suggested chapters LOL

The value I found in my short trial of gpt3 was a bidirectional path finder.

Don't have me read 20 pages of docs just to integrate into a browser or a framework.. cutting the legwork essentially so I can keep my motivation and inspiration going.

You avoid the two usual mistakes I see with current AI, either thinking it's already game over for us or that it's a nothing-burger.

A lot of the latter is caused by the former. It is a nothing burger compared to the shocking amount of hysteria on HN about AI putting programmers out of a job. You'd expect a programmer to know what his job is, but alas, apparently even programmers think of themselves as glorified typewriters.

Oh man this is the opposite of my experience!

Copilot has replaced almost all of the annoying tedious stuff, especially stuff like writing (simple) SQL queries.

“Parse this json and put the fields into the database where they belong” is a fantastic use case for copilot writing SQL.

(Yes I’m sure there’s an ORM plugin or some middleware I could write, but in an MVP, or a mock-up, that’s too much pre optimization)

An ORM is not too much pre optimisation…

Yeah MVPs and mockups is exactly what ORMs are for, since they get you the first 80% for very little effort. Maybe it depends on the language, but this is definitely the way Rails and Laravel style framework ORMs are designed.

To me SQL is better for MVPs. It is transparent and clear, easy to debug and can do easily whatever joins, filtering you want to.

ORM is more for future, because it abstracts away the database implementation, so you could in theory change the db or change some behaviour more simply.

E.g. if you add something like deleted_at column, ORM can have a single place where to configure system to use it, but if you have SQL queries lying around you may need to find all spots to add that to your where clauses everywhere.

But otherwise SQL is easier to work with in my view.

What ORMs/languages do you usually work with? Dynamic language ORMs usually let you very easily add filters, work with relationships, etc.

I started out with Laravel 10 years ago, but now I have been in the NodeJS Ecosystem for recent 5 years, so I've been working with different things, but lately mostly with Supabase, which has its own REST API abstraction, but now I've been lately starting to prefer just having a direct bare postgres client against it. This is for my own side projects/MVPs.

Yeah I’ve found both Eloquent (Laravel ORM) and Entity Framework to be more than flexible enough and to produce surprisingly good SQL. You do need to keep an eye on the queries if produces though because innocent changes can make a huge difference in performance.

ORMs can be annoying to use in my experience, especially with more complex queries, joins, counts, filtering by joins etc. And they make debugging harder usually.

Since I found out that ChatGPT does a really nice job of "Show me what ActiveRecord query would generate this SQL query", and vice versa, it's gotten quite a bit easier. It handles some surprisingly complex stuff

There is no avoiding object-relational mapping, though, unless you intend to carry relations (sets of tuples) from start to finish. Something you almost certainly cannot do if you interface with any third-party code – which will expect language-native structures.

It's just a question of whether you want to use a toolkit to help the transformations, or if you want to 'do it by hand'.

I don't know, i don't think ever came across something I wasn't able to express in SQLAlchemy... And I prefer all the goodies that come with it than writing raw sql

Really? I've been doing web dev as a hobby for 20 years and professionally for 6 or 7 years. It's super helpful for me giving how much boilerplate there is to write and how much of the training set is for websites. Any time I try to get it to write non-trivial SQL or TypeScript types it fails hard. But it's still a nice time saver for writing tests, request handling, React boilerplate, etc.

This is the problem.

As programmers we should be focusing effort on reducing boilerplate, so that it’s never needed again. Instead, we’ve created a boilerplate generator.

Nope. The wrong abstraction is too risky vs. having no abstraction. AI can let us move towards an abstraction-free mode of coding. In the future you’ll tell the AI “add a 60 second cache to every source of X data” and it’ll scour your source code for matching code and you’ll review the diffs. C programmers would review LLVM IR diffs. Web devs will review whatever compilation target they’re using.

We use high level languages because they improve reading comprehension and save us time when writing. Having a copilot take on a major role allows us to fundamentally rethink programming.

We use high level languages because they improve reading comprehension and save us time when writing.

Which completely contradicts your earlier point.

Why not just get copilot write assembly for you? Or, spit out raw machine code? Oh, that’s right, because you need to check that it hasn’t fucked something up. Which, when there’s a ton of boilerplate, is hard.

It’s arguable that programming language evolution stopped around the time Java was released (barring a few sprinkles here and there, like async/await, affine/linear types, etc.)

We haven’t had a major leap in language power for decades (not like the leap from assembly to procedural languages) - I believe it’s because Java popularised evolution through libraries - and for a long time that was fine, even if it did lead to language evolution stagnating.

But now we’ve hit a complexity threshold that demands an abstraction leap, but instead of looking for that abstraction leap we’re getting a computer to generate boilerplate for us, hoping it will dig us out of the complexity hole.

We’re still ways off having a computer maintain a complex code base over many years. So humans still have to do that. It’d be much easier if we remove the incidental complexity.

“The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.”

— Edsger W. Dijkstra

We haven’t had a major leap in language power for decades

I call bullshit. Haskell, Rust, and Zig (and others) are revolutionary. Also: Rust and Zig are facilitated by LLVM.

Haskell is over 30 years old and I don't know if it really was a major leap over SML.

Rust is basically an ML with a borrow checker, it's cool but not really a major leap in power over C++ (maybe a leap in freedom from bugs).

What does Zig do that's so revolutionary? Custom allocators are nothing new. And compile-time evaluation has been around so long some of us forgot we had it. Also Zig is working to step off LLVM.

I find it unlikely that we would be asking the ai for a very specific caching design. If the optimists are right, the ai should be able to do its own analysis and automatically design an architecture that is superior to what a human would be able to do.

A job spent mostly reviewing ai generated diffs sounds like a level of hell beyond even Dante's imagination.

The challenge there is that reducing boilerplate comes with opinionated magic, which rubs a lot of people the wrong way. So the creators add configs and code level config, and pretty soon you're back in boilerplate hell.

I’m starting to lean in the direction that trading some boiletplate for flexibility is a good deal.

As with most things in life, moderation is key.

I find co-pilot primarily useful as an auto-complete tool to save keystrokes when writing predictable context driven code.

Writing an enum class in one window? Co-pilot can use that context to auto complete usage in other windows. Writing a unit test suite? Co-pilot can scaffold your next test case for you with a simple tab keystroke.

Especially in the case of dynamic languages, co-pilot nicely compliments your intellisense

Yep, treat copilot like a really good autocomplete system and its wonderful. Saves a lot of time and typing even in languages that aren't known for having a lot of boilerplate.

Treat copilot to solve actual problems (y'know, the kind of stuff you are presumably paid to solve), and it falls completely flat.

I've found it's autocomplete really bad. It often hallucinates import paths and other illegal operations. It's not clear to me why it can't leverage the LSP better.

It’s also nice for solving small things like ‘euclidian distance’

My experience is similar. This week I created a few DBT models and Co-Pilot saved a *ton* of keystrokes on the YML portion.

It needed some hand-holding in the early parts, but it was so satisfying to tab autocomplete entire blocks of descriptions and tests once it picked up context with my preferences.

This is the problem with using AI for generating SQL statements; it doesn't know the semantics of the your database schema. If you are still open for a solution, I recently deployed a solution[1] that combines AI, db schema and simple way train AI to know your database schema.

Essentially, you "like" correct (or manually corrected) generations and a vectorized version is stored and used in future similar generations. An example could be tell which table or foreign key is preferred for a specific query or that is should wrap columns in quotes.

From my preliminary tests it works well. I was able to consistently make it use correct tables, foreign keys and quotes on table/column name for case-sensitivity using only a couple of trainings. Will open a public API for that soon too.

[1]: https://www.sqlai.ai/

It's "a public API", not "an public API", because of the consonant rule.

I really worry that there are people out there who will anxiously mangle their company's data thinking what is being called AI, which doesn't exist yet, will save the day.

Use it as a tool not a replacement. However it does do things well even without much additional information like fixing SQL statement[1]. That being said it is consistently improving, GPT-3.5 to GPT-4 was a major upgrade.

[1]: https://www.sqlai.ai/snippets/clroq0qn9001xqzqeidtm4jgx

Just copy paste the db/table schema in a comment at start of your file. Nothing else needed.

If you used copilot in the beginning (and I think still with some plans), it was only GPT 3.5.

Likely you’d get much better results with GPT-4.

The difference in output code quality in 3.5 vs 4 is staggering - just with using regular old ChatGPT.

I feel like OpenAI is really shooting themselves in the foot by not having some sort of trial of 4. The difference in quality of everything is huge and most people don’t know it.

I weep for the juniors that will be absolutely crushed by this garbage.

This is the real danger of this sort of thing. When your Copilot or whatever are good enough that they replace what is vastly superior for purely economic reasons.

I wrote about this trend applied to the unfortunately inevitable doom of the voice acting industry in favour of text-to-speech models a couple of months ago, using my favourite examples of typesetting, book binding and music engraving: https://news.ycombinator.com/item?id=38491203.

But when it’s development itself that gets hollowed out like this, I’m not sure what the end state is, because it’s the developers who led past instances of supplanting. Some form of societal decline and fall doesn’t feel implausible. (That sentence really warrants expansion into multiple paragraphs, but I’m not going to. It’s a big topic.)

Yes, like desktop publishing compared to traditional printing. Its output is comparatively shitty but average people will settle for it on account of its democratization. My context was feature factory web dev versus low / no code rather than autocomplete versus solo author.

By democratization, I mean that it enables one-one where previously there was only one-many: instead of the inversion of experience where the same unique app experience is shared by millions, a technology allows the interface to be tailored for an audience of one or dozens: missing toes and fingers, color blindness, particularly difficult and unique operating conditions, etc. Given those unique constraints the mediocrity provides at least some preferable solution. The downside of this is that it sets the floor a lot lower, and people who would never have even tried or contemplated trying typesetting will dabble with desktop publishing to achieve their ends.

Somebody on here gifted me with the word "procrustean" and I've taken it and put it in my Minsky fish tank. There are many reasons to eschew the trusted experts model: somebody who has made heads for pins for twenty years is incontrovertably an expert, but who cares? Our uncanny valley appears to be only a local minimum.

[PS, nobody understood my point but I thank them for the honest feedback.]

On your P.S.: I contemplated your point being “I can cope with this stuff and know when to correct it or give up on it, but what chance does a beginner have?” which I think is what you actually meant, but ended up deciding to write what I wrote anyway, because it ends up feeling fairly connected to me, though from a slightly different angle.

I’m in India at present, and a number of 18–25 year-olds ask me about learning to code (commonly because their college or University is teaching C/C++ and they have no idea). Somehow, they have more access to a computer than ever before, because most of them carry one on their person all day, and more information about this task than ever before, yet their computer, a phone, has been dumbed-down and locked-up in such a way that they can’t really use it to learn to code, because that’s something you do on Computers, and those are just on the campus or your laptop, Chris, or things like that.

Easier for the simple tasks, so that the more complex tasks that enough in a previous generation used to work for just don’t get reached, leaving over time perhaps a chasm, and fewer really skilled people in a society that may depend on them more than previously. The lower floor helps some who wouldn’t have got started before, but discourages others by making things too easy so they never find the challenge they seek.

I'm not using Copilot to write code, I use it for autocomplete. For that it's great.

Copilot probably saved my career. I was getting occasional wrist pain due to keyboard overuse, but the autocomplete is so good that my keyboard use is a tiny fraction of what it was previously. Wrist pain gone.

Using Copilot is a skill though, you have to live with it and learn its limits and idiosyncrasies to get the most out of it.

100% agree. It usually autocompletes better than the language's LSP since it takes a lot of context into consideration. It's a godsend for repetitive tasks or trivial stuff.

I've been using Bing's GPT-4 to learn Fortran for about a week. Well, I'm mainly reading a book which is project-oriented so I'm often researching topics which aren't covered in enough detail or covered later. I think this mixed-mode approach is great for learning. Since Fortran's documentation is sparse and Google's results are garbage the output of GPT4 helps cut through a lot of the cruft. Half the time it teaches me, the rest I'm teaching it and correcting its mistakes. I certainly won't trust it for anything half complicated but I think it does a good job linking to trustworthy supporting sources, which is how I learned how to cast assumed-rank arrays using c_loc/c_f_pointer and sequence-association using (*) or (1). It's great for learning new concepts and I imagine it would be great for pair-coding in which it suggests oversights to your code. However I can't imagine depending on it to generate anything from scratch. What's surprising is how little help the compiler is - about as bad a resource as SEO junk. I'm used to "-Wall -Werror" from C, but so many gfortran warnings are incorrect.

Please be sure to report poor warnings to the gfortran developers -- it's generally a great compiler, and you can help keep it great.

A problem with Fortran compiler error and warning messages is that Fortran is largely a legacy language at this point, and most Fortran code hitting the compilers has already had its errors shaken out. New code, and especially new code from new Fortran users, is somewhat more rare -- so those error and warning checks are a part of the compiler that doesn't get as much exercise as one would like.

It was much less effort to just write everything myself because I actually know what I want to write

This aligns with my observations. I don't use Copilot etc. but the other devs on my small team do. I've observed that I'm generally a faster and more confident type and coder - not knocking their skills, I'm just more experienced, and also spent my teens reading and writing a lot.

I've seen that it helps them in cases where they're less certain what they're doing, but also when they know what they're doing and it's quicker about it.

For me I know what I want to write and it seems that Copilot also knows what I want to write, so as an auto complete it just works out for me. Most of the time code I want to write is in my head, I just need to be able to quickly vomit it out and typing speed is the bottleneck.

I am also able to intuitively predict that it is going to vomit out exactly what I want.

E.g. I know ahead of time what the 10 lines it will give me are.

You cancelled the tool because you didn’t know how to use it?

Depends on how you use it.

I use a similar vscode assistant bit only for shorter code. I am able to complete code faster than an instructor on video.

As other said I use copilot or similar for scaffolding/boilerplate code. But you are right, reading almost correct code that you are unfamiliar with is much more demanding than fixing stuff I got wrong to begin with.

it's crazy how amazing it is sometimes for certain things, including comments (and including my somewhat pithy style of comment prose), and how incredible and thorough the wastes of time when it autosuggests some function that is plausible, defined, but yet has some weird semantics such that it subtly ruins my life until i audit every line of code i've written (well, "written", i suppose). i finally disabled it, but doing so _did_ make me kind of sad-- it is nice for the easy things to be made 25x easier, but, for programming, not at the expense of making the hard stuff 5x harder (note i didn't say 10x, nor 100x. 5x). it's not that it's that far away from being truly transformative, it's just that the edge cases are really, really rough because for it to be truly useful you have to trust it pretty much completely & implicitly, and i've just gotten snakebitten in the most devious ways a handful of times. an absolute monster of / victim of the pareto principle, except it makes the "90%" stuff 1.5x easier and the "10%" stuff 5x harder (yes, i know i haven't been using my "k%"s rigorously) (and for those keeping score at home, that adds up to making work 10% harder net, which i'd say is about right in my personal experience).

highlights: "the ai" and i collaboratively came up with a new programming language involving defining a new tag type in YAML that lets one copy/paste from other (named) fragments of the same document (as in: `!ref /path/to/thing to copy`) (the turing completeness comes from self-referential / self-semi-overlapping references (e.g. "!ref /name/array[0:10]`) where one of the elements thus referred-to is, itself, a "!ref" to said array).

lowlights: as already alluded to, using very plausible, semi-deprecated API functions that either don't do what you think they do, or simply don't work the way one would think they do. this problem is magnified by googling for said API functions only to find cached / old versions of API docs from a century ago that further convince you that things are ok. nowadays, every time i get any google result for a doc page i do a little ritual to ensure they are for the most recent version of the library, because it is absolutely insane how many times i've been bitten by this, and how hard.

I think there is a sweet spot where if you're junior on the cusp of intermediate it can help you because you know enough to reject the nonsense, but it can point you in the right direction. Similar to if you need to implement a small feature in a language you don't know, but basically know what needs to get done.

I've definitely seen juniors just keep refining the garbage until it manages to pass a build and then try to merge it, though, and using it that way just sort of makes you a worse programmer because you don't learn anything and it just makes you more dependent on the bot. Companies without good code reviews are just going to pile this garbage on top of garbage.

I have similar sentiment and looking at how mixed takes are, I think it depends on what you do. I write a lot of research code so I think it's unsurprising that GPT isn't too good here. But people I see that write code more in line with the "copy and paste from stackoverflow" style, get huge utility out of this. (This isn't a dis on that style, lots of work is repetitive and redundant).

So I changed how I use GPT (which I do through API. Much cheaper btw). I use it a lot like how I would use SO in the first place. Get outlines, understand how certain lines might work (noisy process here), generate generic chunks of code especially from modules I'm unfamiliar with. A lot of this can just be seen as cutting down time searching.

So, the most useful one: using it as a fuzzy search to figure out how to Google. This one is the most common pattern. Since everything on Google is so SEO optimized and Google clearly doesn't give a shit, I can ask GPT a question, get a noisy response that contains useful vernacular or keywords which I can then use to refine a Google search and actually filter out a decent amount of shit. I think people might read this comment and think that you should just build a LLM into Google, but no, what's going on is more complicated and requires the symbiosis. GPT is dumb, doesn't have context, but is good at being a lossy compression system. The whole reason this works is because I'm intelligent and __context aware__, and importantly, critical of relying on GPT's accuracy[0]. Much of this can't be easily conveyed to GPT and isn't just a matter of token length. So that said, the best way to actually improve this system is actually for Google to just get its shit together or some other search engine to replace them. Google, if you're listening, the best way you can make Google search better with LLMs is to: 1) stop enabling SEO bullshit, 2) throw bard into the side and have the LLM talk to you to help you refine a search. Hell, you can use a RL agent for 1 to just look how many times I back out from the links you send me or look at which links I actually use. Going to page 2 is a strong signal that you served shit.

[0] accuracy is going to highly depend on frequency of content. While they dedupe data for training, they don't do great semantic deduping (still an unsolved problem. Even in vision). So accuracy still depends on frequency and you can think of well known high frequency knowledge as having many different versions, or that augmentation is built in. You get lower augmentation rates with specific or niche expert knowledge as there's little baked in augmentation and your "test set" is much further from the distribution of training data.

It was much less effort to just write everything myself because I actually know what I want to write and fixing my own mistakes was easier than fixing the bot’s.

Echoing this, it takes longer to read code than to write it, so generally, if you know what you want to write and it's non-trivial, you'll spend more time groking AI-written code for correctness than writing it from scratch.

I've been using it since before the beta and I still do not understand why people have ever used it for multi-line suggestions. I only ever use it as a one-line autocomplete and it has done wonders for my productivity

It sounds to me like you were getting it to do too much at once.

so its good my card that was auto-paying for the Copilot subscription expired

I never even started. On a vacation I tried to get a LLM to write me an interpolation function for two hours. I had a test data set and checks laid out. Not a single of the resulting algorithms passed all the checks, most didn't even do what I asked for and a good chunk didn't even run.

LLMs give you plausible text. That does not mean it is logically coherent or says what it should.

For me it's useful for new languages that I'm not familiar with, saves a lot of googling time and looking up the docs.

When I've tried codepilot and similar tools, I've found them rather unimpressive. I assumed it was because I hadn't put the time in to learn how to make the best use of it, but maybe it's just that it's not very good.

On the other hand, I use ChatGPT (via the API) quite often, and it's very handy. For example, I wrote a SQL update that needed to touch millions of rows. I asked ChatGPT to alter the statement batch the updates, and then asked it to log status updates after each batch.

As another example, I was getting a 401 accessing a nuget feed from Azure DevOps - I asked ChatGPT what it could be and it not only told me, but gave me the yaml to fix it.

In both cases, this is stuff I could have done myself after a bit of research, but it's really nice to not have to.

This is why I am concerned that really good AI tools (be it LLMs, or not), are only going to make people dumber over time. It is true that those people that are highly motivated will be digitally enhanced by LLMs, but the hard reality is that this does not encompass most people. Most people will do the least amount of work possible to solve a task. Rather than taking the time to understand and critically reason about the output of an LLM, these people will instead take it and run, only asking questions later [if at all].

And as we zoom out and think more farther into the future, I see it getting much worse. If AI is really doing all the "hard stuff", then the general human incentive to learn and do them at all quickly treads to 0. This isn't going to be everyone, I think some people will absolutely become "10x developers" or the equivalent for other domains. But all this will result in is more inequity, in my naïve view. The universe has a fundamental property that things move from higher energy states into lower ones. From a human POV, I think you could apply a similar idea, if the need to be smart quickly recedes, then in general we may degrade unaugmented human collective intelligence over time.

I don't know, maybe we'll figure something out to make it much easier to learn and ingest new concepts, but it seems more and more to me that the high obstacles for human brains learning things (with poor memory) is too big a bottleneck to overcome any time soon.

"This tech will only make people dumber over time" was also claimed about the Internet.

This is a tale as old as time.

In a conversation with Phaedrus, Socrates worries that writing could impair human memory, as people depend on written words instead of recalling information. [0]

[0]: https://en.wikipedia.org/wiki/Phaedrus_(dialogue)#Discussion...

It's true though, as human (+ tool) gets smarter, human (without tool) tends to get dumber in the domain the tool augments.

The question is will we one day have tools so powerful, that the human is vestigial, and tool (without human) is just as powerful and cheaper than tool (+ human)?

True in general, yes, but writing is an elegant solution where the longer something is, the more likely you are to write it down because you are less likely to be able to remember it. The shorter something is, the less likely you are to write it down because it’s a pain to get out your scratchpad (or iPhone) for ten words or less.

Yes writing can only augment human intelligence.

AI might replace it.

I do think it's murkier then that.

Writing "replaced" some forms of intellectual effort.

And it's yet to be seen how AI will play out.

This is the case today for ai chess engines

But there are still people playing chess professionally because playing well is not enough. (You have to entertain an audience, which is only possible by "just playing well" if you're human)

It’s also an early example of the medium being the message. Socrates and his interlocutors are sometimes a parable on the transition from oral to written culture.

I find that it is true that new mediums and new technologies for language have a numbing effect on certain senses and an amplification of others.

Writing is beneficial in one regard but does have an impact on memory. Epic poems of great length were memorized in their entirety, a skill that would be a lot easier to develop in a world without writing.

Personally speaking, I've got much worse at stripping tree bark off with my teeth since I evolved opposable thumbs.

yes but in this case, we have actual research showing that the technology is degrading the quality of work.

Is that necessarily a bad thing? What if high quality isn't required to do the thing you wanted to accomplish?

Many other mass produced products we buy today are clearly lower quality than ones crafted by artisans 50 years ago, and yet they do what we need them to do at a fraction of the cost.

I think that regardless of category, we should strive for making quality easier and cheaper to achieve and sustain. Unfortunately financial incentives favor velocity and volume at the cost of all else and so instead we’re increasingly pumping out vast amounts of garbage.

regardless of category

I think that's a sweeping generalization. For example, it's much better to have a bunch of food that's mostly garbage than to have a famine where all food is super high quality. There were points in history where this choice was made (obviously unconsciously because the choice is too obvious to even think about). Other examples abound.

Technology often makes things much cheaper while reducing quality. Sometimes that's bad, sometimes it's great.

These cheaper products are also far more wasteful and taxing on natural resources. This is the “growth at any cost” mindset. It’s been great for us in the last 100 years, but I’m not convinced it’s actually sustainable in the long run.

There is a threshold of quality; once it goes below, it's a broken product. Personally, I'd rather have a high-quality product crafted by artisans than mass-produced ones, similar to preferring a dinner at a restaurant made by a chef over fast food. However, in engineering, every choice involves tradeoffs.

We have research but I think it’s far from rock-solid proven.

Like any idea about squishy human brains and its products it remains to be seen and can’t be as easily proven as for example research in Physics.

I would say current research has a probability of 50% of being correct at best.

...and about the written word, by Socrates

https://blogs.ubc.ca/etec540sept13/2013/09/29/socrates-writi...

In a sense Socrates was right. It's always better to be able to recall something from your memory rather than having to look it up in a book. The classic example is the multiplication table. The problem is that humans have rather limited capacity for such a 'recallable on demand' memory.

The problem is that you have to invest actual effort to memorize things. Memorizing things you very rarely need is not worth it, even if you're amazing at memorizing things. It's impossible to memorize everything you would benefit from having access to in written form.

On the other hand just because something is written down somewhere is not at all equal to 'having access to it'. Once you forget it it might as well not exist at all. The information needs to be internalized to be any useful, and for that at least some of it needs to be kept in your memory

And that's why digital documents are better (for average use-cases) than paper. Another argument for using technology to make things easier at the cost of possibly losing skills you no longer urgently need (such as "maintaining a library of paper books").

I'm terrible at recalling specifics but I feel that my "superpower" is that I'm really good at remembering if say a solution to a certain problem exists, and just enough breadcrumbs to find that solution again, be it a few key words that I can search for or similar.

This coupled with an broad interest in just about anything means I can often help people far outside my specific field, usually rather swiftly as well.

Whoever said it was right. One of the dumbest public figures of our time was elected president as a direct result of the reach and amplification provided by the internet.

Qanon was purely a creation of the internet. Now go take a look at how many people believe one, many or all of the various Qanon alternative facts.

People used to believe a lot of crazy conspiracies and superstitions throughout history. The reason why we're so appalled at Qanon is that "it should be so easy for them to correct their superstitions, given the tools we have".

It's hard to argue that nowadays people believe more crazy stuff than before the internet was invented. (It's very easy, of course, to claim this, as many like to do.)

Yes. It seems that the proportion of people for whom 50% of their beliefs constitute unfounded, unjustified nonsense is actually far lower than it was before the internet. One of the main things that has changed, though, is how quickly ideas (including bonkers ones) spread. So whereas we used to see a significantly new bonkers idea only every few centuries, we seemingly now see several a week.

Those are convincing arguments that internet has (major) downsides and bad consequences. The scale of surveillance it enables is another one.

They don't prove that it makes people dumber. You have to quantify and qualify people and to define "dumb".

Maybe people are not actually dumber because of the internet, but the internet is very good at spreading ideas, including incredibly dumb ones, especially (because of how human beings work) those likely to cause feeling of outrage.

Maybe people are not dumber, just too defenseless against the scale of bullshit they are faced with because of the internet. Maybe internet is an incredibly good tool, but strongly requires good learning / training of critical thinking and there's not enough of this yet.

It was claimed about writing as well. Oral experience passing was considered superior. The Quran puts great emphasis on decoration of it so it can be transmitted and recited orally.

same with phd defense

PhD defenses are mostly ceremonial. What are you trying to say? Is anything important transmitted orally during a PhD defense?

We'd have to define "dumber" because adult literacy and numeracy rates have largely increased globally since the 1990's.

https://ourworldindata.org/literacy

This is largely due to increasing access to education in less developed countries.

In already developed countries the Flynn effect seems to be reversing, ie IQ is leveling off or even dropping.

IQ is a very different thing from literacy (or generally any skills one learns). I think IQ is not relevant to this discussion.

It's basically impossible to improve an adult's g-factor, for example. For children, things like nutrition and hygiene (e.g. no parasites) play a big role. But the kinds of things a human learns or tools they use doesn't significantly affect G-factor.

Well, to this day one can consider this true, looking at tiktok et al.

TikTok may indeed be a contributing factor to the current collective brain rot, but I’ve learnt more so far from YouTube and access to random PDF documents on any technical subject I can imagine than I probably would have done in one lifetime without.

one can consider this true

Not without a solid result establishing so. But good luck with this, you'd probably need to compare our world with internet and a similar enough world but without it, which does not exist.

In any case, internet can't be reduced to "tiktok et al"

Well, it wasn’t wrong. The Internet has made people dumber.

Internet made us dumber but also made a lot of things easier. Just like industrial machines made us weaker (or rather, not as rugged as we used to be) but also allowed us to get much more work done.

The mistakes that will be made are obvious, and those who fear that are correct, yet they will also be overcome because the ones not learning will fail and the ones learning will succeed.

Are you sure it didn't?

Didn't it?

It did.

And they seem to have been proven right. (probably not the way they think)

"Internet makes smart people smarter and dumb people dumber". (Mark Solonin, a Russian historian).

Taking into account that such blunt statements always hide a lot of nuances, this seems to capture the reality.

And they were right.

In Phaedrus, Plato has Socrates tell the story of a dialog between Egyptian gods as a caution against writing—that writing would cause people to lose their ability to remember. On the one hand, Thamus's warning was accurate—cultures that rely on writing generally do not have robust memorized oral traditions—but on the other hand, we only have this story today because Plato wrote it down, and he cannot have been ignorant of the irony.

Every tool has this trade-off, and the existence of skills that will be lost is not evidence that the tool will do more harm than good. I don't think anyone here would argue that Socrates was correct that writing would be the end of memory and wisdom.

To [Thamus] came Theuth and showed his inventions ... when they came to letters, "This,* said Theuth, "will make the Egyptians wiser and give them better memories; it is a specific both for the memory and for the wit."

Thamus replied: "O most ingenious Theuth, the parent or inventor of an art is not always the best judge of the utility or inutility of his own inventions to the users of them. And in this instance, you who are the father of letters, from a paternal love of your own children have been led to attribute to them a quality which they cannot have; for this discovery of yours will create forgetfulness in the learners' souls, because they will not use their memories; they will trust to the external written characters and not remember of themselves. The specific which you have discovered is an aid not to memory, but to reminiscence, and you give your disciples not truth, but only the semblance of truth; they will be hearers of many things and will have learned nothing; they will appear to be omniscient and will generally know nothing; they will be tiresome company, having the show of wisdom without the reality."

https://www.gutenberg.org/files/1636/1636-h/1636-h.htm

This sounds pretty similar to all technology. I remember Slashdot threads about how IntelliSense was making software engineers dumber. I remember my teachers saying that allowing calculators on the SAT made kids dumber. I remember people saying that spell checkers would make people worse at writing. I remember people saying at the start of the pandemic that Zoom meetings would make everyone anti-social.

None of this really happened. I mostly use editor tools to automate away tedium that doesn't matter; I type "log.Inf<TAB>" and it adds "src/internal/log" as an import at the top of the file and types the o and open parenthesis for me. I have not forgotten how to do that myself, but it saves me a couple seconds. Calculators didn't really make people dumber, though I have to say that a lot of arithmetic I learned in elementary school did make me dumber ("touch math" was the killer for me; slows me down every time I do arithmetic in my head; I need some brainwashing program that deletes that from my brain). Spell check didn't make people worse at spelling; spelling things wrong still has a penalty (C-w to kill the last word and spell it correctly), so the incentive is to still to lurn how to spel wrds rite. Zoom meetings didn't ruin the corporate world; I personally found them very helpful for memorizing people's names with a high degree of certainty. You see it under their face for 40 minutes at a time, and you learn it fast. In real life, probably takes me a few weeks for people I only see once a week. So, honestly a benefit for me.

The current state of AI seems very similar to these technologies. I did a copilot free trial (and didn't renew it). With the free trial I think there were a couple things it was good at. One time I wanted a CPU profile for my app, so I just asked the AI to type it in. Open a file, check for an error, start profiling to the file, stop profiling, close the file. Would have taken me a minute or two to type in, but Copilot typed it in instantly. I also did something like "do the same as the function above, but for a YAML file instead of a JSON file". Again, super trivial to type in that code, it's really only one line, but the AI can handle that just fine in an instant. I don't really think it's more than slightly smarter IntelliSense, but without any access to the compiler toolchain, so it can sometimes just hallucinate stuff that doesn't exist.

I've found this to be kind of an interesting way to proofread documents and design APIs. Give ChatGPT a document you're working on, and then ask questions about it. If it gets the wrong answer, then your doc is no good! Similarly, ask it how to write some code using your new API (without showing it the API). Whatever it writes should be the surface area of your API. This avenue is pretty interesting to me, and it's not replacing humans, it's just a smarter rubber duckie.

Overall, I think we're in a little bit of a hype phase with AI. I look at it kind of like a dog that has read every book, if such a thing were possible. Pretty smart, but not quite human yet. As a result, it's not going to do well in the areas where people really want to apply it; customer service, insurance claims, loan underwriting, etc. But it is pretty good for asking questions like "does my document make sense" or "please find some boilerplate to cut-n-paste here, I am going to delete this before checking in anyway". Also not too bad at slightly modifying copyrighted images ;)

There was already a backlash against DRY code occurring before "AI" assistants hit the market, sadly. It was a growing movement when I was using Twitter in 2019-2022.

Some younger developers have a very different attitude to code than what I was brought up with. They have immense disdain for the Gang of Four and their design patterns (probably without realising that their favourite frameworks are packed to the gills with these very patterns). They speak snidely about principles like DRY and especially SOLID. And on places like Twitter the more snide and contrarian you can be, the more engagement you'll get. Very disturbing stuff.

DRY is mostly bad. You couple a lot of things together by making them all share the same code and then a small change to that code breaks various unrelated components.

...said no one who inherited the maintenance of hundreds of thousands of lines of copy pasted code.

God I would love that. Are you kidding? I've always hoped I'd get to inherit one. A codebase where each component "vendors" all its dependencies so you can fearlessly make changes and not affect anything else.

You're describing heaven for a maintenance programmer -- I'm only looking at the codebase because there's a bug in some component, I likely even have a stack trace. If can just read what that bit of code does top to bottom, fix the error in just that component, write a test and ship I'll send the original author chocolates.

Sure, that would fix the bug immediately in front of you, but what if the same bug exists in several of the other copies of the dependency? Are you going to be able to track all of the occurrences down? I think that's the big tradeoff.

I'll take that every time, to zeroth order if the bug occurs other places it's free to err other places and I'll deal with it. But if it doesn't then does it matter? If we're experiencing a bug in a specific code-path is the right move to make a change that fixes it in one codepath but potentially changes the currently working n other codepaths that depend on it? If it weren't DRY'd would you bother changing the others? Probably not.

To first order :vimgrep is totally mechanical and brain-off repeat fix is super easy.

It is hard to know if you have fixed or broke something when there are hundreds of distinct variants of it and no way to test its correctness.

Also, not all maintenance involves correcting inborn defects.

Dry's mostly a smell test to be applied when copying stuff over, to check that you're doing that for the right reasons, rather than expediting shit.

For sure, there are obvious egregious examples of repeating one's self. There are also many more examples of developers programming by unexamined catch phrases.

I don't think it's mostly good or mostly bad in the abstract.

DRYing code repeated for the same reason is mostly good.

DRYing code coincidentally repeated for different reasons will sow code churn or inadvertent behavior shifts.

If the reason for DRYing up code is simply because it looks the same as some other code, that’s a bad reason.

It also frequently leads people to violate YAGNI

I'm not a younger developer but I also speak snidely about SOLID and DRY. I also care a lot about code quality.

No opinion on SOLID with regard to your comment. But DRY is foundational to any code because it forces you to find the right abstraction.

This is the kind of thinking that leads to unmaintainable AbstractFactoriesFactory classes. Sometimes allowing repeat code is good because two functions might drift away from common functionality in the future which would require a major rework of whatever abstraction you put over them to get them under the same roof.

Even in simpler cases it can be problematic. For example, refactoring a tiny bit of repetitive code into a separate function might seem like a good idea. But, over time, things diverge and now additional parameters and branches have to be introduced to support. A small amount of repeated code is optimal.

No, it just forces you to find an abstraction, not necessarily a good one. Badly designed abstractions can be far worse than repetition.

If code is copied and pasted everywhere that is obviously bad. However, some repeated code is better than none in many cases.

Yeah, like this dude: https://twitter.com/ID_AA_Carmack/status/753745532619665408

"DRY isn't absolute" is not what I was talking about. No serious developer considers any principle "absolute".

Only a sith deals in absolutes.

Don't try it, he has the high ground.

I think of SOLID and DRY as primarily tools used back in the day by Uncle Bob et all to sell books and event tickets.

Maybe because SOLID is overrated / overhyped marketing term which somehow made it to academy despite being far from actual computer science / software engineering fundamentals?

We just cant stand acting as if that random list of principles created by Java's OOP mind was some source of truth for software modeling.

We're just tired of seeing bilionth discussion about how to understand SOLID

You probably don't see people arguing against CAP theorem because it is not some arbitrary collection of ideas (not even fully authored by SOLID author) which composes fancy mnemonic

There was already a backlash against DRY code occurring before "AI" assistants hit the market, sadly.

As everything else - DRY can be abused too and people backlash against acting as those things were flawless when they arent.

The backlash isn't against proper DRY (concerned with single source of truth) but fake DRY (concerned with syntactically-similiar code).

Immense disdain does accurately describe how I feel towards whatever it is happens in corporate codebases. No, creating layers upon layers of indirection via classes is not ok, no matter what your SOLID guru tells you. Best practices, DRY, and SOLID are just excuses.

I’ve noticed similar trends. After a while, I started to realise that a lot of the critics don’t really understand the principle they’re criticising.

For example, take DRY. The important principle was never really about repeating code. It was about repeating ideas. For any given concept in your system, ideally there should be a single source of truth, and therefore a single place you need to understand or change if you’re working with that concept. It’s true that this means copying and pasting non-trivial amounts of code instead of creating a meaningful abstraction is often a bad idea. But it is also a warning that any time you do repeat an idea, you now have an ongoing liability because you need to keep those different representations in sync. That could refer to database migrations that define your schema and separate ORM class definitions, or an API you define in your back-end code and a client for that API you define in your front-end code, or a retained mode UI where you have a current value in some form field that corresponds to a specific value in your internal application state, or some invariant in your data model that can be represented in both types and unit tests.

People who object to combining duplicate or near-duplicate code that represents different ideas but happens to have a similar implementation at the time, on the basis that it’s a maintenance hazard for later on, aren’t wrong. They’re just objecting to a straw man that was never really the point of DRY in the first place, but has been treated as if it were due to some kind of cargo cult/gaslighting effect.

The question I have now is where and when in our industry do we expect new developers to learn these principles so they do understand them properly? Some people have a formal background in CS or the like, but not everyone does, and in any case it’s not necessarily the role of an academic CS course to teach a lot of practical software development skills. I had a discussion the other day about how when I was starting out, the senior developers would give real, substantial training to the juniors to help us learn and understand these principles, but with the job-hopping culture today and the resulting general aversion to hiring juniors as a long-term investment, that just doesn’t seem to happen much any more. There are formal courses that cost a lot by personal standards but almost nothing by business standards, but there must be a tiny proportion of new developers who actually get sent on them by their employers. There are a few books worth reading, but what 20-something in 2024 wants to deal with presentation as antiquated as ink on sliced bits of tree? I suspect a lot of what today’s up and coming developers learn about these ideas comes from sources like blogs and YouTube videos, where again there is some great material out there, but as ever the problem is finding it among all the poorly understood and dubiously presented dross.

And then we wonder why tools come along that seem like magic, producing a dozen lines of code in a heartbeat that seem to mostly work, and young developers think they’re great even while having little idea of all the deeper things that may be wrong with that code. It’s not really surprising, and I’m not sure it’s really anyone’s fault, but it’s definitely a problem and I wish I knew what we should do about it.

When I look into the future, and I know that I really can't, one thing I really believe in is that there will be a shift in how quality will be perceived.

With all things around me there is a sense that technology is to be a saviour for many very important things - ev's, medicine, it, finance etc.

At the same time it is more and more clear to me that technology is used primarily to grow a market, government, country etc. But it does that by layering on top of already leaking abstractions. It's like solving a problem by only trying to solvent be its symptoms.

Quality has a sense of slowness to it which I believe will be a necessary feat, both due to the fact that curing symptoms will fall short and because I believe that the human species simply cannot cope with the challenges by constantly applying more abstractions.

The notion about going faster is wrong to me, mostly because I as a human being do not believe that quality is done by not understanding the fundamentals of a challenge, and by trying to solve it for superficial gains is simply unintelligent.

LLMs is a disaster to our field because it caters to the average human fallacy of wanting to reach a goal but without putting in the real work to do so.

The real work is of course to understand what it is that you are really trying to solve with applying assumptions about correctness.

Luckily not all of us is trying to move faster but instead we are sharpening our minds and tools while we keep re-learing the fundamentals and applying thoughtful decisions in hope to make quality that will stand the test of time.

The real work is of course to understand what it is that you are really trying to solve with applying assumptions about correctness.

In how far do you think LLMs stand in the way of that?

My experience has been very much the opposite: Instead of holding the hard part of the process up by digging through messy apis or libraries, LLMs (at least, in their current form but I suspect that this will theoretically simply remain true) make it painfully obvious when my thinking about a task of any significance is not sound.

To get anywhere with a LLM, you need to write. To write, you have to think.

Very often I find the most beneficial part of the LLM-coding-process is a blended chat backlog that I can refer back to, consisting of me carefully phrasing what it is that I want to do, being poked by a LLM, and me through this process finding gaps and clarifying my thoughts at the same time.

I find this tremendously useful, specially when shaping the app early, to keep track of what I thought needed to be done and then later being able to reconsider if that is actually still the case.

This is how I’ve been most successful so far at using LLMs. They help me poke as you say at the problem until a satisfying solution appears or until I have enough info to know what to look for.

But that is a terrible way of dealing with hard problems.

I recommend Rich Hickey's "Hammock Driven Development" talk. You don't solve hard problems by poking at it repeatedly until something works, that is the recipe for terrible code and abstractions. You instead take a step back from your computer and digest it until you come with a well-understood solution.

This approach is what separates the experienced engineer from the junior. Code is the least of your problems.

I don't know what experience you draw from, and what is lost in assumption here (on both sides, "hard problems" is open to a lot of interpretation), but the entire premise behind agile is, that thinking a solution up is not how software development works. You chunk things up, you iterate, you stay flexible, precisely because sitting down and thinking hard is not realistically working for a sizeable, messy, real world business solution.

I draw from my 17 years of experience in the field, but I am well aware that people work differently.

There are the John Carmack types, whose output depends on how much time on they spend at keyboard, and the Rich Hickey types, whose output depend on how much time they spend on a hammock with their eyes closed (or under the shower in my case). I am afraid I am of the latter type. My best solutions are found away from the keyboard, as I have learned to simply depend on my subconscious to process and digest them while I'm doing other things.

Check out that talk still, it has deep insight on how the human brain operates.

Poke around as in, at least in my case, discussing and throwing ideas back and forth. this is not incompatible with hammock-driven development (big fan here). It's however a good way of "discussing" ideas and solutions, and has been working well for me as part of an overall strategy to solve my problems. (30 years of experience in the industry if we're counting)

This aligns with my thinking about the utility of LLMs: they're rubber duck programming as a service

I don't get so much from going over previous conversations, but needing to articulate a problem well enough to ask chatgpt a question is extremely useful. Far moreso than coming up with search phrases, I find.

When I look into the future, and I know that I really can't, one thing I really believe in is that there will be a shift in how quality will be perceived.

IKEA furniture is a great example of this. I build my own furniture and being around it is a much much nicer thing than some piece of cardboard from IKEA.but it seems like cost, soeed an convenience are the most important thing in peoples minds.

But the tradeoff of cost and convenience vs quality is everywhere in life. Most people (including me) do not have the time, money, nor (in my case most importantly) workspace to build their own furniture. IKEA and other budget furnishing companies are a perfect solution for people in this situation and I can buy a handmade piece of furniture if I ever feel that something is not up to quality.

There is an interview with the great jazz pianist Bill Evans (conducted by his brother) in which he muses that most amateur musicians make the mistake of overplaying. They go out to the club and hear a professional and they come home and try to approximate what the professional does. What they end up with is a confused mess with no foundation. He insists that you have to learn to be satisfied with doing the simple things and gradually building up a stronger foundation.

I think his insight applies nearly as well to using code generated by an ai.

LLMs is a disaster to our field because it caters to the average human fallacy of wanting to reach a goal but without putting in the real work to do so.

It's a tool. It doesn't make sense to blame the tool. Is it the screwdriver's fault it gets used as a hammer? Or a murder weapon?

Used intelligently Copilot & Co can help. It can handle the boilerplate, the mundane and free up the human element to focus on the heavy lifting.

All that aside, it's early days. It's too early to pass judgement. And it seems unlikely it's going to go away.

I'm a junior, and I have Codeium installed in VSCode. I've found it very distracting most of the times, I don't really understand why so many people uses this kind of assistants.

I find stuff like Phind useful, in the sense that sometimes something happens that I don't understand, and 60% of the times Phind actually helps me to understand the problem. Like finding trivial bugs that I didn't spot because I'm tired, dumb, etc.

On the other hand, with Codeium, I guess it may be useful when you're just churning boilerplate code for some framework, but in my little expericence (writing scrapers and stupid data pipelines & vanilla JS + HTML/CSS) cycling through suggestions is very irritating, specially because many times it doesn't work. Most of the times for stupid reasons, like lacking an argument or something like that, but then it's time you have to spend debugging it.

Another problem I have is that I find there's a common style of JS which consist in daisy-chaining a myriad of methods and anonymous functions, and I really struggle with this. I like to break stuff into lines, name my functions and variables, etc. And so many times code suggestions follow this style. I guess it's what they've been trained on.

Codeium is supposed to learn from this, and sometimes it does, to be fair.

But what I worry the most is that, If I'm a junior and I let this assistants do the code for me ¿How the hell I'm supposed to learn? Because giving Phind context + questions helps me learn or gives me direction to go on find it by myself in the internet, but if the only thing I do is press tab, I don't know how the hell I'm supposed to learn.

I found a couple days ago that many people (including devs) are not using LLMs to get better but it's just a substitute of their effort. Isn't people afraid of this? Not because companies are going to replace you, but it's also a self-reflection issue.

Coding is not the passion of my life, addmitedly, but I like it. I like it because it helps me to make stuff happen and to handle complexity. If you can't understand what's happening you won't be able to make stuff happen and much less to spot when is complexity going to eat you.

Another problem I have is that I find there's a common style of JS which consist in daisy-chaining a myriad of methods and anonymous functions, and I really struggle with this. I like to break stuff into lines, name my functions and variables, etc.

I think your whole comment is excellent but I just wanted to tell you, you're on the right track here. Certain developers, and in particular JS developers, love to chain things together for no benefit other than keeping it on one line. Which is no benefit at all. Keep doing what you're doing and don't let this moronic idiom infect your mind.

The downside of extra variables used only once is that as a reader of the code I have to think about whether they might be used again.

I know what you mean, but in this situation it shouldn't be a major problem. These would be variables scoped locally and one would hope that this scope would not be more than about a page of code, and hopefully much less. Also - one would hope that local variables are not being re-used!

Sometimes making something a one-liner is itself a benefit for readability. Especially if you’re used to reading it. But admittedly it’s very easy (and can be tempting) to take it too far…

This is just another coding style. After 1-2 weeks you get used to whatever you're reading. Try it and you'll see.

It's the high-level code that can become an issue (structuring the state of your program, using dependency injection incorrectly, having a convoluted monad transformer stack, putting very specifically typed effects in your Reader etc.). If you make mistakes there, you will struggle to read, write and reuse code, and even then, not all is lost. If there's bad structure you can most often transform it into a good one. When there's no structure, that's a problem.

Seeing .map.filter becomes a quick pattern match. You know what's happening there. It does not matter if it's a named variable or just part of a long

    a.map
     .filter
     .reduce
     .map

chain.

I agree, if your goal is to hire a lot of people, then you might want a style that does not strain the pattern matching abilities too much. We can compare which style is the best for that.

Nothing stops you from extracting a sequence from a long chain into a function to reuse it elsewhere.

    pipe(
      object,
      map,
      filter,
      ...
    )

Many languages today allow declaring functions inside functions. I'd argue that in that case it's better you declare functions as close as possible to the place where you'll call them, which can be inside another function.

I think probably the best use of AI, so far, was when I went into a controller and told it to generate an openAPI spec ... and it got it nearly right. I only had to modify some of the models to reflect reality.

BUT (and this is key), I've hand-written so many API specs in my career that 1) I was able to spot the issues immediately, and 2) I could correct them without any further assistance (refining my prompt would have taken longer than simply fixing the models by hand).

For stuff where you know the domain quite well, it's amazing to watch something get done in 30s that you know would have taken you the entire morning. I get what you're saying though, I wouldn't consider asking the AI to do something I don't know how to do, though I do have many conversations with the AI about what I'm working on. Various things about trade-offs, potential security issues, etc. It's like having a junior engineer who has a PHD in how my language works. It doesn't understand much, but what it does understand, it appears to understand it deeply.

I wouldn't consider asking the AI to do something I don't know how to do

My experience has been the opposite so far. I benefit much more from such tools when I can easily check if something works correctly and would have to learn/look up a lot of easy and elementary stuff to do it from scratch.

For example, adding to some existing code in a language I don't know and don't have time or need to learn (I guess not many people are often in that situation). I get a lot of hints for what methods and libraries are available, I don't have to know the language syntax, for easy few-line snippets (that do standard things and which I can test separately) the first solution usually just works. This is deliberately passing on an opportunity for deeper and faster learning, which is a bad idea in general, but sometimes the speed trade-off is worth it.

On the other hand, for problems where I know how to solve them, getting some model to generate the solution I want (or at least one I'm happy with) tends to be more work than just doing it myself.

I probably could improve a lot in how I use the available tools. Haven't had that much opportunity yet to play with them...

The tool and design of the tool matters a lot. I've used Codeium in VSC and GH Copilot in Intellij, and the experience (and quality) of the GH + Intellij paring is much better than Codeium + VSC.

My biggest use for AI assistants has been speeding up test writing and any "this but slightly different" repetitive changes to a code base (which admittedly is also a lot of test writing). At least in intellij + GH, things like, a new parameter that now needs to be accounted for across multiple methods and files is usually a matter of "enter + tab" after I've manually typed out the first two or three variants of what I'm trying to do. Context gives it the rest.

In VSC with Codeium, the AI doesn't seem quite as up to snuff, and the plugin is written in such a way that its suggestions and the keys for accepting them seem to get in the way a lot. It's still helpful for repetitive stuff, but less so for providing a way of accomplishing a given goal.

While I can't speak to Codeium, you might want to try Copilot in a more mature codebase that reflects your style of composition.

The amazing part for me with the tech is when it matches my style and preferences - naming things the way I want them, correctly using the method I just wrote in place of repeating itself, etc.

I haven't used it much in blank or small projects, but I'd imagine I'd find it much less ideal if it wasn't so strongly biased towards how I already write code given the surrounding context on which it draws.

Coding is not the passion of my life, addmitedly, but I like it.

It may not be the passion of your life but I haven't seen anybody articulate better (in recent memory) what they want to get out of coding and how they evaluate their tools. Keep at it, don't change and you'll go places, you are definitely on the right path.

Using GPT-4 has significantly enhanced the efficiency of my work. My focus is primarily on developing straightforward PHP CRUD applications for addressing problems in my day-to-day job. These applications are simple, and don't use frameworks and MVC structures, which makes the code generated by GPT-4, based on my precise instructions, easy to comprehend and usually functional right out of the prompt. I find if I listen to the users needs I can make something that addresses a pain point easily.

I often request modifications to code segments, typically around 25 lines, to alter the reporting features to meet specific needs, such as group X and total Y on this page. GPT-4 responds accurately to these requests. After conducting a quick QA and test, the task is complete. This approach has been transformative, particularly effective for low-complexity tasks and clear-cut directives.

This process reminds me of how a senior programmer might delegate: breaking down tasks into fundamental components for a junior coder to execute. In my case, GPT-4 acts as the junior programmer, providing valuable assistance at a modest cost of $20 per month. I happily pay that out of pocket to save myself time.

However, much like how a younger version of myself asked why we had to learn math if the calculator does it for us, I know understand why we do that. I think the same thing applies here. If you don't know the fundamentals, you won't be effective. If GPT-4 had been around when I learned to write PHP (don't @ me!), I probably wouldn't understand the fundamentals as well as I do. I have the benefit of learning how to do it before it was a thing, and then benefitting from the new tool being available.

I also don't find the code quality to be any less, if anything what it spits out is a bit more polished (sometimes!).

precisely. we are very lucky to be during the timeline where ChatGPT was released during our later years, that we didn't have to compete with auto created code during our learning formative years.

This sounds like motivated reasoning to me. Having an above-average personal tutor that doesn't get mad or tired 24/7 every time seems like a big multiplier

It‘s the Duolingo of software engineering. Real understanding requires real effort.

Just because they’re better than you does not mean they’re above average.

Though standards have been going down, so maybe you are right.

Eh, you could say it about compilers, then optimizing compilers… unless they are on their way to the post-scarcity world, the next generation will figure out a way to take advantage of new tools. Sure, lots of things will change, but people will adapt.

I’d be more worried if I was somebody like Squarespace. When anybody can say “build me a neat looking website,” the business of selling templates looks rough.

Or it just means Squarespace has to pivot and extend their product offerings.

Spend a bit in R&D to make sure they have a lock on ease of use at gen AI website building/modification, ideally starting from base templates and that's plug and play with their existing CMS.

If anything, doing that could even increase their market share to users for whom simple templating was either (a) not enough, or (b) still too complex without additional handholding.

For pretty much everyone, generative AI is a threat to those who stagnate in the status quo and an opportunity to those actively seeking growth and ever improved product market fit.

I can see both points.

But I agree with this one more so, I did programming as part of my Comp Sci degree and my job doesn't require any programming. I didn't particularly like programming and would end up with 20+ tabs of various questions being asked with most of my time spend finding an answer to my question having to troll through what was often the cesspool of stackoverflow.

But having a tool where I can ask it questions about my code, code in general is hugely beneficial. I can write a block of code, or have it write a block of code, then have it explain to me how it's meant to be working. If I don't understand a particular component I can contextually ask it more questions.

I appreciate the expectation of code quality is higher in production, but from a personal learning standpoint for a learner its great.

How do you interface with it? Are you pasting chunks of code into chat? Or just describing new code to write and then giving it feedback to rewrite it? Or something else?

Yeah, a lot of times it has better code quality, but more subtle bugs than what I'd be prone to produce.

I think a lot of the criticisms are premature, and it's more a stumbling step forward with need for support from additional infrastructure.

Where's the linter integration so it doesn't spit out a result that won't compile? Where's the automatic bug check and fix for low hanging fruit errors?

What should testing look like or change around in a gen AI development environment?

In general, is there something like TDD or BDD that is going to be a better procedural approach to maximizing the gains to be had while minimizing the costs?

A lot of the past year or two has been dropping a significant jump and change in tech into existing workflows.

Like any tool, there's the capabilities of the tool itself and the experience of the one wielding it that come together to make the outcome.

The industry needs a lot more experience and wisdom around incorporation of gen AI in development before we'll realistically have a sense of its net worth. I'd say another 2-3 years at least - not because the tech will take that long to adapt, but because that's how long the humans will take to have sufficiently adapted.

I decided to use ChatGPT to build a clone of Yourls using Django/Python. I gave it specific instructions to not only allow for a custom shortened URL but to track the traffic. It didn’t properly contemplate how to do that in the logic or data model. I had to feed it specific instructions afterwards to get it fixed.

AI tools are akin to having a junior developer working for you. Except they are much much faster.

If you don’t know what you’re doing they just accelerate the pace that you make mistakes.

If you don’t know what you’re doing they just accelerate the pace that you make mistakes.

100%

And if you know what you are doing, they will accelerate the way you're building stuff.

It’s not always clear to everyone that there’s something they don’t know!

True. But if you pay attention to how well the AI does, you have a decent chance of finding out!

I pressed down the pedal on my car and it drove off a cliff!

AI tools are akin to having a junior developer working for you. Except they are much much faster.

Honestly, this is brilliant. The other day I had to add table name prefixes to a SELECT statement column aliases, since such a feature just doesn't exist for some reason, a bit like:

  -- fails because of duplicate column names (e.g. when creating a view)
  SELECT
    *
  FROM table_a
  JOIN table_b ON ...
  JOIN table_c ON ...
  ...

  -- this would solve my issue, if WITH_PREFIX did exist (or anything like it)
  SELECT
    table_a.* WITH_PREFIX 'table_a',
    table_b.* WITH_PREFIX 'table_b',
    table_c.* WITH_PREFIX 'table_c'
  FROM table_a
  JOIN table_b ON ...
  JOIN table_c ON ...
  ...

So I just gave ChatGPT the schema definitions/query and it wrote out the long list of like 40 columns to be selected for me, like:

  SELECT
    table_a.id AS 'table_a_id',
    table_a.email AS 'table_a_email',
    ...
    table_b.id AS 'table_b_id',
    table_b.start_date AS 'table_b_start_date',
    ...

and so on. I haven't found another good way to automate things like that across different RDBMSes (different queries for system tables that have schema information) and while it's possible with regex or a bit of other types of text manipulation, just describing the problem and getting the output I needed was delightfully simple.

Aside from that, I just use the LLMs as autocomplete, which also encourages me to have good function naming, since often enough that's sufficient information for the LLM to get started with giving me a reasonable starting point. In particular, when it comes to APIs or languages I haven't used a lot, but the problems that I face have been solved by others thousands of times before. I don't even have to use StackOverflow much anymore.

That's why I bought Copilot (though JS/HTML autocomplete in JetBrains IDEs is visually buggy for some reason) and use ChatGPT quite a lot.

LLMs are definitely one of my favorite things, after IntelliSense (and other decent autocomplete), codegen (creating OpenAPI specs from your controllers, or bootstrapping your EF/JPA code from a live dev database schema), as well as model driven development (generating your DB schema migrations/tables from an ER model) and containers (easily packaged, self-contained environments/apps) and smart IDEs (JetBrains ones).

it wrote out the long list of like 40 columns to be selected for me

It seems like the process of reviewing its generated code to make sure all 40 columns are there and then either re-doing this or manually going through that list whenever the schema changes would take longer than just writing the script? And now you're asking your code reviewers to the same both boring-and-slow manual check on the commit rather than just reviewing the three lines of the script?

My question is, how do you become a senior developer when the junior developer just keeps throwing "working" "good enough" code at you?

I think companies will want more code faster to the extent that fewer people will emerge from the churn really knowing what they are doing.

The methodology seems to be: compare commit activity from 2023 to prior years, without any idea of how many involve Copilot. Then interpret those changes with assumptions. That seems a bit shakey.

Also: "The projections for 2024 utilize OpenAI's gpt-4-1106-preview Assistant to run a quadratic regression on existing data." ...am I to understand they asked gpt to do a regression on the data (4 numbers) rather than running a simple regression tool (sklearn, r, even excel can do this)? Even if done correctly, it is not very compelling when based off of 4 data points and accounting for my first concern.

check out the paper, not just the summary. They explain their methodology. The output has four data points because it’s a summary. The input is … more data than that.

More data, but OP is right on the weaknesses of the study—the author posted here [0] and acknowledged that they can't actually say anything about causality, just that 2023 looked different than 2020.

[0] https://news.ycombinator.com/item?id=39168841

I'd bet 2017 looked different from 2020 too.

I did, that's where my quote is from. The appendix confirms; they ran the regressions on just two inputs: 2022 and 2023 totals.

I’m sympathetic to the study results since I have seen similar things anecdotally but I agree their data is not really warranting the conclusions they reach. For all we know it could because of the covid hiring spree and subsequent layoffs.

Not even that, the prompt used is "Looking only at the years 2022 and 2023, what would a quadratic regression predict for 2024" as mentioned in the appendix.

So quadratic regression makes it sound all fancy, but with two data points, it's literally just "extend the line straight". So the 2024 prediction is essentially meaningless.

Maybe it's worth reevaluating our definition of quality?

In a world where AI can read our codebase, ingest a prompt, and quickly output "correct" if not clean and concise code, and then be able to iterate on code with more prompt, do we need all the same patterns we used to adopt when humans were painstakingly writing every line of code?

This reminds of of the CISC to RISC migration - now that computers are in the loop writing the tedious parts, we don't need to burden our codebase with patterns meant to relieve humans from the tedium.

I find myself, for instance, writing more long form, boring configuration files that once upon a time I would have built some abstraction reduce the boilerplate and verbosity. But now that co-pilot can just auto-complete the next section for me, why bother?

https://en.wikipedia.org/wiki/Wirth%27s_law

Old saying "what Intel giveth, Microsoft taketh away" :-)

Patterns aren't intended to "relieve humans from the tedium," they're to make the code more intelligible. Code generators create notoriously difficult to understand code. Apparently the current crop of LLMs are no better in that regard.

find myself, for instance, writing more long form, boring configuration files that once upon a time I would have built some abstraction reduce the boilerplate and verbosity. But now that co-pilot can just auto-complete the next section for me, why bother?

Again, so humans can understand it more easily.

In a world where AI can read our codebase, ingest a prompt, and quickly output "correct" if not clean and concise code, and then be able to iterate on code with more prompt, do we need all the same patterns we used to adopt when humans were painstakingly writing every line of code?

That world does not exist, so currently this line of thinking is academic.

Perhaps it will exist in the future, but it's far from a certainty if that will come to pass, and unclear on what kind of time-frame. Personally I'm quite skeptical any of us will see it within our lifetimes.

The real issue is not writing the code, but debugging it. The patterns exist to make the code readable, debuggable, and thus maintainable. "Don't repeat yourself" is not to save you typing, it's to be able to fix a bug once, rather than hunting through all the code for similar instances and fixing each one, introducing new errors. If you hope that with enough prompts AI will find and fix its own bugs, I think your level of optimism is enviable :-)

Shock horror, the thing everyone said would happen has happened. I mean, I’m not sure what people expected.

Some people are thinking hard about a complex, nuanced topic of which we have very little past experience to draw on. I'm glad the conclusion is so self-evident to you. I must be a little slow.

ML/LLMs are nuanced and complex topics as they are the inner workings of automation. People using ML/LLM to get around having to write code already understood well enough isn’t a complex nuanced topic because it is the outer workings of automation and has been studied in other fields quite extensively. No one should be surprised at the trend of lazier development from wide adoption of automation tools.

How do past lessons about automation cleanly apply to AI code generation?

Cleanly enough that there's no room for debate, doubt or discussion?

(Edit. Not the same person. I keep making this mistake in discussion threads)

Previously I wrote:

It's your tone of "this is obvious, people! Why are you still wasting time thinking about it?" that I'm taking exception to.

There’s room for discussion on it and where to go from here, it’s the result that is not surprising.

I would say instead of reacting to the rhetorical remarks, bring up the actual interesting discussion around it in your response.

People have different workflows, but mine is frequently, skim the documentation, make a prototype, refine code a bit, add tests, move stuff around, break stuff, rework code, study documentation, refactor a bit more, and then at that point I have enough understanding of the problem to go in at yank out 80% of my code and do it right.

If Copilot gives me working code in the prototype stage, good enough that I can just move on to the next thing, my understanding is never going to be good enough that I can go in and structure everything correctly. It will effectively allow me to skip 90% of my workflow, but pay the price. That's not to say that Copilot can't be extremely helpful during the final steps of development.

If those findings are correct, I can't say that I'm surprised. Bad code is written by poor understanding and Copilot can't have any understanding beyond what you provide it. It may write better code than the average programmer, but the result is no better than the input given. People are extremely focused on "prompt engineering", so why act surprised when a poor "prompt" in VScode yields a poor result?

I'm not sure why you decided that "use copilot" also implies missing out most of your later steps. Who decides to skip all those steps? Presumably you?

My experience is that Copilot is great at getting me started. Sometimes the code is good, sometimes it's mediocre or completely broken.

But it's invaluable at getting me thinking. I wasted a lot more time before I started using it. That might just be my weird brain wiring...

(Edited to sound less narky. I shouldn't post from a mobile device)

I recently tried Copilot out of curiosity and this is my experience too: It helps me getting started, which for me is 99% of the challenge. I know how to solve problems, even complex ones, but for some reason getting started is just so extremely hard, sometimes.

Copilot lets me get started, even if it's wrong sometimes. There have been times where I have been surprised by how it took something I wrote for a server, and presented the correct client-side implementation.

I've used it a few times to describe a problem and let it handle the solution. It's not very good, but I wonder if one should place more blame on PEBCAK and put more time into problem-description. I gave it a few more paragraphs to describe the problem, and eventually I could take it from there. It was still wrong, but enough to get me started. Immensely helpful that way.

Another aspect that I'm wondering about is if it will be able to do more with better documented code. Anyone have experience with that? I've started to write more doxygen comments, and hoping to see if there's a slow shift to more accurate predictions.

It helps me getting started

This is like writing in general. It's easy to edit crappy text you've written into something better.

It's completely impossible to do it to a text you didn't write at all.

LLM models are pretty good in doing the crappy first version. It might use abandoned packages or old APIs but the skeleton is there. It's not that hard to add some meat on the bones when the structure exists.

Recently I had to parse a pretty crappy XML format (planned by committee) with Go. I just fed the XML to GPT4 and asked it to parse specific values from it. It got like 95% there. I just had to do a few fixes and polish it a bit. Saved me a lot of headache and poking around in documentation.

I’ve circumvented all of this “getting started” trouble with the pomodoro method. It’s simple and I don’t have to real with maybe broken code and it works for everything in my life. Worth a try.

I'm not surprised. AI tools can be great at providing a quick, working example in simple scenarios, but it's just that: a quick(often dirty) working example. But I've seen people taking it as-is and putting it into a production codebase. Sure, a function that iterates over an array, and checks if some item exists - fine. In those cases it will do the job simply because you(it) can't get it wrong. However I had this experience where senior developers were fully invested into using it. And because managers see code just erupting like a volcano, they embrace it and in fact rise their expectation when it comes to delivering a feature: "you could do it in a week before, you should be able to do it in 2 hours now, right?". And on more than one occasion this has backfired massively in everyone's face. The last time me and another dev spent two straight days rewriting and debugging everything cause there was an international exhibition that was about to start and the company was at the front of the line and everyone else simply pushed a ton of code that was 75% AI-generated, completely ignoring the edge cases, which were more than people anticipated.

But probably the most off-putting thing I've experienced is an OKR session with 50 people in it, where a lead dev publicly opened chatgpt, prompted "how do we reduce the number of critical bugs by 30% in the next quarter", chatgpt came up with a generic response, everyone said "perfect", copy-paste that into Jira and call it a day. And I'm just sitting there and wondering if there was something rotten in my breakfast and I'm hallucinating. Unfortunately my breakfast was fine and that really happened. The few times I've tried using those, they were only helpful with dumb, repetitive and mundane tasks. Anything else, you have to rinse and repeat until you get a working solution. And when you read it(for those of us that do), you realize you might have been better off recruiting a freelancer from year one in University to spend a day mashing it up and likely coming up with something better.

But I bet much of those year ones would be doing this exact thing day in and day out until they come up with a solution: Occasionally I will grab my laptop and go work at a cafe on my personal projects for a change and I can't tell you how many times I've seen people just copy pasting stuff from chatgpt and pasting it back into their IDE/editor and calling it a day - students and clearly people who are doing this for a living. Not to mention copilot, that's the de-facto standard at this point.

In fact I had this conversation last year with a guy(developer) who is 20-something years older than me(so mid 50-s): most of the LLM's are trained on stuff that is in the documentation, examples, reddit and stackoverflow. It's only a question of time until the content found in those exact locations where the training data is pulled from will become more and more AI-generated, models will be re-trained on those and eventually shit hitting the fan. I don't think we are too far off from this event.

chatgpt came up with a generic response, everyone said "perfect", copy-paste that into Jira and call it a day

I don't think I can believe this story as told...

5 people(myself included) announced we resign the following day, so you better believe it.

What you said gave me an idea of how to really make AI useful at work! :-)

Every quarter HR requires that we write our "perspective" for the next quarter; what we are going to do and how we are going to improve ourselves. It's purely bureaucratic exercise, has no meaning and no impact on anything anybody does, but on an off-chance somebody reads it I cannot just fill it with nonsense. Writing something resembling sensible, in a stilted language required, always makes me struggle much more than writing code or something with meaning.

Strange that I haven't though about using an LLM or this before; seems like a perfect job for it.

Yeah, admittedly I've used chatgpt for such things. That and resignation letters. Funny enough one of the team leads sent me a message over chat when I was leaving my old jobs, saying that he'd like to stay in touch with me cause we have similar interests and he really values me as a developer. He also used chatgpt for that message. How do I know? His English was insanely limited and couldn't have come up with such sophisticated words. Also grammar was 10/10. I laughed it off and moved on though.

But yeah, I abhor bureaucracy and trivial bs like the one you mentioned so I'm more than happy to outsource this issue to someone/something else.

Original research author here. It's exciting to find so many thinking about long-term code quality! The 2023 increase in churned & duplicated (aka copy/pasted) code, alongside the reduction in moved code, was certainly beyond what we expected to find.

We hope it leads dev teams, and AI Assistant builders, to adopt measurement & incentives that promote reused code over newly added code. Especially for those poor teams whose managers think LoC should be a component of performance evaluations (around 1 in 3, according to GH research), the current generation of code assistants make it dangerously easy to hit tab, commit, and seed future tech debt. As Adam Tornhill eloquently put it on Twitter, "the main challenge with AI assisted programming is that it becomes so easy to generate a lot of code that shouldn't have been written in the first place."

That said, our research significance is currently limited in that it does not directly measure what code was AI-authored -- it only charts the correlation between code quality over the last 4 years and the proliferation of AI Assistants. We hope GitHub (or other AI Assistant companies) will consider partnering with us on follow-up research to directly measure code quality differences in code that is "completely AI suggested," "AI suggested with human change," and "written from scratch." We would also like the next iteration of our research to directly measure how bug frequency is changing with AI usage. If anyone has other ideas for what they'd like to see measured, we welcome suggestions! We endeavor to publish a new research paper every ~2 months.

We hope it leads dev teams, and AI Assistant builders, to adopt measurement & incentives that promote reused code over newly added code.

imo, this is just replacing one silly measure with another. Code reuse can be powerful within a code base but I've witnessed it cause chaos when it spans code bases. That's to say, it can be both useful and inappropriate/chaotic and the result largely depends on judgement.

I'd rather us start grading developers based on the outcomes of software. For instance, their organizational impact compared to their resource footprint or errors generated by a service that are not derivative of a dependent service/infra. A programmer is responsible for much more than just they code they right; the modern programmer is a purposefully bastardized amalgamation of:

- Quality Engineer / Tester

- Technical Product Manager

- Project Manager

- Programmer

- Performance Engineer

- Infrastructure Engineer

Edit: Not to say anything of your research; I'm glad there are people who care so deeply about code quality. I just think we should be thinking about how to grade a bit differently.

this is just replacing one silly measure with another

Not to say anything of your research

The second statement isn't true just because you want it to be true. The first statement renders it untrue.

I'd rather us start grading developers based on the outcomes of software. For instance, ... errors generated by a service

yeah you should click through and read the whitepaper and not just the summary. The authors talk about similar ideas. For example, from the paper:

The more Churn becomes commonplace, the greater the risk of mistakes being deployed to production. If the current pattern continues into 2024, more than 7% of all code changes will be reverted within two weeks, double the rate of 2021. Based on this data, we expect to see an increase in Google DORA's "Change Failure Rate" when the “2024 State of Devops” report is released later in the year, contingent on that research using data from AI-assisted developers in 2023.

The authors are describing one measurable signal while openly expressing interest in the topics you're mentioning. The thing is: what's in this paper is a leading indicator, while what you're talking about is a lagging indicator. There's not really a clear hypothesis as to why, for example, increased code churn would reduce the number of production incidents, the mean time to resolution of dealing with incidents, etc.

That said, our research significance is currently limited in that it does not directly measure what code was AI-authored -- it only charts the correlation between code quality over the last 4 years and the proliferation of AI Assistants

So, would a more accurate title for this be "New research shows code quality has declined over the last four years"? Did you do anything to control for other possible explanations, like the changing tech economy?

On the other end of the spectrum, I'm finding ChatGPT is helping reduce the friction of adding unit tests, getting started with docstrings (I usually need to do a lot of work on them though), and type annotations. Plus I've had some luck with handing code to ChatGPT and asking it to suggest clearer ways of writing it.

For example, a few months ago I was rewriting some particularly tricky, opaque code from Ansible having to do with the conversion of symbolic "chmod" strings into numeric. Because I was coming from code that was so hard to reason about (a bug in it took me a couple days to understand and fix), I wanted something really obvious and well tested. ChatGPT helped with that. https://github.com/linsomniac/symbolicmode/blob/main/src/sym...

I think unit tests are suppose to induce friction to a certain extent. Using ai to do it seems like it is just checking a box.

The amount of boiler-plate, and in particular getting mocks right, are friction that can go away.

"the disjointed work of a short-term contractor"

Speaking as someone who has been a short-term contractor,

1) the issue with short-term contractors is not the work, it's the project management -- so in this analogy, the project manager is the idiot who thinks using Copilot to do just the bits they don't understand or don't want to do is going to improve things

2) go f### yourself

Yes, this is disrespectful, but no more derogatory than associating "disjointed work" with an arbitrary group, or the study talking of "itinerant" contributors. These people contribute because they will write code nobody else would or has; if the code has not subsequently been "made more DRY" or better-integrated into the codebase, it isn't the fault of the contributor, it's the fault of the person managing the codebase.

Short-term contributors very often find themselves being tasked to maintain code that nobody else will, and limited in the scope of what can improve through project management, budget or time restrictions.

It is in general asinine to talk of the work of a (single) contractor being "disjointed", when it is the whole that is disjointed, and it contributes to the blame culture around contractors.

I once took on a project where it was very clear just from code styles that three short-term contractors had worked on it. Were any of them individually to blame for it not working? It became very clear, after dealing with the client, that they were not.

To be clear: I think using Copilot is a lazy, expedient crutch that won't go anywhere good. But the framing of this study suggests that the magic is in the "senior developer"; it's not. It's in the ownership.

You're absolutely right. Why are they maligning contractors? Never mind the fact that the comment was insulting - it's not even factual. I'm hired as a contractor to fix broken projects.

Couldn't agree more. Knocking on contractors seems random, out of place, factually inaccurate, and it doesn't really add anything to the substance of this article, so also completely unnecessary to sustain the actual point the article is trying to make. It's just bad writing.

The full paper is here: https://gitclear-public.s3.us-west-2.amazonaws.com/Coding-on...

Thank you for the source.

The real news here is that the authors have apparently found an objective and universal measure of code quality.

Eh, it's not that such measures don't exist - they're just noisy.

And the thing about AI is that their negative impact is clearly visible above the noise.

Copilot is great, as is gpt-4. I use it as search engine, to give me pointers about parts of the docs I might’ve missed.

I have never not modified and refactored the code before implementing it.

I also see LLMs primarily as a better search engine. Stack overflow gave us code snippets and was a huge step forward. Now we can find those snippets and explanations much faster.

The secondary use is as a duck to bounce ideas off. A duck that is not as smart as a human but always available. And I can ask stupid questions to check my understanding.

The final use is for code generation. If I’m super tired it can do trivial coding for me. Or if it’s boilerplate, which is very little code for me. Generally it doesn’t help a great deal within the IDE, and I’m not completely sure it’s a net win there yet.

Feels like taking the average of all text on the internet, does result in barely OK output. Very few people are truly “average” in the numerical sense of the word, so you’re going to see better results from most devs compared to the coefficient soup that tries to find the commonalities between the best and worst devs.

Truth be told though, speaking as someone that still does not use LLM tools at work… “just OK” is totally viable for a lot of things. Prototypes yes, products expected to be around a while, maybe not.

I recently had this discussion at work. There's a difference between writing software that will be seen again in a few months and writing software that probably will never be touched again (or at least touching it again will be pretty risky). Identifying which one you're working on is super important.

When you're writing software that is touched fairly often, the "tribal knowledge" of how it works will likely live-on. You can be a little bit clever at times, you don't need to comment as heavily, and your variable names can be a bit wonky.

When you're writing software that is hardly ever touched ... everything needs to be crystal clear. Write lots of comments, explaining the "why" things are the way they are. You want to avoid cleverness except in the name of performance (but then comment how you expect it to work so that when someone comes along to change it, they understand wtf you are doing and the constraints you used to define the code). It's a totally different ballgame.

AI doesn't get this distinction, hell, most programmers don't either.

Sometimes less dry code can actually be easier to read and understand at the point of usage than dry code that has been more highly abstracted and requires grokking a sort of quasi DSL that defines the abstraction. Assuming that AI contributions will only increase, if a codebase were almost completely written by AI perhaps the benefits of DRY would diminish vs on the spot readability by humans only trying to understand the code and not personally add to it

Well as with anything, DSLs are subject to the rules of good design. A really well-designed DSL (such as SQL) takes on a life of its own and becomes incredibly ubiquitous and useful to know in its own right. Many other DSLs are totally unknown, not worth learning, and serve as barriers to code understanding.

I don’t know of too many people who would advocate replacing SQL with hand-written C manipulating B-trees and hash tables. Similarly, it’s pretty rare that you want to hand-roll a parser in C over using a parser generator DSL or even something like regex.

The amount of things I’m expected to be able to accomplish constantly increases. I get it, I’m becoming more senior, but at the same time, it’s kind of crazy that a single person can just be told to do the kind of things I’m told to do with the kind of deadlines I have. I can think of some takes I do by myself with code where a decade or two ago I’d be talking about hiring dozens of employees or contractors across the globe to physically enter a building and make a global business information system possible.

I think this is related to the productivity wage gap. I’m being paid lower than I was in 2018 adjusted for inflation and I’m far more capable, producing far more value for my employer. As my tools get more valuable my labor gets proportionally less valuable. (A certain historical figure predicted this end state of capitalism).

Here’s another take: I don’t give a fuck about the code quality at my company. I’m not paid extra to write good code. My bonus is based on basically nothing I can influence. If the whole company goes out of business on Monday I’ll enjoy my long weekend.

This is how everything goes.

To build a house, it used to be necessary to hire experienced craftsmen. Now most of it is delivered pre-fabricated to the building site and laborers hired for the day nail it together.

Cars used to be hand built by machinists and metalworkers. Now laborers tighten a bolt as cars roll by on an assembly line. Or a robot does it.

Computer coding is going the same way. We lived and worked through the craftsman stage. Now we're becoming laborers.

The use of generative models will cause their own demise. At the end they will just produce pure noise.

Those who can't code better than an LLM will welcome it, those who can will abhor it. Unfortunately it seems there are far more of the former than the latter, and they aren't going to get better either.

Progress comes from human intellect, mediocrity comes from regurgitation.

The valuable thing to a business is not code, it’s the people in your business that understand the code.

LLMs are ultimately going to be more valuable for reading code than writing it.

One thing that I found especially helpful when using CoPilot and ChatGPT is that I can get an answer very quickly and directly, instead of having to wade through a lot of blogs or articles on Stack Overflow.

"'AI-generated code resembles an itinerant contributor, prone to violate the DRY-ness [don't repeat yourself] of the repos visited.' ... That serves as a counterpoint to findings of some other studies, including one from GitHub in 2022 that found, for one thing: "developers who used GitHub Copilot completed the task significantly faster -- 55 percent faster than the developers who didn't use GitHub Copilot."

These two observations aren't mutually exclusive. DRY-ness comes from a holistic understsanding of a system; you get similar results when you task a junior developer with little deep knowledge of a novel codebase with solving a problem and don't give them enough oversight; they'll tend to bull-rush through coding a solution with little-to-no knowledge of how the solution can be built out of the parts already existing in the codebase.

I convinced my boss to pay for it I love it

Github Copilot is really good at writing spaghetti code. It isn't good at refactoring and modularizing your code. So if you're used to writing spaghetti code yourself, you can work with it and manage the complexity. If you're used to writing highly-factored code, the code it generates is going to require a lot of work.

Code churn is up because the cost of trying new ideas is cheaper. Not all of the churn is because the code was defective.

I've been watching from the sidelines this whole AI schtick with much amusement. I'm pretty sure it will end with a whimper.

I can definitely relate to that.

I found that using ChatGPT to help me code led me to get working, well documented, code done much faster but it tend to also be somewhat more "naive" code: going for the obvious solutions even if their complexity is not optimal.

After using Copilot seriously for a few weeks, I can't believe they charge for it. It's really awful.

It shows glimpses of being useful but rarely delivers usable code.

Quality is best thought of as a process, and that process got pushed out of the SDLC by Agile process and its metric of velocity. The use of LLM-generated code to further increase velocity in the absence of quality process is an obvious result.

Garbage in, garbage garbage garb garb ggggggggggg

As someone with multiple projects, it helps with “good enough” or time-consuming tasks like filling out interfaces. basically it cuts down on context switch/flow state/ADD tendency time loss.

This makes sense to me -- I find that if I ask it to generate more than, say, a paragraph of code at a time[1], I'd be much better off just writing it myself.

That said, I have issues with RSI, so every saved keystroke is extremely useful to me. I find Copilot's ability to accurately predict the next, say, ten words I'm about to type very useful.

The usual insult thrown at LLMs is that they're "fancy autocomplete". I think that's actually about right, and I'm extremely sceptical about any claims made about their critical thinking skills which I think are essentially non-existent. Where I differ with most AI-sceptics is that I think fancy autocomplete is incredibly valuable.

[1]: Let's say that's about a mid-sized function definition.

And a new generation learns the term, "overtraining".

Someone wake me up in fifteen years when the next batch of students repeat history.

We further find that the percentage of 'added code' and 'copy/pasted code' is increasing in proportion to 'updated,' 'deleted,' and 'moved 'code. In this regard, AI-generated code resembles an itinerant contributor, prone to violate the DRY-ness [don't repeat yourself] of the repos visited.

Couldn’t this equally be explained by the cost of refactoring becoming incredibly cheap? If most of your code is generated, and you don’t have to make the investment of hand crafting everything, aren’t you naturally going to be regularly replacing vast tracts? Obviously the trend may have implications, but in large part aren’t we just seeing the impact of code cheapening? Serious question.

Previously (same URL): [0](5 points, 2 days ago, 5 comments) [2](17 points) [3](9 points, 3 comments)

"Poor code quality due to AI assistants GitHub Copilot and ChatGPT" [1](21 points, 2 days ago, 10 comments)

[0]: https://news.ycombinator.com/item?id=39142285 [1]: https://news.ycombinator.com/item?id=39144366 [2]: https://news.ycombinator.com/item?id=39156643 [3]: https://news.ycombinator.com/item?id=39164079

Something I didn't see mentioned is also that over time this is going to feedback loop into the training set and compound (possibly exponentially). I.e. as more lower quality code hits github and is used for training, the output of the code will decline, which causes lower quality code to hit github, which causes output to further decline, etc.

I am an experienced dev but new to ML so take with a grain of salt, but I really wonder if the future is going to be quality in the training sets rather than quantity. I have heard that the "emergent properties" don't seem really affected by bad data as long as the set is large enough, but at what point does this cease to be true? How do we guard against this?

It looks like the trend started heading upwards before AI took over. Are we sure this wasn't just the way it was heading anyway? Is there a control group that doesn't show the same line?

Overall from personal experience TFA's conclusions do seem correct to me, but I want to make sure we aren't confirmation biasing the interpretation of the data.

Personally, I experience a great boom to my productivity. Sure, the code I get from copilot is not "the best", however it allows me to get so much more done. If it makes a mistake or two it's not really that big of a deal. However, I don't work in finance or "really important stuff", so ymmv.

Surprise! As soon as you aren't thinking about your code and not understanding it...

This certainly fits with my experience and biases. When using Copilot, I felt that it tried to be too clever, and often got things wrong. It would try to write a function based on the name, and would go one of three ways: either the function was a trivial one-liner and it saved me ~2 seconds of typing after I figured out if it was correct, or it was a complex function where it cost me ~2 seconds to figure out it was waaaaay off the mark, or it saved me 30 seconds up-front by producing something that appeared correct, but that I found out 10 minutes of debugging later was actually subtly incorrect in a way that I couldn't spot when reading quickly, but likely wouldn't have written myself.

What I really want is a smarter intellisense, whereas Copilot is a dumber pair programmer. I want smart, style-aware, context-aware tab completion on boilerplate, not on business logic.

Unfortunately I think many people are using it for business logic and that seems to be the direction the product is going.

I am worried that AI assisted code will be a competitive advantage so that the downsides will not be addressed until there are serious failures in critical code. Boeing but for power plants, for example.

I have a Jetbrains AI subscription and I use it mostly to write commit messages. It speeds up my work significantly, but I still have to skim over the generated output to make sure the message is correct.

I tried it for code generation and it broke DRY multiple times. It's still faster for me to just build a thing without code generation. I'm not a wizard with my IDE, but I've gotten to a point where I'm pretty fast with some things I need to get done, it's muscle memory.

I am a bit fanatic about verbose variable names, and sometimes the AI creates terrible names for variables, which I go and change anyway.

Much like the Y2K bug kerfuffle, there will be a time when the Copilot bug will cause an upsurge in business requiring developers to fix the stochastic parrot code bugs. I think this will be sooner rather than later.

CoPilot is good a as a one line autocomplete. It's short enough that you can review the suggestion and decide whether or not to accept the autocompletion or type out your own completion.

For reasoning about larger chunks of code I find ChatGPT better than CoPilot as an LLM assistant. Trying to use CoPilot for making large sections of boilerplate like the kind you might see in a db->api->web project is just full of frustration. It doesn't realize it makes makes tiny inconsistencies everywhere so you are permanently babysitting. I think the key takeaway is that if you have repeated code (An entity, a DTO, a controller, a frontend component all sharing some set of names/properties) then its better to change jobs than change tools.

If all copilot did for me was autocomplete my exception handling routines, log messages, and class definition getters and setters, I would still pay for it.

I think that's where it's competency really sits. It provides some interesting suggestions otherwise, but mostly it's these slam dunks that make me very happy with it.

Have been using chatgpt/ copilot for a while now and I feel I know better the limitations and what it can achieve most: - unit testing and docs, readme - when it can potentially hallucinate

It helps me do things that I'd usually procrastinate from doing, yet I know how I can get them done. It is really a booster to ones performance if you know exactly what you want

There’s a potential case study here. It seems that AI itself is vulnerable to misinformation which is something that might be glossed over in academic papers but has just started to be seen in the wild. I have used alternatives to Github Copilot that appear to be better about predicting my intent but I’m curious how rapidly that can turn around depending on the audience of the platform.