I cancelled my subscription after 2 months because I was spending way too much mental effort going over all of the code vomit fixing all of the mistakes. And it was basically useless when trying to deal with anything non-trivial or anything to do with SQL (even when I frontloaded it with my entire schema).
It was much less effort to just write everything myself because I actually know what I want to write and fixing my own mistakes was easier than fixing the bot’s.
I weep for the juniors that will be absolutely crushed by this garbage.
Good to know, that means I'm still economically useful.
I'm using ChatGPT rather than Copilot, and I'm surprised by how much it can do, but even so I wouldn't call it "good code" — I use it for JavaScript, because while I can (mostly) read JS code, I've spent the last 14 years doing iOS professionally and therefore don't know what's considered best practice in browser-land. Nevertheless, even though (usually) I get working code, I can also spot it producing bad choices and (what seems like) oddities.
Indeed.
You avoid the two usual mistakes I see with current AI, either thinking it's already game over for us or that it's a nothing-burger.
For the latter, I normally have to roll out a quote I can't remember well enough to google, that's something along the lines of "your dog is juggling, filing taxes, and baking a cake, and rather than be impressed it can do any of those things, you're complaining it drops some balls, misses some figures, and the cake recipe leaves a lot to be desired".
This is always really surprising to me that it appears to be these two camps. Though what frustrates me is that if you suggest something in the middle people usually assume you're in the opposite camp than they are. It reminds me a lot of politics and I'm not sure why we're so resistant to nuance when our whole job is typically composed of nuance.
Though I'll point out, I think it is natural to complain about your juggling dog dropping balls or making mistakes on your taxes. That doesn't mean you aren't impressed. I think this response is increasingly common considering these dogs are sold as if they are super-human at these tasks. That's quite disappointing and our satisfaction is generally relative to expectations, not actual utility. If you think something is shit and it turns out to be just okay, you're happy and feel like you got a bargain. If you're expecting something to be great but it turns out to be just okay, you're upset and feel cheated. But these are different than saying the juggling dog is a useless pile of crap and will never be useful. I just want to make that clear, so we avoid my first paragraph.
My theory is the increase in available information is overwhelming everyone's cognitive abilities, and so jumping to reductive conclusions is just a natural defense mechanism, leading to increased polarization and reduced listening/tolerance skills. Certainly software engineers have an above-average tolerance for nuance, but even we have to pick our battles or risk drowning in the flood.
I agree and hold a similar theory. The human mind prefers to put things in boxes. And the simpler the box, the better.
There also seems to be a phenomena where the more complex the system the simpler the box must be. I'm kinda surprised by this as understanding complexities (especially around dynamic environments and future planning) seems to be one of the things that distinguishes humans from other animals. And opposable thumbs. I mean sure, there's other animals with those too but we stand out, literally.
If the box is simple, then it's easier to cast each individual complexity of the system, as needed, into an interpretation that fits and reinforces the box.
But don't you find it odd that my box is complex and that your box is simple? And that this holds for whoever you and I are?
I buy that one, but I think there's a coupling with metric availability.
Where we have measurements for pretty much everything but people don't know measurements are proxies and not always aligned with goals. Like how a ruler doesn't measure meters, but meters according to the ruler and only at the ruler's precision level. I can totally get how people who don't work with these tools don't understand the nuances, but it confuses me with experts. Isn't expertise, by definition, contingent upon understanding nuance?
It seems that the more metrics we have available to us, the less we care about understanding how those metrics work and what their limitations are. That they just become black boxes.
I mean there seems to be a very common belief that you can measure visual image quality by measuring some difference between an intermediate representation of a classification model. Or a belief that entropy measures quality of speech.
I'm really concerned that Goodhart's Law might be one of the Great Filters.
History says that the things in the middle, usually ones with the clearest computational model beneath them, get a fancy name and we stop calling it AI and we just call it algorithms.
I could see someone making a case for this being that 'middle' group but there's a sour note to this process that I don't know from one week to the next whether I find it sneaky or delightful.
Someone else can make that argument, because I'm so sick of silver bullets and gold rushes that I just. don't. care. Sturgeon's Law applies (90% of everything is crap) and I'll listen just enough to see if anyone is proposing which bits are the 10% to keep a finger on the scale if I think it'll matter. But let everyone else bleed over this, because in another ten years they'll be laughing about how silly they were to think this was going to solve all of our problems or end our profession.
Software is eating most things. If something eats software, your employability will be the least of your existential crises.
I've heard this before, but do we? Which were those this from the past that we "stopped called AI and we just call algorithms"?
I see three categories:
(1) very complex algorithms that we never did call "AI".
(2) stuff we did call AI, and we still do - things like expert systems, or IBM's Watson, or game AI. We knew, and still know, that those weren't AGI.
(3) some stuff promoted as AI (like the "AI assistant" Clippy), but which, marketing materials aside, nobody really considered AI or called them that.
But I don't remember this demoting/relabelling of stuff from AI to "just algorithms". It might have happened with some stuff, but I doubt it was a "classic" development as its portrayed.
Offhand, things that used to just be "AI": Expert systems, markov chains, genetic algorithms, data mining, and identification/classification.
People in the field would probably say they fall under the AI umbrella, but it's not the common viewpoint. Either someone can conceptualize how they'd work (expert systems) or they've been watered down and turned commonplace (markov chains in software keyboards, identification in Facebook images), and either way it disassociates the technology with the "intelligence" part of "artificial intelligence", so is no longer thought of as part of it.
Also Dynamic Programming, and soon, I predict, Monte Carlo simulation.
Polarisation in general can be well-explained by the business models of social media. You’re shown content that you react to strongest, and that is usually one end of a spectrum of views.
Not only are you shown content that you react the strongest too, people create content that they react strongly too, continuing the cycle.
A user with a balanced interpretation and somewhat neutral feelings about a topic generally won't feel like they want to add something to a discussion. A user with strong opinion will more likely engage with someone with posts with the opposite viewpoint or the same viewpoint.
HN is a bit of an exception because the community is reasonably high quality. But major platforms? The people who bother to write out long and neutral posts learned there is nothing to gain from doing that years ago.
Even here, depending on the felt "hotness" of the topic in the community, you might get a lot of negative sentiment for trying to find a middle ground or daring to look at generalized claims in more detail.
I think, in general, people now seem to require you to signal that you share their identification with a certain thought before an open discussion might become possible.
One important aspect seems to be that the higher your educational level, the more likely you are conditioned to identify with your own thoughts. This amplifies polarization on an intellectual level. There is something in it for the individual thinker taking on a new polarized belief. It adds to their identity.
The ultimate catch then is to take the position that I just outlined as an identity contrary to "all" others who are polarized. This is yet another trap.
Hence, the exercise is to practice not getting polarized while being compassionate to those who are. It's just a tendency of the human mind and nobody should be judged for falling for these traps. It's too easy to fall for it given our current conditioning.
I think that’s mostly spillover from over social media, combined with the very opinionated nature of us nerds :-)
Interesting, usually when I express my opinion in the middle people agree. I've only talked to people about chatgpt, ai etc in real life though, not on the internet
I think the real life aspect changes things a lot.
Zealots on either side communicate at 100 times the rate of people who aren't so heavily invested.
What makes a man turn neutral? What makes a man turn neutral? Lust for gold? Power? Or were you just born with a heart full of neutrality?
-- Zapp Brannigan
I think it pays to just point out good use cases. For me, it's a superpowered thesaurus, a more concise and relevant Wikipedia, and an excellent listicle generator.
AI is not all-powerful, but those things alone help me a lot when I'm brainstorming.
You can tell it the code is bad and how, a lot of the times it will correct it. For some bs code that you have to write it is a great time saver.
For the code questions that I ask, it is sometimes quite non-trivial to check whether the code is correct or not.
It never happened in my tests that it could correct incorret code that it generated. Typically, the bot then generated code that is wrong for sometimes a different and sometimes even a similar reason.
Again I disagree: the common case where you have to write BS code is when the abstraction is wrong. Implementing a proper abstraction that strongly reduces the BS code to write is the way to go.
Writing that abstraction can also feel BS.
The entire industry is BS all the way down
If you get to a capacitor, you've gone too far - you may be in a power supply.
Yes, I've even done that to see how well it works (my experience is that half the time it fixes the code, the other half it claims to but doesn't) — but I can do that because I can recognise that it's bad code.
I don't expect a junior to notice bad code, and therefore I don't expect them to ask for a fix. (I've also spent 6 months in a recent job just fixing code written by work experience students).
That is true. I actually seen instances of juniors struggling with code that doesn't work and frankly doesn't make sense, but they claim they wrote it :)
The criticism from the second camp stems from the fact that the WHOLE job is to not drop anything.
A fence with a hole is useless even if it's 99% intact.
A lot of human jobs, especially white collar, are about providing reassurance about the correctness of the results. A system that cannot provide that may be worse than useless since it creates noise, false sense of security and information load.
adding nuance to your fence analogy, most fences are decorative and/or suggestive of a border but can be overcome somewhat easily and are hence not useless because they have a hole somewhere.
The spirit of the analogy holds, as there are plenty of clear alternatives that map verbatim:
- drinking glass that is 99% hole-free
- car that doesn't explode 99% of the time
- bag of candy where 99% of the pieces are not poisonous
In all of these cases, it's more optimal to start from scratch and build something that you know is 100% reliable than to start with whatever already exists and try to fix it after-the-fact.
Personally, I use AI to assist development, especially in unfamiliar stacks, but in the form of a discussion rather than code-vomit. It's primarily synthesizing documentation into more-specific whole answers and providing options and suggestions.
I beg to differ. True, many fences are easily overcome by adult humans, but most fences are not designed with adult humans in mind. Most fences are intended to keep animals in or out of an area. In more urban areas that may extend to small children. In high security areas, fencing may be just one layer of security, but it is certainly more than just "suggestive". In any of these cases, the fence is useless if it has a gap that can be exploited.
wanting this is probably the worst possible use case for LLM code vomit
I feel like if I ever try something less trivial than generating a wedding speech in the style of HP Lovecraft some AI evangelist tells me I've chosen the wrong use case for an LLM.
That is because you have already found the ideal use case for one.
For a sense of scale, the last time my JS knowledge was close to familiar with the best practices of the day, was 1-1.5 years prior to the release of jQuery.
I way trying to say how ChatGPT is good in relative terms, not absolute.
Something can be very impressive without actually being useful, but that still doesn’t make it useful. There’s no market for a working dog that does a bad job of baking cakes and filing taxes, while dogs that can retrieve game birds or tackle fleeing suspects are in high demand.
I'm certainly impressed by a man that can dig a tunnel with a spoon. But you're definitely right that it's not that useful.
But I disagree that half assed work is not useful. It's just lower usefulness. My laundry app isn't even half assed. The programmers couldn't even sort the room list (literal random order) or cache your most recent room. It's still better than the BS system they had before where I had to load a prepaid card and that machine was locked in a building that isn't open on weekends or after 6pm. I'm still immensely frustrated, but I don't want to go back to the old system.
Though I'll mention that my fear is that because so many people see LLMs as having far more utility than they offer, we'll get more shit like the above instead of higher quality stuff. Most issues are solved for me to be comfortable in my life, so I definitely value quality a lot more. Plus, reduces a lot of mental stress as I'm not thinking "how can the person that made this be so dumb? How do you learn how to program and not know what sort is?"
This is the crux of most of modern AI. Nobody debates it's cool. Since 2014 or so there has been no shortage of amazing demos of computers doing stuff many thought wasn't possible or required human level intelligence. But there's not automatically a bridge from those demos to commercial relevance, no matter how amazing.
As a senior frontend/javascript guy, I’m afraid that relying on ChatGPT/copilot for _current best practices_ is probably where it works the worst.
Oftentimes it will produce code that’s outdated. Or, it will output code that seems great, unless you have an advanced understanding of the browser APIs and behaviors or you thoroughly test it and realize it doesn’t work as you hoped.
But it’s pretty good at getting a jumpstart on things. Refining down to best practices is where the engineer comes in, which is what makes it so dicey in the hands of a jr dev.
This matches my experience. When ChatGPT started going viral, I started getting a lot of PRs from juniors who where trying it out. Pretty much every single one was using depreciated API calls or best practices from 5-10 years ago. I'd ask why they chose to use an API that is scheduled to be removed in the next release of whatever library or system we are using.
ChatGPT does have it's place. But you need to understand the tools you're using. It can't be great for a first spike or just getting something working. But then you have to go and look at what it's doing and make sure you understand it.
>For the latter, I normally have to roll out a quote I can't remember well enough to google, that's something along the lines of "your dog is juggling, filing taxes, and baking a cake, and rather than be impressed it can do any of those things, you're complaining it drops some balls, misses some figures, and the cake recipe leaves a lot to be desired".
Not the quote, but there was a Farside cartoon along those lines where the dog was being berated for not doing a very good job mowing the lawn:
https://i.pinimg.com/originals/22/22/79/222279ceaa98f293e76e...
Oh no, a really sad Far Side cartoon! Which is very closely related to a shaggy dog joke you can spin out for ages, "$5 talking dog for sale", which ends with the setup / punchline, "why so cheap?" / "because he's a goddamn liar!"
I myself also don't know what's considered best practice in Javascript generally (browser or server-side), even though I also have to write it occasionally -- but I wouldn't feel safe trusting that ChatGPT suggestions were likely to be model current best practices either.
On what are you basing your thinking that ChatGPT is more likely than not to be suggesting best practices? (Real question, I'm curious!)
If there are patterns that are good, idiomatic, and mostly repeatable, we should be putting those into the standard library, not an AI tool.
What we have right now is a system to collect information about the sorts of problems developers want existing code to solve for them. We should embrace it.
Right now, for me, it’s fancy Auto Complete at best.
I get a Free subscription to it by using my kids EDU email accounts. Which is handy :)
But I absolutely would not pay for it.
I recall the last time I tried using the chat feature to do something, the code it produced wasn’t very useful and it referenced chapters from a book for further information.
It was clearly just regurgitating code from a book on the subject, and that just feels wrong to me.
At least give credit to the Authors and reference the book so I can go read the suggested chapters LOL
The value I found in my short trial of gpt3 was a bidirectional path finder.
Don't have me read 20 pages of docs just to integrate into a browser or a framework.. cutting the legwork essentially so I can keep my motivation and inspiration going.
A lot of the latter is caused by the former. It is a nothing burger compared to the shocking amount of hysteria on HN about AI putting programmers out of a job. You'd expect a programmer to know what his job is, but alas, apparently even programmers think of themselves as glorified typewriters.
Oh man this is the opposite of my experience!
Copilot has replaced almost all of the annoying tedious stuff, especially stuff like writing (simple) SQL queries.
“Parse this json and put the fields into the database where they belong” is a fantastic use case for copilot writing SQL.
(Yes I’m sure there’s an ORM plugin or some middleware I could write, but in an MVP, or a mock-up, that’s too much pre optimization)
An ORM is not too much pre optimisation…
Yeah MVPs and mockups is exactly what ORMs are for, since they get you the first 80% for very little effort. Maybe it depends on the language, but this is definitely the way Rails and Laravel style framework ORMs are designed.
To me SQL is better for MVPs. It is transparent and clear, easy to debug and can do easily whatever joins, filtering you want to.
ORM is more for future, because it abstracts away the database implementation, so you could in theory change the db or change some behaviour more simply.
E.g. if you add something like deleted_at column, ORM can have a single place where to configure system to use it, but if you have SQL queries lying around you may need to find all spots to add that to your where clauses everywhere.
But otherwise SQL is easier to work with in my view.
What ORMs/languages do you usually work with? Dynamic language ORMs usually let you very easily add filters, work with relationships, etc.
I started out with Laravel 10 years ago, but now I have been in the NodeJS Ecosystem for recent 5 years, so I've been working with different things, but lately mostly with Supabase, which has its own REST API abstraction, but now I've been lately starting to prefer just having a direct bare postgres client against it. This is for my own side projects/MVPs.
Yeah I’ve found both Eloquent (Laravel ORM) and Entity Framework to be more than flexible enough and to produce surprisingly good SQL. You do need to keep an eye on the queries if produces though because innocent changes can make a huge difference in performance.
ORMs can be annoying to use in my experience, especially with more complex queries, joins, counts, filtering by joins etc. And they make debugging harder usually.
Since I found out that ChatGPT does a really nice job of "Show me what ActiveRecord query would generate this SQL query", and vice versa, it's gotten quite a bit easier. It handles some surprisingly complex stuff
There is no avoiding object-relational mapping, though, unless you intend to carry relations (sets of tuples) from start to finish. Something you almost certainly cannot do if you interface with any third-party code – which will expect language-native structures.
It's just a question of whether you want to use a toolkit to help the transformations, or if you want to 'do it by hand'.
I don't know, i don't think ever came across something I wasn't able to express in SQLAlchemy... And I prefer all the goodies that come with it than writing raw sql
Really? I've been doing web dev as a hobby for 20 years and professionally for 6 or 7 years. It's super helpful for me giving how much boilerplate there is to write and how much of the training set is for websites. Any time I try to get it to write non-trivial SQL or TypeScript types it fails hard. But it's still a nice time saver for writing tests, request handling, React boilerplate, etc.
This is the problem.
As programmers we should be focusing effort on reducing boilerplate, so that it’s never needed again. Instead, we’ve created a boilerplate generator.
Nope. The wrong abstraction is too risky vs. having no abstraction. AI can let us move towards an abstraction-free mode of coding. In the future you’ll tell the AI “add a 60 second cache to every source of X data” and it’ll scour your source code for matching code and you’ll review the diffs. C programmers would review LLVM IR diffs. Web devs will review whatever compilation target they’re using.
We use high level languages because they improve reading comprehension and save us time when writing. Having a copilot take on a major role allows us to fundamentally rethink programming.
Which completely contradicts your earlier point.
Why not just get copilot write assembly for you? Or, spit out raw machine code? Oh, that’s right, because you need to check that it hasn’t fucked something up. Which, when there’s a ton of boilerplate, is hard.
It’s arguable that programming language evolution stopped around the time Java was released (barring a few sprinkles here and there, like async/await, affine/linear types, etc.)
We haven’t had a major leap in language power for decades (not like the leap from assembly to procedural languages) - I believe it’s because Java popularised evolution through libraries - and for a long time that was fine, even if it did lead to language evolution stagnating.
But now we’ve hit a complexity threshold that demands an abstraction leap, but instead of looking for that abstraction leap we’re getting a computer to generate boilerplate for us, hoping it will dig us out of the complexity hole.
We’re still ways off having a computer maintain a complex code base over many years. So humans still have to do that. It’d be much easier if we remove the incidental complexity.
“The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.”
— Edsger W. Dijkstra
I call bullshit. Haskell, Rust, and Zig (and others) are revolutionary. Also: Rust and Zig are facilitated by LLVM.
Haskell is over 30 years old and I don't know if it really was a major leap over SML.
Rust is basically an ML with a borrow checker, it's cool but not really a major leap in power over C++ (maybe a leap in freedom from bugs).
What does Zig do that's so revolutionary? Custom allocators are nothing new. And compile-time evaluation has been around so long some of us forgot we had it. Also Zig is working to step off LLVM.
I find it unlikely that we would be asking the ai for a very specific caching design. If the optimists are right, the ai should be able to do its own analysis and automatically design an architecture that is superior to what a human would be able to do.
A job spent mostly reviewing ai generated diffs sounds like a level of hell beyond even Dante's imagination.
The challenge there is that reducing boilerplate comes with opinionated magic, which rubs a lot of people the wrong way. So the creators add configs and code level config, and pretty soon you're back in boilerplate hell.
I’m starting to lean in the direction that trading some boiletplate for flexibility is a good deal.
As with most things in life, moderation is key.
I find co-pilot primarily useful as an auto-complete tool to save keystrokes when writing predictable context driven code.
Writing an enum class in one window? Co-pilot can use that context to auto complete usage in other windows. Writing a unit test suite? Co-pilot can scaffold your next test case for you with a simple tab keystroke.
Especially in the case of dynamic languages, co-pilot nicely compliments your intellisense
Yep, treat copilot like a really good autocomplete system and its wonderful. Saves a lot of time and typing even in languages that aren't known for having a lot of boilerplate.
Treat copilot to solve actual problems (y'know, the kind of stuff you are presumably paid to solve), and it falls completely flat.
I've found it's autocomplete really bad. It often hallucinates import paths and other illegal operations. It's not clear to me why it can't leverage the LSP better.
It’s also nice for solving small things like ‘euclidian distance’
My experience is similar. This week I created a few DBT models and Co-Pilot saved a *ton* of keystrokes on the YML portion.
It needed some hand-holding in the early parts, but it was so satisfying to tab autocomplete entire blocks of descriptions and tests once it picked up context with my preferences.
This is the problem with using AI for generating SQL statements; it doesn't know the semantics of the your database schema. If you are still open for a solution, I recently deployed a solution[1] that combines AI, db schema and simple way train AI to know your database schema.
Essentially, you "like" correct (or manually corrected) generations and a vectorized version is stored and used in future similar generations. An example could be tell which table or foreign key is preferred for a specific query or that is should wrap columns in quotes.
From my preliminary tests it works well. I was able to consistently make it use correct tables, foreign keys and quotes on table/column name for case-sensitivity using only a couple of trainings. Will open a public API for that soon too.
[1]: https://www.sqlai.ai/
It's "a public API", not "an public API", because of the consonant rule.
I really worry that there are people out there who will anxiously mangle their company's data thinking what is being called AI, which doesn't exist yet, will save the day.
Use it as a tool not a replacement. However it does do things well even without much additional information like fixing SQL statement[1]. That being said it is consistently improving, GPT-3.5 to GPT-4 was a major upgrade.
[1]: https://www.sqlai.ai/snippets/clroq0qn9001xqzqeidtm4jgx
Just copy paste the db/table schema in a comment at start of your file. Nothing else needed.
If you used copilot in the beginning (and I think still with some plans), it was only GPT 3.5.
Likely you’d get much better results with GPT-4.
The difference in output code quality in 3.5 vs 4 is staggering - just with using regular old ChatGPT.
I feel like OpenAI is really shooting themselves in the foot by not having some sort of trial of 4. The difference in quality of everything is huge and most people don’t know it.
This is the real danger of this sort of thing. When your Copilot or whatever are good enough that they replace what is vastly superior for purely economic reasons.
I wrote about this trend applied to the unfortunately inevitable doom of the voice acting industry in favour of text-to-speech models a couple of months ago, using my favourite examples of typesetting, book binding and music engraving: https://news.ycombinator.com/item?id=38491203.
But when it’s development itself that gets hollowed out like this, I’m not sure what the end state is, because it’s the developers who led past instances of supplanting. Some form of societal decline and fall doesn’t feel implausible. (That sentence really warrants expansion into multiple paragraphs, but I’m not going to. It’s a big topic.)
Yes, like desktop publishing compared to traditional printing. Its output is comparatively shitty but average people will settle for it on account of its democratization. My context was feature factory web dev versus low / no code rather than autocomplete versus solo author.
By democratization, I mean that it enables one-one where previously there was only one-many: instead of the inversion of experience where the same unique app experience is shared by millions, a technology allows the interface to be tailored for an audience of one or dozens: missing toes and fingers, color blindness, particularly difficult and unique operating conditions, etc. Given those unique constraints the mediocrity provides at least some preferable solution. The downside of this is that it sets the floor a lot lower, and people who would never have even tried or contemplated trying typesetting will dabble with desktop publishing to achieve their ends.
Somebody on here gifted me with the word "procrustean" and I've taken it and put it in my Minsky fish tank. There are many reasons to eschew the trusted experts model: somebody who has made heads for pins for twenty years is incontrovertably an expert, but who cares? Our uncanny valley appears to be only a local minimum.
[PS, nobody understood my point but I thank them for the honest feedback.]
On your P.S.: I contemplated your point being “I can cope with this stuff and know when to correct it or give up on it, but what chance does a beginner have?” which I think is what you actually meant, but ended up deciding to write what I wrote anyway, because it ends up feeling fairly connected to me, though from a slightly different angle.
I’m in India at present, and a number of 18–25 year-olds ask me about learning to code (commonly because their college or University is teaching C/C++ and they have no idea). Somehow, they have more access to a computer than ever before, because most of them carry one on their person all day, and more information about this task than ever before, yet their computer, a phone, has been dumbed-down and locked-up in such a way that they can’t really use it to learn to code, because that’s something you do on Computers, and those are just on the campus or your laptop, Chris, or things like that.
Easier for the simple tasks, so that the more complex tasks that enough in a previous generation used to work for just don’t get reached, leaving over time perhaps a chasm, and fewer really skilled people in a society that may depend on them more than previously. The lower floor helps some who wouldn’t have got started before, but discourages others by making things too easy so they never find the challenge they seek.
I'm not using Copilot to write code, I use it for autocomplete. For that it's great.
Copilot probably saved my career. I was getting occasional wrist pain due to keyboard overuse, but the autocomplete is so good that my keyboard use is a tiny fraction of what it was previously. Wrist pain gone.
Using Copilot is a skill though, you have to live with it and learn its limits and idiosyncrasies to get the most out of it.
100% agree. It usually autocompletes better than the language's LSP since it takes a lot of context into consideration. It's a godsend for repetitive tasks or trivial stuff.
I've been using Bing's GPT-4 to learn Fortran for about a week. Well, I'm mainly reading a book which is project-oriented so I'm often researching topics which aren't covered in enough detail or covered later. I think this mixed-mode approach is great for learning. Since Fortran's documentation is sparse and Google's results are garbage the output of GPT4 helps cut through a lot of the cruft. Half the time it teaches me, the rest I'm teaching it and correcting its mistakes. I certainly won't trust it for anything half complicated but I think it does a good job linking to trustworthy supporting sources, which is how I learned how to cast assumed-rank arrays using c_loc/c_f_pointer and sequence-association using (*) or (1). It's great for learning new concepts and I imagine it would be great for pair-coding in which it suggests oversights to your code. However I can't imagine depending on it to generate anything from scratch. What's surprising is how little help the compiler is - about as bad a resource as SEO junk. I'm used to "-Wall -Werror" from C, but so many gfortran warnings are incorrect.
Please be sure to report poor warnings to the gfortran developers -- it's generally a great compiler, and you can help keep it great.
A problem with Fortran compiler error and warning messages is that Fortran is largely a legacy language at this point, and most Fortran code hitting the compilers has already had its errors shaken out. New code, and especially new code from new Fortran users, is somewhat more rare -- so those error and warning checks are a part of the compiler that doesn't get as much exercise as one would like.
This aligns with my observations. I don't use Copilot etc. but the other devs on my small team do. I've observed that I'm generally a faster and more confident type and coder - not knocking their skills, I'm just more experienced, and also spent my teens reading and writing a lot.
I've seen that it helps them in cases where they're less certain what they're doing, but also when they know what they're doing and it's quicker about it.
For me I know what I want to write and it seems that Copilot also knows what I want to write, so as an auto complete it just works out for me. Most of the time code I want to write is in my head, I just need to be able to quickly vomit it out and typing speed is the bottleneck.
I am also able to intuitively predict that it is going to vomit out exactly what I want.
E.g. I know ahead of time what the 10 lines it will give me are.
You cancelled the tool because you didn’t know how to use it?
Depends on how you use it.
I use a similar vscode assistant bit only for shorter code. I am able to complete code faster than an instructor on video.
As other said I use copilot or similar for scaffolding/boilerplate code. But you are right, reading almost correct code that you are unfamiliar with is much more demanding than fixing stuff I got wrong to begin with.
it's crazy how amazing it is sometimes for certain things, including comments (and including my somewhat pithy style of comment prose), and how incredible and thorough the wastes of time when it autosuggests some function that is plausible, defined, but yet has some weird semantics such that it subtly ruins my life until i audit every line of code i've written (well, "written", i suppose). i finally disabled it, but doing so _did_ make me kind of sad-- it is nice for the easy things to be made 25x easier, but, for programming, not at the expense of making the hard stuff 5x harder (note i didn't say 10x, nor 100x. 5x). it's not that it's that far away from being truly transformative, it's just that the edge cases are really, really rough because for it to be truly useful you have to trust it pretty much completely & implicitly, and i've just gotten snakebitten in the most devious ways a handful of times. an absolute monster of / victim of the pareto principle, except it makes the "90%" stuff 1.5x easier and the "10%" stuff 5x harder (yes, i know i haven't been using my "k%"s rigorously) (and for those keeping score at home, that adds up to making work 10% harder net, which i'd say is about right in my personal experience).
highlights: "the ai" and i collaboratively came up with a new programming language involving defining a new tag type in YAML that lets one copy/paste from other (named) fragments of the same document (as in: `!ref /path/to/thing to copy`) (the turing completeness comes from self-referential / self-semi-overlapping references (e.g. "!ref /name/array[0:10]`) where one of the elements thus referred-to is, itself, a "!ref" to said array).
lowlights: as already alluded to, using very plausible, semi-deprecated API functions that either don't do what you think they do, or simply don't work the way one would think they do. this problem is magnified by googling for said API functions only to find cached / old versions of API docs from a century ago that further convince you that things are ok. nowadays, every time i get any google result for a doc page i do a little ritual to ensure they are for the most recent version of the library, because it is absolutely insane how many times i've been bitten by this, and how hard.
I think there is a sweet spot where if you're junior on the cusp of intermediate it can help you because you know enough to reject the nonsense, but it can point you in the right direction. Similar to if you need to implement a small feature in a language you don't know, but basically know what needs to get done.
I've definitely seen juniors just keep refining the garbage until it manages to pass a build and then try to merge it, though, and using it that way just sort of makes you a worse programmer because you don't learn anything and it just makes you more dependent on the bot. Companies without good code reviews are just going to pile this garbage on top of garbage.
I have similar sentiment and looking at how mixed takes are, I think it depends on what you do. I write a lot of research code so I think it's unsurprising that GPT isn't too good here. But people I see that write code more in line with the "copy and paste from stackoverflow" style, get huge utility out of this. (This isn't a dis on that style, lots of work is repetitive and redundant).
So I changed how I use GPT (which I do through API. Much cheaper btw). I use it a lot like how I would use SO in the first place. Get outlines, understand how certain lines might work (noisy process here), generate generic chunks of code especially from modules I'm unfamiliar with. A lot of this can just be seen as cutting down time searching.
So, the most useful one: using it as a fuzzy search to figure out how to Google. This one is the most common pattern. Since everything on Google is so SEO optimized and Google clearly doesn't give a shit, I can ask GPT a question, get a noisy response that contains useful vernacular or keywords which I can then use to refine a Google search and actually filter out a decent amount of shit. I think people might read this comment and think that you should just build a LLM into Google, but no, what's going on is more complicated and requires the symbiosis. GPT is dumb, doesn't have context, but is good at being a lossy compression system. The whole reason this works is because I'm intelligent and __context aware__, and importantly, critical of relying on GPT's accuracy[0]. Much of this can't be easily conveyed to GPT and isn't just a matter of token length. So that said, the best way to actually improve this system is actually for Google to just get its shit together or some other search engine to replace them. Google, if you're listening, the best way you can make Google search better with LLMs is to: 1) stop enabling SEO bullshit, 2) throw bard into the side and have the LLM talk to you to help you refine a search. Hell, you can use a RL agent for 1 to just look how many times I back out from the links you send me or look at which links I actually use. Going to page 2 is a strong signal that you served shit.
[0] accuracy is going to highly depend on frequency of content. While they dedupe data for training, they don't do great semantic deduping (still an unsolved problem. Even in vision). So accuracy still depends on frequency and you can think of well known high frequency knowledge as having many different versions, or that augmentation is built in. You get lower augmentation rates with specific or niche expert knowledge as there's little baked in augmentation and your "test set" is much further from the distribution of training data.
Echoing this, it takes longer to read code than to write it, so generally, if you know what you want to write and it's non-trivial, you'll spend more time groking AI-written code for correctness than writing it from scratch.
I've been using it since before the beta and I still do not understand why people have ever used it for multi-line suggestions. I only ever use it as a one-line autocomplete and it has done wonders for my productivity
It sounds to me like you were getting it to do too much at once.
so its good my card that was auto-paying for the Copilot subscription expired
I never even started. On a vacation I tried to get a LLM to write me an interpolation function for two hours. I had a test data set and checks laid out. Not a single of the resulting algorithms passed all the checks, most didn't even do what I asked for and a good chunk didn't even run.
LLMs give you plausible text. That does not mean it is logically coherent or says what it should.
For me it's useful for new languages that I'm not familiar with, saves a lot of googling time and looking up the docs.
When I've tried codepilot and similar tools, I've found them rather unimpressive. I assumed it was because I hadn't put the time in to learn how to make the best use of it, but maybe it's just that it's not very good.
On the other hand, I use ChatGPT (via the API) quite often, and it's very handy. For example, I wrote a SQL update that needed to touch millions of rows. I asked ChatGPT to alter the statement batch the updates, and then asked it to log status updates after each batch.
As another example, I was getting a 401 accessing a nuget feed from Azure DevOps - I asked ChatGPT what it could be and it not only told me, but gave me the yaml to fix it.
In both cases, this is stuff I could have done myself after a bit of research, but it's really nice to not have to.