HN comments for: Show HN: Plandex – an AI coding engine for complex tasks

BirbSingularity

17 replies

4d2h

2024-04-03 15:30:21 UTC

It's pretty annoying that every project like this lately is just a wrapper for OpenAI API calls.

danenania

11 replies

4d2h

2024-04-03 15:33:51 UTC

Supporting more models, including Claude, Gemini, and open source models is definitely at the top of the roadmap. Would that make it less annoying? :)

codeapprove

4 replies

4d1h

2024-04-03 17:05:39 UTC

Not affiliated with the project but you could use something like OpenRouter to give users a massive list of models to choose from with fairly minimal effort

https://openrouter.ai/

danenania

2 replies

4d1h

2024-04-03 17:11:47 UTC

Thanks, I need to spend some time digging into OpenRouter. The main requirement would be reliable function calling and JSON, since Plandex relies heavily on that. I'm also expecting to need some model-specific prompts, considering how much prompt iteration was needed to get things behaving how I wanted on OpenAI.

I've also looked at Together (https://www.together.ai/) for this purpose. Can anyone speak to the differences between OpenRouter and Together?

kekebo

1 replies

3d22h

2024-04-03 20:14:09 UTC

I can't speak to the differences of Openrouter to Together but the Openrouter endpoint should work as a drop-in replacement for OpenAI api calls after replacing the endpoint url and the value of $OPENAI_API_KEY. The model names may differ to other apis but everything else should work the same.

danenania

0 replies

3d21h

2024-04-03 20:31:28 UTC

Awesome, looking forward to trying it out.

j45

0 replies

3d22h

2024-04-03 20:02:45 UTC

Would love to hear any feedback from people who have gotten to know OpenRouter, as well as any similar tools.

npace12

3 replies

3d23h

2024-04-03 18:53:24 UTC

I think Mistral-2-Pro would work really well for this, judging by the great results I've had with it on another heavy on tool calling project [1]

[1] https://github.com/radareorg/r2ai

danenania

2 replies

3d23h

2024-04-03 18:58:42 UTC

Thanks, I'll give it a try. Plandex's model settings are version-controlled like everything else and play well with branches, so it will be fun to start comparing how all different kinds of models do vs. each other on longer coding tasks using a branch for each one.

p1esk

1 replies

3d21h

2024-04-03 20:57:40 UTC

For challenging tasks, I typically get code outputs from all three top models (gpt4, opus, and ultra), and pick the best one. It would be nice if your tool could simply this for me: run all three models and perhaps even facilitate some type of model interaction to produce a better outcome.

danenania

0 replies

3d21h

2024-04-03 21:08:23 UTC

Definitely, I'm very interested in doing something along these lines.

mritchie712

0 replies

3d23h

2024-04-03 19:04:45 UTC

https://github.com/ollama/ollama

Aeolun

0 replies

3d18h

2024-04-03 23:59:48 UTC

I think OpenAI is still the best of the bunch. Kind of feel like the others are kind of there to make people realize OpenAI works the best. Maybe when Gemini 1.5 is released?

aerhardt

1 replies

3d20h

2024-04-03 21:51:09 UTC

I’m moving an inordinate amount of data between the ChatGPT browser window and my IDE (a lot through copying and pasting) and this demonstrates two things: 1) ChatGPT is incredibly useful to me and 2) the worflow UX is still terrible. I think there is room for building innovative UXs with OpenAI, and so far what I’ve seen in Jetbrains and VSCode isn’t it…

danenania

0 replies

3d20h

2024-04-03 21:59:33 UTC

That was also my experience and thought process.

bottlepalm

0 replies

3d20h

2024-04-03 22:17:54 UTC

Every program is a wrapper around a CPU, so annoying.

_ink_

0 replies

3d20h

2024-04-03 21:35:59 UTC

But the open source models have Open AI compatible APIs, so as long as you can set the API endpoint you can use whatever you want.

CharlieDigital

0 replies

3d22h

2024-04-03 19:46:08 UTC

OpenAI API is simply a utility. The question is given this utility, how does one find the right use case, structure the correct context, and build the right UX.

OP has certainly built something interesting here and added significant value on top of the base utility of the OpenAI API.

visarga

7 replies

3d21h

2024-04-03 20:44:40 UTC

This approach works. I just built a SPA in 3 days with GPT-4 of which about 50% was generated. My only tooling was a bash script to list all the files in the repo (with some exceptions), including a README.md planning the project, a file list, and at the end I type my task.

I run about 10-15 rounds with it. At the beginning I was using GPT more heavily, but in the middle I found it easier to just fix the code myself. The context got as big as 10k tokens, but was not a problem. At some point I might need to filter the files more aggressively.

But surprisingly all that is needed for a bare-bone repo-level coding assistant is a script to list all the files so I could easily copy paste the whole thing into the chatGPT window.

danenania

2 replies

3d21h

2024-04-03 20:50:16 UTC

Yes, well said. Doing exactly this kind of thing for months with ChatGPT is what convinced me the idea could work in the first place. I knew the underlying intelligence was there--the challenge is giving it the right prompts and supporting infra.

Aeolun

1 replies

3d18h

2024-04-03 23:48:39 UTC

Do you have any of the issues where ChatGPT tends to forget the first parts of it’s context window? It could have the information explicitly spelled out, but if it weren’t in the last 2K tokens or so it’d just start to hallucinate stuff for me.

danenania

0 replies

3d17h

2024-04-04 00:40:23 UTC

Plandex uses gradual summarization as the conversation gets longer (the exact cutoff point in terms of tokens is configurable via `plandex set-model`). So eventually, with a long enough plan, you can start to lose some resolution. That said, assuming you use the default gpt-4-turbo model with a 128k context window, you'd need to go far beyond 2k tokens before you'd start seeing anything like that.

We don't know what ChatGPT's summarization strategy is since it's closed source, but it does seem to be quite a bit more aggressive than Plandex's.

sanmon3186

1 replies

3d15h

2024-04-04 02:41:17 UTC

What’s your experience with API cost? I've also tried something similar, but I often end up using up my balance too quickly.

CGamesPlay

0 replies

3d13h

2024-04-04 05:04:05 UTC

I can generally have these tools solve a simple issue in about 0.1 USD, or "complex" issues in 1-2 USD (complex generally just means that I'm spending time prompt engineering to get the model to do the right thing).

ugh123

0 replies

3d19h

2024-04-03 22:31:28 UTC

Do you have any boilerplate part of your prompt you can share?

nico

0 replies

3d20h

2024-04-03 22:15:32 UTC

a script to list all the files so I could easily copy paste the whole thing

Just in case you are using a Mac, you can pipe the output of your script to pbcopy so that it goes directly into your clipboard

script.sh | pbcopy

ldelossa

7 replies

3d22h

2024-04-03 20:21:33 UTC

Show me one of these things do something more complex then a front end intern project.

IshKebab

3 replies

3d21h

2024-04-03 20:31:21 UTC

I agree, these things seem to do okish on trivial web projects. I've never seen them do anything more than that.

I still use ChatGPT for some coding tasks, e.g. I asked it to write C code to do some annoying fork/execve stuff (can't remember the details) and it did a decentish job, but it's like 90% right. Great for figuring out a rough shape and what functions to search for, but you definitely can't just take the code and expect it to work.

Same when I asked it to write a device driver for some simple peripheral. It had the shape of an answer but with random hallucinated numbers.

I've also noticed that because there is a ton of noob-level code on the internet it will tend to do noob-level things too, like for the device driver it inserted fixed delays to wait for the device to perform an operation rather than monitoring for when it had actually finished.

I wonder if coding AIs would benefit from fine tuning on programming best practices so they don't copy beginner mistakes.

danenania

2 replies

3d21h

2024-04-03 20:40:42 UTC

I used a web project in the demo because I figured it would be familiar to a wide range of developers, but actually many nontrivial pieces of Plandex have been built with the help of Plandex itself.

That's not to say it's perfect or will never make "noob-level" mistakes. That can definitely happen and is ultimately a function of the underlying model's intelligence. But I can at least assure you that it's quite capable of going far beyond a trivial web project.

It's also on me to show more indepth examples, so thanks for calling it out. I'd love it if you would try some of the projects you mention and let me know how it goes.

Kamii0909

1 replies

1d4h

2024-04-06 13:32:21 UTC

So basically you doesn't have any non trivial example. What else but to be expected?

danenania

0 replies

1d2h

2024-04-06 16:21:54 UTC

Check out some of the test prompts here for examples of larger tasks: https://github.com/plandex-ai/plandex/blob/main/test/test_pr...

danenania

2 replies

3d21h

2024-04-03 20:28:19 UTC

Here's a prompt I used to build the AWS infrastructure for Plandex Cloud with Plandex: https://github.com/plandex-ai/plandex/blob/main/test/test_pr...

chmod2

1 replies

3d20h

2024-04-03 21:35:37 UTC

It's not something I would consider a complex job. A simple prompt to chatgpt could even produce a working CDK template.

danenania

0 replies

3d20h

2024-04-03 21:57:13 UTC

Here's another one, for the backend of a Stripe billing system: https://github.com/plandex-ai/plandex/blob/main/test/test_pr...

It seems like more examples demonstrating relatively complex tasks would be helpful, so I'll work on those.

I'm certainly not trying to claim that it can handle any task. The underlying model's intelligence and context size do place limits on what it can do. And it can definitely struggle with code that uses a lot of abstraction or indirection. But I've also been amazed by what it can accomplish on many occasions.

CGamesPlay

6 replies

3d13h

2024-04-04 05:09:57 UTC

Love the idea of this, and very excited to see how it pans out. That said: I hate the code review UI. Just dump the changes as `git diff` does and let me review them using all the code review tools I use every day, then provide revision instructions. Building a high-quality TUI for side-by-side diffs should not be the thing you are spending time on, and there already exist great tools for viewing diffs in the terminal.

danenania

5 replies

3d13h

2024-04-04 05:18:17 UTC

Thanks for the feedback! I actually had a ‘plandex diff’ command working at one point, but dropped it in favor of the changes TUI. I could definitely bring it back for people who prefer that format.

retendo

2 replies

3d11h

2024-04-04 06:39:15 UTC

You could have a mode for people „who know what they are doing“ and just auto approve all the changes plandex makes and let users handle the changes themselves. I would actually prefer that, because I could keep using my IDE to look at diffs and decide what to keep.

danenania

0 replies

3d1h

2024-04-04 16:40:02 UTC

Thanks, I'll consider this. It would be easy enough to add flags that will allow this.

adr1an

0 replies

3d8h

2024-04-04 10:07:36 UTC

Agreed! I for example would prefer to use diffstatic

parentheses

1 replies

2d17h

2024-04-05 00:47:37 UTC

Providing diff output allows people to self select their approach to merging the changes.

danenania

0 replies

2d17h

2024-04-05 01:04:11 UTC

Yeah, that makes sense. I'm going to add this soon.

wanderingmind

5 replies

3d19h

2024-04-03 23:24:40 UTC

Congrats on the launch. Can you please compare and contrast Plandex features with another similar solution like aider[1] which also helps solve similar problem.

[1] https://github.com/paul-gauthier/aider

anotherpaulg

2 replies

3d15h

2024-04-04 03:21:51 UTC

Thanks for mentioning aider! I haven't had a chance to look closely at plandex, but have read the author's description of differences wrt aider. I'd add a few comments:

I think the plandex UX is novel and interesting. The idea of a git-like CLI with various stateful commands is a new one in this space of ai coding tools. In contrast, aider uses a chat based "pair programming" UX, where you collaborate with the AI and ask for a sequence of changes to your local git repo.

The plandex author highlights that it makes changes in a "version-controlled sandbox" and can "rewind" unwanted changes.

These capabilities are all available "for free" in aider, because it is tightly integrated with git. Each AI change is automatically git committed with a sensible commit message. You can type “/diff” to check the diff, or "/undo" to undo any AI commit that you don't like. Or you can use "/git checkout -b <branch-name>" to start working on a branch to explore a longer sequence of changes, etc.

All your favorite git workflows are supported by invoking familiar git commands with "/git ..." inside the aider chat, or using any external git tooling that you prefer. Aider notices any changes in the underlying repo, however they occur.

bjornsing

1 replies

3d6h

2024-04-04 11:25:24 UTC

These capabilities are all available "for free" in aider, because it is tightly integrated with git.

Sounds like the right approach to me. Some quick questions:

1. Is it easy to customize the system prompt with aider?

2. Does aider save a record of all OpenAI API calls? I’m thinking I may e.g. want to experiment with fine tuning an open source model using these one day.

3. What would you say are aider’s closest “competitors”?

danenania

0 replies

3d1h

2024-04-04 16:33:42 UTC

Just to note, Plandex also has integration with git on the client-side and can automatically commit its changes (or not--you can decide when applying changes).

One of the reasons I think it's good to have the plan version-controlled separately from the repo is it avoids intermingling your changes and the model's changes in a way that's difficult to disentangle. It's also resilient to a "dirty" git state where you have a mix of staged, unstaged, and untracked changes.

One more benefit is that Plandex can be used in directories that aren't git repos, while still retaining version control for the plan itself. This can be useful for more one-off tasks where you're not working in an established project.

danenania

1 replies

3d18h

2024-04-03 23:43:33 UTC

Thanks! Sure, I posted this comment in a Reddit thread a couple days ago to a user who asked the same question (and I added one additional point):

First I should say that it’s been a few months at least since I’ve used aider, so it’s possible my impression of it is a bit outdated. Also I’m a big fan of it and drew a lot of inspiration from it. That said:

Plandex is more focused on building larger and more complex functionality that involves multiple steps, whereas aider is more geared toward making a single change at a time.

Plandex has an isolated, version-controlled sandbox where tentative changes are accumulated. I believe with aider you have to apply or discard each change individually?

Plandex has diff review TUI where changes can be viewed side-by-side, and optionally rejected, a bit like GitHub’s PR review UI.

Plandex has branches that allow for exploring multiple approaches.

aider has cool voice input features that Plandex lacks.

aider’s maintainer Paul has done a lot of benchmarking of file update strategies. While I think Plandex’s approach is better suited to larger and more complex functionality, aider’s unified diff approach may have higher accuracy for a single change. I hope to do benchmarking work on this in the future.

aider requires Python and is installed via pip, while Plandex runs from a single binary with no dependencies, so Plandex installation is arguably easier overall, especially if you aren't a Python dev.

I’m sure I’m missing some other differences but those are the main ones that come to mind.

wanderingmind

0 replies

3d18h

2024-04-03 23:53:53 UTC

Thank you. Branches to explore different approaches is a really good idea, since LLMs are most powerful when they are used as a rubber duck to generate boilerplate templates and this can help get multiple perspectives. Going to test it soon.

usernamed7

3 replies

2d2h

2024-04-05 15:35:19 UTC

Whats the deal with plandex cloud and $10/$20-mo? The github repo README devolves into a cloud pitch halfway through. I thought this was a local binary talking to openAI? I thought this was open source?

danenania

2 replies

2d2h

2024-04-05 15:45:54 UTC

Hi, it’s open source and it also has a cloud option. You can either self-host or use cloud—it’s up to you.

The CLI talks to the Plandex server and the server talks to OpenAI.

usernamed7

1 replies

2d1h

2024-04-05 16:30:59 UTC

but i still don't get what the cloud option would be doing that's worth $20/mo if it's talking to openAI. Does the plandex server have large resource requirements?

danenania

0 replies

2d1h

2024-04-05 17:02:35 UTC

The server does quite a bit. Most of the features are covered here: https://github.com/plandex-ai/plandex/blob/main/guides/USAGE...

I actually did start out with just the CLI running locally, but it reached a point I needed a database and thus a client-server model to get it all working smoothly. I also want to add sharing and collaboration features in the future, and those require a client-server model.

timfsu

3 replies

3d20h

2024-04-03 21:52:28 UTC

Congrats on the launch, I'm excited to give it a try. I'm curious how you're having it edit files in place - having built a similar project last summer, I had trouble with reliably getting it to patch files with correct line numbers. It was especially a problem in React files with nested div's.

danenania

2 replies

3d20h

2024-04-03 22:06:53 UTC

Thanks! I tried many different ways of doing it before settling on the current approach. It's still not perfect and can make mistakes (which is why the `plandex changes` diff review TUI is essential), but it's pretty good now overall.

I was able to improve reliability of line numbers by using a chain-of-thought approach where, for each change, the model first summarizes what's changing, then outputs code that starts and ends the section in the original file, and then finally identifies the line numbers from there.

The relevant prompts are here: https://github.com/plandex-ai/plandex/blob/main/app/server/m...

nico

1 replies

3d20h

2024-04-03 22:12:04 UTC

Amazing work. Loved the video and looking forward to trying it

Can a user ask plandex to modify a commit? Maybe the commit just needs a small change, but doesn’t need to be entirely re-written. Can the scope be reduced on the spot to focus only on a commit?

danenania

0 replies

3d19h

2024-04-03 22:52:22 UTC

Thanks! There isn't anything built-in to specifically modify a commit, but you could make the modification to the file with Plandex and then `git commit --amend` for basically the same effect.

poulpy123

3 replies

3d7h

2024-04-04 11:08:07 UTC

Not for this project specifically, but I realize that I've seen a lot of AI agents, but I've never seen something interesting build with them. Some simple website, maybe even some very simple old games like snake or pong, but nothing better. Do I miss something ?

paradite

0 replies

3d7h

2024-04-04 11:12:11 UTC

I'd say LLM agent, or multi-agents are still in the very early research/prototype stage.

You can tell because there are papers from Microsoft on this but no product: https://www.microsoft.com/en-us/research/project/autogen/

I also wrote about the L1 to L5 of AI coding here: https://prompt.16x.engineer/blog/ai-coding-l1-l5

dmos62

0 replies

3d5h

2024-04-04 12:27:38 UTC

I brainstormed a text game-engine powered by an llm, but relying on a non-local llm was offputting. Local llms are getting more and more viable though. A general problem I was running into was that thinking in terms of LLM queries is a very new way of computation design and adapting takes a lot of effort. Then again, my central idea was a bit ambitious too: every game character would have a unique interpretation on what was happening.

danenania

0 replies

3d1h

2024-04-04 16:35:33 UTC

Try to build something interesting with Plandex! Perhaps you will be pleasantly surprised. Either way, please let me know how it goes.

ijustlovemath

3 replies

3d6h

2024-04-04 11:54:37 UTC

To support many other models you should look at ollama - it provides a REST API on your machine for local inference that works just like OpenAI

danenania

2 replies

3d1h

2024-04-04 16:44:40 UTC

Thanks, I'm aware of ollama and the open source model ecosystem, but I haven't done a deep dive yet, so all the info in this thread has been quite helpful.

ijustlovemath

1 replies

2d20h

2024-04-04 21:40:11 UTC

In theory, all you have to do is redirect the API gateway to localhost and all your existing integrations should just work!

danenania

0 replies

2d17h

2024-04-05 01:09:07 UTC

There's an issue here to keep track of this: https://github.com/plandex-ai/plandex/issues/20

It seems that while ollama does have partial OpenAI API compatibility, it's missing function calling, so that's a blocker for now.

brap

3 replies

3d22h

2024-04-03 20:20:27 UTC

This seems very interesting, but I think the interface choice is not good. There would have been much less friction if this was purely a GitHub/GitLab/etc bot.

danenania

1 replies

3d21h

2024-04-03 20:26:44 UTC

I see where you're coming from and I do plan to add a web UI and plugin/integration options in the future.

I personally wanted something with a tighter feedback loop that felt more akin to git. I also thought that simplifying the UI side would help me stay focused on getting the data structures and basic mechanics right in the initial version. But now that the core functionality is in place, I think it will work well as a base for additional frontends.

ENGNR

0 replies

3d21h

2024-04-03 20:31:01 UTC

I haven't tried it yet, but I think making it fast iteration and simple initially is the right way to go. Nice one sharing this as open source!

vertis

0 replies

3d22h

2024-04-03 20:22:11 UTC

I disagree, having used Sweep extensively, I've found the GitHub Issue -> PR flow to be incredibly clunky with a lack of ability to see what is happening and what has gone wrong.

asadalt

3 replies

3d23h

2024-04-03 18:52:05 UTC

In demo it modified UI components, is there any model that can look at the rendered page to see if it looks right? Right now all these wrappers just blindly edit the code.

danenania

1 replies

3d23h

2024-04-03 18:53:33 UTC

Plandex can't do this yet, but soon I want to add GPT4-vision (and other multi-modal models) as model options, which will enable this kind of workflow.

asadalt

0 replies

3d22h

2024-04-03 19:33:09 UTC

Well I have built similar project that lives in github action, communicates via issues and sends PR when done.

4-vision isn't there yet. It can mostly OCR or pattern recognize the image if it's popular or has some known object. It cannot detect pixel differences or css/alignment issues.

razster

0 replies

3d22h

2024-04-03 19:30:42 UTC

I paired mine with VSCode and used the live view addon for that folder. So far so good.

lprubin

2 replies

4d3h

2024-04-03 15:23:41 UTC

Looks interesting. Can you go into more detail about why you like this better for large/complex tasks compared to GH Copilot?

Cieric

1 replies

4d2h

2024-04-03 15:30:06 UTC

Not the author, but I'm in a discord with him, I believe the main selling point here is that it allows you to manage your updates and conversations in a branching pattern that's saved. So if you can't get the AI to do something you can always revert to a prior state and try a different method.

Also it doesn't work on a "small view of the world" like Copilot from when I was using it could only insert code around your cursor (I understand that copilot pulls in a lot of context from all the files you have open, but the area it can modify is really small). This can add/remove/update code in multiple files at once. But it'll also just show you a diff first before it applies and you can select some or all of the changes made.

danenania

0 replies

3d21h

2024-04-03 20:44:17 UTC

Yes, couldn't have said it better myself!

FezzikTheGiant

2 replies

3d10h

2024-04-04 07:48:17 UTC

Curious to know how you built this. Is it GPT-4 or a fine-tuned model. How much does it cost?

danenania

1 replies

3d1h

2024-04-04 16:48:00 UTC

It's written in Go. The models that it uses are configurable, but it mostly uses gpt-4-turbo by default currently. It calls the OpenAI API on your behalf with your own API key. No fine-tuning yet, though I'm interested in trying that in the future.

FezzikTheGiant

0 replies

2d22h

2024-04-04 19:28:20 UTC

Appreciate the response. Really cool work!

ukuina

1 replies

3d17h

2024-04-04 00:37:58 UTC

Congrats! Looks great, and I can't wait to try it.

Do you support AzureOpenAI with custom endpoints?

Are any special settings necessary to disable telemetry or non-core network requests?

danenania

0 replies

3d17h

2024-04-04 00:44:41 UTC

Thanks! It doesn't yet support custom endpoints, but it will soon. I'd recommend either joining the Discord (https://discord.gg/plandex-ai) or watching the repo for updates if you want to find out when this gets released.

If you self-host the server, there is no telemetry and no data is sent anywhere except to your self-hosted server and OpenAI.

splatzone

1 replies

3d19h

2024-04-03 22:37:39 UTC

This is really cool. I tried it and ran into a few syntax errors - it kept missing closing braces in PHP for some reason.

It seems it might be useful if it could actually try to execute the code, or somehow check for syntax errors/unimplemented functions before accepting the response from the LLM.

danenania

0 replies

3d19h

2024-04-03 22:44:13 UTC

Thanks! Was this on cloud or self-hosted? If cloud and you created an account, feel free to ping me on Discord (https://discord.gg/plandex-ai) or by email (dane@plandex.ai) and let me know your account email so I can investigate. If you have an anonymous trial account on cloud, please still ping me--I can track it down based on file names. There is definitely some work to do in ironing out these kinds of edge cases.

"It seems it might be useful if it could actually try to execute the code, or somehow check for syntax errors/unimplemented functions before accepting the response from the LLM."

Indeed, I do have some ideas on how to add this.

rglover

1 replies

3d22h

2024-04-03 19:28:30 UTC

This is something I've been thinking a lot about (a way to set context for an LLM against my own code), thank you for putting this out. Looks really polished.

danenania

0 replies

3d22h

2024-04-03 20:15:29 UTC

Thanks! Please let me know how it goes for you if you try it :)

parentheses

1 replies

2d15h

2024-04-05 02:48:37 UTC

Very small nit: it'd be nice to provide an OpenAI org in case multiple orgs exist.

danenania

0 replies

2d2h

2024-04-05 16:14:58 UTC

Ok, I made a note to add that. Thanks for the feedback!

parentheses

1 replies

3d8h

2024-04-04 09:26:20 UTC

This looks so damn good! Can't wait to try it in the morning!

danenania

0 replies

3d1h

2024-04-04 16:45:54 UTC

Thanks! Please let me know how it goes for you.

mbil

1 replies

3d22h

2024-04-03 20:10:25 UTC

Looks really interesting. Is it wrapping git for the rollback and diffing stuff? If I were a user I'd probably opt to use git directly for that sort of thing.

danenania

0 replies

3d22h

2024-04-03 20:20:03 UTC

Yes, it does use git underneath, with the idea of exposing a very simple subset of git functionality to the user. There's also some locking and transaction logic involved to ensure integrity and thread safety, so it wouldn't really be straightforward to expose the repo directly.

I tried to build the backend so that postgres, the file system, and git would combine to form effectively a single transactional database.

liampulles

1 replies

3d10h

2024-04-04 07:55:39 UTC

I appreciate in the copy here that you are not claiming plandex to be a super dev or some such nonsense.

I really dislike the hype marketing in some other solutions.

danenania

0 replies

3d1h

2024-04-04 16:38:37 UTC

Thanks! I agree. I think the key to working effectively with LLMs is to understand and embrace their limitations, using them for tasks they're good at while not spinning the tires on tasks (or parts of tasks) that they aren't yet well-suited for.

jtwaleson

1 replies

3d6h

2024-04-04 11:36:54 UTC

As someone who is trying to build a bootstrapped startup in spare time (read: coding while tired), this is amazing. Thank you so much for creating it.

danenania

0 replies

3d1h

2024-04-04 16:49:49 UTC

Thanks! I agree it's great for coding while tired. I also like it when I'm procrastinating or feeling lazy. I find it helps to reduce the activation energy of getting started.

jayloofah

1 replies

4d1h

2024-04-03 16:48:30 UTC

What is the cost of planning and working through, let's say, a manageable issue in a repo? Does it make sense to use 3.5/Sonnet or some lower cost endpoint for these tasks?

danenania

0 replies

4d1h

2024-04-03 16:56:27 UTC

It's hard to put a precise number on it because it depends on exactly how much context is loaded, how many model responses the task needs to finish, and how much iteration you need to do in order to get the results you're looking for.

That said, you can do quite a meaty task for well under $1. If you're using it heavily it can start to add up over time, so you'd just need to weigh that cost against how you value your time I suppose. In the future I do hope to incorporate fine tuned models that should bring the cost down, as well as other model options like I mentioned in the post.

You can try different models and model settings with `plandex set-model` and see how you go. But in my experience gpt-4 is really the minimum bar for getting usable results.

j45

1 replies

3d22h

2024-04-03 20:02:10 UTC

Congrats on the launch.

danenania

0 replies

3d21h

2024-04-03 20:43:54 UTC

Thank you!

htrp

1 replies

2d22h

2024-04-04 20:17:14 UTC

Are you using plandex to write improvements to plandex?

danenania

0 replies

2d20h

2024-04-04 21:38:43 UTC

Yes, quite often! Some of the most complex bits involving stream handling and concurrency were easier to do myself, but it’s been very helpful for http handlers, CLI commands, formatted output, TUIs, AWS infrastructure, and a lot more. I’ve also used it to track down bugs.

dr_kiszonka

1 replies

3d18h

2024-04-04 00:22:30 UTC

Hi! Is it possible to tell Plandex that the code should pass all tests in, e.g., `tests.py`?

danenania

0 replies

3d17h

2024-04-04 00:36:12 UTC

Hey! Not in an automated way (yet). But you can get pretty close by building your plan, applying it, and then piping the output of your tests back into Plandex:

  pytest tests.py | plandex load
  plandex tell "update the plan files to fix the failing tests from the included pytest output"

aksyam

1 replies

3d23h

2024-04-03 19:10:04 UTC

Love this. Super excited AI-SWEs, will give it a try.

danenania

0 replies

3d12h

2024-04-04 05:35:00 UTC

Awesome, thank you!

ahstilde

1 replies

3d20h

2024-04-03 21:55:30 UTC

this looks neat i can't wait to try it out.

danenania

0 replies

3d12h

2024-04-04 05:34:18 UTC

Thanks! Let me know how it goes :)

_bry-guy

1 replies

3d19h

2024-04-03 23:24:27 UTC

Wow, this is phenomenal! I can't wait to dig in. This is almost exactly the application I've been envisioning for my own workflow. I'm excited to contribute!

danenania

0 replies

3d18h

2024-04-03 23:46:12 UTC

Thank you! Awesome, I'm glad to hear that! Looking forward to your thoughts, and your contributions :)

CGamesPlay

1 replies

3d7h

2024-04-04 10:34:09 UTC

I wanted to get a better idea of how it worked, so I asked Claude to write up an overview. https://gist.github.com/CGamesPlay/8c2a2882c441821e76bbe9680...

danenania

0 replies

3d2h

2024-04-04 15:25:52 UTC

This is really cool! And quite accurate.

bobby_the_whal

0 replies

3d4h

2024-04-04 13:49:32 UTC

If this thing really worked, why wouldn't you just point it at AWS documentation and ask it to implement the exact same APIs and come up with designs for the datacenters in extreme detail? Implementing APIs is completely legal.