What are the steps required to get this running in VS Code?
If they had linked to the instructions in their post (or better yet a link to a one click install of a VS Code Extension), it would help a lot with adoption.
(BTW I consider it malpractice that they are at the top of hacker news with a model that is of great interest to a large portion of the users where and they do not have a monetizable call to action on the page featured.)
"All you need is users" doesn't seem optimal IMHO, Stability.ai providing an object lesson in that.
They just released weights, and being a for profit, need to optimize for making money, not eyeballs. It seems wise to guide people to the API offering.
On top of Hacker News (the target demographic for coders) without an effective monetizable call to action? What a missed opportunity.
Github Copilot makes +100M/year, if not way way more.
Having a VS Code extension for Mistral would be a revenue stream if it was one-click and better or cheaper than Github Copilot. It is malpractice in my mind to not be doing this if you are investing in creating coding models.
How the hell does Copilot make $100M/yr? That seems an order of magnitude higher than I would expect at the high end.
???
if we’re talking individual subscriptions that’s ~1M paying subscribers. honestly that number would not totally shock me?
plus they’ve got some kinda enterprise/team offering; assuming they charge extra there, I could easily see $100M ARR
but that’s pure conjecture, and generous at that; I don’t think we have any hard numbers
I was thinking the opposite...remember there's enterprise subscriptions and multi-million dollar contracts with single companies.
yeah exactly only $100M/yr? barely covers expenses
I see, that makes sense: make an extension and charge for it.
I assumed they meant free x local. It doesn't seem rational to make this one paid: its significantly smaller than their better model, and even more so than Copilot's.
But they also signal competence in the space which means M&A. Or big nation states in future would hire them to produce country models once the space matures as was Emad's vision.
What does a "country model" mean? Optimized for that country's specific language, or with state propaganda or something else?
More or less; it was about as serious as your median Elon product tweet the last decade, or median coin nonsense.
Half-baked idea that obviously the models would need to be tuned for different languages / for specific knowledge, therefore countries would pay to do that.
There were many ideas like that, none of them panned out, hence the defenestration. All love for the guy, he did a very, very good thing. It's just meaningless to invoke it here, not only because it's completely off-topic, if anything that's already the play as the EU champion, and because the Stability gentleman was just thinking out loud, nothing more.
If you believe LLMs are going to end up built into everything and doing everything, from moderating social media to writing novels and history books, making such a model will be the most political thing that has ever happened.
If your country believes guns=bad nipples=good war=hell but you get your novels and history books written by an LLM trained by people who believe guns=good nipples=bad war=heroic it would be naive to expect the output to reflect your values and not theirs.
Even close allies of the US would be nervous to have such power in the hands of American multinational corporations alone - so the French state could be very eager for Mistral to produce a competitive product.
Did Emad's vision end up manifest? ex. did a nation-state end up paying Stability for a country model?
Would it help signal competency? They're a small team focused on making models, not VS Code extensions.
Would they do M&A? The founding team is ex-Googlers and has found significant attention in the MBA world via being an EU champion.
If you can run this using ollama, then you should be able to use https://www.continue.dev/ with both IntelliJ and VSCode. Haven’t tried this model yet - but overall this plugin works well.
They say no llama.cpp support yet, so no ollama yet (which uses llama.cpp)
Ollama is supported: https://docs.continue.dev/setup/select-provider
They meant that there is no support for Codestral Mamba for llama.cpp yet.
Correct. The only back-end that Ollama uses is llama.cpp, and llama.cpp does not yet have Mamba2 support. The issues to track Mamba2 and Codestral Mamba support are here:
https://github.com/ggerganov/llama.cpp/issues/8519
https://github.com/ggerganov/llama.cpp/issues/7727
Mamba support was added in March of this year:
https://github.com/ggerganov/llama.cpp/pull/5328
I have not yet seen a PR to address Mamba2.
Unrelated, all my devices freeze when accessing this page, desktop Firefox and Chrome, mobile Firefox and Brave. Is this the best alternative to access code ai helpers besides the GitHub Copilot and Google Gemini on VSCode?
I've been using it for a few months (with Starcoder 2 for code, and GPT-4o for chat). I find the code completion actually better than Github Copilot.
My main complain is that the chat sometimes fails to correctly render some GPT-4o output (e.g. LaTeX expressions), but it's mostly fixed with a custom system prompt. It also significantly reduces the battery life of my Macbook M1, but that's expected.
I'm quite happy with Cody from Sourcegraph https://marketplace.visualstudio.com/items?itemName=sourcegr...
Looking through the Quickstart docs, they have an API that can generate code. However, I don't think they have a way to do "Day 2" code editing.
Also, doesn't seem to have a freemium tier...need to start paying even before trying it out ?
"Our API is currently available through La Plateforme. You need to activate payments on your account to enable your API keys."
I signed up when codestral was first available and put my payment details in. Been using it daily since then with continue.dev but my usage dashboard shows 0 tokens, and so far have not been billed for anything... Definitely not clear anywhere, but it seems to be free for now? Or some sort of free limit that I am not hitting.
Through codestral.mistral.ai? It's free until August 1st: https://docs.mistral.ai/capabilities/code_generation/
I feel like local models could be an amazing coding experience because you could disconnect from the internet. Usually I need to open chatgpt or google every so often to solve some issue or generate some function, but this also introduces so many distractions. imagine being able to turn off internet completely and only have a chat assistant that runs locally. I fear though that it is just going to be a bit to slow at generating tokens on CPU to not be annoying.
I don't have a gut feel for how much difference the Mamba arch makes to inference speed, nor how much quantisation is likely to ruin things, but as a rough comparison Mistral-7B at 4 bits per param is very usable on CPU.
The issue with using any local models for code generation comes up with doing so in a professional context: you lose any infrastructure the provider might have for avoiding regurgitation of copyright code, so there's a legal risk there. That might not be a barrier in your context, but in my day-to-day it certainly is.
Website codegpt.co also has a plugin for both VS Code and Intellij. When model becomes available in Ollama, you can connect plugin in VS code to local ollama instance.
Currently the best (most user-friendly) way to run models locally is to use Ollama with Continue.dev. This one is not available yet, though: https://github.com/ggerganov/llama.cpp/issues/8519