Are these libraries for connecting to an ollama service that the user has already installed or do they work without the user installing anything? Sorry for not checking the code but maybe someone has the same question here.
I looked at using ollama when I started making FreeChat [0] but couldn't figure out a way to make it work without asking the user to install it first (think I asked in your discord at the time). I wanted FreeChat to be 1-click install from the mac app store so I ended up bundling the llama.cpp server instead which it runs on localhost for inference. At some point I'd love to swap it out for ollama and take advantage of all the cool model pulling stuff you guys have done, I just need it to be embeddable.
My ideal setup would be importing an ollama package in swift which would start the server if the user doesn't already have it running. I know this is just js and python to start but a dev can dream :)
Either way, congrats on the release!
On the subject of installing Ollama, I found it to be a frustrating and user-hostile experience. I instead recommend the much more user-friendly LLM[0] by Simon Willison.
Among the problems with Ollama include:
* Ollama silently adds a login item with no way to opt out: <https://github.com/jmorganca/ollama/issues/162>
* Ollama spawns at least four processes, some persistently in the background: 1 x Ollama application, 1 x `ollama` server component, 2 x Ollama Helper
* Ollama provides no information at install time about what directories will be created or where models will be downloaded.
* Ollama prompts users to install the `ollama` CLI tool, with admin access required, with no way to cancel, and with no way to even quit the application at that point. Ollama provides no clarity that about what is actually happening during this step: all it is doing is symlinking `/Applications/Ollama.app/Contents/Resources/ollama` to `/usr/local/bin/`
The worst part is that not only is none of this explained at install time, but the project README doesn’t tell you any of this information either. Potential users deserve to know what will happen on first launch, but when a PR arrived to at least provide that clarification in the README, Ollama maintainers summarily closed that PR and still have not rectified the aforementioned UX problems.
As an open source maintainer myself, I understand and appreciate that Ollama developers volunteer their time and energy into the project, and they can run it as they see fit. So I intend no disrespect. But these problems, and a seeming unwillingness to prioritize their resolution, caused me to delete Ollama from my system entirely.
As I said above, I think LLM[0] by Simon Willison is an excellent and user-friendly alternative.
[0]: https://llm.datasette.io/
"User hostile experience" is complete hyperbole and disrespectful to the efforts of the maintainers of this excellent library.
It's very, very, very annoying how much some people are tripping over themselves to pretend a llama.cpp wrapper is some gift of love from saints to the hoi polloi. Y'all need to chill. It's good work and good. It's not great or the best thing ever or particularly high on either simple user friendliness or power user friendly. It's young. Let it breathe. Let people speak.
What troubles me is how many projects are using ollama. I can't stand that I have to create a model file for every model using ollama. I have a terabyte of models that are mostly GGUF, which is somewhere around 70 models of various sizes. I rotate in and out of new versions constantly. GGUF is a ~container~ that already has most of the information needed to run the models! I felt like I was taking crazy pills when so many projects started using it for their backend.
Text-generation-webui is leagues ahead in terms of plug and play. Just load the model and it will get you within 98% of what you need to run any model from HF. Making adjustments to generation settings, prompt and more is done with a nice GUI that is easily saved for future use.
Using llama.cpp is also very easy. It takes seconds to build on my windows computer with cmake. Compiling llama.cpp with different parameters for older/newer/non-existent GPUs is very, very simple... even on windows, even for a guy that codes in Python 97% of the time and doesn't really know a thing about C++. The examples folder in llama.cpp is gold mine of cool things run and they get packaged up into *.exe files for dead simple use.
It's not hyperbole when he listed multiple examples and issues which clearly highlight why he calls it that.
I don't think there was anything hyperbolic or disrespectful in that post at all. If I was a maintainer there and someone put in the effort to list out the specific issues like that I would be very happy for the feedback.
People need to stop seeing negative feedback as some sort of slight against them. It's not. Any feedback should be seen as a gift, negative or positive alike. We live in a massive attention-competition world, so to get anyone to spend the time to use, test and go out of their way to even write out in detail their feedback on something you provide is free information. Not just free information, but free analysis.
Really wish that we could all understand and empathize with frustration on software has nothing to do with the maintainers or devs unless directly targeted.
You could say possibly that the overall tone of the post was "disrespectful" because of its negativity, but I think receiving that kind of post which ties together not just the issues in some bland objective manner but highlights appropriately the biggest pain points and how they're pain points in context of a workflow is incredibly useful.
I am constantly pushing and begging for this feedback on my work, so to get this for free is a gift.
Indeed, I thought the user experience was great. Simple way to download, install and start: everything just worked.
What I said is an utterly factual statement: I found the experience to be user-hostile. You might have a different experience, and I will not deny you your experience even in the face of your clearly-stated intention to deny me mine.
Moreover, I already conveyed my understanding of and appreciation for the work open-source maintainers do, and I outright said above that I intend no disrespect.
nix-shell makes most of this go away, except the ollama files will still be in `~/.ollama` which you can delete at any time.
in two tmux windows, then in one and in the other.Exit and all the users, processes etc, go away.
https://search.nixos.org/packages?channel=23.11&show=ollama&...
Is this any different from
The Linux binary (pre-built or packaged by your distro) is just a CLI. The Mac binary instead also contains a desktop app.
I agree with OP that this is very confusing. The fact the Mac OS installation comes with a desktop app is not documented anywhere at all! The only way you can discover this is by downloading the Mac binary.
I got the same feeling. I think it’s generally bad practice to ask a user for their admin password without a good rationale as to why you’re asking, particularly if it’s non-obvious. It’s the ‘trust me bro’ approach to security that that even if this is a trustworthy app it encourages the behaviour of just going ahead and entering your password and not asking too many questions.
The install on Linux is the same. You’re essentially encouraged to just
which is generally a terrible idea. Of course you can read the script but that misses the point in that that’s clearly not the intended behaviour.As other commenters have said, it is convenient. Sure.
https://github.com/ollama/ollama/blob/main/docs/linux.md
They have manual install instructions if you are so inclined.
You don’t sound like the kind of user ollama was meant to serve. What you are describing is pretty typical of macOS applications. You were looking for more of a traditional Linux style command line process or a Python library. Looks like you found what you were after, but I would imagine that your definition of user friendly is not really what most people understand it to mean.
Respectfully, I disagree. Not OP, but this “installer” isn’t a standard macOS installer. With a standard installer I can pick the “show files” menu option and see what’s being installed and where. This is home rolled and does what arguably could be considered shady dark patterns. When Zoom and Dropbox did similar things, they were rightly called out, as should this.
I agree that alternative is good, but if you want to try ollama without the user experience drawbacks, install via homebrew.
There's also a docker container (that I can recommend): https://hub.docker.com/r/ollama/ollama
Big fan of Simon Willison's `llm`[1] client. We did something similar recently with our multi-modal inference server that can be called directly from the `llm` CLI (c.f. "Serving LLMs on a budget" [2]). There's also `ospeak` [3] which we'll probably try to integrate to talk to your LLM from console. Great to see tools that radically simplify the developer-experience for local LLMs/foundation models.
[1] https://github.com/simonw/llm
[2] https://docs.nos.run/docs/blog/serving-llms-on-a-budget.html...
[3] https://github.com/simonw/ospeak
I think it boils down to a level of oblivious disrespect for the user from the points you raised about ollama. I am sure it’s completely unintentional from their dev’s, simply not prioritising the important parts which might be a little boring for them to spend time on, but to be taken seriously as a professional product I would expect more. Just because other apps may not have the same standards either re complete disclosure, it shouldn’t be normalised if you are wanting to be respected fully from other devs as well as the general public - after all, other devs who appreciate good standards will also be likely to promote a product for free (which you did for LLM[0]) so why waste the promotion opportunity when it results in even better code and disclosure.
Just for connecting to an existing service: https://github.com/ollama/ollama-python/blob/main/ollama/_cl...
For the client API it's pretty clear:
But I don't quite get how the example in "Usage" can work: Since there is no parameter for host and/or port.Once you have a custom `client` you can use it in place of `ollama`. For example:
Thanks. I don't have the service installed on my computer RN, but I assume the former works because it by default uses a host (localhost) and port number that is also the default for ollma service?
Exactly that. Client host options default, https://github.com/ollama/ollama-python/blob/main/ollama/_cl...
Also overrideable with OLLAMA_HOST env var. The default imported functions are then based off of a no-arg constructed client https://github.com/ollama/ollama-python/blob/main/ollama/__i...