return to table of content

I’m writing a new vector search SQLite Extension

alexgarcia-xyz
25 replies
14h6m

Author here — happy to answer any questions! This is more of a "I'm working on a new project" rather than an official release, the extension itself is still a work-in-progress. A link to the project: https://github.com/asg017/sqlite-vec

I have a pretty specific vision of what v0.1.0 of this extension will look like, but it'll take a few more weeks to get there. This blog post was more for letting users of sqlite-vss (a previous vector search SQLite extension I wrote) know what will be coming next. There will be a much bigger release when that is ready.

But in general, I'm super excited to have an easy embeddable vector search alternative! Especially one that runs on all operating system, in WASM, mobile devices, Raspberry Pis, etc. I personally I'm trying to run a lil' semantic search app on my Beepy[0], which is a ton of fun to play with.

[0] https://beepy.sqfmi.com/

lsb
5 replies
13h47m

Would this implement indexing strategies like HNSW? Linear scan is obviously great to start out with, and can definitely be performant, especially if your data is in a reasonable order and under (say) 10MB, so this shouldn't block a beta release.

Also do you build with sqlite-httpvfs? This would go together great https://github.com/phiresky/sql.js-httpvfs

alexgarcia-xyz
4 replies
13h38m

The initial v0.1.0 release will only have linear scans, but I want to support ANN indexes like IVF/HNSW in the future! Wanted to focus on fullscans first to make things easier.

In my experiments you can get pretty far with linear scans using sqlite-vec. Depends on the number of dimensions of course, but I'd expect it can handle searching 100's of thousands of vectors to maybe a million with sub-second searches, especially if you tweak some settings (page_size, mmap_size, etc.). And if you quantize your vectors (int8 reduces size 4x, binary 32x) you can get even faster, at the expense of quality. Or use Matryoshka embeddings[0] to shave down dimensions even more.

Building sql.js-httpvfs would be cool! Though I'm using the "official" SQLite WASM builds which came out after sql.js, so I don't think it'll be compatible out of the box. Would be very curious to see how much effort it would be to make a new HTTP VFS for SQLite's new WASM builds

[0] https://huggingface.co/blog/matryoshka

sroussey
2 replies
13h23m

Might have a look at this library:

https://github.com/unum-cloud/usearch

It does HNSW and there is a SQLite related project, though not quite the same thing.

alexgarcia-xyz
1 replies
13h15m

Thanks for sharing! I've looked into usearch before, it's really sleek, especially all their language bindings. Though I want sqlite-vec to have total control over what stays in-memory vs on-disk during searches, and most vector search libraries like usearch/hnswlib/faiss/annoy either always store in-memory or don't offer hooks for other storage systems.

Additionally, sqlite-vec takes advantage of some SQLite specific APIs, like BLOB I/O [0], which I hope would speed things up a ton. It's a ton more work, coming up with new storage solutions that are backed by SQLite shadow tables, but I think it'll be work it!

And I also like how sqlite-vec is just a single sqlite-vec.c file. It makes linking + cross-compiling super easy, and since I got burned relying on heavy C++ dependencies with sqlite-vss, a no-dependency rule feels good. Mostly inspired by SQLite single-file sqlite3.c amalgamation, and Josh Baker's single file C projects like tg[1].

[0] https://www.sqlite.org/c3ref/blob_open.html

[1] https://github.com/tidwall/tg

ashvardanian
0 replies
3h54m

USearch author here :)

You are right. I don't yet support pluggable storage systems, but you can check the Lantern's fork of USearch. It may have the right capabilities, as they wanted the same level of integration for Postgres.

Our current SQLite extension brings "search APIs" to SQLite but not the index itself. Think of it as a bundle of SIMD-vectorized Cosine/Dot/L2/Hamming/Jaccard... (for vectors) and Levenshtein/Hamming/NW (for text) distances coming from SimSIMD and StringZilla. Unlike USearch, the latter are pure C 99 header-only libraries, in case you need smth similar.

Best of luck with your project!

jeanloolz
3 replies
11h4m

I originally added sqlite-vss (your original vector search implementation) on Langchain as a vectorstore. Do you think this one is mature enough to add on Langchain, or should I wait a bit?

Love your work by the way, I have been using sqlite-vss on a few projects already.

alexgarcia-xyz
1 replies
3h53m

Hey really cool to see re sqlite-vss+langchain! You could try a langchain integration now, there's an (undocumented) sqlite-vec pypi package you can install that's similar to the sqlite-vss one. Though I'd only try it for dev stuff now (or stick to alpha releases), but things will be much more stable when v0.1.0 comes out. Though I doubt the main SQL API (the vec0 table) syntax will change much between now and then.

jeanloolz
0 replies
3h38m

Cool beans! I'll look into it soon then

davedx
0 replies
7h18m

He says in the blog post and here that this isn't finished yet

xyc
2 replies
13h27m

This is awesome! We used Qdrant vector DB for a local AI RAG app https://recurse.chat/blog/posts/local-docs#vector-database, but was eyeing up sqlite-vss as a lightweight embedded solution. Excited to see you are making a successor. Would be interested to learn about latency and scalability benchmarks.

alexgarcia-xyz
1 replies
13h13m

That's great to hear, thanks for sharing! Will definitely have benchmarks to share later

Would love to hear about the scale of the vector data you're working with in your local app. Do you actually find yourself with > 1 million vectors? Do you get away with just storing it all in-memory?

xyc
0 replies
12h54m

I don't think we need > 1 million vector yet. But we plan to target folder of local document such as obsidian vault or store web search results which could rack up a large number of vectors. Persistence on disk is also a desired feature when looking at the choices because we don't want to reindex existing files.

Benchmark is also not everything, easiness to embed and integrate with existing sqlite apps could make sqlite-vec stand out and help with adoption.

tipsytoad
2 replies
10h34m

Looks really nice, but the only concern I had — how does the perf compare to more mature libraries like faiss?

alexgarcia-xyz
1 replies
3h47m

Benchmarking for this project is a bit weird, since 1) only linear scans are supported, and 2) it's an "embeddable" vector search tool, so it doesn't make a lot of sense to benchmark against "server" vector databases like qdrant or pinecone.

That being said, ~generally~ I'd say it's faster than using numpy and tools like txtai/chromadb. Faiss and hnswlib (bruteforce) are faster because they store everything in memory and use multiple threads. But for smaller vector indexes, I don't think you'd notice much of a difference. sqlite-vec has some support for SIMD operations, which speeds things up quite a bit, but Faiss still takes the cake.

dmezzetti
0 replies
1h30m

Author of txtai here - great work with this extension.

I wouldn't consider it a "this or that" decision. While txtai does combine Faiss and SQLite, it could also utilize this extension. The same task was just done for Postgres + pgvector. txtai is not tied to any particular backend components.

nnx
1 replies
12h19m

Interesting idea.

Do you intend to compress the vector storage in any way and do you intend to implement your own vector search algorithms or reuse some already optimized libraries like usearch?

alexgarcia-xyz
0 replies
11h43m

I don't have plans for compressed vector storage. There is support for int8/binary quantization, which can reduce the size of stored vectors drastically, but impacts performance quite a lot. I'd like to support something like product quantization[0] in the future, though!

No plans for using usearch or another vector search library, either. I want to keep dependencies low to make compilingeasy, and I want full control for how vectors are stored on-disk and in-memory.

[0] https://www.pinecone.io/learn/series/faiss/product-quantizat...

alexgarcia-xyz
0 replies
3h37m

It will! I've neglected that project a bit, but in some of the alpha builds I was able to get WASM to work, I just have documented or blogged about it yet. SQLite extensions in WASM written with sqlite-loadable-rs are a bit large, mostly because of Rust. But it's possible!

TachyonicBytes
1 replies
12h7m

Really nice to see Wasm there, usually vector search in sqlite wasn't available in the browser.

Did you ever consider making it syntax compatible with pgvector, in order to have a common SQL vector DSL? I am sure the benefits are much smaller than the disadvantages, but I'm curious if it's possible.

alexgarcia-xyz
0 replies
11h47m

It's not possible to match pgvector's syntax with SQLite. SQLite doesn't have custom indexes, and since I want to have full control of how vectors are stored on disk, the only real way to do that in SQLite is with virtual tables[0]. And there's a few other smaller details that don't match (postgres operators like <->/<=>, SQLite's limited virtual table constraints, etc.), so matching pgvector wasn't a priority.

I instead tried to match SQLite's FTS5 full text search[1] extension where possible. It solves a similar problem: storing data in its own optimized format with a virtual table, so offering a similar API seemed appropriate.

Plus, there's a few quirks with the pgvector API that I don't quite like (like vector column definitions), so starting from scratch is nice!

[0] https://www.sqlite.org/vtab.html

[1] https://www.sqlite.org/fts5.html

CGamesPlay
1 replies
9h46m

Very exciting and I can’t wait to try it. Dependency issues are the reason I am currently not using sqlite-vss (i.e. not using an index for my vector searches), but man do I wish for better performance.

At the risk of fulfilling a HN stereotype, may I ask why you ditched Rust for this extension?

alexgarcia-xyz
0 replies
3h31m

Went back, and forth a lot, but the short version:

1. Having directly access to SQLite's C APIs is really useful, especially for BLOB I/O and some rarely-used APIs that sqlite-vec uses and aren't available in some SQLite/Rust bindings

2. Writing in Rust for this project would have meant adding a few dependencies which would make some builds a bit more complicated

3. Pure C means easier to compile for some targets I wanted to support, like WASM/mobile devices/raspberry pis

4. A single sqlite-vec.c/h file means you can drag+drop a single file into your C projects and "it just works"

5. I wanted full control over what stays in memory and what doesn't, and it's a bit easier to do so in C (at least in my brain)

6. Most vector search operations aren't too complicated, so I feel comfortable manually handling memory access. You work with fixed length vectors, pretty easy to do manual checks for. If it required a lot of string parsing/validation or other dicey work, then Rust would have been a godsend, but vector search specifically fits C pretty well.

7. Ton of C/C++ vector search examples I could reference from Faiss/hnswlib/annoy

HanClinto
0 replies
1h33m

Awesome work!!

* Which distance functions does this support? Looks like it supports binary vectors already -- is Hamming distance supported?

* How does the performance compare with sqlite-vss? I'm curious about the profiling numbers -- both in terms of query speed, as well as memory usage.

Overall, this looks absolutely fantastic, and I love the direction you're heading with all of this.

Though initially, sqlite-vec will only support exhaustive full-scan vector search. There will be no "approximate nearest neighbors" (ANN) options. But I hope to add IVF + HNSW in the future!

I think this is 1000% the correct approach -- kudos for not over-complicating things initially! I've shipped on-device vector search (128-bit binary vectors, Hamming distance) and even with a database size of 200k+ entries, it was still fast enough to do full brute-force distance search on every camera frame -- even running on crappy phones it was fast enough to get 10+ fps, and nicer phones were buttery-smooth. It's amazing how frequently brute-force is good enough.

That said, for implementing ANN algorithms like HNSW and whatnot, my first thought is that it would be slick if these could be accomplished with a table index paradigm -- so that switching from brute-force to ANN would be as simple as creating an index on your table. Experimenting with different ANN algorithms and parameters would be accomplished by adjusting the index creation parameters, and that would let developers smoothly evaluate and iterate between the various options. Maybe that's where your mind is going with it already, but I figured I would mention it just in case.

Overall, awesome writeup, and fantastic project!!

barakm
6 replies
14h23m

Thy are binary vectors with 768 dimensions, which takes up 96 bytes (768 / 8 = 96).

I guess I’m confused. This is honestly the problem that most vector storage faces (“curse of dimensionality”) let alone the indexing.

I assume that you meant 768 dimensions * 8 bytes (for a f64) which is 6144 bytes. Usually, these get shrunk with some (hopefully minor) loss, so like a f32 or f16 (or smaller!).

If you can post how you fit 768 dimensions in 96 bytes, even with compression or trie-equivalent amortization, or whatever… I’d love to hear more about that for another post.

Ninja edit: Unless you’re treating each dimension as one-bit? But then I still have questions around retrieval quality

lsb
2 replies
14h21m

Binary, quantize each dimension to +1 or -1

You can try out binary vectors, in comparison to quantize every pair of vectors to one of four values, and a lot more, by using a FAISS index on your data, and using Product Quantization (like PQ768x1 for binary features in this case) https://github.com/facebookresearch/faiss/wiki/The-index-fac...

barakm
1 replies
14h15m

Appreciate the link; but would still like to know how well it works.

If you have a link for that, I’d be much obliged

alexgarcia-xyz
1 replies
14h15m

Author here - ya "binary vectors" means quantizing to one bit per dimension. Normally it would be 4 * dimensions bytes of space per vector (where 4=sizeof(float)). Some embedding models, like nomic v1.5[0] and mixedbread's new model[1] are specifically trained to retain quality after binary quantization. Not all models do tho, so results may vary. I think in general for really large vectors, like OpenAI's large embeddings model with 3072 dimensions, it kindof works, even if they didn't specifically train for it.

[0] https://twitter.com/nomic_ai/status/1769837800793243687

[1] https://www.mixedbread.ai/blog/binary-mrl

barakm
0 replies
14h10m

Thank you! As you keep posting your progress, and I hope you do, adding these references would probably help warding off crusty fuddy-duddys like me (or at least give them more to research either way) ;)

queuebert
0 replies
2h47m

BTW, the "curse of dimensionality" technically refers to the relative sparsity of high-dimensional space and the need for geometrically increasing data to fill it. It has nothing to do with storage. And typically in vector databases the data are compressed/projected into a lower dimensionality space before storage, which actually improves the situation.

ncruces
4 replies
5h29m

Do you plan to only use public SQLite APIs, or do you expect to be appended to the amalgamation?

I'm definitely interested in something like this, but need to think through how I could distribute this separate from SQLite in my Wasm based Go bindings. So far everything C is bundled, because it's a lot simpler than Wasm "dynamic linking".

Also, you've mentioned incremental BLOB I/O, and you probably know this already, but keep in mind that BLOB I/O is never random access, as large BLOBs are stored as a linked list of pages.

alexgarcia-xyz
2 replies
3h40m

Only public SQLite APIs! So no need to append to amalgamation.

I'm a big fan of your wazero SQLite bindings! I actually plan on providing 1) CGO bindings to sqlite-vec and 2) a custom WASI build sqlite-vec that can be used in go-sqlite3 directly. My plan was to build a sqlite3.wasm file using the build scripts in your repo. If you did want to support it in your project directly, I think you could just drop the sqlite-vec.c/h file in go-sqlite3/sqlite3 and be good to go

Re incremental Blob I/O: I learned that the hard way! It's definitely the limiting factor on query speed for sqlite-vec. I've found that keeping the chunks relatively low in size (low MB's) and increasing the page_size strikes a good balance, but page_size in particular has consequences. PRAGMA mmap_size also helps out a ton, since it seems to keep pages in memory and makes overflow lookups faster, but that of course means a ton more memory usage. It's a difficult balance!

ncruces
1 replies
3h27m

OK, when you feel it's time, feel free to ping me on my repo, and I'll look into it.

A custom Wasm blob definitely works: it's an explicit design goal of my bindings that one can bring their own Wasm (e.g. because one wants to configure the build differently, or bought the SEE extension, or something). And if your extension turns out much like FTS5 that would work.

Still, "one big Wasm blob" is less flexible than I'd like, because all connections in an app (currently? maybe this could be lifted) need to use the same Wasm blob. Then (and if you want to import different extensions...) it becomes a fight over who got to initialize the global variable last.

So, I've been on-and-off looking into what it would take to "dynamically link" a C extension as a second Wasm, but... I'm not even sure it's possible.

alexgarcia-xyz
0 replies
3h13m

Ya dynamically linking extensions in WASM would be amazing, but really hard to see how it would work. DuckDB was able to figure it out[0], but I struggle to see how it would work for SQLite. Though if anything, I think your library would make it easier to solve, since you don't have to fight with the browser/JavaScript

[0] https://duckdb.org/2023/12/18/duckdb-extensions-in-wasm.html

koeng
0 replies
1h35m

Definitely interested in this for your wasm Go bindings!

goprams
3 replies
11h13m

KNN style query goes brrrr

Why do devs write comments like this? What does "goes brrrr" add to anyone's understanding of the code? It's so annoying. Is it supposed to be amusing or cute? It's not.

The rest of the article is great but it's so jarring to see this sort of nonsense in amongst a solid piece of work.

It's like when you go to a conference and the presentations are peppered with unfunny "meme" pictures. Grow up, people.

timmy777
0 replies
10h47m

Keep in mind, a presentation (text, or otherwise) is an art. Culture, background, experiences and ideologies influences art.

... and to address your tone, did you know, you can give feedback with empathy?

efdee
0 replies
9h57m

Why do people write comments like this? What does "grow up, people" add to anyone's understanding of the topic? It's so annoying. Is it supposed to make you look better? It doesn't.

dayjaby
0 replies
11h11m

Language evolves. Grow up yourself.

xrd
2 replies
12h52m

Very excited about this.

When running inside the browser, is sqlite-vec able to persist data into the browser-native indexdb? Or, is this part left to the user? If you can share the thinking there I would appreciate it, even if that is, "no thinking yet!"

alexgarcia-xyz
1 replies
12h38m

It could! It's based on the official SQLite WASM build, so you can use the same persistent options[0] that are offered there. Not sure if IndexedDB is specifically supported, but localStorage/OPFS VFS is available.

[0] https://sqlite.org/wasm/doc/trunk/persistence.md#kvvfs

samwillis
0 replies
11h40m

OP is correct, the official WASM SQLite build is a sync only build and doesn't support async VFSs (which a IndexedDB VFS needs to be as it's an async api, unless you load the whole file into memory).

The best option for IndexedDB backed WASM SQLite is wa-sqlite, it offered both a sync and async build and a bunch of different VFSs, including an IndexedDB one. Note however that the async builds add significant overhead and reduce performance.

The most performant VFS is the "OPFS access handle pool" VFS that Roy developed for wa-SQLite and the official build also adopted as an option. That would be my recommendation for now.

codazoda
2 replies
14h58m

This is a lot like I imagine “readme driven development” to look like. I’m curious if the author started with docs first.

alexgarcia-xyz
1 replies
14h10m

Thanks for the kind words!

I started with code first — the extension itself is already mostly written[0]. But it's one of those "20% effort for 80% of the work," where the last 20% I need to write (error handling, fuzzy tests, correctness testing) will take 80% of the time. But people already have questions about the current status of `sqlite-vss`, so I figured this "work-in-progress" blog post could answer some questions.

Though I do like the idea of starting with docs first! Especially with SQLite extensions, where it all really matters what the SQL API looks like (scalar functions, virtual tables, etc.). I definitely did a lot of sketching of what the SQL part of sqlite-vec should look like before writing most of the code.

[0] https://github.com/asg017/sqlite-vec/blob/main/sqlite-vec.c

peter_l_downs
0 replies
13h45m

You’ve done a great job communicating the project, in every aspect. Nice work. I’m excited to try this out!

alexgarcia-xyz
1 replies
11h37m

Yes it is! Sorry for not following up there. Actually, when I first read that ticket, it started me down the rabbit-hole of "how can I make sqlite-vss" better, which eventually turned into "I should make sqlite-vec." So thanks for helping me go down this path!

With sqlite-vec's builtin binary quantization, you should be able to do something like:

  CREATE VIRTUAL TABLE vec_files USING vec0 (
    contents_embedding bit[1536]
  );

  INSERT INTO vec_files(rowid,contents_embedding) 
    VALUES (
      (1, vec_quantize_binary( /* 1536-dimension float vector here*/))
    )

ComputerGuru
0 replies
33m

Hey, you’re welcome! Looking forward to trying this at some point. Sooner if rust bindings are available, later if I have to contribute them.

My embedding is already binary, presumably I can bind a blob instead of the call to vec_quantize_binary?

yard2010
1 replies
7h34m

Thank you for creating sqlite-vss. It helped me learn how RAG works and implement one in my toy project. It was a bit hard to debug but it worked FLAWLESSLY on ubuntu when done right. I'm still using it. I'm glad you're making a new better version with no limiting deps! You are awesome. Thank you and good luck!

alexgarcia-xyz
0 replies
3h52m

Thanks for the kinds words! Really cool to hear that people were able to use sqlite-vss, lol.

usgroup
1 replies
11h4m

Shallow insight, but separating the algorithmic/representational and sqlite operational parts into own C projects, is possibly a good idea.

I'd expect the rate of evolution to be significantly different between the two parts, and if you are using an algorithm library adopted by others, you may get progress "for free".

alexgarcia-xyz
0 replies
3h56m

Oooh I haven't thought of seperating the SQLite stuff and the vector search stuff. Good idea, will keep that in mind!

blizzardman
1 replies
10h15m

Are you looking for contributor by any chance?

alexgarcia-xyz
0 replies
3h55m

As of now for this beta release not really, but after v0.1.0 is released, will definitely have contributing guides + issues that people could tackle!

auraham
1 replies
13h32m

Awesome work! It would be great if you can share a post of how you developed that extension. I am not familiar with SQLite extensions, so I am not sure how to dig into the github repo.

alexgarcia-xyz
0 replies
13h28m

I definitely plan to! I have a much larger list of SQLite extensions I've built here: https://github.com/asg017/sqlite-ecosystem

Here's a few other references you may enjoy if you wanna learn more about SQLite extensions:

- The single source file for sqlite-vec: https://github.com/asg017/sqlite-vec/blob/main/sqlite-vec.c

- sqlean, a project from Anton Zhiyanov which is good base of great SQLite extensions: https://github.com/nalgeon/sqlean

- The official SQLite docs: https://www.sqlite.org/loadext.html

- The "hello world" SQLite extension example: https://www.sqlite.org/src/file/ext/misc/rot13.c

TheAnkurTyagi
1 replies
10h48m

This is great. we used Qdrant vector DB for an end to end automation for our AI RAG app at https://github.com/rnadigital/agentcloud, but very excited to see you are making a successor. any ETA when it would be ready to use and any quickstart guide? Maybe I can help in writing a blog as well.

alexgarcia-xyz
0 replies
3h15m

I plan to have v0.1.0 in about a month or so! Definitely will include a ton of docs and a quickstart guide. There's an undocumented pip package "sqlite-vec" that you might be able to use now, if you wanted to call it from your Python "Agent Backend" directly.

zaeger
0 replies
11h5m

very good

rcarmo
0 replies
5h46m

Great news. Looking forward to it, as I'm constantly looking for simple solutions that work in constrained environments, and this looks like it's going to be in one of them.

fitzn
0 replies
1h36m

Cool stuff. I'm probably missing this, but where in the code are you ensuring that all feature vectors have the same number of dimensions (i.e., length)? From what I can tell, for a text value from sqlite, the code converts each char to a float and stores those bits in the vector. This could work if the hamming distance accounts for different length vectors, but that function appears to assume they are the same. Thanks for the clarification.

elitan
0 replies
10h50m

Vector search in SQLite is what's keeping me from switching from Postgres.

babox
0 replies
9h30m

very cool i like so much sqlite

Trufa
0 replies
14h35m

I love this style of projects, OSS project for a very particular issue.

I keep thinking what can I do in the Typescript/Next.js/React ecosystem that's very useful to a technical niche but I haven't had the inspiration yet.