hey hn, supabase ceo here
For background: we have a storage product for large files (like photos, videos, etc). The storage paths are mapped into your Postgres database so that you can create per-user access rules (using Postgres RLS)
This update adds S3 compatibility, which means that you can use it with thousands of tools that already support the protocol.
I'm also pretty excited about the possibilities for data scientists/engineers. We can do neat things like dump postgres tables in to Storage (parquet) and you can connect DuckDB/Clickhouse directly to them. We have a few ideas that we'll experiment with to make this easy
Let us know if you have any questions - the engineers will also monitor the discussion
Dear supabase. Please don’t get bought out by anyone and ruined. I’ve built too many websites with a supabase backend now to go back.
This is my biggest reservation towards Supabase. Google bought Firebase in 2014. I've seen Vercel run Nextjs in to the ground and fuck up their pricing for some short-term gains. And Figma almost got bought by Adobe. I have a hard time trusting products with heavy VC backing.
I think Vercel and Next.js are built by the same group of people, the same people who made Now.sh, started the company (Zeit), then changed product name to Now.sh, then changed company and product name to Vercel.
Yes. That doesn’t mean that they haven’t ran it into the ground.
Tell me about it.
My simple SSG Next.js static site loads much slower than my Jekyll site on GitHub pages.
And I can't figure out how to improve its speed or disable the "ISG" feature that I believe is to be blamed for the poor performance.
Not defending NextJS, I'm pretty out on it myself, but ISG requires a server to run. It pregenerates static content for defined pages and serves that until revalidating. If you've built a fully static bundle, nothing exists that would handle that incremental/revalidating logic.
I understand that ISG requires a server (Node.js runtime) to run it, that's why I want to disable it for my SSG Next.js static site, but I can't figure out how. I just want static hosting like S3+Cloudfront which is much faster.
You need to use static export:
https://nextjs.org/docs/app/building-your-application/deploy...
Cool, so that's what I was missing all along!
Unfortunately looks like I can't use it now since I am using `next-plausible` which does require a Node.js proxy...
big reason on why I decided on Flutter
What's the actual complaint here , other then company A buys company B.
the only reason i am using supabase is that its cheap and i can export it to postgres, thats it.
i know that one day these founders will exit and it will be sold to AWS, Google or Microsoft cloud.
so its a bit of a gamble but that risk is offset by its super cheap cost and ability to export out the data and swap out the pieces (more work but at that point you should be cashflow positive to pay engineers to do it for you).
Buy & Kill
https://insights.som.yale.edu/insights/wave-of-acquisitions-...
I’m not defending Vercel or VC backed companies per se, but I don’t understand your comments towards Vercel. They still offer a generous hobby plan and Next.js is still actively maintained open source software that supports self-hosting.
Heroku might be a better example of a company that was acquired and the shut down their free plan.
Supabase and Next.js are not the same. Supabase can be self-hosted but it’s not easy to do so and a lot of moving parts. Most of my Next.js apps are not even on Vercel. They can easily be deployed to Netlify, Railway, Render, etc.
Supabase self-hosting is not difficult, but it makes no sense since some important features are missing, like reports and multiple projects.
You know the whole point of YC companies is to flip their equity on the public market right and then moving on to the next one?
We know, but it screws over your existing customers when a very helpful tool is turned over to a greedy investment firm that’s gonna gut the product seeking the highest return
Firebase was such a terrible loss. I had been following it quite closely on its mailing list until the Google takeover, then it seemed like progress slowed to a halt. Also having big brother watching a common bootstrap framework's data like that, used by countless MVP apps, doesn't exactly inspire confidence, but that's of course why they bought it.
At the time, the most requested feature was a push notification mechanism, because implementing that on iOS had a steep learning curve and was not cross-platform. Then probably some more advanced rules to be able to do more functional-style permissions, possibly with variables, although they had just rolled out an upgraded rules syntax. And also having a symlink metaphor for nodes might have been nice, so that subtrees could reflect changes to others like a spreadsheet, for auto-normalization without duplicate logic. And they hadn't yet implemented an incremental/diff mechanism to only download what's needed at app startup, so larger databases could be slow to load. I don't remember if writes were durable enough to survive driving through a tunnel and relaunching the app while disconnected from the internet either. I'm going from memory and am surely forgetting something.
Does anyone know if any/all of the issues have been implemented/fixed yet? I'd bet money that the more obvious ones from a first-principles approach have not, because ensh!ttification. Nobody's at the wheel to implement these things, and of course there's no budget for them anyway, because the trillions of dollars go to bowing to advertisers or training AI or whatnot.
IMHO the one true mature web database will be distributed via something like Raft, have rich access rules, be log based with (at least) SQL/HTTP/JSON interfaces to the last-known state and access to the underlying sets selection/filtering/aggregation logic/language, support nested transactions or have all equivalent use-cases provided by atomic operations with examples, be fully indexed by default with no penalty for row or column based queries (to support both table and document-oriented patterns and even software transactional memories - STMs), have column and possibly even row views (not just table views), use a copy-on-write mechanism internally like Clojure's STM for mostly O(1) speed, be evented with smart merge conflicts to avoid excessive callbacks, preferable with a synchronized clock timestamp ordered lexicographically:
https://firebase.blog/posts/2015/02/the-2120-ways-to-ensure-...
I'm not even sure that the newest UUID formats get that right:
https://uuid6.github.io/uuid6-ietf-draft/
Loosely this next-gen web database would be ACID enough for business and realtime enough for gaming, probably through an immediate event callback for dead reckoning, with an optional "final" argument to know when the data has reached consensus and was committed, with visibility based on the rules system. Basically as fast as Redis but durable.
A runner up was the now (possibly) defunct RethinkDB. Also PouchDB/PouchBase, a web interface for CouchDB.
I haven't had time to play with Supabase yet, so any insights into whether it can do these things would be much appreciated!
Wasn't there Parse from firebase
we don't have any plans to get bought.
we only have plans to keep pushing open standards/tools - hopefully we have enough of a track record here that it doesn't feel like lip service
Is it even up to you? Isn't it up to your Board of Directors (i.e investors) in the end?
we (the founders) control the board
Who controls the board, controls the company
Absolutely. I am so impressed with SB. It’s like you read my mind and then make what I’ll need before I realise.
(This is not a paid promotion)
we receive a lot of community feedback and ultimately there are only a few "primitives" developers need to solve 90% of their problems
I think we're inching closer to the complete set of primitives which can be used to combine into second-order primitives (eg: queues/search/maps can all be "wrappers" around the primitives we already provide)
That's a neat way of thinking about it.
Thanks for an awesome product. Please also never get bought or make plans to in the future, or if you really, really, really have to then please not by google.
I can believe there are no plans, right now. But having raised over $100mm, the VCs will want to liquidate their holdings eventually. They have to answer to their LPs after all, and be able to raise their next funds.
The primary options I can think of that are not full acquisition are: - company buys back stock - VC sells on secondary market - IPO
The much more common and more likely option for these VCs to make the multiple or home run on their return is going to be to 10x+ their money by having a first or second tier cloud provider buy you.
I think there's a 10% chance that a deal with Google is done in the future, so their portfolio has Firebase for NoSQL and Firebase for SQL.
well before there was supabase I would use Firebase
so it would serve Google well if they matched what supabase is doing or bought them out
Having founded a database company that IPO'd (Couchbase) and seeing the kinds of customer relationships Supabase is making, an IPO seems a reasonable outcome.
plans*
*Subject to change
As long as you make it so if you do get bought a team of you can always fork and move on, it's about the best anyone can hope for.
Never used Supabase before but I'm very much comfortable with their underlying stack. I use a combination of postgres, PostgREST, PLv8 and Auth0 to achieve nearly the same thing.
I was unfamiliar with PLv8, found this:
“”” PLV8 is a trusted Javascript language extension for PostgreSQL. It can be used for stored procedures, triggers, etc.
PLV8 works with most versions of Postgres, but works best with 13 and above, including 14, [15], and 16. “””
https://plv8.github.io/
I'm a bit terrified of this as well. I have built a profitable product on the platform, and it were to drastically change or go away, I'd be hosed.
A question about implementation, is the data really stored in a Postgres database? Do you support transactional updates like atomically updating two files at once?
Is there a Postgres storage backend optimized for storing large files?
We do not store the files in Postgres, the files are stored in a managed S3 bucket.
We store the metadata of the objects and buckets in Postgres so that you can easily query it with SQL. You can also implement access control with RLS to allow access to certain resources.
It is not currently possible to guarantee atomicity on 2 different file uploads since each file is uploaded on a single request, this seems a more high-level functionality that could be implemented at the application level
Oh.
So this is like, S3 on top of S3? That's interesting.
Yes indeed! I would call it S3 on steroids!
Currently, it happens to be S3 to S3, but you could write an adapter, let's say GoogleCloudStorage and it will become S3 -> GoogleCloudStorage, or any other type of underline Storage.
Additionally, we provide a special way of authenticating to Supabase S3 using the SessionToken, which allows you to scope S3 operations to your users specific access control
https://supabase.com/docs/guides/storage/s3/authentication#s...
What about for second tier cloud providers like Linode, Vultr or UpCloud, they all offer S3 compatible object storage, will I need to write an adaptor for these or will it work just fine in lieu of their S3 compatibility?
Our S3 Driver is compatible with any S3 Compatible object Storage so you don’t have to write one :)
I’m confused about what directions this goes.
The announcement is that Supabase now supports (user) —s3 protocol—> (Supabase)
Above you say that (Supabase) —Supabase S3 Driver—> (AWS S3)
Are you further saying that that (Supabase) —Supabase S3 Driver—> (any S3 compatible storage provider) ? If so, how does the user configure that?
It seems more likely that you mean that for any application with the architecture (user) —s3 protocol—> (any S3 compatible storage provider), Supabase can now be swapped in as that storage target.
Gentle reminder here that S3 compatability is a sliding window and without further couching of the term it’s more of a marketing term than anything for vendors. What do I mean by this statement? I mean that you can go to cloud vendor Foo and they can tell you they offer s3 compatible api’s or clients but then you find out they only support the most basic of operations, like 30% of the API. Vendor Bar might support 50% of the api and Baz 80%.
In a lot of cases, if your use case is simple, 30% is enough if you’re doing the most common GET and PUT operations etc. But all it takes is one unsupported call in your desired workflow to rule out that vendor as an option until such time that said API is supported. My main beef with this is that there’s no easy way to tell usually unless the vendor provides a support matrix that you have to map to the operations you need, like this: https://docs.storj.io/dcs/api/s3/s3-compatibility. If no such matrix is provided on both the client side and server side you have no easy way to tell if it will even work without wiring things in and attempting to actually execute the code.
One thing to note is that it’s quite unrealistic for vendors to strive for 100% compat - there’s some AWS specific stuff in the API that will basically never be relevant for anyone other than AWS. But the current situation of Wild West could stand for some significant improvement
in case it's not clear why this is required, some of the things the storage engine handles are:
image transformations, caching, automatic cache-busting, multiple protocols, metadata management, postgres compatibility, multipart uploads, compatibility across storage backends, etc
The article does not mention: do you support pre-signed URLs?
https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-...
Thanks for asking this, we do not support signed URLs just yet, but it will be added in the next iteration
Presigned URLs are useful because client app can upload/download directly from S3, saving the app server from this traffic. Does Row-Level Security achieve the same benefit?
Yes, I agree! Although I should have specified that we support signed URL https://supabase.com/docs/reference/javascript/storage-from-... just not in the S3 protocol just yet :)
Thanks for confirming! Could you maybe update https://supabase.com/docs/guides/storage/s3/compatibility to include s3-esque presigning as a feature to track?
Can you give an indication of what "next iteration" means in terms of timeline (even just ballpark)?
You specifically say "for large files". What's your bandwidth and latency like for small files (e.g. 20-20480 bytes), and how does it compare to raw S3's bandwidth and latency for small files?
You can think of the Storage product as an upload server that sits in front of S3.
Generally, you would want to place an upload server to accept uploads from your customers, that is because you want to do some sort of file validation, access control or anything else once the file is uploaded. The nice thing is that we run Storage within the same AWS network, so the upload latency is as small as it can be.
In terms of serving files, we provide a CDN out-of-the-box for any files that you upload to Storage, minimising latencies geographically
A common pattern on AWS is to not handle the upload on your own servers. Checks are made ahead of time, conditions baked into the signed URL, and processing is handled after the fact via bucket events.
That is also a common pattern I agree, both ways are fine if the upload server is optimised accordingly :)
hey, supabase engineer here; we didn’t check that out with files that small, but thanks for the idea, i will try it out
the only thing i can say related to the topic is that s3 multipart outperforms other methods for files larger than 50mb significantly, but tends to have similar or slightly slower speeds compared to s3 regular upload via supabase or simplest supabase storage upload for files with size about and less than 50mb.
s3-multipart is indeed the fastest way to upload file to supabase with speeds up to 100mb/s(115 even) for files>500mb. But for files about 5mb or less you are not going to need to change anything in your upload logic just for performance cause you won’t notice any difference probably
everything mentioned here is for upload only
Hi, a question, but first some background. I've been looking at solutions to store columnar data with versioning, essentially Parquet. But, I'd also like to store PDFs, CSVs, images, and such for our ML workflows. I wonder if now, that Supabase is getting better for data science DuckDB crowd, could Supabase be that one solution for all this?
yes, you can store all of these in Supabase Storage and it will probably "just work" with the tools that you already use (since most tools are s3-compatible)
Here is an example of one of our Data Engineers querying parquet with DuckDB: https://www.youtube.com/watch?v=diL00ZZ-q50
We're very open to feedback here - if you find any rough edges let us know and we can work on it (github issues are easiest)
Well, this is great news. I'll take "just works" guarantee any day ;)
We have yet to make a commitment to any one product. Having Postgres there is a big plus for me. I'll have to see about doing a test or two.
Shouldn't it be API rather than protocol?
Also my sympathies for having to support the so-called "S3 standard/protocol".
I think that protocol is appropriate here since s3 resources are often represented by a s3:// url where the scheme part of the url is often used to represent the protocol.
Yes, both can be fine :) after all, a Protocol can be interpreted as a Standardised API which the client and server interact with, it can be low-level or high-level.
I hope you like the addition and we have the implementation all open-source on the Supabase Storage server
Do you think Supabase Storage (now or in the future) could be an attractive standalone S3 provider as an alternative to e.g MinIO?
It's more of a "accessibility layer" on top of S3 or any other s3-compatible backend (which means that it also works with MinIO out-of-the-box [0])
I don't think we'll ever build the underlying storage layer. I'm a big fan of what the Tigris[1] team have built if you're looking for other good s3 alternatives
[0] https://github.com/supabase/storage/blob/master/docker-compo...
[1] Tigris: https://tigrisdata.com
I'm still confused. Who is paying Amazon? Is it
1. You buy the storage from Amazon so I pay you and don't interact with Amazon at all or
2. I buy storage from Amazon and you provide managed access to it?
This is great news, and I agree with everyone in the thread - Supabase is a great product.
Does this mean that Supabase (via S3 protocol) supports file download streaming using an API now?
As far as I know, it was not achievable before and the only solution was to create a signed URL and stream using HTTP.
Yes, absolutely! You can download files as streams and make use of Range requests too.
The good news is that the Standard API is also supporting stream!
I tried to migrate from Firebase once and it wasn't really straightforward and decided against doing it, I think you guys (if you haven't already) should make migration plugins a first class priority that "just works" as the amount of real revenue generating production projects on Firebase and similar are of a much higher number. It's a no-brainer that many of them may want to switch if it were safe and simple to do so.
we have some guides (eg: https://supabase.com/docs/guides/resources/migrating-to-supa...)
also some community tools: https://github.com/supabase-community/firebase-to-supabase
we often help companies migrating from firebase to supabase - usually they want to take advantage of Postgres with similar tooling.
Here is the example of the DuckDB querying parquet files directly from Storage because it supports the S3 protocol now - https://github.com/TylerHillery/supabase-storage-duckdb-demo
https://www.youtube.com/watch?v=diL00ZZ-q50
Yes. Duckdb works very well with parquet scans on s3 right now.
no feedback on this in particular, but I love supabase. I use it for several projects and it's been great.
I was hesitant to use triggers and PG functions initially, but after I got my migrations sorted out, it's been pretty awesome.
Do you manage your functions and triggers through source code? What framework do you use to do that? I like Supabase but it’s desire to default to native pg stuff for a lot of that has kind of steered me away from using it for more complex projects where you need to use sprocs to retrieve data and pgtap to test them, because hiding away business logic in the db like that is viewed as an anti pattern in a lot of organizations. I love it for simple CRUD apps though, the kind where the default postgrest functionality is mostly enough and having to drop into a sproc or build a view is rarely necessary.
I think if there was a tightly integrated framework for managing the state of all of these various triggers, views, functions and sproc through source and integrating them into the normal SDLC it would be a more appealing sell for complex projects
Is iOS support a priority for supabase?
Yes, we made it an official library this week: https://supabase.com/blog/supabase-swift
One of the big wins we get from AWS is that you can do things like load structured data files (csv, parquet) from S3 directly in Redshift using SQL queries.
https://docs.aws.amazon.com/redshift/latest/dg/t_loading-tab...
Just commenting to say I really appreciate your business model. Whereas most businesses actively seek to build moats to maintain competitive advantage and locking people in, actions like this paint a different picture of Supabase. I'll be swapping out the API my app uses for Supabase storage to switch it to an S3 API this weekend in case I ever need to switch it.
My only real qualm at this point is mapping JS entities using the JS DB API makes it hard to use camelcase field names due to PG reasons I can't recall. I'm not sure what a fix for that would look like.
Keep up the good work.