Hey, HN. We’re thrilled to release Observable Framework today — a new open-source tool for developing data apps. I highly recommend viewing this example report adapted from our internal dashboard analyzing web logs:
https://observablehq.com/framework/examples/api/
This technique of “just plot everything” (7.6M requests as a scatterplot) has revealed surprising insights we’ve used to optimize our servers and better control traffic. We’re also sharing a more traditional dashboard that visualizes the adoption of our open-source visualization library (and in some ways the successor to D3), Observable Plot:
https://observablehq.com/framework/examples/plot/
In addition to releasing Observable Framework, we’ve also made Observable free again for individuals (including private notebooks and databases connectors). Let me know if you have any questions!
This seems nice and the plots look great, but I have a hard time imagining switching to Observable from Plotly since there doesn't seem to be a way to make any plots interactive. By which I mean Zoom and Pan. The nearest point high light feature is nice, but what if I want to zoom in? None of the examples here seem to be able to do that and quick google search doesn't make it seem like that's straight forward. That's not even additional code when I use Plotly, it's just built-in.
There's also the issue of convincing staff to use JS instead of Python which is still just a tough sell. I think everyone on my team (of data scientists) would look at me like I've got two heads if I were to suggest that. Maybe we're not the target demographic though.
I do like the idea of shipping the data straight to the client, but I don't have a lot of confidence in our corporate network doing well and not slowing stuff down. Perhaps the graphics all are sent pre-rendered over the wire though? I'm not sure, but would be cool if Observable figured out a way to side step that issue.
We’re working on zooming and panning for Observable Plot (https://github.com/observablehq/plot/pull/1738) and other interactions such as brushing (https://github.com/observablehq/plot/pull/721) — all of this is already possible, we just haven’t packaged it up in a convenient way yet (https://github.com/observablehq/plot/pull/1871). And as skybrian pointed out, you can also get interactivity “for free” with Observable’s reactivity and re-rendering.
We’ve been focused primarily on the static display of visualizations because that’s what viewers see first, and often that’s often the only thing they see. Relying too heavily on interaction places an onus on the user to find the insights; a good display of data should be opinionated about what it shows and guide the user to what is interesting.
We’re not trying to convince you to switch to JavaScript here — a main value prop of Observable Framework is that you can write data loaders in any language (Python, R, Go, Julia, etc.). So do all your data preparation and analysis in whatever language you like, and then do your front-end in JavaScript to leverage the graphics and interactive compute capabilities of modern browsers. It’s pipes and child_process.spawn under the hood. And you still get instant reactivity when you save changes to your data loaders (when you edit Python) because Framework watches files and pushes new data to the client with reactive hot data & module replacement.
And you can compress (aggregate or filter) the data as much as you like, so it’s up to you how much data you send to the client. For example your data loader could be a minimal CSV file that’s just the numbers you need for a bar chart. Or it could be a Parquet file and you use DuckDB (https://observablehq.com/framework/lib/duckdb) on the client to generate dynamic visualizations.
I also love how Observable Plot looks, but agree with top poster, the things that keep me from switching are: - Python wrapper - Out of the box interactivity
We’re working on the interactivity, but we’re not going to do a Python wrapper as the goal of Observable Framework (and Plot) is to leverage web technologies with JavaScript in the front-end for interactive graphics — while doing whatever language you like, including Python, on the back-end for data analysis. There is a R wrapper for Observable Plot (https://github.com/juba/obsplot) and so I imagine someone could write one for Python, but ultimately we don’t think you’ll need it with Framework’s polyglot data loaders since you can seamlessly move between languages (front-end and back-end).
FYI: Same dev building obsplot is also building a Python version at https://github.com/juba/pyobsplot
Ah, that’s what I thought at first, but I googled for “obsplot” and found the R one and thought I misremembered. Thank you for the correction.
Interesting!
What are your thoughts on PRQL (integrated with DuckDB)?
Also: sure, I get your point on opinionated viz, but this means someone already "played with the data" to figure out what to viz "cleanly", and then coded that
now what if we want to build a viz to do this first "data playground" allowing to ruffle with the data? we would need some sort of interactivity
Observable is much more than its library, plot. You mean to compare plot to plotly.
There are a number of reasons to choose Observable’s plot over plotly, but to address your point, there is no lock-in here with using plot for the view - you can seemingly use any JS library, including plotly, vega, D3, etc., so I don’t think that’s a huge issue.
I agree with your point regarding convincing other scientists to use JavaScript - that was the biggest point of failure for Observable notebook adoption that I saw. (As an anecdote, rather than adopt Observable, my science team @bigtech decided to write a Jupyter -> interactive static site transpiler, so the scientists could continue their work in python). Observable 2.0 seems built on recognizing that friction, and making it so that the it’s much easier for non-js users to collaborate. But the npm dependency will still scare many data folks away.
To anyone from observable reading: I think getting mass adoption involves making this as seamless for python users as possible. (E.g. something similar to marimo notebooks or evidence). Also: great work!
I don’t think the `npm install` will scare people away (Evidence uses that, too), and we’ve definitely tried to make the onboarding process as guided as possible (shout-out to clack.cc for a great CLI prompt library):
https://observablehq.com/framework/getting-started
And plus you can import libraries directly from a CDN rather than needing to use npm or yarn to manage dependencies. (Though we plan on supporting the latter in the future, too.)
https://observablehq.com/framework/javascript/imports
See this example for getting started with Python:
https://github.com/observablehq/framework/tree/main/examples...
But of course we’d love to add more affordances and documentation for other languages. We’re naturally biased towards JavaScript as our focus has historically been on visualization, but I like to think we’re making progress on the polyglot dream.
Importing libs from a cdn is a big no-no for almost any system I work on - they are just another way of surveilling users on the Internet, and building information about the insides of organisations.
We plan on downloading the imported libraries during build so that they are self-hosted, so it’s effectively another way to install them without having to run `npm install`. We also plan on supporting importing libraries from node_modules. And you can already import local modules so you can install libraries by manually downloading them, too.
Though it’s not designed for animation, Observable Plot is just a JavaScript library and it renders fast enough that you can do things like that just by re-rendering. Here are some old notebooks with experiments with audio data hooked up to UI controls:
https://observablehq.com/collection/@skybrian/observable-plo...
Hi! Some background first: I'm putting together a blog right now using Hugo and D3. I'm a huge fan of D3's infinite flexibility, as seen in some famous scrollytellers [0-1], and I've spent some time experimenting with that format myself [2].
My question is: what does Observable Framework offer for data storytellers who want to blog? Is this meant to go up against Hugo/Jekyll in terms of full-fledged max-efficiency site generation? If not, are there plans to add integrations with other blogging frameworks?
[0]: http://r2d3.us/ [1]: https://algorithms-tour.stitchfix.com/ [2]: https://vermarish.github.io/big-brother-barometer/
We’re not expressly targeting the blogging use case — we primarily want to support data apps, dashboards, and reports. But Observable Framework is quite flexible and you can use it for a lot of things; the Framework documentation is itself written using Framework, for example. So I would say that if you are working with data and you want an automated process to keep your data up-to-date, or to work with multiple languages (e.g., Python and JavaScript), or if you want to do a lot of interactive visualizations then you should give Framework a go. But we don’t have much built-in affordances for blogging, so you might find some things missing. Feel free to file feature requests! We’d love to hear your ideas, though we’re primarily focused on reporting and data app development for work.
I’m not sure what better integration with other blogging frameworks would look like — like, part of the page is rendered by Framework, but the site as a whole is handled by the blogging framework? Perhaps we could develop Framework’s API further so it could function like a plugin. But this is speculative and not a priority for us currently. If you explore the possibilities here please let us know!
How do dashboards work if data is computed at build time? Does that mean every time you want to update the data you need another build? I'm interested in live dashboards, is Obversable framework the wrong tool for the job?
Yes, we use continuous deployment (cron) to rebuild as needed. You can also get realtime data on the client if you need to (via fetch or WebSocket to your own servers — it’s “just” JavaScript), but generally we find building static data snapshots a useful constraint because it forces you to think about exactly what data is needed, and as a result the dashboard loads instantly.
My use case is monitoring machine learning models as they train, static snapshots doesn't seem like the right approach for me.
If you’re developing the models locally, perhaps Framework’s preview server could work: it watches local files and automatically pushes updates to the browser when files change. This enables reactive updates for data loaders, but also works with static files. So you could visualize the models as they are being generated — meaning as some external process writes local files.
But in general the use case we’re targeting is a shared data app, dashboard, or report. Not something just for you individually, or something ephemeral (that you look at in real-time during training). For example, Framework would work well for sharing a report or dashboard evaluating the performance of the latest models you’ve built with your team.
Row64 dashboards are pretty instant. And interactive.
Edit: link: https://row64.com/
One question I have is if there's a way to integrate an observable framework project into an existing static site? I see how I could easily add a project as a subdomain, but what if I wanted to interleave a project I make with observable framework into my existing domain and that static site generator I already use for that domain?
By the way, thank you making this. I've been reading and enjoying very much the documentation. It looks like it has huge potential.
Thank you. At a minimum, you could iframe pages built with Framework, or have them live alongside your other pages and link to them. Maybe it would be possible to use Framework’s internal API to generate HTML that could be embedded within another static site generator page but we haven’t explored that idea yet.
Thank you for answering my question. I'm sure after more people use framework an elegant design will make itself more clear. The decision to make data loaders agnostic to specific technology was a welcomed approach, and so I have no doubt a similar result will be achieved with integrating observable framework into existing static sites, however that may look. Thank you!
Also curious if it can be worked into my jekyll sites
That is my point too. Now that I have tried Observable Framework (and before it D3, Plot, Observable Notebook) I do not think I can propose to change our statically generated site to just use Observable Framework. I will explore ways to migrate parts of the existing stuff and how integrate new pages generated by Framework... A Big Bang is not an option for us... (for anybody I guess, so it looks quite a need...but I understand that it doesn't go in the right business direction for Observable the company)
My question as well. If I had say a Hugo blog, how much effort would it be to embed the output to its own page?
Super cool! Especially for low cardinality, low interactivity dashboards this approach makes a ton of sense.
How is Observable going to make money off of the framework?
Hosting & compute — operationalizing/productionizing data apps. Observable Framework is open-source, but our hope is that we offer a compelling complementary paid service for you to host your (typically private) data apps on Observable. We make it easy for you to share data apps securely with your team or customers or clients or whoever, and manage the complexities of keeping your app & data up-to-date with continuous deployment, scheduled builds, access control, collaboration, monitoring, analytics, etc.
In case you ever forget the difference between complementary and complimentary, that's it right there.
Haha, my compliments.
Thank you Mike for pushing the visualisation envelope for so many years.
Is the new Framework going to support virtualized data access for data sets too large to be sent over network (think of a pivot table that allows to browse huge data warehouse) - it is impossible to prepare entire file upfront, so data queries must happen incrementally with users actions? Or is it completely the other direction from where your vision for Framework is?
If you generate Apache Parquet files you can use DuckDB to make range requests and not download everything to the client. This is pretty magical and allows you to have surprisingly large datasets still queryable at interactive speeds.
But the general idea is to not send everything the client — to be more deliberate and restrictive in what you send, and also what you show. So you probably shouldn’t use this for a general-purpose pivot table that’s trying to show “everything” in your data warehouse and enable ad hoc exploration. You’d instead design more specific, opinionated views, and then craft corresponding data loaders that generate specific pre-aggregated datasets.
It's not always clear which pushdowns are available in DuckDB. For instance, while x = y has been available for a while, x in (y, z, ...) hasn't. The DuckDB team seems very eager and motivated to get all the pushdown functionality working though, so hopefully becomes a non-issue soon (perhaps already in 0.10.0).
Another way to use DuckDB if you're warehouse supports it would be e2e Arrow (no col->row->col overhead).
Of course this differs in that you would be reading from the warehouse directly, but in my experience fully pre-aggregating data and then staging it (keeping parquet files on S3 up to date) might solve one issue, but results in unimaginable issues. Perhaps the sweet spot might be something like Iceberg in the middle? > craft corresponding data loadersDo you have an example of this using DuckDB? I'm very interested in seeing an actual implementation of Observable's data loaders combined with DuckDB (or any other SQL DB)
edit: nm, found it. https://observablehq.com/framework/lib/duckdb
edit2: eh, I didn't even realize who i was responding to, lol. The more I read into this, the more I see how this is all heavily based on static files. So the static parquet files thing makes more sense, the solution I added makes little. Although I guess you could add a static iceberg table and interact with its manifest with the duckdb iceberg extension.
Very impressive and will definitely server many use-cases. However, it's static site with data refresh at build time. Does this mean there cannot be user-based row-level security (i.e. selective access)?
One of the main selling points of the clunky, general purpose drag-and-drop BI tools (Power BI, Tableau etc.) is selective access. This is especially important in larger enterprises and for customer-facing dashboards.
For example, you're an enterprise manufacturing and selling IoT devices and have many different corporate customers. When you build a dashboard you want to make sure that each customer can see the data that belongs to their account and potentially, have further user-based restrictions. Obviously this goes against the idea of creating pre-aggregated datasets and instant loads but it's a massive multi-billion gap that currently is being filled by inferior tools to D3/Plot/Framework. This is something that Observable could develop in the future given what I'm seeing now and considering how relatively close already you are to this. Framework could serve both types of needs - static sites and dynamic, user-based more fully-featured sites for Enterprise needs.
Route per authorization scope?
Right, conceptually it’s static files, but we could develop a hybrid approach where the server does additional data processing on-demand. We already offer access control, but we could also serve different data snapshots to different users, or even filter the data snapshots based on the user. It still has to be fast, though.
Is this meant to be a competitor to tools like Tableau or Metabase? Something more dev-friendly and maybe git-versioned as opposed to a configurable SaaS tool?
More developer-focused, and yes, you can use git for version control and develop locally, setup continuous deployment, and self-host apps anywhere.
While the docs look great, I'm having trouble getting over the hump of starting. It would be great if you had a repo with a started app we could fork and play around with to help us understand everything before diving in from scratch.
Did you try running `npm init @observablehq`? It’ll create a starter app for you with everything you need to get started, as described in the Getting started tutorial.
https://observablehq.com/framework/getting-started
If you want more starter apps to look at, you can browse our examples on GitHub:
https://github.com/observablehq/framework/tree/main/examples
Interesting to see ObservableHQ making strides towards dashboards, similar to what Quarto and Evidence are doing.
Observable Notebooks reactivity feels intuitive, much like spreadsheets, but the lack of self-hosting options is no-go Drawback in my work context.
This looks like a dream!
Congratulations on this release! Your writing at bost.ocks.org, D3, and Observable have been big sources of inspiration over the years, and it’s always exciting to see new ideas from this team.
At last!
Time to call it quits for https://www.jigdev.com :-D
Godspeed Observable, hope you guys make it big
You've sponsored some very cool, state of the art tools. I've had friends work at Observable. I want you to succeed.
I tried to get our team to use Observable Notebooks a few years back. The researchers I work with are more comfortable in Python. Clearly that's one of the things you're trying to solve in this release. The other half of that uphill battle was discomfort posting code externally. In some ways you've also mitigated that in this release, but I wonder how sustainable it is.
Small teams eat for free by virtue of being small. Large organizations with trepidation or bureaucracy about using SaaS hosting will self host. That leaves the people in the middle: big enough to need to pay, but small enough to not have institutional problems with external hosting. Moreover, if the Observable bill ever gets much higher than the equivalent on Firebase et. al., the medium guys can self-host too.
How do you anticipate the paid side of the new business to work out? What's the hook (beyond thinking you guys are cool and trying to keep you in business) that gets someone to pay for Observable?
I was looking for a way to integrate Observable Inputs to VitePress and this came as a big surprise. Love what you are doing.