return to table of content

Observable 2.0, a static site generator for data apps

mbostock
45 replies
1d1h

Hey, HN. We’re thrilled to release Observable Framework today — a new open-source tool for developing data apps. I highly recommend viewing this example report adapted from our internal dashboard analyzing web logs:

https://observablehq.com/framework/examples/api/

This technique of “just plot everything” (7.6M requests as a scatterplot) has revealed surprising insights we’ve used to optimize our servers and better control traffic. We’re also sharing a more traditional dashboard that visualizes the adoption of our open-source visualization library (and in some ways the successor to D3), Observable Plot:

https://observablehq.com/framework/examples/plot/

In addition to releasing Observable Framework, we’ve also made Observable free again for individuals (including private notebooks and databases connectors). Let me know if you have any questions!

ZeroCool2u
11 replies
1d

This seems nice and the plots look great, but I have a hard time imagining switching to Observable from Plotly since there doesn't seem to be a way to make any plots interactive. By which I mean Zoom and Pan. The nearest point high light feature is nice, but what if I want to zoom in? None of the examples here seem to be able to do that and quick google search doesn't make it seem like that's straight forward. That's not even additional code when I use Plotly, it's just built-in.

There's also the issue of convincing staff to use JS instead of Python which is still just a tough sell. I think everyone on my team (of data scientists) would look at me like I've got two heads if I were to suggest that. Maybe we're not the target demographic though.

I do like the idea of shipping the data straight to the client, but I don't have a lot of confidence in our corporate network doing well and not slowing stuff down. Perhaps the graphics all are sent pre-rendered over the wire though? I'm not sure, but would be cool if Observable figured out a way to side step that issue.

mbostock
5 replies
1d

We’re working on zooming and panning for Observable Plot (https://github.com/observablehq/plot/pull/1738) and other interactions such as brushing (https://github.com/observablehq/plot/pull/721) — all of this is already possible, we just haven’t packaged it up in a convenient way yet (https://github.com/observablehq/plot/pull/1871). And as skybrian pointed out, you can also get interactivity “for free” with Observable’s reactivity and re-rendering.

We’ve been focused primarily on the static display of visualizations because that’s what viewers see first, and often that’s often the only thing they see. Relying too heavily on interaction places an onus on the user to find the insights; a good display of data should be opinionated about what it shows and guide the user to what is interesting.

We’re not trying to convince you to switch to JavaScript here — a main value prop of Observable Framework is that you can write data loaders in any language (Python, R, Go, Julia, etc.). So do all your data preparation and analysis in whatever language you like, and then do your front-end in JavaScript to leverage the graphics and interactive compute capabilities of modern browsers. It’s pipes and child_process.spawn under the hood. And you still get instant reactivity when you save changes to your data loaders (when you edit Python) because Framework watches files and pushes new data to the client with reactive hot data & module replacement.

And you can compress (aggregate or filter) the data as much as you like, so it’s up to you how much data you send to the client. For example your data loader could be a minimal CSV file that’s just the numbers you need for a bar chart. Or it could be a Parquet file and you use DuckDB (https://observablehq.com/framework/lib/duckdb) on the client to generate dynamic visualizations.

zurfer
3 replies
22h58m

I also love how Observable Plot looks, but agree with top poster, the things that keep me from switching are: - Python wrapper - Out of the box interactivity

mbostock
2 replies
22h28m

We’re working on the interactivity, but we’re not going to do a Python wrapper as the goal of Observable Framework (and Plot) is to leverage web technologies with JavaScript in the front-end for interactive graphics — while doing whatever language you like, including Python, on the back-end for data analysis. There is a R wrapper for Observable Plot (https://github.com/juba/obsplot) and so I imagine someone could write one for Python, but ultimately we don’t think you’ll need it with Framework’s polyglot data loaders since you can seamlessly move between languages (front-end and back-end).

ddanieltan
1 replies
16h8m

FYI: Same dev building obsplot is also building a Python version at https://github.com/juba/pyobsplot

mbostock
0 replies
14h26m

Ah, that’s what I thought at first, but I googled for “obsplot” and found the R one and thought I misremembered. Thank you for the correction.

oulipo
0 replies
20h19m

Interesting!

What are your thoughts on PRQL (integrated with DuckDB)?

Also: sure, I get your point on opinionated viz, but this means someone already "played with the data" to figure out what to viz "cleanly", and then coded that

now what if we want to build a viz to do this first "data playground" allowing to ruffle with the data? we would need some sort of interactivity

jwilber
3 replies
1d

Observable is much more than its library, plot. You mean to compare plot to plotly.

There are a number of reasons to choose Observable’s plot over plotly, but to address your point, there is no lock-in here with using plot for the view - you can seemingly use any JS library, including plotly, vega, D3, etc., so I don’t think that’s a huge issue.

I agree with your point regarding convincing other scientists to use JavaScript - that was the biggest point of failure for Observable notebook adoption that I saw. (As an anecdote, rather than adopt Observable, my science team @bigtech decided to write a Jupyter -> interactive static site transpiler, so the scientists could continue their work in python). Observable 2.0 seems built on recognizing that friction, and making it so that the it’s much easier for non-js users to collaborate. But the npm dependency will still scare many data folks away.

To anyone from observable reading: I think getting mass adoption involves making this as seamless for python users as possible. (E.g. something similar to marimo notebooks or evidence). Also: great work!

mbostock
2 replies
23h37m

I don’t think the `npm install` will scare people away (Evidence uses that, too), and we’ve definitely tried to make the onboarding process as guided as possible (shout-out to clack.cc for a great CLI prompt library):

https://observablehq.com/framework/getting-started

And plus you can import libraries directly from a CDN rather than needing to use npm or yarn to manage dependencies. (Though we plan on supporting the latter in the future, too.)

https://observablehq.com/framework/javascript/imports

See this example for getting started with Python:

https://github.com/observablehq/framework/tree/main/examples...

But of course we’d love to add more affordances and documentation for other languages. We’re naturally biased towards JavaScript as our focus has historically been on visualization, but I like to think we’re making progress on the polyglot dream.

time4tea
1 replies
10h26m

Importing libs from a cdn is a big no-no for almost any system I work on - they are just another way of surveilling users on the Internet, and building information about the insides of organisations.

mbostock
0 replies
3h55m

We plan on downloading the imported libraries during build so that they are self-hosted, so it’s effectively another way to install them without having to run `npm install`. We also plan on supporting importing libraries from node_modules. And you can already import local modules so you can install libraries by manually downloading them, too.

skybrian
0 replies
1d

Though it’s not designed for animation, Observable Plot is just a JavaScript library and it renders fast enough that you can do things like that just by re-rendering. Here are some old notebooks with experiments with audio data hooked up to UI controls:

https://observablehq.com/collection/@skybrian/observable-plo...

vermarish
6 replies
23h44m

Hi! Some background first: I'm putting together a blog right now using Hugo and D3. I'm a huge fan of D3's infinite flexibility, as seen in some famous scrollytellers [0-1], and I've spent some time experimenting with that format myself [2].

My question is: what does Observable Framework offer for data storytellers who want to blog? Is this meant to go up against Hugo/Jekyll in terms of full-fledged max-efficiency site generation? If not, are there plans to add integrations with other blogging frameworks?

[0]: http://r2d3.us/ [1]: https://algorithms-tour.stitchfix.com/ [2]: https://vermarish.github.io/big-brother-barometer/

mbostock
5 replies
20h19m

We’re not expressly targeting the blogging use case — we primarily want to support data apps, dashboards, and reports. But Observable Framework is quite flexible and you can use it for a lot of things; the Framework documentation is itself written using Framework, for example. So I would say that if you are working with data and you want an automated process to keep your data up-to-date, or to work with multiple languages (e.g., Python and JavaScript), or if you want to do a lot of interactive visualizations then you should give Framework a go. But we don’t have much built-in affordances for blogging, so you might find some things missing. Feel free to file feature requests! We’d love to hear your ideas, though we’re primarily focused on reporting and data app development for work.

I’m not sure what better integration with other blogging frameworks would look like — like, part of the page is rendered by Framework, but the site as a whole is handled by the blogging framework? Perhaps we could develop Framework’s API further so it could function like a plugin. But this is speculative and not a priority for us currently. If you explore the possibilities here please let us know!

ipsum2
4 replies
19h58m

How do dashboards work if data is computed at build time? Does that mean every time you want to update the data you need another build? I'm interested in live dashboards, is Obversable framework the wrong tool for the job?

mbostock
3 replies
19h14m

Yes, we use continuous deployment (cron) to rebuild as needed. You can also get realtime data on the client if you need to (via fetch or WebSocket to your own servers — it’s “just” JavaScript), but generally we find building static data snapshots a useful constraint because it forces you to think about exactly what data is needed, and as a result the dashboard loads instantly.

ipsum2
1 replies
10h12m

My use case is monitoring machine learning models as they train, static snapshots doesn't seem like the right approach for me.

mbostock
0 replies
3h18m

If you’re developing the models locally, perhaps Framework’s preview server could work: it watches local files and automatically pushes updates to the browser when files change. This enables reactive updates for data loaders, but also works with static files. So you could visualize the models as they are being generated — meaning as some external process writes local files.

But in general the use case we’re targeting is a shared data app, dashboard, or report. Not something just for you individually, or something ephemeral (that you look at in real-time during training). For example, Framework would work well for sharing a report or dashboard evaluating the performance of the latest models you’ve built with your team.

sroussey
0 replies
17h13m

Row64 dashboards are pretty instant. And interactive.

Edit: link: https://row64.com/

asimpletune
5 replies
22h11m

One question I have is if there's a way to integrate an observable framework project into an existing static site? I see how I could easily add a project as a subdomain, but what if I wanted to interleave a project I make with observable framework into my existing domain and that static site generator I already use for that domain?

By the way, thank you making this. I've been reading and enjoying very much the documentation. It looks like it has huge potential.

mbostock
1 replies
20h15m

Thank you. At a minimum, you could iframe pages built with Framework, or have them live alongside your other pages and link to them. Maybe it would be possible to use Framework’s internal API to generate HTML that could be embedded within another static site generator page but we haven’t explored that idea yet.

asimpletune
0 replies
9h47m

Thank you for answering my question. I'm sure after more people use framework an elegant design will make itself more clear. The decision to make data loaders agnostic to specific technology was a welcomed approach, and so I have no doubt a similar result will be achieved with integrating observable framework into existing static sites, however that may look. Thank you!

hanniabu
0 replies
20h59m

Also curious if it can be worked into my jekyll sites

espinielli
0 replies
3h37m

That is my point too. Now that I have tried Observable Framework (and before it D3, Plot, Observable Notebook) I do not think I can propose to change our statically generated site to just use Observable Framework. I will explore ways to migrate parts of the existing stuff and how integrate new pages generated by Framework... A Big Bang is not an option for us... (for anybody I guess, so it looks quite a need...but I understand that it doesn't go in the right business direction for Observable the company)

0cf8612b2e1e
0 replies
21h35m

My question as well. If I had say a Hugo blog, how much effort would it be to embed the output to its own page?

xixixao
3 replies
1d

Super cool! Especially for low cardinality, low interactivity dashboards this approach makes a ton of sense.

How is Observable going to make money off of the framework?

mbostock
2 replies
1d

Hosting & compute — operationalizing/productionizing data apps. Observable Framework is open-source, but our hope is that we offer a compelling complementary paid service for you to host your (typically private) data apps on Observable. We make it easy for you to share data apps securely with your team or customers or clients or whoever, and manage the complexities of keeping your app & data up-to-date with continuous deployment, scheduled builds, access control, collaboration, monitoring, analytics, etc.

JackFr
1 replies
23h48m

we offer a compelling complementary paid service

In case you ever forget the difference between complementary and complimentary, that's it right there.

mbostock
0 replies
23h44m

Haha, my compliments.

polskibus
2 replies
22h46m

Thank you Mike for pushing the visualisation envelope for so many years.

Is the new Framework going to support virtualized data access for data sets too large to be sent over network (think of a pivot table that allows to browse huge data warehouse) - it is impossible to prepare entire file upfront, so data queries must happen incrementally with users actions? Or is it completely the other direction from where your vision for Framework is?

mbostock
1 replies
22h33m

If you generate Apache Parquet files you can use DuckDB to make range requests and not download everything to the client. This is pretty magical and allows you to have surprisingly large datasets still queryable at interactive speeds.

But the general idea is to not send everything the client — to be more deliberate and restrictive in what you send, and also what you show. So you probably shouldn’t use this for a general-purpose pivot table that’s trying to show “everything” in your data warehouse and enable ad hoc exploration. You’d instead design more specific, opinionated views, and then craft corresponding data loaders that generate specific pre-aggregated datasets.

chrisjc
0 replies
16h47m

It's not always clear which pushdowns are available in DuckDB. For instance, while x = y has been available for a while, x in (y, z, ...) hasn't. The DuckDB team seems very eager and motivated to get all the pushdown functionality working though, so hopefully becomes a non-issue soon (perhaps already in 0.10.0).

Another way to use DuckDB if you're warehouse supports it would be e2e Arrow (no col->row->col overhead).

    warehouse --> ADBC --> arrow --> DuckDB
Of course this differs in that you would be reading from the warehouse directly, but in my experience fully pre-aggregating data and then staging it (keeping parquet files on S3 up to date) might solve one issue, but results in unimaginable issues. Perhaps the sweet spot might be something like Iceberg in the middle?

    warehouse --> iceberg table (parquet) --> DuckDB

> craft corresponding data loaders

Do you have an example of this using DuckDB? I'm very interested in seeing an actual implementation of Observable's data loaders combined with DuckDB (or any other SQL DB)

edit: nm, found it. https://observablehq.com/framework/lib/duckdb

edit2: eh, I didn't even realize who i was responding to, lol. The more I read into this, the more I see how this is all heavily based on static files. So the static parquet files thing makes more sense, the solution I added makes little. Although I guess you could add a static iceberg table and interact with its manifest with the duckdb iceberg extension.

laurels-marts
2 replies
3h58m

Very impressive and will definitely server many use-cases. However, it's static site with data refresh at build time. Does this mean there cannot be user-based row-level security (i.e. selective access)?

One of the main selling points of the clunky, general purpose drag-and-drop BI tools (Power BI, Tableau etc.) is selective access. This is especially important in larger enterprises and for customer-facing dashboards.

For example, you're an enterprise manufacturing and selling IoT devices and have many different corporate customers. When you build a dashboard you want to make sure that each customer can see the data that belongs to their account and potentially, have further user-based restrictions. Obviously this goes against the idea of creating pre-aggregated datasets and instant loads but it's a massive multi-billion gap that currently is being filled by inferior tools to D3/Plot/Framework. This is something that Observable could develop in the future given what I'm seeing now and considering how relatively close already you are to this. Framework could serve both types of needs - static sites and dynamic, user-based more fully-featured sites for Enterprise needs.

politician
0 replies
3h45m

Route per authorization scope?

mbostock
0 replies
3h28m

Right, conceptually it’s static files, but we could develop a hybrid approach where the server does additional data processing on-demand. We already offer access control, but we could also serve different data snapshots to different users, or even filter the data snapshots based on the user. It still has to be fast, though.

tootie
1 replies
1d

Is this meant to be a competitor to tools like Tableau or Metabase? Something more dev-friendly and maybe git-versioned as opposed to a configurable SaaS tool?

mbostock
0 replies
23h51m

More developer-focused, and yes, you can use git for version control and develop locally, setup continuous deployment, and self-host apps anywhere.

hanniabu
1 replies
20h35m

While the docs look great, I'm having trouble getting over the hump of starting. It would be great if you had a repo with a started app we could fork and play around with to help us understand everything before diving in from scratch.

mbostock
0 replies
20h28m

Did you try running `npm init @observablehq`? It’ll create a starter app for you with everything you need to get started, as described in the Getting started tutorial.

https://observablehq.com/framework/getting-started

If you want more starter apps to look at, you can browse our examples on GitHub:

https://github.com/observablehq/framework/tree/main/examples

fredguth
0 replies
16h42m

Interesting to see ObservableHQ making strides towards dashboards, similar to what Quarto and Evidence are doing.

Observable Notebooks reactivity feels intuitive, much like spreadsheets, but the lack of self-hosting options is no-go Drawback in my work context.

espinielli
0 replies
23h12m

This looks like a dream!

daniel_grady
0 replies
21h4m

Congratulations on this release! Your writing at bost.ocks.org, D3, and Observable have been big sources of inspiration over the years, and it’s always exciting to see new ideas from this team.

d--b
0 replies
22h39m

At last!

Time to call it quits for https://www.jigdev.com :-D

Godspeed Observable, hope you guys make it big

bsimpson
0 replies
12h48m

You've sponsored some very cool, state of the art tools. I've had friends work at Observable. I want you to succeed.

I tried to get our team to use Observable Notebooks a few years back. The researchers I work with are more comfortable in Python. Clearly that's one of the things you're trying to solve in this release. The other half of that uphill battle was discomfort posting code externally. In some ways you've also mitigated that in this release, but I wonder how sustainable it is.

Small teams eat for free by virtue of being small. Large organizations with trepidation or bureaucracy about using SaaS hosting will self host. That leaves the people in the middle: big enough to need to pay, but small enough to not have institutional problems with external hosting. Moreover, if the Observable bill ever gets much higher than the equivalent on Firebase et. al., the medium guys can self-host too.

How do you anticipate the paid side of the new business to work out? What's the hook (beyond thinking you guys are cool and trying to keep you in business) that gets someone to pay for Observable?

ayhanfuat
0 replies
1d

I was looking for a way to integrate Observable Inputs to VitePress and this came as a big surprise. Love what you are doing.

ddanieltan
10 replies
1d

I'm super excited to try this out! Couple of questions since I see @mbostock active in the comments.

1. Is the flexibility of languages used in data loaders/backend going to eventually come to the front end/ui? Or will the paradigm always be bring-your-own-language for the data loading but build your dashboard with observablejs/observable plot?

2. Considering ObservableJS is supported by Quarto, can we look forward to Observable Framework integrated with Quarto too? Or is the fact that the latest Quarto version also featured Dashboards more of a competitor to Framework?

3. Saw some comparison to Evidence.dev in the comments. I saw some shades of similarity with the markdown focused dev experience too but I recall Evidence chose Apache Echarts for their main charting library. Any thoughts of the pros/cons of Echarts vs ObservableJS/Plot?

kuatroka
3 replies
8h28m

3. Apache echarts are much more interactive out of the box. The API is indeed clunky, but they’ve got all the chart type and all interactions you might need. IMHO, Plot in comparison, is very limited in interactivity and even chart types ( there are no heat maps or donuts).

echarts have a huge example library with clear examples and though Plot has it too, the library is not thought out well. You might looks at an example in the Plot Library only to realize later that it’s a D3 example. On the good side, the API in Plot is much cleaner and easier to work with.

mbostock
2 replies
3h44m

There are lots of ways to do heatmaps with Observable Plot. See the raster, contour, and cell marks.

https://observablehq.com/plot/marks/raster https://observablehq.com/plot/marks/contour https://observablehq.com/plot/marks/cell

We generally recommend stacked bar charts over pie and donut charts, so we haven’t prioritized those. But you can already implement them using custom marks, and there’s even a hacky way of doing them using Plot’s map projection system.

https://observablehq.com/@observablehq/pie-to-donut-chart

I don’t understand your comment about the “D3 example.” If you’re looking for Plot examples, you can find them linked from the Plot documentation and the gallery:

https://observablehq.com/@observablehq/plot-gallery

Plot is designed to be extended with JavaScript (rather than a non-JavaScript DSL such as Vega-Lite), such as for custom marks and data transforms. So you might occasionally see other libraries being used together with Plot.

kuatroka
1 replies
3h25m

I'll check the raster, contour, and cell marks. Thanks.

"I don’t understand your comment about the “D3 example.”..." 1. When I visit the Plot Gallery https://observablehq.com/@observablehq/plot-gallery 2. Go down the page to "More from Observable creators" 3. Select an example I like, for example - https://observablehq.com/d/3ea4b4458fed9242?page=2&collectio...

It turns out it's D3, not Plot. I think you just have all possible viz in this section, but for me as a user coming from the Observable Plot page and clicking on "See more..." my expectation is to see only examples of what could be done with Plot, not both D3 and Plot. I need to explicitly click on each link an check if it's Plot based or not. It gets tiresome and the curiosity just wanes away. Thanks.

mbostock
0 replies
3h10m

That “More from Observable creators” is just a standard footer we put across the site for signed-out users to showcase community content across Observable. It’s not part of the notebook. You can ignore everything below the “Appendix”.

mbostock
2 replies
23h53m

1. We don’t have immediate plans to bring other languages to the front-end — maybe TypeScript, but that’s just stripping annotations; maybe some WebAssembly. Our idea is to have a clear serializable “membrane” separating your back-end (in any language, running on build on your servers) from your front-end (in JavaScript, running on load in the client). Data loaders produce data during build, which gets handed-off to the client to render. Trying to do data processing on the client is often a frustrating and poor user experience. Likewise trying to render great interactive charts without web technologies is quite limiting!

2. I can’t speak to Quarto’s plans. Observable Framework is open-source so they might pick up some of this stuff. I look at Framework more as an alternative to Quarto than a complement.

3. As the creator of Observable Plot (and D3 before that), I’m a huge fan of visualization grammars! Apache Echarts is a chart typology, and while it’s got a lot of chart types in it, it has no overarching conceptual model of how to represent a visualization. And so it’s not very interesting. But “the proof of the pudding is in the eating” as I say in the post, so I encourage you to look at Observable Plot and decide for yourself if you like both the syntax and the resulting plots. I certainly do!

Leland Wilkinson said it best: “If we endeavor to develop a charting instead of a graphing program, we will accomplish two things. First, we inevitably will offer fewer charts than people want. Second, our package will have no deep structure. Our computer program will be unnecessarily complex, because we will fail to reuse objects or routines that function similarly in different charts. And we will have no way to add new charts to our system without generating complex new code. Elegant design requires us to think about a theory of graphics, not charts.”

apitman
1 replies
13h33m

That's an interesting quote. What is the difference between charting and graphing in this context?

mbostock
0 replies
3h13m

See Leland Wilkinson’s The Grammar of Graphics. He describes the difference between a chart typology (a fixed set of chart types with a fixed set of configuration options) and a grammar of graphics (a set of orthogonal primitives that can be composed in arbitrary ways).

cscheid
2 replies
22h6m

(disclosure: Quarto dev here). I'm a huge Observable fan.

Speaking entirely for myself, this space is so important that I'm thrilled to have more activity rather than less. Quarto's great and Observable's great. I hope folks pick the tool that's best for their use case!

an1sotropy
1 replies
20h26m

I'm looking forward to learning more about which one makes it easier to see how various possible changes in the data are mapped to legible changes in the visualization.

cscheid
0 replies
20h10m

... sorry about the weird question, but do I know you in person? (there's a tiny chance your comment is specifically an obscure inside joke from a past life of mine, and I can't stop myself from taking the bait if so)

kuatroka
7 replies
20h0m

A couple of questions:

1. Let's say I got a Sqlite/Duckdb database file on my server. It's got multiple tables and some of them 100M to 150M records. I want to create a plot/table that would have a slider/filter to only bring and show a slice of data at a time. Since it's statically generated data, how is this interactivity achieved? All the possible facets of data filtered by which ever way will be generated? Won't it be huge and how long will it take to generates this static data or is there an actual call back to the server to the duckdb file (I assume it works with .duckdb file too?)

2. If Observable Framework provides the front-end, does it mean I can use any auth library if I want to create a web site with a log in and subscription options?

3. If it's a static web page, does it mean that at any time a user views a chart, they will also be able to go to the Dev Tools and download the file with data that's behind the viz?

4. When (if you can share of course) is the planned release of Plot's interactions: zoom, pan, interactive legend, brush?

5. Deployment - with big parquet, sqlite, csv files, it's impossible to do CI/CO through github or vercel and such. Will your hosting services offer an option to host those files and runtimes to generate them?

Thanks

chrisjc
3 replies
17h4m

Came here with similar questions and Cmd-F "DuckDB". See the comment about "data loaders". Seems like a "data loader" would provide most of what you're asking about.

I'm also thinking that a "data loader" combined with duckdb-wasm and arrow would be a pretty nice combination. I imagine that it might not be too difficult to switch two between two implementations of the "data loader" as needed. Switch between reading from a remote system (in your case DuckDB on a server) and DuckDB running locally in the browser (that can interact with its own remote or local data sources).

edit: welp https://observablehq.com/framework/lib/duckdb

recifs
2 replies
10h19m

See the example at https://huggingface.co/spaces/observablehq/fpdn where DuckDB is used both as a data loader (to download and digest 200GB worth of source data into a small 8MB parquet file) and on the client-side to allow the user to do live search queries on the minimized data. Server-side, we're using duckdb-the-binary, and client-side we're using duckdb-wasm.

kuatroka
1 replies
8h19m

So the 200Gb loading and digesting part is totally separate from the Observable Framework, right? You just do it with a standard ( non wasm duckdb as part of ETL) and later you just direct Observable Framework to read and plot the 8Gb file? Thanks

severo_bo
0 replies
5h48m

nope, Observable Framework data loader accesses the 200GB dataset. The code is here: https://huggingface.co/spaces/observablehq/fpdn/blob/main/do...

tophtucker
2 replies
15h22m

Good questions.

1. It’s just JavaScript so you can fetch stuff dynamically too (see https://observablehq.com/framework/lib/duckdb). But yeah, only client-side. (Though see https://github.com/observablehq/framework/issues/234.)

2. Sure, it’s all open source, I bet you could make that work. Or `yarn deploy` to Observable and configure sharing there (though it wouldn’t let you charge others).

3. Yup. Which is part of the appeal of model of running data loaders at build time: you can query some private data and viewers would only be able to see the final result set. (The lack of something like this has always been a huge problem for Observable notebooks. You’d make some great query-driven charts and then couldn’t make it public without some awkward manual dance of downloading and re-uploading a file to a fork of the notebook.)

4. I wish I knew! It’s being tracked here https://github.com/observablehq/plot/issues/1711. Lately there’s been a lot more work on Framework naturally but now that that’s out…

5. Another good question. We’re definitely interested in tailoring it more to this sort of use case but lots is TBD!

kuatroka
1 replies
8h17m

Thank you I wonder if for #3 there is a way of somehow to keep the data hidden and only let people see the chart without hacking their way to see the underlying data

mythmon_
0 replies
2h50m

During our early exploration, someone made a data loader that returned an entire svg of a chart, instead of the data for a chart. I think it was a headless browser running Observable Plot, but I imagine there are lots of ways to generate charts in a data loader.

simonw
5 replies
23h26m

There's an almost bewildering amount of interesting ideas buried in this.

Things like data loaders which are ANY script that can output data (as JSON or something else) to standard output. Markdown files with ```js blocks in that get executed. The reinvention of the core Observable notebook to avoid custom syntax.

This is really big.

tophtucker
4 replies
22h29m

Yeah data loaders are like a UNIX pipe to a reactive notebook cell (?). There needn’t be any question of “do the data loaders support this or that”; it doesn’t even have a concept of “supporting” beyond supporting stdout… Still thinking through how to understand it myself!!

skybrian
2 replies
20h4m

Data loaders seem like an interesting way to define a multi-language build system without having to write a makefile. Lots of build systems do this, but the boundaries between build steps often isn't as clean and uniform as having a single output per build step and relying on a file naming convention.

It's not truly reactive if you have to do a build to make anything happen. But maybe that doesn't matter, as long as it's reactive during development?

tophtucker
1 replies
15h37m

Yeah there are some open issues about more granular rebuilds and chaining data loaders. https://github.com/observablehq/framework/issues/638, https://github.com/observablehq/framework/issues/332

It’s kinda cool to think about the shearing layers of reactivity. Reactivity is what originally drew me to Observable. But the way notebooks have to be recomputed live for every viewer every time makes them feel silly for, like, a BI dashboard that changes daily at most. Like they only have one pace layer. Like they’re trying so hard to be _live_ that they can’t be _fast_! Idk. I guess even a chalkboard is reactive on the timescale of “someone noticing some information and telling it to someone who writes it down” lol.

skybrian
0 replies
13h50m

Another interesting example: since Go packages use minimum version selection, publishing a new version of a package doesn't do anything right away. Someone has to notice it (perhaps reading a release announcement) and bump the version number on a dependency. Then, after testing, they might release a new version, which again, doesn't do anything until projects downstream from them decide to upgrade.

That's deliberate, since they don't want packages to update their dependencies without testing them, and builds are supposed to be deterministic.

So it seems like for cross-project data dependencies, there's a tradeoff between deterministic results and getting the latest data? If one project depends on a JSON file from another project, and the JSON changes, when do you want or expect to see the change? There needs to be a version history for changes to the external JSON file to get a choice in the matter. (Perhaps it's cached locally.)

dleeftink
0 replies
15h55m

Scripting with hot-reload?

mbostock
5 replies
1d

Another tidbit buried in this announcement is that Observable Framework is 100% vanilla JavaScript syntax — so you get Observable’s reactive runtime without the quirky Observable JavaScript syntax (as in Observable notebooks). And you can use static ES imports from npm or local modules, declare multiple top-level variables in a code block (not just a single named variable per cell), call the built-in display(…) function to put things on the page, etc. It’s a huge relief to have vanilla syntax and greatly improves interoperability. And we’re figuring out how to port these improvements back to Observable notebooks in the near future.

skybrian
1 replies
23h32m

With regard to code edits (rather than UI reactivity), this looks similar to how many web development environments watch the file system for changes and then rebuild and reload the page. Is there more to it?

How are syntax errors reported? Is there support for TypeScript syntax and type-checking? Can a page partially run that has errors in some JavaScript snippets, like a notebook with errors in some cells?

In the examples, there is a “view source” link that goes to GitHub. Understanding the code involves finding the Markdown file and then going back and forth between the published page and the Markdown file, which hopefully correspond to the same version.

It seems like the thing that’s lost compared to notebooks is letting the user see and edit the code in the browser. But I suppose that’s a niche use case for coding tutorials. Not everything needs to be a notebook.

Even so, better built-in “view source” support might be nice, even if it doesn’t allow editing. It doesn’t have to be as prominent as it is in a notebook to be useful.

mbostock
0 replies
23h11m

You can read about our reactive runtime here (it’s the same as Observable notebooks even though Framework uses vanilla JavaScript syntax):

https://observablehq.com/@observablehq/how-observable-runs

And the source is here:

https://github.com/observablehq/runtime

The “trick” is to structure all code as reactive variables (or nodes, defined as pure functions) within a dataflow graph. So if you replace one variable (by replacing a function with a new import), you then have to recompute any downstream variables that depend on the replaced variable, while cleaning up old variables and updating the display.

Invalid syntax doesn’t prevent other code blocks from running (though if it means a variable is then undefined, that might cause downstream errors). Syntax errors are displayed in the page, and also in the console for the running preview server. We’d like to improve the error display in the console to show more context around where the error occurred, since unlike notebooks the code isn’t immediately adjacent to the output.

We don’t support TypeScript yet, but there’s a PR (https://github.com/observablehq/framework/pull/129) and we are interested in stronger validation at build time to catch more errors.

And yes, we’re making different tradeoffs, optimizing for data apps and dashboards (more polished presentation) rather than ad hoc exploration in notebooks. So it’s more work to find and edit the code, but conversely it’s a more careful, deliberate process that allows code review, unit tests, continuous integration, etc. And we think that’s appropriate for data apps that are depended on by many people.

But still, a view source link back to your source control would be nice, yes!

rmbyrro
0 replies
23h26m

If more developers understood the value of using vanilla JS when appropriate, we would all be much happier.

alanbernstein
0 replies
20h53m

Thanks for highlighting this. Realistically, this is what will make me want to try it out.

EasyMark
0 replies
21h21m

I too appreciate this. It is so easy to turn javascript into a DSL that doesn't much look like vanilla javascript. Give me a great API anytime over a DSL except in some very specific cases.

lf-non
5 replies
1d1h

The new direction seems very similar to what evidence has been doing for a while

https://evidence.dev

mbostock
3 replies
1d

Yep, Evidence is doing good work. We were most directly inspired by VitePress; we spent months rewriting both D3’s docs (https://d3js.org) and Observable Plot’s docs (https://observablehq.com/plot) in VitePress, and absolutely loved the experience. But we wanted a tool focused on data apps, dashboards, reports — observability and business intelligence use cases rather than documentation. Compared to Evidence, I’d say we’re trying to target data app developers more than data analysts; we offer a lot of power and expressiveness, and emphasize custom visualizations and interaction (leaning on Observable Plot or D3), as well as polyglot programming with data loaders written in any language (Python, R, not just SQL).

amcaskill
2 replies
1d

One of the founders of Evidence here. Thanks the kind words Mike - that means a lot coming from you.

I think that distinction is right -- we are focused on making a framework that is easy to use with a data analyst skill set, which generally means as little javascript as possible.

As an example, the way you program client-side interactions in Evidence is by templating SQL which we run in duckDB web assembly, rather than by writing javascript.

Evidence is also open source, for anyone who's interested.

Repo: https://github.com/evidence-dev/evidence

Previous discussions on HN:

https://news.ycombinator.com/item?id=28304781 - 91 comments

https://news.ycombinator.com/item?id=35645464 - 97 comments

anentropic
1 replies
7h16m

This looks very interesting to me, I'm building a BI reporting tool in my company at the moment, but browsing the docs I felt what I was missing was a clear overview of the architecture.

e.g. you say above that Evidence takes templated SQL and runs it in DuckDB WASM

and then in the docs there's various https://docs.evidence.dev/core-concepts/data-sources/#suppor... like Snowflake, MySQL etc

I guess I am wondering where and when the queries are happening

If I set up a Snowflake data source is it doing a build-time import (like in the new Observable, from this thread) into DuckDB? or DuckDB is connecting to the sources via extensions?

Where does the data live?

My question is really just "how does it work?" and the "What is Evidence? > How does Evidence work?" section on the docs homepage doesn't really answer that at all, it's just a list of things that it does.

amcaskill
0 replies
1h3m

Thanks for the kind words.

That’s good feedback on the docs. The tool has evolved pretty dramatically from where it started and we should revisit those diagrams.

Evidence is a static site generator.

Queries against your sources happen at build time and save to parquet.

Queries against the built in DuckDB web assembly instance happen at runtime.

Sources (snowflake, Postgres, csv files etc.) run at build time.

Pages in evidence are defined as markdown files. You write markdown, components, and code fences.

SQL code fences in pages run in the built in duck db wasm instance which can query across the results from all of your sources. These queries run in the client. We call this feature universal SQL, and it’s quite new.

You can read about universal SQL here if it’s of interest. https://evidence.dev/blog/why-we-built-usql/

You can template those SQL queries to accept input from input components. This enables you to build extremely performant client side interactions.

Under the hood, Evidence is built on svelte and compiles to a svelte kit application, and you can extend your project with custom svelte components.

Hope that’s helpful — we’re very active in our slack if you ever want to say hi!

vramana
0 replies
1d

Thanks for sharing. Evidence looks pretty great! Git controlled BI reporting.

0cf8612b2e1e
4 replies
1d1h

Sounds a bit like the “baked data” pattern. Which I think is a really good idea. I have long been toying with how to use Datasette to make a deliverable dashboard, so this is interesting.

rmnclmnt
1 replies
1d

Me too, and that lead to developing the « datasette-dashboards » plugin[0]. I use this for my company where all the data is gathered by connectors scheduled in CI, storing data in Git, and triggering a SQLite db build and Datasette deployment. « BI as Code » if you will

[0] https://github.com/rclement/datasette-dashboards

pzmarzly
0 replies
23h10m

How often do you refresh the data (how often your CI runs)?

mbostock
1 replies
1d

Yes! It first felt counterintuitive and constraining to prepare data ahead of time, rather than just loading whatever you want on the fly. But we’ve found this to be a great discipline in practice because it forces you to think about what data you actually need to show, and to shift as much of the compute to build time to reduce file sizes (through aggregation and filtering) and accelerate page load. And once you become accustomed to instant dashboards it becomes unthinkable to go back to slow queries.

shiandow
0 replies
6h32m

Realistically what's the biggest you can make the dataset if you need to prebuild everything? Aggregation and filtering help, but quickly become impractical if you want people to change the filters dynamically.

yodon
3 replies
1d1h

Lots of good stuff here, but are you really naming your framework "Framework" (as in "With Framework, you can build...")?

You do realize that naming only makes sense inside your own company, right? To everyone who uses it, it's "a" framework or "Observable's framework". No consumer of the framework is going to refer to it as "Framework" without ridiculous amounts of confusion resulting.

renewiltord
0 replies
1d1h

It's called Observable Framework one sentence before and on the product page. It's normal to write like this in a blog post. For instance, the Kubernetes deployments page uses the word "Deployment" doesn't say Kubernetes Deployment everywhere. It just says Deployment.

I think introducing it like that is fine. You don't have to say Observable® Framework™ every sentence. That would be strange.

prakis
0 replies
1d

Yes, it is confusing. It took me sometime to realize their product name itself is framework.

alexgarcia-xyz
0 replies
1d

Observable has other open source libraries with similar "generic" names: Plot, Runtime, Inputs. When speaking generically, most people say "Observable Plot" or "Observable Runtime." In projects where people already know about them, then I say "Plot" or "Inputs" without much fuss.

I imagine most people will say "Observable Framework" when talking out in the open, and "Framework" on established projects.

one_buggy_boi
3 replies
15h23m

What would be a good design pattern to put these dashboards behind auth? I suppose since they're static files you could just serve them with something like FastAPI or Spring Boot and have your CI/CD refresh the static files throughout the day on shared storage?

mbostock
0 replies
14h48m

For one, you can deploy them to Observable (with `observable deploy`) and we’ll provide access control.

apitman
0 replies
13h27m

I'd recommend a reverse proxy and login server using "forward auth". I made a list of such login servers here: https://github.com/lastlogin-io/obligator?tab=readme-ov-file...

a-ve
0 replies
15h17m

If you're putting these behind a reverse proxy (nginx, etc.) you can just setup client certificate authentication by using your own locally generated CA or by using something like Vault for UI-based certificate generation. When you visit this site with a certificate installed on your device, it will authenticate successfully, and for those who do not have a correct certificate installed, a "No certificate presented" error will be shown.

It's fairly easy to setup and there are multiple guides available for it. Here's one: https://fardog.io/blog/2017/12/30/client-side-certificate-au...

mrtimo
3 replies
1d

It would be really cool if you guys supported Malloy. Maybe you already do? https://www.malloydata.dev/

mythmon_
2 replies
1d

Observable engineer here. I haven't looked into Malloy much, but Framework's data loaders are very flexible. If you can write a script or binary that uses Malloy and writes to stdout, it can be a data loader. For example, although we use SQL a lot in our internal usage of Framework, Framework doesn't actually have any specific SQL support. We just use our database's normal bindings.

chrisjc
1 replies
17h9m

Can you link to the "data loader" API or perhaps even a "data loader" example implementation for something similar to malloy, duckdb, or any other DB/SQL data source/provider?

Would love to see it, thanks in advance!

edit: found it https://observablehq.com/framework/lib/duckdb

mythmon_
0 replies
16h31m

The docs for data loaders are here: https://observablehq.com/framework/loaders. The simple version is they are simply programs that write their output to standard out. Very Unixey. When those programs are referenced in client side parts of the JS, they are reactively run when in development, and prebuilt for deployment.

I don't think we have any full examples of using a database yet, but we have written a bit about using DuckDB via its Node bindings here: https://observablehq.com/framework/lib/duckdb

I imagine that either Malloy's CLI or its Python bindings would fit very well here.

tetris11
2 replies
11h38m

I miss source loading D3 into a simple HTML page, and having D3 tutorials that were up to spec with the latest D3 release.

Yeah Observable data pages look cool, but it really feels like excessive JS bloat for the features.

I think I miss throwing a D3 viz together without having to load an entire framework library.

recifs
0 replies
10h30m

You can still use D3 the good old-fashioned way, as a standalone script. See https://d3js.org/getting-started#d3-in-vanilla-html for examples.

meowtimemania
0 replies
11h23m

@observablehq/plot is what you're looking for if you want to have the same charts loaded via a script tag.

https://observablehq.com/plot/getting-started#plot-in-vanill...

kepano
2 replies
22h55m

I appreciate the nod to "File over app"[1] in the announcement. It's so cool that a Markdown file with code blocks can be the source for complex data visualizations and dashboards. Interoperability of this kind makes me giddy. I played around with editing an Observable site from Obsidian and it works great[2].

[1]: https://stephango.com/file-over-app

[2]: https://twitter.com/kepano/status/1758202572446581025

tophtucker
0 replies
22h12m

We’ve talked about and shared your “File over app” manifesto so many times internally over the last few months. It’s one of those tweets that gets immortalized as the perfect crystallization of an ethos that we might otherwise have only been able to gesture at vaguely. It gives the ethos weight and clarity and credibility, and it’s such a relief to be able to point to it! I’m very grateful. —an Observable employee

dleeftink
0 replies
15h58m

My mind immediately went to how these Dasboards could be integrated in Obsidian, and seeing the `import` dependency graph reflected in Obsidian's graph view.

j-pb
2 replies
20h33m

If you play the history of Observable backwards you start with a company creating a static site generator for dashboards, which then struggles to find a market fit as it tries to bring datascience to middle management, to finally reach a focused, simple and elegant tool for exploratory programming, data visualisation, and interactive documentation in javascript.

nextworddev
1 replies
13h31m

Ok but is anyone paying for it?

j-pb
0 replies
11h33m

We used to pay for 4 licenses, until they switched to the weird pricing schemes and the new editor targeted towards people with no programming experience.

I really wish they would open source Observable 0.5, Pluto is currently the only other notebook left that has the flexible data-flow model at reasonable simplicity.

kuatroka
1 replies
8h15m

Are you planning to add new UI components like Data Table or other in the future or it’s purely Plotting and Data ingestion and the UI through Tailwind or CSS or would it be possible to add UI libraries like shadcn or DaisyUI to make it a full fledged web site? Thanks

recifs
0 replies
3h31m

It’s only the first public release, rest assured we have plans to develop it beyond that point :-) Data tables are high on the list, but it’s going to be a lot of work and I can’t say when we’ll have something to share. In the meantime almost any library available on npm should work out of the box—not just the ones that we added explicit support for (even though of course some might need more work than others).

jwilber
1 replies
1d

This is really a game changer for creating data apps. I loved observable, but convincing scientists with no js knowledge to try the platform was almost impossible. Markdown + language agnostic loaders seems like the perfect way to collaborate.

Are there any plans to allow existing observable notebooks to be deployed? For example, a one-click “deploy” button?

mbostock
0 replies
23h45m

The Observable Framework CLI supports a `convert` command for downloading an Observable notebook and converting it to Markdown. E.g., `observable convert @d3/bar-chart` will download the notebook and save a bar-chart.md and alphabet.csv to your current working directory. (We did it from the command line so it’s easy for you to automate or batch-convert or refresh and keep things in sync.) You may have to make some tweaks to the code due to Framework’s vanilla JavaScript syntax, but we’ll work on making that more seamless over time.

And you can `observable deploy` to deploy your app to Observable for sharing — though most often you’ll want to setup continuous deployment to keep your app up-to-date automatically.

farhanhubble
1 replies
13h29m

I love how much is possible with the Observable framework and support for libraries like d3.js. However many data apps cannot precompute their outputs. For example, a pipeline that extracts text from documents based on what a user queries, cannot precompute the results and any visualizations must be updated every time. The best hack to accomplish this seems to be rebuilding the app on each update. Or is there another solution?

recifs
0 replies
10h23m

The code in a Framework can do whatever you want it to do—it can load data on demand, call an external API, etc. Precomputing data is only an option, not an obligation.

But even when you want things to be very interactive it is a good idea to minimize the data. Expose only the "rows and columns" that you need, and compress it as much as possible. This can be done in a data loader. For example, see the data app we deployed yesterday on hugging-face: its data loaders ingest a large source database (320 files totaling 200GB), and digests it into a single 8MB parquet file that we can then use on the page to "live query" 3 million newspaper titles and dates. https://huggingface.co/spaces/observablehq/fpdn

beefman
1 replies
1d

No mention of math formatting but from the docs it looks like there is TeX support!

https://observablehq.com/framework/lib/tex

mbostock
0 replies
23h36m

And Graphviz (dot) and Mermaid, too!

xrd
0 replies
1d

I love Observable. And, this is a phenomenal approach, untethering Observable from observablehq.com. I'm so excited.

It probably goes without saying that EVERYONE should have a blog, and this approach from Observable means journalists everywhere can now easily create a dynamic and information-driven blog. It isn't a coincidence that Observable came from a guy that did amazing data visualizations at the NYTimes. We are on the precipice of a major power shift back to journalists and away from dubious corporations, and tools like this enable that transition.

(Shameless plug: Svekyll is going towards the same goal. Svekyll is a static site generator inspired by Jekyll. If you want to use Svelte in your blog, check it out. Svekyll bundles the incredible Apache ECharts so you can use echarts in your blog with a few lines of code (and no complicated build process). https://extrastatic.dev/svekyll/svekyll-cli. These are ideas I've been thinking about too.)

vwkd
0 replies
2h6m

How does this compare to a traditional reactive framework like React, Vue, or Svelte with Observable Plot?

terpimost
0 replies
1d1h

Wow guys. That is so cool! Thank you for what you do. This is a great example of high quality product!

rogue7
0 replies
1d1h

Interesting pivot from the Observable team.

I loved observable and wrote a couple of notebooks, it worked great ! I'm gonna try Framework asap !

rad_gruchalski
0 replies
22h11m

Getting started guide: https://observablehq.com/framework/getting-started.

Looks very nice!

omneity
0 replies
20h5m

This is super cool! I’ve been looking for ways to integrate Observable with my blog posts and make them more interactive and engaging. This might just be it.

Thank you and congrats on the release!

nrjames
0 replies
21h15m

I love this and hope that a django-observable package comes along that makes it very easy to integrate and serve these static apps through a larger Django site.

nalgeon
0 replies
1d

And if you want a simpler tool for creating interactive docs, maybe try Codapi:

https://codapi.org/

lloydatkinson
0 replies
23h55m

There are simply too many tools and sites called Observable

johnmorrison
0 replies
19h25m

This is very cool to see! We've been building something very similar (in some regards, very different in others) in https://rysana.com/bundown

Somehow ideas like this emerge in waves seemingly with no coordination, kind of like what happened when Calculus was first invented

(Of course there's a lot of prior art here but it's interesting to see a specific jump towards polyglot single-file Markdown 'apps' and so on happen around the same time)

jeffbee
0 replies
1d1h

I am totally psyched up to try this. Observable (1.0) has been a very effective outlet for my research. There is no other platform that could have hosted it and offered me the quick and easy tools to make persuasive visualizations. I was a little concerned when I heard some people got laid off from the company, but it seems like it is still going.

janice1999
0 replies
22h11m

Are there any similar projects that allow you to plot thousands or tens of thousands of datapoints and also allows the viewer zoom into time series plots? I'd love to have an in-browser matplotlib replacement. So far I haven't found one. Observable plots look static.

greenie_beans
0 replies
23h36m

this is exciting. love observable.

foolswisdom
0 replies
1d1h

I really like this idea.

fburnaby
0 replies
6h19m

I can't see from the docs what this gives me over a Makefile, Asciidoctor (or pandoc, Jekyll etc), and D3?

dleeftink
0 replies
9h16m

I've setup a codespace starter template that pulls in the required dependencies to start building dashboards with Observable Framework, right inside your browser. You can access the starter template here:

[0]: https://github.com/dleeftink/observable-codespace

coolca
0 replies
19h22m

It is so beautiful

cheptsov
0 replies
10h25m

Congrats on the launch! Excited to hear that you go fully open-source with this! There is a certain need for great visualization tools that enable building apps.

austinpena
0 replies
1d

I'm curious where this fits in relative to streamlit which I use heavily

RobinL
0 replies
1d

Observable is such an incredible, powerful and enjoyable tool. I use it heavily, including to power my blog. I love it, but I've always had a slight concern about needing to rely on the Observable notebook website. So this has really made my day.

To give a sense of the kind of performant, statically hosted interactive content that has only really been within my reach since using Observable, here are some examples:

Highly interactive vis: https://www.robinlinacre.com/visualising_fellegi_sunter/

Editable computations that flow through the document https://www.robinlinacre.com/computing_fellegi_sunter/

Multiple synced representations of data: https://www.robinlinacre.com/prob_bf_mw/ https://www.robinlinacre.com/partial_match_weights/ (half way down, under 'Understanding the partial match weight chart and waterfall chart)

Of course, there are a huge number of additional examples: https://observablehq.com/trending but i think the whole thing makes much more sense to the end-user when embedded (sadly at which point they don't even know it's observable!)

CJefferson
0 replies
22h26m

Can anyone with knowledge of both systems compare this to quarto for me?

77ko
0 replies
19h8m

This looks amazing! I really like the clear seperation of loading/prepping data, and presenting it.

Some requests:

Add simple examples and more clarity to the publish docs.

I assume most ppl would prefer to deploy via github actions[1], which the docs currently just link to a complex deploy file - can you please add some more documentation on this, or add an example of the simplest possible deploy file?

Suggestion: is it possible to use a interface (like in vercel) to connect to a github repo and build/publish on changes?

[1]: https://observablehq.com/framework/getting-started#deploying...