HN comments for: Storing knowledge in a single long plain text file

kjksf

25 replies

1d21h

2024-05-21 20:48:49 UTC

I've built a web-based tool for myself that has similar philosophy: https://edna.arslexis.io/

It does support multiple pages but you can use just one.

It has a nifty feature in that you can divide the single file into virtual parts. They just have alternate backgrounds to tell them apart. And each virtual part can have a type for syntax highlighting (plain text, markdown or a programming language).

I've been using it for a few months now and it's my primary note taking / knowledge recording thing.

Even though it's web based, on Chrome you can save notes on disk so it works like a desktop app.

Each note is a plain text file so you can edit them in any text editor.

If you put notes on a shared drive (Dropbox, OneDrive, Google Drive etc.) you can work on notes on multiple computers.

It's also open-source: https://github.com/kjk/edna

porridgeraisin

5 replies

1d21h

2024-05-21 21:18:10 UTC

Heynote is similar

kjksf

4 replies

1d20h

2024-05-21 21:37:52 UTC

Edna is a fork of Heynote with a bunch of changes.

Mostly it supports multiple notes and it's a web app, not a desktop app.

I could build a desktop app but it would not offer almost any advantages given that Edna can also save notes on disk (that's how I use it).

You can use Chrome's "Install" feature to make it look act like a native app (it opens in it's own window and acts independently of the browser).

jonatanheyman

3 replies

1d10h

2024-05-22 07:41:52 UTC

Heynote also exists as a web app: https://app.heynote.com/

BOOSTERHIDROGEN

2 replies

1d7h

2024-05-22 10:47:03 UTC

Can I self hosted this ?

sphars

1 replies

1d5h

2024-05-22 12:57:09 UTC

Looking at the GitHub repo[0], I don't see why you wouldn't be able to host it yourself (extra configuration may be required). In the package.json, there is a script for running the web app `npm run webapp:build`, so I'd assume you could do that and then host the built web app in ./webapp/dist however you'd like.

[0]: https://github.com/heyman/heynote

jonatanheyman

0 replies

21h13m

2024-05-22 21:14:39 UTC

Yep, that should work!

smusamashah

4 replies

1d20h

2024-05-21 22:03:09 UTC

This is great. Any plans to add images support? (for screenshots in my case) I use OneNote extensively because it's free form like a white board and allows pasting images (which i often do while debugging).

ralgozino

1 replies

1d2h

2024-05-22 15:38:54 UTC

sounds like Obsidian's canvas: https://obsidian.md/canvas

eMPee584

0 replies

21h37m

2024-05-22 20:50:44 UTC

inkscape and prezi should already do some of this

pcblues

0 replies

1d7h

2024-05-22 10:45:35 UTC

A thing I used OneNote for was easy OCR.

kjksf

0 replies

1d19h

2024-05-21 22:28:35 UTC

Probably not to Edna. It's focused on being fast and lightweight.

I've been thinking about more featureful markdown note taker that would support images and more.

I've started on such a thing but stalled. It's way more work. The good thing about Edna is that I spent less than a month adding the features I wanted to Heynote fork.

The current version is at https://notedapp.dev/ but don't use it for actual notes.

porridgeraisin

2 replies

1d21h

2024-05-21 21:19:51 UTC

Just found out this is a fork of heynote! Was looking for one of these with web support

kjksf

0 replies

1d20h

2024-05-21 21:45:47 UTC

Yeah, I loved the simplicity and speed of Heynote and math mode.

I wanted multiple notes and I didn't get why it was made as a desktop app first given that all functionality to implement it is available in a browser (well, Chrome).

So I forked it and added those features.

Been using it daily so it was worth it.

jonatanheyman

0 replies

1d10h

2024-05-22 07:42:50 UTC

Heynote exists as a web app as well :)

https://app.heynote.com/

gitinit

2 replies

1d19h

2024-05-21 23:22:10 UTC

EDIT: Originally I just looked at the website. Looking at the GitHub repo, I see it's a fork, which makes sense (I also didn't notice the other replies!) Either way, it's cool. I'll probably end up using this myself. I was unable to find a way to store notes in a folder or in encrypted Gists though.

This seems nearly identical to Heynote[0], which was also on HN[1]. Even the example blocks share some content with that used as an example in the screenshot on the Heynote homepage (and I think in the app too)

[0]: https://heynote.com/[1]: https://news.ycombinator.com/item?id=38733968

kjksf

0 replies

1d17h

2024-05-22 01:08:14 UTC

To save on disk you must use Chrome or Edge because only they support necessary APIs.

Initial note storage is in localStorage. To switch to disk: right-click for context menu, `Notes storage` / `Move notes from browser to directory`.

Then choose a directory on disk and we will do one time migration from localStorage => disk.

You can then switch to another directory (some apps call it a "workspace"). Because why not.

Encryption is probably the next feature I'll add because I want to store secrets in my notes and I'll feel better if those notes are encrypted.

More docs: https://edna.arslexis.io/help

Multiple notes is pretty big addition. I loved the concept and implementation of blocks in Heynote but a single note was a deal breaker for me.

I've also added some UI like right-click context menu for discoverability, ability to enable spell checking.

And I'm really trying to optimize for speed of use, including speed of switching between notes.

For example you can assign Alt + 0 .. Alt + 9 as note quick access shortcuts.

By default I create 3 notes: scratchpad, daily journal and inbox and they get Alt + 1, Alt + 2, Alt + 3 quick access shortcuts but you can assign them to any page you want.

Brajeshwar

0 replies

1d15h

2024-05-22 02:45:17 UTC

Jonatan Heyman produces some pretty awesome and useful apps/tools. One should check out his work - https://heyman.info

canadiantim

2 replies

1d21h

2024-05-21 21:25:05 UTC

How does the saving notes on disk work? You mean just downloading it? Or is the content synced? If so how does that work?

kjksf

1 replies

1d20h

2024-05-21 21:34:02 UTC

Chrome implements APIs that allow accessing files on the disk.

So Edna either stores notes in localStorage or in a directory of your choosing on disk.

In Edna you can right-click for context menu to switch between localStorage and disk.

If you ask: "how do the browser APIs work", you can look at https://github.com/kjk/edna/blob/main/src/fileutil.js

Basically, there's `window.showDirectoryPicker()` to ask user for permission to access directory (either read only or read write). And then using that directory handle you can read list of files, read / write files or create new files.

wernsey

0 replies

1d10h

2024-05-22 08:05:20 UTC

Oh, man, many years ago I used Tiddlywiki (and later Wiki-On-A-Stick) as a browser-based note taking app, but stopped using it because the API they used to save the file to disk got deprecated and removed.

History not repeating but rhyming, I suppose...

Anyway, thanks for this. I've just added it to my bookmarks.

desio

1 replies

1d10h

2024-05-22 07:39:25 UTC

Looks like that's on codemirror framework? Any good resources you could share on wiring up custom language and view? I've managed to kinda get something working with lezer but the docs aren't great and I want to setup some pretty specific behaviour in the view with folding and validation etc.

kjksf

0 replies

1d7h

2024-05-22 10:48:43 UTC

Yes, Codemirror.

What I know about Codemirror I mostly learned by reading other people's code so I suggest that.

Specifically code of silverbullet: https://github.com/silverbulletmd/silverbullet/tree/main/web... (and a few other directories there).

It implements very advanced Markdown mode, lots of code to learn from.

FredPret

1 replies

1d21h

2024-05-21 21:04:09 UTC

Very cool!

I love the math block. Is there a way to reference a variable elsewhere, or fetch data online? Then you could build a little personal dashboard with it.

kjksf

0 replies

1d20h

2024-05-21 21:41:56 UTC

Not at the moment.

I was thinking about making math more like a mode i.e. make it available in every block type, as opposed to it's own block type.

Then it would be active in plain text, markdown and even code blocks.

As to data fetching - falls a bit outside of scope.

zcw100

0 replies

1d7h

2024-05-22 10:59:57 UTC

Looks like a CLI version of a tiddlywiki

rakoo

12 replies

1d19h

2024-05-21 22:47:51 UTC

Is this a serious article ? The state of the art of knowledge ?

There are at least two (2) existing prior art implementation that have done this for years, only better (as in, with better tools):

- recutils: https://www.gnu.org/software/recutils/

- ndb: https://9fans.github.io/plan9port/man/man7/ndb.html

Please, developers of everywhere, I beg you: please learn what came before you before reinventing the wheel, only triangular this time. Please take the time to appreciate that if it's so obvious maybe it's because of your ignorance and not your genius.

stavros

7 replies

1d19h

2024-05-21 23:28:01 UTC

Maybe the author just independently came up with this, thought it was cool, and wanted to share?

I don't know if "please do a thorough literature review before showing me things" is the right sentiment here.

vineyardmike

2 replies

1d18h

2024-05-22 00:04:13 UTC

Considering the article has a “prior art” section, I assume a literature review would be appropriate.

My confidence is shaken considering the sparse “prior art” section links to Apple M1 as an example of “fast file systems”.

tambourine_man

0 replies

1d18h

2024-05-22 00:15:30 UTC

Yeah, this is really weird.

breck

0 replies

1d17h

2024-05-22 01:03:55 UTC

It wasn't clear why I mentioned M1. I updated that. Thank you.

https://github.com/breck7/breckyunits.com/commit/61792237c0b...

A number of things have to best fast for this system to be enjoyable to use (at scale) and before the M1 no personal machine I ever tried came close.

rakoo

1 replies

1d18h

2024-05-21 23:59:44 UTC

I'm not sure the paper-like presentation of the article shows that the author was in pure discovery mode, eager to share something new and interesting.

My message is an echo to earlier comments of earlier posts that talked about a similar point: nothing is ever new, everything has already been done before. If we tell ourselves we're engineers, we should be studying what came before in order to prove that the new thing is indeed better.

That being said, recutils is the standard method of recording data in GNU, and ndb is the standard method of configuring stuff in Plan 9, a system that any proponent of UNIX mindset should know about. I'm not exactly talking about obscure stuff here.

chipdart

0 replies

1d13h

2024-05-22 05:28:12 UTC

I'm not sure the paper-like presentation of the article shows that the author was in pure discovery mode, eager to share something new and interesting.

If the author was following a paper-like presentation, the author somehow skipped the section listing relevant prior work. This is something every single journal enforces, as researching prior work is the very first step any author does when they come up with something.

rkangel

0 replies

1d5h

2024-05-22 12:34:01 UTC

Maybe the author just independently came up with this, thought it was cool, and wanted to share?

Except that the title ("A New Way to Store Knowledge") is leaning heavily on NEW.

chipdart

0 replies

1d13h

2024-05-22 05:24:37 UTC

Maybe the author just independently came up with this, thought it was cool, and wanted to share?

That's perfectly fine, but that's besides the whole point.

The point is that between coming up with something and implementing it, there should be a step to check if anyone already did something similar.

The whole point of researching prior work is to a) don't waste time reinventing the wheel, b) leverage prior work to improve your own ideas, c) make better use of your time by doing meaningful contributions instead of taking a risk on whether you're ripping off someone else's work.

That's the absolute basic standard on scientific publishing, for example. If you pick up any paper at all, you'll notice that right after the introduction and summary you get a bibliographical review listing any relevant work that your peers already contributed. When anyone submits a paper, the reviewers can and outright do reject your submission if it fails to adequately contextualize the paper with regards to prior art and related work. One of the points is to ensure the author is not wasting everyone's time with a novel approach to the wheel.

More importantly, if an author fails to know what's already there, how can they tell their idea is any good?

breck

3 replies

1d19h

2024-05-21 23:21:56 UTC

Thank you for bringing up recutils and ndb. I had seen them years ago but didn't make the connection when writing this paper. But there are some great connections, and I will definitely be updating the post with a section and links to them.

I am reading through the source and will have more to say soon. If anyone has any links to massive plain text datasets based on these (or other similar tools), I would appreciate more pointers.

I can tell you now (subject to change), based on my preliminary read through the source is that the two systems you mentioned missed some highly important details that I have presented in my paper, with order of magnitude impacts. Not to discredit them at all, rather I think my work gives them credit, in that they were on the right track, and we just have the benefit of some recent innovations, and get to stand on their shoulders (and the shoulders of others).

Edit:

I have updated the paper with a reference to Recutils. Thanks rakoo! https://github.com/breck7/breckyunits.com/commit/71b706d296e...

The added text:

    GNU Recutils^recutils deserves credit as the closest precursor to our system. If Recutils were to adopt some designs from our system it would be capable of supporting larger databases.
     https://www.gnu.org/software/recutils/

    ^recutils: GNU Recutils: Jose E. Marchesi
     https://www.gnu.org/software/recutils/
    - Recutils and our system have debatable syntactic differences, but our system solves a few clear problems described in the Recutils docs:
     - "difficult to manage hierarchies". Hierarchies are painless in our system through nested parsers, parser inheritance, parser mixins, and nested measurements.
     - "tedious to manually encode...several lines". No encoding is needed in our system thanks to the indentation trick.
     - In Recutils comments are "completely ignored by processing tools and can only be seen by looking at the recfile itself". Our system supports first class comments which are bound to measurements using the indentation trick.
     - "It is difficult to manually maintain the integrity of data stored in the data base." In our system advances parsers provides unlimited capabilities for maintaining data integrity.

rakoo

2 replies

1d18h

2024-05-22 00:00:41 UTC

If your research has added value over the existing state of the art I would love to read more about it !

breck

1 replies

1d17h

2024-05-22 01:07:45 UTC

Thanks to your pointer, I was able to explain a bit more about the advances over the SOTA. Thank you! This is the speed at which peer review should happen.

If I'm lucky, I'll wake up tomorrow to someone else pointing out another precursor I overlooked.

rakoo

0 replies

1d7h

2024-05-22 11:14:34 UTC

Thanks for the comparison ! I see there are some advances compared to recutils, those would benefit being put to the front !

mushufasa

12 replies

1d22h

2024-05-21 20:00:35 UTC

The title, "A New Way to Store Knowledge", indicates this is a joke.

happytoexplain

6 replies

1d22h

2024-05-21 20:19:08 UTC

Is there some context you're leaving unsaid?

mushufasa

5 replies

1d21h

2024-05-21 20:56:07 UTC

a plain text file is the oldest idea for storing knowledge. see unix philosophy: "Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

sprobertson

4 replies

1d21h

2024-05-21 21:26:04 UTC

Did you read past the title? The main point of the article is a syntax for knowledge bases - plain text is just an implementation detail.

chipdart

3 replies

1d12h

2024-05-22 05:36:13 UTC

If you take out plain text from this presentation, what's left? The tree structure? The log aspect? In order to claim any of this is remotely novel, you have to first ignore the whole body of work built around information systems.

breck

2 replies

1d6h

2024-05-22 11:38:14 UTC

Maybe you missed the link in the "Evidence" section to a 7 year open source project containing 172,162 lines of code, and a compiler compiler.

;)

chipdart

1 replies

1d6h

2024-05-22 12:02:36 UTC

That doesn't answer my question.

breck

0 replies

1d4h

2024-05-22 13:37:52 UTC

If you take out plain text from this presentation, what's left? The tree structure? The log aspect? In order to claim any of this is remotely novel, you have to first ignore the whole body of work built around information systems.

Thank you for the feedback. I've updated the paper with some more links.

The language in which the measures are written in (currently called Grammar. I will like rename it to something like Parssers) is quite advanced.

The improvements over Recutils, the closest precursor I am aware of, have now been added.

The PLDB ScrollSet is now about 500,000 cells of information. Each cell is strongly typed and fully auditable by git. There is a high amount of signal in that dataset. It is an intelligent set of weights, and continually getting more intelligent. And it is read at runtime as a single plain text file and compiled to a single CSV (or tsv, json, etc).

All from using the system documented in the paper (and the advanced language for Parsers).

If you can point me to a similar database or similar scale anywhere in the world (plain text base, >10e5 size, git backed, strongly typed, hierarchical and graphical), I would be grateful as I might learn something.

andrepd

1 replies

1d20h

2024-05-21 21:54:31 UTC

It must be, right?? The whole thing reads like a satire of the exact kind of thing HN would fawn over. Just look at the current comments!

SrslyJosh

0 replies

1d16h

2024-05-22 01:51:34 UTC

I'm not sure myself. I didn't want this to be the second comment on the submission so I'll say it now: I got TimeCube vibes from this.

robertclaus

0 replies

1d14h

2024-05-22 04:18:00 UTC

I hope so...

m463

0 replies

1d21h

2024-05-21 21:05:32 UTC

I suspect it refers to Wolfram's "A New Kind of Science".

I don't see it as a this-is-all-a-joke thing though, more tongue in cheek.

also I think one-big-text-file has a certain simplicity, like everything-is-a-file on unix (or more properly plan9)

knighthack

0 replies

1d19h

2024-05-21 23:17:07 UTC

The moment I read the text I knew the title was satirical.

You know it is when it starts like this: "...All tabular knowledge can be stored in a single long plain text file. The only syntax characters needed are spaces and newlines."

That's fundamentally the simplest way of storing text. And it's nothing new, yet people have long ignored that simplicity for much more complicated ways of storing text.

boomlinde

7 replies

1d8h

2024-05-22 10:27:40 UTC

It feels like you left a chapter or two out. You mention in the citations that "Hierarchies are painless in our system through nested parsers, parser inheritance, parser mixins, and nested measurements." Nothing else in the article gives any hint as to what those things are or how your system implements them except nested measurements. It's unclear at all what a parser is in your system. It is however clear that what you call "parsers" aren't parsers. Is the list of "parsers" a schema definition?

Overall it seems like your ideas would make more sense if you used more widely adopted language to describe it. "Concepts" are records, "measurements" are fields.

breck

6 replies

1d7h

2024-05-22 11:23:52 UTC

It feels like you left a chapter or two out.

I agree with you. More details will come out over time but I wanted to keep yesterday's paper a single page.

You mention in the citations that "Hierarchies are painless in our system through nested parsers, parser inheritance, parser mixins, and nested measurements." Nothing else in the article gives any hint as to what those things are or how your system implements them except nested measurements. It's unclear at all what a parser is in your system.

Below is a link to a web IDE we built. You can see parsers (on the left), and concepts (on the right). Nested parsers and parser inheritance are demonstrated. Mixins is not currently in that branch yet. Ignore the "cells" stuff at top (that turned out to be an unneeded division between lines parsers and word parsers).

https://jtree.treenotation.org/designer#url%20https%3A%2F%2F...

Overall it seems like your ideas would make more sense if you used more widely adopted language to describe it. "Concepts" are records, "measurements" are fields.

Yes, concepts often map to records or rows. Measures to fields or columns. Measurements to the cells in a spreadsheet.

There are reasons for my terminology, that should become clearer over time.

the_duke

5 replies

1d5h

2024-05-22 13:06:56 UTC

From a quick scan, it sounds like you re-invented a lot of the concepts of semantic data, just with different terminology and a different text format. (RDF, triples, ...)

0x445442

2 replies

1d4h

2024-05-22 13:41:02 UTC

I wish there was a defacto/canonical site that housed free papers that people could go search before embarking on these types of efforts. Perhaps there is but when I attempt these types of searches I get directed to pay walled ACM type links or Github "Papers We Love" type links.

mynotaccount

0 replies

1h56m

2024-05-23 16:31:44 UTC

sci-hub

breck

0 replies

1d4h

2024-05-22 13:43:18 UTC

You might enjoy https://pldb.io/, which is a paywall free, open source, public domain, database you can browse completely locally, with information on all of these kinds of prior languages.

breck

1 replies

1d4h

2024-05-22 13:41:45 UTC

It would certainly be fair to add RDF/triples/semantic web, to prior work. I spent many years exploring that stuff.

We are aiming at roughly the same problem. Our implementation has solved some important details.

FabHK

0 replies

1d2h

2024-05-22 15:59:12 UTC

Might be worth highlighting some of the problems solved (particularly those that earlier ideas haven't).

thorncorona

3 replies

1d20h

2024-05-21 22:24:50 UTC

this is solved better by obsidian

breck

1 replies

1d19h

2024-05-21 23:24:24 UTC

Can you explain why?

atrus

0 replies

1d16h

2024-05-22 01:58:11 UTC

Obsidian is a bit closer to an unstructured TreeBase than this single file TreeBase imo.

Sarky

0 replies

1d12h

2024-05-22 05:36:28 UTC

Indeed. Markdown files seperated into folders. I organize them into topics. Easy to search with a lot of possible customizations. And even setup without customizations is optically pleasing and functional

samatman

3 replies

1d2h

2024-05-22 16:06:13 UTC

In case you would like to be less (or more) confused, this is an application of Tree Notation, by the same author https://treenotation.org/

I suffer from the same flaw as the author, a tendency towards grandiosity and fervor in describing my good ideas. So I'm in a good position to advise that he knock it off: people don't like that, and it will keep them from using your stuff even if it's good.

Which it might be, actually. The extreme simplicity of the foundation is laudable.

breck

2 replies

1d1h

2024-05-22 17:03:26 UTC

The brevity and grandiosity is not for marketing the idea, it is so the idea can be attacked. I don't want to waste my working hours building a factory out of the wrong materials. If I've made a mistake, I want to know.

If the idea is truly good, the products built on the idea should do just fine.

samatman

1 replies

1d1h

2024-05-22 17:27:47 UTC

It's your project to run as you please, of course.

My guess is that the attacks you draw will skip any basis in technical merit and land directly on the tone, proceeding on an emotional basis. We have an n=1 here with plenty of that behavior on display.

You'd like to believe that someone proposing Tree Notation for a project wouldn't be dismissed with "isn't that, like, the YAML for TimeCube guy?". But this is, in large part, how the world actually functions.

breck

0 replies

2024-05-22 18:27:03 UTC

It's been a slog, but I'm very happy with how the ideas in Scroll (which for all intents and purposes Tree Notation and Grammar are Scroll--99% of usage is Scroll) and PLDB have evolved.

I don't mind the pushback.

If it wasn't for the pushback against Tree Notation, I never would have started PLDB. ("Learn to research properly", one commenter once said. And he was right. I think PLDB is the proper way to do research).

It's much nicer to get pushback than crickets. That means people are generously giving their time to consider the ideas.

Crickets is the worst. I should know, I mostly get crickets.

TZubiri

3 replies

1d12h

2024-05-22 05:56:37 UTC

Xml is too bulky, let's do csv Csv is too limited too strongly typed, let's do json. Json is too heavily punctuated let's do yaml. Yaml is too yamly, let's do this instead.

wruza

1 replies

1d10h

2024-05-22 07:40:58 UTC

Let’s just do json5 after json. https://json5.org/

corn13read2

0 replies

1d9h

2024-05-22 09:22:31 UTC

I'll wait for version 6

Quothling

0 replies

1d3h

2024-05-22 15:11:26 UTC

Eventually everything becomes toml.

All jokes aside, I think the equivalent for this would be markdown not xml/csv/json/yaml.

runjake

2 replies

1d21h

2024-05-21 20:55:02 UTC

Caveat from article:

  > For pragmatic reasons, it is best to split your data into 1 file per concept and combine concept files at runtime.

pizzafeelsright

1 replies

1d20h

2024-05-21 22:15:51 UTC

All separate things should be in different files.

And files are just key/values anyway.

egeozcan

0 replies

1d9h

2024-05-22 08:29:01 UTC

I wouldn't say "should" but I agree.

A file is a very abstracted concept and it technically can mean a lot of different things depending on the file system.

However, it is a very good abstraction that's nearly universal and practically there is little to no reason not to use them to organize things.

fwip

2 replies

1d20h

2024-05-21 22:14:57 UTC

Not sure why the "fast filesystem" links to the M1 processor.

breck

1 replies

1d17h

2024-05-22 01:01:53 UTC

You are right, that is not clear. I've added a note and link (https://github.com/breck7/breckyunits.com/commit/61792237c0b...)

"The M1 laptop was the first consumer machine I tried where the performance of this system wasn't abysmal." - https://breckyunits.com/building-a-treebase-with-6.5-million...

Thank you!

fwip

0 replies

1d2h

2024-05-22 15:35:28 UTC

Thanks for the explanation. :)

cpr

2 replies

1d22h

2024-05-21 20:12:23 UTC

I did this for decades (using Emacs) but finally gave up and am using Notes.

ukuina

0 replies

1d14h

2024-05-22 03:51:43 UTC

Interestingly, I am moving more towards plain text notes because they are easier to ingest for LLMs.

breck

0 replies

1d1h

2024-05-22 17:04:59 UTC

Any useful tricks or techniques you picked up along the way?

AtlasBarfed

2 replies

1d21h

2024-05-21 21:17:33 UTC

There's two hard problems in computer science: name spacing and caching.

This ... Is namespace hell, and if you squint at the caching problem, it's actually an indexing problem, which is also related to this.

adtac

1 replies

1d20h

2024-05-21 21:45:15 UTC

The aphorism typically says cache invalidation is hard. Not because you don't know what index to invalidate but because it's hard to invalidate the thing at the right time.

Caching itself is quite easy, just ask the designers of speculative execution at Intel :)

samatman

0 replies

1d2h

2024-05-22 15:47:47 UTC

If a cache doesn't have cache invalidation, it isn't a cache, it's a database.

zaik

1 replies

1d8h

2024-05-22 10:11:03 UTC

Too many people don't know about Wikidata.

breck

0 replies

1d6h

2024-05-22 11:32:36 UTC

Can you elaborate?

wodenokoto

1 replies

1d12h

2024-05-22 06:25:32 UTC

I don’t get it. How do I now that something is a data definition and not just more data?

Is “>” a special character together with space and new lines? He calls it a trick, why?

How do I add data with spaces and new lines?

Is “Parser” a keyword that you postfix to names of values? He writes “idParser” and then has a value in each observation that is named “id”

breck

0 replies

1d6h

2024-05-22 11:30:49 UTC

I don’t get it. How do I now that something is a data definition and not just more data?

In our ScrollSet implementation, a measure definition (what you call a "data definition") is a subset of a parser. You will know something is a measure definition when you see a line starting with a word with a "Parser" postfix, and nested inside that definition is a line like "extends abstractMeasureParser".

Below is a link to a web IDE we built. You can see all of the measure definitions currently powering PLDB on the left. On the right, you can see a concept ("more data", in your terms).

https://jtree.treenotation.org/designer#url%20https%3A%2F%2F...

He calls it a trick, why?

The current term of art is "Offi-side rule" (https://en.wikipedia.org/wiki/Off-side_rule). I never liked that term. I call it the indentation trick. But I am referring to the Offside_rule.

h2odragon

1 replies

1d22h

2024-05-21 19:44:51 UTC

plain text has so many advantages.

then you need some syntax for the strictures of your use case.

and /etc is reborn.

Terr_

0 replies

1d10h

2024-05-22 07:43:24 UTC

Then you want to be able to access old copies, and RCS [0] is reborn...

[0] https://en.wikipedia.org/wiki/Revision_Control_System

fellowniusmonk

1 replies

1d22h

2024-05-21 20:05:32 UTC

I'm so excited for this kind of work. I think there is an alternate history where EMACS or an EMACS equivalent became the dominant OS but the onboarding process was too onerous, and the community has been focused on technical integrations instead of integrating a larger less technical community of people into a sane but simpler default.

With AI I think interfaces will further bifurcate between "users" and "creators" and pretty much all of our "desktop" ui paradigms will be consigned to history in favor of structured collaborative text interfaces.

pama

0 replies

1d13h

2024-05-22 05:17:27 UTC

I thought I wasnt alone but perhaps I live in a sparsely populated alternative history where Emacs gets simpler over time. Once you know the basics they dont change. Some more advanced tools gradually simplify or improve but it takes years. Various ideas are explored by users around the globe and the simplest and best ones survive: we now have magit and eglot and treesitter support. And org, but also markup. The shells are true shells with unlimited context and full access to the OS. Similarily for the REPLs. The only thing I miss is changing tools all the time and losing history, which felt like a refreshing excuse to start over when I was younger —- these days I dont have the patience and time.

dflock

1 replies

1d21h

2024-05-21 20:42:42 UTC

This is _so much more_ than the title suggests - this is not about making notes in text files.

breck

0 replies

1d17h

2024-05-22 01:04:56 UTC

You get it ;).

bitwize

1 replies

23h4m

2024-05-22 19:23:41 UTC

Reminds me of the Canon Cat. You put a disk in and it would store everything you typed as a single, long document on the disk. You could put dividers in the document to separate sections. Parsers in the Cat's system software allowed for specific actions to be taken on parts of the document; for example, tabular numeric data could be identified and spreadsheet-like functionality could be enabled over that data. The whole document was searchable via a pair of LEAP keys which, when held down while typing, would search for what was typed. Jef Raskin of Macintosh fame was responsible for this UI.

https://en.wikipedia.org/wiki/Canon_Cat

breck

0 replies

22h56m

2024-05-22 19:31:28 UTC

This is fascinating. I don’t think I’ve seen this before. Thank you.

akasakahakada

1 replies

1d10h

2024-05-22 08:01:33 UTC

Educate me if this is not reinventing the wheel of ymal, toml, xml, etc.

enriquto

0 replies

1d7h

2024-05-22 10:41:42 UTC

One would rather say that ymal, toml, xml, etc. are wheel reinventions of plain text files.

TZubiri

1 replies

1d12h

2024-05-22 05:52:59 UTC

When you still haven't emerged from the covid pandemic and your shutdown project started to take roots deep in your mind.

It reminds me of that scene from The Shining where the character writes the same sentence over and over again.

breck

0 replies

1d6h

2024-05-22 11:41:20 UTC

All text and no syntax makes Breck a dull boy. All text and no syntax makes Breck a dull boy. All text and no syntax makes Breck a dull boy.

082349872349872

1 replies

1d7h

2024-05-22 11:03:13 UTC

tangent: http://www-formal.stanford.edu/jmc/elephant/elephant.html

breck

0 replies

1d6h

2024-05-22 11:32:23 UTC

I know what my rabbit hole of the day will be now.

Thanks! (https://github.com/breck7/pldb/commit/83ba14454ed80fa682c85d...)

zitterbewegung

0 replies

1d1h

2024-05-22 16:37:53 UTC

This reminds me of Org Mode[1]

https://orgmode.org

throwthrowuknow

0 replies

1d20h

2024-05-21 21:57:16 UTC

Facts

smarm52

0 replies

20h2m

2024-05-22 22:25:32 UTC

(Excuse me if this is obvious, I have limited time, but this article grabbed my attention; Fascinating).

How do you handle writes? It seems like an interrupted write process could corrupt a section of text, which could be difficult to recover from.

Given the example at "breckyunits.com", I don't see hashing information associated with each item.

Are you depending on git to prevent such errors from corrupting individual items? If so, then I would be concerned about gits propensity for data corruption [1, 2, 3, 4].

I wonder if adding some ZFS-like hashing and integrity checks would be helpful. Then, as it's one big file, it seems to act like a TAR archive [5], where you append to the end, but have to scan through the previous content to find what you want. If that's the case, then it may be viable to do copy-on-write [6], where information is never modified, but instead referenced with a key, and later modifications supersede older versions.

(Again apologies if this is redundant, I just had the thought and had to get it down. XD)

[1] https://superuser.com/questions/1253830/does-git-prevent-dat... [2] https://superuser.com/questions/1635797/what-if-git-reposito... [3] https://stackoverflow.com/questions/tagged/corruption?tab=Fr... [4] https://www.reddit.com/r/git/comments/oq9wph/power_outage_in... [5] https://en.wikipedia.org/wiki/Tar_(computing) [6] https://en.wikipedia.org/wiki/Copy-on-write

rifty

0 replies

12h52m

2024-05-23 05:35:56 UTC

I feel like the focus on language appearance is taking too much precedence over covering other aspects like parser composition. For example mentioning the 'indention trick' feels like a deviation and distraction from the actual point you're trying to convey. The idea here isn't actually dependent on the exact presentation style of the format...

To comment on the appearance though since it seems a focus none the less... I appreciate the ideal of syntax sparseness, but in this case I feel like it loses visual salience in plaintext after looking at some of the .scroll files. It's difficult to recognize the shape and proportion of what the content will be when rendered. The applied meta content lacks visual differentiation in plaintext from the content itself. I don't think total spareness should be the sole goal here; Markdown for example isn't strong in plaintext just because it is syntactically sparse, but because it is sparse in tandem with not supporting applying extensible meta content to content - but this does.

racional

0 replies

1d5h

2024-05-22 12:33:01 UTC

See also - Ask HN: How do you store your knowledge? - https://news.ycombinator.com/item?id=40131689

pimlottc

0 replies

1d4h

2024-05-22 14:11:41 UTC

I have no idea what the visualization means.

kriro

0 replies

1d4h

2024-05-22 14:19:35 UTC

It is surprisingly common for very good bug bounty hunters to rely on stuff.txt as their major "knowledge base". At least I've heard this from a couple of high earning guys in interviews. They usually just grep through it or roughly remember where things are. I was quite surprised to hear that.

jhoechtl

0 replies

1d2h

2024-05-22 15:46:25 UTC

Glad recutils got a mention. I find it superior to the proposed concept. Sadly recutils never cought on.

igtztorrero

0 replies

1d4h

2024-05-22 14:06:18 UTC

I like the blog site, it's using scroll language, cool !

dSebastien

0 replies

21h27m

2024-05-22 21:01:15 UTC

One big text file (OBTF) for the win? I'm curious about the pros and cons people think of.

https://notes.dsebastien.net/30+Areas/33+Permanent+notes/33....

csomar

0 replies

2024-05-22 17:41:32 UTC

I understand the author point, but I think this is over-complicating a database table while losing most of the features a database can give you.

This is not some new concept, however. I stumbled upon this concept two years ago with some dude promoting a "Vault" architecture, where you use a single "notion.so" table to store all your data. You create views from this data to separate topics. You'll then be able to "centralize" all you notion stuff in a single file; all while being able to link any two topics or more together.

What hit me is that I can export the notion table to CSV and then this can be fed into an AI pipeline that might be able to predict my tasks better (like code). Only problem was, a couple of months into this and the notion interface became completely unusable.

This can be done with a regular database. Though the views/interfaces to interact are not that easy to create. I didn't find an alternative (I tried airtable too)

benatkin

0 replies

1d11h

2024-05-22 06:36:33 UTC

Nested Markdown with code fences is plain text.

Alas, vscode will choke on it.

I have a project where a thin wrapper loads from a giant markdown file into a sandboxed iframe. That way you could paste code from an unknown source into it and play with the output and paste private data into it and it wouldn’t be encoded into a URL sent to a server, as making network requests and following links are blocked.

https://codeberg.org/ristretto/pages

notebook.md is huge, output in the project website, link to source in the README.

a_c

0 replies

1d12h

2024-05-22 06:16:26 UTC

If we forego human read-write-ability to gain some interactivity, we got https://tiddlywiki.com/ , a single long html file

EGreg

0 replies

1d4h

2024-05-22 14:06:54 UTC

What I want to know is, what is the maximum size of a PHP file that can be loaded?

I guess I can TIAS but is it documented anywhere ??

Biganon

0 replies

1d18h

2024-05-21 23:43:02 UTC

This gives off a TimeCube vibe.