return to table of content

I made an open source Windows app to rewind and search everything on screen

hu3
30 replies
1d2h

It would be cool to:

1) Make a Meta Smart Glasses take a photo every 2 seconds.

2) Send images to some server in the cloud.

3) Run OCR and object detection/labeling on these images.

4) Then present an app that allows searching (and chatting with) your past.

I could then ask the LLM things like:

1) Where did I left my wallet?

2) Did I get my credit card back after paying the restaurant yesterday? (ADHD things, don't ask)

3) What was written in my daughter's new tshirt today?

Bonus point if the app also records and transcribes audio so you could ask the LLM things like:

1) In the last meeting, what was the deadline that we settled on?

2) What was the phone number of that person I met in the park earlier today?

3) What was the name of the investor I met today?

Bonus bonus point if it has access to your phone calls to so it can transcribe and index what others said.

See Black Mirror episode "The Entire History of You": https://en.wikipedia.org/wiki/The_Entire_History_of_You

bee_rider
16 replies
1d2h

Generally Black Mirror episodes are not intended to be aspirational.

drdaeman
8 replies
1d1h

Black Mirror is fiction, a lot of it works well thanks to our suspension of disbelief.

Remove cloud from the equation, replacing it with a properly safeguarded fully owned hardware located on your own premises - basically make it privacy-respecting, minimizing abuse potential, do a proper audit, and it becomes a super desirable system for people with neurological differences (or just stressed folks - attention and memory suffer under the stress too), a glasses that enhance memory rather than vision. Sure, not without caveats and gotchas (don’t bring it to a poker night lol), but not that cyberpunky-bad either.

Cyph0n
4 replies
1d1h

Have you watched the Black Mirror episode in question (S1E3)? Because the point of it was the tech itself and had nothing to do with sharing or the cloud.

somenameforme
2 replies
1d

An extremely similar take on this, that predates Black Mirror, was in some sci-fi media whose name, appropriately enough, currently escapes me. Their premise is that there was a species who evolved to develop a flawless memory. It's a subset of the Black Mirror episode (and this dubious idea) because there, at least, memory is a purely private thing.

Yet the end too was quite predictable and logical -- frequently destroying those of this species simply because your own life and mind ends up becoming more tempting and destructive than even the most enticing of drugs. One could simply lose themselves in your own memories. What need is there for the rest of your life when you can simply endlessly relive, in perfect clarity, the best moments of your life - ones that you, in many cases, will likely never surpass?

karencarits
0 replies
19h55m

I am curious; ChatGPT suggests

- the science fiction novella "The Rememberers" by A.E. van Vogt, originally published as "The Book of Ptath" in 1943.

- the species known as "Trills" from the "Star Trek" universe, particularly in the series "Star Trek: Deep Space Nine."

Edit: the first suggestion seems to be a hallucination

LoganDark
0 replies
17h21m

What need is there for the rest of your life when you can simply endlessly relive, in perfect clarity, the best moments of your life - ones that you, in many cases, will likely never surpass?

Even without perfect memory, this can still happen, especially if something like PTSD is involved. I myself have a dissociative disorder and sometimes it can be difficult not to just relive the past over and over. (It's not like this is a daily struggle, but during low points it comes up.)

novok
0 replies
1d

IMO that is a more fundamental argument about removing information from the anxious and jealous who cannot handle the truth. Similar to medical paternalism where the doctor has to decide which test results to tell patients about. Which I understand, but can also be used to go full ham into china style totalitarian censorship regime for the 'peoples own good'.

Also you might not realize how much the hard of hearing, deaf, and other neurodiverse ADHD types would find a memory prosthetic like that very useful unless you are in their very own shoes, for possibly years.

karaterobot
2 replies
1d

I have a coworker who was tracking Black Mirror premises that were being taken as inspiration by people on the internet. This may be one of those. A lot of SF authors have expressed this idea: "I meant it as a warning, not a road map". I think I read either Neal Stephenson or William Gibson expressing this about their respective dystopias and the Silicon Valley entrepreneurs who were trying to implement them.

Anyway, posting this comment so I remember to send it to her tomorrow.

hu3
0 replies
1d

I'd rather not make conclusions of a technology impact based on a 44 minutes episode that was optimizing for maximum entertainment.

So there's nothing fundamentally wrong in taking inspiration from sci-fi.

Sure the presented perspective is negative but that is because the producer explored local maximums to the benefit of the audience. They are obliged to paint technology in a certain way to captivate viewers.

Does that mean that the presented reality is the only possible one? Certainly not, that would be reductive.

Many popular technologies today are used to scam a defraud people. Does that mean said technologies are fundamentally bad? Not at all.

I use technology as a force multiplier.

RUnconcerned
1 replies
4h29m

Reminded me of the "Torment Nexus" tweet:

https://twitter.com/AlexBlechman/status/1457842724128833538

@AlexBlechman

Sci-Fi Author: In my book I invented the Torment Nexus as a cautionary tale

Tech Company: At long last, we have created the Torment Nexus from classic sci-fi novel Don't Create The Torment Nexus
bee_rider
0 replies
3h28m

I thought about mentioning the Torment Nexus but I figured it is already in the back of everybody’s heads, in this type of conversation, haha.

lannisterstark
0 replies
19h36m

I don't see it as "Black Mirror" but rather as a feature. I pretty much record everything about myself anyway (locally, selfhosted) - so this alone wouldn't add any other obstacles than storage.

a local model that can analyze the videostream/files and answer questions would be great.

htrp
0 replies
1d1h

And yet we are already rushing towards this.

dwhit
0 replies
1d1h

for a less misanthropic take on the same idea, I recommend Ted Chiang’s “The Truth of Fact, the Truth of Feeling”

dbish
0 replies
17h58m

really? i tend to look to them for ideas

ametrau
0 replies
22h7m

Lol!

shiroiushi
9 replies
13h38m

2) Did I get my credit card back after paying the restaurant yesterday?

A restaurant should never be taking your credit card out of your view, ever. Really, they should not even be handling it: you should be swiping it yourself at the terminal at the cash register when you pay before leaving.

bongodongobob
7 replies
4h55m

I would say 95% of the restaurants I go to take your card and run it themselves.

justsid
6 replies
4h31m

That’s insane. Here in Canada the waiter/waitress brings a portable pos terminal to the table and you pay through that. I’d be incredibly uncomfortable if they ran away with my credit card.

bongodongobob
2 replies
4h5m

Why is that insane? I think a more appropriate description would be "perfectly normal".

justsid
1 replies
3h15m

Because normally you aren’t supposed to just let strangers wander off with your credit cards.

bongodongobob
0 replies
1h10m

I offer up 40 years of my life as a counter example. Maybe it's a bit city thing, idk, but this is not the norm where I'm from. You get the bill, put your card in it, they take it and run it, bring it back, and you sign the receipt.

JoeAltmaier
2 replies
4h29m

We had that locally. They're abandoning it, citywide. Nobody liked it, it was cumbersome and invasive, it involved your credit card anyway but was a worse experience for everybody.

justsid
1 replies
3h16m

How is that a worse experience? I don’t have to mess around the bill to add a tip, I just punch it into the machine and then tap my card or phone. It also makes it super easy to split a bill if you are in a group.

cal85
0 replies
2h39m

You don’t mind the waiter/waitress standing over you while you decide how much to tip them?

kaashif
0 replies
3h55m

Yes, but in the US, the waiter taking the card out of your view, doing something with it, then bringing it back is the norm.

I'd never seen a restaurant do that in the UK.

I think it's a relic of the times before handheld payment terminals existed.

Also, having to decide on a tip percentage directly in front of the waiter is awkward sometimes, doing it the US way avoids that. Tipping is very common in the US.

namanyayg
1 replies
1d2h

That is exactly what the (highly criticised) Humane AI Pin does

dbish
0 replies
1d1h

It really doesn’t. It has no history or proactive understanding, it only runs a query when you ask it

alchemist1e9
0 replies
20h37m

Frame from Brilliant Labs is getting close to providing hardware that can realistically provide the data in a user friendly and fairly incognito style.

I’ve preordered them.

karencarits
16 replies
1d2h

For Windows, there is also TimseSnapper [1], not open source, but the developer is sometimes here on HackerNews

[1] https://timesnapper.com/

LeonB
11 replies
1d2h

Cheers, yeh, I am here a bit.

doodlebugging
4 replies
1d

Thanks for TimeSnapper Pro. I started using it in 2010 I think after cross-testing the field of options available. I especially like that I was able to purchase a permanent license as opposed to being locked in to a SaaS model where my time data ended up in someone else's control. Everything right on my machine where if it is lost I get all the blame so I never have to stand up from my desk and shake my fist at the cloud.

Excellent product for a consultant like myself to have on hand to document everything happening on their machine. The ability to change the screenshot interval and add notes along the timeline are real power tools for someone involved in multiple projects every day. You can secure your time record snapshots with a password. Playback at multiple speeds so you can jog your memory by browsing the timeline to the exact time when you flipped to something else. Classify by project, client, etc.

Really a great bit of software.

I once found one of my invoiced time periods being challenged during a regular audit of one of my client's operations. I frequently billed long hours since I worked long hours on 24-hour call (remote oil and gas data processing and support of international operations). The auditor called and requested anything I had that could support a particular time period where I billed "excessive" hours (>20 consecutive). I told them I would get them everything that I had and asked whether they were interested in any other dates or time periods or work for specific projects. They narrowed it to one single invoice entry so that made it simple for me.

Thanks to TimeSnapper I was able to produce a GIF movie of the screenshots (I always used a 1 minute snap interval to allow fine-grained operations review) during the requested interval that documented everything happening on screen and I provided a PDF of the notes dumped from that interval documenting phone calls taken, etc. I also sent copies of my handwritten notes I always maintain as primary memory-jogging tool. In the return note to the auditor I asked if they were interested in any other dates or time periods and offered to send an archive on their request.

The auditor called me up and we talked for a while and they thanked me for the detailed records and told me they wouldn't need anything else.

It really helps that my wife is a CPA/Auditor/Fraud Examiner. Tools like TimeSnapper Pro helped me in my business to maintain excellent records about day-to-day operations that made it simple for me to invoice clients for all the time that I spent on their behalf.

Thanks /u/LeonB

LeonB
3 replies
20h53m

Wow! Thanks for sharing this!

doodlebugging
2 replies
20h12m

No, thank you!

Your hard work and thoughtful programming not only made it possible for me to document my efforts no matter what I was doing on my machines, but it made it easy.

That is the true measure of success with any programming effort.

You created a product that 1) I could install on my machines, read through the documentation and understand how to make it work for me without any friction since it is well-documented, 2) I could parameterize my record-keeping (even allowing changing things on the fly) to fit my own unique situation so that I could get compensated fairly for all of my time that I spent dealing with my client's problems, 3) I could archive and export in common formats to form the basis for reports that I needed to satisfy client requirements and help them determine how much value as a consultant I added to their projects.

Bravo! It's a great piece of work.

Before choosing it I ran side by side comparisons with other similar apps including some that stored all my usage data in a cloud or in a proprietary format with hoops to jump through to be able to prepare invoices and reports. The pure simplicity of using TimeSnapper Pro and the fact that I had all the usage data locally stored where I didn't have to depend on someone else's cloud availability when an invoice was due and all the other well-designed features including the ability to lock it all behind a password made my choice simple.

For most of the shitty projects I worked on I billed by the minute hoping that some asshole would challenge that, knowing that I had everything I needed to support that granularity. I really wanted them to come back and challenge some of the other invoiced time. Those ducks are still in a row if I ever need them thanks to my usage of your software.

I frequently worked really stupid hours, sometimes billing more than 40 hours in a 48 hour period when they hit me with a time-sensitive task that had to be completed now. I remember too many times getting a call late on Friday telling me about new data that needed to be processed for a meeting on Monday morning. I got it done and documented all my time and got paid.

You're the bestest!

LeonB
1 replies
18h34m

Thanks again. Btw, I certainly didn't code it and maintain it single-handed -- my co-founder Atli Björgvin Oddsson deserves the credit for all the good bits.

I have to say -- I'd rather you worked sane hours, and pushed back a bit. You only get one life, and giving too much of your time to (for example -- as I've spent years in that industry too) the oil and gas industry isn't everything in life.

"You work sixteen hours and what do you get? Another day older and deeper in debt" --some old blues legend probably

wartijn_
3 replies
1d2h

That’s fast :) Do you get notified when someone mentions your product, or are you _that_ active?

LeonB
2 replies
1d2h

Neither, lol. Just happened across the comment within five minutes of it being posted. I think it was one of the first times I visited HN today.

pests
0 replies
16h16m

The first first time, or the second first time, maybe the third?

agons
0 replies
1d

one of the first times

I don't think you can say it's "neither"! :)

justinlloyd
1 replies
14h49m

Have been running TimeSnapper non-stop on my Windows laptop and Windows workstation since... January 1st, 2006 at 11:16AM. That's almost 18 years of desktop capture. Crazy to think about that. I may have been running it prior to that, but that's the first recorded date on my saved captures. 50% resolution. 60% quality. JPEG. The entire screen and any attached monitors. Hundreds and hundreds of gigabytes of my daily professional and personal life. Captured once every 10 seconds. The movies I've watched. The games I've played. The thoughts I've had. The code I wrote. All of it captured.

Prior to TimeSnapper for Mac I ran my own little script, and a companion one for Linux. Now I have TimeSnapper on my Macbook Pro too. And if you ever create a Linux version (or want someone to create a Linux version), let me know. I'd be happy to beta test.

I've lost count of the number of times that TimeSnapper has saved my arse, either with accidentally deleted data, finding an obscure web page, proving I did something on a specific date/time, or that something happened in the way I say it did, or recovering the keys to a small amount of Ethereum and Dogecoin stored in a wallet.

Thank you for TimeSnapper. It has been interesting to use it for that length of time.

"Four weeks work in 9 minutes" captured by TimeSnapper where I create a video game (or three) in October of 2012 http://www.otakunozoku.com/video/working.flv

Sorry, it's an flv, along with an onion on your belt, it was the fashion back in the day

fmos
3 replies
23h31m

For time tracking with screenshots and advanced tagging based on window titles (and sometimes open documents), there is also ManicTime [1]. I don’t think it has OCR though.

[1] https://www.manictime.com/

ametrau
1 replies
22h14m

cloud or license

The free tier is bought by being a firehose of data for them.

jabroni_salad
0 replies
19h43m

The opposite, actually. Nothing leaves the computer you install it on, unless you specifically pay for their cloud service or stand up your own server. This is one of the reasons I chose it for myself over using timely or toggl.

chrnola
0 replies
22h27m

ManicTime was a lifesaver when I was working as a consultant and had to attribute every hour of my week to one of several clients.

compsciphd
13 replies
1d2h

We built in almost 2 decades ago now (including the ability to scrub to a point in the past and resume execution from there)

http://www.cs.columbia.edu/~orenl/papers/sosp07-dejaview.pdf

Abstract: As users interact with the world and their peers through their computers, it is becoming important to archive and later search the information that they have viewed. We present DejaView, a personal virtual computer recorder that provides a complete record of a desktop computing experience that a user can playback, browse, search, and revive seamlessly. DejaView records visual output, checkpoints corresponding application and file system state, and captures displayed text with contextual information to index the record. A user can then browse and search the record for any visual information that has been displayed on the desktop, and revive and interact with the desktop computing state corresponding to any point in the record. DejaView combines display, operating system, and file system virtualization to provide its functionality transparently without any modifications to applications, window systems, or operating system kernels. We have implemented DejaView and evaluated its performance on real-world desktop applications. Our results demonstrate that DejaView can provide continuous low-overhead recording without any user noticeable performance degradation, and allows browsing, search and playback of records fast enough for interactive use.

beeboobaa3
5 replies
1d1h

Did you actually build it or just write a paper? Where can I download it?

Intralexical
4 replies
23h50m

Previously commented with some other details.

TL;DR: It's an old PhD project that requires custom patches to an ancient kernel version. So I guess you can't download it, and even if you could it wouldn't work on any system you'd want to use today:

compsciphd on Oct 11, 2022 | parent | context | favorite | on: Linux NILFS file system: automatic continuous snap...

we used NILFS 15 years ago in dejaview - https://www.cs.columbia.edu/~nieh/pubs/sosp2007_dejaview.pdf

We combined nilfs + our process snapshotting tech (we tried to mainline it, but it didn't go, but many of the concepts ended up in CRIU though) + our remote display + screen reading tech (i.e. normal APIs) to create an environment that could record everything you ever saw visually and textually. enable you to search it and enable you to recreate the state as it was at that time with non noticeable interruption to the user (processes downtime was like 0.02s).

https://news.ycombinator.com/item?id=33165519

compsciphd on Oct 12, 2022 | parent | next [–]

sadly (as with much work form phd students like I was), the closest one could get to it today is trying to duplicate it. i.e. combining criu with nilfs (but a lot of the work that we did to get process downtime to minimal numbers requires being in kernel, as described in paper) and unsure criu can do it.

In addition our screenrecording mechanism was our own "proprietary" (not really proprietary as fully described in research papers, but also not a standard) and something that was built as an X display driver 15 years ago (so not directly usable today even if code is available). Could probably duplicate it with vnc based screencasting. vnc didn't work for us as we needed better performance (i.e. it was built to demonstrate remote display of video and games and there was no real remote audio setup back then so we had to create our own).

the "text" search just used gnome's accessible API much like a screenreader would do (with a bit of per application optimizations as can filter out things like menus and the like, primarily was to dump text out of terminals, firefox and perhaps open office and maybe even a pdf reader if memory serves me correctly, but a long time ago).

I've been looking into it myself though, mostly for forensic concerns (capturing state/evidence in a way that's harder to forge than screenshots). Hopefully, running the desktop environment through VNC and then running `criu dump --leave-stopped --prev-images-dir …` immediately followed by `mkcp --snapshot …` (or `btrfs subvolume snapshot …` would also work, I guess) would be enough for basic functionality.

yarg
1 replies
20h12m

vnc didn't work for us as we needed better performance

15 years is a long time, performance wise VNC would be adequate and we're at the point now where OCRing a render is probably fast enough for most use cases (it's actually the most reasonable choice a lot of the time, e.g.: PDF parsing).

(But how's VNC support looking on Wayland based window managers?)

zamadatix
0 replies
18h0m

VNC is available for pretty much any Wayland window manager but your distro may opt to default build the more modern RDP instead (e.g. Ubuntu Gnome).

hn72774
0 replies
11h10m

I've been looking into it myself though, mostly for forensic concerns (capturing state/evidence in a way that's harder to forge than screenshots).

For what kinds of surveillance? Why is the forgery-proofing important for the playbacks?

compsciphd
0 replies
20h3m

one thing I'd note: we didn't patch the kernel source, everything we did was through the module interface, though we did abuse it a bit, but a lot of that abuse was to provide our home grown cgroup/namespace like functionality that wasn't around when our checkpoint/restart work started. But it is fair to say because of that abuse, it was fairly tied to a specific set of kernels)

another project I created on the forensic side (steve bellovin asked the Q and I was like, yeah, I know exactly how to build thta) that might then interest you was something we called ISE-T (I See Everything Twice - Catch 22).

https://academiccommons.columbia.edu/doi/10.7916/D8HQ45MK

Two-Person Control Administration: Preventing Administration Faults through Duplication

Modern computing systems are complex and difficult to administer, making them more prone to system administration faults. Faults can occur simply due to mistakes in the process of administering a complex system. These mistakes can make the system insecure or unavailable. Faults can also occur due to a malicious act of the system administrator. Systems provide little protection against system administrators who install a backdoor or otherwise hide their actions. To prevent these types of system administration faults, we created ISE-T (I See Everything Twice), a system that applies the two-person control model to system administration. ISE-T requires two separate system administrators to perform each administration task. ISE-T then compares the results of the two administrators’ actions for equivalence. ISE-T only applies the results of the actions to the real system if they are equivalent. This provides a higher level of assurance that administration tasks are completed in a manner that will not introduce faults into the system. While the two-person control model is expensive, it is a natural fit for many financial, government, and military systems that require higher levels of assurance. We implemented a prototype ISE-T system for Linux using virtual machines and a unioning file system. Using this system, we conducted a real user study to test its ability to capture changes performed by separate system administrators and compare them for equivalence. Our results show that ISE-T is effective at determining equivalence for many common administration tasks, even when administrators perform those tasks in different ways.

I should note that the paper also discusses that 2 people might be expensive, so the same mechanism can be used by a single admin but in a manner that maintains an audit trail.

The above project wouldn't require any kernel modifications as the work was all about using unionfs (using normal vfs loadable module interface hooks) to capture changes and user spaces to log and compare them.

All this work led to what can be viewed as a proto-docker - https://www.usenix.org/legacy/events/atc10/tech/full_papers/... and https://www.usenix.org/legacy/events/lisa11/tech/full_papers...

hysan
3 replies
1d2h

Is the url correct? I get a file not found when I try to open it.

water-data-dude
1 replies
1d2h

Ah, gotta love link rot.

This one works : https://www.cs.columbia.edu/~nieh/pubs/sosp2007_dejaview.pdf

This one’s broken: https://www.cs.columbia.edu/~orenl/papers/sosp07-dejaview.pd...

When I search for “dejaview” in the main index, I get the same broken link in the search results[0]. At first I thought they’d changed the URL structure (/pubs/ to /papers/), but if you visit [1] it works, but [2] doesn’t work. I guess “orenl” isn’t a member of the faculty anymore, so they tore down their page and removed all the associated resources.

[0] https://www.cs.columbia.edu/g-search/?q=dejaview#gsc.tab=0&g...

[1] https://www.cs.columbia.edu/~nieh/

[2] https://www.cs.columbia.edu/~orenl/

wizzwizz4
0 replies
22h26m

Dr Oren Laadan's site hasn't been updated since 2011, but it's still up: it's just HTTP-only.

wizzwizz4
0 replies
1d2h

HTTPS isn't a drop-in replacement for HTTP. Try visiting the HTTP site.

IshKebab
2 replies
23h23m

Great project name!

eps
0 replies
7h22m

Back in the early '00s there was be a Canadian cable channel called exactly the same. It still exists apparently - https://www.dejaviewtv.ca

compsciphd
0 replies
20h15m

thanks, I came up with it :) (originally title which i also came up with, but had to be changed to keep double blindness due to a snafu on a previous submission, was ThincBack, because Thinc was our home grown remote display protocol, which became the basis for VESA's net2display standard, though unsure anything really ever happened with that in practice after it was published)

yogorenapan
4 replies
1d4h

Is there anyone here that has used it for an extended period of time? Would be interested to see if it’s actually helpful.

aspenmayer
1 replies
1d4h

I used to use Google Desktop back when it existed for its viewed web page search feature mostly. It was pretty handy.

xhevahir
0 replies
1d3h

Google Desktop was the first thing I thought of when I saw the link. One difference with Desktop though is that nowadays people are doing stuff on more than one device. Synching the data somehow or other would presumably involve the sort of cloud services that this developer is avoiding for privacy reasons.

wingerlang
0 replies
1d1h

I can't speak for the mentioned app (rem), but I built my own app with a similar feature set called ScreenMemory (https://screenmemory.app). Which I have obviously used for an extended period of time (coming up on 7 months I believe).

My main and daily use case it to look back at the previous day - this is helpful for standups, retros, and so on. I skim through my days (sometimes weeks) to pick up on what I was working on - it's incredible how much "untracked" work is performed that you pick up on. Sometimes I forget who exactly I talked to about something, but just knowing the rough date I can usually find something to jog my memory.

Obviously I am biased, but still!

jasonjmcghee
0 replies
1d1h

I build a lot of stuff and forget why I did things the way I did.

You can do things like find the point in history where you fixed a bug and go watch yourself debug and put in the fix.

Pretty wild. It makes a lot more sense once you experience getting the value.

Personally I hope Apple adds the feature natively to the OS at some point. They're plenty experienced to build it themselves over there, but maybe having a reference implementation will encourage them to give it a shot.

jasonjmcghee
2 replies
1d1h

Author of rem here.

Come join in on development!

It's MIT Licensed.

I also kicked off a cross-platform version in rust https://github.com/jasonjmcghee/xrem which is earlier in development and could use even more help.

RamblingCTO
1 replies
9h26m

Awesome! But how about private browser windows?

jasonjmcghee
0 replies
2h53m

I'm guessing you're asking if there's a way to prevent recording private windows?

It's possible to not record specific window IDs (at the code level) - so you'd need to be able to detect whether the window is "private". I am not familiar with such a flag.

Alternatively you might be able to just have a lookup or regex of all the most popular private browsing / applications...

Or let a user specify a pattern to look for and skip.

But at the same time, it's your private data and it's not going anywhere. There's no telemetry or network access of any kind.

Anyway- doesn't exist as a feature today. Maybe someone (you?) will build it!

fragmede
2 replies
8h13m

also http://rewind.ai which is a company doing the same thing

eps
1 replies
7h21m

Why is it "AI" though.

world2vec
0 replies
5h49m

Seems it's using LLMs (GPT4 in this case) to transcribe and summarise documents and draft notes, emails, etc.

interstice
0 replies
15h27m

I spent ages looking for something like this for Mac before giving up and writing a script that takes screenshots every 10 seconds and another script to compile that up into video using ffmpeg!

I'd love to contribute but I know nothing about Swift, mine was all in bash scripts with launchd to run them.

modeless
8 replies
1d

Has anyone built something like this using accessibility APIs instead of (or in addition to) OCR? It seems like a waste to OCR everything when you could just get the text directly from the accessibility APIs. Also seems like potentially a good way to connect LLMs to UIs, and something like this would be the way to collect the training data.

dbish
2 replies
18h1m

We've done a bit of both for our screen seachable loom-like screen recorder, the problem is that the accessibility APIs differ greatly between Mac and Windows if you want to be OS agnostic and even on Windows all the apps tend to do things a little differently making it hard to say what did you actually "see", with some apps missing key data or implementing it incorrectly. OCR ends up being easier many times desptie thinking accessibility would be.

Sephr
1 replies
17h27m

OCR is easier for the developer, but worse for the user in terms of battery drain / energy use.

dbish
0 replies
16h30m

For sure, we made a privacy tradeoff to do it server side (given some screen change delta) because of this. Accessibility is a good "in addition to" but there are just so many apps that don't handle it well

JoBrad
2 replies
1d

It would also be great to have the foreground apps as metadata.

wingerlang
0 replies
13h15m

This is what I added to my macOS app recently, foreground app metadata. It is displayed on the timeline if you look at the pictures on my website (screenmemory.app). For my use case it was night and day in UX.

ehsankia
0 replies
1d

Also more general structured data. Like being able to search only the window titles, or text within a certain windows (Discord, Chrome).

janpmz
0 replies
6h39m

I've built a workflow recorder with a screen history (MVP).

I concluded that in case this is a viable approach, Microsoft or Apple will built that into their OS natively as part of a copilot that remembers everything and comes to assist the user with the knowlede.

My screen history was not as advanced as the app mentioned here though. And I didn't use it myself.

bitwize
0 replies
23h10m

Dragon NaturallySpeaking supports voice commands like "click OK" and responds accordingly. Its solution to the problem of Microsoft Office doing its own custom widget rendering was to OCR the text on widgets and buttons to determine their labels. You need something like this far, far more often than you think you do. Developers will flummox you, they will NOT use the provided APIs.

poisonborz
6 replies
1d4h

Congrats on the great execution of this novel idea! Inspiring for everyone with a "why isn't there an app that does x" idea.

novok
4 replies
1d1h

rewind.ai is another example of this, but their recent pivot into cloud only storage for this and renaming to limitless.ai makes me glad that open source stuff like this popping up so you are not forced into a cloud storage situation. And I say this as a paying customer who will probably stop being one.

dbish
2 replies
18h0m

they also seem to have shifted to hardware/"AI" wearables (lavalier mics?). hardware is not a promising shift imho

fragmede
1 replies
8h4m

the rise of open source competitors makes it harder to compete as a software-only product

dbish
0 replies
1h38m

I don’t think hardware is the right savior though, just moving faster with software, focusing on differentiation (ex: better understanding of the data for the task at hand rather then purely using OpenAI APIs). Hardware is the perfect way to kill your margins even with a successful product, look at the jawbone story

jejeyyy77
0 replies
1d

oh man, didnt know they completely pivoted. RIP

nerdile
0 replies
1d2h

The easiest way to find an app that does X, is to build a new one and then post it to HN and check the comments.

jstanley
5 replies
1d4h

Whoa, very cool project, I would be keen to try it out if it worked on Linux.

alchemist1e9
0 replies
18h52m

I’m surprised not more momentum on this. It would be neat to modify this to support local a LLM and maybe with txtai try to build semantic knowledge graphs on screenshots to auto detect “topics” of your desktop activity.

INTPenis
1 replies
1d2h

I wonder if there is a sensible API where you can track everything without storing a video file. Something like seeing Window positions, classes, names, and maybe even GTK/QT cache?

chabad360
0 replies
1d

Same, I've been looking for something like this ever since Rewind came out.

ppqqrr
3 replies
1d

Anyone know if there’s an audio equivalent of this software?

fastestjet
0 replies
24m

https://flyonthewall-3499fa132151.herokuapp.com/

I made one called flyonthewall, you can upload a m4a file and it gives back transcription from openAI whisper model. Basic setup (no accounts right now) that I use to transcribe plus summarize my audio notes from my dictaphone.

dbish
0 replies
17h54m

Not "always on" but if you enable a recording with us, it's all searchable and even attempts to write a doc of what you were talking about to scan or answer questions about later: https://augmend.com. We do screen + audio, but audio alone does work.

oezi
3 replies
1d

I would be curious if anybody feels that such a tool wouldn't be too much a danger as spyware. In particular from a Chinese developer.

ametrau
1 replies
22h5m

Any Chinese developer can call themselves John Smith. Even the spies.

earleybird
0 replies
20h4m

I prefer to be also known as John Bigbooté.

rullelito
0 replies
23h2m

Any tool you download can be used as spyware.

maxloh
3 replies
1d4h

Download ffmpeg (the download file name is: ffmpeg-master-latest-win64-gpl-shared.zip), extract all files in bin directory(excluding the bin directory itself) to C:\Windows\System32 (or other directories located in PATH)

Copying ffmpeg to C:\Windows\System32 doesn't seem to be the correct way to install it.

adzm
2 replies
1d3h

Yeah just add the bin directory to the %path%. A lot of things have installed it to their own directory in appdata or program files.

wongarsu
1 replies
1d3h

A lot of things have installed it to their own directory in appdata or program files

This is the windows way. In the early days (Windows 95/98) it was indeed common practice to put shared dlls and exes in C:\Windows or C:\Windows\System32, but without central oversight this was a nightmare: version conflicts were rampant, and uninstallers didn't know what could be safely removed. Everyone switched to installing everything into the program's install dir, and the world was much better off.

Some poweruser tooling gets fancy by checking if ffmpeg is installed systemwide, and if not downloads it to the programs appdata folder. But that exposes you to versioning issues if that version is too old or too new

nurple
0 replies
1d2h

This is what's always confused me about the Unix FHS. A unified hierarchy makes (un)install complex and error-prone indeed. One of my favorite things about Nix is that it threw FHS away.

lannisterstark
3 replies
19h38m

There was this app, I don't remember the name, that was similar I think for MacOS. It would record the entire timeline, and you could just "scroll back in time - in real time btw-" to a specific date/time and see the open windows you had and what not.

I distinctly remember seeing it somewhere (I think it was here), but I can't find what the app was called, despite all teh gooling.

hfe
1 replies
19h33m

I made this app on Windows. The major inspiration from the early concept of Mac app Rewind and Black Mirror S1E3 "The Entire History of You"
lannisterstark
0 replies
19h31m

That'll show me to actually read the post than just gloss over it. Thanks lol.

FL410
0 replies
19h35m

Probably rewind.ai

ksec
3 replies
1d4h

I wanted the same thing for Browser for a very long time. Since that is where 99% of all my information are consumed. But May be it does seems to fit in the OS in a broader prospective.

asdefghyk
1 replies
22h27m

I use singlefile extension to save automatically a copy of every webpage I view. Also use BetterHistory extension on Chrome to record browsing history. Have been using both for a few years.

alchemist1e9
0 replies
20h29m

I use singlefile also but not the auto-save.

I’ve long thought that ideally an auto-archiving transparent proxy would be a more elegant approach.

aspenmayer
0 replies
1d4h

That’s the feature I miss most about Google Desktop.

shiftpgdn
2 replies
1d3h

This could easily be sold for $5/seat to giant corporations who want to stick spyware all over their employee workstations.

cableshaft
1 replies
1d1h

If I knew companies were recording my screen constantly I would just quit and save everyone the trouble of them getting on my case for not coding 100% of the time.

I know my screen can be checked at my current client, but I also know the product owners don't really care as long as the work is getting done on time (or at least they haven't bugged me about it since I've starting working with them a year ago and have given very positive reviews about me to my company).

cj
0 replies
1d

Screen recording probably won’t take off in this context, luckily.

I think employers would be more interested in basic things like “did this person login their computer today, between what times were they active, how many hours in the day did they type at least 100 words?” … or things like “here’s a list of people on PTO today, which people not on this list haven’t typed on their keyboard today?”

That sort of data would be easy to collect - keyloggers have been around forever, doesn’t trigger MacOS to show the “recording screen” notification, and the data is easier to aggregate and view across a large number of computers.

I think the fact that keylogging didn’t explode as a method of tracking productivity during Covid WFH probably means we’re safe from screen recording for the foreseeable future.

msephton
2 replies
20h30m

I remember a few projects like this, the first I saw was called Savant Recall in 2014. But it failed to be selected for YC, so was set free as open-source. Napster co-founder Ritter picked it up, renamed it Atlas Recall (2016), gave it a new UI, secured $20M in funding. A year later it was suddenly shutdown. On LinkedIn says "acquired by Xinova". Another one I'd heard of was called Apse (2019).

cynicalsecurity
2 replies
22h45m

I can't think of any usage for that project other than spying for the employees. Just slightly modify the saving mechanism to send it to the corporate cloud and voila. AI profiling. Perfect for harassment at work and layoffs.

lannisterstark
1 replies
19h32m

Despite your username, sometimes it's nice not to be cynical all the time.

This is tremendously helpful for people like me who forget stuff often.

Ylpertnodi
0 replies
19h20m

sometimes it's nice not to be cynical all the time.

I would propose you are both absolutely correct.

bluelightning2k
2 replies
1d1h

Amazing that this was done by a non-professional dev. Huge cudos. (Tbh why not become a professional dev? You clearly have the skill and at least some interest.)

There's a huge miss in this implementation though: 95% of the time what is on screen is web-browsing. So to go from nicely formatted markup with title tags -> video -> OCR is clearly missing am ore obvious path (either a proxy or browser extension).

pmichaud
0 replies
1d1h

I'm not so sure. This generalizes, plus "nicely formatted markup" is a big assumption in the brave new world of modern web apps. I guess it's probably pretty reliable to look for visible text strings, but there's something cool about only caring about what is directly visible.

I've often thought that screen readers might be better off now by attempting this visual approach. It used to not be possible, now it may be easier than relying on accessible markup.

haruharuha
0 replies
7h20m

Thank you! However, coding feels quite daunting to me, as my talents lean more towards design. The reason I embarked on this project is that I strongly desired to address a particular need, but had unable to find an appropriate alternative or willing programming friends to help. Due to my limited development capabilities, many aspects of this app don’t seem to be optimally or fundamentally implemented. Potentially, an open-source approach may inspire others with the same needs to collaborate or provide better solutions.

And yes, most of my screen time is also web-browsing. Currently, I can use the web page's Windows title as a reference, which provides quite a bit of information. To enhance this feature, it might consider using like Chromedriver to gather more details, such as web links and page text, similar to what 'Rewind' offers.

Cilvic
2 replies
1d4h

Great, ever since rewind.ai came out I was longing for something similar (MacOS only for now)

wingerlang
0 replies
1d1h

One alternative is ScreenMemory (https://screenmemory.app)

I made it, for full disclosure.

layer8
1 replies
1d3h

Per the GitHub readme, the video should be around 100-200 GB per year, not too bad.

m3kw9
0 replies
1d1h

Avg how many hours a day? Does it delete scenes with no text like movies and games

jackvalentine
1 replies
18h29m

I have had this thought for ages and I'm stoked to see someone having built it.

It seems to create a huge risk for your data though. Any ideas about how you might go about securing against the application itself doing something untoward?

akoboldfrying
0 replies
16h26m

It runs locally, so you could just block outgoing network connections for the app.

helpfulContrib
1 replies
1d1h

A long time ago, someone got "cryofreeze" working on Linux processes, where you could freeze a process to disk and unfreeze it later .. but this was purely experimental and doesn't seem to have persisted.

Anyway, in my lab environment where I had this running .. over a decade ago now I guess .. I set up scripts to automatically cryofreeze the daemon processes I was writing, so I could indeed reset to the state in time, of interest during testing. This lab got re-comissioned for other things, and a needed kernel upgrade broke cryofreeze, and I never got back to it - but I have often wondered about it, as a temporal interface to some things would make for a whole new world of useful UI and other interation paradigms to explore.

I remember there was something like this possible on WANG and Tandem systems back in the 80's, and it always has kind of piqued my interest as to why this isn't still considered a thing. Well, I guess syscall complexity has vastly changed since WANG and Tandem were around, lol.

Anyway, I'd love to be able to do this on Linux again - I keep thinking to catch up with the state of the art of process freezing/unfreezing, so this has motivated me to do this - although I do confess that, as principally a Lua/C developer in my chosen market (embedded), I attain this re-playability in other ways (state sync'ing) which also allows me to move processes to other systems, relatively smoothly, also.

Still, would love to have this for normal Linux processes, out of the box. Am I ignoring an obvious way to do this?

unnouinceput
0 replies
6h11m

The site is not usable on Firefox even after I temporarily allowed JS to run in my NoScript setup. Perhaps it wants me to disable uBlock Origin as well? Not gonna happen, so there is that. For me this is a no go area regardless of how cool the app is.

tivert
0 replies
4h10m

JavaScript must be enabled in order to use Notion.

Please enable JavaScript to continue.

I run NoScript, and the way this block is constructed makes it impossible to actually enable JavaScript to run the site.

Please don't redirect to a static page, because that page doesn't have any of the JavaScript present to enable on a case-by-case basis, the the redirect happens so fast I have no chance to enable JavaScript on the main page.

Just enabling for notion.so itself is not sufficient to bypass the block.

okokwhatever
0 replies
10h33m

Nice idea, really, but it has the smell of a visual keylogger.

maxglute
0 replies
8h11m

This is great start. I've been wanting something like this for a while, 10-20gb per month feels like a lot though. Feel like archiving at quarter resolution and black and white and 32bit rate sound and eventually have AI upscale a good compromise.

lazylion2
0 replies
23h13m

TIL you can screen record with ffmpeg

gnutrino
0 replies
23h38m

Looks very similar to https://apse.io/, which uses OCR to build a searchable index of everything you've seen on screen. I like the open source aspect of windrecorder.

fnp84
0 replies
3h36m

this is ingenious, and most likely will open a thought process for the implementation of similar but extremely lightweight mechanisms in RATs and other malwares (like Rootkits)

djmips
0 replies
13h11m

This will eventually be an AR app to do the same thing for you daily life.

anon115
0 replies
22h6m

how do i download it? i wanna try the msi not the command line

albert_e
0 replies
1d3h

I hope this becomes hardware feature soon on laptops -- where we dont need to sacrifice the main CPU/memory/storage to have this functionality.

account42
0 replies
7h58m

Oops! Something went wrong.

Please refresh and try again or.