return to table of content

Csvlens: Command line CSV file viewer. Like less but made for CSV

jwiz
17 replies
1d7h
zoogeny
7 replies
1d1h

I've been thinking about a terminal spreadsheet editor for about a year now. I haven't deeply investigated this tool while I write this comment but I'd love for something like this for Terminal that supported `=A1+B2` kind of formulas.

runjake
4 replies
1d1h

Back in the day, we called this Lotus 123 and VisiCalc.

https://en.m.wikipedia.org/wiki/Lotus_1-2-3

PaulHoule
2 replies
1d1h

The spreadsheet data model lives a double life.

(1) On one hand you can use it for data that is really tabular,

(2) But it also has an engine that can compute the dependency relationships between cells and recalculate the cells affected by a change. This is quite different from conventional programming languages where you are required to specify an order to put operations in. Of course this a problem for exploiting parallelism but I'd charge it is one more bit of cognitive load that makes it harder for beginners and non-professional programmers. (Professional programmers are just used to it and only run into problems in unusual cases where circularity is involved, but I think it's one more thing that beginners struggle with.)

The worst problem is a lack of separation between code and data. If you are doing an analysis you might put something like

  =SUM(A1:A28)
in A30. Excel will change that to

  =SUM(A1:A29)
if you insert a row in there, which helps, but they lock you into the mindset of "I'm making the December sales report" as opposed to "I'm making the monthly sales report". That is, the data and the analysis should be two separate things: just as you can put the December data or the January data into a Python script.

(Note some of the same problems still exist with "notebooks" and "workspaces" where you wind up with a file that has both code and data in it which can be problematic to check into git, particularly when your are working for people who would like to have a beautiful notebook with an analysis in it to view in GitHub but will then struggle to version control it. Many "data scientists" fail to rise above the December sales report even though there's a clear path to turn a Jupyter notebook into a Python script.)

---

As much as that is a rant I think there's a huge untapped market for things that are like spreadsheets but different. That is, something that looks like Excel but is specialized for editing tabular data (no formulas, CSV import 'just works' all the time, ...) or something that has formulas like Excel but not on a grid or not just on a grid. For the latter there was this product

https://en.wikipedia.org/wiki/TK_Solver

which is still around. People were amazed with TK Solver when it came out and I'm surprised to this day that it hasn't had a lot of competition.

wlindley
0 replies
22h38m

We had that once, in an MSDOS-era program called Javelin ( https://en.wikipedia.org/wiki/Javelin_Software ):

"Unlike models in a spreadsheet, Javelin models are built on objects called variables, not on data in cells of a report. For example, a time series, or any variable, is an object in itself, not a collection of cells which happen to appear in a row or column... Calculations are performed on these objects, as opposed to a range of cells, so adding two time series automatically aligns them in calendar time, or in a user-defined time frame. Data are independent of worksheets..."

That was forty years ago and as far as I know there hasn't been a program that works the same way since. Perhaps someone could reverse-engineer it (the jav.exe file is only 56,448 bytes!)

bee_rider
0 replies
20h19m

Spreadsheets are kind of interesting tech. Because the user is more-or-less directly building the data dependency graph, I’d expect them to map better to actual hardware (massively parallel in so many dimensions) better than a sequential procedural language like C.

But, they are not very fashionable.

It would be fun to have a pipeline to compile a spreadsheet, that produces an executable. Potential outputs: cpu, gpu, fpga. Haha.

zoogeny
0 replies
1d1h

Yes, exactly - imagine a modern version written in Rust, with vi key-bindings, etc. Please someone be inspired by this idea and build this. TUIs are making a comeback and the spreadsheet is still one of the most useful and effective tools in all information science.

genericacct
0 replies
1d

I wrote an SC fork that does this ,will send a link when im home

franek
0 replies
1d

While not built around CSV, two terminal spreadsheet tools I have successfully used in the past are sc-im and the (neo)vim plugin vim-table-mode:

https://github.com/andmarti1424/sc-im/

https://github.com/dhruvasagar/vim-table-mode

Back then I stopped using sc-im because it could not import/export XLSX, if I remember correctly. Apparently it can today!

vim-table-mode always felt a little fragile and I don't want to be bound to vim anymore. That said, it still feels like a small miracle to me to have functional spreadsheet formulas inside markdown documents – calculation and typesetting all in one place.

jellyfishbeaver
3 replies
1d3h

Strongly considering donating to the creator of Visidata, it's one of the terminal tools I can't live without. I use it every day.

letmeinhere
0 replies
1d1h

huh, something (maybe similarity to VisiCalc name) had me dismissing it as commercial software

jwiz
0 replies
1d

Yep, I am doing the $5/mo Patreon level because I love it so much.

bbkane
0 replies
1d1h

If it's that important and you can afford a few bucks a month, I strongly urge you stop considering your donation, and (right now) follow the links on the GitHub to actually start supporting the project.

It's really easy to wallow in the "considering" phase of a decision indefinitely (or maybe that's just me :D)

standing_user
0 replies
1d7h

Didn't know about it, thank you

somat
0 replies
23h16m

Can confirm, I consider visidata to be the gold standard for data exploration.

I have largely given up on spreadsheets, replacing them with relational databases. And on top of being a world class tool for understanding the data for import, visidata makes for a pretty great query pager.

rrr_oh_man
0 replies
1d7h

Scatterplots in the terminal

Wow!

retrochameleon
0 replies
20h20m

I've been using Visidata and love it for viewing, extracting rows or columns to a different file, but I know it is capable of much more that I haven't figured out yet

mistaken
0 replies
1d7h

visidata is awesome. It also supports a bunch of other file formats such as excel sheets.

nickjj
11 replies
1d

One thing I often do with CSVs is sum up all or specific rows in a column.

For example maybe you're doing end of year taxes and now you have this large CSV export from your bank or payment provider with multiple categories and you want to get the totals for certain things.

In a GUI tool it's really easy to sort by a column and drag your mouse to select what you want and see it summed in real time.

Oftentimes things aren't clean enough to have 100% confidence that you can solve this with an automated script because maybe something is spelled slightly different but it's really the same thing. This feels like one of those things where spending a legit 10-15 minutes once a year to do it manually is better than trying to account for every known and unknown edge case you could think of. The stakes are too high if you get it wrong since it's related to taxes.

Has anyone found a really good standalone basic spreadsheet app that "just works" which isn't Microsoft Excel that works on Windows or Linux? I don't know why but Libre and Open Office both struggle to parse columns out in certain types of CSVs and the sorting behavior is typically a lot worse than Google's spreadsheet app but I'd like to remove some dependence on using Google.

e12e
2 replies
22h24m

Have you tried gnumeric?

nickjj
1 replies
22h0m

Nope, but I just tried. I went to their site and noticed there's no binaries for Windows. The site is also served over HTTP (not HTTPS) and the default experience for apt installing it on Ubuntu 22.04 didn't work due to a bunch of packages no longer existing:

    E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/evince-common_42.3-0ubuntu3_all.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/p/poppler/libpoppler118_22.02.0-2ubuntu0.2_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/p/poppler/libpoppler-glib8_22.02.0-2ubuntu0.2_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/g/ghostscript/libgs9-common_9.55.0%7edfsg1-0ubuntu5.5_all.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/g/ghostscript/libgs9_9.55.0%7edfsg1-0ubuntu5.5_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/libevdocument3-4_42.3-0ubuntu3_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/libevview3-3_42.3-0ubuntu3_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/evince_42.3-0ubuntu3_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/w/webkit2gtk/libjavascriptcoregtk-4.0-18_2.42.1-0ubuntu0.22.04.1_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
    E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/w/webkit2gtk/libwebkit2gtk-4.0-37_2.42.1-0ubuntu0.22.04.1_amd64.deb  404  Not Found [IP: 91.189.91.83 80]
Going to go with nope on this one.

e12e
0 replies
21h28m

Gnumeric isn't in main on Debian/Ubuntu anymore?

Ed: looks like it is in universe on Ubuntu - you have universe enabled?

https://packages.ubuntu.com/search?keywords=gnumeric

Ed: Did you apt update? Core pieces of evince missing sounds very strange?

AB1908
2 replies
1d

I've never tried this myself but you could try importing the CSV in sqlite and then run aggregate queries over it. Does that sound useful?

nickjj
0 replies
1d

You can but it's back to depending on code to get the totals. This is one spot where IMO being able to visualize the data by seeing the rows and have immediate feedback on the sum is useful. You can continue dragging or use CTRL + SHIFT clicking to select more stuff as needed while letting your brain decide what should be grouped together.

With a SQL / code approach you have to account for these things without being able to see them and then adjust the code afterwards to include your custom groupings. It ends up taking more time. If the categories didn't change every year it would for sure be worth it to code up a solution since you'll know the edge cases by looking at the existing CSV but it's a moving target because it could change next year.

mongol
0 replies
1d

Was about to suggest the same

wobblykiwi
0 replies
1d

I haven't dug too much into the tool, but could you use something like Datasette for that?

snidane
0 replies
23h36m

There is sc-im, which is the closest you'll get to a full spreadsheet app in terminal with vi controls.

https://github.com/andmarti1424/sc-im

lowbloodsugar
0 replies
23h40m

Jupyter?

hermitcrab
0 replies
4h36m

Have you looked at data wrangling software like Easy Data Transform, Knime or Alteryx? It is specifically written for cleaning, merging, summing and reshaping tabular data. They can all handle CSV and Excel files (plus a lot more).

hectormalot
0 replies
23h54m

If you want to use the CLI, Visidata might be what you want. It does have a bit of a learning curve. Beyond that, I've found it quite handy to do quick data explorations. e.g. there are shortcuts for histograms, filtering, x-y plots, etc.

[1]: https://www.visidata.org

drog
9 replies
1d5h

One of the things that greatly improved my csv workflow is duckdb. It’s a small binary that allows querying csv with sql.

lyjackal
4 replies
1d3h

You can do the same with SQLite, which is usually already installed in most environments

    sqlite> .import test.csv foo --csv

llimllib
1 replies
1d2h

A poor man's version of csvlens is something like:

    sqlite -column :memory: '.import --csv file.csv tmp' 'select * from tmp;' | bat
which imports the csv into sqlite and outputs it to bat, my favorite pager - use `less` or whatever else you desire.

1vuio0pswjnm7
0 replies
13h7m

In addition to using a pager with sqlite3's fantastic text-only output .modes, if the CSV contains hyperlinks I use a custom UNIX filter I wrote that outputs simple, minimalist HTML. Then I view with text-only browser.

For example, this is how I use YouTube. I never use the YouTube website, with its gigantic pages and its "Javascript player", not to mention all of the telemetry. All the search results and information about videos is stored in SQL or CSV, viewed with a text-only sqlite3 output .mode, and optionally converted to simple HTML.

For me, this is better than a "modern" web browser that's too large for me to compile.

xcdzvyn
0 replies
13h21m

Wow, this is awesome. Thanks a lot.

dbreunig
0 replies
1d1h

duckdb is a single file with no dependencies and it's fast. Still blows my mind how quickly it can query GB sized gzipped CSVs.

wenc
1 replies
22h57m

I use DuckDB for queries and Visidata for quick inspections.

Between those two, I can work with not only CSVs, but also JSON and Parquet files (which are blazing fast -- CSVs are good for human readability and editability, but they're horrendous for queries).

CLI CSV tools pop up every now and then, but there's too many of them and I feel that my use cases are sufficiently addressed with only 2 tools.

paddy_m
0 replies
53m

If you use jupyter check out what I'm building with Buckaroo. The aim is to have a table viewer that does the right/obvious thing for most data (formatting, sorting, downsampling large datasets, summary statistics, histograms). all customizable. Supports pandas and polars.

https://github.com/paddymul/buckaroo

ok_computer
0 replies
15h29m

If using python, I’ll also recommend polars sql context manager to run queries on csv.

https://docs.pola.rs/py-polars/html/reference/api/polars.SQL...

dkga
0 replies
21h49m

Long live duckDB! Big fan here.

musha68k
5 replies
1d7h

Not on “computer” but some such in your pipeline should be good enough more often than not?

awk -v OFS='\t' '{$1=$1; print}' | column -t

cowsandmilk
4 replies
1d4h

many csv data sets cannot be handled by awk due to csv supporting things like commas inside fields, newlines inside fields, etc. Your solution would also fall over for any tabs within fields. Plenty of tools out there support RFC 4180 CSV files as well as common csv variants. And these tools have been around for a decade and have suites of cli commands that can be piped together, just speaking "CSV" as a format rather than pure text streams.

prudentpomelo
2 replies
1d1h

Awk now supports a `--csv` flag for processing csv's. https://github.com/onetrueawk/awk/blob/master/README.md

snidane
0 replies
23h34m

How does one output properly csv quoted rows? It seems thr csv flag works only for parsing inputs.

musha68k
0 replies
1d1h

oh nice TIL

musha68k
0 replies
1d2h

Good points on using proper CSV tooling 100%; so that’s why the disclaimer “more often than not”

BTW I just realized that I was omitting the field separator ‘-F,’ actually.

For those pesky nested commas from top off simplistic text hat still I’d go back to sed for fun and potential profit:

sed 's/","/"\t"/g; s/^"/"\t/; s/"$//; s/,,/, ,/g' | column -t

Not elegant nor hands-off and might be missing something else; def needs the right data / volume and fiddling as we know ymmv; can get unwieldy quickly etc etc

TLDR; use column -t when applicable

pradeepchhetri
3 replies
1d5h

I enjoy using clickhouse-local for parsing csv files. I generally hit situations where I need custom delimiter and custom parsing rules, I find it handles all of these edge cases very well. Recently I found that if my csv files are compressed, i don't even need to uncompress them, it auto-magically figures out the compression format and process it for me.

magundu
1 replies
14h35m

Which one is better for querying CSV from both command line and programmatically?

1. SQLite 2. DucksDB 3. clickhouse-local

pradeepchhetri
0 replies
11h55m

Definitely clickhouse ecosystem.

- For querying csv data from command line, I use clickhouse-local.

- For querying csv data programmatically using a library, I use chdb (embedded version of clickhouse)

- For querying large amount of csv data programmatically, I offload it to clickhouse cluster which can do processing in distributed fashion.

If you are looking from query performance perspective, this blog is useful: https://www.vantage.sh/blog/clickhouse-local-vs-duckdb

tambourine_man
0 replies
1d2h

Didn’t know about them, seems interesting.

fao_
3 replies
23h44m

See we wouldn't need this if everyone had standardized on TSV instead

JohnKemeny
2 replies
22h37m

I don't understand if this is meant as a joke or not. Do you mean that we then would have Tsvlens? Tabs don't automatically make columns aligned, nor do they eliminate the problem of quoting. I don't get it, but then again, I'm exceptionally dim-witted.

m2f2
0 replies
20h31m

TSV are far superior to CSV as the separator is a TAB and not commas, semicolons or other weird stuff depending on your locale. Same for single and double quotes. Only thing to be aware are tsv created by Excel with newlines embedded in text fields, which are a huge PITA for almost all parsers.

bee_rider
0 replies
20h32m

Because tabs are flexible space, you can render them really wide in most text editors, wide enough that all the columns line up. But it is a hacky solution!

RhysU
3 replies
1d5h

ngrid is quite usable, stable, and sensible for streaming work: https://github.com/twosigma/ngrid

ranger_danger
2 replies
1d3h

a TUI program without a screenshot might as well not exist... nobody is going to use it if they can't see what it looks like.

RhysU
1 replies
1d3h

Heh, and to think I downloaded dozens of Slackware floppy images over a 28.8 modem without any screenshots! What a fool I was!

(I have used this utility weekly for almost 10 years. It isn't pretty. It's effective.)

bbkane
0 replies
1d

Maybe they'd accept a PR to the README with a screenshot.

wodenokoto
2 replies
1d2h

Is there an open source thing similar to this that has a windows gui?

pama
0 replies
1d2h

Emacs has an excellent csv mode.

olavgg
0 replies
1d2h

Libreoffice is a great tool for opening csv files

edu_guitar
2 replies
1d1h

Nice! Once or twice I've used tad as a GUI to view csv files, but I usually use vi with nowrap or read the file in R. Now csvlens will be my default for csv files.

[1]:https://www.tadviewer.com/

bee_rider
1 replies
20h34m

I like viewing most CSVs in vim. It can be a little annoying when the columns don’t line up well. Often I’ll replace all the commas with a tab, and then set the tab width to some very high value to make it all line up.

Of course, there exist plenty of CSV files that don’t follow any particular standard, so the trick doesn’t always work (if there are tabs in the data, if there’s some very wide field, or if there’s a comma in the data). Works good enough for the files I have, though.

quesera
0 replies
13h8m

There are a few vim plugins that make CSVs more convenient.

adius
2 replies
23h4m

I found the perfect solution for me with https://www.moderncsv.com. Starts fast, focused GUI, fully featured, no bullshit!

minroot
1 replies
21h59m

Really cool, hope it was open source

hermitcrab
0 replies
4h39m

Some people are trying to make a living from the software they write. Open source is not very conducive to that.

tinkertamper
1 replies
1d3h

Love this!

Call me lame, but are there any open-source projects that accomplish this type of gui in typescript? I have an idea but it uses something written in JavaScript.

chuckadams
0 replies
1d2h

You can use any javascript library in typescript, and TS is all JS at runtime anyway. You can even add types without touching the JS library, just write a .d.ts file and stick it anywhere that TS looks for sources.

kkoncevicius
1 replies
1d7h

Does it allow specifying columns when filtering for rows?

For instance, the shown example filters for "Bug" but seems to filter for rows that contain "Bug" in any of the columns. Can I specify to filter for "Bug" only looking at a certain column?

At the same time - is filtering by numbers implemented, like filtering for rows that have a numeric value above X in a certain column?

If not these should be on TODO list - those are very common operations for .csv type data.

dmd
0 replies
1d3h
zeckalpha
0 replies
1d2h

I'd love something like this in csvkit, xsv, or qsv. The refragmentation of CSV CLI tools is counter to the long term trend.

yzzyx
0 replies
20h17m

I have been looking for a viewer exactly like this for so long! Visidata is nice, but way more complex than what I was after. This will fit perfectly into my workflow, thank you for sharing!

trackofalljades
0 replies
1d2h

This is glorious, someone put it in homebrew!

leeoniya
0 replies
23h25m
flusteredBias
0 replies
18h24m

See also: (tv) Tidy Viewer. A cross-platform CLI csv pretty printer.

https://github.com/alexhallam/tv

bishfish
0 replies
1d3h

This would be nice to have with lesspipe.sh viewer.

account-5
0 replies
1d6h

I love these sorts of programs. Is there a list so where of data programs like this for the command line? Im sure there will be.

RicoElectrico
0 replies
1d1h

I tried some TUI tool that was pretty advanced at viewing tables (sorting, filtering etc.), but I don't remember the name. It was on a WSL instance on my work laptop which I nuked when they laid me off.

I remember that it was tab-based. Any idea what that could have been?

BasilPH
0 replies
21h25m

That looks like a great CSV viewer.

I've enjoyed using csvkit[^0] in the past. The viewer isn't as good as csvlens seem to be, but it comes with the ability to grep, cut and pipe CSV data which has come in handy.

csvlens + csvkit might be a great combination.

[^0]: https://csvkit.readthedocs.io/en/latest/