1. It's amazing that they're doing this as a gradual C++ to Rust rewrite, while keeping it working end-to-end, if I understand correctly.
2. It's amazing how quickly this is going.
3. If you haven't tried fish yet, make sure to do so! It's a really ergonomic shell and overall very pleasant to use, with good defaults (you don't have to customize it, even though you can, for a great experience). I've switched from bash a couple years ago and haven't looked back since.
Bonus: you won't have to google how to write a for loop in bash ever again (which I, writing them rarely, and them being unintuitive enough, had to do every single time)!
Tip for some easy to remember Bash loop constructs:
Incremental index:
For iterating over an array:Please please please always quote variable expansion. Just do it everywhere, every time.
Or use Fish.
Or python…
python is unbelievably awkward for shell scripting though
I think with the cmd package it’s not actually that bad and quite ergonomic. Ymmv
Do you mean this cmd package?
https://docs.python.org/3/library/cmd.html
If so, that's entirely orthogonal to the problem. cmd is for writing the interactive "front end" to a command-line interpreter; it doesn't help at all in writing Python scripts to replace shell scripts (which typically run with no interactive interface at all).
The sort of problem I think fragmede is alluding to is that of writing Python code to emulate a shell script construct like:
e.g. constructing pipelines of external commands. It's certainly possible to do this in Python, but I'm not aware of any way that'd be anywhere near as concise as the shell syntax.I really don't find using subprocess all that tedious personally. I usually have one helper function around `subproces.run` called `runx` that adds a little sugar.
But if you really want something more ergonomic, there's sh:
https://pypi.org/project/sh/
subprocess doesn't make it easy to construct pipelines -- it's possible, but involves a lot of subprocess.Popen(..., stdin=subprocess.PIPE, stdout=subprocess.PIPE) and cursing. The "sh" module doesn't support it at all; each command runs synchronously.
Incorrect. https://sh.readthedocs.io/en/latest/sections/piping.html
Just add `_piped=True` to the launch arguments and it'll work as expected where it won't fully buffer the output & wait for the command to complete.
YMMV but I found sh to be a step function better ergonomically, especially if you want to do anything remotely complex. I just wish that they would standardize it & clean it up to be a bit more pythonic (like command-line arguments as an array & then positional arguments with normal names instead of magic leading `_` positional arguments).
Shoot. I mixed up the module name. It's the sh module https://sh.readthedocs.io/en/latest/
It's not as natural in some ways for most people because you have to write it right to left instead of left to right as with the pipe syntax. If you split it over multiple lines it's better: There's also all sorts of helpful stuff you can do like invoking a callback per line or chunk of output, with contexts for running a sequence of commands as sudo, etc etc.And of course, you don't actually need to shell out to sort/uniq either:
This is also cheaper because it avoids the sort which isn't strictly necessary for determining the number of unique lines (sorting is typically going to be an expensive way to do that for large files compared to a hash set because of the O(nlogn) string comparisons vs O(n) hashes).It's really quite amazing and way less error prone too when maintaining anything more complicated. Of course, I've found it not as easy to develop muscle memory with it but that's my general experience with libraries.
Oh neat, I guess I missed the "_piped" arg when I looked. That does make it a lot better.
Yeah, it's a contrived example. Imagine something more important happening there. :)
yeah exactly. eg
is possible to do with os.walk() in python, but to get there is so unergonomic.You can also pipe the output of find to grep if you want to run grep for some reason instead of writing that processing in python (maybe it's faster or just more convenient)
For writing on the command-line, yes. For anything over a screenful of code and/or using non-trivial math or non-scalar variables, I’ve found the opposite to be true. Python forces multi-line code more but the functionality is so much richer that it’s easy to end up replacing a hundred line shell script with half as many lines of easier-to-read code. Literally every time I’ve done that with a mature shell script I’ve also found at least one bug in the process, usually related to error handling or escaping, which the original author knew about but also knew would be painful to handle in shell.
^ this
To illustrate why, consider a file name with spaces.
Additionally,
...better to glob with ./* than *Also for sanitizing user input.
True, true. I don't worry about it for explicitly simple indexes because too many quotes are ugly. But in general it's completely right.
Just to illustrate for others, in fish that would be
or usually written as and that is already based on iteration, so for iterating over files, you'll similarly use and it works the same for arrays.You don't even need the (ls) for file iteration:
Fish also has globstar out of the box, so you can do[You don't even need the (ls) for file iteration:
]You don't even need the for for that (specific example):
echo *
at least in sh and bash.
echo * does not have the newline after each file
True, echo * will print all the filenames on one line, with a space between pairs, but for data processing purposes, the two snippets are equivalent, because as per shell behavior, newline, tab and space are all treated as space, unless quoted. Like the definition of whitespace in the K&R C book. After all, Unix is written in C.
You wouldn’t iterate over `echo *` anyway. This works just fine in bash/sh:
Also, IIRC, and less commonly seen, you can do:
which will iterate over all the command line arguments to the script containing that for loop. It is a shortcut for: This is mentioned in either the shell man page or in the Kernighan & Pike book The Unix Programming Environment.Your example in fish:
directly translated to bash:No need for seq :)
If, for whatever reason, you want leading zeroes - BASH will respect that. Do {01..10}seq would work in bash too, but it's not part of the language, it's an external tool. C like syntax is more expressive I'd say for simple numerical increments.
“Just remember how to do it” isn’t really the problem.
It’s that the finer details of the syntax is sufficiently different from other things I do, and I don’t write shell scripts frequently enough to remember it.
exactly. i predict some kind of 'chat gpt shell' in the near future.
GitHub has already released this a couple months ago: https://docs.github.com/en/copilot/github-copilot-in-the-cli...
zsh makes this so much easier. This doesn't capture $i, but covers most use cases of "I want to run something more than once":
If you do need $i you can use a similar construct as bash, but more convenient: Short loops are hugely helpful in interactive use because it's much less muckery for short one-liners (whether you should use them in scripts is probably a bit more controversial).Also looping over arrays "just works" as you would expect without requiring special incantations:
---Whether zsh or fish is better is a matter of taste, and arguably zsh has too many features, but IMHO bash is absolutely stuck as a "1989 ksh clone" (in more ways than one; it uses K&R C all over the place, still has asm hacks to make it run on 1980s versions of Xenix, and things like that).
Any reason to eschew short_loops in general that you're aware of? I ask because I'd probably use `for i ({0..9}) echo $i` in your for loop example. I've never managed to get my head around the necessity for a narrow short_repeat option when there is - for example - no short_select.
All of the zsh alternate forms feel far superior to me, in both interactive use and within scripts.
JNRowe notes the many off-by-one translations in other examples and tips hat to arp242
Oh yeah, I forgot about the {n..m} syntax; I almost always use repeat these days or the C-style loop if I need $i, as that's more "in the fingers", so to speak, from other languages, even though {n..m} is easier.
I don't know why you would want to avoid short loops, other than subjective stylistic reasons (which is completely valid of course). The patch that added short_repeat just asserts that "SHORT_LOOPS is bad"[1], but I don't now why the author thinks that.
Also: my previous comment was wrong; you don't need "setopt short_repeat"; you only need it if you explicitly turned off short loops with "setopt no_short_loops".
[1]: https://www.zsh.org/mla/workers/2019/msg01174.html
Bash array syntax is error prone and hard to read, and the expansion of arrays still depend on the field separator and must fit in the environment.
Most of the time you should just rely on that field separator and do it the simple way:
Much more obvious and more shell-like. That's how all bash functions process their parameters.Set IFS only if you really need to. But in that case also consider something like xargs where you are not limited by environment, have null terminated fields, and offer parallelism.
Arrays are only useful when you need multidimensionality, at which point you should probably look at using data files and process with other tools such as sed. Or start looking at something like Perl or Python.
Your example proves that not using arrays is worse. That loop will only run once and just print every element on one line. The array equivalent works as expected , _isn't_ affected by IFS, and can handle spaces in individual elements
Can't speak for others but one issue is the "; do"
I can generally cobble together python, and it'll be syntactically correct. I just need to check libs.
With shell I need to stop and make sure my semicolons in a for loop are correct.
If you're a heavy user it probably isn't a problem but all the little warts just make it difficult for an occasional user to keep a good enough model in their head.
You don't need the semicolon. Just use a newline instead; it's also easier to remember that way.
zsh, short loop:
powershell:For the first, I prefer range syntax:
Seems to me they're not doing it gradually at all.
https://github.com/fish-shell/fish-shell/discussions/10123#d...
The price of C++ compatibility is that it doesn't use Rust strings internally. It's all "wchar" and "WString", which are Microsoft C++ "wide character strings". This may be more of a Microsoft backwards compatibility issue than a C++ issue.
and yet they may be breaking Cygwin support?
I think the problem is more like, there are valid files that aren't representable as rust strings?
Like? I don't use rust but I assume it's capable of utf 8 ?
The siblings are correct but also not precise in a way in which could be misleading.
Rust has one built-in string type: str, commonly written &str. This is a sequence of UTF-8 encoded bytes.
Rust also has several standard library string types:
* String is an owned version of &str. Also UTF-8.
* CString/CStr: like String and &str, but null terminated, with no specific encoding
* OsString/OsStr: an "os native" encoded string, in practice, this is the WTF-8 spec.
And of course, all sorts of user library-created string types.
The issue at hand isn't that there's some sort of special string type that Rust can't represent, it's that they have an existing codebase that uses a specific string type, so they're going to keep using that type for compatibility reasons, rather than one of the more usual Rust types. This means they won't have to transcode between the two types.
<3 WTF-8
If we're competing to be pedantic here, the problem is that Windows encodes paths as UTF-16 and Linux can support just about any random jumble of bytes in a path regardless of if those bytes are valid UTF-8 or not. Neither of these play nicely with Rust's "if it can be represented, it's valid" approach to code safety, so OsStr(ing) exists as a more permissive but less powerful analogue for such cases.
Rust is capable of UTF-8. It is not capable of not UTF-8. Any sequence of bytes that is not a valid UTF-8 string cannot be represented by the String type.
Rust strings enforce utf-8 encoding, yes. However, it seems Windows (which uses utf-16) allows ill-formed UTF-16, in particular it tolerates unpaired surrogates.
You can read more here http://simonsapin.github.io/wtf-8/
That's why OsString exists.
The Cygwin issue isn't strings (well, that could be another issue) but that Rust doesn't support Cygwin in the first place, at least according to the comments in the linked thread.
Well, for whatever reason, it doesn't meet their requirements... See the String section of https://github.com/fish-shell/fish-shell/blob/master/doc_int...
edit: here is a more specific rationale: https://github.com/fish-shell/fish-shell/pull/9512#discussio...
It has nothing to do with Windows. fish doesn't support Windows. Their use of wchar_t is the glibc wchar_t (wchar_t is not Microsoft-specific) which is a 32-bit type and stores UTF-32-encoded codepoints. The Rust type they're using is also the same ( https://github.com/fish-shell/fish-shell/blob/master/doc_int... ).
They are doing the release all-at-once, but developing gradually. From the original PR[1]:
[1] https://github.com/fish-shell/fish-shell/pull/9512
It's a bit of both. They are not doing any mixed C++/Rust releases, but you can check out the source and build it in the current mixed state and it will produce a binary that contains parts from both languages and works as a Fish shell.
I still rather write Bash scripts or just bust out Python or Perl, but I definitely prefer Fish for terminal usage.