Shellcheck finds bugs in your shell scripts

My take is that bugs in sh scripts are best avoided by not programming in sh. So my preferred tool for sh linting is git-rm. It's not always possible, but driving down the sh line count sure helps against bugs from this language and its weird expansion rules.

Most people don't even know the language has expansion rules and write stuff that accidentally works after the fourth try. This lang wants to become obsolete.

there are many cases that you have to use sh scripts, e.g. many IoT devices, embedded devices etc where python etc are just huge and slow.

If you wouldn't program it in Python, you wouldn't program it in sh either. You don't run shell scripts on embedded devices! Typically, an embedded device runs one program that basically acts as the whole OS for the device. There's no kernel or userspace to speak of. You're directly interfacing with the hardware, and that's it.

This just isn’t true. Smart TVs run some linux/android, so do car infotainment, the list goes on.

Old tumble dryers, vacuum cleaners - sure.

These appliances can run python scripts just fine, then. Try to read the whole context instead of focusing on one part of the comment. I'm writing in the context of an appliance that can't run python script. That means the resources are heavily constrained.

Eh, there's like a 10x difference in speed between Python and Java/Go and probably like a 100x between Python and shell stuff. Definitely some devices in that range that can do shell stuff but not Python.

Are you implying that Bash is 10x faster than Java or Go?

Not Bash but the libraries they call out to.

Yep, and size wise, busybox is <1MB and python is much larger.

This probably explains why Mercedes-Benz is on the list of sponsors

Possibly dumb question: Why not write it in a compiled language like C or Go?

because it's a script, we have bash and c on a linux and we need both, same to embedded devices.

Because then you need a computer and build process.

Or, pre compile it across all target platforms and have an install process.

Personally because it's quick. Sometimes I just need to automate some set of CLI commands... Of course sometimes things evolve and it's time to replace the script with something more easy to handle at the new scale, but for simple stuff, meaning something that can fit a single page or two they are far quicker.

BTW in a broad topic: a classic system with a user-programming language as the topmost human computer automation/interface is obviously better, but we have had such systems in the past and commercial reasons have push them to oblivion so...

Bash is perfect for interacting with console tools.

If that’s what you need to do, C/go won’t only be much longer code, but also much harder and error prone. Command line tools are complex to deal with and there’s no magic bullet language for that.

Properly designed IoT devices wouldn't have Bash at all in my book.

Did you know properly designed IoT devices on sale? Personally I have some IoT at home to automate the home itself especially for p.v. self-consumption and the best I was able to find and integrate can be described as crap... I failed to fined anything else...

A simple example: I like to have some electricity switch/breakers automation, the best I've found are from Shelly Pro series, witch have a not really useful webserver built-in and not really useful APIs the rest are even worse having no wired versions at all. Why the hell not offer manual breaker + two wires for modbus so I do not need to fit ethernet wires and power in the same place?

Why just finding classic ModBUS-tcp/MQTT wired devices is so hard?

Things meant to be integrated does not need shiny UIs, need effective ways to integrate them, simple and reliable coms. My VMC witch is not a dirty cheap device have mobus support, unfortunately even the vendor do not know a full list of all registry and many of them does works "sometimes" like "write a 1 to switch from heating to passive ventilation", "sometimes works", so to integrate it I need to check if the command was "accepted" after 30" then re-check it after 40 because sometimes it flip back for unknown reasons... And the list is long...

There aren't very many unfortunately, but hopefully Matter will change that (I haven't looked into the protocol in detail tbh - been out of the IoT world for a while).

MQTT is not a good solution. It's a very ad-hoc protocol that really only works if you are a programmer setting up a completely custom system. Obviously manufacturers are not going to cater to that miniscule market.

Not strident enough. Properly designed Ts wouldn't have I at all in my book.

There are two kinds of languages, ones that everyone complains about, and those that nobody uses.

Everyone uses Python

I complain about Python all of the time.

1. seems to frequently get used for large products where people should really do strong static typing

2. Python 2 to 3 fiasco

3. Packaging and venvs and dependencies are a mess.

   3a. versioning of dependencies. culturally handled in a slapdash kind of manner.

   3b. Managing multiple versions of Python on my system, and multiple venvs per version.

4. not a functional language

5. Insanely, ridiculously slow compared to something like C. Slow even by the standards of scripting languages.

6. Python culture doesn't seem to acknowledge that regular users that don't care at all what language you used and want an executable, not a hot-glue build-your-own-environment hobby kit.

7. Pervasive culture of weak backward compat in libraries.

8. The pip devs are opinionated, and frankly user-hostile in my opinion, breaking backward UX convention, deprecating, general high-handed ivory-tower dev behavior.

9. I'd complain about the GIL, and a lot of people do, but I actually personally don't care. Maybe I covered it with the slow runtime and care indirectly.

Every time I encounter something written in Python I just mentally groan and roll my eyes. I genuinely don't like Python. Some of what I enumerated obviously bleeds into culture and ecosystem, but in practical terms for practical use, that's not so cleanly separable from the language in a lot of cases. Perl was always great about backwards compat and documentation, Ruby had a strong web dev bent because of rails, blahblah.

These are mostly complaints about Python the development language, not Python the sh alternative.

People complain about various python and its ecosystem's quirks all the time, though!

You don't need to use its ecosystem at all to use it as a replacement for sh

Urgh, I hate this thought-terminating cliche. It makes no distinction between minor annoyances and major design flaws.

About a decade ago I worked with someone who'd say this when a PHP project hit some ridiculous bug, and I'd suggest we maybe consider something else for future projects. Like, sure people complain that mixing tabs with spaces causes an invisible syntax error in Python; but at least it (a) has a dictionary datastructure, and (b) doesn't unavoidably convert some string keys into numbers, such that CSS colours like "112e33" become 112×10³³. (I thankfully don't use PHP anymore; inb4 "ackshually PHP7 is great"...)

Your solution to hitting a bug in a tool is switching ecosystems and spending years building up your expertise?

What happens when you hit another bug in the new ecosystem?

I may evaluate whether a platform is full of accidental complexity and decide to switch. This evaluation may be triggered by a failure caused by weird conversion rules yes.

Your solution to hitting a bug in a tool

If only it were "a bug". Rather, there was a steady trickle of post-mortems for production issues which couldn't be explained-away as a lack of testing; which secure coding practices would not have prevented; which took multiple experienced developers a while to figure out, and whose behaviour surprised all of them; and which were unavoidably baked into the design decisions of the underlying technology. This was also in a team that purposefully focused on building small, task-specific, loosely-coupled systems.

is switching ecosystems and spending years building up your expertise?

Meh, it's easy enough to jump between PHP, Perl, Python, Ruby, Javascript, etc. since their differences are pretty minimal. Indeed, their language roadmaps seem to be "copy each other's homework" (generators, async, etc.; JS even added a `class` keyword FFS!). Even for someone who's only experienced with one particular approach taken by some specific framework, there's probably a work-alike clone to be found in all of those languages. Heck, the current PHP advocates keep going on about how different PHP7/8 are from PHP4; in which case, it may be easier to jump ship to another interpreted, untyped, imperative, procedural/OO scripting language (which may also be more battle-hardened and stable than that new version of PHP!) It's certainly not like switching from, say, Mercury to Zig!

Anecdatally: My first job was PHP programming, which I landed despite having never used it before; I was hired since I knew Python, I did fine. I got another PHP job a year later, whose interview included a written test. A few years later I learned that (a) its purpose was to check if candidates were lying about knowing PHP, by asking questions that exposed PHP's unique quirks and pitfalls; and (b) I'd apparently scored higher than anyone they'd interviewed before or since, despite only having used it for a year. In other words: some programming languages are so similar that switching between them may be easier than learning a new framework.

What happens when you hit another bug in the new ecosystem?

My whole point is that this question is not answerable, precisely because "bug" is such a broad category. If the bug is a minor annoyance, we'd work around or avoid it; if it was more serious, but small, we'd maybe upstream a patch; if it was more difficult we'd maybe sponsor a core maintainer to work on it; etc. Presumably we wouldn't be hitting fundamental design problems, since our switching process would not land upon a notoriously flawed technology, cobbled-together by someone who "hates programming" and is "really, really bad at writing parsers"[1] (yet nevertheless proceeds to bake much of their language's semantics into the parser![2])

[1] https://en.wikiquote.org/wiki/Rasmus_Lerdorf

[2] For example, all the syntax errors and ambiguities I listed at https://stackoverflow.com/a/24463174/884682

Okay, let's say that you want a script that counts the number of lines in files versioned by git.

You'd write a Python3 script I assume? With subprocess?

Toy examples should not guide larger decisions. And even if that trivial script you describe is really what goes into production today, tomorrow someone will modify it and introduce a quoting bug or a poorly-done command line option facility or whatever.

Huh?

What makes this a plaything?

Did you respond to the right comment?

You asked about:

a script that counts the number of lines in files versioned by git

I'm saying that is not a realistic production program that most of us would need.

If you want it as a personal utility to use in your own shell, absolutely you can use bash. I'm responding to the idea that such a trivial script would have long term use in production.

Ah I didn't know the original comment was about "production"

To mitigate that, recently there was something posted on HN that checks your shell script for those types of bugs.

However, let's not produce any software at all, in case someone introduces a bug in it later.

I agree, but some bash can be unavoidable. I've found that even trivial looking bash can be helped with shell-check; this is a testament to the issues in bash more than anything.

What bash is unavoidable? The aliases and functions you define in your personal shell, sure. But what else?

Some examples of possible exceptions:

* Installing or starting your general-purpose runtime, passing the right configuration options

* Project glue where introducing general purpose language makes things more complicated because of tooling or startup.

* Simple cloud-init scripts

* A command line tool is simply the best library for the job and you don't want to get a general purpose language just to immediately shell out.

Piping stuff, job control?

Easy to avoid, use POSIX sh. It is a truth universally acknowledged that 95% of bash scripts don't use any non-Bourne features (or can trivially be modified to be so).

What on earth? POSIX sh is far more annoying to deal with than Bash. The only reason anyone puts up with it is to target the lowest common denominator. Horrible advice.

Bash is usful for 100 lines scripts that do little logic and mostly chain together various commands. Setup the right CC variables and call make.

I say ten. Ten lines maximum

You'd better specify a maximum line-length too :-|

Don't use bash, don't use C, don't use C++, don't use Python, don't use Javascript, don't use Ruby, ...

Don't use computers.

Only winning move

One does not program in bash. It is a scripting language - there's a difference, even if subtle.

Just like any other tool, commit the time to learn it instead of just complaining it's hard.

Developers tend to think they can write amazing things with minimal effort and then curse the tool/lang when things turn out different.

The world runs on C and bash scripts... and it's just fine.

commit the time to learn it

I did. Turns out I don't like it. What I dislike the most is that it brings out the clever type who likes to show off their tricks. That includes me. I've written atrocious Bourne scripts, and I've watched other people defend line noise in code reviews.

instead of just complaining it's hard

So what's good about it that I should tolerate the weird expansions and arcane scoping?

The world runs on C and bash scripts...

Less and less, thankfully.

and it's just fine.

It may have been fine in 1980.

I strongly agree with this.

I find that I'm okay with shell scripts which invoke repetitive individual commands, run complex piped commands, or run a sequence of commands with little-to-no control flow.

However, then I find that people use shell scripts for more advanced programming "because it's there" and "it's ubiquitous". What follows is a lot of stringly-typed programming using more and more bash-specific features and advanced control flow. External commands and builtins gather exotic flags to handle whitespace (or preferably null bytes). Then people want complex data types so they resort to things like `jq` and JSON stored in variables. But hey, at least they didn't use Python or Ruby!

Some people argue that by using a programming language like Python or Ruby or Go instead of "just a shell script" that I then bring about dependency management hell because I might use non-stdlib libraries and I require a minimum version of the language. But the people writing shell scripts rely a lot on their ambient system environment that they take for granted. They will use bash instead of sh and assume a recent version of it (good luck macOS users!). They will assume that the coreutils are BSD or GNU and they will get it wrong. They will want advanced data structures and assume you have some random utility for manipulating JSON, CSV, or TSV installed. Now you have a dependency management problem just like you would in other programming languages.

So then people say in a slack channel "hey, please install this util on your system", or they share Brewfiles and dockerfiles and container images, or they write yet another script to check your system for the right dependencies. You end up with a library of scripts and functions which require certain env vars to be set so they can all find and `source` each other. People writing and committing clever and complicated shell scripts usually end up with the same problems they would have with a different programming language that has package managers. But it's weeks or months down the road and it's easy not to think about when committing a 100 line bash script to the repo or adding new functionality to a script or depending on "just one more" installed utility.

It devolves to the same problem of "I'm writing Python/Ruby/Haskell/C/C++/... and I'm relying on the operating system's package manager to provide the libraries I depend on", but without having a powerful general purpose programming language with an import system and featureful standard library. Some teams do that and shell scripts which rely heavily on the system environment might be a great fit for them.

The best use case I can think of for shell scripts are FreeBSD init scripts, but then I consider how tightly coupled those are to the operating system. So basically... part of the system.

I don't think I have the discipline for all of that, so my flowchart for writing shell scripts is: is it a one-liner? If yes, do it. If not, use the same programming language as the rest of the project(s) in your team.

Came here to say this exactly.

My toy OS has the most spartan shell imaginable (no pipes, no redirection, no variables, no flow control, just run a list of commands). It's enough to mount filesystems, configure network interfaces, spawn some services, and do the reverse for shutdown.

Complex tasks are delegated to Lua scripts or actual binaries instead.

If you have to interact with console tools, nothing beats bash.

If you don’t have to, you should never use it.

Shameless self-plug:

I very much live by the "fix all warnings before you commit" (or at least before you merge), so I have Shellcheck and a bunch of other linters set up in my pre-commit configurations. But the majority of the shell in most of my projects ends up embedded in .gitlab-ci.yml files, where it's hard to check. So I made a wrapper that does that automatically: https://pypi.org/project/glscpc/.

It uses the Shellcheck project and some magic to give you the Shellcheck remarks with (mostly) accurate line-numbers.

I’d love to see a project that would do this, but more generally.

I don’t use GitLab CI, but I do use a good handful of other file types that essentially inline shell scripts. Dockerfiles, GitHub Actions, and Justfiles, just to name a couple.

Usually, and almost exclusively for the sake of ShellCheck, I make a point of putting anything more complex than a couple of commands into their own shell scripts that I call from the inlined script in my Dockerfile.

(This pattern also helps me keep my CI from being too locked into GitHub Actions)

I’d rather see CI services add a mode that would enforce all scripts must live in separate files, rather than inline.

It’s really not necessary to support inline except for single lines that are very short (under 30 chars).

I am not generally a fan of Xcode Cloud's design, but this is one thing I think it gets very right. Rather than let you specify the build actions in the CI settings, it invokes scripts with fixed names in the `ci_scripts` directory of your repo if they exist. There just isn't a mechanism for creating jobs which aren't something that you can build locally and test independently of Xcode Cloud.

so much this. it will help CI not become a holy, super-hard-to-debug, unreproducible mammoth. write scripts, call them from CI

I would say: stick to straight command sequences with variables, avoid if/for/while/functions etc. Subshells and pipes only for trivial things.

Setting something like a 30 char limit will inevitably lead to code-golfing to fit a mess into the 30 char limit.

But that becomes harder to automatically check, that's why you should still have good peer reviews based on written standards somewhere.

Hi there, Dagger contributor here. We're solving this exact same problem by allowing you to encode your CI/CD pipelines in your favorite programming language and run them the same way locally and/or any CI provider (Gitlab, Github Actions, Jenkins, etc).

We're very active in our Discord server if you have any questions! https://discord.gg/invite/dagger-io

We've been experimenting with Dagger here, as part of the alternative to writing glscpc actually, but I'm not convinced it's ready to replace Gitlab CI.

Dagger has one very big downside IMO: It does not have native integration with Gitlab, so you end up having to use Docker-in-Docker and just running dagger as a job in your pipeline.

This causes a number of issues:

It clumps all your previously separated steps into a single step in the Gitlab pipeline. That doesn't matter too much for console output (although it does when your different steps should run on different runners), but is very annoying if you use Gitlab CI's built in parsing of junit/coverage/... files, since you now have extra layers of context to dig trough when tests fail etc. Plus not all of these allow for multiple input files, so now you have to add extra merging steps.

If your job already uses Docker-in-Docker for something, you have to be careful not to end up with Docker-in-Docker-in-Docker situations, or container name conflicts if you just pass through the DOCKER_HOST variable.

The one thing that would make this worth it is being able to run the pipelines locally to debug, but I've just written quick-and-dirty scripts to do that every time I've needed it. For example, running the test job in our pipeline on every Python version: https://gitlab.com/Qteal/oss/gitlabci-shellcheck-precommit/-...

but I'm not convinced it's ready to replace Gitlab CI.

The purpose of Dagger it's not to replace your entire CI (Gitlab in your case). As you can see from our website (https://dagger.io/engine), it works and integrates with all the current CI providers. Where Dagger really shines is to help you and your teams move all the artisanal scripts encoded in YAML into actual code and run them in containers through a fluent SDK which can be written in your language of choice. This unlocks a lot of benefits which are detailed in our docs (https://docs.dagger.io/).

Dagger has one very big downside IMO: It does not have native integration with Gitlab, so you end up having to use Docker-in-Docker and just running dagger as a job in your pipeline.

Dagger doesn't depend on Docker. We're just conveniently using Docker (and other container runtimes) as it's generally available pretty much everywhere by default as a way to bootstrap the Dagger Engine. You can read more about the Dagger architecture here: https://github.com/dagger/dagger/blob/main/core/docs/d7yxc-o...

As you can see from our docs (https://docs.dagger.io/759201/gitlab-google-cloud/#step-5-cr...), we're leveraging the *default* Gitlab CI `docker` service to bootstrap the engine. There's no `docker-in-docker` happening there.

It clumps all your previously separated steps into a single step in the Gitlab pipeline.

It's not generally how we recommend to start, we should definitely improve our docs to reflect that. You can organize your dagger pipelines in multiple functions and call them in separate Gitlab jobs as you're currently doing. For example, you can do the following:

  ```.gitlab-ci.yml
  build:    
    script:
      # no funky untestable shellscripts here, but calls to \*real\* code
      - dagger run go run ci/main.go build
  test:
    script:
      # no funky untestable shellscripts here, but calls to \*real\* code
      - dagger run go run ci/main.go test
  ```

This way, if your pipeline currently has a `build` and `test` job, you still keep using the same structure.

but is very annoying if you use Gitlab CI's built in parsing of junit/coverage/... files, since you now have extra layers of context to dig trough when tests fail etc

You can also still keep using these. The only thing you need to be aware is to export the required test / coverage output files from your Dagger pipelines so Gitlab can use them to do what it needs.

but I've just written quick-and-dirty scripts to do that every time I've needed it.

This is what we're trying to improve. Those quick-and-dirty generally start very simple but they become brittle and very difficult to test by other engineers. Yes, you could use docker or any container-like thing to enable portability, but you'll probably have to write more scripts to glue all that together.

Quoting one of our founders:

"Our mission is to help your teams to keep the CI configuration as light and "dumb" as possible, by moving as much logic as possible into portable scripts. This minimizes "push and pray", where any pipeline change requires committing, pushing, and waiting for the proprietary CI black box to give you a green or red light. Ideally those scripts use containers for maximum reproduceability. Our goal at Dagger is to help democratize this way of creating pipelines, and making it a standard, so that an actual software ecosystem can appear, where devops engineers can actually reuse each other's code, the same way application developers can."

Absolutely agree. The main downside of that pattern is that it doesn't work with jobs included from other projects in GitLab CI, since the job runs in the context of the project that imported it and therefore can't find the script in its original repo. Huge bummer.

Bundle up configuration and scripts, and now we have containerized CI infrastructure.

You might like Dagger, building images with code, uses the same buildkit engine under the hood

No more linear Dockerfile, use the powers of your preferred language

This is the way

Generalizing this is non-trivial (I tried initially) but I'm sure others can build in the same principles.

I think this comes close for Dockerfiles: https://hadolint.github.io/hadolint/ Just have to write a pre-commit hook for it.

Oh, that's really cool. I was trying to solve this from a different perspective a while ago: I wanted to add some pre-processing that would take a "normal" shell script and render it to the `script` part of the corresponding job at build time, the advantage being that you still have everything self-contained in the gitlab-ci job.

I stopped working on it because dealing with shell shenanigans in the GitLab CI runner environment is such a miserable experience that we're in the process of moving all our jobs to python scripts.

Yea, Pythonifying the scripts is also generally my preferences the moment they become somewhat complex. But even then it's nice that you can be reasonably sure you're not forgetting quotes around variables or using bash constructs where you only have sh.

High five - I did something similar very recently[1]; usage[2]:

  - repo: https://gitlab.com/engmark/shellcheck-gitlab-ci-scripts-hook
    rev: 174fda8f384db229aca5b40380e713d0c25a1cb9 # frozen: v1
    hooks:
      - id: shellcheck-gitlab-ci-scripts
        files: ^\.gitlab-ci\.yml$

[1] https://gitlab.com/engmark/shellcheck-gitlab-ci-scripts-hook

[2] https://gitlab.com/engmark/root/-/blob/9f7d9b93c2297d0b170e5...

Some tips of my own:

- It's almost always preferable to put `-u` (nounset) in your shebang to cause using undeclared variables to be an error. The only exception I typically run across is expansion of arrays using "${arr[@]}" syntax -- if the array is empty, this is considered unbound.

- You can use `-n` (noexec) as a poor-man's attempt at a dry-run, as this will prevent execution of commands.

- Also handy is `-e` (errexit), but you must take care to observe that essentially, this only causes "naked" commands that fail to cause an exit. Personally, I prefer to avoid this and append `|| fail "..."` to commands liberally.

The problem with "${arr[@]}" only exists on bash 3 and before, since bash 4, [@] will never throw unbound variables even in cases where the variable is truly undefined. This is still a problem however, because macOS, to this day, still installs bash v3 by default and doesn't update it automatically (absolute madness, the last release of bash 3 it's from 20 years ago!).

In any case, you can workaround expanding empty arrays throwing unbound by using the ${var+alter} expansion

  echo "${arr+${arr[@]}}"

  The problem with "${arr[@]}" only exists on bash 3 and before, since bash 4

4.4 fixed it:

  $ bash --version
  GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)
  $ declare -A arr
  $ set -u
  $ "${arr[@]}"
  -bash: arr[@]: unbound variable

Ah, that's true, I couldn't recall which version fixed it. Usually I assume v4 because any other distro automatically updates to the latest v4 version or latest version af all.

macOS is the only one out there missing on this. The other big feature that was only added after bash 3 and is missing on mac is associative arrays

macOS, to this day, still installs bash v3 by default and doesn't update it automatically (absolute madness, the last release of bash 3 it's from 20 years ago!).

GPL3, buddy. You can use brew to install a newer version and somewhat hide the old one.

to put `-u` (nounset) in your shebang

Any particular reason why in the shebang instead of set -u?

The only exception I typically run across is expansion of arrays using "${arr[@]}" syntax

In Bash? Works for me. Edit: another comment mentions it as well. Seem to behave better in newer versions of Bash and only problematic in <= 4.3 https://news.ycombinator.com/item?id=38397241

  $ bash -uc 'unset x; echo "=> ${x[@]}"'
  =>
  $ bash -uc 'x=(); echo "=> ${x[@]}"'
  =>
  $ bash -uc 'x=(); echo "=> ${x[0]}"'
  bash: x[0]: unbound variable

Zsh does not like the first example but both should support:

  $ bash -uc 'unset x; echo "=> ${x[@]:-null}"'
  => null

> Also handy is `-e` (errexit),

It is unfortunately very confusion with functions. Made me like it less over the years.

Using the shebang just helps highlight the fact that the rule is in use globally, but otherwise has no advantage to using `set -u`.

The clarifications on `-u` and arrays are useful. I'm definitely used to assuming newer (... non-ancient?) versions of Bash are what is available.

Using `set -u` is more portable. If your shebang is `/usr/bin/env bash`, which it probably should be, then you can't add additional command line arguments in Linux with older coreutils. macOS supports additional arguments, regardless, and in Linux, coreutils 8.30 added the `-S` option to `env` to get around this problem.

I find that I usually have to use traps when I put use options to terminate the script early, mostly when I have some files to clean up.

"trap" is a great scripting feature that doesn't get talked about enough, IME.

Regarding the `-e` issue, this is well handled by `-o pipefail`, which as of last year is part of POSIX.

Saves a lot of time for tons of legacy shell scripts

I have a colleague that writes alot of shellscripts and there is an ongoing discussion if shellcripts or scripting languages like python is better.

Python's subprocess/shell-out story is just bad enough that I still find myself writing a lot of shell scripts if the task at hand warrants more than 2-3 subprocess calls.

Realistically, Perl or Ruby would fill this role fine, but I hate adding another language to a project just for that purpose.

Agree, and also you have to write a few lines of shell to get your python going (and same for node or ruby or whatever).

It’s perfectly fine for glue type stuff in a CI pipeline imo. There’s frankly no easier way to work with files.

From a security point of view, Python is better because it is less of a footgun. So if you expose an interface to untrusted users, you should use Python because its behavior is more intuitive. An arithmetic expansion or missing quotes do not easily become a vulnerability in Python.

i remember trying to rewrite so fragile scraping bash script in python, and even with some effort to use nice libs and create some cool helpers it ended up as long and not much more solid

bash is infectious in the bad sense :D

We ported all shell scripts to Python at a company that I’ve worked for. Scripts just kept getting longer and more complex. As a language Python is great, super fast and easy to code in, very little inherent complexity. The reason I wouldn’t choose Python again is distribution. It’s easy enough in Docker, you can just bite the bullet and vendor the same Python in all Docker containers. Mac was a pain though, pyenv etc all had their own issues and collisions with homebrew and dependency management with pip is a hassle as big as npm. A real bummer given how well Python works as a language for scripting.

Recently, I found a privilege escalation vulnerability in a shell script as a result of arithmetic expansion (similar to the one described at https://research.nccgroup.com/2020/05/12/shell-arithmetic-ex...). For example, $((1 + ENV_VAR)) allows you to inject code if you can control $ENV_VAR.

Unfortunately, shellcheck did not catch that. At least not with the default settings. But if you are implementing anything remotely security-critical, you should not be using shell anyway.

What should we use for more security?

Basically anything where it's difficult to treat variable values as code. Python, Ruby, Java, and even PHP are much better at this.

I've seen `eval()` in production code of several applications. The biggest vulnerability is more often than not the programmer :)

But `eval()` does not violate a programmer's intuition as easily as an arithmetic expression resulting in code execution.

Perl is probably worthy of a mention there, with it's taint-mode you're explicitly forced to test externally-influenced variables before using them.

I have flashbacks of when my PHP teacher showed us how to turn query parameters into their own variables by using PHP's dynamic variables feature.

He waited a bit, and promptly said to never do that and started to explain the security risks.

The page is thanking Mercedes Benz? That came unexpected.

https://github.com/orgs/mercedes-benz/sponsoring

They're sponsoring quite a few devs. Caddy, curl, and SeaweedFS notably.

That gives me a new found respect for Mercedes-benz.

totally agree (well, it was about time)

Great to know Mercedes Benz is willing to do this! Even supporting OpenSSL, one of the cornerstones of modern web security.

zsh is not officially supported, but you can force shellcheck to check zsh regardless via `--shell=bash`.

This makes shellcheck treat the file as bash script, so there are some false positives for zsh-specific syntax, but 99% of the rules work pretty much the same.

Certainly for my own scripts it's a lot less than "99%". It won't even run on most of them as it will just exit on some fatal error:

Special variables like $argv and $status.

Most array things (e.g. ${(o)arr} for an ordered array)

(( .. )) conditionals

"short loops" without "do".

repeat loops.

You could rewrite these things, but what errors remain where shellcheck is helpful in zsh?

well, for me it's 99%. But true, I don't use short loops, and I also don't use array a lot (when I feel the needs to use arrays, I switch to a different language anyway).

"what errors remain" is hard for me to say – I get a wide variety of helpful errors, regularly.

Even in simple scripts I found arrays to be very helpful once I let go of the "POSIX sh mindset" and was used to it. The way word-splitting and $IFS works is basically like an "implicit array", as is $@, $1, etc. (actually in zsh $@ is an array: $1, $@[1], $*[1], and $argv[1] are all the same), so everyone already uses arrays.

It replaces a lot of uses of cut and "awk '{print $2}" for starters.

To spot more common problems I recommend:

  alias shellcheck='shellcheck -o all -e SC2292 -e SC2250'

SC2292: Prefer [[ ]] over [ ] for tests in Bash/Ksh.

* https://www.shellcheck.net/wiki/SC2292

SC2250: Prefer putting braces around variable references (${my_var}) even when not strictly required.

* https://www.shellcheck.net/wiki/SC2250

Opportunistic interjection that unnecessary ${} is the most bothersome style choice in any language I know of:

- It obscures actual uses of modifiers, particularly ${foo-} when set -u is in effect,

- It's obvious when a name runs into subsequent text, even if one has somehow avoided syntax highlighting,

- And expansions followed by identifier chars don't actually occur in practice. Cases where the quotes cannot be moved to surround the variable are often interpolation of an argument to echo, whose behaviour is such a mess not even portable between bash and dash that shellcheck ought to be demanding printf at all times instead!

Related pet peeve: always writing variables as $UPPER_CASE in shell scripts.

Useful: $UPPER_CASE for exported variables ("super globals"), $lower_case for anything else. Can also use $lower_case for function locals and $UPPER_CASE for exported and script global variables (stylistic preference; both are reasonable).

Not useful or reasonable: $ALWAYS_UPPER_CASE_NO_MATTER_WHAT.

I suppose people started doing it because they saw $EXPORTED_VARIABLE and thought "oh, I need to always upper case it", not realizing what that meant. And then after that more copy-copy of this "style".

Shellcheck is a godsend

https://github.com/jamespwilliams/strictbash, I wrote this little wrapper a while back that you can use as a shebang for scripts. It runs shellcheck for you before the script executes, so it’s not possible to run the script at all if there are failures. It also sets all the bash “strict mode” [0] flags.

[0] http://redsymbol.net/articles/unofficial-bash-strict-mode/

That’s nice, but I suppose it only works if you run your own scripts, otherwise you’d be debugging and fixing every single script you have to run.

Some people may prefer that to “oh no, Steam did rm -rf $typo/“

I’ve turned some time ago a build and deploy script (single production server) some bash scripts into Haskell using Turtle [1]. What I enjoyed was the ability to reduce redundancies significantly. It was significantly shorter code afterwards.

[1] https://hackage.haskell.org/package/turtle

I recently tried Turtle but ended up throwing it out in favour of typed-process.

Afaik a Turtle program has a single current directory, which makes it hard when you want to run concurrent jobs that need to be executed from particular directories. I partially solved the problem by using locks/queues/workers. But it got too much for me when Turtle started failing due to its current directory being deleted.

In contrast, typed-process lets you spawn separate processes, and execute within a working dir (rather than needing to cd there), so it works great for big, complicated workflows.

And it also has good support for OverloadedStrings, which means you can generally copy & paste what you would have typed into bash, and it just works.

I also use the interpolate package (with QuasiQuotes) to make the raw strings nicer in the source code, but it's not compatible with hlint, so I'm thinking of looking for a different package for string-handling.

Author of Shh [0] here. I've replaced a lot of Bash with Haskell (it's great!). I wrote Shh to help me out with that.

I ended up liking PyF [1] for quasiquoting. A friend of mine worked with the author a while ago to strip down it's dependencies, so it's easy to justify. It's what I recommend in my cargo culting guide [2].

[0] https://hackage.haskell.org/package/shh

[1] https://hackage.haskell.org/package/PyF

[2] https://github.com/luke-clifton/shh/blob/master/docs/porting...

Shellcheck is great, but dealing with source/imports is suuuch a pain. Not their fault sh is a nightmare.

Well, it's possible to do this:

    # shellcheck source=./deployment/deployment-example.env
    . "${1}"

But I see how it's a pain point when you have multiple subshell scripts and files to source.

Lots of mentions of this:

* https://news.ycombinator.com/from?site=shellcheck.net

with the last large-scale discussion (301 points; 54 comments) being in 2021:

* https://news.ycombinator.com/item?id=27030504

It's actually 'follow-up dupe' of this https://news.ycombinator.com/item?id=38387464 where it comes up repeatedly.

But not zsh scripts, sadly

zsh was originally supported, but unceremoniously removed: https://github.com/koalaman/shellcheck/issues/298

I've had great experiences with this tool, but, for some reason, this issue always makes me question taking too great a dependency on it.

Shellcheck is great, but requires some investment to tailor to your style

Disable default checks or enable optional ones using directives: https://www.shellcheck.net/wiki/Directive

The error checks can be pretty arcane: https://github.com/koalaman/shellcheck/wiki/Checks

I appreciate that the text for each check is brief and usually includes a suggestion. I end up disabling 26xx's a lot (for unquoted variables to be interpreted as multiple values).

Python is probably the best alternative to bash, but Swift is getting surprisingly good.

With shwift[1] you get NIO/async APIs, operator overloading for shell-like locutions, and trivial access to existing executables:

    /// Piping between two executables
    try await echo("Foo", "Bar") | sed("s/Bar/Baz/")

    /// Piping to a builtin
    try await echo("Foo", "Bar") 
       | map { $0.replacingOccurrences(of: "Bar", with: "Baz") }

Scripts can easily be configured with libraries and run pre-compiled by using clutch[2].

For cross-platform use, be sure only use libraries on all platforms (i.e., not Foundation). It's a pain, but at least the error shows up typically at compile-time instead of run-time.

[1] - [shwift](https://github.com/GeorgeLyon/Shwift)

[2] - [clutch - any Swift scripts in a common nest](https://github.com/swift-nest/clutch)

There is also a bash language server.

https://github.com/bash-lsp/bash-language-server/

very good the Mercedes Benz sponsor :)

Shellcheck is great. Unfortunately, its checks pale at the idiosyncrasies of per-version bashism.

For example:

  set u
  ignored_users=()
  for i in "${ignored_users[@]}"; do
    echo "$i"
  done

passes shellcheck's checks, however bash <= 4.3 will crash with "bash: ignored_users[@]: unbound variable". Therefore, set -u isn't available to use in this (valid) use-case.

Shellcheck also doesn't catch the expansion of variables as key names in testing assoc arrays:

  declare -A my_array
  un='$anything'

  [[ -v my_array["$un"] ]] && return 1

will will fail as "my_array: bad array subscript" because "$un" gets expanded to "$anything", which on a second pass, gets expanded to "", making the check [[ -v my_array[] ]]. Even worse, a value of

  un='$(huh)'

actually gets executed:

  [[ -v my_array["$un"] ]] && return 1
  -bash: huh: command not found

Here's another one: in versions older than 4.3 (maybe?) these -v checks don't even work:

  $ declare -A my_array
  $ my_array["key"]=1
  $ [[ -v 'my_array["key"]' ]] && echo exists
  $ [[ -v my_array["key"] ]] && echo exists
  $ [[ -v $my_array["key"] ]] && echo exists
  $ [[ -v "$my_array["key"]" ]] && echo exists
  $ bash --version
  GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)

I've recently been documenting some of this on my website: https://joshua.hu/more-fun-with-bash-ssh-and-ssh-keygen-vers...

I also recommend https://github.com/bach-sh/bach when you have to use Bash for things long enough it probably shouldn't be!

But it still doesn’t find bugs in my Zsh scripts. :<

Nice: I have learnt some things from this on the very first production /bin/sh script that I pointed it at, and I've been hacking such scripts since the 80s!

What about finding bugs in other peoples' shell scripts.