>Some Unix victims turn this filename-as-switch bug into a “feature” by keeping a file named “-i” in their directories. Type “rm *” and the shell will expand this to “rm -i filenamelist” which will, presumably, ask for confirmation before deleting each file.
I think Unix Wildcards Gone Wild (2014)[0] demonstrates and explains this rather well (posted 6 and 4 years ago[1][2] - while HN supports reposting, it's probably been posted enough)
I once saw a friend nearly delete all their files using tip#3 from the thread:
> never use only * as a wildcard. A ./* would save you in most cases
My friend typod that for /* in an rm -rf command but caught it just in time. They seemed like unnecessary characters to me if it does the same as *, and I don't think they do it anymore, but now I'm not sure which is worse...
I suppose good backups just go a long way whatever you do.
This has nothing to do with the OP, but I always rehearse my deletions with "ls -d", and after seeing the output hit the up arrow in my command history and replace it with the rm command I wanted to attempt. I also never use -r without intention - a lot of people use it habitually even when not deleting a directory. Lastly, I never -f, I just chmod first.
This is why I never use rm when I'm tired or working on important system anymore. I'll just mv the files somewhere for review later. Worst case is I just moved a wrong file and can simply restore it back.
Funnily enough, rsync is one of the commands for which you might be more tempted to use * instead of going up a directory, because the relative path names go over the wire.
There's a lot of operations from the command line that are too easy to screw up. As much as I love having the power available and I'd certainly not give it up, I can't deny that GUIs (or even just TUIs) are much safer since you can visually validate the selection you've made and there's nothing to make a typo on. They also tend to ask for confirmation on destructive actions or provide an undo mechanism.
Maybe we need transactional file systems. At work, on our databases, when I'm running an ad-hoc update or delete on something that's not trivial to recover then in my DB tool I will often have something like:
BEGIN TRANSACTION
UPDATE ...
SELECT ...
-- COMMIT TRANSACTION
If the select validates that the update was good (and the update doesn't say "1000000 rows updated"), I'll just highlight COMMIT TRANSACTION and hit run. This isn't a perfect solution since I either need to hold the transaction open or run the update twice (first time immediately followed by a rollback), but a blip in accessibility is better than having to restore the latest nightly backup and run any processes that updated the table since it was taken.
If you're using postgres (and maybe others?) it can be handy to use the "returning" keyword -- e.g. "update foo set x = y where z returning *" -- to get an immediate look at the updated rows. Maybe not so good though if touching more than a few dozen rows intentionally.
Microsoft Windows actually provided a transactional file system called Transactional NTFS (TxF)in Windows Vista. However, due to the complexity and lack of widespread use, it has since been deprecated.
I mixed up dd's "of" and "if" args once when trying to image a disk. And of course, it failed somewhere in the middle after borking something important.
> At one point, Linus had implemented device files in /dev, and wanted to dial up the university computer and debug his terminal emulation code again. So he starts his terminal emulator program and tells it to use /dev/hda. That should have been /dev/ttyS1. Oops. Now his master boot record started with "ATDT" and the university modem pool phone number. I think he implemented permission checking the following day.
Actually that command seems accurate, because the instructions is to do the reverse of what you googled - it's telling people "how to make an image from a disk device", but in this case the image file has the ".iso" file extension.
But I agree about "copy paste instructions". I remember an intern at work asking me how to switch between virtual terminals on Linux, I told her Ctrl-Alt-Fx, I wanted to explain what virtual terminals are and how the shortcuts are probably configurable, but at that point she already stopped listening...
dd is one of the most dangerous ones I was thinking of. I hadn't even thought of swapping if and of though, just mistyping the target drive. I would feel so much safer choosing from a list of the drives shown with good identifying info like their manufacturer, size, serial number, and known partitions.
hah, i had a tool where i configured ~/something as a target directory in its config. except that the tool did not understand ~ and literally created a directory named '~'
guess what i typed to remove it?
i lost a lot of work that day.
since then i developed the habit to always only use rmdir for directories, and first go into the directory to remove files with rm.
i also avoid rm * by finding some common parts like an extension that most files share.
i don't want rm * to be in my history where i might accidentally call it.
Blender tried this on me a few months ago. I used mv to rename the directory before deleting it though, the fear caught me as soon as I realized I had a directory named ~ sitting in my home directory.
mv is a clever idea. rename to something safe, which can be undone if you get it wrong.
i use a similar approach when deleting a lot but not all files from a directory. i first move the files into a new directory, double check that every file is in the right place and then delete the new directory safely.
I accidentally did the same on my Mac via a shell script where I assumed mktemp worked the same as gnu mktemp, it does not. IIRC it was my /Applications directory that was first to be hit; most of the directory was unaffected due to file permissions but it did delete some files despite me catching thee error and sending sigkill within seconds.
While it takes significant extra effort, it’s more robust to carefully decompose patterns into exact lists of files that can be reviewed (or use "find"), and this has saved me a couple of times.
For example, expand wildcards into a separate list of files that can become controlled commands ("rm -f A/file1", "rm -f A/file2", "rmdir A", ... instead of "rm -Rf ..."). This way, if directory "A" contains anything you didn’t expect, the "rmdir" fails at the end; and, someone can review the list of proposed files to be deleted before you run. Oh, and instead of having that sinking feeling of "rm" taking “too long” to run, your command is merely in the process of constructing a list of 1000 unexpected files instead of blowing away half your disk with shocking efficiency.
Also, file lists are pretty useful when you need to make minor edits (e.g. sometimes it’s a lot easier to find and exclude a few files from a list that you don’t want to touch, as opposed to describing those in a wildcard or search).
Depending on the task (and assuming no filesystem caching) it can be faster to gather up a list once and pass the composed list in to a whole series of commands. This is also good if you technically want the list to remain frozen even if the filesystem is changing underneath you, e.g. new files being added somewhere in the tree.
An `rm -rf ` always has the potential for some desaster. Especially if you are calling it from bash history via `Ctrl + r` and hit return too fast without editing the argument after `-rf`. Actually the rm command should move files into a `~/.Trash` folder instead of immediately removing them.
> Actually the rm command should move files into a `~/.Trash` folder instead of immediately removing them.
I remember thinking that way when I started using the command line, but I now respectfully disagree. The `mv` command already does that, and they way `rm` work is actually necessary sometimes (eg. I couldn't live without it now that I'm deleting huge files all day long).
Anecdote: when Windows was my daily driver, emptying the trash had become a mechanical task just after deleting a bunch of files anyway, so it didn't provide more safety than these useless “are you sure you want to X” dialogs you click without thinking about the question. I have lost a pair of files just because of that. When I have to use Windows these days, I mostly use shift+del to delete (ie. bypass the trash).
The nicest answers to data loss I've found so far are filesystem snapshots and proper backups. They not only protect from the occasional `rm` mistype, but from the mistyped shell redirections and bad programs as well.
PSA: There exists a tool called trash-cli that can be installed on most linuxes.
Then you can `trash -r ./*` to send files to the trash.
It won’t work for all situations but it’s fairly useful.
At least ctrl-r shows you what command you're about to execute. I often see advice about character sequences in bash that expand to the last command or arguments in the last command, but I have never felt comfortable using them because I can't double-check the command.
I have absolutely no clue what I've done to make it work this way (if anything), but in zsh running a command with !$ (last word from the previous command) I get the opportunity to edit the interpolated version first.
“Because the share contains a file named exactly "--delete", and since it gets sorted first, rsync thinks it to be an argument and deletes everything.”
Also because Unix made the mistake (edit: calling this a mistake may be unfair) of having the shell expand wildcards. If it had provided a library for doing that, the “rsync thinks it to be an argument” part wouldn’t happen.
Alternatively, a file system could sort hyphens last when reading directory entries, but I’m not sure that would be a good idea. It would make problems rarer, but that also might mean fewer users would know when and how to avoid them. It certainly wouldn’t help with other file systems.
Right, DOS expects wildcards to be handled by programs. Besides avoiding problems from the shell expanding wildcards in problematic ways, this also enables some neat semantics like some tools accepting wildcards in multiple locations either for recursive path mapping or for a purpose similar to regex capture groups. It also substantially reduces (but does not eliminate) the need to be careful when including literal wildcard characters in commands.
The problem is of course that wildcard handling ends up being inconsistent between tools or entirely missing, although in practice this became less common as various Win32 calls related to file operations had internal wildcard handling, so applications got a basic form "for free".
As with many things in computing it's hard to say that either approach is superior to the other. PowerShell carries on the DOS tradition of not expanding wildcards in the shell, but provides a more complete and standardized wildcard implementation as part of the API to improve consistency.
In DOS/Windows the shell does not expand as the sibling comment says (which puts some burden on the programmer - I had to use a Windows DLL in a perl script once to replicate Windows-style argument handling) but as most people know they also have paths with \ and options with /, and / is not a valid filename character.
This use of / for options goes way back to VMS apparently[0] and changing the path separator to avoid ambiguity does seem sensible. Of course, Windows has its own quirks like CON.
It's a shame that 'glob is a generator' did not become the standard way of doing things. It would save so much headache (and make writing programs easier) if
This is possible if you use "xargs" (which runs a command multiple times with different subsets of the arguments), although it is most useful if commands are aware of null-separation (e.g. old-style "find ... -print0 | xargs -0", before "find" added the "+" option).
Yes, 'find -print0' / 'xargs -0' is the workaround that guided my thinking. (Which was essentially "how can I make 'find -print0 |' easy enough to use all the time, and flexible enough for more than one glob at a time?")
Because rsync semantics around the trailing slash are a little confusing and unusual, it's pretty common IME for people to intentionally explicitly define paths to the file with rsync, in order to avoid having to look up which way they need to write the target path.
Use a dot after the trailing slash and there is less confusion. "Blabla/." means "(everything) in the Blabla directory". Fairly self-explanatory/un-confusing and no need for "*"
Another example of why shell languages are insane interfaces. Only someone who works on writing shell scripts all day could keep in their head all the things like this that could go wrong and how to defend against them.
Sadly this is one of the sanest answers. ./* or always using -- are coping mechanisms. We've internalized the UNIX warts so hard that we no longer perceive them.
I've moved to writing scripts in python instead. It's horribly verbose, but at least it's predictable and doesn't require the use of noisy disclaimers on every call to defend against rare cases.
There's definitely a need for a terser sane scripting language, but I haven't found one yet.
http://web.mit.edu/~simsong/www/ugh.pdf page 28 (68)
>Some Unix victims turn this filename-as-switch bug into a “feature” by keeping a file named “-i” in their directories. Type “rm *” and the shell will expand this to “rm -i filenamelist” which will, presumably, ask for confirmation before deleting each file.