Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to delete all your files (reddit.com)
121 points by maple3142 on Aug 23, 2020 | hide | past | favorite | 64 comments


Unix Haters Handbook is still surprisingly up to date.

http://web.mit.edu/~simsong/www/ugh.pdf page 28 (68)

>Some Unix victims turn this filename-as-switch bug into a “feature” by keeping a file named “-i” in their directories. Type “rm *” and the shell will expand this to “rm -i filenamelist” which will, presumably, ask for confirmation before deleting each file.


Oh God that's like something I'd find on Cthulhu's computer


I think Unix Wildcards Gone Wild (2014)[0] demonstrates and explains this rather well (posted 6 and 4 years ago[1][2] - while HN supports reposting, it's probably been posted enough)

[0]: https://www.defensecode.com/public/DefenseCode_Unix_WildCard... [1]: https://news.ycombinator.com/item?id=8189968 [2]: https://news.ycombinator.com/item?id=17376895


Also posted 3 days ago [0].

[0] https://news.ycombinator.com/item?id=24220503


I once saw a friend nearly delete all their files using tip#3 from the thread:

> never use only * as a wildcard. A ./* would save you in most cases

My friend typod that for /* in an rm -rf command but caught it just in time. They seemed like unnecessary characters to me if it does the same as *, and I don't think they do it anymore, but now I'm not sure which is worse...

I suppose good backups just go a long way whatever you do.


I just did that last weekend!

After an allnighter, which isn't easy for me anymore, I typed:

``rm -rf. /*``

with the dot and the space reversed.

When the shell threw an error in my face, I thought, "oh, an extra dot." so I deleted the dot and re-run the command.

And there goes my configs and most of the dotfiles in my home dir. Luckily, I have backup for some of those, so it wasn't a complete disaster.

I don't trust myself doing ``rm`` in commandline anymore.


This has nothing to do with the OP, but I always rehearse my deletions with "ls -d", and after seeing the output hit the up arrow in my command history and replace it with the rm command I wanted to attempt. I also never use -r without intention - a lot of people use it habitually even when not deleting a directory. Lastly, I never -f, I just chmod first.


I tend to use the command line utility tras-cli instead for these reasons. So easy to screw up with a wild card.


It doesn't hurt to do some variation of pointing and calling when doing destructive commands like rm https://en.wikipedia.org/wiki/Pointing_and_calling


This is why I never use rm when I'm tired or working on important system anymore. I'll just mv the files somewhere for review later. Worst case is I just moved a wrong file and can simply restore it back.


You almost never need to use "*". Use ".", or go up a dir and type the name of the dir.


Yes thanks. I thought running rsync with a "*" looked odd, but didn't completely understand why until now.


Funnily enough, rsync is one of the commands for which you might be more tempted to use * instead of going up a directory, because the relative path names go over the wire.


besides ./* i also like to use star.ext or even a* b* c* depending on the contents of the directory.

rsync also used the local/only/path/./local/and/remote/path convention where the path before the /./ is not sent to the remote side


There's a lot of operations from the command line that are too easy to screw up. As much as I love having the power available and I'd certainly not give it up, I can't deny that GUIs (or even just TUIs) are much safer since you can visually validate the selection you've made and there's nothing to make a typo on. They also tend to ask for confirmation on destructive actions or provide an undo mechanism.

Maybe we need transactional file systems. At work, on our databases, when I'm running an ad-hoc update or delete on something that's not trivial to recover then in my DB tool I will often have something like:

  BEGIN TRANSACTION
  UPDATE ...
  SELECT ...
  -- COMMIT TRANSACTION
If the select validates that the update was good (and the update doesn't say "1000000 rows updated"), I'll just highlight COMMIT TRANSACTION and hit run. This isn't a perfect solution since I either need to hold the transaction open or run the update twice (first time immediately followed by a rollback), but a blip in accessibility is better than having to restore the latest nightly backup and run any processes that updated the table since it was taken.


If you're using postgres (and maybe others?) it can be handy to use the "returning" keyword -- e.g. "update foo set x = y where z returning *" -- to get an immediate look at the updated rows. Maybe not so good though if touching more than a few dozen rows intentionally.


Microsoft Windows actually provided a transactional file system called Transactional NTFS (TxF)in Windows Vista. However, due to the complexity and lack of widespread use, it has since been deprecated.


Just get a snapshotting filesystem.


I mixed up dd's "of" and "if" args once when trying to image a disk. And of course, it failed somewhere in the middle after borking something important.


> At one point, Linus had implemented device files in /dev, and wanted to dial up the university computer and debug his terminal emulation code again. So he starts his terminal emulator program and tells it to use /dev/hda. That should have been /dev/ttyS1. Oops. Now his master boot record started with "ATDT" and the university modem pool phone number. I think he implemented permission checking the following day.

From https://liw.fi/linux-anecdotes/

But if and of are "easy", since they stand for "input file" and "output file"?


You'd think, but a lot of people blindly follow tutorials that Google gives them.

As an example, here's my #2 result for, "Linux command line create bootable usb drive from iso":

https://www.tecmint.com/create-an-iso-from-a-bootable-usb-in...

It instructs you to run this command, despite explaining what "if" and "of" mean in the following paragraph:

>sudo dd if=/dev/sdb1 of=/home/tecmint/Documents/Linux_Mint_19_XFCE.iso

Proofreading is harder and less profitable than SEO.


Actually that command seems accurate, because the instructions is to do the reverse of what you googled - it's telling people "how to make an image from a disk device", but in this case the image file has the ".iso" file extension.

But I agree about "copy paste instructions". I remember an intern at work asking me how to switch between virtual terminals on Linux, I told her Ctrl-Alt-Fx, I wanted to explain what virtual terminals are and how the shortcuts are probably configurable, but at that point she already stopped listening...


In my case, it was simpler. Just mixing up, mentally, which disk device was the source, which was the dest.


>But if and of are "easy", since they stand for "input file" and "output file"?

Yes, but which hard drive device is which isn't as straightforward. My fault, of course...just got them mixed up.


dd is one of the most dangerous ones I was thinking of. I hadn't even thought of swapping if and of though, just mistyping the target drive. I would feel so much safer choosing from a list of the drives shown with good identifying info like their manufacturer, size, serial number, and known partitions.


There’s a reason they call it “disk destroyer”.


I had a close call one time while trying to delete a directory `~/foo` and its contents. Here's what I typed — can you spot the error?

    rm -rf ~ /foo
I realized after a second or so and hit CONTROL-C. Nothing seemed to have been deleted, so I think it may have been still winding up.


hah, i had a tool where i configured ~/something as a target directory in its config. except that the tool did not understand ~ and literally created a directory named '~'

guess what i typed to remove it?

i lost a lot of work that day.

since then i developed the habit to always only use rmdir for directories, and first go into the directory to remove files with rm.

i also avoid rm * by finding some common parts like an extension that most files share.

i don't want rm * to be in my history where i might accidentally call it.


Blender tried this on me a few months ago. I used mv to rename the directory before deleting it though, the fear caught me as soon as I realized I had a directory named ~ sitting in my home directory.


it tried to blend in ;-)

mv is a clever idea. rename to something safe, which can be undone if you get it wrong.

i use a similar approach when deleting a lot but not all files from a directory. i first move the files into a new directory, double check that every file is in the right place and then delete the new directory safely.


rm is not like explorer.exe and is rather fast with destruction, so I guess you must have deleted some dotfiles.


Could be. This was on a Mac, around 2013, so it should have been pretty snappy.

Whatever it got I never noticed. Maybe it was trawling through a bunch of build detritus in `~/.cpan` or something.


I accidentally did the same on my Mac via a shell script where I assumed mktemp worked the same as gnu mktemp, it does not. IIRC it was my /Applications directory that was first to be hit; most of the directory was unaffected due to file permissions but it did delete some files despite me catching thee error and sending sigkill within seconds.


rm -rf on root will query for confirmation. Not sure if this happens on all systems or even all top level directories.


My friend did a "rm * -i" and got back a '-i not found'


While it takes significant extra effort, it’s more robust to carefully decompose patterns into exact lists of files that can be reviewed (or use "find"), and this has saved me a couple of times.

For example, expand wildcards into a separate list of files that can become controlled commands ("rm -f A/file1", "rm -f A/file2", "rmdir A", ... instead of "rm -Rf ..."). This way, if directory "A" contains anything you didn’t expect, the "rmdir" fails at the end; and, someone can review the list of proposed files to be deleted before you run. Oh, and instead of having that sinking feeling of "rm" taking “too long” to run, your command is merely in the process of constructing a list of 1000 unexpected files instead of blowing away half your disk with shocking efficiency.

Also, file lists are pretty useful when you need to make minor edits (e.g. sometimes it’s a lot easier to find and exclude a few files from a list that you don’t want to touch, as opposed to describing those in a wildcard or search).

Depending on the task (and assuming no filesystem caching) it can be faster to gather up a list once and pass the composed list in to a whole series of commands. This is also good if you technically want the list to remain frozen even if the filesystem is changing underneath you, e.g. new files being added somewhere in the tree.


An `rm -rf ` always has the potential for some desaster. Especially if you are calling it from bash history via `Ctrl + r` and hit return too fast without editing the argument after `-rf`. Actually the rm command should move files into a `~/.Trash` folder instead of immediately removing them.


> Actually the rm command should move files into a `~/.Trash` folder instead of immediately removing them.

I remember thinking that way when I started using the command line, but I now respectfully disagree. The `mv` command already does that, and they way `rm` work is actually necessary sometimes (eg. I couldn't live without it now that I'm deleting huge files all day long).

Anecdote: when Windows was my daily driver, emptying the trash had become a mechanical task just after deleting a bunch of files anyway, so it didn't provide more safety than these useless “are you sure you want to X” dialogs you click without thinking about the question. I have lost a pair of files just because of that. When I have to use Windows these days, I mostly use shift+del to delete (ie. bypass the trash).

The nicest answers to data loss I've found so far are filesystem snapshots and proper backups. They not only protect from the occasional `rm` mistype, but from the mistyped shell redirections and bad programs as well.


PSA: There exists a tool called trash-cli that can be installed on most linuxes. Then you can `trash -r ./*` to send files to the trash. It won’t work for all situations but it’s fairly useful.


At least ctrl-r shows you what command you're about to execute. I often see advice about character sequences in bash that expand to the last command or arguments in the last command, but I have never felt comfortable using them because I can't double-check the command.


I have absolutely no clue what I've done to make it work this way (if anything), but in zsh running a command with !$ (last word from the previous command) I get the opportunity to edit the interpolated version first.


I feel like rm should default to limiting within the working directory.


“Because the share contains a file named exactly "--delete", and since it gets sorted first, rsync thinks it to be an argument and deletes everything.”

Also because Unix made the mistake (edit: calling this a mistake may be unfair) of having the shell expand wildcards. If it had provided a library for doing that, the “rsync thinks it to be an argument” part wouldn’t happen.

Alternatively, a file system could sort hyphens last when reading directory entries, but I’m not sure that would be a good idea. It would make problems rarer, but that also might mean fewer users would know when and how to avoid them. It certainly wouldn’t help with other file systems.


Right, DOS expects wildcards to be handled by programs. Besides avoiding problems from the shell expanding wildcards in problematic ways, this also enables some neat semantics like some tools accepting wildcards in multiple locations either for recursive path mapping or for a purpose similar to regex capture groups. It also substantially reduces (but does not eliminate) the need to be careful when including literal wildcard characters in commands.

The problem is of course that wildcard handling ends up being inconsistent between tools or entirely missing, although in practice this became less common as various Win32 calls related to file operations had internal wildcard handling, so applications got a basic form "for free".

As with many things in computing it's hard to say that either approach is superior to the other. PowerShell carries on the DOS tradition of not expanding wildcards in the shell, but provides a more complete and standardized wildcard implementation as part of the API to improve consistency.


In DOS/Windows the shell does not expand as the sibling comment says (which puts some burden on the programmer - I had to use a Windows DLL in a perl script once to replicate Windows-style argument handling) but as most people know they also have paths with \ and options with /, and / is not a valid filename character.

This use of / for options goes way back to VMS apparently[0] and changing the path separator to avoid ambiguity does seem sensible. Of course, Windows has its own quirks like CON.

0: https://superuser.com/a/176395/206248


It's a shame that 'glob is a generator' did not become the standard way of doing things. It would save so much headache (and make writing programs easier) if

  command *
(or similar syntax) expanded to something like

  command /some/fd
where calls to

  read(/some/fd)
produced

  "expansion_one␀expansion_two␀expansion_three␀...␀"
Though exactly how one would plumb this is its own question.


This is possible if you use "xargs" (which runs a command multiple times with different subsets of the arguments), although it is most useful if commands are aware of null-separation (e.g. old-style "find ... -print0 | xargs -0", before "find" added the "+" option).


Yes, 'find -print0' / 'xargs -0' is the workaround that guided my thinking. (Which was essentially "how can I make 'find -print0 |' easy enough to use all the time, and flexible enough for more than one glob at a time?")


I was once doing a lot of stuff to a disk with gnu parted. I popped out and popped back in like this:

  $ parted
  <do a lot of work>
whoops, all the I erased and recreated were... my root disk!

I couldn't figure otu the partition table of the running disk, but I was able to rsync all the data elsewhere and recover it.

IMHO if you invoke parted - WITH NO ARGUMENTS - it should not make a choice for you.


The canonical document about this sort of option injection attack:

https://www.defensecode.com/public/DefenseCode_Unix_WildCard...

BTW, shellcheck detects this sort of thing:

https://www.shellcheck.net/


  rsync -r ./ ~/Documents/
Why would I even need to use *, if I wanted all files from a directory?


Because rsync semantics around the trailing slash are a little confusing and unusual, it's pretty common IME for people to intentionally explicitly define paths to the file with rsync, in order to avoid having to look up which way they need to write the target path.


Use a dot after the trailing slash and there is less confusion. "Blabla/." means "(everything) in the Blabla directory". Fairly self-explanatory/un-confusing and no need for "*"


I would suggest people use the rsync --dry-run feature when using wildcards or working on something sensitive.


Or create a file _named_ —dry-run and save it everywhere...


Shell globbing + filenames which happen to be switches is one of the first things mentioned in the Unix Hater's Handbook IIRC

I guess every generation of Unix users will have to independently rediscover this the hard way in perpetuity.


Another example of why shell languages are insane interfaces. Only someone who works on writing shell scripts all day could keep in their head all the things like this that could go wrong and how to defend against them.


Sadly this is one of the sanest answers. ./* or always using -- are coping mechanisms. We've internalized the UNIX warts so hard that we no longer perceive them.

I've moved to writing scripts in python instead. It's horribly verbose, but at least it's predictable and doesn't require the use of noisy disclaimers on every call to defend against rare cases.

There's definitely a need for a terser sane scripting language, but I haven't found one yet.


I've started writing in Rust, and have a small function to pass a simple string to "/bin/sh -c"

I can avoid most of the pitfalls of writing in sh, while still being able to glue external programs together.

For the record, I'm only writing it in Rust because I enjoy it, not because "Rust is the one true way".


Another classic is something like "rm -rf /$BUILD_DIR" when the variable is undefined (maybe from a typo or from an argument not passed in).


Nowadays avoided by GNU tools, though. (requires --no-preserve-root)


Nobody wants to talk about msdos?


How do you provision hundreds of boxes using a UI?

...In a way that can be checked in source control and reviewed?

CLIs are not bad.


I apologize for saying CLIs are bad.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: