More then one way

Story: Using 'ls' and 'xargs' to manage large numbers of filesTotal Replies: 13
Author Content
chofhead

Feb 15, 2007
4:33 AM EDT
Like all good OSs' there is more then one way to skin a cat.

Here is another way to do the same without all the files streaming by on your screen.

find . -exec rm -f '{}' \;

I did a time measurement on the two and the way shown in the article is the fastest of the two by ~20 sec using the time command. The article way on my machine took about 2 min 34 sec and the find way took 2 min 53 sec.

DarrenR114

Feb 15, 2007
5:22 AM EDT
A lot of people don't know much about the potential of the find command, or the xargs command.

The thing I like about xargs is that you can produce the argument list from within a single file, by piping 'cat' into xargs.

The thing I like about the find command is, as you demonstrated, that its 'exec' feature makes it a very powerful tool to do things like converting MS-Word (.doc) documents into different formats en masse.

I get the feeling that you, chofhead, are a Perl enthusiast with Mr. Wall's motto being "There's More Than One Way To Do It".

And if you want to not see the individual commands whiz by with xargs, you can skip the '-t' option: >$ ls -tr |xargs -I{} rm -f {}

That way it'll run silent but just as deadly.
swbrown

Feb 15, 2007
8:10 AM EDT
I generally use while loops, as it's more general so can be expanded to more complex things. E.g., like to de-caps files in a directory (Windows disease), something like find . -type f | while read a; do mv "$a" "`echo "$a" | tr '[A-Z]' '[a-z]'`"; done.

However, do note the irony in the article of his fight against command line length, yet he's using a "for i in `seq 1 30000`" loop to set up the example. :) Old habits die hard.
DarrenR114

Feb 15, 2007
8:24 AM EDT
I'm not so much against command line length per se, as I am against executing the same command over and over and over again.

And the reason I was so adverse to that in this sort of situation is that in the original problem there were "datestamps" embedded in the file names. The solution I showed the operations guy was something he could write down in a procedural operations book and use without modification the next year.

But you're right, sometimes the long command lines are the most expedient.

I do find myself using find with the -exec parameter quite often; especially when I want to grep for some expression in a few hundred JSP files (and only JSP files) spread out over a dozen subdirectories. Mix that with 'cut -d: -f1 |uniq' and I can quickly open all the files I need with gvim.

joshbaptiste

Feb 15, 2007
8:41 AM EDT
using find -exec rm {} \; .. will execute the rm binary per file... and using ls | xargs .... will break if files contains spaces/newlines etc... newer versions of GNU find support + ; find . -exec rm {} + provides a set of files to delete per rm execution or find . -print0 | xargs -0 rm is another way to do essentially the samething.
DarrenR114

Feb 15, 2007
9:01 AM EDT
joshbaptiste,

ack!! ... spaces in file names - another "MS-Windows" travesty.

Your solution is nicely elegant, but if you want use xargs, you can do so by putting quotes around the {} thusly: ls |xargs -I{} rm -f "{}"

I'm beginning to think with the feedback from you guys that a short tutorial on find would be in order, and maybe an expanded one on xargs.

Then there's sed ...

mmmmmm
chofhead

Feb 15, 2007
10:47 AM EDT
DarrenR114

When you start one of these types of examples, you must expect many different ways to do it and this is a good thing. Some of us either haven't come across the different way of doing it or, like me, have forgotten it and need a little spark to get the old gray cells working again.

I want to thank you for doing this, it gets us thinking again.
DarrenR114

Feb 15, 2007
10:51 AM EDT
chofhead -

I absolutely agree - and thanks for your feedback. It helps to give others a more complete picture and perhaps even more ideas.

theduke459

Feb 15, 2007
6:32 PM EDT
The only isuse with using the find command is that it starts a new rm process for each file deleted. The use of xargs can delete maybe 100 files per process started. I've seen xargs do mass file deletions in seconds that just never seemed to stop running via find.

DISCLAIMER: I'm speaking here as a big fan of find, but mass file deletions is definitely where xargs really shines!
dcparris

Feb 15, 2007
7:06 PM EDT
This is a pretty cool discussion. Period. I'm learning something new! I rarely have need for these myself. But it's great to see some new "faces" around here.
Aladdin_Sane

Feb 15, 2007
7:17 PM EDT
> But it's great to see some new "faces" around here.

Not that they haven't been lurking for years. :-)

Really, any upsurge in posting here or in other fora is due to behavioral changes brought about by the Groklaw PJ Hiatus. Speaking only for myself, of course...
dcparris

Feb 15, 2007
7:25 PM EDT
Glad you're out of the shadows. Please do make yourself at home. :-)
swbrown

Feb 15, 2007
8:15 PM EDT
> I'm not so much against command line length per se, as I am against executing the same command over and over and over again.

No, I mean his whole problem was the command line length limit causing the evaluation of '*' to hit the cap, but in his setup to describe the solution, he uses a 'for `seq..`' loop with a large value which has the same kind of problems as using '*'. :) Kinda funny to read it like "I ran into problem X, let me show you how to avoid problem X, first we'll start by setting ourselves up for another problem X to occur, ...".

It's really bad programming practice to get used to using 'for `seq..`' loops, as code often gets reused even if not originally intended, there'll often be portability problems due to different command line length limits on various systems, and it's a maintenance issue as anything changing the number of iterations could unexpectedly cause the whole thing to fail.
DarrenR114

Feb 16, 2007
3:27 AM EDT
I considered using 'while' but didn't want to do the whole separate counter-increment thing - so I used the for loop that would accomplish what I needed. Here is a page with the sample of all three - for, while, and until: http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-7.html

I don't use for loops a whole lot. Heck when iterating through result sets I use 'while (resultSet.next())' (java construct). But sometimes for loops do the job a bit more to my liking. For small purposes, it's all a matter of personal preference.

Posting in this forum is limited to members of the group: [ForumMods, SITEADMINS, MEMBERS.]

Becoming a member of LXer is easy and free. Join Us!