On Apr 16, 2007, at 5:35 AM, Jordan K. Hubbard wrote:

On Apr 16, 2007, at 2:09 AM, Kevin Ballard wrote:

I'm also confused as to your complaint about it executing the body every time. With your original implementation, it executed the match every time, and every time it succeeded (most of the time, as it defaulted to "expr 1") it would then execute the body (which defaulted to printing the file, though I'm not sure why as any practical usage of the command doesn't want to print the result).

The fact that { 1 } and { print $_filename } are the defaults shouldn't be taken as an indication that we don't need a flexible API.  You can type "find ." and get useful results at the shell, too, but no one argues that find(1)'s other options should be deprecated since that's all one would ever want to do with it.   I do a lot of things with find(1) that don't print the result - they manipulate the filenames found silently and move on.

I'm also hardly defending the original find API as perfect - never said it was.   However, like xinstall, I think it's a worthwhile goal to make it follow in the footsteps of its command-line analog so as to ease the process of people coming up to speed with it.   It's not going to be identical, naturally, since Tcl is a "real language" unlike /bin/sh and there are more powerful things you can do with it, but it should at least follow the POLA as much as it can.  I'm still waiting for some usage examples which demonstrate how this is a marked improvement.

Unlike xinstall/install, find is a command that's far more likely to be used in the MacPorts base code than in a Portfile. Making it match the find command-line usage simply makes it more complicated to use, since it doesn't actually match the command-line usage. And I'm not sure why you're arguing against my changes in this context, since your implementation didn't match command-line usage either.

And yes, you can type "find ." and get useful results at the shell, and that's because you're either doing that to tell yourself about files, or you're piping the results elsewhere. Inside a tcl script you're not examining the filesystem by hand with find, nor are you piping the results.

In any case, the standard usage for something like this is to traverse a filesystem and perform some action on each file that matches the criteria. The new find command is intended specifically for this. It's basically a foreach on a recursive glob. A theoretical usage might look as follows:

find src file {
    switch -glob $file {
        .svn -
        CVS  { continue }
        *.o  { file delete $file }
    }
}

And maybe it's just me, but I would find this to be far more readable (and concise):

find file src {string match *.o $file} {file delete $file}

Except, of course, that what you just typed wouldn't work. It would find all the object files in the src directory, but not in any of its subdirectories because it wouldn't match any of its subdirectories. And, of course, it doesn't match the behaviour of my example - it will recurse into .svn and CVS directories. To match the same behaviour you'd have to do the following:

find file src {expr {[string match *.o $file] || ([file type $file] eq "directory"] && ![string match $file */.svn] && ![string match $file */.svn])}} { if {[file type $file] ne "directory"} { file delete $file } }

Pretty ugly, no? Sure, you could try cleaning it up like

find file src {
switch -glob $file {
*/.svn  -
*/CVS   -
*.o     { return 1 }
default { return 0 }
} } { if {[string match $file *.o]} { file delete $file } }

But that's hardly any cleaner.

Incidentally, it just occurred to me that my example won't quite work right, as $file contains the full path. Either the switch should be on [file tail $file] or the .svn and CVS patterns should be */.svn and */CVS.

Note that I'm also talking about "general case" usage here, where the user is really trying to accomplish a simple and straight-forward job in one line without too much knowledge of the internals of find's recursion behavior, something I think you'll find to be true in the majority of usage scenarios rather than the converse.   Is that one line equivalent to your script?   No, because as you note, we don't short-circuit recursion in the .svn or CVS cases, but I also think your example is just a wee bit contrived given that most folks will be using this primitive to do operations in $destroot and not checked-out trees of source where this kind of short-circuit behavior is typical or even important.  Still, if you were to allow multiple expr/body pairs, I suppose you could always do it like this:

find file src {regex .svn|CVS $file} {return 1} {string match *.o $file} {file delete $file} 

Where "return 1" could have the special case meaning of "don't recurse."   Again, however, I don't see this as the typical usage case, just as people don't typically use find(1) with 11 predicates and parenthesized sub-expressions even though that's technically possible.  They use maybe one predicate and a fairly typical action, like -exec rm {} or print.   That's the usage case I was trying to optimize for in my version and while I don't think I necessarily did the best job, I think you've gotten further away, not closer, to the goal of making it really simple for the majority of intended usage scenarios.

I think the biggest problem here is you're optimizing for Portfile use case while I'm optimizing for MacPorts base code use case. Why am I doing this? Because I know right off the bat two places where this is useful in base code - the delete proc in portutil and pippings new +universal build stuff for ports like openssl that need two single-architecture builds.

I can understand what you're saying about making it work similar to find. I can see about 16 uses of system "find ..." in Portfiles right now, but trying to replace them with your old command would be difficult. If you really want to work towards acting like find(1), maybe we should implement a command that actually does act like find(1)? And then we can rename my version of find to fs-traverse.

How does that sound?

-Kevin Ballard

-- 
Kevin Ballard
http://kevin.sb.org
eridius@macports.org
http://www.tildesoft.com