find & xargs
Part of the reason why the Linux command line is so POWERFUL!
Finding files can be a daunting task, given the vast number of files on your
average Linux filesystem. The number of system files in an average Linux install
is well into the tens of thousands, or even hundreds of thousands of files.
That's not counting user files!
The command `find` is, as it's name implies, used to aid you in finding files
you are looking for, anywhere in your filesystems. Plus much more (see below).
find can use a large number of other criteria to find a file.
- The first argument to "find" is the directory (or directories) to perform
the search.
- Example: find (and display) every file in your home directory:
- "-name" the name of a file, or a partial name (basic regex).
- Example: find the file named "bookmarks.html" in your home directory:
- find $HOME -name bookmarks.html
- Example: find all files starting with the name "bookmarks" in your home
directory:
- find $HOME -name bookmarks\*
- Characters that mean something special to the shell, like the
asterisk must
be escaped with a backslash or put in single quotes, to
avoid problems.
- "-atime/-ctime/-mtime" the last time a files's "access time", "file
status" and "modification time",
measured in days or minutes. The time can
be compared to another file with "-newer/-anewer/-cnewer".
- Example: find everything in your home directory modified in the last 24
hours:
- Example: find everything in your home directory modified in the last 7
days:
- Example: find everything in your home directory that have NOT been
modified in the last year:
- Example: find everything in your home that has been modified more
recently than "abc.txt":
- find $HOME -newer klug/find.html
- "-type x" files of a certain type (file, directory, symlink, socket, pipe,
block, character) (fdlspbc)
- Example: find all directories under /tmp
- "-user" files owned by a certain user.
- Example: find all files owned by user "bruce" under /var
- "-group" files which are a member of a certain group.
- Example: find all files in group "users" under /var
- "-size" files of a certain size.
- Size can be specified in blocks, bytes, works, Kilobytes, Megabytes or
Gigabytes (bcwkMG).
- Example: find all files in your home directory exactly 100 bytes long:
- Example: find all files in your home directory smaller than 100 bytes:
- Example: find all files in your home directory larger than 100MB:
- "-perm" files that has certain permissions, or has individual bits set or
not set.
- Example: find all files in your root directory that are SUID.
- find / -xdev -type f -perm +4000
- Example: find all files in your root directory that are SUID-root.
- find / -xdev -type f -user root -perm +4000
- "-links" files that has a certain number of hard links.
- Example: find all files in your home directory with a hard link
count of two:
- find $HOME -type f -links 2
- Example: find all files in your home directory with more than one
hard link:
- find $HOME -type f -links +1
- "-inum" a file with a certain `inum`, useful in filesystem debugging and
locating identical hard linked files.
- Example: find file with inum=114300 in the /home partition:
find can perform a number of actions on the file(s) it finds.
- "-print" prints the names of the files it finds. This is the default if no
other actions are specified.
These two commands are identical on recent
Linux systems:
- find $HOME -name bookmarks.html
find $HOME -name
bookmarks.html -print
- Variations include:
- "-ls" to display detailed output instead of just filename ("ls -dils"
format).
- "-fprint" to send the output to a file instead of stdout.
- "-printf" to format the output in a specific way.
- "-fprintf" a combination of the above two.
- "-print0" Same as -print, except it separates files by a null character
(ascii 0) instead of a newline.
Although the usefulness of this may not be
immediately obvious, it is extremely useful!
See examples below. (the
argument ends in the number ZERO, not the letter O)
- Variations include:
- "-fprint0" to send output to a file instead of stdout.
- "-delete" will delete all files it finds. Use with care! :-)
- Example: delete all files named "core" in the /tmp directory:
- find /tmp -type f -name core -delete
- "-exec" will execute any command on the files found.
- Use "{}" to specify the filename found in the command.
- End the command with a ";" (escape it!) to execute the command every
time a file is found.
- End the command with a "+" to pass multiple files to the command (like
xargs).
- Variations include:
- "-execdir" to execute the command in it's directory (instead of the
current directory)
- "-ok" to ask the user for each file found if the command should be
executed.
- "-okdir" ask the user and execute in the file's directory.
Other useful find parameters:
- "-xdev" Don't descend directories on other filesystems.
- Useful for searching a single hard drive partition and omitting other
HDD partitions, /proc,
CDROM's, network mounts, etc. (network drives and
CD's can be really slow to search)
- "-maxdepth n" Descend at most n directory levels. (cannot be
negative)
- "-mindepth n" Do not apply tests or actions at levels less than
n (non-negative).
- "-daystart" perform time tests from beginning of today, instead of current
date/time.
- "-L" follow symbolic links (does not follow symlinks by default).
- "-fstype x" only find files a filesystems of type x.
- Useful for searching hard drive partitions and omitting CDROM's, network
mounts, etc.
- "-regex pattern" use full regular expressions.
- Variations include:
- "-iregex" case insensitive regex.
- "-depth" Process each directory's contents before the directory itself
- Useful for removing, since the directory has to be empty before it can
be removed
- "-noleaf" Do not optimize by assuming that directories contain 2 fewer
subdirectories than their hard link count.
- The default optimization improves speed significantly on Unix
filesystems.
However it doesn't work so well on other filesystems (DOS,
CDROM, etc.), hence this option.
OPERATORS
- "! expr" True if expr is false. (logical NOT)
- "( expr )" Force precedence.
- "expr1 -a expr2" Logical AND (default operation, not
necessary)
- "expr1 -o expr2" Logical OR.
- "expr1 , expr2" For different searches while traversing the
filesystem hierarchy only once.
Must be used with parenthesis and -fprint
to save separate outputs.
Examples:
- Display all jpg files in the top two levels of your home directory:
- find $HOME -maxdepth 2 -name \*jpg -print -exec xv {} \;
- find $HOME -maxdepth 2 -name '*jpg' -print -exec xv {} +
- find $HOME -maxdepth 2 -name '*jpg' -print0 | xargs -0 xv
- cron job to make all files & directories world readable/writable in
common area:
- find /somedir/common -type f -exec chmod a+wr {} \;
find
/somedir/common -type d -exec chmod 777 {} \;
- cron job to force correct owner/group/permissions on certain files:
- find $BSE/lib/user \( -name '[p,u]*' -a -type f -a ! -perm 664 \)
-exec chmod 664 {} \;
find $BSE/lib/user \( -name 'd*' -a -type f -a !
-perm 666 \) -exec chmod 666 {} \;
find $BSE/lib/user \( -type f -a !
-user bsp \) -exec chown bsp {} \;
find $BSE/lib/user \( -type f -a !
-group programs \) -exec chgrp programs {} \;
- cron job to delete some old log files and keep record of files removed:
- find /var/opt/hparray/log -mtime +30 -print -exec rm -f {} \;
>> $logf 2> /dev/null
- cron job to delete some old temp files and keep record of files removed:
- find / -name core -type f -fstype xfs -print -exec rm -f {} \;
>> $logf 2> /dev/null
find /var/tmp -mtime +1 -name '*aaa*'
-print -exec rm -f {} \; >> $logf 2> /dev/null
find /var/tmp
-mtime +1 -name 'srt*' -print -exec rm -f {} \; >> $logf 2>
/dev/null
find /var/tmp -mtime +7 -print -exec rm -f {} \; >> $logf
2> /dev/null
- Traverse /var only once, listing setuid files and directories into
/root/suid.txt
and large files into /root/big/txt. (example taken from the
find man page):
- find /var \( -perm +4000 -fprintf /root/suid.txt '%#m %u %p\n' \) ,
\
\( -size +100M -fprintf /root/big.txt '%-10s %p\n' \)
xargs
- Why do we need this "xargs" thing? It's in the presentation title!
:-)
Answer: Speed and efficiency.
- The second line runs much faster than the first for a large number of
files:
- find / -name core -exec rm -f {} \;
- rm -f $(find / -name core -print)
In other words,
running "rm" once, with all the filenames on the command line
is much
faster than running "rm" multiple times, once for each file.
- However, the second line could fail if the number of files is very large
and
exceeds the maximum number of characters allowed in a single command.
- "xargs" will combine the single line output of find and run commands
with multiple
arguments, multiple times if necessary to avoid the max
chars per line limit.
- find / -name core -print | xargs rm -f
- The simplest way to see what xargs does, is to run some simple commands:
- find $HOME -maxdepth 2 -name \*.jpg -exec echo {} \;
- find $HOME -maxdepth 2 -name \*.jpg | xargs echo
- Enter the power of ZERO!
- The 2nd command will fail if any of the files contain a space or other
special character:
- find $HOME -maxdepth 2 -name \*.jpg -exec ls {} \;
- find $HOME -maxdepth 2 -name \*.jpg | xargs ls
- Delimiting the file names with NULL fixes the problem:
- find $HOME -maxdepth 2 -name \*.jpg -print0 | xargs -0
ls
- Real world example of a very useful set of commands: (This happens
to me all the time)
- Our "webmaster" comes to me and asks if I can "find" all the web
pages
that contain the graphic file "ArmstrongISS.jpg" so they can edit
those pages.
- find /home/httpd \( -name \*.html -o -name \*.php -o -name \*.php3
\) -print0 \
| xargs -0 grep -l "ArmstrongISS.jpg"
Note: add
a "-i" parameter to "grep" for a case insensitive search on the string.
- The above example alone is worth more than double the price of
admission! :-)
Not only does it find files by name, it only displays file
names containing a certain string!
When combining "find" with other Linux
commands (like grep) and it's potential use in shell
scripts, the power
is only limited by your imagination! (and your command line skills). :-)
- Similar examples to demo on my local system:
- find $HOME \( -name \*txt -o -name \*html \) -print0 | xargs -0
grep -li vpn
- find $HOME \( -name \*txt -o -name \*html \) -exec grep -li vpn {}
+
Miscellaneous
- Always read the man page for "find" & "xargs" on the system where you
plan on using it.
"find" has been around for a long time, but is still
evolving at a rapid rate.
- Some arguments are fairly new, and may not exist on some older
systems,
and some commercial Unix systems.
- Some parameters, like "-delete" and -exec followed by a plus sign
("+"),
are REALLY NEW. (Neither existed in the SuSE 9.2 "find"!)
- Other parameters may be named something completely different
on
commerical Unix systems (i.e. "-fstype" == "-fsonly" on HPUX)
- Older versions of "find" did NOT have "-print" as the default
action.
In fact there was no default, so running find without any
action did
nothing noticeable (except use CPU time and grind the hard
drive)!
- Always test your commands using "-exec echo" before invoking real
commands,
especially destructive commands like removing files! :-)
- Honorable mention: "locate".
- The "locate" package comes with most distributions.
- Installs by default on Redhat/Fedora systems.
- Comes with SuSE, but does not install by default.
- "Locate" consists of a cron job that runs nightly (by default),
and
stores all filenames on your system in a searchable database.
- Simply type "locate filename" to find the location (directory) of
a file.
- Is MUCH faster than using "find" on your entire directory structure, but
only
tells you where files live. It has none of the extra functionality
of "find".
- Using the right tool for the job, "locate" is faster than "find" when
you are
only looking for the locations of files (or where they lived as
of yesterday).
- Any [more] questions?
- Note to self: if this went too quickly, read the "find" man page
out-loud very slowly! :-)
- Fin.