ron
Member
I am re-working the scripts that back-up to tape. No problem with the PROBKUP side of things - but I hit a problem with tar (we copy quite a large number of Unix files to tape too).
The old script included a long list of directory names in the tar command. That worked just fine - but I wanted to have a file list as well and so I changed the logic to create a file with find first, like this:
cd $C_Home
$FIND . -print >${FINDLIST}
and then:
tar cvfb /dev/rmt/0 20 -I ${FINDLIST} >>$FILELOG 2>&1
I found that the tape archive was HUGE. Investigation showed that (as we all know) find (by default) writes-out the names of directories as well as files to ${FINDLIST}. If file cccc is in the path ./aaaa/bbbb/cccc then it will get written to the tape FOUR times! Once because "." is an entry in ${FINDLIST}, again because "./aaaa" is an entry, once more because "./aaaa/bbbb" is an entry - and finally because "./aaaa/bbbb/cccc" is an entry!
I have dealt with the output of find many times ... and worked with tar many times - and I SHOULD have seen this coming ... but I didn't.
I spent a very long time searching the web to see what others have done about this problem ... but found nothing. However - I found MANY references to people recommending using tar like this:
tar cvfz archive.tgz `find /home -ctime -1 -depth -print`
and - as far as I can see - that will cause exactly the same kind of problem as I've had.
I solved the problem by filtering-out "files" only by adding -type f to the find, like this:
$FIND . -type f -print >${FINDLIST}
Has anyone else hit this problem? If so what did you do about it?
Thanks,
Ron.
The old script included a long list of directory names in the tar command. That worked just fine - but I wanted to have a file list as well and so I changed the logic to create a file with find first, like this:
cd $C_Home
$FIND . -print >${FINDLIST}
and then:
tar cvfb /dev/rmt/0 20 -I ${FINDLIST} >>$FILELOG 2>&1
I found that the tape archive was HUGE. Investigation showed that (as we all know) find (by default) writes-out the names of directories as well as files to ${FINDLIST}. If file cccc is in the path ./aaaa/bbbb/cccc then it will get written to the tape FOUR times! Once because "." is an entry in ${FINDLIST}, again because "./aaaa" is an entry, once more because "./aaaa/bbbb" is an entry - and finally because "./aaaa/bbbb/cccc" is an entry!
I have dealt with the output of find many times ... and worked with tar many times - and I SHOULD have seen this coming ... but I didn't.
I spent a very long time searching the web to see what others have done about this problem ... but found nothing. However - I found MANY references to people recommending using tar like this:
tar cvfz archive.tgz `find /home -ctime -1 -depth -print`
and - as far as I can see - that will cause exactly the same kind of problem as I've had.
I solved the problem by filtering-out "files" only by adding -type f to the find, like this:
$FIND . -type f -print >${FINDLIST}
Has anyone else hit this problem? If so what did you do about it?
Thanks,
Ron.