[kwlug-disc] So why not tar -cf tarball.tar a.xz b.xz c.xz, instead of tar -cJf tarball.tar.xz a b c ?
B.S.
bs27975.2 at gmail.com
Sat Nov 5 00:33:19 EDT 2016
On 11/04/2016 01:59 PM, Chris Frey wrote:
> In your original example, the comparison was between tarring pre-compressed
> files in a plain .tar file vs. compressing a tar of uncompressed files.
>
> I was saying that if you can recover from both compressed and uncompressed
> tar files, the advantage of pre-compressing may not be so stark.
>
> In my brief reading and tests, It seemed possible to recover from both.
> In both cases, it appeared that I would need some special tools: ft for
> fixing tar, and gzrt for fixing gzip. (I haven't tested gzrt yet).
>
> So for pre-compressed plain tarballs, the recovery would be:
>
> 1) use ft to fix the tarball
> 2) use gzrt to try to recover something from the file
>
> For tar.gz, it would be:
>
> 1) use gzrt to recover the plain tarball
> 2) use ft to fix the tarball and extract good files
Ah, OK, I follow now.
But the paths having identical results is premised upon the efficacy of
being able to recover (from either) - agreed A then B, or B then A, same
difference. From what little I've seen so far, that premise is far from
a given - a glitched tar you skip to the next file block and continue
getting what you can out of it; a glitched zip is more likely to be
completely unrecoverable due to the integrity info spread throughout a
file, and a hole in the middle of a large tar.gz likely kills the entire
tar.gz file, while a hole in the middle of a tar file (or a single gzip
hole within a single file) only affects that file.
I will be most interested to hear any opinion you arrive at should you
ever conclude your tests.
The other premise of this conversation, though, is the ability to have
confidence in a file within a tar at any point in time - integrity
confirmation being inherent to the compress process would be an
advantage of tar'ring zips over zipping tars.
As said prior, you can get to the same place with md5sums (or sha1sums),
though, and avoid compression entirely (and potential broken compress /
recovery issues). Which arguably makes sense in the presence of a
compressing filesystem such as btrfs. Except, as also noted prior, I am
seeing some significantly better compression results by using such as zx
over btrfs alone.
At least one article I read noted you could change the contents of a tar
file (e.g. hex edit), and as long as you stayed within any one file, tar
would never know. Bummer.
So, one way or the other, compressing or md5summing tars seems prudent.
More information about the kwlug-disc
mailing list