[MacPorts] #36560: Use hfsCompression
#36560: Use hfsCompression -------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Keywords: haspatch | Port: -------------------------+-------------------------------- The attached patch enables the port activation phase in registry2.0/portimage.tcl to take advantage of hfsCompression on Mac OS X
=10.6. Posting here for reference before committing to trunk.
-- Ticket URL: <https://trac.macports.org/ticket/36560> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+-------------------------------- Comment (by egall@…): Could we get some benchmarks as to about how much of a difference it makes? -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:1> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+-------------------------------- Comment (by ryandesign@…): Any link explaining what hfsCompression is? -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:2> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+-------------------------------- Comment (by mfeiri@…): HFS compression is a fully transparent way to save disk space and reduce disk I/O. Apple uses it for all of /usr, /bin, /sbin, and /System. You can easily verify this using "ls -lO". The man page of ditto describes this feature as "intended to be used in installation and backup scenarios that involve system files". Should be appropriate enough for macports. Ditto will only actually compress files "if appropriate". Here are two links describing some technical details and beneftis of transparent compression in HSF: * http://arstechnica.com/apple/2009/08/mac-os-x-10-6/3/ * http://developercoach.com/2009/file-system-compression-in-hfs-space- savings-and-performance-gain/ And here are some numbers to show the effect of HFS compression on my system using 24 recently updated ports * 787,056K (sum of relevant .tbz2 archives in /opt/local/var/macports/software/) * 2,958,600K (sum of relevant unarchived files not using HFS compression) * 1,407,148K (sum of relevant installed files using HFS compression) * 1,551,452K (sum of space saved on disk) * 52% space saved on disk Finally the list of ports I used to calculate the above values: * libpng-1.5.13_0+universal.darwin_12.i386-x86_64 * gcc45-4.5.4_6.darwin_12.x86_64 * gcc46-4.6.3_9.darwin_12.x86_64 * llvm-2.9-2.9_12.darwin_12.x86_64 * serf1-1.1.1_0.darwin_12.x86_64 * clang-2.9-2.9_12+analyzer+python27.darwin_12.x86_64 * swig-perl-2.0.8_2.darwin_12.noarch * swig-python-2.0.8_2.darwin_12.noarch * swig-2.0.8_2.darwin_12.x86_64 * openmpi-1.6.2_0+gcc45.darwin_12.x86_64 * llvm-3.0-3.0_11.darwin_12.x86_64 * llvm-3.1-3.1_4.darwin_12.x86_64 * clang-3.1-3.1_4+analyzer+python27.darwin_12.x86_64 * db46-4.6.21_7+java+universal.darwin_12.i386-x86_64 * sqlite3-3.7.14.1_0+universal.darwin_12.i386-x86_64 * giflib-4.2.1_0+x11.darwin_12.x86_64 * netpbm-10.60.01_0.darwin_12.x86_64 * graphviz-2.28.0_8.darwin_12.x86_64 * wireshark-1.8.3_0+no_python.darwin_12.x86_64 * bind9-9.9.2_0.darwin_12.x86_64 * flex-2.5.37_1.darwin_12.x86_64 * subversion-1.7.7_0.darwin_12.x86_64 * libusb-1.0.9_0.darwin_12.x86_64 * mpfr-3.1.1-p2_0.darwin_12.x86_64 -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:4> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: closed Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: fixed | Keywords: haspatch Port: | --------------------------+-------------------------------- Changes (by mfeiri@…): * status: new => closed * resolution: => fixed Comment: Commited in r98734 -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:5> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: closed Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: fixed | Keywords: haspatch Port: | --------------------------+-------------------------------- Comment (by jmr@…): Doesn't this break hard links? And how much does it affect activation time, particularly on slower systems? -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:6> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: closed Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: fixed | Keywords: haspatch Port: | --------------------------+-------------------------------- Comment (by jmr@…): Replying to [comment:6 jmr@…]:
Doesn't this break hard links?
Yes, it does. Reverted in r98737 because of this regression.
And how much does it affect activation time, particularly on slower systems?
It makes it slower by more than a factor of 10, apparently. One example: file move: {{{ sudo port -v activate git-core 1.98s user 0.68s system 80% cpu 3.305 total }}} ditto: {{{ sudo port -v activate git-core 21.35s user 15.62s system 91% cpu 40.362 total }}} -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:7> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+-------------------------------- Reporter: mfeiri@… | Owner: macports-tickets@… Type: enhancement | Status: reopened Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+-------------------------------- Changes (by jmr@…): * status: closed => reopened * resolution: fixed => -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:8> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Changes (by mfeiri@…): * status: reopened => new * owner: macports-tickets@… => mfeiri@… Comment: Thanks for pointing out this regression. I've updated the patch to apply to ''extract_archive_to_tmpdir'' instead of ''_activate_file''. This way ditto can compress the entire directory tree of a port, which is a lot faster and seems to preserve hard links. I've also tried to hook the HFS compression directly into the unarchiving pipe, e.g. {{{bsdtar -cpf - --format cpio @${location} | ditto -xV --hfsCompress - $extractdir}}}, but it turned out that the conversion from tar to cpio does not preserve hard links https://github.com/libarchive/libarchive/wiki/Hardlinks. -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:10> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Comment (by jmr@…): Hmm, I can't say I'm thrilled with writing out the files twice. It might be best to implement the relevant bits in C (using libarchive). -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:11> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Comment (by mfeiri@…): AFAICT the only way for us to avoid writing files twice with {{{ditto}}} is to require the use of cpio instead of tar for archives. Not sure if this is a good idea. I'm also not sure if libarchive would accept patches for direct HFS compresion, because AFAIK there is no official API for HFS compression and it is a bit of a hack anyway. Maybe one day we will get something like ZFS and we can simply configure truly transparent filesystem compression per directory... But for now {{{ditto}}} is the only available tool and filesystem compression is a desireable feature. -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:12> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.2 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Comment (by mfeiri@…): *bump* I'm happily using the second iteration of this patch for quite a while now. Hard links work fine and I guess adding one additional pass of i/o is as good as it gets (requiring cpio instead of tar or forking libarchive don't sound very attractive). I know some users are eager to see this patch in a released version of macports to save precious space on their SSD equipped MacBooks. I'll commit to trunk again during the weekend. -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:13> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.99 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Changes (by jmr@…): * version: 2.1.2 => 2.1.99 Comment: Why would we need to fork libarchive? There are at least two implementations out there that turn on compression for individual files, and if you have that, you can do it as you extract with libarchive. There is a fundamental space/speed tradeoff here, and you don't get to decide that everyone wants to save space. At minimum, it needs to be a conf option. -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:14> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.99 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Comment (by egall@…): Replying to [comment:14 jmr@…]:
Why would we need to fork libarchive? There are at least two implementations out there that turn on compression for individual files, and if you have that, you can do it as you extract with libarchive.
There is a fundamental space/speed tradeoff here, and you don't get to decide that everyone wants to save space. At minimum, it needs to be a conf option.
How would one go about making this into a conf option? -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:15> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.99 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Comment (by mfeiri@…): Oh, when I created this patch there was no support for hfsCompression in libarchive. I just noticed that a couple of weeks later "experimental support for HFS+ Compression" was added to libarchive. Yay! I guess this means we can get rid of the additional round of i/o. Would you suggest to somehow depend on our port of libarchive or to import libarchive/bsdtar into the base macports distribution? I would have assumed that for the kinds of files installed by macports (text and executables) the cost/benefit tradeoff is clearly in favor of unconditionally enabling compression. Apple also uses hfsCompression in /usr/bin and similar locations. Once I have some spare time again I will look into extending the patch to allow to opt out of hfsCompression. -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:16> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.99 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Comment (by jmr@…): Replying to [comment:16 mfeiri@…]:
Oh, when I created this patch there was no support for hfsCompression in libarchive. I just noticed that a couple of weeks later "experimental support for HFS+ Compression" was added to libarchive. Yay! I guess this means we can get rid of the additional round of i/o. Would you suggest to somehow depend on our port of libarchive or to import libarchive/bsdtar into the base macports distribution? Well, I guess using `${prefix}/bin/bsdtar --hfsCompression ...` would be simplest, but the disadvantage would be it could only be used after the port is installed.
-- Ticket URL: <https://trac.macports.org/ticket/36560#comment:17> MacPorts <http://www.macports.org/> Ports system for Mac OS
#36560: Use hfsCompression --------------------------+---------------------- Reporter: mfeiri@… | Owner: mfeiri@… Type: enhancement | Status: new Priority: Normal | Milestone: Component: base | Version: 2.1.99 Resolution: | Keywords: haspatch Port: | --------------------------+---------------------- Comment (by ryandesign@…): As of MacPorts 2.3.0 we now have the infrastructure in place to bundle other software packages with MacPorts base... -- Ticket URL: <https://trac.macports.org/ticket/36560#comment:18> MacPorts <http://www.macports.org/> Ports system for OS X
participants (1)
-
MacPorts