[MacPorts] #16373: svn fetch type should maintain a persistent working copy
#16373: svn fetch type should maintain a persistent working copy -------------------------------------+-------------------------------------- Reporter: ryandesign@macports.org | Owner: macports-tickets@lists.macosforge.org Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts base enhancements Component: base | Version: 1.7.0 Keywords: | -------------------------------------+-------------------------------------- "`fetch.type svn`" is inefficient in that it checks out a new working copy every time, directly to the work area. That would be like a normal port downloading the distfile every time. Instead, we should check out a working copy to that port's distpath, and then in the extract phase we should `svn export` it to the work area. Some checks will be needed in the fetch phase to ensure that an existing working copy: * has no modifications: check `svn status`. Ideally we would try to clean up the working copy, for example by `svn revert`ing modified or added or deleted files, and then in a second `svn status` run, delete any unversioned files. But it's already an improvement if we just discard the working copy if `svn status --ignore-externals` produces any output. * is from the right URL: check `svn info`: check if the "URL" is the one we want. If not, check that the "Repository Root" is a substring of the repository we want. If yes, try to `svn switch` to the URL and revision we want; if not, discard the working copy. So the fetch phase would go something like... {{{ if {working copy exists} { if {working copy has modifications} { delete working copy } } if {working copy exists} { if {working copy url is the one we want} { svn update to the desired revision } else { if {working copy repository root matches beginning of desired url} { try to svn switch to the desired url and revision if {an error occurred} { delete working copy } } else { delete working copy } } } if {working copy doesn't exist} { check out working copy } }}} And the extract phase is simply to `svn export` the working copy from the distpath to the worksrcpath. (There is [http://subversion.tigris.org/issues/show_bug.cgi?id=2429 one problem] if the working copy has externals and the user is using Subversion earlier than 1.5, for example Subversion 1.4.whatever which is included with Leopard. But rather than spend time working around this in base, I think this is a case where the port should depend on MacPorts subversion.) -- Ticket URL: <http://trac.macports.org/ticket/16373> MacPorts <http://www.macports.org/> Ports system for Mac OS
#16373: svn git and hg fetch type should maintain a persistent working copy -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Keywords: performance fetch | Port: -------------------------------------+-------------------------------------- Changes (by jeremyhu@…): * keywords: => performance fetch Comment: This should be done for mercurial and git as well. It's quite annoying to have to redownload sources every time through my debug itteration even though they haven't changed. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:1> MacPorts <http://www.macports.org/> Ports system for Mac OS
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by mojca@…): I'm looking at options for git. The following commands result in stable checksums: {{{ git archive {shasum_or_branch} > /path/to/name_version.tar gzip < /path/to/name_version.tar > /path/to/name_version.tar.gz }}} {{{ git archive {shasum_or_branch} > /path/to/name_version.tar gzip -n /path/to/name_version.tar }}} {{{ git archive {shasum_or_branch} | gzip -n > /path/to/name_version.tar.gz }}} {{{ git archive {shasum_or_branch} | xz > /path/to/name_version.tar.xz }}} The first option results in a different checksum that the other two. I didn't try to understand the difference in the approaches, but in either case that would allow users to store the resulting compressed file, verify the checksums and store the file on MacPorts' server. (Optionally the resulting file could be touched to get the same timestamp as the contents, but that's not a strict requirement.) -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:8> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by ryandesign@…): I'm not sure how this relates to this ticket. The solution I'm envisioning for this issue (in Subversion parlance, though I'm sure git and hg have equivalent concepts) is maintaining a persistent working copy which would be updated and switched as needed, or in extreme cases deleted and recreated, not creating any tarball, keeping any checksums, or uploading any file to a MacPorts server. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:9> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by snc@…): For git, we could store the downloaded repository in the distfiles directory. If local repo doesn't exist `git clone`, or if local repo exists `git reset --hard && git pull`. This repo can then be locally cloned or checked out to the working directory. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:10> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by mojca@…): Sure, keeping the whole repository (and cleaning it in case it turns out to be "broken" or changed in unexpected ways) would be the optimal solution, but the solution I was talking about would probably be a lot faster to implement: it would be similar to what the `GitHub PortGroup` does for example. It fetches a `.tar.gz` file from GitHub (even though it could clone the git repo) and calculates the checksums. If the checksum matches, all is well and a copy of that file gets mirrored on one of the MacPorts server. The solution I suggest would: * check if `${distfile}` exists * if not, clone the git repository and create a `${name.version}.tar.gz`/`${name.version}.tar.xz` of the desired branch/tag/version in `${distpath}`, delete the temporary git clone * verify the checksums, extract the contents as usual ... So something similar to what GitHub and BitBucket PortGroup already do (except that those fetch the distfiles from the server already). I mentioned this because I believe it would be relatively easy to implement and it would allow to keep a mirror of a particular version on the server. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:11> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by mojca@…): The problem is that I'm now trying to push some projects into making GitHub clones just for the sake of being able to avoid constant re- fetching of the sources from a random git repository. I would be really really grateful if MacPorts would get the ability to store the old repository and/or to mirror snapshots in the form of .tar.[gz|bz2|xz] files. I suspect that solution would need to be implemented for each system separately anyway (different commands for svn, git and hg). I wanted to push the issue to start with git which is probably most widely used. I would like to add a new port and I'm trying to figure out whether I should: * make an unofficial mirror on GitHub (in my user account) * deal with the pain of re-fetching from the original repository * or make sure that the issue gets fixed in MacPorts I would prefer the last one. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:12> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by cal@…): Keeping a (bare repo) clone of the whole thing would speed up fetching even after a port is updated, though. Packaging tarballs wouldn't. Also we can't easily avoid the git dependency because by the time the fetch phase is started we wouldn't know whether our mirrors already had a generated tarball or we'd have to fetch from git. I guess getting this implemented using bare clones wouldn't be so hard after all. For git, you'd have to - generate a unique identifier from the repository URL (e.g. using a hash function) - test whether $cachedir/$identifier is a valid git repository - create a bare clone if it isn't, run git fetch if it is - export the version/revision/tag you need from $cachedir/$identifier into $worksrcdir. I think that's actually easier to implement than getting the mirroring stuff you propose into the scripts that update our distfile mirrors. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:13> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by mojca@…): Replying to [comment:13 cal@…]:
Keeping a (bare repo) clone of the whole thing would speed up fetching even after a port is updated, though. Packaging tarballs wouldn't.
Yes, that would be a huge benefit over tarballs.
Also we can't easily avoid the git dependency because by the time the fetch phase is started we wouldn't know whether our mirrors already had a generated tarball or we'd have to fetch from git.
I don't think that getting rid of the dependency on git would be of any substantial benefit.
I guess getting this implemented using bare clones wouldn't be so hard after all. For git, you'd have to - generate a unique identifier from the repository URL (e.g. using a hash function) - test whether $cachedir/$identifier is a valid git repository - create a bare clone if it isn't, run git fetch if it is - export the version/revision/tag you need from $cachedir/$identifier into $worksrcdir.
I would also suggest to add/check the SHA sum of the commit (even when dealing with tags) just to be on the safe side.
I think that's actually easier to implement than getting the mirroring stuff you propose into the scripts that update our distfile mirrors.
I'm too clumsy when it comes to tcl (I've learnt to handle the `Portfiles`, but changing anything in base is still too complex for me). I would be thrilled if someone would be willing and able to implement this. Once that gets implemented – how would you handle GitHub and BitBucket from that point on? And how would you handle situations when the servers go offline? Would you mirror the bare repository on one of MacPorts servers? (This is of course less important.) -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:14> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by snc@…): Replying to [comment:14 mojca@…]:
I would also suggest to add/check the SHA sum of the commit (even when dealing with tags) just to be on the safe side.
Using commitish over tags is helpful and uniform.
Once that gets implemented – how would you handle GitHub and BitBucket from that point on? And how would you handle situations when the servers go offline? Would you mirror the bare repository on one of MacPorts servers? (This is of course less important.)
There's no need to mirror their repositories. The authors can easily host it elsewhere and we simply update the portfile. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:15> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by mojca@…): I wasn't talking about moving the git repositories. I meant situations when the server is not accessible for several days. Or when the sources disappear completely (there are certain tar.gz files that are only present on MacPorts mirrors and can still be installed, but are otherwise long gone from web). -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:16> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by snc@…): Replying to [comment:16 mojca@…]:
I believe that both the SHA sum and the tag should be present. Tag doesn't always represent the exact version number (sometimes the version needs to be set separately for a github project anyway, but is often clear and helpful, often even for livecheck).
And sometimes tags are never used.
I wasn't talking about moving the git repositories. I meant situations when the server is not accessible for several days. Or when the sources disappear completely (there are certain tar.gz files that are only present on MacPorts mirrors and can still be installed, but are otherwise long gone from web).
So we have two issues here: it's not a distfile, and keeping the whole repo we'd mean we have to manage history rewrites on our servers. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:17> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by ryandesign@…): Replying to [comment:16 mojca@…]:
I wasn't talking about moving the git repositories. I meant situations when the server is not accessible for several days. Or when the sources disappear completely (there are certain tar.gz files that are only present on MacPorts mirrors and can still be installed, but are otherwise long gone from web).
I consider that scenario to be outside the scope of this ticket. If I get around to working on this issue, I would begin with the Subversion portion, since that's the Version control system I'm most familiar with. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:18> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by mojca@…): OK, it could be a mandatory SHA sum and an optional tag (or maybe this needs a bit of rethinking). One thing that I would also like to see supported out of the box (but is otherwise completely independent and also outside of scope of this ticket) is creating a version string like `3.14-beta-20140314-{short_SHA}`. I mean: provided a full SHA string, I would like to be able to extract both date (just for "sorting" the increasing version) and a shortened version of the SHA sum. But keeping a copy on MacPorts mirrors is definitely a lower priority than getting this functionality to work in the first place. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:19> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by snc@…): Replying to [comment:18 ryandesign@…]:
If I get around to working on this issue
Could you give further guidance on this so that others who aren't as familiar with base might try to help out? -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:20> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by cal@…): I'm not sure we really need a mandatory SHA sum. We currently trust git (or any other version control system) to do the right thing automatically when specifying tags (and not using github or setting `fetch.type git`). I'm also not sure how to implement a SHA sum of a complete source tree. As for the version string, try `git describe`, it might generate what you want. This needs to be implemented in browser:trunk/base/src/port1.0/portfetch.tcl; there are a couple of procs named portfetch::${vcs}fetch where this would have to be implemented. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:21> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by ryandesign@…): Replying to [comment:21 cal@…]:
This needs to be implemented in browser:trunk/base/src/port1.0/portfetch.tcl; there are a couple of procs named portfetch::${vcs}fetch where this would have to be implemented.
Currently, when using a non-distfile fetch.type, they fetch directly into workpath, and the extract phase does nothing; the extract phase would also have to be updated to do something. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:22> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+-------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+-------------------------------- Comment (by mojca@…): Replying to [comment:21 cal@…]:
I'm also not sure how to implement a SHA sum of a complete source tree.
One option is to generate a `.tar` or a `.tar.[gz|bz2|xz]` and calculate the checksum of that. There are other options for sure.
As for the version string, try `git describe`, it might generate what you want.
I meant something that would easily be accessible in Tcl, so that I could specify something like {{{ git.branch ...sha... version "3.14-beta-${git.commitdate}-${git.shortsha}" }}} I would need to learn how to interface git and Tcl first to implement that. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:23> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+------------------------------- Reporter: ryandesign@… | Owner: larryv@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+------------------------------- Changes (by larryv@…): * owner: macports-tickets@… => larryv@… * cc: larryv@… (removed) -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:25> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+------------------------------- Reporter: ryandesign@… | Owner: larryv@… Type: enhancement | Status: assigned Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+------------------------------- Changes (by larryv@…): * status: new => assigned -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:26> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: svn git and hg fetch type should maintain a persistent working copy ---------------------------+------------------------------- Reporter: ryandesign@… | Owner: larryv@… Type: enhancement | Status: assigned Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+------------------------------- Comment (by devans@…): Note this is an issue with ports that use bzr fetches as well such as inkscape-devel. -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:27> MacPorts <http://www.macports.org/> Ports system for OS X
#16373: base should maintain a persistent working copy for all supported VCS fetches ---------------------------+------------------------------- Reporter: ryandesign@… | Owner: larryv@… Type: enhancement | Status: assigned Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Resolution: | Keywords: performance fetch Port: | ---------------------------+------------------------------- -- Ticket URL: <https://trac.macports.org/ticket/16373#comment:28> MacPorts <http://www.macports.org/> Ports system for OS X
participants (1)
-
MacPorts