fetch.type git & GitHub submodules (was: [133168] trunk/dports/sysutils)

Rainer Müller raimue at macports.org
Sat Mar 7 08:50:08 PST 2015


On 2015-03-04 23:41, Mojca Miklavec wrote:
>> b) tarballs are generated automatically on the server after the Portfile
>> was committed
> 
> I was thinking of that option. (Tarballs would also be automatically
> generated on the user's machine straight from VCS if the file wouldn't
> exist on the mirror.)
> 
>> I would prefer b) for the simple fact that this ensures that the
>> maintainer did not modify any of the files and would be closer to our
>> existing distfiles mirroring. The infrastructure changes would be small
>> if it can be integrated into what the existing 'port mirror' does.
>> However, checksums for the generated tarball are definitely not known at
>> the time the Portfile is committed.
> 
> Why not? At least for GIT I can show you a trivial way to create a
> compressed file in a repeatable way. That way anyone would get the
> same checksums and the maintainer can easily add the checksums to the
> Portfile *before* committing.

The problem is to get it right for all VCS. As an example, the mtime of
files is sometimes set to the time of checkout, not to the time of
commit. That can be solved with builtin commands like 'git archive', but
that is not available for every VCS.

For example, a 'svn export' sets the commit time and date for files, but
directories get the current time and date. However, I assume we can
determine an appropriate value for the mtime of directories and apply
that after the export, but we need to do it manually.

Possible solutions I can think of:

 1) with BSD tar, reproducible ownership and mtimes could be achieved
    by generating an appropriate mtree file. Quick example:

    $ tar --format mtree -cf foo.mtree foo
    $ sed -i '' -e 's/[[:<:]]uname=[^ ]*/uname=macports/g' foo.mtree
      # ... also fix gname, uid, guid, mtime, mode, ...
    $ tar -cf foo.tar.xz @foo.mtree

 2) use libarchive and inject the metadata we want directly.

 3) clone using git-svn and then create the tarballs with 'git archive'
    from this repository.

> For other VCS it might be just slightly more complicated (I'm not so
> familiar with them), but probably not much. One just needs to make
> sure that all the files have reproducible/stable timestamps, including
> the tar file.

Exactly that's the point. Git has it builtin, but other systems do not,
so it needs to be implemented. for them. Or we only do it for systems
for which we already verified that the result is reproducible.

>> One solution for this would be to add an additional file in the port
>> directory after tarball generation that holds the checksums. Or, the
>> generated tarballs are also signed by the job that generated them. With
>> the signature it is possible to verify that this is the intended file
>> without distributing any additional checksum through other channels.
> 
> That sounds way too complex.

Why? Signing tarballs is the approach we use for binary packages already.

> So what exactly did I do wrong when I managed to get the same
> checksums on three machines?
> 
> Here's what I did:
>     git clone <some project> && cd <project>
>     git archive <shasum> | xz > ../test.tar.xz
>     md5 ../test.tar.xz

Nothing wrong here, git just has this builtin. If we can find a
reproducible way for every VCS, then this approach with a checksum in
the Portfile and tarballs generated on the server will be feasible.

Rainer


More information about the macports-dev mailing list