fetch.type git & GitHub submodules (was: [133168] trunk/dports/sysutils)

Mojca Miklavec mojca at macports.org
Wed Mar 4 14:41:54 PST 2015


On Wed, Mar 4, 2015 at 10:53 PM, Rainer Müller wrote:
> On 2015-03-04 22:27, Mojca Miklavec wrote:
>>> I agree with you, creating the distfiles from VCS would be possible.
>>>
>>> There could be a target to be run on 'port mirror' that downloads and
>>> creates a tarball if a non-default fetch.type is used. That alone would
>>> reduce multiple downloads and even makes port development faster.
>>>
>>> However, for end-users, there is the problem that we would need to
>>> distribute checksums for these tarballs (or rely on signatures only?).
>>
>> Of course we would have to distribute the checksums in that case, like
>> for any other port. What exactly is considered a "problem" here?
>
> There are two options:
>
> a) the maintainer generates the tarball locally, uploads it to the main
> mirror and also adds an additional checksum to the Portfile before
> committing it

No, this is certainly not what I had in mind.

(But now that you mentioned it, it reminded me that we might sometimes
want other complex strategies to fetch files, not just from VCS. Like
wxPython where we only extract 400 kB out of a 50 MB file and it would
be a waste to have to store and fetch all the 50 MB.)

> b) tarballs are generated automatically on the server after the Portfile
> was committed

I was thinking of that option. (Tarballs would also be automatically
generated on the user's machine straight from VCS if the file wouldn't
exist on the mirror.)

> I would prefer b) for the simple fact that this ensures that the
> maintainer did not modify any of the files and would be closer to our
> existing distfiles mirroring. The infrastructure changes would be small
> if it can be integrated into what the existing 'port mirror' does.
> However, checksums for the generated tarball are definitely not known at
> the time the Portfile is committed.

Why not? At least for GIT I can show you a trivial way to create a
compressed file in a repeatable way. That way anyone would get the
same checksums and the maintainer can easily add the checksums to the
Portfile *before* committing.

For other VCS it might be just slightly more complicated (I'm not so
familiar with them), but probably not much. One just needs to make
sure that all the files have reproducible/stable timestamps, including
the tar file.

> One solution for this would be to add an additional file in the port
> directory after tarball generation that holds the checksums. Or, the
> generated tarballs are also signed by the job that generated them. With
> the signature it is possible to verify that this is the intended file
> without distributing any additional checksum through other channels.

That sounds way too complex.

> In general, note that generating a tarball might include timestamps,
> usernames, and other metadata.

I just checked. I tried to generate a .tar.xz file on three different
machines: one Mac OS X, two linux boxes. A different username and a
different userid/groupid on every machine. And I got exactly the same
checksum on all the three machines.

> Generating it multiple times, locally by
> the maintainer and once again on the server, will not always give the
> same results. Although that would be the closest to what we do for
> distfiles at the moment, combining the checksum in Portfile from a)
> *and* the automatic generation on the server from b) is not possible.

So what exactly did I do wrong when I managed to get the same
checksums on three machines?

Here's what I did:
    git clone <some project> && cd <project>
    git archive <shasum> | xz > ../test.tar.xz
    md5 ../test.tar.xz

I admit that I don't have enough experience to claim that this would
generate the same checksums in all the possible scenarios that one can
think of and all hardware that one can imagine, but I would be
comfortable enough to speculate that in most cases there shouldn't be
any difference on the user's Mac and the server.

(I didn't test on servers in different timezones, but I would imagine
that there must be a cure for that if that would result in some
discrepancies. I also didn't think of zillion of other possible
scenarios that could potentially spoil the game. I could imagine
potential problems with repositories with "Makefile" and "makefile" in
the same dir, resulting in different tarballs depending on whether or
not the operation was performed on a case sensitive partition. But we
should ask for a "bugfix" if we come across such cases. And I know for
a fact that CVS has some "dementia symptoms" and the strategy wouldn't
work for CVS as it keeps loosing files and folders. But given how old
and broken CVS is, I really wouldn't care.)

Mojca


More information about the macports-dev mailing list