[MacPorts] #18736: distname is not percent-encoded before gluing it into the URL
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- MacPorts base does not percent-encode the distname before putting it in the final URL. This causes distnames containing spaces to be completely unavailable (causes a Tcl error), causes distnames containing a percent character to probably not download at all (probably not a common need), and causes distnames containing a plus character to work sometimes but not others, at the whim of the web server (since a plus should really be encoded as %2b the web server is completely within its right to not know what we're asking for). I first noted this in [comment:ticket:18733:3 #18733] in response to r47662. -- Ticket URL: <http://trac.macports.org/ticket/18736> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by ryandesign@…): I searched high and low to see how to do percent-encoding in Tcl. I found a function http::mapReply which does it. It's part of the http module included with Tcl. It seems like I should just be able to do something like this: {{{ package require http return [http::mapReply ${distname}] }}} Unfortunately while I can get this to work with MacPorts tcl, I can't get it to work with the OS-provided tcl; it can't find the http::mapReply procedure. And even if we find out why that is and how to fix it, it won't work because http::mapReply [http://sourceforge.net/tracker/?func=detail&atid=110894&aid=1182373&group_id... was seriously broken] until tcl 8.4.12; Mac OS X 10.4.11 has tcl 8.4.7 so it would not include the fix. The http module appears to have originally been written in tcl (http.tcl) but is now [http://wiki.tcl.tk/14144 apparently written in C]. I copied the tcl code from http.tcl and modified it as suggested in [http://sourceforge.net/tracker/?func=detail&atid=110894&aid=1182373&group_id... the bug report] so that it is not broken. I'll attach the patch I came up with. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:1> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Changes (by raimue@…): * cc: raimue@… (added) Comment: I also can't find mapReply in the [http://www.tcl.tk/man/tcl8.4/TclCmd/http.htm Tcl docs]. So this seems to have been an internal proc. Now you already made the effort of writing this in Tcl, but as we are already using and linking with libcurl, we could use [http://curl.haxx.se/libcurl/c/curl_easy_escape.html curl_easy_escape()] for this purpose. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:2> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by ryandesign@…): Replying to [comment:2 raimue@…]:
Now you already made the effort of writing this in Tcl, but as we are already using and linking with libcurl, we could use [http://curl.haxx.se/libcurl/c/curl_easy_escape.html curl_easy_escape()] for this purpose. That would be fine with me, except if it does what the docs say: "All input characters that are not a-z, A-Z or 0-9 are converted to their 'URL escaped' version" -- if that's what it really does then that's broken just like tcl < 8.4.12 was broken: it needs to convert all characters that are not a-z, A-Z, 0-9, -, ., _ or ~.
-- Ticket URL: <http://trac.macports.org/ticket/18736#comment:3> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by raimue@…): According to the man page, libcurl implements [http://www.ietf.org/rfc/rfc2396.txt RFC2396], Tcl goes with [http://www.ietf.org/rfc/rfc3986.txt RFC3986]. It is not really broken, just outdated. Which is not a serious issue, as decoders are advised to decode these characters normally. There is also a simple approach using the public http API, which is so simple that we could drop the wrapper as well. Although calling ::http::formatQuery with an odd number of arguments seems not to be documented (could also call with an additional empty string and strip the last char). {{{ package require http proc urlencode {str} { return [::http::formatQuery $str] } }}} But if we are going to duplicate this, we should probably copy from the [http://tcl.cvs.sourceforge.net/viewvc/tcl/tcl/library/http/http.tcl?view=mar... latest version] which is not using regsub for performance reasons. It is not implemented in C as stated on the Tclers wiki. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:4> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by ryandesign@…): Replying to [comment:4 raimue@…]:
According to the man page, libcurl implements [http://www.ietf.org/rfc/rfc2396.txt RFC2396], Tcl goes with [http://www.ietf.org/rfc/rfc3986.txt RFC3986]. It is not really broken, just outdated. Which is not a serious issue, as decoders are advised to decode these characters normally.
It seems silly if you want to download foo-1.0.tar.gz to request foo%2d1%2e0%2etar%2egz but I agree it should still work.
There is also a simple approach using the public http API, which is so simple that we could drop the wrapper as well.
That would be fine if it works. I had not tried [::http::something] I had tried [http::something] which was how the example I saw used it. I'm not so familiar with tcl syntax when it comes to packages.
Although calling ::http::formatQuery with an odd number of arguments seems not to be documented (could also call with an additional empty string and strip the last char).
formatQuery is for formatting a query string (e.g. "?a=b&c=d"), but that's not what we're doing, so I don't feel we should be using that function. It seems mapReply is the correct function to use.
But if we are going to duplicate this, we should probably copy from the [http://tcl.cvs.sourceforge.net/viewvc/tcl/tcl/library/http/http.tcl?view=mar... latest version] which is not using regsub for performance reasons. It is not implemented in C as stated on the Tclers wiki.
Ok. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:5> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by raimue@…): FYI ::http::* starts from the top level namespace, http::* looks for a http namespace in the current namespace first. Replying to [comment:5 ryandesign@…]:
formatQuery is for formatting a query string (e.g. "?a=b&c=d"), but that's not what we're doing, so I don't feel we should be using that function. It seems mapReply is the correct function to use.
http::mapReply is private API (at least in 8.4 and 8.5, seems to have changed for 8.6), we '''should''' not use it. But we cannot even use it before we call another http::* function, as it is not listed in pkgIndex.tcl. http::formatQuery is a wrapper around http::mapReply which is public. Actually we want to encode a string to be used as value in a query string, so it is not absolutely wrong. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:6> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by ryandesign@…): Replying to [comment:6 raimue@…]:
http::mapReply is private API (at least in 8.4 and 8.5, seems to have changed for 8.6), we '''should''' not use it. But we cannot even use it before we call another http::* function, as it is not listed in pkgIndex.tcl. That's a shame.
http::formatQuery is a wrapper around http::mapReply which is public. Actually we want to encode a string to be used as value in a query string, so it is not absolutely wrong. The [http://en.wikipedia.org/wiki/Query_string query string] is the part of a URL after the question mark. We are constructing a distfile URL which does not contain a question mark, meaning it does not have a query string component.
So what about using curl_easy_escape() as you suggested? -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:7> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by raimue@…): `patch-macports-curl-escape.diff` is a patch to add a `curl escape` command, using curl_easy_escape() internally. {{{ $ rlwrap /usr/bin/tclsh % source /Library/Tcl/macports1.0/macports_fastload.tcl 0 % package require Pextlib 1.0 % curl escape foo~bar+baz! foo%7Ebar%2Bbaz%21 }}} -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:8> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by ryandesign@…): Thanks. When trying to compile MacPorts with this change, I got: {{{ ld: Undefined symbols: _curl_easy_escape }}} I found that on this Intel Tiger Mac, I somehow had both library version 3 and 4 of curl, with all the symlinks pointing to version 3, and only version 4 has the curl_easy_escape function: {{{ $ ls -l /usr/lib/libcurl* lrwxr-xr-x 1 root wheel 15 Sep 5 2008 /usr/lib/libcurl.2.dylib -> libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Sep 5 2008 /usr/lib/libcurl.3.0.0.dylib -> libcurl.3.dylib -rwxr-xr-x 1 root wheel 384692 Oct 24 02:02 /usr/lib/libcurl.3.dylib -rwxr-xr-x 1 root wheel 460136 Nov 6 15:53 /usr/lib/libcurl.4.dylib lrwxr-xr-x 1 root wheel 15 Sep 5 2008 /usr/lib/libcurl.dylib -> libcurl.3.dylib $ grep curl_easy_escape /usr/lib/libcurl* Binary file /usr/lib/libcurl.4.dylib matches $ }}} I checked two other Tiger Macs (one Intel and one PowerPC) and they only have library version 4, and everything else is just symlinks to it: {{{ $ ls -l libcurl* lrwxr-xr-x 1 root wheel 15 Mar 26 2008 libcurl.2.dylib -> libcurl.4.dylib lrwxr-xr-x 1 root wheel 15 Mar 16 2007 libcurl.3.0.0.dylib -> libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Mar 26 2008 libcurl.3.dylib -> libcurl.4.dylib lrwxr-xr-x 1 root wheel 15 Mar 26 2008 libcurl.4.0.0.dylib -> libcurl.4.dylib -rwxr-xr-x 1 root wheel 460136 Nov 6 15:53 libcurl.4.dylib lrwxr-xr-x 1 root wheel 15 Mar 26 2008 libcurl.dylib -> libcurl.4.dylib $ }}} And I checked two Leopard Macs (one Intel, one PowerPC) which also look the same (except that the libcurl.3.0.0.dylib symlink is missing). Once I fixed the symlinks on the first Mac I could build it. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:9> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Comment(by ryandesign@…): Replying to [comment:5 ryandesign@…]:
It seems silly if you want to download foo-1.0.tar.gz to request foo%2d1%2e0%2etar%2egz but I agree it should still work.
I've had this patch in place for awhile locally, and it seems to work with most servers, but [http://lists.macosforge.org/pipermail/macports- dev/2009-June/009025.html not with downloads.sourceforge.net]. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:10> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.0 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Changes (by snc@…): * cc: snc@… (added) Comment: Replying to [comment:3 ryandesign@…]:
That would be fine with me, except if it does what the docs say: "All input characters that are not a-z, A-Z or 0-9 are converted to their 'URL escaped' version" -- if that's what it really does then that's broken just like tcl < 8.4.12 was broken: it needs to convert all characters that are not a-z, A-Z, 0-9, -, ., _ or ~.
We can probably summarize that as "Returns a string in which all non- alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits." (from [http://php.net/url_encode PHP]). -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:11> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts 1.8.1 Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Changes (by jmr@…): * milestone: MacPorts 1.8.0 => MacPorts 1.8.1 Comment: Pushing this back a little since it's not a regression and will apparently be nontrivial to fix. -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:12> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL -------------------------------------+-------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: new Priority: Normal | Milestone: MacPorts Future Component: base | Version: 1.7.0 Keywords: | Port: -------------------------------------+-------------------------------------- Changes (by jmr@…): * milestone: MacPorts 1.8.2 => MacPorts Future -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:15> MacPorts <http://www.macports.org/> Ports system for Mac OS
#18736: distname is not percent-encoded before gluing it into the URL --------------------------------------+------------------------------------- Reporter: ryandesign@… | Owner: macports-tickets@… Type: enhancement | Status: closed Priority: Normal | Milestone: MacPorts 1.9.0 Component: base | Version: 1.7.0 Resolution: fixed | Keywords: Port: | --------------------------------------+------------------------------------- Changes (by jmr@…): * status: new => closed * resolution: => fixed * milestone: MacPorts Future => MacPorts 1.9.0 Comment: r66794 -- Ticket URL: <http://trac.macports.org/ticket/18736#comment:16> MacPorts <http://www.macports.org/> Ports system for Mac OS
participants (1)
-
MacPorts