email address anti-spam encoding in portfiles
I thought I'd run by everybody the novel concept of anti-spam encoding email addresses in portfiles. For all other cases we could obfuscate the address, but with Portfiles we're sortof stuck, since we need to be able to make these available in raw form in various places (svn, mpwa, etc). What if we adopt the convention in the maintainer field of using user/ domain instead of user@domain. I have a feeling the spambots won't find that, and it's pretty easy to recognize as a user (or to reconstitute as a machine). So to be completely clear, I propose that we would encode my email address (jberry@macports.org) as jberry/macports.org Feedback? James
On May 15, 2007, at 20:25, James Berry wrote:
I thought I'd run by everybody the novel concept of anti-spam encoding email addresses in portfiles. For all other cases we could obfuscate the address, but with Portfiles we're sortof stuck, since we need to be able to make these available in raw form in various places (svn, mpwa, etc).
What if we adopt the convention in the maintainer field of using user/domain instead of user@domain. I have a feeling the spambots won't find that, and it's pretty easy to recognize as a user (or to reconstitute as a machine).
So to be completely clear, I propose that we would encode my email address (jberry@macports.org) as jberry/macports.org
Feedback?
I'd love to reduce the amount of spam I receive. But obfuscating the maintainers in the portfile may not be sufficient. I'm also concerned about the login to the Subversion server, which is also my email address. This appears in the $Id$ tag at the top of portfiles I've modified, and there are also several sites tracking the Subversion commits and making this available on the web, without obfuscation, of course. For example, the first Google hit for searching for my email address is currently: http://cia.vc/stats/author/ryandesign@macports.org
I support this, and I'd go one further. A format I use internally is macports.org:jberry. A slash would work equally well. In either case, a basic regular expression couldn't catch it. -- Sal smile. -------------- Salvatore Domenick Desiano Doctoral Candidate Robotics Institute Carnegie Mellon University On Tue, 15 May 2007, James Berry wrote: o I thought I'd run by everybody the novel concept of anti-spam encoding email o addresses in portfiles. For all other cases we could obfuscate the address, o but with Portfiles we're sortof stuck, since we need to be able to make these o available in raw form in various places (svn, mpwa, etc). o o What if we adopt the convention in the maintainer field of using user/domain o instead of user@domain. I have a feeling the spambots won't find that, and o it's pretty easy to recognize as a user (or to reconstitute as a machine). o o So to be completely clear, I propose that we would encode my email address o (jberry@macports.org) as jberry/macports.org o o Feedback? o o James o _______________________________________________ o macports-dev mailing list o macports-dev@lists.macosforge.org o http://lists.macosforge.org/mailman/listinfo/macports-dev o
Following discussion with several of you, and more thought, my thinking is now: (1) Obfuscate plain text email addresses by using the form: - tld/domain/username user@bar.com ==> com/bar/user - if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user - If the domain/tld is macports.org, then it may be dropped: user@macports.org ==> user Note that this is machine reversible, and also fairly easy for a user to produce manually, both of which are important considerations. (2) If a Portfile is submitted with a maintainer email address containing an @, we will accept it as such (this is up to the submitter/maintainer). We're providing a means by which port maintainers may obfuscate their address, but not mandating that they do so. Note that this is also a machine detectable situation. (3) There are a number of other cases in which email addresses may show up. This doesn't attempt to deal with all of them yet. Small steps. Among these are: - CIA commit pages - Trac commits and perhaps bug reports too - Mailing list archives - irc logs If I don't hear any contradictory pleas soon, I'm going to move ahead with this, perhaps including auto fixing all the portfiles. James On May 15, 2007, at 6:25 PM, James Berry wrote:
I thought I'd run by everybody the novel concept of anti-spam encoding email addresses in portfiles. For all other cases we could obfuscate the address, but with Portfiles we're sortof stuck, since we need to be able to make these available in raw form in various places (svn, mpwa, etc).
What if we adopt the convention in the maintainer field of using user/domain instead of user@domain. I have a feeling the spambots won't find that, and it's pretty easy to recognize as a user (or to reconstitute as a machine).
So to be completely clear, I propose that we would encode my email address (jberry@macports.org) as jberry/macports.org
Feedback?
James _______________________________________________ macports-dev mailing list macports-dev@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo/macports-dev
We'd also have to remove the id keyword from all Portfiles, since user names are email addresses. On May 16, 2007, at 8:12 PM, James Berry wrote:
Following discussion with several of you, and more thought, my thinking is now:
(1) Obfuscate plain text email addresses by using the form:
- tld/domain/username user@bar.com ==> com/bar/user
- if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user
- If the domain/tld is macports.org, then it may be dropped: user@macports.org ==> user
Note that this is machine reversible, and also fairly easy for a user to produce manually, both of which are important considerations.
(2) If a Portfile is submitted with a maintainer email address containing an @, we will accept it as such (this is up to the submitter/maintainer). We're providing a means by which port maintainers may obfuscate their address, but not mandating that they do so.
Note that this is also a machine detectable situation.
(3) There are a number of other cases in which email addresses may show up. This doesn't attempt to deal with all of them yet. Small steps.
Among these are:
- CIA commit pages - Trac commits and perhaps bug reports too - Mailing list archives - irc logs
If I don't hear any contradictory pleas soon, I'm going to move ahead with this, perhaps including auto fixing all the portfiles.
-- Kevin Ballard http://kevin.sb.org eridius@macports.org http://www.tildesoft.com
On May 16, 2007, at 8:32 PM, Kevin Ballard wrote:
We'd also have to remove the id keyword from all Portfiles, since user names are email addresses.
Yes. A couple of possibilities there: (1) convince macosforge to move away from name@domain for their user names, relying on name only, or (2) find a way to let svn see (and report) only the user end of that, or (3) move from $Id$ to $Revision$ or remove the keyword altogether. Note that (1) or (2) would also solve a number of other related issues, such as on trac, cia. James.
On May 16, 2007, at 8:12 PM, James Berry wrote:
Following discussion with several of you, and more thought, my thinking is now:
(1) Obfuscate plain text email addresses by using the form:
- tld/domain/username user@bar.com ==> com/bar/user
- if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user
- If the domain/tld is macports.org, then it may be dropped: user@macports.org ==> user
Note that this is machine reversible, and also fairly easy for a user to produce manually, both of which are important considerations.
(2) If a Portfile is submitted with a maintainer email address containing an @, we will accept it as such (this is up to the submitter/maintainer). We're providing a means by which port maintainers may obfuscate their address, but not mandating that they do so.
Note that this is also a machine detectable situation.
(3) There are a number of other cases in which email addresses may show up. This doesn't attempt to deal with all of them yet. Small steps.
Among these are:
- CIA commit pages - Trac commits and perhaps bug reports too - Mailing list archives - irc logs
If I don't hear any contradictory pleas soon, I'm going to move ahead with this, perhaps including auto fixing all the portfiles.
-- Kevin Ballard http://kevin.sb.org eridius@macports.org http://www.tildesoft.com
On 17/05/2007, at 13:54, James Berry wrote:
Yes. A couple of possibilities there: (1) convince macosforge to move away from name@domain for their user names, relying on name only, or (2) find a way to let svn see (and report) only the user end of that, or (3) move from $Id$ to $Revision$ or remove the keyword altogether. Note that (1) or (2) would also solve a number of other related issues, such as on trac, cia.
I would definitely vote for (1) and then (2), but if we have to go to (3) I'd ask for both $Revision$ and $Date$ to be used; I find it useful to know when things changed. Kind regards, Maun Suang -- Boey Maun Suang (Boey is my surname) Email: boeyms at macports dot org
--On 16 May 2007 17:12:10 -0700 James Berry <jberry@macports.org> wrote:
Following discussion with several of you, and more thought, my thinking is now:
(1) Obfuscate plain text email addresses by using the form:
- tld/domain/username user@bar.com ==> com/bar/user
This won't always work (at least, not with simple implementations), since a slash is legal in an email local part (though often banned by local policy). However, if you used "subdomain.tld/localpart" you'd be OK, since the first slash would always be the separator.
- if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user
- If the domain/tld is macports.org, then it may be dropped: user@macports.org ==> user
Note that this is machine reversible, and also fairly easy for a user to produce manually, both of which are important considerations.
(2) If a Portfile is submitted with a maintainer email address containing an @, we will accept it as such (this is up to the submitter/maintainer). We're providing a means by which port maintainers may obfuscate their address, but not mandating that they do so.
Note that this is also a machine detectable situation.
(3) There are a number of other cases in which email addresses may show up. This doesn't attempt to deal with all of them yet. Small steps.
Among these are:
- CIA commit pages - Trac commits and perhaps bug reports too - Mailing list archives - irc logs
If I don't hear any contradictory pleas soon, I'm going to move ahead with this, perhaps including auto fixing all the portfiles.
James
On May 15, 2007, at 6:25 PM, James Berry wrote:
I thought I'd run by everybody the novel concept of anti-spam encoding email addresses in portfiles. For all other cases we could obfuscate the address, but with Portfiles we're sortof stuck, since we need to be able to make these available in raw form in various places (svn, mpwa, etc).
What if we adopt the convention in the maintainer field of using user/domain instead of user@domain. I have a feeling the spambots won't find that, and it's pretty easy to recognize as a user (or to reconstitute as a machine).
So to be completely clear, I propose that we would encode my email address (jberry@macports.org) as jberry/macports.org
Feedback?
James _______________________________________________ macports-dev mailing list macports-dev@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo/macports-dev
_______________________________________________ macports-dev mailing list macports-dev@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo/macports-dev
-- Ian Eiloart IT Services, University of Sussex x3148
On May 16, 2007, at 22:32, Kevin Ballard wrote:
We'd also have to remove the id keyword from all Portfiles, since user names are email addresses.
I like the information the Id keyword provides, including the user who last changed the file. I would prefer to find a way to make usernames not be email addresses.
On May 17, 2007, at 06:00, Ian Eiloart wrote:
On 16 May 2007 17:12:10 -0700 James Berry wrote:
Following discussion with several of you, and more thought, my thinking is now:
(1) Obfuscate plain text email addresses by using the form:
- tld/domain/username user@bar.com ==> com/bar/user
- if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user
This won't always work (at least, not with simple implementations), since a slash is legal in an email local part (though often banned by local policy). However, if you used "subdomain.tld/localpart" you'd be OK, since the first slash would always be the separator.
I'm also concerned that this looks silly/weird for people whose email addresses are at CCTLDs where the last two components of the email address usually "go together". For example, if your email address is at "mail.example.edu", then it's somewhat ok to encode this as "edu/ mail.example", since "mail.example" is the part that the school has control over while "edu" is the part they don't control. However, in the case of Ian, encoding "sussex.ac.uk" as "uk/sussex.ac" makes less sense/looks more strange, since "ac.uk" goes together (is the UK equivalent of "edu"). So I would be in favor of changing user.name@mail.example.com into mail.example.com/user.name. Or com.example.mail%user.name. Or something. But in any case making it clear (to both people and machines) which part is the local part and which part is the domain part (which as Ian said is not possible when you use multiple encoding characters to split the domain part into multiple parts).
On May 16, 2007, at 5:12 PM, James Berry wrote:
Following discussion with several of you, and more thought, my thinking is now:
(1) Obfuscate plain text email addresses by using the form:
- tld/domain/username user@bar.com ==> com/bar/user
- if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user
- If the domain/tld is macports.org, then it may be dropped: user@macports.org ==> user
This kind of begs the question: What's the point of these fields again? Is it to report bugs? I suspect not, since we have a bug tracking system for that already and if you're going to contact the maintainer directly then you're short-circuiting the bug reporting mechanism and will probably just get a nice reply saying "please file a bug report" if the maintainer is as busy as most folks anyway. Is it for port maintainers to talk to other port maintainers? If so, we could just as easily keep this information in a side database and have the maintainer field in the portfile contain some sort of identifier that's purely unique to the macports project and doesn't even need to look like an email address - it could be a hash of someone's account record. Needless to say, in either case we could also have a port command which did the right thing with the information to preserve ease of use. "port bug" to automatically open and jump into a new bug report, "port feedback" to talk to the maintainer. Since you've now front-ended the process, there's no need to make the Portfiel fields even vaguely comprehensible to a spammer. Thoughts? - Jordan
On May 17, 2007, at 5:47 PM, Jordan K. Hubbard wrote:
On May 16, 2007, at 5:12 PM, James Berry wrote:
Following discussion with several of you, and more thought, my thinking is now:
(1) Obfuscate plain text email addresses by using the form:
- tld/domain/username user@bar.com ==> com/bar/user
- if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user
- If the domain/tld is macports.org, then it may be dropped: user@macports.org ==> user
This kind of begs the question: What's the point of these fields again?
Is it to report bugs? I suspect not, since we have a bug tracking system for that already and if you're going to contact the maintainer directly then you're short-circuiting the bug reporting mechanism and will probably just get a nice reply saying "please file a bug report" if the maintainer is as busy as most folks anyway.
Is it for port maintainers to talk to other port maintainers? If so, we could just as easily keep this information in a side database and have the maintainer field in the portfile contain some sort of identifier that's purely unique to the macports project and doesn't even need to look like an email address - it could be a hash of someone's account record.
Needless to say, in either case we could also have a port command which did the right thing with the information to preserve ease of use. "port bug" to automatically open and jump into a new bug report, "port feedback" to talk to the maintainer. Since you've now front-ended the process, there's no need to make the Portfiel fields even vaguely comprehensible to a spammer.
Thoughts?
Hi Jordan, Sure, I agree in principle that we're heading in a direction that makes what you suggest possible. And this proposal does cover part of that: if somebody has a macports.org account, they can just use the user name portion of their email address. But we need to bridge the gap between the now (when we don't have any good way for people to get accounts if they're not committers) and the fact that most of the maintainer email addresses in those files don't belong to committers....and we'd like to find a quick and easy way to mangle them out of sight of some of the spammers. So can we do more with a more elaborate system? Sure. But rather than spend a lot of time on this issue _right_now_, especially when those systems are still taking shape, I'm looking for a quick fix, to keep more addresses out of the hands of spammers, and some sort of mangling seems like an easy way thing to do without dwelling too much on this problem while there are other more critical things to solve. James
- Jordan
On May 17, 2007, at 5:59 PM, James Berry wrote:
Sure, I agree in principle that we're heading in a direction that makes what you suggest possible. And this proposal does cover part of that: if somebody has a macports.org account, they can just use the user name portion of their email address. But we need to bridge the gap between the now (when we don't have any good way for people to get accounts if they're not committers) and the fact that most of the maintainer email addresses in those files don't belong to committers....and we'd like to find a quick and easy way to mangle them out of sight of some of the spammers.
Fair enough. Though you've just said something which really surprises me - we have maintainers that don't have commit access? What's the point of that? - Jordan
On May 17, 2007, at 6:16 PM, Jordan K. Hubbard wrote:
On May 17, 2007, at 5:59 PM, James Berry wrote:
Sure, I agree in principle that we're heading in a direction that makes what you suggest possible. And this proposal does cover part of that: if somebody has a macports.org account, they can just use the user name portion of their email address. But we need to bridge the gap between the now (when we don't have any good way for people to get accounts if they're not committers) and the fact that most of the maintainer email addresses in those files don't belong to committers....and we'd like to find a quick and easy way to mangle them out of sight of some of the spammers.
Fair enough. Though you've just said something which really surprises me - we have maintainers that don't have commit access? What's the point of that?
A good and valid question. I'd love to hear feedback on that issue, particularly as I know there are bug reports sitting in trac that nobody has time to attend to. But the answer dates firmly back to before I ever got to darwinports, and probably stretches back into the reaches of your own mind somewhere... ;) I've worked hard in the last 8 months or so to give commit access to basically everybody who has asked for it, but that doesn't begin to account for all the people who have submitted ports and put their email into the maintainer key. (I'll point out too that one of the things I'm trying to get to with mpwa is to make svn access not a requirement to be able to submit and maintain ports for macports). James
On May 17, 2007, at 10:41 AM, Ryan Schmidt wrote:
On May 17, 2007, at 06:00, Ian Eiloart wrote:
On 16 May 2007 17:12:10 -0700 James Berry wrote:
Following discussion with several of you, and more thought, my thinking is now:
(1) Obfuscate plain text email addresses by using the form:
- tld/domain/username user@bar.com ==> com/bar/user
- if there are multiple components in the hostname, only the dot before the tld is turned into a slash: user@foo.bar.com ==> com/foo.bar/user
This won't always work (at least, not with simple implementations), since a slash is legal in an email local part (though often banned by local policy). However, if you used "subdomain.tld/localpart" you'd be OK, since the first slash would always be the separator.
I'm also concerned that this looks silly/weird for people whose email addresses are at CCTLDs where the last two components of the email address usually "go together". For example, if your email address is at "mail.example.edu", then it's somewhat ok to encode this as "edu/mail.example", since "mail.example" is the part that the school has control over while "edu" is the part they don't control. However, in the case of Ian, encoding "sussex.ac.uk" as "uk/sussex.ac" makes less sense/looks more strange, since "ac.uk" goes together (is the UK equivalent of "edu").
Good point.
So I would be in favor of changing user.name@mail.example.com into mail.example.com/user.name. Or com.example.mail%user.name. Or something. But in any case making it clear (to both people and machines) which part is the local part and which part is the domain part (which as Ian said is not possible when you use multiple encoding characters to split the domain part into multiple parts).
As Ian pointed out, / isn't really good as it's a valid atext character from rfc 2822. So what if we head back to Salvatore's suggestion of: suxxex.ac.uk:iane That's more readable, and unlikely to be recognized by the spambots. James
On May 17, 2007, at 20:16, Jordan K. Hubbard wrote:
we have maintainers that don't have commit access? What's the point of that?
We have many. We have around 41 maintainers [1] with @macports.org email addresses (all of these are committers) handling about 1013 ports [2], and another approx. 282 maintainers [3] at other domains (most of which are not commiters) handling about 882 ports [4]. About 2116 ports [5] are unmaintained. There is some overlap in these categories as some ports have multiple maintainers. [1] cat */*/Portfile | sed -n -E 's/^maintainers[[:space:]]+//p' | xargs -n 1 echo | grep @macports.org | grep -v -E '(no|open) maintainer@macports.org' | sort -f | uniq -c | wc -l [2] cat */*/Portfile | sed -n -E 's/^maintainers[[:space:]]+//p' | xargs -n 1 echo | grep @macports.org | grep -v -E '(no|open) maintainer@macports.org' | wc -l [3] cat */*/Portfile | sed -n -E 's/^maintainers[[:space:]]+//p' | xargs -n 1 echo | grep -v ^$ | grep -v @macports.org | sort -f | uniq -c | wc -l [4] cat */*/Portfile | sed -n -E 's/^maintainers[[:space:]]+//p' | xargs -n 1 echo | grep -v ^$ | grep -v @macports.org | wc -l [5] port echo maintainer:nomaintainer@macports.org | wc -l These counts are also not entirely accurate because some maintainers have not decided on a single email address to use. These people (arsptr, bfulgham, denis.defreyne, deric, erickt, fenner, landonf, marius, mas, pkern, ramercer, warp-darwinports/warp-opendarwin) should verify whether they are using the same email address in all of their ports. I see one maintainer has already taken measures to obfuscate his email address in python/py-biggles/Portfile (though not, I see, in the other ports he maintains).
--On 17 May 2007 21:11:03 -0700 James Berry <jberry@macports.org> wrote:
As Ian pointed out, / isn't really good as it's a valid atext character from rfc 2822. So what if we head back to Salvatore's suggestion of:
Actually, you could use any character that isn't valid in an email domain, as long as you aren't munging the domain at all, but it's likely to be easier to implement with a colon.
suxxex.ac.uk:iane
yes, but it'd better be "sussex"!
That's more readable, and unlikely to be recognized by the spambots.
-- Ian Eiloart IT Services, University of Sussex x3148
On May 18, 2007, at 4:39 AM, Ian Eiloart wrote:
--On 17 May 2007 21:11:03 -0700 James Berry <jberry@macports.org> wrote:
As Ian pointed out, / isn't really good as it's a valid atext character from rfc 2822. So what if we head back to Salvatore's suggestion of:
Actually, you could use any character that isn't valid in an email domain, as long as you aren't munging the domain at all, but it's likely to be easier to implement with a colon.
well, but given that I've proposed we also allow raw user names (within the macports.org domain), we really can't use a character that's legal in the user name, as we'd then be inclined to spit the user name up, mistaking it for a domain ;)
suxxex.ac.uk:iane
yes, but it'd better be "sussex"!
Indeed. I guess I was just subconsciously trying to keep you safe from the spambots! james
--On 18 May 2007 06:58:53 -0700 James Berry <jberry@macports.org> wrote:
On May 18, 2007, at 4:39 AM, Ian Eiloart wrote:
--On 17 May 2007 21:11:03 -0700 James Berry <jberry@macports.org> wrote:
As Ian pointed out, / isn't really good as it's a valid atext character from rfc 2822. So what if we head back to Salvatore's suggestion of:
Actually, you could use any character that isn't valid in an email domain, as long as you aren't munging the domain at all, but it's likely to be easier to implement with a colon.
well, but given that I've proposed we also allow raw user names (within the macports.org domain), we really can't use a character that's legal in the user name, as we'd then be inclined to spit the user name up, mistaking it for a domain ;)
Good point!
suxxex.ac.uk:iane
yes, but it'd better be "sussex"!
Indeed. I guess I was just subconsciously trying to keep you safe from the spambots!
thx :)
james
-- Ian Eiloart IT Services, University of Sussex x3148
participants (7)
-
Boey Maun Suang
-
Ian Eiloart
-
James Berry
-
Jordan K. Hubbard
-
Kevin Ballard
-
Ryan Schmidt
-
Salvatore Domenick Desiano