Re: Track who installed what ports on what OS and on what
On May 13, 2007, at 22:06, Norman MacIntyre wrote:
That is a bad idea. What purpose would it serve other than collecting statistics and curiosity?
Let me know when you do this so I can immediately use only fink.
I see! Well, that's two people vehemently opposed so far. :) Interesting. For comparison, it's far from unprecedented for programs to send statistical information to their developers when they check for updates. The popular text editors BBEdit and TextWrangler from Bare Bones Software do this, for example, as does the venerable dock utility DragThing. As I recall, they all ask for the user's permission before doing so, and explain exactly what data is sent; we could do that too. Even Software Update sends Apple a list of the Apple software you have installed, so that it can deliver to you a list of applicable updates. And it does that without your permission, and it's on by default, too. The purpose for collecting this data is obvious: it allows the developers to learn what kinds of machines and OS versions their users are using, which can help the developers decide where to focus their resources. In the case of MacPorts, it's fairly obvious that we will have PowerPC users running 10.3.x and 10.4.x and Intel users on 10.4.x. Maybe some 10.5.x testers too. What's less obvious is what ports people are using. And by collecting that information, we would not only help the port authors learn which of their ports are actually getting used, but also help users gauge a port's reliability. If a user is interested in a port and sees that hundreds of others have successfully installed it, there's a good chance it works. If nobody has installed it, or nobody with the user's processor architecture or OS version has installed it, there may be problems. If I wasn't clear before, I should also clarify that no personal information about the user would be stored -- username, email address, IP address: none of that. And it is not the intention to display a list of any individual user's active ports, only to sum up totals of active ports. However, I have to admit the system would need to store the lists of the users' active ports in order to compute the totals. And maybe that's a concern, since the information could be obtained by determined individuals with access. Perhaps I should explain what led to my initial email. In working on a redesign for the MacPorts web site, I thought it would be nice to include a box of popular ports in a sidebar, like so: My data here is clearly made up, but it illustrates the idea. If we tracked who installed what ports and on what processor, we could generate an accurate graph of the above, and similarly for OS versions. What I didn't do in the example above, but what I would want to do, is show only "interesting" ports. I would say ports like pkg-config and apr-util aren't "interesting" because they're merely dependencies of other ports. You wouldn't install apr-util just by itself; apr- util isn't the end goal. Other larger ports like apache2 are the end goal, and that's the kind of port I would show in the graph. If I could find a good way to identify the "interesting" ports. But perhaps this entire graph isn't so interesting, ultimately. Presumably if you're looking at MacPorts, you have a specific software package in mind that you're wanting to install, and don't need a graph of what everyone else is installing. I guess I was thinking of something like Apple's Dashboard widget download site, where they show you a list of today's top widget downloads. Same idea. If we don't do this graph, I would still like some extra info on the MacPorts home page, something to show new users what they can do with MacPorts. Something to draw them in. Maybe just a list of the most recently updated ports and their versions would do. That would be easier, too, since I can already pull that information from the repository.
I see! Well, that's two people vehemently opposed so far. :) Interesting.
Count me as "vehemently opposed" as well. Thank you.
For comparison, it's far from unprecedented for programs to send statistical information to their developers when they check for updates.
Is that reason enough to do it as well?
The purpose for collecting this data is obvious: it allows the developers to learn what kinds of machines and OS versions their users are using, which can help the developers decide where to focus their resources.
Is there somebody who decides, or who is even entitled to decide, how the resources of port developers are being focused? I doubt it. In practice, it seems to me that Portfiles are being developed and updated by people who actually use the software, not by some abstract pool of developers who can be told to do this or that at the whim of some statistics. IOW, although we probably all know that a lot of people use perl, the Portfile for perl5.8 is still owned by nomaintainer. And that's a Good Thing. ;) I think your hypothesis that those statistics are somehow "needed" requires more support. As it stands, it is not well defended. Be that as it may, my real argument is that you are not entitled to collect this kind of statistics, and certainly not to "turn it on by default because we wouldn't get enough people to allow us to do it otherwise". I am firmly convinced that it's absolutely none of your business what software I use or don't use. And if I recommend MacPorts to a friend, or install it for him, I don't want to do it knowing that Ryan Schmidt Design is going to get detailed reports on what that person decides to do with MacPorts. I want to be able to continue to support MacPorts. I'm strictly opposed to collecting more and more data about your users. Please don't do this. If you feel you absolutely need to do it, create an optional package that collects your statistics for you, so that people who think you should get it can install and activate it for you. Explicitly, manually, consciously agreeing to having their data collected. Regards, Marc
On May 14, 2007, at 2:28 AM, Ryan Schmidt wrote:
I see! Well, that's two people vehemently opposed so far. :) Interesting.
For comparison, it's far from unprecedented for programs to send statistical information to their developers when they check for updates. The popular text editors BBEdit and TextWrangler from Bare Bones Software do this, for example, as does the venerable dock utility DragThing. As I recall, they all ask for the user's permission before doing so, and explain exactly what data is sent; we could do that too. Even Software Update sends Apple a list of the Apple software you have installed, so that it can deliver to you a list of applicable updates. And it does that without your permission, and it's on by default, too.
Yeah, but still, this is a calculated risk for an open source project to take. People have a hard enough time with software phoning home that's also presumably commercially vetted (and released by people who can also presumably get sued if it does something really egregious), but the wild and wooly open source stuff doesn't get much benefit of the doubt there at all. I think a far better idea, and one which kills a lot more avians with a single projectile, is to focus on creating packaged software for MacPorts and a nice GUI interface for downloading it, then simply track the download stats (which everybody and their dog does) and that will give you a REAL notion of what's truly popular and what's not. Right now, all your stats would be measuring is the activity of a relatively small collection of geeks who know enough to install the dev tools, run Terminal.app and use the port(1) command line tool. This may seem like a relatively large and meaningful number of people if you're judging solely based on a focused sampling of macports traffic, but in reality it's just a wee tiny drop in the bucket when compared to the overall, pent-up demand for add-on software for the Mac. No offense, but your stats would literally be about as meaningful as taking an election poll in Joplin, Missouri and announcing the results on CNN. - Jordan
participants (3)
-
Jordan K. Hubbard
-
Marc André Selig
-
Ryan Schmidt