|
| Sat, Jul 19th | home | browse | articles | contact | chat | submit | faq | newsletter | about | stats | scoop | 19:13 UTC |
|
login « register « recover password « |
| [Article] | add comment | [Article] |
When you want to announce changes in your project to several Web sites, you go to the first one and fill in a form with your new info. Then you go to the next one and do it again. And again with the next one. And again. And again. Doc O'Leary thinks he has a solution. Please read to the bottom for my comments and a couple of questions I'd like to ask everyone. Copyright notice: All reader-contributed material on freshmeat.net is the property and responsibility of its author; for reprint rights, please contact the author directly. IntroductionIn the (good ol') days when FTP and archie were king, it was fairly simple for developers to spread their offerings far and wide. I had scripts set up to drop the right files in the right locations, and it didn't much matter if there were two or twenty archives. Enter the Web, and the focus shifts from pushing software out to archives in favor of pulling people into Web sites. I think that's a good thing, because it puts more information into the users' hands, but in the process, developers have lost the ability to easily push anything out. Instead, we have to manually go to a number of tracking sites (the more the better, usually), set up accounts, and edit essentially the same information on all those sites. My long-winded question is essentially this: Is there any interest in automating this process? I currently have a property list (easily made available in plist or XML format (and simple to convert to other formats, if necessary)) I can use to build the dynamic pages of my site which contains all or nearly all the information that is gathered at various software tracking sites. If a general software description file format can be agreed on, simply making that file available would give sites all the information they need to update their database entries. No fuss, no muss. Minimizing the administrative efforts will really lower the barrier of entry for all sites. Reference FilesBetterConsole is my newest piece of software, recently released, which has brought this issue to the surface. You can find my SPIF files for it in two formats: The plist format might not be particularly easy to parse on non-NeXT/Apple systems, so I would be willing to write a converter that puts out a format that is easier to parse. Discussion of SPIF formatKeep in mind that this is a work in progress and represents only a first pass effort at a format that contains sufficient information to satisfy most tracking software. Most tracking sites seem to consider a basic piece of software to have eight attributes: author, description, version, system, license, price, category, and package. authorThe author attribute identifies who owns the software. It does this via contact information by assigning values to the sub-attributes name (e.g., Joe Programmer) and location (e.g., http://www.someisp.com/~joe). Other sub-attributes such as email could also be added, though most tracking sites currently only seem to ask for the name and URL of the author. descriptionThe description attribute identifies the software in increasing levels of detail. Currently that is done with 3 sub-attributes: name, short, and long. versionThe version attribute identifies the version of the packaged software. This is done with 2 sub-attributes: revision, and status. systemThe system attribute identifies what operating systems the software is for. This is done with 2 sub-attributes: name, and version. It may be a good idea to make this an array instead, as some software may run on many different systems (the workaround would be to make a SPIF file for each system). It may also be a good idea to add further system requirements (RAM, HD, etc.), but that does not seem to be a major consideration from a tracking point of view. licenseThe license attribute identifies the license the software is distributed under. This is done with 2 sub-attributes: name, and location. purchaseThe purchase attribute gives information on purchasing the software. This is done with 2 sub-attributes: price, and location. categoryThe category attribute identifies how the software should be organized. In looking at the various tracking sites, there was really no consistency in the arrangement or naming of software categories. Additionally, most sites had additional fields for keyword descriptions of software (for search purposes). I'm hoping these features can be subsumed by this one category attribute. It should be considered a prioritized list of organization and search keywords. The software that scans the file should be able to look at this list and determine where it fits with all the other software that is being tracked. If it fails, I suppose it would be up to the tracker to either adjust the scanning software to be more robust or inform the developer of the error. packageLeaving the most complicated for last, the package attribute identifies all the files that are associated with this software. Example sub-attributes are info, binary, and source. Each of these identifies a document that is related to this particular piece of software. That is done by further breaking that file information down to location, size, and checksum. I also included contact information, just in case the contact for, say, the binary might be different from the contact for the source, but I'm not sure that's really necessary. [You can watch the original version of this document at http://www.subsume.com/spif/ for updates about the file format proposal. -- Ed.]
Editor's CommentsDoc makes a good point -- hackers who would rather be coding have to take the time to submit announcements about their work both to general announcement locations (like freshmeat and comp.os.linux.announce) and to locations that match their specific area of interest (gtk.org or linuxgames.com). This is redundant and time-consuming work, and it would be good to reduce it as much as possible. The question is, how?A pull proposalAs Doc points out, there are two ways of getting your information through -- you can push it, or you can let people pull it. His proposal for letting sites pull the information from you makes it possible that you would only need to keep one piece of information up-to-date at all the sites on which you want your project listed -- the URL of your project info file. If you want your site listed on freshmeat, gnu.org, and kde.org, you could just give them the URL, and, at a regular interval, each site would have a bot check if the file has changed. If it has, it would compare the file with the information in the site's database and submit any differences as change requests to be reviewed by the site's staff. If the regular interval is taking too long, you could click a button on the site to ask it to check your info file immediately. You could even have an "info file URL" field in the info file, and you wouldn't have to go to the sites to keep the URL up-to-date. When you changed servers, you would put your files up at the new location and change the field in the file at the old location. The URL to the new info file would be submitted as a change request like any other. After allowing enough time for everyone to catch up, you could just remove the old files. One big thing missing from his first draft is an "announcement" field. When you release a new version, you want it to show up on freshmeat's front page. Using Doc's scheme, you should simply be able to change the necessary fields in your info file, including one that lists what's new in this version, ready to appear in an announcement on freshmeat's front page, in the newsletter, on the newsgroups, etc. This brings us to a problem I see in doing this. Ask any of the freshmeat staff, and they'll tell you that the number of items that are submitted and approved without any changes is quite small. Sometimes, there are errors in spelling and grammar. Sometimes, it's not clear what the contributor is trying to say, and we have to work it out with him or her. Sometimes, there are just changes that have to be made to make the submission match our editorial policies -- for example, we don't allow the name of the project to appear in the short description, and we insist that it appear in the long description (preferably in the first few words), we don't allow HTML in descriptions, etc. Now, let's say you make changes to your info file, and we pick them up as change requests. What do we do?
Even when we get past that point and your info file is acceptable to us, you'll check your mail in an hour and see messages from two other sites saying that they need you to change x and y. When you change x, site number 3 will be unhappy, and when you change y, site number 1 will be unhappy. At this point, you'll wish you were just going to Web pages and filling out forms again. The issues of what options to include in the file format can be overcome. Everyone who wants to take part in the system can get together and flame each other until they work it out. Dealing with policy issues and the editorial needs of all the sites is not as easy. You might end up having tag attributes to accommodate different sites:
<announcement site="freshmeat.net">
(text acceptable to freshmeat.)
</announcement>
<announcement site="linuxdoc.org">
(text acceptable to the LDP.)
</announcement>
, etc. Whatever the solution, the problem would have to be dealt with. One size is not going to fit all. A push proposalAnother idea that comes up from time to time is that of letting people submit information by email. Again, you would have a standard format for the information, only now it would be sent to the sites, where a script would parse it and submit the parts of it as change requests as needed. The advantage to this is that you no longer have multiple sites trying to get you to change your info file to match their needs. They each receive your request and can contact you with any problems they have. You could have your XML info file in your build directory and have a rule in your makefile with a list of the addresses to which it should be sent and a command that will send it. Then:
make submit I like that just for the coolness factor. :) Your proposalsI have two questions for everyone:
Doc O'Leary (droleary@subsume.com) is a COG in the machine of Subsume Technologies, Inc. (http://www.subsume.com/). He is lazy, and has thus been an advocate of free software since 1996 and of object-oriented development for nearly a decade.
T-Shirts and Fame!We're eager to find people interested in writing editorials on software-related topics. We're flexible on length, style, and topic, so long as you know what you're talking about and back up your opinions with facts. Anyone who writes an editorial gets a freshmeat t-shirt from ThinkGeek in addition to 15 minutes of fame. If you think you'd like to try your hand at it, let jeff.covey@freshmeat.net know what you'd like to write about.[Comments are disabled]
[»]
NO ! NO POLLING !! This is better IMHO ... Polling is of course an evil thing!
[»]
Trying to fix the mess in those packages... I see one other piece of information which may be interesting.
[»]
pole rates some one mentioned that if a project wasn't updated in a while it would eventualy be abandoned. Another idea was to specify how often to check back in the info file itself. I would sugest making the server figure it out itself. Each time it checks to find no update it can increase the wait interval. This would make it adapt to the update rate and would be nice because often when a project is started it has freqent changes and after it becomes more stable the updates are less fequent.
[»]
KISS - keep it simple, stupid I don't think it's necessary to discuss a standard or interchangebility of the information. What is instead a nuisiance is the login process after which I am presented with the fields I have to change then. "make announce" sounds cool anyway - to make it work I do want freshmeat to send back a form sheet of the latest announcement. The next time I am around to make a re-announcement, I can simply adapt the fields as long as the syntax is intuitive. And as long as it is intuitive, I don't care about a specific format or conversion tools - just edit the thing for each site as they want it, as long as I am able to cut and paste on my local computer. It would even be sufficent if I would be allowed to paste the form-information online into a single field. Still I have to login - an authentication matter. The authentication would be a bit complex with e-mail however (I still don't have a pgp-key). Well, no need for that if you go with the url+pull+trigger idea - this means that I just setup a website once (which would need authentication there), and then I push the announcement information in there - just a file anyway - and simply send a trigger impuls to freshmeat so they look up the url they have been made to know. And the trigger impuls is as simple as a "make announcement-call". And please, no need for micro-emails here - just make it a special cgi-url at freshmeat to trigger the pull, there are enough commandline utils that can do http-get.
[»]
Copy the Media! They copy one another. Web sites should also talk to each other...
[»]
Pulling content Why not go half-way at least? The announcements could be a new gnu (heh) standard file (ANNOUNCE) in the project directory with the same concept as ChangeLog, except the more human-centered version. This way, an author could _either_ enter an announcement _or_ point their freshmeat.net entry to their web/ftp-uploaded announcement file. Freshmeat would grab the most recent announcement and treat it the same way they currently are (as though the author had typed it in).
[»]
make submit make submit should be ``make petition'' or ``make announcement'' (harder to type) as it's closer to real-world language and scans better. You are petitioning the announce sites to publish your announcement/update.
[»]
re: scoring i think (registered fm) users should be able to vote on three things: documentation, stability, features (many or few could be good or bad). then the users could revise their score for a program if another release is better or worse. this would help me as a developer also so i could fix things because i never really get any feedback at all. i think most users get the impression that _all_ developers are very busy and are insulted to get mail from users. if someone doesn't like something i wrote, i want them emailing me why so i can do something.
[»]
Push VS Pull and making software harder to announce It would be much easier for maintainers of such databases (I keep such a
thing for Mac OS X/WebObjects stuff.. Doc approached me about this a week
or so ago) if this was a push situation rather than pull.
[»]
We should me making software HARDER to announce... All too often, writing free software seems like a form of masturbation.
Sites like freshmeat, sourceforge and advogato speak for the authors of
free software, but nobody speaks for the users of free software. Until
someone does so, free software will remain a niche market.
[»]
Responding to some of the issues raised
Push vs. Pull
Missing attributes But I do agree that it would be handy to have in the SPIF file itself. I like the suggested changelog/change suggestion, an I'll add it to the SPIF document as a proposal. Dependency information probably needs more discussion as to what you actually refer to as the dependency (I'd be inclined to go with a SPIF URL).
Site requirements/corrections, invalid submissions, etc.
Conflicting site requirements
File format
author/name Subsume Technologies, Inc. What's important is that it contains sufficient information. I think we're getting there, so it comes down to finding a base format that is expressive enough and easy enough to work with. I have no particular preference of my own. So my question to tracker sites like freshmeat and the developers who submit software is what kind of format are you willing to work with?
[»]
Not that problematic Well, perhaps it is. One thing, though, only the initial release of a
software package needs updates of most information. The announcements of
new versions would only include:
--
[»]
Previous nearly-sufficient attempts With some allusion to "Why doesn't everybody just build debians", there
have been a few similar attempts to define the metainfo of a package so
that it can be listed. I think this attempt is the closest. Why doesn't
everyone just build RPMs? Why doesn't everyone just build Unix PKG files?
SSOs? The list goes on.
--
[»]
deb files? It seems to me that if everyone would simply make debian packages we would have all the information about a package and also the ability to update easily.. or am i missing something?
[»]
How long to poll... The pull method, should probably include a field to indicate how often to poll. A slow moving project may be happy with once a week. A fast moving project, once a day. Also there should be a rule to determine when to mothball a project. If no updates occur in 6 months, email the developer. If no updates occur in 7 months, turn off automatic polling.
[»]
About the uncompatibility between sites This shouldn't be a problem, I mean, we are always writing code that works in diferents plataforms! I see here to differents solutions:
Freshmeat is like the main site for Open Source software, so, I would like to see a place in the site, where developers can ask for help in certain part of the project.
[»]
Dependencies! (mandatory and optionals) It continuosly amazes me that there are so many projects putting up the instructions on how to decompress a tarball source, but without any information on dependencies (or buried deep within documentation, or scattered among several places). What I'd like: a tag specifying build dependencies (i.e. on libraries) meant to be read by human beings. Example: Building foo app requires:
Also, why limit to one category? mpg123 could qualify under "audio" (it plays mp3) as well as "console" (it's a console app) and "streaming" (because it can play audio streams from the net). This would eliminate the need of a deep tree, as actually Freshmeat has. Think of them as metatags for software search engines: unlike pr0n spammers, I am confident that developers can make intelligent choices on categories. Finally, if you are searching good examples of what information you should include, have a look at the GNU Free Software Directory.
[»]
Changelog missing A common piece of information distributed with announcement (e.g. on the
GNOME announcements mailling list), is a list of changes in 'bullet-point'
fashion.
|