January 20, 2007

Delivering Personally Scheduled Web Feeds

Chatting to Nick Heap, Chair of the OU's Web Applications Development Certificate courses, about 'feedcycles' (like the OpenLearn_daily feeds) the other day, he pointed out yet another obvious and efficient way of doing part of it if we're to deliver content at scale using feedcycles ('static', pre-written, web feeds whose content is delivered according to a schedule determined by the user when they subscribe):

Simply run a job on each feed to create offline files/records that contain either a single item (one each for each delivery item - so e.g. delivery_item(n) would contain the nthe item) or the list of items to date, plus the current one (so for example, the delivery_to_date(3) record would contain the first three delivery items, plus the current one (indexing from 0...)).

I've also started wondering what other options we might offer to allow users to schedule deliveries within their own feeds.

The options I've come up with include the delivery time (i.e. time of day each new item appears), the delivery start date (so for example I may want content to start next Tuesday; a 'header' item describing the feed would be made available immediately); the delivery period (i.e. how frequently new items should appear); the number of items in the first delivery (a user may want to start with several items immediately - hmm, so I need an Immediate start button somewhere...); the number of items per delivery.

Here's a demo: schedule my feed [URL subject to change]

I need to define some URL arguments to carry this information, which maybe something the folks at FeedCycle would be interested in chatting about/would have an opinion on?

I also need a way for subscribers to a feed to - somehow - demand additional items? Hmm - maybe a getahead argument could do this? Users would add ?getahead=2 to the subscription URL to get the next 2 items over an above what they are already being fed? getahead would have to be relative - so to get the next two items over and above the additional two that have already been demanded would require the argument to change to ?getahead=2.

This suggests a model for updating feeds in a feedreader. A subscription option to 'get ahead in this feedcycle' would automatically take care of the getahead numbering. The reader would also have to not think this were a completely new feed - you would want to preserve the read/unread state of items that have already been delivered via the feedcycle.

getahead could also be used to provide a negative offset, in effect a 'halt delivery' option, that would delay the delivery of new items by a particular number of delivery slots.

I suppose for completeness, you'd also need to support a 'pause' option, that would halt delivery of items until the user was ready to restart the cycle, at which point a relative negative getahead offset could be automatically generated.

And finaly, a 'fast forward to end' or 'all' option that would deliver the complete contents of the feedcycle.

Hmm - this is possibly richer than I thought - and achievable too?

PS in reply to Stephen Downes on this post, (whose commenting system doesn't work for me at the mo, which is why I'm posting here):

I've learned from experience that you don't want to "run a job on each feed" - or, more accurately, that you want such a job to involve as few CPU cycles as possible. After all, if each feed represents several database lookups, some processing, and some file writing (and in my case, emailing) then customized feeds begin to generate some serious overhead

I agree with what you're saying about server overhead for feed delivery, Stephen.

What I was thinking was that when you register a feed *as a publisher*, it's processed to produce N (or 2N) flat files for an N item feed (newfeed_firstMitems.xml and newFeed_Mth_item.xml), each flat file file containing either the first M items or the M'th item.

When a feedreader polls the feedserver, the current timestamp is checked against the one in the feed URL, offset items (one ahead, etc) accounted for, and the corresponding number of items to be delivered calculated. The corresponding thisfeed_firstMitems.xml (or thisfeed_Mth_item.xml) is then returned.

I see the carving up of the submitted feed as an offline, run it once service executed to generate static files that can be returned as required. Judicious use of metadata to tell feedreaders when would be sensible to ping the feedserver should also be part of the mix. The path to a fully working, scaleable system is likely to be a twisty one, but the payoff in terms of utility to users is woth it, I think?

As to the comment - "I refused to call them 'serialized web feeds' because RSS is not a 'web' technology" - ok, slack use of language on my part, but I'm trying to get away from saying RSS feeds... ;-)

Posted by ajh59 at January 20, 2007 12:51 PM

PLEASE - Can anyone help ? (RSS FEED ...using Feeder)
Iíve built our website and before launching into the podder sphere I wanted to check-out the RSS Feed and have bumped into some problems...
We're using Feeder (Reinvented Software) to created the .xml files which it seems to produce okay - publishing is causing me a problem and yet the ftp set-up is correct.
I've been trying to sort out this out for ages and don't know anyone who could talk me through this program - do you know anyone who could help ?
Paul Weller, Photo-Journalist - D/Line +44 (0) 1386 841 490
E-Mail: paul.weller@itinerantpodcaster.com
Internet: www.itinerantpodcaster.com

Posted by: Paul Weller at January 23, 2007 06:31 PM

I've not used that programme [http://reinventedsoftware.com/feeder/] (I still haven't got into the podcasting groove...)

What's the problem? Have you managed to ftp something else to your server successfully? Are you sure a firewall isn't blocking the ftp process?


Posted by: Tony at January 23, 2007 07:01 PM

I'm looking for something similar. Let's say a faculty has materials prepared for class that can be distributed via RSS--audio files, pdf's, etc. We want to update the feed weekly, such that each RSS update contains the content for that week. So we could enter the start date of the term, and somehow automagically generate the new updates on a weekly basis. It wouldn't be dependent on when the user subscribes, however, it would be like a relative date update based on a set interval (weekly) and start date. Make sense? Do you know of any tool out there for this sort of thing?

Posted by: todd at March 4, 2007 03:08 PM

The way that feedcyle.co.uk works, and the approach I was going to take, is to include a timestamp of when the subscription was made in the subscription URL.
Every time the feed is polled, a comparison is made between the current time and the subscription timestamp, to work out how many items to include in the feed.

For your purposes - a fixed start date with items release every week - you would just hard wire the subscription timestamp in the subscription URL to the date of the start of the course.

I have fragments of scrappy PHP that I keep meaning to tinker with that will serialise an arbitrary feed according subscription time etc;

You're welcome to it such as it is (little more than php time() function examples, really...


Posted by: Tony Hirst at March 4, 2007 06:16 PM