Aggregated publishing system

Posted on May 23, 2007. Filed under: Atom, RSS, binserver, cms, content aggregation, media CMS, open source content managment system, podcast directory, publishing, video cms |

We’re designing another topic based publishing system for a client who wanted to know how news items are moderated. In other words how does the system pick and chose what the readers see?

Will the news pick up articles like this? Do we have to “crawl” the [omitted] site for this stuff?

http://www.blahblahba.com/article/email/idUSN22467289xyz

ANSWER:

Generally ‘crawling’ is where you have bots or your machine agent go from one site to the next and follow links based on predefined criteria or as Google calls it “an algorithm.’ Its just like wondering the net and taking notes on everything.

So while we can do it in theory we aren’t going to ‘crawl’ anything. You don’t want to do that as it is both resource intensive and ineffective unless you have the resources.

The answer to your question however is ‘yes’ and ‘no’. The following is the only coherent answer I can give you that is true and I know right now. I know allot about this but not as much as some others.

First it will aggregate EVERYTHING we specifically tell it to or any feed we put in the system will be aggregated. So if blah—all of blah —is aggregating then of course this feed item your pointing to will be one of them. However as I said there is a filter system. You do not want every feed published even if it is aggregated and in the database. That’s pointless as you know. You only want feeds published that are either topic based and you subscribe to on purpose because it is written specifically for your subject OR…. You want individual feed items [ITEMS ARE NOT FEEDS]published and promoted based on filters ]. The items are scanned by a filter system that you/we/your designers are going to have to setup and maintain all the time. The filter system is only good if you work it / refine the results… and not all that much different than what Google does. Actually its exactly what Google does.

It works like this. The filter system looks at a pool of x feed ITEMS , and scans for key words or phrases. Then when it finds a hit it can do one of three things .. 1. publish the item or 2. publish and put in the moderation queue for a human editor to approve or not or 3. publish AND PROMOTE the item to the front.

So to recap. You can setup certain feeds where all the feed ITEMS are promoted and published. I would call these your featured newsfeeds. You can also subscribe to non topic or general feeds and apply the filter which will help you pick out the items you want to promote to the front. It can also just promote them. It can also be set to promote the items from some feeds or trusted feeds and queue others for moderation. Etc.. ..

FYI it works well. I’ve setup sites as experiments that don’t use the human editor part (I don’t advise this btw) and it more or less works but pick out the feeds that are relevant. Adding the human is the key here and one person working 30 minutes every other day can do all that’s needed to make the site contain tons of very relevant, non Spam, content. You should read that blog post where I answered the charge that I was a splogger. That’s me who wrote it and the so called review of the system’s effectiveness was not something I solicited but under the circumstances I decided to go with it. FYI this was before we had any kind of filter system designed.

I hope this helps you understand how it works.

Now there is stuff I can’t really explain right now. As I’ve said I need to get someone to begin the work on the podcasting and feed aggregation system to make the first and refine the second. I also want know we’re  continuing to refine the look and feel based on your original design but are going to be adding panels and other theme/content systems to create a series of feed blocks.

Make a Comment

Make a Comment: ( None so far )

blockquote and a tags work here.

    About

    Video Content Management Systems and other crud

    RSS

    Subscribe Via RSS

    • Subscribe with Bloglines
    • Add your feed to Newsburst from CNET News.com
    • Subscribe in Google Reader
    • Add to My Yahoo!
    • Subscribe in NewsGator Online
    • The latest comments to all posts in RSS
    • Subscribe in Rojo

    Meta

Liked it here?
Why not try sites on the blogroll...