Acts of Volition

Comments

Comments are locked. No additional comments may be posted.

Dave -

I guess I'd better get our Blog platform running on Postgresql.

Steven Garrity -

There goes my facade that I actually do this stuff myself.

mpt -

<ol><li><p>Does “our weblogging system” have a name?</p></li>
<li><p>What’s the proprietary code you’re relying on?</p></li>
<li><p>Well done with your URIs! I hummed and haahed about archives/, and eventually decided against it, because it makes posts look old even if they were posted five minutes ago. (The flip side of that decision would be that root-level categories couldn’t have a wholly numeric name, because it might clash with the number of a year.)</p></li>
<li><p>I did indeed mention “support for the old URLs of imported content” — “Keep URLs the same for legacy entries”.</ol>

Steven Garrity -

mpt, the weblog system we're referring to is one developed with my friends and co-workers at silverorange. It was developed without any intention for commercialization. The main goal was to setup a weblogging system that we could use for ourselves (mainly Acts of Volition and a few other sites).

The "proprietary code" that is used is a web-application development platform that we use at silverorange.

We're got five weblogs running on the system (see the list). We'll probably have a page up soon with a description and feature list.

Lou Quillio -

MPT's all-in-one-place thinking on this subject is a boon, collecting some of my thoughts on a durable CMS schema, adding others and saving me some brain cells. I'll be referring to it plenty.

What I'm still not seeing clearly is category faceting, with an eye toward multiple category assignment _plus_ hierarchy ... plus future category reassignment, plus avoiding architecture astronautics.

Any thoughts here? I might, for instance, start out with the top-level categories WebDev and CSS2, and rightly assign a post to both. Later I might decide that CSS2 is subordinate to WebDev. I don't want to manually reassign posts, rather just rearrange my hierarchy, so category assignment at the post-level can't be slave to the changing hierarchy.

I suppose this can be solved by assigning categories to posts as a delimited string of equivalent category names, without regard to hierarchy. The string is parsed, and archives are served by considering the current state of the hierarchy -- finest to coarsest granularity. Where once WebDev and CSS2 were peers, CSS2 has become a subordinate (more fine-grained than WebDev by a factor of 1) -- but that fact is transparent to the post records themselves.

This would allow rich, free-form assignment of multiple categories *and* free-form hierarchy tweaking (really alternate "views") without welding categories as posts understand them to categories as the hierarchy does.

Hmmm ... okay, never mind. I think I figured it out. I think. Thanks.

LQ

Steven Garrity -

Right on Lou.

<p>Epinions does this very nicely. When you browse something like "Laptops", you are presented with "sub-categories" like Brand, Screen Size, and Price Range. If you click "Brand" you'll see "Screen Size" and "Price Range" as "sub-categories", but if you had clicked "Screen Size", you'd see "Brand and "Price Range" as "sub-categories".

They are really more like "attributes" than "categories". We had an internal debate about what to call them (though we aren't nesting, yet). We settled on "categories" because it is a recognized convention in weblog archives. However, "attributes" is more accurate.

It's not flat. Taking Epinions for example, the category of laptops with 15" screens is obviously a child of the "laptops" group, but you can narrow down sibling categories in a way that makes them appear as though they were a tree-structure.

For examples of this in practice see the Epinions Laptop section or my friend Nick's photo gallery.

<p>silverorange will be using this technique in some of our future e-commerce sites as well.

Make sense?

Daniel Burka -

Good post Steve. Perhaps some sort of spell-checking tool (I'm aware how difficult this is with current technologies) should be involved in any good web blogging tool. Of note, you made about 5 typing errors in this post alone.
<ul>
<li>end-users of you system</li>
<li>and will be implementing thanks for Thomas’ suggestion</li>
<li>the robot’ suggestion does’ work out (2 errors here)</li>
<li>Something Thomas didn#8217;t mention, support for the old URLs of imported content - we#8217;re handling</li>
</ul>

Steven Garrity -

Thanks Daniel. Perhaps we should implment the Distributed-Remote-Daniel-Burka-Weblog-Proofreading-Protocol (DRDBWPP).

Lou Quillio -

SG: What I'm thinking of, though, is a case where post (item) records contain a string of delimited category names (better "attributes," as you say) that category-interpreting code applies a set of heuristics to. Posts don't belong to categories, categories belong to them. And categories don't own posts.

I'm sure we're talking the same language, but what I'm chewing on now are the programmatic nuts and bolts, and database schemas. The category-interpreting module (and its UI, and its very identity) is the real challenge. Instructions to the user are to slap as many categories on an item as are germane, and to tier categories if desired. Some will make a real mess.

And the category-interpreting module must impose order. It must control category creation and insist that they are distinct. It must perform all global name changes. It must be air-tight against anomalies. Hardest, it must make sense. I've never seen a UI like this done well in GUI, far less in browser-limited controls. Jesus, that'd be a fun challenge.

Yeah, so that's it: Items are dumb and have moods and topical concerns -- and dates and titles, let's don't forget. But they also don't need to think about their "recent posts" browsability. A separate control structure interprets and serves categories to the category-minded user.

Hold it.

Am I the only one thinking that this whole business of CMS item categorization will mean nothing once the average user learns to construct a focused site-search string? Never mind. Different subject.

How is this done:

http://gallery.whitelands.com/photos/galleryinfo

I'm interested to know if the rich hierarchy is embedded in the database schema or the code.

LQ

Alan -

I don't want spell checking in a blog. A blog is often not formal writing an if there are spelling errors they are most usually of the dashed off note sort of thing. Spell check also introduces errors based on using an improper but properly spelled word. It also fails on the standardization of linguistic variation - ie Canadian English is not recognized. It would also take way all the "fun" for people who count aleged spelling errors in other peoples' posts.

Alan -

Also, it falsely places spelling errors above poor grammer and lack of clarity, bit of which are more important obstacles to conveying meaning.

Steven Garrity -

Lou, I think this type of categorization can be realitvely simple. Also, having the capability to nest attributes doesn't require you to. You can always just have one flat set instead.

Here's a screenshot of the administration system for the photo gallery that might help. On the left, you see a heirarchy of the "categories". On the right, you see an "Add New Photos Page" scrolled down to the area to select attributes (called "galleries" in this system).

In the example in the screenshot, I've selected "Halifax Trip" under "Trips & Travel" and "Dan", "Isaac", and "Steven" under "People. This would indicate that this photo was from our Halifax trip and includes Dan, Isaac, and Steven.

Lou Quillio -

Steven: Screenshot's worth 1,064 words. Thanks.

abhi -

My blogging system is in ASP and uses an access database as of now.
Supports - Archives, RSS, Calendar, etc

I'll be soon making it public as right now I am busy working on a Generic Database Editor.

Lou Quillio -

I'll be soon making it public as right now I am busy working on a Generic Database Editor.

I quit ASP a while back, but check James Shaw's work on an ASP CMS at CoverYourASP.com. Mature code, full source, well documented, and (bonus) a generic table editor.

LQ

abhi -

heh.

Nice. But its feels better to use ur own script/program. And then I can just put in a scaled down version of the same for other scripts/projects that I make public. :)

Regards,
Abhi

Stephen DesRoches -

spell checking makes more sense at the browser level rather then the blogging system

Steven Garrity -

Re: Stephen DesRoches - I totally agree that spell checking makes far more sense at the browser level. However, it makes even more sense at the operating system level, as illustrated the use of OS X's system wide spell check in Safari. One interface and dictionary across all programs.

Alan -

Just make sure the dictionary and the rules of usage are good - unlike MS Word now. You either spend time now correcting the errors in MS Word's dicitonary or, worse, you come to believe it to be are a superior source of good usage and spelling.

Keir -

Having recently finished reading <a href=http://www.amazon.co.uk/exec/obidos/ASIN/0764549626/ref=sr_aps_books_1_1/202-3430118-2471861>We blog: Publishing online with weblogs</a> I went about setting one up for myself. Having read the book Moveable Type seemed the best bet but my limited experience of PHP and MySQL put me off attempting to install it. I may revisit this in time as their feature list is impressive. Eventually I decided to write my own basic system using ASP. Although you are able to replicate common features of popular weblogs the one thing I really struggled with is URLs.

I believe Apache has an in built feature (mod_rewrite?) enabling you to pass parameters through the URL without the use of the ?. This feature is available in IIS but as a purchasable add on which needs installing on your live server, or in my case external web host. As a result I have to make do with ?postID=XYZ which is a major compromise. If anyone has used ASP and solved this problem any pointers would be gladly received.

In terms of scripting languages I noticed that this Blog used to appear in CFM but is now in PHP. As ASP is on it's way out I wonder whether or not to begin learning ASP.NET, which probably has less of a learning curve or move over to PHP which is more open source. I like the idea of ASP.Net separating the code from the HTML a lot but PHP seems to be on the rise. Steven - Out of interest what made you move to PHP?

Finally - Any chance of the feature list promised above?

abhi -

hey Kier,
that ?id=1239 in place of /2003/10/13 got me bugged for some time too.

if your server admin gives you a 404 page, u can modify that to suit you.

I'm currently making a makeover for my CMS and i'll be using querystrings like:

?d=2003/05/22 - for the posts and
?d=2003/05 - for the monthly archives

then use RegExps to break it up and do the stuff. This obviously looks better than ?id=50.

Good luck.

Lou Quillio -

Keir: If anyone has used ASP and solved [the messy querystring] problem any pointers would be gladly received.

The way to do it on a hosted IIS domain is with a custom 404 script.

You decide on a virtual directory structure that describes the hierarchy of your data. These "clean URLs" don't actually exist on your site, so requests for them are kicked to the 404 script. But before displaying the 404 message, your script grabs the referring page's URL and examines the virtual path/filename. If it can convert the path/filename requested into a meaningful querystring-style URL, it redirects to that page; if not, it goes ahead and throws the 404.

It's actually pretty easy to do (though there will be performance issues), just takes some planning.

Conceive the virtual path/filename scheme such that they include everything you'd need to construct a query from the parts. If you get a request for, say

http://base/2003/05/02/oneMeaningfulWordFromTheTitle

your 404 handler could field-strip that into

REQUEST id FROM tablename WHERE yr='2003' AND mo='5' AND dy='2' AND title LIKE '%oneMeaningfulWordFromTheTitle%'

Empty result set? Throw the 404. Found it? Redirect to the querystring-style URL that your CMS understands.

The other side of the coin is that your CMS should reverse the process when it forms internal links. So when it's outputting the permalink for an item, say, it shouldn't set an href of

http://base/index.asp?id=7

but rather it should do a lookup on the id #7 record and explode the date into a virtual path and the title (perhaps) into the OneMeaningfulWord/filename. This way your messy querystring URLs aren't released into the wild.

Did I mention there are performance issues? Obviously there are, but if you think things through this method will work, and nearly every IIS host allows custom 404 scripts.

Brad Pineau -

I'm standing behind MySQL, all the way. Steven, I say you should consider making the blogging system work with MySQL - as this is the only database supported by a low of cheap web hosting companies. You'd be surprised what it can handle.

Keir -

Lou - Thanks a lot for your entry relating to the custom 404 error scripts. I had never thought of using the 404 for that reason but it seems so obvious after reading your outline. I will need to redo the way I handle dates but that is not a major problem with so few entries.