Google catalogues

I know it’s unusual for me to make posts about Google or any kind of web thing at all, but I can’t remain silent any longer. I wont attempt to use the word “semantics” because I don’t know what it means (I have since looked it up). With that being said, I will begin.

Maybe a month or so ago, maybe more, I noticed that on Google’s front page they spoke of catalogs.google.com that apparently would let me browse mail-order catalogues they have scanned. I was facinated, and amused. Tease me if you like if you’ve seen Fight Club, but since my girlfriend and I are getting an apartment next year, I went to the Ikea Catalogue and began perusing. It’s really quick, and offers a bunch of different page views. It will never replace the feeling I had at age ten, looking at Lego in the WishBook, but it’s still a fun service.

The ability to do a keyword search was quite interesting, and remarkably accurate considering they’re dealing with tiny-scanned-smudgy magazine font. I did get a few red herrings, but it made leafing more efficient.

Semantics.

 

BlogML: The Weblog Markup Language

A simple thought experiment in weblog semantics

1:37 PM – Joe makes a post on his blog about his golf vacation on Prince Edward Island

3:24 PM – Joe makes a post on his blog about hang gliding in Newfoundland

Two weeks pass.

Sam searches Google for hang gliding in Prince Edward Island. Joe’s blog is the first result.

The trouble with this picture is that Joe never wrote anything about hang gliding on Prince Edward Island. The thought never even crossed his mind.

This is a fundamental problem with searching by keywords – the page is not necessarily the finest unit of web content. Often, especially on weblogs, any given page will have dozens of completely independent posts – made by different authors, on different days, about different topics. Google has no way to tell one post from another and can only link to general archive pages rather than to individual posts.

With all the talk about separating design from content and encoding semantics, it occurred to me that this current level of separation isn’t particularly useful to most of us. It can help with accessibility, which is important; but the cruel fact is, that for truly semantic code to become universal, people are going to have to see concrete results (ie. cool stuff happening).

BlogMLBlogML (weblog markup language) doesn’t exist yet (as far as I know), but I think could save us.

The average weblog has a relatively simple set of fields for each post: title, author, date/time, permanent URL, # of replies, URL of replies, and main content (I’m sure I’m missing some, but you get the idea).

If we could somehow code our weblogs with this structure, Google and other services would be able to see the content as it really is: a loose collection of independent posts. When Google indexes weblog archives, search results could include individual blog posts with the appropriate links (rather than linking to an archive page with 30 posts).

This is not a new idea – it’s the semantic web and it’s been coming for a while. Here’s the key: The weblog community is in the unique position situation to affect significant change through nimble and collective action.

Think about it – if some kind of markup could be defined for this, it wouldn’t take years to be adopted (like most standards). Rather, it would take the cooperation of a few key weblog players. If Blogger, Moveable Type, GreyMatter, and the UserLand suite all started pumping out BlogML-enhanced HTML, it would be instant critical mass. The majority of weblogs would be on board in a matter of days and it wouldn’t take long for the rest of us to jump on the bandwagon. Search tools like Google, BlogDex, and DayPop would be able to offer better search results to their customers.

Webloggers are in a unique position to take collective leaps and bounds towards the semantic web.

Ok, so how do you actually do this? What is BlogML? What does it look like? I’m not sure – I’m shooting from the hip here. Perhaps a simple markup could be hidden in HTML comment tags. Or perhaps a set of reserved keyword DIV and SPAN titles could be established (eg. <div id=”BlogMLtitle”>) as this would be an appropriate use of the ID element according the W3C specs.

Implementation can be dealt with. What I want to know first is: Does this make sense? What hasn’t it already been done? Had is already been done? What am I overlooking? I look forward to your feedback.

 

Simple for sale.

37signals.com
The web interface specialists at 37signals have turned their experience on themselves and served up at nice redesign of 37signals.com. The old site lives on as the 37signals manifesto (worth reading for any web professional).

With the new site comes a White Paper on Contingency Design (‘design for when things go wrong’) that builds on the examples of contingency design, good and bad, at the previously linked Design Not Found.

The guys at 37signals also maintain a fine blog, Signal vs. Noise.

 

when I grow up I’m gonna be a physicist

Brian Greene's The Elegant Universe
While aov rages with debate about web standards and development techniques, a regular reader speaks up for those who aren’t enthralled by cascading style sheets and asks us to throw them a conversational bone. Fair enough, how’s general relativity for a Sunday conversation topic?

Brian Greene’s The Elegant Universe is an introduction to string theory, the latest stepping stone in the scientific quest for the ‘unified theory of everything’. The book is intended for a general audience without an academic background in physics (that describes me pretty well – Physics 101 was at 8:30AM and my hair was always frozen from a quick shower and cold walk to the campus).

Greene’s book has been worth the price if only for the simple and graceful explanation of Einstein’s Theory of General Relativity in the first few chapters. Taking the leaps from Newton to Einstein as if they were steps in a walk through the garden, Greene’s gives a sort of history of modern physics for dummies. His ability to simplify and explain the abstract concepts of relativity remind me of how C.S. Lewis can simplify theology without sacrificing the integrity of the topic.

I remember torturing my high school physics teacher when learning about the laws of gravity. ‘What is gravity?’ I would ask. The answer would come straight from the textbook: ‘Gravity is the attraction between any two massive bodies, which is directly proportional to the…’

Of course, this is a descriptive definition. ‘Yeah, I know what it does, what what is it?’. I was genuinely curious, though my primary motivation was to illustrate that the teacher didn’t know – nobody knew.

Looking back, I give my teacher credit for not screaming that gravity was the warping of space-time but my puny 16-year-old brain was incapable of grasping the concept. She was a good teacher. Conceeding to our weak collective attention span, she read to us from a book called Einstein’s Dreams.

I feel, having begun The Elegant Universe, that I’m getting the answer to questions like ‘what is gravity?’ as well as they can be answered by modern physics to someone with no academic background.

Greene’s steers clear of the arrogance (however justified) of Stephen Hawking’s A Brief History of Time. Hawking’s writing leaves the reader feeling as though the author had to be persuaded by editors that it was worth writing for the average reader. Greene seems to relish in the potential of the much broader impact of simple and accessible writing.

Most fascinating of the many revelations in these first few chapters is the compounding of human achievement. Einstein found out what was wrong with Newton’s ideas (a profound discovery, of course) just as today’s physicists are trying to find out what’s wrong with Einstein’s ideas.

I am constantly amazed by this compounding effect of scientific and technological progress. In 1997, our beautiful Island province was connected to the rest of Canada with the construction of the 12.9Km Confederation Bridge. Every time I drive across the bridge, I marvel at how anyone was able to build such a fantastic structure.

I was wondering about this aloud at work one day (as I often do) and a co-worker drew an apt analogy to our work with software.

Each individual engineer and construction worker did their part, building upon each other’s work and expertise. The person who drives the crane (there was a very cool crane involved) doesn’t have to understand the effects of cold on the strength of concrete. The person who paints the lines doesn’t have to understand the aerodynamics of the structure in high winds.

Progress in software works in a similar way. When I call on a simple PHP function to show the date on a website, I don’t need to understand how PHP is interpreted. I don’t need to understand the network protocols used to transmit the process page to your computer. Basically, I don’t need to understand how computers work because someone else has done it for me.

All of this nonsense brought to mind the recently asked question: Is evolution finally over?. Could it be that this growing external knowledge base (as opposed to being stored in the memory of individual human beings) is the new evolution?

Today’s average civil engineer isn’t any smarter than a scribe of 3000 years ago. The difference is that the civil engineer has Newtonian physics and centuries of progress in chemistry at his disposal.

Futuretastic guru Ray Kurzweil has drawn this parallel between natural evolution and the progress of technology. He proposes that evolution and eventually technological progress fall on the same grand timeline. He goes so far as to suggest that this scale of progress, both natural and technological, has been accelerating according to an expanded version of Moore’s Law.

We are not standing on the shoulders of giants. Rather, we are standing on the shoulders of millions of average people.

 

An open letter to the web design community: Why should I redesign my site with Cascading Style Sheets?

Update – May 31, 2003: It’s been over a year since I wrote this and a lot has changed, this post about web standards better reflects by current view.

This article assumes a basic knowledge of web standards and languages, including HTML, XHTML, and most importantly Cascading Style Sheets. You don’t have to be an expert, but I’m assuming you at least know what they are (if you don’t, start here).

I’m in the early stage of planning a redesign of actsofvolition.com. Plans include a series of minor refinements and, most significantly, a move from ColdFusion to PHP for the programming. What I haven’t yet decided for the redesign, is what formatting standards to use. Should I stay with the tried and true HTML tables, or should I take the leap into the emerging world of Cascading Style Sheets?

The purpose of this article is to seek feedback, suggestions, answers, and ultimately a better understanding of the state of web design practices in the face of emerging standards. To immediately deflate any would-be critics, let me clearly establish that I welcome critical feedback. I’m going put forward complaints and grievances about designing with CSS and I want you to tell me why I’m wrong. Notice the italics, telling me that I’m wrong without telling me why won’t do any good.

Zeldman Rules, Netscape sucks
I do not criticize or question the work of Jeffrey Zeldman and The Web Standards Project in promoting the adoption and proper implementation of web standards. Over the past few years, I’ve grown to hate Netscape (version 4.x, that is) as much as any web developer. The non-standard nature of the Netscape 4.x rendering engine has cost my company and many others like it loads of time, sleep, and money. Like many web developers, I’ve had to become an expert in browser quirks. It is enormously important that someone is lobbying Macromedia, Microsoft, and Netscape to ship products that produce and interpret standard-compliant code. This is important work that I don’t have time to do – and neither does Jeffrey Zeldman, but he’s doing it anyhow.

Let it be understood that standards are inherently good. If every browser rendered HTML as defined in the W3C (World Wide Web Consortium) specifications, the world would be a better place. This cannot be understated. Broken browsers are a significant drain to productivity for web developers. Everyone should close their <p>’s and nest tags properly. I’m not questioning web standards. Rather, I’m asking why I should use CSS to design my site. These two issues are often confused, to the detriment of the dialog surrounding both topics.

Writing standard-compliant code and designing with CSS are not the same thing. However, confusion between the two is understandable. CSS is latest method of formatting web documents recommended by the W3C. This recommendation supplants the old HTML 4.x recommendations. Therefore, it is not unreasonable to suggest that in order to comply with the latest standards, one must use CSS. However, the latest standards are not the only standards. The HTML 4.x specifications have only recently gained real and complete support across a significant user base with the release of browsers like Microsoft Internet Explorer 5 & 6, and Netscape 6 / Mozilla. Web developers do have the option to design to the HTML 4.x specifications – these standards have not been depreciated.

Yes, I know, FONT tags suck
FONT tags are like my car – they suck, but they work, and they were respectable in their day.

HTML 4.x does have serious problems. The best example I’ve come across was during the development of an ecommerce site with FONT and TABLE tags. The design included a long list of product categories that was displayed on every page. The best possible formatting solution in this particular case involved each item being in its own cell in a table, each with its own FONT tag. In some sections of the site, there were as many as forty list items, meaning forty iterations of this tag <font face=”verdana, sans-serif” size=”-2″></font> in the code.

At the time of the development of the site I can remember thinking, there has to be a better way. Well, CSS is the better way, and it was just emerging at the time – the trouble is, CSS is still just emerging.

Watch out for the small print
In the hands of a knowledgeable developer those cursed FONT tags could be very powerful. Like many web developers who cut their teeth developing for Netscape 3, Netscape 4, and eventually Internet Explorer 4, I know the quirks of each browser and can get them to render text exactly the same way. By defining relative and absolute font sizes (relative being “-1” and absolute being “2”) developers could control whether fonts were resizable by the user.

If you’ll pardon the pun, user-resizable fonts are huge. The benefits of being able to increase the font size to users with accessibility issues are obvious – but there is more to it than making the text easier to read. User-resizable fonts were part of the Zen of web design – don’t struggle for control of every pixel, don’t use images where you can use text, build flexible liquid layouts, and make fonts user-resizable.

CSS as currently supported in widely-used browsers (the highest courts, in my opinion) kills user-resizable fonts. While CSS dictates that size can be defined by several methods, including keywords (small, x-small, etc.) and percentages, it is generally understood that the only safe method to define font size in today’s browsers is to use pixels.

The W3C is getting out of font resizing game altogether. In CSS2 they basically leave resizing up to the user agent (the browser). Opera and Mozilla comply with this recommendation, allowing the user to resize all fonts, even those defined in pixels by CSS. However, Internet Explorer 4, 5, and 6 (together comprising a huge majority of users) lock the size of pixel-defined fonts, regardless of users setting. Internet Explorer does allow the user to resize fonts defined by keyword and by percentage.

For many new sites, developers have given up on the idea of user-resizable fonts in favour of the benefits of defining fonts with CSS. I’m not content with this compromise – user-resizable fonts are simply too important. The best solution I’ve come across, and one I’ve used on several client sites is messy, but works. We’re using server-side browser sniffing to show some browsers percentage font sizes (thereby allows IE 4, 5, & 6, the majority of web users to resize their fonts), while showing other (dumber) browsers pixel font sizes.

This solution stems from a rule of thumb I use when deciding on technologies for clients: It has to work perfectly for most people (preferably 95%+) and it has to work for everyone else.

Separating style from content – Isn’t that what databases are for?
The strongest case for the adoption of CSS is the ability to separate style from content. A series of documents formatted with CSS can all be styled centrally with one CSS file. Emphasis and structure can be coded with HTML, while style and design of how that emphasis and structure are visual represented can be controlled separately. This makes sense.

Separation of style from semantic content is, in theory, necessary for content to be accessible via alternative means (screen readers for example). Whenever someone says “in theory” and it turns out not to be true, the theory must be examined. The self-proclaimed accessibility curmudgeon, Joe Clark has stated that “the accessibility defects of tables are overblown or outright false.” Screen Readers have gotten smarter at handling table-based layouts.

In response to skepticism towards CSS by web writer Dave Winer, it has been argued that CSS will allow for better forward compatability. Winer boasts a significant archive of web writing compiled over the last five years. What happens five years from now, when desktop web browsing may be be a historical anomaly and Dave’s pile of HTML archives will be dated and potentially inaccessible? If Dave’s archives were a pile of HTML files, this would be a legitimate concern.

However, Dave knows better then to store precious data in text files – he’s using a database. Everyone’s using a database. I think this simple fact has been missing from the dialog surrounding the separation of style from content. If Dave has his archives in a database, he can pull the content into whatever format display technology is standard in 2007 – probably with 5 minutes of work.

I’ve been involved in ecommerce sites with thousands of products (each with their own ‘page’) that are based on a handful of simple HTML templates. If I want to redesign all product pages, I redesign the product template. If the site started with HTML 3, I can update the entire site to XHTML 4.01 in a matter of minutes. That is separation of design from content, in practice.

The power of databases and templates don’t necessarily negate the benefits of smart formatting standards, but the discussion of separating style from content is incomplete without discussing databases.

actsofvolition.com currently comes in three flavours: Classic (the normal HTML version), Handheld-Friendly (a simple text-only version intended for small screens), and RSS (an XML format for syndication of headlines). After the HTML version of the site was setup, the alternative versions took literally minutes to build – and I am not an expert programmer (I’m really not a programmer at all).

Separation of content and design is, in some ways, a myth. Content can not be thoughtlessly and automatically repurposed for different medium (for example, aov for handheld computers strips out images but occasionally there will be a post that loses its meaning without the image – it doesn’t make any sense out of context).

The semantic web is going to happen on the server side, in the hands of the content creator, not the client side.

For example, when Wired News wanted to deliver tech news headlines to the handheld-equipped startup-happy tech masses, they had two options. First, they could have designed a website that used the latest standards and used medium-specific style sheets for all layout and formatting. Then, they could just have a simple ‘handheld device’ style sheet that would make the site readable on a handheld. To be fair, the standards still aren’t implemented well enough to pull this off today, let alone three years ago, but it’s the ideal goal, so bear with me. The second option was to pull content out of their database, have good editors re-work it for the ridiculous reading-style of a tiny handheld screen, and create a completely separate handheld news site.

In a move that went on to make them one of the most popular handheld news services on the web, Wired went beyond building a separate handheld-friendly layout. The editors at Wired actually created handheld-friendly writing. References to external links, images, and the general length of stories, headlines, and story-teasers were all edited with the handheld reader in mind. The success of the services that Wired built on these principles is a positive example of how content needs more than just new templates to make sense in a new medium.

So, do I use CSS or not?
I reserve the final decision on whether or not to use CSS in my pending redesign until I’ve had and chance to absorb the feedback I get from this article. However, I am leaning towards a solution that is intended to backup the points made here about the importance of the database in the separation of style from content.

I’m going to do both. I will build a solid XHTML/Tables-based layout that will work on the popular browsers, and I will build a CSS-only/table-free layout. Why do both? It’s certainly not to entertain readers. I’ve never been one for the user-customizable layouts. Rather, I’m planning to do both to show how easily it is to customize presentation with a database.

Zeldman points out that on the web, “tutorials are misread as manifestos, and wee improvements are heralded as breakthroughs”. Absolutely true. This article is neither a tutorial nor a manifesto. Rather, it is thoughts of a web develper who spends his days building web systems, wondering out loud as to the best coding practices. As I’ve stated at the begining of this article, I welcome feedback, but don’t bite my head off – explain your position in relation to the points I’ve made here.

Update – May 31, 2003: It’s been over a year since I wrote this and a lot has changed, this post about web standards better reflects by current view.

 

rumblings of a version 2.0

The low-level brain rumblings of an Acts of Volition re-development are underway. I call it a re-development because it’s not really going to be a re-design – I’m pretty happy with the way aov looks right now. However, it’s going to be completely overhauled from the back-end (snicker).

Here’s your chance to be ignored: what features would you like to see on Acts of Volition?

To get you on the right track, here are a few vague ideas I have in mind:

  • improving the ‘recent conversations’ to highlight new items since your last visit
  • maybe a spell checker on the replies form (if I can’t find a simple one that works well and doesn’t get in the way)
  • improved searching including searchable reader replies
  • general smoothing – simplified permanent links, simplified URLs, etc.
  • more grandmother sex (hey, according to our search engine referrals, that’s what the people want!)

And for any who may be interested in the technical side of aov:

  • I’m considering a CSS (Cascading Style Sheets) layout – I don’t think the browsers support CSS well enough for complete CSS sites for my clients, but aov is a good place to experiment with it (Netscape 4 users – it’s only gonna get worse).
  • One thing is for sure – I’m going to be moving from ColdFusion to PHP as my development platform. The move follows a switch made in early 2001 by our development team at silverorange. I’ve got to keep up with them in order to take advantage of them.

Your feedback will be read mostly for our own entertainment purposes.

 

it’s a designer eat designer world

AIGA, the American Institute of Graphic Arts has a site on User Experience called Gain. Ironically, it takes quite a bit of patience to navigate. If you do manage to find your way around, you might come across an interesting graphic illustrating the distilling of the many pioneering web-firms into the few struggling behemoths we have today.

Because their site is based in flash and pop-up windows, I can’t link to it. How’s that for #$!@ing user experience. They do have a PDF version of the chart, but it obscures the interesting data in its own totally X-treme way (note to graphic designers, most of you suck).

Since I can’t link to it and even if I could, you’d be annoyed by the flash interface, I’ll sum up the most interesting info with words (they can be so handy sometimes).

+ Razorfish
Avalance
CHBi
Spray Network
Sunbather
I-Cube
TSDesign
Lee Hunt Assoc.
Medialab AG
+ Studio Archetype
Sapient
Adjacency
E-Lab
+ Rare Medium
i/o 360
Circumstance



= Razorfish = Sapient = Rare Medium

Oddly, Sapient also designed the website for AIGA. Hmmm… They should read some Edward Tufte.

 

aov complies with the w3c

Warning: what follows may be considered boring by some readers – especially non-geeks (if there are any).

Valid HTML 4.01!
Valid CSS!

Ladies and gentlemen, actsofvolition.com is now compliant with the The World Wide Web Consortium’s (W3C) HTML 4.01 Transitional recommendation*.

The Web Standards ProjectThe good people at The Web Standards Project can explain why adhering to standards matters.

Those who work in web development will know that adhering to these standards is not always easy since some popular browsers don’t interpret them correctly. Netscape 4.x, for example, will have problems rendering a perfectly standards compliant page. Web developers must choose between breaking the standards or having glitches for Netscape 4.x users.

New browsers are much better at interpreting the standards. Internet Explorer 4, 5, 5.5, and 6 all do a decent job and have gotten better and better with each new version (although there are still problems. The new Mozilla browser (aka Netscape 6.x) does an excellent job as do most other alternative browsers such as Opera, iCab, and Konqueror.

The trouble is that loads of people still use Netscape 4.x. While global stats put it’s usage at below 10% of web users, this site and others I help produce have up to 30% of users running Netscape 4.x.

Here on aov, I can (and have) decide to screw Netscape 4.x users in favour of the standards. There are better browsers that are easy to get. However, as a professional web developer (which I am, apparently) I can not make that decision on behalf of my clients. Company XYZ is interested in selling their widget, not ensuring web accessibility and cross-platform-interoperability. Ignoring users is not good business. That said, it is possible to make sites that adhere to the standards (for the most part) and still work relatively well in NS4.x.

Exciting, isn’t it?


* If you actually run our site through the W3C’s Validator, you’ll notice that it doesn’t actually validate, but that’s only because it’s a little confused about some of our URLs and some non-standard characters, which I’m working on.