J. R. Boynton


Notes on taking keywords seriously in html content.

The keywords metadata tag was one of the original metadata fields for html.

<meta description="keywords" content="various keywords, still more keywords" />

Internal versus external use

The original idea was that keywords would accurately portray the content of the page, and search engines would be able to use them to rank results. Alas, the pornographers and scamsters of the world quickly abused badly-engineered search engines by putting words in the keywords field that weren't related to the content.

Google came along, and essentially ignored keywords for ranking results. Rather, Google uses the number and description of links to a page as the main determination of rank.

But that's because Google is a jungle beast – working in the untrustable wilds.

If you have your own search engine under your control, you can and should use keywords to influence the ranking of search results.

How to use keywords

First, keywords should include the structure of the site. That is, if you have some Yahoo-like drill-down for your site, the keywords for any page should indicate its location the structure.

Second, you should really have a knowledge categorization scheme for all the information related to your site. "Information architects" have a fancy word for this, that I usually forget: "facets".

Any given html page might fit into several places in different categorizations. For example, a document might be categorized as reference - technical - "how to", as well as software - spreadsheets - Excel. The entire hierarchy for a page should be listed for every categorization scheme.

Obviously, the content of the page should be described in the keywords.

A further opportunity is to add keywords that potential readers may recognize in addition to or instead of the terminology used in the document. One of the problems in finding information is that the user may not have the same vocabulary as the authors and editors. Keywords can help to make up for such differences. For example, if you watch the queries people make to your search engine, you can add common search terms to the keywords of the pages that should rank higher in the results.

Finally, I suggest using "anti-keywords" – a list of words on a page that should be excluded from adding to the page ranking. For example, "switch" is a networking device, but also a relatively common verb. If your website is about networking devices, and an article uses the word switch in a way that isn't related to a device or network switching, make "switch" an anti-keyword, so you don't waste your customers' time by sending them to an article that isn't about switches at all.

Maintaining keywords

If you take this approach, you will want to maintain keywords explicitly in a database, and automate the process of associating keywords with content.

For example, as soon as the location of the article in the site is identified, there should be no further effort required for the related keywords to be associated with the article. And if you change a structure-related keyword, you will want to be sure it is automatically changed for all articles that used it.

When originally submitting the article, software should scan the article for common terms and propose a list of words to be included as keywords (or anti-keywords).

You will also want to make sure that keywords are not inadvertantly lost or mangled when editing content. The right way to handle this is to separate metadata from content in the creation/maintenance phase, and use the appropriate technology for maintaining each part. A wysiwyg editor is appropriate for content, but a form-based database interface is appropriate for keywords.

Copyright © 1998-2011 J. R. Boynton