Do we need a CMS?
Content Management System [CMS]
- As is common with TLAs, the acronym CMS is ambiguous.
- Of or referring to a "mid-tier web content management system".
- Any of the baffling array of products and services attesting to provide this capability—their diversity belying the lack of a coherent definition for the phrase.
- As a Wikipedia contributor explains, regarding Content Management Systems:
Readers should be cautioned, however, that terminology here can be very confusing and overlapping.
What is a CMS and why does the W&M website need one?
As Susan Evans (chair of re.web) discussed in her recent podcast, we've been looking at products, talking to vendors and staring at spreadsheets in windowless conference rooms. Why? We're trying to get our heads around what CMS products are available, what features they offer, and which of those features provide us the greatest impact for a dynamic and sustainable web environment. With the help of our friends at mStoner, we've been making progress.
In an attempt to broaden the discussion, and to establish a consistent vocabulary within the College community, let's discuss some of the features we're finding in these products, and how they might make things "better" within the W&M web.
In this discussion I will quote liberally from Wikipedia, since it is a reference work with which many of us are comfortable. But even the editors of Wikipedia are having a hard time with the term CMS, having rewritten the entry several times in the last few months. The link provided above, for instance, points to the entry as it existed in late-August of 2007, as much of the text I found useful to quote is transient on later versions of that page.
What can the CMS do for me?
Having not yet decided upon a CMS product to run the new W&M web, what can we say about the capabilities of this unknown CMS and how it will benefit us?
At this time, we have not ruled out any particular back-end technologies, nor have we made a commitment to either commercial or open source software. But our research to date indicates that there are a relatively small number of mid-tier CMS products, with a relatively common set of core features with which the re.web and mStoner project teams are comfortable.
Without leaning toward any specific product, let's see if we can unwrap some of the features that we have identified as being requisite, or as having the greatest potential impact on a dynamic and sustainable web presence.
Let's discuss some of the specific features (or categories of features) offered by most of the CMS products we are vetting, with the goal of explaining what each is, and how it is intended to support a dynamic, sustainable web presence.
The list so far:
- Processing Engine and Data Repository
- Users and Permissions
- WYSIWYG Editing
- Templates and Themes
- Link Management
- Content Reuse
- Content Scheduling
- Compliance with Web Standards and Accessibility
- Handling Media
- Information Architecture
As you look over these features and their potential impact, if you think of something not touched on here, let us know. Our checksheet for CMS evaluations is much more granular and technical than the high-level explanations here—so maybe we've got it covered; but it's also possible that your perspective will bring out a question we have not yet asked.
The re.web blog is a good place to direct comments in response to a specific topic (such as the CMS).
Another approach to CMS building uses databases such as PostgreSQL, MySQL or MS SQL, and scripting languages or tools such as ColdFusion, PHP, JSP or ASP to interact with the data to parse them into visual content. Data stored in a database are queried and compiled into html pages or other documents and transformed using cascading style sheets [Wikipedia].
This wikipedia note points to certain properties of a CMS which we will call the "Processing Engine" and the "Data Repository". Before explaining further, let me point out that the above list of technologies is not only incomplete, but fails to capture some of the platforms most represented among mid-tier CMS products.
Specifically, several of the systems in this tier will integrate with an Oracle database, while a few others use a more XML-centric repository. Likewise, there are additional languages represented in mid-tier CMS products, such as Java and .Net. Finally, some of these CMS products provide transformation of content not only with CSS, but with XSLT or other technologies; and some go further to allow transformation of content into additional formats, from XHTML and RSS, to PDF, Flash, or Mobile-friendly formats (WAP/WML)—although this is an area where there is a quite a variance between competing products.
In terms of potential impact, the Processing Engine and Data Repository are the most difficult of the features to encapsulate or evaluate. In fact, they are not so much features of the CMS as the back-end, heavy-lifting dynamo of the CMS. A well constructed and efficient data repository in which to house the content, and a powerful and dynamic processing engine with which to manage, render and transform the content are more accurately the prerequisites to all of the other features we're discussing.
We might also note that these two elements anchor both the target audience's experience with our public site, as well as, the management interface through which content owners create and maintain the web site. As we discuss the other features, notice that each requires a strong presence in both the management and presentation aspects of the CMS for maximum impact.
A Content Management System (CMS) is a software system...deployed primarily for interactive use by a potentially large number of contributors [Wikipedia].
Identification of all key users and their content management roles...[and]...[t]he ability to assign roles and responsibilities to different content categories or types [Wikipedia].
The W&M web presence consists of a very large number of component "sites" that ideally work together to form the whole. These sub-sites might include academic departments, administrative offices, student services, or official news and announcements. The days when a single webmaster could maintain an entire site are gone (if there ever really was such a time). Nor is it practical to assign all members of the community equal access to all of the content on the web server.
Currently, over 200 individuals have permissions to create and maintain content on the W&M website. Any new CMS solution must be able to handle a complex arrangement of users with overlapping permissions in disparate portions of the website.
If the visitor to your site can determine your organizational structure based on your navigational structure, something is seriously wrong.Truth is, this is exactly how our current site is constructed—partly for historical reasons, but mainly due to the simplicity of creating a folder called "admission" and assigning all the folks under the Dean of Admission permissions to that folder.
The fact that some of their content could be of use elsewhere, or that some content generated elsewhere on the College org chart might be useful to prospective students viewing the admission page, is undermined by the difficulty (within our current system) of assigning permissions granularly to individual elements of content. Nor are we readily able to share and reuse content across these departmental boundaries—but that is another feature we shall discuss.
For now, let us say that central to disassociating the navigational structure of the website from the organizational structure of the institution is the ability to make complex assignments of user and group permissions to content in various sections of the website whenever necessary.
WYSIWYG is an acronym for What You See Is What You Get, used in computing to describe a system in which content during editing appears very similar to the final product. It is commonly used for word processors, but has other applications, such as web (HTML) authoring. The phrase was originally popularized by comedian Flip Wilson, whose character "Geraldine" would often say this to excuse her quirky behavior [Wikipedia].
Can I reminisce for a moment? In 1984, (the company formerly known as) Apple Computer released the Macintosh. Those of us who remember word processors prior to this event were using DOS and Apple II computers with Word Perfect, MS Word, AppleWriter and Appleworks. For those unfamiliar with this era, or trying to suppress the memory, word processing was done by typing text, and adding commands (bold, underline, etc.) around the text. Then came the mouse, the graphics-enabled operating systems (Mac vs. Windows) and the next generation of word processors, where highlighting some text and clicking the bold button actually made the text appear bold on the screen! Not only that, but adding an image to a document caused the image to be display on the screen just as it appeared when printed. (Laugh if you must, but before this we had text markers that indicated an image would be present, as well as text markers indicating where each page would break.)
Fast forward to 1994. The web browser reinvented the internet. Mosaic, Netscape and Internet Explorer utilize the latest in graphics-enabled operating systems to show content from across the world, complete with images on the same page as the text! Again, younger readers may laugh—but those of us using the internet, gophers and CompuServe knew about getting to information from all over the world. Even drill-down links (or whatever we called them in those systems) allowed navigation of the information in a logical, if linear, fashion. But it was all (plain) text—like AppleWriter had been on my old Apple IIe. So in came HTML, the worldwide web and the web browser. Now, by creating a text document in Notepad, and adding some commands around the text, I could make a webpage visible to the world, complete with text embellishments and images. Does this process sound familiar? Writing HTML was nothing more than swapping out the old AppleWriter tags for HTML tags within my brain. Simple.
Over time, the WYSIWYG editor for HTML arrived. Netscape Composer, HomeSite, GoLive, FrontPage and Dreamweaver are examples of software applications that brought the modern word processing interface to the web. I suspect most of our readers have heard of, and maybe use, one or more of these products. For the last 10 years or so, this has been the dominant means of content creation on the web.
Over the last few years, another shift has been taking place on the web. This shift, often denoted as Web 2.0, uses the web itself as the tool for creating content on the web. Go to Wikipedia, Blogger.com, del.icio.us, blogmarks or netvibes and build content for the web using the web. Early Wiki and Blog tools required learning a special syntax of commands (HTML, Wiki markup, Markdown) to wrap around your text (déjà vu?) to get the appropriate formatting. In fact, Wikipedia still requires this type of formatting. More and more, however, we are seeing the same move from text marked with commands to more of a word processor style WYSIWYG interface for creating content for the web using the web.
But here's the point that we want to make. With re.web, WYSIWYG has come to the W&M web! One of the requirements on our CMS features list is a strong cross-platform WYSIWYG editor. How does this interface look? It depends on the CMS. Some CMS products have a more complex set of formatting options, while others allow the CMS administrators to customize the tools to fit the site. In all cases, content contributors should find the interface similar to modern word processors—which means a low barrier to entry, and minimal (re-)training.
Some content management systems allow the textual aspect of content to be separated to some extent from formatting. For example the CMS may automatically set default color, fonts, or layout [Wikipedia].
Using a template to layout elements usually involves less graphic design skill than that which was required to design the template. Templates are used for minimal modification of background elements and frequent modification (or swapping) of foreground content [Wikipedia].
The current idiom for web development is the separation of presentation and content. The use of templates allows those most knowledgeable about the content to manage it, without becoming an expert graphic artist or web designer. Likewise, the graphic artist and web designer can make changes to the layout and design independent of the specific content on each page.
The mid-tier CMS products we are considering all implement this paradigm. Recall from earlier feature discussions that CMS content is stored in a data repository that the processing engine transforms and renders into webpages. The Template is the blueprint for merging that content into a consistent design, or look and feel.
This has direct implications on WYSIWYG Editing. Professional quality templates (such as the ones mStoner is working on for us) include not only the layout of the textual and graphical elements of the page, but also a stylesheet that defines how those various textual and graphical elements should look. Therefore the content creator needs a relatively small number of features in the WYSIWYG Editor to set off certain text, for instance, as a blockquote (the quotes from Wikipedia above are blockquotes). The template designer is responsible for deciding how the blockquote behaves. And the application of the template to many content pages causes blockquotes to render consistently throughout the site.
How many templates will we have? This is a trick question. The terminology from one system to the next is inconsistent. Some systems call each look a "template," and each layout (text with an image, text with two images, etc.) a "sub-template" or "stationery." Others call the look a "style" or "theme," and each layout a "template."
Here is the best answer we have to this question. The College will have a new look that will be consistent, although not identical, throughout the administrative and undergraduate portions of the College's main web server (www.wm.edu). This look will consist of a unique top-level page, and some variations on that theme for second-level and deeper content pages. A few high-traffic public pages (like the Admission top page) may get a special variation on this look. And there may be elements, such as images or other media, that can be applied (turned on or off) within these looks. The exact count and how they are implemented depends on the actual CMS product we select.
Additionally, the re.web project has invited the College's graduate and professional schools to participate in our new CMS. As readers of the re.web blog will remember, the Law School has already had both the intake and strategy report meetings with mStoner. Once the College's main site design has been presented and approved, another top-level page will be designed with a familial look to the main site, but emphasizing the distinct character of the Law School and its constituency. Again, within the Law site there will be secondary pages, standard content pages and the like.
While we cannot provide an exact numerical answer to the question of templates, hopefully this explanation helps to set the stage for what we expect.
This page (the one you are currently reading) contains many links—links to content within this page, links to content within this site and links to content on other sites. The web is a dynamic system, with content being created, modified and removed at a mind-boggling rate. Over time, some of the links included here may become stale or dead—nice ways to say you won't see what we intended when you click on the link, and will possibly get a nasty error.
What's to be done? Shall we check every page on our server every day, clicking on every link just to be sure that the links continue to work? Well, that's one way to keep things working. But maybe there are better methods.
Most of the mid-tier CMS products under consideration have some form of built-in Link Management. This generally takes one of two forms. The first, which most of the CMS products on our list can handle, is internal Link Management. This is when one page within the CMS provides a link to another page within the system. The exact means of linking varies with the CMS, but the basic notion is that the CMS knows that the link exists, and to what content it connects. When a content item is renamed, moved or deleted, the system will find all of the links affected and correct (or remove) them.
Some of our departmental webmasters may recognize this form of Link Management from their use of Dreamweaver. Define a "Site" within Dreamweaver, rename a page from within the Dreamweaver file manager, and Dreamweaver will check inside every file in the site to correct links to that file. The time required grows with content, becoming painfully slow on large sites. In a CMS, the Data Repository houses all of the content and included links, and may have these links pre-indexed for quick access. Where Dreamweaver needs to open every file and search each link to see if it is affected by each change, the CMS can perform the equivalent action across many more pages much more quickly.
The second form of link management involves an automation of our facetious process of visiting every link on every page every day. Not all CMS products provide this, although there are third-party tools that can be used in conjunction with any CMS. The notion here is that a script may run every day (or night, or other prescribed interval) and check each external link to see if attempting to download the page results in an error instead of an actual page. This can't guarantee that the particular information quoted hasn't been updated (all too often the case on Wikipedia, a blog, or the top page of a news site). But at least dead links can be detected. What happens after the detection varies, but generally involves a report being generated, and a real person making the needed changes. Since the content is not managed internally, the CMS itself generally doesn't attempt to decide on appropriate changes to the link.
Whatever CMS the College implements, the re.web team is working with mStoner to provide some means for Link Management as a part of our long-term smart, sustainable solution.
The ability to track and manage multiple versions of a single instance of content [Wikipedia].
The reference to Wikipedia provided here is from a historical version of the "Content management system" article within Wikipedia. The nature of Wikipedia, and the web as a whole, is change. Change happens over time. One of the distinguishing features of a "mid-tier" web CMS over some of the smaller systems is the capability for storing and retrieving previous versions of a page.
In our current web environment, I have had occasion to work with users to restore content that has been inadvertently deleted, or altered to the extent that restoring it manually would be time consuming. The file system on which our web files exist provides limited recovery options. But, as often as not, the version of the file that needs to be restored is only available on our tape backup. Without covering the specific steps involved in the process, it is accurate to say that several individuals within IT must be involved in the process of restoring a single file.
With the College's new CMS-based web environment, this will not be the case. We must warn once more, as we have with previous features, that the exact implementation of versioning varies from one CMS to another. But we can say that the new W&M website will incorporate the ability for departmental site "owners" to retrieve and restore content that may have been modified or deleted over time.
[Workflow] is the idea of moving an electronic document along for either approval, or for adding content. Some Content Management Systems will easily facilitate this process with email notification, and automated routing. This is ideally a collaborative creation of documents [Wikipedia].
An approval-centered workflow tracks the approval chain of command on a piece of content, no matter what the form...[CMSWatch Glossary].
A content management system may support...content workflow tasks, often coupled with event messaging so that content managers are alerted to changes in content [Wikipedia].
[Workflow is] [t]he automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules [e-workflow.org].
When it comes to the web, content is king. The data we're seeing, particularly with regard to current and prospective students, shows that our website is the place where key constituents expect to find the most engaging, current and timely information about the College. Whatever we do to empower content owners to create and update their pages will add value to the site.
One feature that a web content management system brings to this effort is workflow. The definitions above describe what workflow is, and how it might be implemented within a CMS. But how does this add value to our web presence? Doesn't this just add another layer of bureaucracy to the process, slowing down the creation of content?
First of all, we've been careful to ask each of the CMS vendors with whom we've talked whether workflows can be different from one part of our site to another, and even turned off altogether where not needed. Content creators who are responsible for a portion of the website can approve their own updates, as well as those submitted by other members of the department or even student assistants. Our current web server doesn't have such capabilities—either a person has permissions to create and edit content, or he or she sends update info to a person who does.
As a consequence of workflow and permissions, those members of our community closest to (and most knowledgeable about) a story, topic, or event have the opportunity to create the content. Others, tasked with authority over specific areas within our website, may choose to feature this content in the most appropriate venue.
'Content reuse' refers to any situation where a single piece of source content is written once, and then used in multiple locations or contexts [Step Two].
Using a mid-tier web content management system, one office would be able to "reuse" a content item, without having permission to modify its content it is owned by another department.
Here's a full solution to a hypothetical example between the Biology department and the Admission Office:
- A writer in the Biology department creates a student profile within the CMS.
- The web person for the Biology department approves the story to display within the Biology site.
- Someone tips off the Admission Office to this great student profile.
- The web person in the Admission Office places this story in rotation with the other student profiles on the Prospective Students page in the CMS.
What if every department and office could not only generate their own content, but could make use of relevant content that already exists? A story in Ideation could be featured by the Chemistry department; an upcoming event posted by the Government department could be featured on the College's homepage; and a University Relations story about faculty achievement could be featured by the English department.
Hopefully we have uncovered some of the benefits of content reuse, as well as the synergy with permissions and workflow. And maybe you can begin to see why re.web gets us so excited about the future of the W&M web!
Web syndication is a form of syndication in which a section of a website is made available for other sites to use... [In] general, web syndication refers to making web feeds available from a site in order to provide other people with a summary of the website's recently added content (for example, the latest news or forum posts).
Syndication benefits both the websites providing information and the websites displaying it. For the receiving site, content syndication is an effective way of adding greater depth and immediacy of information to its pages, making it more attractive to users. For the transmitting site, syndication drives exposure across numerous online platforms. This generates new traffic for the transmitting site... [Wikipedia].
The primary format for web syndication at W&M is RSS. Those familiar with myWM can see a tangible use of RSS in action. Many of the news and information channels provided are actually RSS feeds that other servers are sharing through web syndication, including some from our own community.
In the context of a web content management system, support for web syndication (specifically RSS) is a two-way street.
- News and information within the CMS should be available for subscription by others (e.g., myWM, myYahoo!).
- News and information that exists external to the CMS (maybe from the FlatHat or from the College Board) should be available to the CMS where it is appropriate.
Said another way, syndication extends content reuse by removing the wall of separation between external content and content residing in the CMS.
In managing web content over time, pages will change:
- information will be updated
- events will be anticipated, attended and remembered
- accomplishments will be heavily promoted, and later archived for reference
Remember our discussion of versioning?
The ability to track and manage multiple versions of a single instance of content [Wikipedia].
Versioning provides us the opportunity to make edits in a safe environment—where we can build the changes for review prior to publishing, and where older versions remain accessible for reference or roll-back.
Content Scheduling adds a chronology to this process. What if we already know next year's tuition rates, but we are not allowed to post them until after the Board makes an official announcement (scheduled for 2:15pm on the day of the next board meeting). By associating a date and time, often called a "sunrise," with the publication of this content, the person responsible for posting tuition to the website is much less frantic and stressed, having updated the page a week earlier.
Okay, let's be honest... That person may be hitting "Reload" every few seconds around 2:15pm just to be sure it happens on schedule! But what about a nice "Happy New Year" on the homepage just after midnight. I may decide to go to bed before midnight, or hang out with friends instead of on the web, and look at it the next morning to make sure the message appeared. After using the CMS for some time, I might just start to trust it will happen...
If we can have a "sunrise," why not a "sunset" as well? There are actually several different forms a "sunset" feature might take:
- Content is retired (removed) from the web
- Content is archived (moved to a new/permanent location)
- The content owner is notified that the content may be stale and in need of revision (aka, "freshness reminders")
In our previous examples (with tuition, or with a message on New Year's Day), we would likely opt for a "freshness reminder," since the page will continue to exist indefinitely, but the content will continue to be updated.
Current News stories might be a place where scheduled archival to a News Archive site might make sense.
For the sake of an example where we might remove content altogether, a landing page for registering for an upcoming event might be removed after the deadline passes.
Of course, each content owner and each section or page of our website might have different content scheduling rules applied. Maybe your site will decide to set a 1-year "freshness reminder" as your default. Or maybe you don't think you need this feature... But content scheduling is a feature we're exploring as we evaluate CMS products to run our beautiful new web.
What does the CMS do for us in the area of compliance? At the very least, it should allow us to have accessible, standards compliant pages and sites within the CMS.
Fortunately, the CMS products contending in our final selection process go well beyond this. Although they differ in implementation, each provides an array of features to address this issue.
First, the CMS uses Templates and Themes throughout. So page content gets displayed inside our professionally designed standards-compliant wrappers. Page content enters the CMS by way of an editor designed to produce good code.
Where the products have some differences is enforcement of, and reporting on, specific standards and guidelines. For instance, one system will require "alt tags" on images, while another might just suggest it.
Of course, each system not only permits, but encourages the author to adhere to the standards and guidelines. So the differences are in the degree to which compliance can be mandated by the system. In any case, the CMS will be a great place to build a compliant site.
Aren't we missing something? You may be thinking that content equates to text. In fact, I may be guilty of fostering that notion in portions of this discussion.
Fortunately, the CMS systems we're discussing can handle much more than simple text! Each offers some sort of solution for storing, positioning and reusing non-text content—although, you may have guessed, the implementation varies by product...
What is non-text content? That's a simple way of lumping together a bunch of otherwise-unrelated types of data. Some examples might include images, audio, video, PDFs (only where necessary!), and even some Flash or AJAX interactivity. Basically, all of the stuff that makes the web so exciting!
There are a variety of media types, and differences in methods for incorporating them in the CMS. In fact, we're looking forward to seeing how the new web presence will take advantage of these elements to reflect the interesting and exciting personality of the College.
For the moment we'll leave it with: the CMS will be capable of storing, positioning and reusing various media elements in the ways we will need."Information Architecture is Key"), which referenced a lengthier discussion of IA.
For those who want the condensed version:
IA is the blueprint and foundation for the site. It is essential to the site map, navigation system and search, as well as, for a consistent strategic message. But merely building a site with these features will not produce strong IA.
Deeper in these previous discussions we added more factors to the list—menus, taxonomies, meta-data and more. None of these provides a solid IA. But an absence of these elements often shows a weakness in the IA. A case of the whole being more than the sum of the parts. Said differently, these elements must all exist and work in harmony with one another as they play from the same score of music.
Does the CMS provide IA?
A piano does not provide music. A quality piano which has been tuned may be used by a pianist (often reading the work of a composer) to create music.
Likewise, the CMS does not provide us with IA. But a quality CMS can be used by professionals (web designers, programmers, content writers) to implement our IA—and become a solid foundation for maintaining solid IA as we grow.
But just as a pianist has a score to be played and interpreted using the piano, our professionals need an IA to implement within the CMS. This is where our consultants, mStoner, again come to our aid. Working with the re.web team, deep in the weeds of IA, we are making a plan before we jump into the details of building sites within the CMS.
Does the CMS do anything for IA?
Actually, the CMS will help a great deal in the implementation of our IA. The CMS will provide us a framework for the navigation system, and site map, a vehicle for consistent messaging, a storehouse for the meta-data and taxonomies we develop and a structure to enhance "searchability" and "findability" across our site.
The ability to make use of non-text content as well as to reuse content assist us in avoiding the devolution to chaos we see in our current site. Maybe that's what is most exciting about the CMS—not the individual features themselves, but the way in which they work together to enable us to implement the IA.
No CMS has every feature we would like it to have. If it did, we'd probably dream up something new to add to the list.
Our current web design was unveiled eight years ago. How many folks then had heard of podcasting, wikis or blogs? What about Facebook, YouTube or Web 2.0?
We certainly hope that our new design, however beautiful, will be updated more often as we move forward. In fact, some of the CMS features we've already discussed will enable this to happen more easily in the future.
Why play nice with others?
The point is, we cannot predict, much less require of the CMS, every feature we may desire over the course of its lifetime. But we don't have to!
Instead, we want a CMS that allows us to extend its capabilities. We want code (modules, applications) external to the CMS to be usable within the CMS. Interfaces and programming languages vary. But the products we're reviewing allow this type of extension.
Just as Firefox and Google Desktop allow the creation and incorporation of plug-ins, top CMS vendors have accepted the reality that they cannot develop quickly enough to keep pace with the web.
How about some tangible examples of features that might require some extension with external technologies?
- Web Forms and Data Collection
- Statistics/Web analytics
- Calendars/Event Registration
- Site Search
Is this a definitive list. Absolutely not! Some of the CMS products we are evaluating provide some of the above features. And there will undoubtedly be more features on the list as time goes by.
Which is precisely the point of extensibility—and why this capability may be the last we'll discuss, but is by no means the least important on our list.
The End of the Road?
This concludes the list of features that we plan to describe. (But only the beginning of the list of features we may implement.)
If you are interested in broader lists of features, as well as CMS product comparisons, you might be interested in the CMSMatrix.
For those who have braved this oversized monograph, you are to be commended. Thank you for your interest in and support of the re.web project, and your continued involvement with the web at William & Mary.