Monday, September 01, 2014

Are You Up to Speed on Microdata?

Today's post is reblogged (with permission) from Author-Zone.com.

Do you have a web site? Do you care about Google search ranking? Do you tag blog posts with keywords? Practice safe SEO?

Well, everything you know is about to change. But first a bit of backstory. Not long ago, Google announced the official end of its Authorship markup program. best known as the scheme that allowed people to appear with their G+ picture off to the left in a highly structured "rich snippet" search-result. Google talked about the death of the program here.

The even bigger news, though (already well known to hard-core SEO geeks but still making its way out to the hinterlands), is that Google will increasingly rely on semantic hints in HTML markup (hints that you need to be sure are there) when indexing web pages. The hints can be RDFa, or microformat-based, or microdata-based (using the schema.org schema). The latter is "recommended."

According to Google, failure to use any of these hinting schemes will not adversely affect page rank, nor will using them affect rank. But let's be honest here. It's pretty clear that your pages are unlikely as hell to show up in cosmetically pleasing rich-snippet format, on any Google page, if you've neglected to use microdata in your markup. It doesn't take a rocket surgeon to understand that eventually, if your site isn't using microdata (and everyone else's is), you're going to be at a disadvantage in certain situations. Extrapolate that however you wish.

What do you have to do to get microdata into your content? Basically stand on your goddam head and count backward to infinity. No rich text HTML editor I know of has built-in support for microdata (and yet, this support is clearly needed in all rich text editors, going forward). Who'll get there first? I have no idea. It's wide open. I expect the tooling situation will change, quickly. But in the meantime, you have only a couple of options for making your site's pages microdata-savvy (read: super-Google/Bing-friendly).

The first option is really a workaround, not a solution; and that's to head over to Google Webmaster Tools, where you can manually train Google to know where the key itemtypes and itemprops are (or should be) located on your site's pages. Once you've registered your site with GWT, flip open the Search Appearance item in the command tree on the left edge of the page, then start clicking around.

That's good enough as a right-this-minute workaround, but it obviously doesn't leave your site with different markup than it had before. And that's where Option No. 2 (the stand-on-your-head option) comes into play: Insert microdata yourself. By hand.

Mind you, this is not something a human being should have to do without proper tooling, and frankly, until tooling arrives I think you'd be crazy to invest a lot of time in marking up your stuff by hand. But to give you a quick visual idea of what kind of markup I'm talking about, here's what needs to happen to, say, a blog post. First, the entire post needs to be wrapped in an article tag with microdata inside it, like so:
<article itemscope itemtype="http://schema.org/Article" id="post-410" class="post-410 post type-post status-publish format-standard hentry category-uncategorized tag-font tag-legibility tag-readability tag-reading-comprehension tag-serif tag-typeface"> <!-- CONTENT GOES HERE --> </article>
I've highlighted microdata in peach so you can get a taste of the deliciousness. (Blech!) The headline for your peach piece needs to be wrapped something like this:
<h2 class="entry-title"> <a href="http://author-zone.com/serif-readability-myth/" title="Permalink to The Serif Readability Myth" rel="bookmark"> <span itemprop="name">The Serif Readability Myth</span></a> </h2>
If you've got a byline, it needs to look something like:
<span class="date-time"><a href="http://author-zone.com/serif-readability-myth/" title="1:15 am" rel="bookmark"><time itemprop="datePublished" content="2014-08-29" class="entry-date" datetime="2014-08-29T01:15:55+00:00">August 29, 2014</time></a></span> / <span class="author vcard"><a class="url fn n" href="http://author-zone.com/author/kasmanethomas/" title="View all posts by Kas Thomas" rel="author"> <span itemprop="author" itemscope itemtype="http://schema.org/Person"> <span itemprop="name">Kas Thomas</span></span>
And the content itself should begin with:
<span itemprop="articleBody"><p>I’ve been involved in publishing all my life, and like many others I’ve always accepted as axiomatic the notion that typefaces with serifs (such as Times-Roman) are, in general, are more readable than [ . . . ]
Don't use curly-quotes; those are a WordPress artifact. So as you can see, this stuff is butt-fugly, not at all intuitive (unless you're a taxonomy geek), and clearly not something an unaided human being should be doing by hand. It's a situation that very obviously calls for better tools, and right now we have [crickets chirping].

WordPress developers? HTML rich-text tooling geeks? Open source community? Content management vendors? We need your help. This situation calls for immediate action. Lots of people could use lots of great tools. Please help us out. I can't stand on my head and count backward much longer.    

Want great tips, news, and resource links in your inbox? Sign up for the Author-Zone Newsletter!