Hatena ID Discovery Lite

Living Standard 20 June 2013

Latest Version
<http://wakaba.github.com/packages/hatenaid-discovery/docs/spec>
Version History
<https://github.com/wakaba/hatenaid-discovery/commits/master/docs/spec-src.html>
Author
<>

Abstract

This document describes how to embed the Hatena ID of the author to HTML pages or HTTP header and how to extract Hatena ID from such annotations.

Status of this document

This section describes the status of this document at the time of its publication. Other documents might supersede this document.

This document might be updated, replaced, or obsoleted by other documents at any time.

Comments on this document are welcome and may be sent to the author.

Translations of thie document might be available. The English version of the document is the only normative version.

Table of contents

  1. 1 Introduction
  2. 2 Conformance
  3. 3 Hatena ID
  4. 4 HTML link element
    1. 4.1 Authoring requirements
    2. 4.2 Implementation requirements
  5. 5 HTTP X-Hatena-Author: header
    1. 5.1 Authoring requirements
    2. 5.2 Implementation requirements
  6. 6 Examples
  7. 7 Tests
  8. References
    1. Normative references

1 Introduction

This section is non‐normative.

This document defines Hatena ID Discovery Lite, a lightweight syntax to embed the Hatena ID of the author to HTML pages and how to extract Hatena ID from such annotations.

Although there is another technique to embed the Hatena ID of the author, i.e. Account Auto-Discovery, its terrible syntax makes it difficult to embed or to extract the Hatena ID within Web pages, and is rarely used these days.

Unlike Account Auto-Discovery, Hatena ID Discovery Lite provides a simple, both easy-to-write and easy-to-parse syntax for Hatena ID annotation, by defining the usage of standard HTML elements and attributes for this particular purpose. In addition, it defines a simple HTTP header field to describe author name, for dynamically-generated Web pages and non-HTML documents.

This document deprecates the use of Account Auto-Discovery in favor of Hatena ID Discovery Lite.

2 Conformance

The keywords "MUST", "MUST NOT", "SHOULD", and "MAY" in the normative parts of this document are to be interpreted as described in RFC 2119 [RFC2119].

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("MUST", "MAY", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps MAY be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)

User agents MAY impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.

For example, a user agent can choose to stop parsing of the HTML document after first 4096 bytes have been parsed to work around its memory limitation.

Some conformance requirements are phrased as requirements on elements or attributes. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents. Similarly, some conformance requirements are phrased as requirements on authors; such requirements are to be interpreted as conformance requirements on the documents that authors produce. (In other words, this specification does not distinguish between conformance criteria on authors and conformance criteria on documents.)

3 Hatena ID

The Hatena ID is a short string identifying some kind of object (such as user and group) in Hatena services. As it is often embedded in URLs, it is sometimes referred to as url_name in Hatena systems. A Hatena ID consists of one or more characters in the following ranges: 0-9, A-Z, a-z, -, _, and @. Hatena IDs are case-sensitive.

Hatena IDs are sometimes preceded by id:. However, the id: prefix is not part of Hatena IDs.

Any semantically valid Hatena ID consists of three or more characters. An application can ignore Hatena IDs whose length is less than three (3), if desired.

Maximum length of a Hatena ID is not defined. An application SHOULD accept at least 128 characters for a Hatena ID.

4.1 Authoring requirements

A Hatena ID link is an HTML element representing the Hatena ID of the author of the content described by the element.

A Hatena ID link MUST be an a, area, or link element [HTML].

The rel attribute of a Hatena ID link MUST contain at least one of following link types:

The href attribute of a Hatena ID link MUST have one of the following values:

... where hatena-id is the Hatena ID of the author. Any @ character in hatena-id MAY be percent-encoded. The other characters in hatena-id MUST NOT be percent-encoded.

According to the HTML Standard, an a or area element with link type author indicates the author of the nearest article element, if there is one, or of the page as a whole, otherwise.

4.2 Implementation requirements

How to find elements indicating the author of the document, or the author of an element, and how to parse rel attribute values, are specified by the HTML Standard. Please note that rel attribute values are ASCII case-insensitive and might contain more than one values.

According to the HTML Standard, a rev attribute with the value made must be treated as having the author value specfiied in the rel attribute. The rel attribute can contain more than one link types separated by white space characters.

The Hatena ID of the author MUST be extracted from an element by the following steps:

  1. If the element does not have the href attribute, return nothing and abort these steps.
  2. Let hatena-id be the value of the href attribute of the element.
  3. If hatena-id prefix-matches http://www.hatena.ne.jp/ literally, delete it from hatena-id.
  4. Otherwise, if hatena-id prefix-matches http://www.hatena.com/ literally, delete it from hatena-id.
  5. Otherwise, if hatena-id prefix-matches http://profile.hatena.ne.jp/ literally, delete it from hatena-id.
  6. Otherwise, if hatena-id prefix-matches http://profile.hatena.com/ literally, delete it from hatena-id.
  7. Otherwise, return nothing and abort these steps.
  8. If hatena-id ends by a / character, delete it from hatena-id.
  9. Otherwise, return nothing and abort these steps.
  10. Replace any occurence of %40 in hatena-id by @.
  11. If hatena-id contains a character not in the ranges 0-9, A-Z, a-z, -, _, and @, return nothing and abort these steps.
  12. If hatena-id is the empty string, return nothing and abort these steps.
  13. Return hatena-id.

Please note that the Hatena ID returned by these steps might or might not be a valid ID. Please also note that an author can markup anyone else's Hatena ID as the author of the document. There is no formal way to test whether a Hatena ID Link is correct or not.

This document does not define how to process a document containing multiple Hatena ID Links. Future version of this document might define the processing model for documents with mutliple authors.

It is encouraged for implementors to use the first Hatena ID Link in tree order if there are multiple Hatena ID Links until such a processing model is defined.

5 HTTP X-Hatena-Author: header

5.1 Authoring requirements

An HTTP message MAY contain an X-Hatena-Author header field. The header field MUST conform to the following ABNF [ABNF] production rule:

x-hatena-author-header = "X-Hatena-Author:" SP hatena-id
... where SP is defined in STD 68.

5.2 Implementation requirements

The Hatena ID of the author MUST be extracted from the HTTP header by the following steps:

  1. If the HTTP message does not contain the X-Hatena-Author: header field, return nothing and abort these steps.
  2. Otherwise, let hatena-id be the value of the first X-Hatena-Author: header field in the HTTP message.
  3. If hatena-id contains a 0x2C comma character (,), delete the character and any following characters from hatena-id.
  4. Delete any leading or trailing 0x0A, 0x09, 0x0D, and/or 0x20 characters from hatena-id.
  5. If hatena-id contains %40 (string 0x25, 0x34, 0x30), replace it by a 0x40 commercial at character (@).
  6. If first three characters of hatena-id, if any, is id:, ID:, Id:, or iD:, delete it from hatena-id.
  7. If hatena-id contains a character not in the ranges 0-9, A-Z, a-z, -, _, and @, return nothing and abort these steps.
  8. If hatena-id is the empty string, return nothing and abort these steps.
  9. Return hatena-id.

This document does not define how to process a document containing multiple X-Hatena-Author: header fields. Future version of this document might define the processing model for documents with mutliple authors.

Likewise, this document does not define how to process a document containing both X-Hatena-Author: header field and Hatena ID link.

6 Examples

An HTML fragment:

<link rel=author href="http://www.hatena.ne.jp/ugomemohatena/">
... describes that the author of the document has Hatena ID ugomemohatena.

The following HTTP header describes that the author of the document contained in the HTTP message has Hatena ID hatenastar:

X-Hatena-Author: hatenastar

7 Tests

There is a test suite.

References

Normative references

ABNF
Augmented BNF for Syntax Specifications: ABNF, Dave Crocker, Paul Overell, IETF Internet Standard (STD 68), RFC 5234.
HTML
HTML Standard, Ian Hickson, WHATWG Living Standard.
RFC2119
Key words for use in RFCs to Indicate Requirement Levels, Scott Bradner, RFC 2119, IETF BCP 14, March 1997.
XFN
XFN 1.1 relationships meta data profile, Tantek Çelik, Matthew Mullenweg, Eric Meyer, GMPG.