Content Arbitration: MultiViews and *.var files

The HTTP standard allows clients (i.e., browsers like Mosaic or Netscape) to specify what data formats they are prepared to accept. The intention is that when information is available in multiple variants (e.g., in different data formats), servers can use this information to decide which variant to send. This feature has been supported in the CERN server for a while, and while it is not yet supported in the NCSA server, it is likely to assume a new importance in light of the emergence of HTML3 capable browsers.

The Apache module mod_negotiation handles content negotiation in two different ways; special treatment for the pseudo-mime-type application/x-type-map, and the MultiViews per-directory Option (which can be set in srm.conf, or in .htaccess files, as usual). These features are alternate user interfaces to what amounts to the same piece of code (in the new file http_mime_db.c) which implements the content negotiation portion of the HTTP protocol.

Each of these features allows one of several files to satisfy a request, based on what the client says it's willing to accept; the differences are in the way the files are identified:

Apache also supports a new pseudo-MIME type, text/x-server-parsed-html3, which is treated as text/html;level=3 for purposes of content negotiation, and as server-side-included HTML elsewhere.

Type maps (*.var files)

A type map is a document which is typed by the server (using its normal suffix-based mechanisms) as application/x-type-map. Note that to use this feature, you've got to have an AddType some place which defines a file suffix as application/x-type-map; the easiest thing may be to stick a

  AddType application/x-type-map var

in srm.conf. See comments in the sample config files for details.

Type map files have an entry for each available variant; these entries consist of contiguous RFC822-format header lines. Entries for different variants are separated by blank lines. Blank lines are illegal within an entry. It is conventional to begin a map file with an entry for the combined entity as a whole, e.g.,


  URI: foo; vary="type,language"

  URI: foo.en.html
  Content-type: text/html; level=2
  Content-language: en

  URI: foo.fr.html
  Content-type: text/html; level=2
  Content-language: fr

If the variants have different qualities, that may be indicated by the "qs" parameter, as in this picture (available as jpeg, gif, or ASCII-art):

  URI: foo; vary="type,language"

  URI: foo.jpeg
  Content-type: image/jpeg; qs=0.8

  URI: foo.gif
  Content-type: image/gif; qs=0.5

  URI: foo.txt
  Content-type: text/plain; qs=0.01

The full list of headers recognized is:

URI:
uri of the file containing the variant (of the given media type, encoded with the given content encoding). These are interpreted as URLs relative to the map file; they must be on the same server (!), and they must refer to files to which the client would be granted access if they were to be requested directly.
Content-type:
media type --- level may be specified, along with "qs". These are often referred to as MIME types; typical media types are image/gif, text/plain, or text/html; level=3.
Content-language:
The language of the variant, specified as an internet standard language code (e.g., en for English, kr for Korean, etc.).
Content-encoding:
If the file is compressed, or otherwise encoded, rather than containing the actual raw data, this says how that was done. For compressed files (the only case where this generally comes up), content encoding should be x-compress, or gzip, as appropriate.
Content-length:
The size of the file. Clients can ask to receive a given media type only if the variant isn't too big; specifying a content length in the map allows the server to compare against these thresholds without checking the actual file.

Multiviews

This is a per-directory option, meaning it can be set with an Options directive within a <Directory> section in access.conf, or (if AllowOverride is properly set) in .htaccess files. Note that Options All does not set MultiViews; you have to ask for it by name. (Fixing this is a one-line change to httpd.h).

The effect of MultiViews is as follows: if the server receives a request for /some/dir/foo, if /some/dir has MultiViews enabled, and /some/dir/foo does *not* exist, then the server reads the directory looking for files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client's requirements, and forwards them along.

This applies to searches for the file named by the DirectoryIndex directive, if the server is trying to index a directory; if the configuration files specify


  DirectoryIndex index

then the server will arbitrate between index.html and index.html3 if both are present. If neither are present, and index.cgi is there, the server will run it.

If one of the files found by the globbing is a CGI script, it's not obvious what should happen. My code gives that case gets special treatment --- if the request was a POST, or a GET with QUERY_ARGS or PATH_INFO, the script is given an extremely high quality rating, and generally invoked; otherwise it is given an extremely low quality rating, which generally causes one of the other views (if any) to be retrieved. This is the only jiggering of quality ratings done by the MultiViews code; aside from that, all Qualities in the synthesized type maps are 1.0.

New as of 0.8: Documents in multiple languages can also be resolved through the use of the AddLanguage and LanguagePriority directives:

AddLanguage en .en
AddLanguage fr .fr
AddLanguage de .de
AddLanguage da .da
AddLanguage el .el
AddLanguage it .it

# LanguagePriority allows you to give precedence to some languages
# in case of a tie during content negotiation.
# Just list the languages in decreasing order of preference.

LanguagePriority en fr de
Here, a request for "foo.html" matched against "foo.html.en" and "foo.html.fr" would return an French document to a browser that indicated a preference for French, or an English document otherwise. In fact, a request for "foo" matched against "foo.html.en", "foo.html.fr", "foo.ps.en", "foo.pdf.de", and "foo.txt.it" would do just what you expect - treat those suffices as a database and compare the request to it, returning the best match. The languages and data types share the same suffix name space.

Note that this machinery only comes into play if the file which the user attempted to retrieve does not exist by that name; if it does, it is simply retrieved as usual. (So, someone who actually asks for foo.jpeg, as opposed to foo, never gets foo.gif).


Home Index