OCR and the Vestigial Aesthetics of Machine Vision

03 Jan in conference, mla, mla13, ocra

Today, I presented a paper at the 2013 MLA Convention as part of a special session on Reading the Invisible and Unwanted in Old and New Media along with Lori Emerson, Paul Benzon and Mark Sample.

My talk was about the font OCR-A, and it’s part of some research I’m doing as a bridge between my dissertation and a current fellowship project that should turn into a book.

My slides are a PDF, which I’ll embed first below. Again, this is ongoing research and thinking, so I certainly appreciate comments, questions, suggestions.

My paper today is about a typeface and font, OCR-A, [slide - block] and the link that exists between its contemporary uses in design back to the context from which it originated in the 1960s. The occasion of this panel, a media archaeological approach to “the invisible and unwanted”, provides two convenient frames from which to explore this linkage. After considering how these frames apply the aesthetic vestiges of machine reading, I’ll provide a brief history of OCR-A’s engineering and then conclude by exploring the significance of some examples of its use in popular culture.

First, from the “invisible and unwanted” theme, invisibility is an ironically appropriate concept when discussing typography, since type operates within that particular sub-visibility of graphic design whereby it can be said to work best when it is noticed least. As Beatrice Warde puts it in her exquisitely Modernist essay, “The Crystal Goblet or Printing should be Invisible”, [slide - quote plus cartoony image]

The book typographer has the job of erecting a window between the reader inside the room and that landscape which is the author’s words.

OCR-A is, in this sense, a rather “opaque” font, but its uses outside of actual optical character recognition eclipse the specific circumstances constraining each of its letters, working toward the obfuscation of Warde’s romanticized segregation of form and content.

A second useful frame that this paper benefits from is that of media archaeology which is, as I understand it, a way of doing media theory with a historical perspective toward the technical and social conditions that make media culture possible in specific moments of history. For OCR-A, its conditions and necessities are one answer I’m proposing to the question that began my interest in OCR-A — [slide - AoVG example] why do I, someone who studies video games, see it so frequently? Especially in situations supposed to create an association with older games? When In fact, since OCR-A’s purpose is to make alphanumeric characters printed in ink more reliably readable by data processing machines, it was not designed for screen use as in a videogame and is rarely, if at all, used in videogame packaging or arcade cabinet design until at least the 1990s. In other words, its retrogame association is an anamorphic and anachronistic look backward at a history that never was. One possible explanation for this link is that the history of OCR-A’s development and the way its design emblematizes that history may reveal a patterning toward obsolescence characteristic of how we tell stories about the futures of technology’s past.

The basic processes and applications for Optical Character Recognition technology are familiar. Basically, a digital scan of a page image is examined by software that identifies specific graphemes and converts them into some digital format such as ASCII. This is the workflow by which, for example, Google Books is converting “all of the worlds knowledge” into an indexical format, ready for searching, collecting and n-gramming. [slide - recaptcha]

This all generally works well enough that, from a readerly point of view, we only get to contemplate that process at all by way of its failures, its excess. This is the point at which its byproducts have the potential to become aesthetic artifacts, as in the example of the ReCAPTCHA OCR-based shibboleth service that crowdsources ocr correction. [slide - perhaps pizzas] In ReCAPTCHA, its juxtaposition of two barely-readable words can be provocative or amusing, enough to provide punchlines for webcomics or an entire saga of Inglip that interprets recaptcha word pairs as commands Issuing forth from some dark master.

[slide - art of google books] Extending into a similarly liminal subjectivity, a tumblr blog curated by Krissy Wilson showcases “The Art of Google Books,” appropriating the errata of Google’s OCR with a new aesthetic sensibility that shows the traces of unnamed owners, donors, readers, or archivists. Those who preside over the twilight of the book.’s material specificity. Like the phenomenon of captcha art, this “adversaria” or miscellany of Google’s 20 million or so books is interesting only when the otherwise silent process of text encoding becomes too noisy to remain invisible. Error, or some other mitigations against the one-to-one conversion of text to data intervenes, and that error is usually human. These artifact and others succeed by exploiting the oddness that lies at the threshold between machine and human subjectivity, positioning the latter as the Other of the former.

[slide?] 1971 handbook on business applications of Optical Character Recognition technology acknowledges that, “Each day it becomes more evident that our large and complex computers do not operate in a void but in a social environment where they communicate with human beings as well as with other data processing equipment.” This book elaborates that within the data-processing domain, there is a human-computer problem where, of the two parties involved, the Problem really is with the human who is slow and prone to error. OCR-A was the product of an attempt to solve that problem through a standardized typeface.

The earliest demonstrations of optical typographic character recognition date back to the late 19th century, [slide - photoprinting apparatus] and a 1914 patent for a “Photoprinting Apparatus” is the first practical demonstration of something like modern OCR in an industrial context. Large-scale data processing emerged in scientific and military domains with the development of mainframe computing through the 1940s, with much larger datasets then becoming available with business implementations. By the early 1960s, OCR advocates and engineers like Jacob Rabinow, who had done some work with Vannevar Bush’s Rapid Selector microfilm reader, were calling for an industry standard to support the development of the technology and the emerging secondary industry in providing turnkey OCR solutions. By now, computers were being used to process, calculate, and track more and more information, and standardized OCR leveraged the durability of ink and paper to replace more fragile systems like Hollerith punch cards.

The typeface we now know as OCR-A, took shape in the mid-1960s, with implementations as early as 1966 at Reader’s Digest magazine, a felicitous irony for this essay about legibility, culture and human subjectivity. [slide - scan of x3.17] The full type standard was released in USASI document X3.17-1966 wherein a committee of the United States of America Standards Insitute (later, the American National Standards Institute [ANSI]) defines details of the font in minute detail, down to relative stroke widths and ink smudge tolerance. Because USASI’s province is technical, throughout the document, although the human reader is implied, it is clear that the primary reader intended for text printed in this font is the OCR reading device itself. [slide - detailed 3] Thus, the shapes of each character become constrained and distorted by the affordances of the reading machine’s resolution and tolerance for error. [slide - comparisons] Great care is taken, for example, to distingish between otherwise similar characters such as capital I, lowercase l and the numeral 1 or between O, 0 and D.

The result is a font that mixes serif with sans-serif characters, shifts its axes and internal symmetry arbitrarily, and is easy to recognize with its acute angularity. In these ways, OCR-A is an ugly font, but of course that ugliness — a human consideration — makes it less “invisible” than, say, Helvetica, in a way that links its technical function with its use in graphic design. Note how this inverts the situation recognized in The Art of Google Books where the human element creates the error that provides an aesthetic remainder . Here it is precisely the characteristics of OCR-A that are NOT intended for humans that comprise its secondary aesthetic function.

[slide - frutiger] Seeking a middle ground, type designer Adrian Frutiger received a commission by the European Computer Manufacturers Association (ECMA) to create an alternative to OCR-A. Writing in a typography journal in 1967, Frutiger declared that OCR-A was “barely readable” and “offensive to human taste.” [slide - ocr-b sample] His solution in the OCR-B font, while still a monospace font with technical constraint, is much gentler and internally consistent, mainly because it replaces acute angles with curved lines. Like OCR-A, OCR-B is still attuned to the affordances of machine resolution, a tolerance of smudging and other vaguaries, but its design documentation and advocacy make the strong case, often in somewhat strident tones, that human readership is a much higher priority than is the case for OCR-A.

Perhaps it is Frutiger’s stridency on behalf of OCR-B that led Jacob Rabinow, writing for the industry journal Datamation in 1969, to speak out in defence of the A standard. First noting that, so far as he knows, no one had yet been injured by OCR-A, nor had anyone required psychiatric consultation. Curiously, Rabinow continues by sidestepping a more technical description of OCR-B’s shortcomings and instead calling into question the supposed aesthetic superiority of OCR-B. [slide - quote - illustrate with serifs/chisel theory]

The esthetics of characters vary with time and place in history. The serifs which we know today are based on something that happened in the Roman times due, some believe, to the problems of chiseling into stone.

[slide - camera-type - car] Turning to a broader consideration of how shapes enter folklore, Rabinow explains how, due to the mechanism of Graflex camera-types, a line-by-line exposure of fast moving imagery like fast automobiles created the photographic impression that they are leaning forward. [slide - quote]

This distortion has become so ingrained in our conscience that all cartoonists draw cars leaning forward when they want to indicate speed and the windows in our buses have their vertical lines tilting forward, and for the same reason. It is interesting to think what would have happened if the Graflex camera shutters moved up instead of down.

Tying up the implications of this reflection, Rabinow concludes, [slide - streamlined toaster and washers]

Suppose the aerodynamic drag were lower for blunt, square objects — what would have happened to the streamlining of our toasters and washers?

In other words, as technical functions in one domain become vestiges for aesthetic paradigms in other industrial design domains, what we see here in 1969 as a defense of OCR-A’s viability becomes an articulation of the specific patterns William Gibson would later give the name “Raygun Gothic” in his 1981 story, “The Gernsback Continuum”, which is anthologized in his Burning Chrome collection [slide - cover].

OCR-A and OCR-B appear frequently on book covers, implying a thematic continuity that, among these titles here could owe something to the coincidence of availability vis a vis my bookshelf [slide - covers] — it’s difficult, after all, to query a database of books for which typefaces are used in cover art — or, more likely, it could be that there is a recurring association created by these paratexts, [slide - gibson covers] particularly in the case of William Gibson’s cybetpunk works, which tend to express a range of ideas dealing with personal memory, the durability of storage media, synthetic subjectivy, and the obsolescence of design, all of which could also correlate to the specific cultural, corporate, and commercial situations which created the need that OCR-A met in the 1960s.

To continue the cyberpunk theme, OCR-A actually plays a role in a significant plot point in The Matrix. It’s presence could have something to do with the obvious influence of Gibson’s fiction on the general look of the film, or it could be a bit of subtle foreshadowing when, in the first act, Agent Smith sits across a table to interrogate Neo. [slide - matrix] If you’ll recall, this is the first scene when viewers might suspect that something is well and truly not right with the world, and as a barely noticeable hint of what that might actually be, the file Agent Smith reads from is printed entirely in OCR-A. That is, a font created to be read by machine eyes first and human eyes second is here being perused by a machine that has been masquerading as a human inside of a machine-created simulation.

Each of these applicatons of OCR-A, from the functional to the aesthetic, accumulates associations and resonances that tie together the cultural, material, and economic conditions that brought it into being as well as those that justify its inclusion as a thematic design element. These contexts include the increasing accessibility of minicomputers like the PDP-1 used by MIT students to program Spacewar! in 1962, just a couple of years before Reader’s Digest began rolling out its OCR system. The relationship between videogame history and OCR-A reflects this historical corellation as well, as demonstrated in the Art of Videogames exhibit at the Smithsonian’s American Art Museum.

Already by the late 1960s and early 1970s, OCR-A had joined other systems like barcodes and Magnetic Ink Character Recognition (MICR) to add an informational layer on day to day reality, particularly in retail environments. But unlike ubiquitous barcodes and the MICR font e-13b that still appears on paper checks, OCR-A drifted much more readily out of the functional domain and into graphic design, as scanning hardware improved image resolution and software has obviated the necessity of a strictly stylized alphabet. Though there are still applicatons for OCR-A, one is much more likely to encounter it in cultural contexts, where its stylization signifies the emergences of contemporaneous media technology. Thus, OCR-A provides a clear example where the aesthetic vestiges of function recall the historical situation that made it both possible and necessary.

Comments

zack's picture

Fascinating

This is really riveting stuff. Slide 12 especially elucidates the absolute brilliance and clarity of the font in a way I had never considered. Thanks.

zach's picture

Thanks for the comment! I

Thanks for the comment! I agree, OCR-A is interesting, and it’s the kind of thing that’s so unmistakable once you know what it is, you start seeing it everywhere. There’s a commercial for Windows 8 (I think) which includes some on-screen OCR-A in the opening shot, for example.

Mary M.'s picture

human error?

Hi Zach
I enjoyed this piece but I’m confused by your references to OCR errors as human: e.g., “Error, or some other mitigations against the one-to-one conversion of text to data intervenes, and that error is usually human”; as well as your reference to the Art of Google Books as “appropriating the errata of Google’s OCR.” There are two types of errors in play here: capture errors (an obstruction or failure in the photographic or capture process); and OCR errors. If a human hand gets in the way of the camera as it photographs a book page, that’s a capture error (and the OCR of course will produce garbage from it). The errors that most significantly bedevil mass digitization, however, are not those from a bad photograph (Google scans books twice for this reason), but those created by OCR software itself, which is imperfect and results in often quite error ridden (and largely unseen) text files.

So, I just wanted to question those two phrases above by pointing out that OCR errors are “machine” errors and not “human”; and that the Art of Google Books blog showcases capture errors but not OCR errors (at least as far as I know from my occasional visits to it).

Mary Murrell
murrell3@wisc.edu

zach's picture

Mary, Thanks for reading and

Mary,

Thanks for reading and thanks for your comment. I suppose one could make the semantic argument that an OCR error ultimately owes its failures to some of the properties of a letter that make it more legible to humans, thus the software could blame the human for what it can’t recognize … but these are just semantics.

The distinction you point out is important, and I think in my paper here I’m glossing over something useful, as you rightfully point out.

I agree that Art of Google Books’ artifacts are typically related to capture error, and there’s a whole other category too of things that just look odd, like bad moire effects or image bleedthrough on illustrations. These aren’t really anyone’s “fault”, nor are they tied specifically to a question of encoding or readability, so I wasn’t sure how to fit them into this paper.

Do you know much about reCAPTCHA? Do you think those artifacts would mostly be from software error?

Mary M.'s picture

recaptcha

I know a bit about reCaptcha because it started with the Internet Archive OCR (I did fieldwork at the Archive) as well as the NYT. OCR on older books (and newspapers) is much more “errorful” than more recent books; the older the material being captured, the more OCR errors. OCR on modern books is considered “very good” but even at 99% accurate, the errors exceed modern publishing standards (ie. errorlessness). At any rate, when “solving” the recaptcha, you are seeing the image that the machine “saw” and “recapturing” it by reading it as a human (which means here “correctly”). And, in so doing, you are proving, as was also the purpose of the original Captcha, that you’re a human and not a bot.