A chapter from my 1997 master's thesis Interface; or, The Modern Pygmalion.

Ontology of the senses in interface

In the womb, the child feels and hears the beats from its mother's chest, from its own developing heart. The daily activities of the mother, such as walking, eating, exercising, and talking, vibrate through the amniotic environment into the child's ears and body. With eyes closed, and fluid supporting them, infants in utero perceive existence as the bodily incorporation of life's transmitted energies. At the very inception of our time on earth, our universe is vibration and sound; only later we are thrust into the world's bright cacophony.

Our relationship to our senses is shaped by more than the basic utility each holds for navigation, interpretation and manipulation of the physical world. In addition to these functional attributes, there are affective qualities to our modes of perception which exceed the basic functions of these modes. These differences between the various perceptive abilities shape the development and use of computer interfaces. Therefore we must attempt to understand these differences, not only for the sheerly practical advantages of doing so--such as enabling more individuals to have access to materials embedded in machines--but also for ensuring that the complete variety of human experiences and relationships can be explored in this new medium from near the outset of its development[1].

Throughout the preceding sections, I have alluded frequently to the senses of sight and hearing. Unlike taste, touch and smell, which are enabled by direct contact with molecules of the entity sensed, these two senses depend on the transmission of energy through a medium, a method which best facilitates remote communication. As such, hearing and sight are the primary senses for navigation of the physical world, and are the most commonly used modes of human/computer communication. But while these two sense channels are most widely used, they are not equal to one another in cultural value, nor are they used toward ethically comparable goals. An illustration from the earliest days of computer interface development can serve to point our reading toward the distinct privileges accorded to one sensory channel over another, and show how this difference is solidly mapped along lines of sexuality and gender.

In his 1960 book Art and Illusion: A Study in the Psychology of Pictorial Representation, E. H. Gombrich names the artist's romance with the image (and a chapter of his book) "Pygmalion's Power." As he points out, the Greek philosophers described the artistic function as "imitation of nature" yet "their own mythology tells of an earlier and more awe-inspiring function of art when the artist did not aim at making a "likeness" but at rivaling creation itself." Artistic creation, specifically visual representation, occurs within the tension of this difference. Creating a visual representation is in some sense to create the thing itself, yet the desires accompanying the looking-desire cannot be met.

The story of Pygmalion bridges this tension by portraying a man who succeeds in taking possession of his creation, the object of desire. "Without the underlying promise of this myth," Gombrich asserts, "the secret hopes and fears that accompany the act of creation, there might be no art as we know it." Gombrich supports his claim with examples from art history; Daumier, Donatello, even da Vinci have fallen under a Galatea's spell. Artistic creation, in Gombrich's terms, is fully a function of masculinist heterosexual desire. If a different myth inspires the work of artists other than the heterosexual male, it is (quite understandably) never addressed.

In 1993, when software designer David Canfield Smith writes in Watch What I Do: Programming by Demonstration about his 1975 development of the first graphical user interface, he cites the work of Gombrich, and through him of Lucien Freud:

[The] urge to create something living is common among artists. Michelangelo is said to have struck with his mallet the knee of perhaps the most beautiful statue ever made, the Pieta, when it would not speak to him. And then there's the story of Frankenstein. Artists have consistently reported an exhilaration during the act of creation, followed by depression when the work is completed. "For it is then that the painter realizes that it is only a picture he is painting. Until then he had almost dared to hope that the picture might spring to life." This is also the lure of programming, except that unlike other forms of art, computer programs do "come to life" in a sense.

Perhaps this lure is why Smith named his software, the first to employ graphic elements such as icons and menus, Pgymalion. His revolutionary software is a visual programming tool, "designed to stimulate creative thinking;" what better metaphor for the stimulating process of creation than that of Man molding his perfect object of desire?

Smith's choice of illustration for the article is equally representative of the link between the masculine science-making of the interface art upon the nature/ground of woman.

 

While again found in Grombich's book, the image is not from that chapter, which included two artists' renderings of Pygmalion's triumphant moment. Instead, Smith chose an image from Grombich's later chapter, "The Analysis of Vision in Art." This section of the book details the history of artists' attempts to paint "what they see," whether that vision leads them emulate "the real" or conversely to present a highly individualized style. Within this context, the Dürer illustration is a depiction of an extreme example of the former:

[The] art student[...]must find means of battling down his knowledge of the familiar meaning of things and look only at shapes and tones projected onto an imaginary plane. We have seen that he can break down the constancies only if he ceases to attend to the meanings of things. The need for the artist to become detached, to introduce an entirely different set of meanings, could scarcely be more drastically illustrated than in Dürer 's woodcut of the painter and his frame.

Whether or not it is true that the successful artist learns to remove his libido from his process of visual creation, or even needs to, this clinical distance from the viewed object is the hallmark of the scientific process. Smith's software is, not coincidentally, built for the Pygmalions of this century. For Smith, the new scientist/artist is the programmer, and the visual object is not the end in itself but part of a process wherein creative scientists can more easily communicate:

If you put two scientists together in a room, there had better be a blackboard in it or they will have trouble communicating. If there is one, they will immediately go to it and begin sketching ideas. Their sketches often contribute as much to the conversation as their words and gestures. Why can't people communicate with computers in the same way?

Both scientists and traditional artists rely on the visual sense to transmit ideas because the basis of their respective crafts is observation, analysis, and control. This reliance points to a subtext that holds true for both, which is obscured by the insistence of one upon "expression" and the other on "rationality." The subtext is the source of my inquiry in this section: If visual production is simply the most utilitarian, common-sense means for accessing and expressing the content of an interaction, why then is the recurring subtext of interface design (which rests at the intersection of visual design and computation) situated around the process of desire in visual creation, and of woman-to-be-looked-at as the proper end for the masculine creative/scientific drive?

 


Vision more than any other sense defines the parameters of interaction between subjects of unequal status[2]. With vision, an empowered actant can perceive another being, a disempowered entity, without acknowledging its status as an actant or potential actant. Even more to the point, vision, coupled with higher social status (money, race, gender), grants the power of creation to the seer, enabling them to change an actant into an object of attention without the complicity of that actant. This outlook is not confined to the world of aesthetics. Vision is the privileged sense of science, ennobling the rational mandate for control over the natural world, which, like an Ingres odalisque, is assumed passive and presumed feminine[3]. As demonstrated above, the privileging of vision as both an interface element and as the primary metaphor of human-computer interaction is tightly linked with the gendering of computer interface.

Historically, it has been the masculine prerogative to see, and the female to be seen[4]. Yet this dualism is not natural nor is it necessary. In fact this split is so repeatedly threatened by the ever-recurring agency of actual females that there have been constant attempts to stabilize this relationship through cultural narratives[5]. One such story is Pygmalion; Western culture's reification of the masculine prerogative to see, desire, and thus create, and the feminine "right" to be beautiful, desired, and cherished. By framing the ability to "be seen", a passive ability, as a privilege, those who see, and thus create, divert the critique that might have been directed at themselves onto others--who can only hope to gain power by being an object of attention[6]. Meanwhile, the status of the seer is elevated to that of creator, of God.

The roles that these senses play goes even deeper than their historically-acquired meanings would suggest. The metaphorical foundation of the senses lies in their implied relationship between an individual consciousness and the world. The diagram below imparts what I believe to be the basic relationships implied by sight and hearing, to the Western mind.

To be seen, an entity need merely exist; but to be heard, an entity must make sound, must move. Sound is intimately a result of physicality and of motion, and in the case of many entities, of life[7]. As a sense directly linking an individual to its body, in relation to other objects, sound interaction implicates vulnerability. Our body creates sounds over which we have no control and are often painfully aware. The creation of speech, music, and dance are all performative, embodied behaviors.

For sound to occur, something must vibrate. As such, sound is a sensory experience based on motion and energy on a human scale. To hear a sound, one feels it; the deaf can still dance. And as we perceive sound phenomena, the motion and energy creating it always has a specific, [earth-bound] source. To hear something is to become aware of the creator of the sound. One cannot exchange sounds with another and have either remain simply an object. Sound reminds us far too much of the other's sentience, the other's will, for any such imbalance of agency to remain in place.

In short, vision identifies existence, while hearing identifies action. Because it does not require the recognition of the object of attention as an entity with subjecthood, vision can be "objective," can distinctly separate the ontologies of the empowered and the abject; for the reverse reasons, sound cannot. If communicants hear and are heard, both parties have positive acknowledgment of one another's agency. If everything that is heard is understood to have agency akin to one's own, then moral choices will be based on this presumption. Nowhere are these choices more pertinent than in media defining themselves as "interactive."


Interface thinkers in recent years have emphasized the need for more attention to sound[8]. But in implementation of most interface projects, even experimental ones, sound is given second (or later) billing to visual. Feedback enhancement, which is a reinforcement of primarily visual cues, is how sound is most often used in standard applications. An example of this in the Macintosh interface is "Windowshade" in which a double-click on a document's title bar not only "retracts" the document but issues as satisfying "schwup" which signals that the window is being "sucked into" its menu bar. Such use attempts to recreate the physicality of the "real world" by modeling commonsensical physical behaviors of conceptually similar objects. Why is sound used mostly as reinforcement for behaviors largely understood through visual metaphors?

The development of computers has so far consigned them to a narrow band of functionality; to date the primary purpose of computer software is to facilitate capital-driven production. In our culture, productivity--especially of the corporate or scientific sort--is largely manifest in visual media, such as a typed document, a three-dimensional rendered space station, or a fourth quarter spreadsheet. Visual media are the most privileged channels for extending human instrumentality into the world. Computers, as a particularly rarefied manifestation of instrumentality (whether of nations warring against one another or of science's manipulation of "natural resources"), continue and indeed intensify this motivation as they intensify all behaviors which they incorporate.

The underlying philosophy behind instrumentality, productionism, tends to reduce social actants along tightly binary structures of actor and acted-upon. In such cases, it is definitive of the structure that the acted-upon is of lesser social status and has less effect upon the outcome of the interaction. The actor is the defining entity in the interaction and treats the acted-upon as if it had no agency, or as if its agency is only that which enables it to serve as efficient means to achieving the actor's ends[9]. In a cultural mapping which saw its first Western manifestation with the ideal forms of Plato, picking up speed with Descartes' mind/body problem, the inequity is gendered in these highly-divided dualistic structures. The actor, possessor of the mind, is coded masculine and the acted-upon, subject relegated to the body, is feminine.


Sound represent an alternative modality, metaphor, and ethic to sight. Unlike vision, the chosen sensory modality of the utilitarian ethic, sound does not generally extend human agency monodirectionally into the world. Instead, sound implies unidirectional affect. What can hear can both be affected and can create sounds which will affect in return. For interaction theory, the metaphor of sound buttresses an ethic which recognizes co-participants in an action rather than an active subject and passive object.

Sound also connects us to sensory and cultural phenomena which can counter the rational. In his essay "Myth and Music," Claude Lévi-Strauss suggests that the rise of formalized music in the West parallels the diminishing acceptance of the narrative effects of myths in those cultures. While "rational" systems of thought, like science, took over the origins-seeking place of mythology in the modern consciousness, the continued subconscious need for deep narrative structures was filled by the musical composer. While it is clear today that many different narrative traditions besides music have taken the place of religious mythology in our secular culture, music remains integral to our emotional comprehension of a narrative (witness a movie with the sound turned off). By its connection with emotion, as contrasted with vision's ruse of logic, sound falls on the silenced side of the dualistic divide. Yet a privileged pathway to emotion holds significant interest; like Barbara McClintock's development of "a feeling for the organism," the practice of science and the possession and expression of emotion, or empathy, need not be mutually incompatible. Indeed, their compatibility is the only route for cyborgs who engage in earthly survival.


It is clear why research in audio interface might be valuable. But we must not overlook sound as an alternative interface metaphor. Doing so could allow us to break out of the binary logic of Cartesian vision, a logic which is not simply about sight but rather has significant implications for the relationship of Western culture to the natural world. In the "real world" of contemporary digital practice, vision cannot be abandoned as an interface element, nor productionism as an interaction ethic. Foregoing both temporarily, however, will allow us to detect the inherent bias of human-computer relations that we normally mightn't notice.

Because the history of production-oriented computer interface creation has been skewed largely toward visual interaction, examples of audio interactions and metaphors are rare. Aside from increasingly-sophisticated dictation programs which translate the auditory into the visual, most sound-based software is music editing software. When alternatives appear, they are generally in the form of "art," "education" and "game," fields in which outcome or goal is often secondary to experience itself. Therefore our exploration of alternatives immediately leaves the world of mainstream task-oriented software in an effort to echo-locate the nodes of resistance, the promise of monstrous noisemaking.


Footnotes

[1] Some, such as Weizenbaum, would argue, contrarily, that certain behaviors can never be undertaken by machines, and should not be attempted. In such cases, it is inevitably the more "feminized" behaviors such as intimacy and sympathy that are considered not only beyond the capability of computers, but beyond their ken as well. Yet computers and software will inevitably take on roles in human life that were previously done by human individuals--they already are. If machines are developed to explore only one portion of the human experience, because it "isn't right" that they exceed the behaviors that we find most convenient, then their ability to interact with us will be limited to this abbreviated understanding of the nature of human life.

[2] This notion is particularly illustrated by Jeremy Bentham's "Panopticon," as elaborated by Michel Foucault in Discipline and Punish. The prerogative to observe someone while they cannot detect your observation, the ability to observe momentarily yet, through the threat of observation, maintain control constantly, is a foundation of the modern penal system.

[3] Evelyn Fox Keller, Reflections on Gender and Science is one important source in this line of research.

[4] For an elaborate examination of this issue, see John Berger's Ways of Seeing. Berger analyzes the history of art, particularly the painterly tradition in oil and contemporary advertising, to show how the stance of the visual subject, particularly of female nudes, is representative of the power of ownership of the purchaser/owner of the painting over the artwork as object and over the subjects depicted as objects of possession. His thesis is that this painterly tradition, beginning with the Renaissance and the rise of capitalism and colonialism, developed out of the need on the part of the wealthy art owner to solidify and elevate their way of life as proper, and more importantly, as beautiful.

[5] Mulvey 

[6] The grandmother of Frankenstein, Mary Wollstonecraft, commented upon this cycle, in her treatise "A Vindication of the Rights of Women":

The conduct and manners of women, in fact, evidently prove that their minds are not in a healthy state; for, like the flowers which are planted in too rich a soil, strength and usefulness are sacrificed to beauty; and the flaunting leaves, after having pleased a fastidious eye, fade, disregarded on the stalk, long before the season when they ought to have arrived at maturity. [...]

[Not] content with [strength, their] natural preeminence, men endeavour to sink us still lower, merely to render us alluring objects for a moment; and women, intoxicated by the adoration which men, under the influence of their senses, pay them, do not seek to obtain a durable interest in their hearts, or to become the friends of the fellow-creatures who find amusement in their society.

[7] Jean Piaget, the French psychologist, studied the development of children, and as part of that inquiry, examined their understanding of the meaning of "alive." As told in Sherry Turkle's book The Second Self: Computers and the Human Spirit, children develop an increasingly-sophisticated view of what is alive.

[8] Joy Mountford .

[9] It is notable here that even the language supports the divergence of agency; while "actor" is of common usage, the potential parallel form "actee" is not. The "object" of attention is linguistically erased; nullifying her potential for possession of independent will.

contact: moboid (at) moboid (dot) com