User Interface Ideas

Note: None of the below concepts has been coded at all, and I don't plan to start coding it for at least a few months (maybe a few years). I am, however, always ready to discuss these concepts and others with whomever's interested.

A recurring theme in Smalltalk software development, regardless of which Smalltalk dialect, has been the separation of model and view. Many refinements of this are possible: Sometimes the input and output aspects of the view are separated into two distinct roles, fulfilled by distinct objects. Sometimes there are multiple layers of abstraction surrounding the models (application model vs. domain model), both of which are useful, but having different levels of coupling to a particular set of presentation options.

The proposed experimental paradigm for Avail is a bit more extreme than these "traditional" schemes. The separations of these layers of abstraction lends great flexibility to a system in which one wishes to construct a user interface. Unfortunately, the choice of medium for a user interface always seems to overconstrain the interface. For example, a touch-tone telephone interface makes it difficult to do anything but push buttons on a physical phone. Similarly, when talking to a human operator on the telephone, it's often excrutiatingly awkward to have to spell out text that could just as easily be transmitted by some other means to the operator (e.g., a separate data channel, or even a fax-during-voice protocol). The fact that none of these protocols is designed with the others in mind should not make them necessarily incompatible with each other. It's only the assumptions of completeness of knowledge in these protocols that prevents their interoperability. For example, a digital phone should be able to receive non-voice data, even if it can't do anything useful with it. A windowing system should be able to deal with arbitrary pointing devices (3 physical dimensions should suffice, but 6 degrees of freedom per pointer is more appropriate). The current 2-dimensional, at most 3-button interfaces seems chronically underdesigned, in my opinion. Nonetheless, this layer will widen as more unorthodox pointing and positioning systems hit the mass market (i.e., Virtual Reality).

Ah, but getting back to Avail. In order to not be confined by the technologies of the day, I require the presentation environment to be a layer that is (virtually) independent of the interfaces that are expected to be attached to it. Current interfaces are tightly unidirectionally coupled to their media - a touch-tone phone UI will always be a touch-tone phone UI. Any attempt at presenting this UI through another medium will either require a major redesign of the interface, or worse, will look like a clunky emulation layer. When was the last time you tried to navigate a touch-tone interface with a pulse-dial phone? There aren't that many out there, but you'd think it wouldn't be particularly hard. But you'd be surprised how hard it is to make an interface work using only digits, with neither asterisk nor octothorpe. A tough problem, to be sure.

But let's restrict our attention to media that are a bit more expressive than digits with no punctuation. Like Turing machines, a reasonable Avail medium should be "Medium Complete", or a "Complete Medium"; that is, able to emulate all other "Complete" media. Even a pulse dial telephone medium can emulate all other media, but it's probably impractically inconvenient to have to enter all numbers in base 8 (or some other encoding/delimiting scheme). Let's call a medium that's this hard to use a "Marginal Medium". This is a subjective boundary, of course, but let's be fairly generous with what is still considered a Complete medium, at least in theory.

So my primary idea for Avail is a scheme under which Media are very much decoupled from Interfaces. They are attached during a session or interaction, but this is associative and temporary. This is a far cry from today's "manually painted" window layouts. I believe the widespread use of these window painters has prevented alternative graphical presentation schemes from being explored commercially. So far it hasn't mattered much, but with all the new Media of interaction appearing in recent years (Virtual Reality, Grafitti and handwriting recognition, speech recognition, collaborative groupware, integrated telephony/data, rich internet media streams, etc.), we should re-examine the current situation.

So how do we avoid the manually painted screens that dominate current presentation methodologies? By abstraction, of course. Not meaning "creating something unimplementable" as some people may mean abstraction, but the more precise sense, "removing the common features into something that can be shared and replaced". The concepts we abstract in this case are the presentation rules. When we're building an interface we may know that the decisions made in one part influence the choices available in another part. This usually leads to a "flow" from top to bottom, left to right (depending on language and cultural directionality, of course, which I have never seen modelled correctly in a user interface). This "flow" is really the result of extracting a presentable spanning tree from the directed graph of dependence information, most of which is left implicit in the specification of the interface. If instead of capturing the spanning tree and throwing out the graph, we do the opposite, we will end up with more information. This additional information should be enough to allow the computer to apply automated layout algorithms. Let's try an example:

Suppose we are writing an application for an insurance company. At some point the application must provide a means for the user to enter information to search for a particular policy. The user search by policyholder name, policy number, or assigned agent (insurance agent). Let's say we want the user to be explicit about what type of search they want performed. Then we have something like:

Choose search method:
   * by policyholder name ->
      * allow entry of policyholder name
   * by policy number ->
      * allow entry of policy number
   * by assigned agent ->
      * allow selection of assigned agent

There are several ways this could be presented to the user. Maybe all the data entry areas are visible at once, with some of them disabled based on 3 radio buttons. That would take a lot of space on the screen, but there may be plenty on the particular Medium instance on which the Interface is running. Note that this is not known precisely at implementation time, and may even vary widely among the intended targets. If these choices (and their immediate consequences, the available data entry areas) are not made visible simultaneously, the user may have to "navigate" too much to find out what options are available. If, on the other hand, these three options are all presented, there may not be enough space on the presentation medium (the screen or window) to see the options without some awkward scrolling (or worse, there may be unscrollable clipping). Since the parameters of the final deployment should strongly influence this choice of presentation, it should be deferred until final deployment (runtime). The medium (or the combination of medium and current user) should have a set of preferences or rules associated with it. These rules control how an Interface is deployed on a Medium. Although this may seem complex, these rules are the same ones an interface designer would have to apply manually during design. The difference is that the interface designer would have to make unwarranted assumptions about deployment during implementation. This might work if the system is only intended to be deployed on completely uniform Media, but as soon as someone plugs in a monochrome monitor (or a colorblind eye) into the interface, all the rules have to change. Besides color, one also has to take into account language and cultural directionality, available screen resolution, pointing mechanisms (why should a trackpad always emulate a mere mouse?), availability of auditory channels (beep), availability of alternative input mechanisms (Dvorak keyboard, handwriting, voice), speed of the output (Html page over a slow link vs. a single fast machine), etc, etc. Clearly, making all these decisions at design time or implementation time creates a brittle, overspecified interface.

So the goal (in Avail) is to delay as many of these decisions as possible, to the point of final execution. So far I've only described the rationale (and hinted at the scope), but I would like to hear other people's ideas on this subject. Please fax me a barcode dump of a UUEncode of a WAV audio file recording your comments (I was just kidding, but see if you can figure out a simpler way of sending a voice message over a telephone wire). Anyhow, I want to focus on Media types and the various kinds of Presentation rules. Also, the representation of an Interface is a good thing to start nailing down early on.


back to main page

(This page was last updated March 26, 2000)

Email me