Opening up the z factor

From the earliest cave paintings to the modern flat screen, we have become used to interpreting the reality of our three-dimensional (3-D) environment through flattened 2-D visual representations. Our brains automatically interpret the mathematical constructs of perspective to create the illusion of the third dimension: depth (usually designated as the z-axis, alongside the vertical y-axis and horizontal x-axis). Whereas early technologies could render scenes only in monochrome, we can now deliver millions of colours but have yet to really support the third dimension outside a few niche areas. However, there are clear signs that that is about to change.

Game developers have increasingly borrowed cinematographic skills from the film industry to create ever more compelling, realistic and enticing environments in which game play unfolds. In parallel, the film industry has turned to the technology sector to support increasingly complex and realistic special effects to create scenes that would otherwise be impossible, too expensive or too dangerous for our entertainment pleasure. These industries have shown us what can be created, leading to a sense of underwhelming indifference to simpler 2-D rendering.

The de facto windows/mouse desktop metaphor is finally being challenged. Alternative desktops are introducing the concept of ?piles? of icons and ?walls? to a cubicle on which documents and images can be posted to recreate the real world 3-D working environment (such as BumpTop).

Multi-touch interfaces such as the Apple iPhone offer pinch and squeeze, swoosh and flick and have set the expectations for many consumer devices. Other example includes gesture interfaces that do not rely on touch. Simple menu selection by gesturing is already appearing on consumer entertainment devices such as Canesta. At the high end, Oblong?s ?g-speak? spatial operating environment provides an example of what can be achieved by offering pixel accuracy in 3-D by interpreting hand movements.

The challenge is in translating natural body movements in 3-D into meaningful and intuitive commands to the application. In future we expect to see combinations of multiple inputs being used to build a richer and more powerful interface. For example, the recently announced Microsoft gaming interface (Project Natal) combines camera-based gestural input with facial recognition and (limited) emotion detection /voice recognition into a single system.

For input devices, falling hardware costs, the space constraints of mobile devices and the desire for a less ?technology-oriented? interface in gaming have driven widespread adoption of six-axis accelerometers and inertial sensors in diverse applications. Examples include Nintendo?s ?Wii-mote? game controller, the ?shake? interface on Nokia (and other) mobile phones, and even the use of the hard disk accelerometer on most notebooks that allows a sideways tap to change screens. Movement is a natural human function, and allowing it to be used to control technology brings an ease of use and intuitive command capability.

Other input technology includes advances in 3-D imaging devices?visual processing derived from consumer digital-camera enable facial recognition and hand position to be determined, providing the basis for simple gestural interfaces. Recent developments include software recreation of 3-D models from analysis of multiple 2-D images from different locations (such as Microsoft Photosynth). In parallel, there is 3-D scanning, used to recreate

a point cloud of geometric samples of an object?s surface to extrapolate the shape of the subject. Applications include mapping and architectural work.

It is the display of the 3-D world that poses greater challenges. 3-D printers (which build 3-D objects by depositing material layer-by-layer rather than by cutting away material from a solid block) open the doors for a host of new applications and services for just-in-time and made-to-order ?one-off? creation of small items (such as from Z Corporation and Stratasys).

Three-D displays are often constrained by the necessity to wear special glasses or maintain a specific viewing position to view images. Although some holographic displays are beginning to emerge (Holografika and Zebra Imaging), true volumetric 3-D images created by high-speed rotating mirrors or charged particles pose serious health risks. We are a long way from matching R2D2?s ability to project the image of Princess Leia we saw back in 1977 in Star Wars! More-controlled-use environments offer the promise of 3-D data visualisation in the office environment, with potential applications in data analytics and vertical sectors such as geophysical and medical imaging.

As data volumes continue to increase, extending data visualisation into 3-D radically increases the volume of data that can be displayed. Medical imaging (for example, MRI scans), robotic surgery, nanomachines and remote handling and intervention all rely on 3-D positioning?and all represent huge commercial opportunities.

In a cost-conscious, environmentally aware world, the ability to project ourselves into another place requires accurate data collection, intuitive input and the ability to recreate a realistic view in 3-D. Extending our technology capabilities beyond the x and y planes?opening up the z factor?will lead to productivity enhancements, efficiencies and commercial opportunities on a massive scale.

The author is vice-president & fellow, Gartner

Opening up the z factor

From the earliest cave paintings to the modern flat screen, we have become used to interpreting the reality of our three-dimensional environment through flattened 2-D visual representations.