X-Ray Vision on your mobile phone

Consider a world with a perfect data connection. You can get mobile broadband wherever you go at speed enough to stream video in HD quality without a hitch. Now throw in flawlessly designed hardware, seamlessly integrated into your every-day clothing and your pocket devices. Sounds good, doesn’t it?

Much of augmented reality hinges on us approaching infrastructure this good before we can really see it becoming a big fixture in our every-day lives in a meaningful way, but there is another area to consider before it’s a success and it's one which we’ve yet to discuss on AR Week on Pocket-lint. It’s all very well having the data and wearable computers to present it to us, but if we can’t tell these machines how to best display it to our senses, then it’s not going to be a lot of use in that real time way we all want AR to be.

The good news is that there’s a team at the University of South Australia working on this very problem, inside what’s known as the Magic Vision Lab. Their mission to “enhance human vision with computer-generated graphics in order to amplify human intelligence on a world-wide scale” puts them right at the heart of what most people are trying to get out of augmented reality and at the very forefront of where that technology is at the moment. With one of our goals on AR Week to speak to the very top minds in the field, it wasn’t a tough decision to pick up the phone to the director at Magic Vision, Christian Sandor, and find out what’s going on at the cutting edge.

"There’s been a big research community in AR for 10 years and there’s lots of knowledge but there’s a large gap between the hype and what is actually possible at the moment,” Dr Sandor tells us as we brace ourselves for the technical detail to come.

“The current mobile phone applications that people use are things we were doing a decade ago in the laboratory. Apps like Layar are pretty much the extent of what’s around at the moment and there’s still a level of delay and inaccuracy in what the user gets.”

One of the most popular Android applications, Layar is an AR browser that allows you to choose an information overlay to fit on top of what your phone is displaying of your surroundings on your mobile screen through the camera lens. So, if you want to find somewhere to eat, you hit the restaurant's layer and see what pops up on your real time view but, as Sandor points out, that’s pretty much the limit of consumer augmented reality as it stands.

“Most AR on mobiles at the moment only really uses your compass and actually not the camera at all. It just overlays information on what’s already there. What we’ve been looking at for a while now is a way to manipulate the camera image as well.”

What Sandor is referring to is a trio of technologies on the benches in Magic Vision, all of which make Pocket-lint’s jaw drop a few inches lower each time as he sends over the links to the demo videos to describe the research.

Each of the techniques are based on the problems of how to best display AR information that’s either outside of the frame of where you happen to have your mobile phone pointed, or that which is occluded by objects in the foreground.

The first is known as Radial DistortVision and answers the former of the two problems. The idea is that instead of having to swing your phone about at arms length to get a picture of what’s going on to each side of you, the camera bends the image - in a similar way that a very wide or fish-eye DSLR camera lens might - so that you can effectively see it all in the one frame.

In the example in the video, the distortion allows you to see down the street in both directions simultaneously by making their horizons go into the picture and not out of either side. In the example of an app like Layar, what Radial Distort would offer is all the restaurants on a single street in one quick shot. (The video below shows an in-depth demo but head to the MVL site for succinct thumbnail footage)

If instead you want to look at points of interest that lie behind objects in your mobile camera’s field of view, then a better option of picture distortion is to try MeltVision. Here, the software will create an image on your display of the same scene that you’d see with the naked eye, but with the added bonus of melting away the buildings in the foreground in quite dramatic style to reveal what you actually were interested in all along; but it's the third of the bunch that is probably the most impressive of all.

“MeltVision is good if you want to focus on the background but the foreground is lost,” explains Sandor. “Now, that’s fine for some uses of AR but, for others, you need the foreground as well as what’s behind it.

“Imagine a construction worker using AR to follow the building plans on top of a real life background. It’s no good if the front surface disappears when you’re trying to drill it.”

For these more commercial and, probably ultimately, more useful kinds of AR applications, Magic Vision has come up with something they call X-Ray Vision. Using a pre-modelled environment where the information of what's behind each building of a given area is known, the X-Ray Vision software fades away the foreground to reveal what’s behind leaving just the overlayed edges of the front surface. The human mind can then work out the difference between the two.

"X-Ray Vision worked well for us to begin with but the problem with leaving just the edges is that some foreground details were lost, so we’ve since improved the project and it now keeps some colour elements of the front surface as well.”

A second problem for the technology has been that it’s not always best to keep all the details of all the edges of the foreground. Sometimes that can prove to be so much information that it confuses what’s going on behind it. Sandor outlines the solution the team’s currently working on and the obvious problems therein.

“There is a new system that mimics the human eye and the way it selectively processes the information. It was originally developed by University of South California to work with a two hour computation but that’s just not fast enough to work for live mobile AR.”

All the same, the video above shows the progress that’s being made but, even if the more intangible machinations of the human visual system can be mastered, there are still other obstacles to remove.

“Google Street View isn’t ideal to use to know what's behind each building. It’s fairly low res and it’s not live. Surveillance camera footage would be fantastic but then there’s privacy issues to consider.”

Another limitation is that the foreground and background have to be superimposed on one another to absolute pixel perfection for X-Ray Vision to be of any commercial use. As a result the team has been using laptops for the live computations and it’s only relatively recently that they have ported this Nokia sponsored project to a collection of N900 handsets. While X-Ray Vision isn’t demanding as far as GPUs go - making it a good choice for mobiles - there’s still plenty of number crunching required from the CPU which is another limiting factor for any kind of AR system on phones; one reason why Sandor describes most mobile AR experiences at the moment as a bit “hello world”.

x ray vision on your mobile phone image 3

Of course, there’s no reason why augmented reality has to be limited to the mobile space. The second major project of the Magic Visions Labs is in the area of what Sandor describes as “Visual Haptics”.

“It’s about trying to create a true Holodeck experience. It’s the search for the ultimate display.”

So far, most AR has been concerned with just one mode of input stream, vision, but according to Sandor, there’s no reason why the field of augmented reality should be limiting itself in that way.

“Three quarters of what we perceive every second is visual,” he explains, “and, of the remainder, again about three quarters, is the sense of touch.”

Naturally, the more multi-modal we can make an interface, the more real it can seem and the more possible it is to bring computer generated overlays to life. Using VH-7000 headmounted displays provided by Canon - the likes of which are only available in three laboratories in the world and cost in excess of $100,000 - and a robot pen known as the Phantom, Magic Vision has been working on a way to bring both haptic senses and visual stimuli, of the same computerised object in a virtual space to a user at the same time, to add to the reality of the experience.

At its simplest, the team has experimented by using a 3D virtual car which can be seen from all angles through the HMD and felt with the tip of the pen as you move the Phantom around the edges of the computer generated car shape.

To put that in commercial context, the Magic Vision team has carried out demonstrations where you can take a virtual object such as a trainer (from a sports shoe company) and then take the robot pen to not only feel around the non-real world object but also draw virtual designs on it and see how they look instead of having to go through the costly and time consuming process of making up prototypes.

You could have a marketing team sitting round a table wearing HMDs and all looking at this virtual product in true 3D space and each of them able to interact; make suggestions for and modifing the object without having to constantly go back to the drawing board and calling a string of follow up meetings as each new prototype is put together at great time and expense.

It’s even possible to mix the real and virtual even further by introducing a real object, such as the tetra pack carton in the video demonstration below, which you can then add extra lengths and dimensions to as well as changing the patterns on the surface.

Of course, as far as the sense of haptic feedback goes, using the Phantom robot pen, it’s only effectively through a single point and via an extension of your body as opposed to an actual digit of your hand. The next steps for Sandor and his team have been to work on improving that physical contact and for that they’ve been in collaboration with the University of Tokyo whom themselves have come up with an interface device known as the SPIDAR-8, as seen in the video below.

It’s a two-handed, multi-fingers haptic device consisting of a web of wires connected to a frame. It connects to your eight fingers via small caps for your fingertips and provides resistance wherever the computer tells it to. The effect is that it can describe a perfect three dimensional virtual object in real space allowing your fingers to explore multiple surfaces and edges at the same time. Paired with the Canon VH-7000 head mounted displays it creates an astounding AR effect.

Of course, all of the AR interface developments at the Magic Visions Labs are far from ready. While the technology is there, there are obviously significant barriers to it reaching the users of the world at even just the level of pure cost. But scaling it back from the SPIDAR-8 and visual haptics to simply a better mobile AR experience, how long will it be before we get more than just the parlour tricks of current smartphone apps? According to Sandor, it can’t be long now.

“We’ve all been waiting for something to happen for quite some time. Apple and Google have been quiet on AR for a long time. It feels like they’re holding something back but no doubt we’ll soon see some really good AR applications from them embedded into their mobile platforms. The wait is most likely so that these releases can be absolutely bulletproof and really nice when they arrive to wow us all.

“When that’s all there, when these applications are on every iPhone and every Android smartphone out there and when my grandmother’s using it, then AR will have really arrived.”

For more information on what Qualcomm is doing with Augmented Reality please click: http://www.qualcomm.co.uk/products/augmented-reality