Debugging the car: An in-depth look at Google’s Street View August 5, 2007Posted by timothyliu in Technology.
I recently attended an Engineering Intern Open House hosted by Google where Luc Vincent, the last speaker, piqued my interest with an amazing new technology:
Google Street View:
For the first time, I was rewarded with an in-depth look at the architecture or process of a new technology instead of trying it out myself and figuring out the inner workings. Monsieur Vincent (he graduated from Ecole Polytechnique in France) gave the following presentation (I have paraphrased and drawn my own conclusions because there is no hard copy of his powerpoint presentation):
Google Street View started out as an idea straight from Larry Page who went around Stanford recording streets and buildings with his own video camcorder. This idea grew with the creation of Google Maps as Street View would provide many benefits such as showing the environment of a new apartment or highlighting beautiful city limits to visit among other benefits. However, since this is Google, Luc Vincent went straight to explaining the technology behind what we see as a streamlined ‘video’ of streets across America.
When deciding how to ‘capture’ an image of a street while driving, there are several things to consider.
1. Pinhole Perspective vs. Pushbroom Linear Imaging
A normal image plane from a camera produces a rectangular picture:
This is all well and good, however, with a moving camcorder, you need to reconcile this. Introducing: Pushbroom. What some people, i.e. Google, terms ‘Pushbroom’ technique is that you take a vertical slice of the rectangle perspective, directly in front of the camera and crop that from the entire image plane. Therefore, as you move along with a camera, the vertical slice changes and you can ‘stitch’ them together to produce a picture. An example of this is always differs and so you can see an infinite picture as long as your camera keeps going:
However, there are problems with depth as each vertical slice is as if you were looking straight at it. If you stitch it together, you can’t see depth as if there was only one person there. Here is an example up close: Link. The cars are squished together and alleyways for example have no depth.
How Google solves this is by combining the pushbroom and the pinhole perspective together by having multiple perspectives repeat in a linear array like pushbrooms. With this you add perspective to an infinite camera stream of streets.
2. San Francisco hills, Pixelated cars, and a shady(in the shade) cafe
Okay, so now you can have an infinite stream of imaging that actually looks 3D with depth and everything, but wait a sec you travel to Lombard Street and the hills sit there mocking you.
To make sure the camera can take footage of bumpy roads Google makes sure that a gyroscope or some technology orients the camera so that bumpy roads or hills maintain the same angle as if you were walking down the hilly San Francisco roads.
Furthermore, Google prides itself on the extremely fast frame rate of it’s high-tech camcorder or you would see pixelated images driving at 20-30 mph. An extension of this is the extremely high resolution that Street View provides so that you can even zoom in and see people’s faces.
Lastly, what happens when your shop around the corner is under the shade of a gigantic tree, although it’s only 3:00 in the afternoon. Google tries to take this into account and will apply an obscene amount of filters to get the picture viewable by all.
3. 3D Panoramic view
Despite his presentation being less than 30 minutes long, Monsieur Vincent spoke of numerous problems dealing with traffic routes (left turns, right turns, or going straight), debugging the car (the streetview car has to accomplish all of the details in adjusting the camera, filtering, accounting for pointing at the sun, etc.), and even powering all of these things to contain several batteries on top or in the trunk of the car.
I would like to however speak of the 3D Panoramic technology because of what I researched after attending the open house. What’s the best way to capture a 360 degree view of everything while driving? Google experimented with various solutions and a triangle layout of 3 camcorders, most likely each having more than a 120 degrees range of view to accommodate a panorama. Wait, but there’s more! Check this out:
Although you can see 360 degrees around you, you need to convert it to a rectangular format. Google does this with pose optimization and has kindly released the code to do this: Link. Imagine a globe and take a triangular cut out of the crust, now to map this to rectangular coordinates with depth, Google used the rendering in Flash to easily translate something that is warped in a circle to produce something seen on a computer screen. Pretty neat!
So that concludes the presentation basically on how Google’s Street View came about.
Now one image on this page is from http://www.immersivemedia.com. I later found out that Google had launched Street View with some aid or some insight from the technology from immersivemedia.com. What amazes me is that despite having an easy solution to just use their technology, Google went ahead, tried to do it their own way, and succeeded. Now of course, the idea is all Larry Page’s, but you have to imagine that Google Street View along with Google Maps is a free service. Yes, it helps grow their user base virally, and it immerses their popularity all across the world, but I think this is a testament of the Google spirit, motivation, lunacy? I really value the fact that despite having readily available technology, Google at least in this instance went above and beyond to refine a technology and provide it free for the general public. I am far from an avid Google supporter, but this is one thing that I just can’t dislike.
As a personal side note, I found this publication extremely stimulating and a very good reason why Monsieur Vincent was chosen to speak about Google’s new amazing service Street View.