AT&T Labs applies analytics to improve California traffic

Anyone who's spent time in traffic in California can probably appreciate the work that AT&T Labs (NYSE: T) is doing with the California Department of Transportation and the University of California at Berkeley to create smarter traffic designs and ease congestion.

Drivers in Los Angeles spend some 80 hours a year sitting in traffic whereas the average American commuter spends 42 hours a year stuck in traffic. Rather than waiting for flying cars or some other science fiction-sounding solution, AT&T researchers are looking at using insights gathered by aggregate and anonymous cell phone data to ease transportation problems -- and they're assuring consumer privacy.

Two projects are in the works: One is the Connected Corridors project, which involves the use of anonymous data to forecast traffic patterns in Los Angeles. The other is the Smart Bay project in northern California, where analysis can help determine the best place to build ride share parking lots, for example.

By analyzing the data, AT&T can better understand at an aggregate level how populations move about, according to Chris Volinsky, AVP at AT&T Labs. "Understanding this better can help us build a smarter, more efficient infrastructure," Volinsky told FierceWirelessTech. For instance, the data could help traffic managers create "play books" that chart traffic patterns and volume. So if a traffic accident occurred, it would allow an adjustment to be made to traffic lights to ease traffic flow.

"Our focus is at the macro level to understand broadly how populations move about," he said. "We observe the cellular tower that a device is connected to when they make a call, text or data transaction, whether on the road or not. This gives us a coarse view of the location of the device, through time." The data is then aggregated by traffic analysis zone on AT&T servers. 

Of course, any time information is gathered from cell phones, the question of privacy arises. AT&T says customer privacy is a "fundamental commitment" at AT&T, and that's why the data is aggregate and anonymous. Furthermore, its researchers obfuscate the data even before it goes into a simulator as an additional precaution to preserve anonymity.

Researchers also inject "noise," or random information into the dataset. "This helps to further strengthen our privacy efforts. It's important to understand anonymized CDRs [customer data records] do not contain names or real phone numbers. The telephone number is replaced by a randomly generated number, and we are also obfuscating the time stamp," he said.

Before aggregation, "we move CDRs' location information to a random location within a traffic analysis zone (TAZ) close to the original cell tower where the call occurred," he said. This ensures further anonymization and that it is impossible to associate any individual call to specific person within the original zone.

That's not all, however. Next, the randomly displaced calls are aggregated and fed into a simulator, which creates a synthetic dataset. This output represents an aggregation of many simulated -- or virtual -- CDRs. "In essence, each arrow in the simulation is not associated with a single CDR," Volinsky explained. "Instead these are random samples that are created from a probability distribution that is constructed from a set of synthetic CDRs in a given area."

For more:
- see this AT&T blog

Related articles:
AT&T gets even more software-defined with SDS
Domain 2.0 is a priority for AT&T's Foundries
AT&T Labs preps for 10x increase in national core throughput