Google voice search: faster and more accurate

Posted by Haşim Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays and Johan Schalkwyk – Google Speech TeamBack in 2012, we announced that Google voice search had taken a new turn by adopting Deep Neural Networks (DNNs) as the core technology used to model the sounds of a language. These replaced the 30-year old standard in the industry: the Gaussian Mixture…
Original post: Google voice search: faster and more accurate
Source: Google Research

Causal attribution in an era of big time-series data

by KAY BRODERSENFor the first time in the history of statistics, recent innovations in big data might allow us to estimate fine-grained causal effects, automatically and at scale. But the analytical challenges are substantial.Every idea at Google begins with a simple question. How can we predict the benefits the idea’s realization would create for users, publishers, developers, or advertisers? How…
Original post: Causal attribution in an era of big time-series data
Source: Unofficial Google Data Science

Parameter Tuning with Hyperopt

This post will cover a few things needed to quickly implement a fast, principled method for machine learning model parameter tuning. There are two common methods of parameter tuning: grid search and random search. Each have their pros and cons. Grid search is slow but effective at searching the whole search space, while random search is fast, but could miss…
Original post: Parameter Tuning with Hyperopt
Source: District Data Labs

ECML-PKDD 2015 Review

ECML-PKDD was a delight this year. Porto is definitely on the short list of the best European cities in which to have a conference. The organizers did a wonderful job injecting local charm into the schedule, e.g., the banquet at Taylor’s was a delight. It’s a wine city, and fittingly wine was served throughout the conference. During the day I…
Original post: ECML-PKDD 2015 Review
Source: Machined Learnings

Information sharing for more efficient network utilization and management

Andreas Terzis, Software EngineerAs Internet traffic has grown and changed, Google and other content and application providers have worked cooperatively with Internet service providers (ISPs) so that services can be delivered quickly, efficiently and cost-effectively. For example, rather than content having to traverse a long distance and many different networks to reach an Internet access provider’s network, a content provider…
Original post: Information sharing for more efficient network utilization and management
Source: Google Research

VLDB 2015 and Database Research at Google

Posted by Corinna Cortes, Head of Google Research NY and Cong Yu, Research ScientistThis week, Kohala, Hawaii hosts the 41st International Conference of Very Large Databases (VLDB 2015), a premier annual international forum for data management and database researchers, vendors, practitioners, application developers and users. As a leader in Database research, Google will have a strong presence at VLDB 2015…
Original post: VLDB 2015 and Database Research at Google
Source: Google Research

On Procedural and Declarative Programming in MapReduce

by SEAN GERRISH      AMIR NAJMITo deliver the services our users have come to rely upon, Googlers have to process a lot of data — often at web-scale. For doing analyses quickly, it helps to abstract away as much of the repeated work as possible. In this post, we’ll describe some things we have learned about mixing declarative and…
Original post: On Procedural and Declarative Programming in MapReduce
Source: Unofficial Google Data Science

Time Maps: Visualizing Discrete Events Across Many Timescales

Discrete events pervade our daily lives. These include phone calls, online transactions, and heartbeats. Despite the simplicity of discrete event data, it’s hard to visualize many events over a long time period without hiding details about shorter timescales. The plot below illustrates this problem. It shows the number of website visits made by a certain IP address over the course…
Original post: Time Maps: Visualizing Discrete Events Across Many Timescales
Source: District Data Labs