Facebook's plans to roll out a Google-like capability with its new Graph Search offering started off with one big advantage: director of engineering Lars Rasmussen is a former Google star, having led the teams that built Google Maps and Google Wave in Sydney. But what technology is driving the new service, and how hard was it to build?
In a post on the Facebook Engineering blog, Rasmussen details some of the challenges which the just-launched project faced. Even getting to the point of the limited beta has been a slow process. Attempts to rebuild Facebook's often cumbersome and unreliable search systems have been in the planning stages since 2011, and the engineering team has been actively working on the project for more than a year.
Rasmussen's previous experience with traditional search wasn't necessarily going to be useful. "Compared to large document collections like the Web, the data in our databases have significantly more explicit structure than free-flowing text. Therefore, a traditional keyword-based search product might not be the answer," he wrote.
The infrastructure challenge was daunting for several reasons. Facebook's existing backend was a hodgepodge of code, including three separate search systems. While that approach had aided with speed of deployment, it had reached the familiar point where maintaining those systems was restricting the ability of the engineering team to develop new services.
One of the existing systems, Unicorn, had been developed to help identify content which wasn't directly linked (used by Facebook for identifying 'friends of friends', amongst other functions). Rasmussen and his team elected to use this as the basis for Facebook's overall search experience, first making it the central system and then adding Graph Search functionality.
Scale was a challenge. "Every day, people share billions of pieces of new content, and Graph Search needs those indexed within seconds of their creation," Rasmussen wrote. But arguably the biggest issue was ensuring that privacy setting were respected:
Consider the relatively simple Graph Search query, "Photos of Facebook employees." For starters, we make sure that only photos that the owner has shared with the person conducting the search can be seen on the photo results page. But we have also to make sure that each photo features at least one person who has shared with the searcher that they work at Facebook! Otherwise we would implicitly be revealing content that the searcher does not have access to. The more complex the Graph Search query, the more work we need to do to ensure the system returns only content the searcher already has access to.
Like many online projects, Graph Search has launched in incomplete form. "Today, we are far enough along now to launch Graph Search as a beta, but we're still missing is the ability to index all of the posts and comments people have shared on Facebook—they make up by far the biggest dataset we have for Graph Search and Unicorn," Rasmussen wrote. Future expansions will include adding additional languages and building a mobile interface.
Under the Hood: Building Graph Search Beta [Facebook Engineering]