The US federal judge presiding over a trial in which the government seeks to break up Google quizzed an expert today on the risks of making the tech giant share user data with rivals who could then reverse engineer Google's search results. US District Judge Amit Mehta is being asked by the government to not just split the Chrome browser off from Google but also force the company to share user search data and mandate ad syndication after he found the tech giant illegally monopolized the markets for Internet search and search text ads.
The US federal judge presiding over a trial in which the government seeks to break up Google quizzed an expert today on the risks of making the tech giant share user data with rivals who could then reverse engineer Google's search results.US District Judge Amit Mehta is being asked by the government to not just split the Chrome browser off from Google but also force the company to share user search data and mandate ad syndication after he found the tech giant illegally monopolized the markets for Internet search and search text ads.
"Would you agree that more user data is a beneficial thing to have to build a better search engine?" Mehta asked James Allan.
Allan, an expert witness put forth by Google, is a professor of computer sciences at the University of Massachusetts Amherst.
"I think that it's clear that you need some user data," the witness said.
In a follow-up question, Mehta asked about the kind of data sharing and forced disclosure proposed by the US Department of Justice and a multistate coalition: "do you think there's any type or volume data that could be disclosed that would impose the reverse engineering and mimicking risks that you described?"
Allan said it's very clear that the more data that is disclosed, the easier it is to reverse-engineer Google's search results.
Earlier in his testimony, Allan said the DOJ's proposed remedies would allow rivals to not only reverse engineer Google's results but also other Google technologies. Competitors would be able to create a system using the data disclosed or the syndication remedies such that they could train a system to produce results that are effectively identical to what Google produces, he said.
"I would assume that the converse is true, which is less data than what you're describing becomes less useful in terms of developing a quality search engine?" Mehta asked.
"Yes, I think that if I gave you 10 sample queries that have been issued in the last year, you couldn't do anything with it. So I agree that there's a point at which there's no use to it to anyone," Allan replied.
The expert explained how a rival can mimic Google if the DOJ's remedies are imposed. If there is a search engine that's not doing well with certain classes of queries, or if the rival wants to do better with certain classes of queries such as health or sports queries, the competitor can find or generate queries associated with those topics and send them to Google. They would then get back the rank list in response to it, which would give the competitor a good result and a bad result. This would allow the competitor to train its system to do a better job or to mimic Google on those types of queries, Allan said.
During cross-examination by DOJ counsel Diana Aguilar, Allan acknowledged that he investigated whether Google's scale incentivizes users to provide Google content as compared to other search engines.
For instance, in 2018, users added approximately half of all new places added to Google Maps, Aguilar said while showing an internal Google slide showing the level of user-generated content that powered the tech giant's search engine and local services.
Aguilar questioned the expert if he was aware that the DOJ is not asking for the technology used to extract information from Google's Knowledge Graph but instead are asking for the database which contains the information such as a company's the business hours. Such information would allow competitors to build their own Knowledge Graph.
Google's Knowledge Graph is a massive database of interconnected facts about the world. It is designed to help Google understand the context and meaning of search queries, and it enables the tech giant to go beyond keyword-based search and provide more relevant, contextual and informative results. The Knowledge Graph powers features such as knowledge panels, which display information about entities such as people, places and things on search-results pages.
Allan acknowledged his understanding was that the DOJ's proposal sought data necessary to construct the Knowledge Graph, not the underlying technology needed to construct that.
Aguilar then asked Allan whether he presumed that the DOJ was asking for ranking signals, query-based salient terms among other things. "Do you understand that plaintiffs have not requested that information?"
"I understand you asserting that now. It's not always clear what is or isn't user-side data, what is derived partially from user-side data and what is clearly user-side data has not been specified," Allan replied.
The expert said he chose to take a broad interpretation that is consistent with using examples drawn from things that the DOJ explicitly called out.
"So, maybe some of those things that I list would not be counted as user-side data, but the plaintiffs provided no guidance to help me figure that out," Allan said.
Google is expected to call Eric Muhlheim, Mozilla's chief financial officer, as a witness tomorrow. Two more expert witnesses may be called to testify.
Please e-mail editors@mlex.com to contact the editorial staff regarding this story, or to submit the names of lawyers and advisers.