Google has found that some search queries with the word “not” are still challenging to answer. One example of this would be: "Tell me the easiest roses to grow, but 'not' pink double knock out."
That’s up from “hopeless” in the very recent past, said Google Director and Product Manager Elizabeth Tucker, who spoke during a recent podcast episode of Google’s Search Off The Record.
During the podcast, which focused on measuring search quality and data, Tucker discussed the challenges Google continues to face despite advancements in artificial intelligence (AI).
It’s confusing because the search engine needs to understand when “not” means the person searching doesn’t want the word included, or when it has a different semantic meaning.
Complex linguistic searches are difficult. In general, preposition are also a challenge, she said. One of the “big breakthroughs” was the BERT paper and transformer-based machine learning (ML) models.
advertisement
advertisement
Google describes BERT as Bidirectional Encoder Representations from Transformers, a ML framework and open-source natural language processing (NLP) model developed by the company in 2018.
“I would not say this is a solved problem,” Tucker said. “We are still working on it.”
Users need to rephrase searches to find the information they seek, and marketers need to focus on clarity in content and communicate relationships between key concepts and phrases, with attention to those nuances when creating keyword matches and tags.
When does Google know searches have improved? Google is seeing more complicated queries, Tucker said. The bar is raised higher with greater success in search.
Measurements also tell Google when more queries are being answered correctly. Google sometimes surveys searchers, such as when it asks “how helpful are these results?”
The company also uses a lot of metrics where it samples queries and when human evaluators evaluate results. Search behavior is monitored to determine whether people find what they seek.
“If we just stood still, search would get worse,” Tucker said.
Sometimes data is misleading. Not everything important is measurable. Not everything measurable is important.
In the early years of Google, the company had few measurements because it was easy to see from looking at a handful of searches whether an algorithmic measurement was successful.
There are many searches that are easy to get correct such as when someone comes to the search engine and queries Facebook. Then there are searches for information on pharmaceuticals, for example, which could put someone’s health at risk.
That risk is one reason that Google has banned the distribution of Adderall and stimulants by telehealth companies on its search engine. The Wall Street Journal last week reported Google banned Done Global from running ads on its platform amid a widening federal crackdown.
TikTok, which is trying to capture more of Google’s search audience, did the same.
She said segmentation in the Thai language was the most difficult to tackle, because people do not typically put spaces between words.
People did not put spaces in their searches and documents on webpages didn’t have spaces. It was much more difficult to match, especially keywords match when there are no spaces between words.