• InnoAI Insight
  • Posts
  • Inside Story: Unveiling the Genesis of Glean - A Leading AI-powered Enterprise Search Solution

Inside Story: Unveiling the Genesis of Glean - A Leading AI-powered Enterprise Search Solution

Inside Story: Unveiling the Genesis of Glean - A Leading AI-powered Enterprise Search Solution

Welcome to our new Inside Story series, where we’re going behind the scenes of some of the biggest companies in AI to find out how they were founded. Today, the spotlight is on Glean, the trailblazing AI-driven search and knowledge discovery platform catering to enterprises.

With a valuation surpassing $1 billion and endorsements from industry giants like Kleiner Perkins, Lightspeed, and Sequoia, Glean has positioned itself as a frontrunner in the AI landscape. Noteworthy clients, including Databricks, Duolingo, and Grammarly, attest to its prominence.

The company is transforming the way contemporary teams discover internal information by establishing a centralized repository for a company's data. This enables employees to quickly access any necessary document, information, or individual required for their tasks.

Humble beginnings

In its early stages, Glean, like many successful tech companies today, emerged from genuine workplace challenges. Arvind Jain, Glean's co-founder and CEO, grappled with knowledge search and discovery issues at his prior startup, Rubrik, inspiring the conception of Glean.

“Rubrik grew extremely fast, reaching over 1,000 employees within four years. Despite the growth, we noticed declining productivity metrics across different departments. Through internal pulse surveys, we discovered that our team was struggling to find information they needed to be effective. Employees complained about not knowing where to find information or who to approach for help.”

Jain sought an existing solution for Rubrik but encountered a hurdle: the need for integration with over 300 applications used by Rubrik's employees and a requirement for high adoption rates.

To Jain's surprise and frustration, he found no such solution available, highlighting an industry-wide challenge. Enterprises worldwide faced the dilemma of scattered information across multiple sources, impeding employees' ability to retrieve it.

When you can’t find it, build it.

Faced with the inability to find a suitable solution, Jain decided to take matters into his own hands. Recognizing the market gap, he envisioned an opportunity to create the much-needed solution for companies. Consequently, he took on the task of building it himself, appointing someone to handle his R&D responsibilities at Rubrik and initiating the development of the concept that would eventually evolve into Glean.

Solving a common problem

Addressing a prevalent issue, information retrieval proves to be a common use case, with a McKinsey survey indicating that more than a quarter of a typical knowledge worker's time is spent searching for information. Another study reveals that only 16 percent of content is accessible to other workers.

The challenges are apparent: companies aspire for their employees to focus on tasks crucial for business success rather than spending time searching for information. However, developing solutions for this issue is intricate. Each enterprise is unique, possessing its distinct information, applications, tech stack, and personnel to consider. Additionally, the lack of widespread API support in the past decade made integration with various apps, such as messaging platforms, exceptionally challenging.

The evolving technological landscape and advancements in AI have now made solutions like Glean possible. The company's success reflects the anticipation for such a solution in the market.

The founding team at Glean, all having previously worked at Google, drew inspiration from their experience with Moma—a custom intranet at Google that indexed everything used within the company. Reflecting on this, Glean's founding engineer, Debarghya (Deedy) Das, emphasized the luxury of Moma at Google, where finding information was taken for granted. Upon leaving Google, the team realized the challenge of functioning without such a tool.

Arvind Jain, Glean's CEO, who spent over a decade working on Google Search before founding Rubrik, was determined to develop a product that mirrored the quality of Google's UX and search capabilities.

The Google standard

Glean’s founding team all worked at Google previously. There, they had the luxury of using Moma—a custom intranet that indexes everything used inside Google.

Chatting on the Latent Space podcast, Glean’s founding engineer, Debarghya (Deedy) Das, said:

[Moma is] one of those things where when you're at Google, you sort of take it for granted. But when you leave and go anywhere else, you think: oh my God, how do I function without being able to find things that I've worked on? I remember this guy had a presentation that he made three meetings ago and I don't remember where he shared it.

Having spent over a decade working on Google Search himself before founding Rubrik, Jain knew he wanted to build a product of similar quality—both in terms of UX and search ability.

From Idea to Product

The mission for Glean was straightforward: develop a search system akin to Google's within a company, proficient in integrating information from numerous SaaS applications.

To accomplish this, Arvind Jain gathered his founding team, enlisting the expertise of three co-founding engineers: TR Vishwanath, Piyush Prahladka, and Tony Gentilcore. Persuading them to transition from Google posed minimal challenge:

“It was quite easy for me to convince people that we should go solve this problem together. It’s such a pervasive issue felt by everyone, whether they’re an engineer, product manager, salesperson, marketer, or IT person. Even up to today, there’s never been a question amongst the team on whether we’re solving an important problem; we know we are.”

Under Jain's leadership as CEO, the four co-founders commenced the development process. Within six months, the team had a functional product. Shortly thereafter, they placed it in the hands of early customers.

The backend development involved addressing several intricate components, given the inherent difficulty of the problem:

  1. Data Assembly: To create a search system, the data to be searched must be assembled into a centralized "crawling system." This entailed building integrations with the enterprise company’s applications to aggregate all content into Glean’s platform.

  2. Search Index: The team constructed a conventional search index by mapping search terms to relevant documents/information.

  3. Ranking: A ranking system was implemented to enable the platform to determine the best match for any given query.

However, what set Glean apart wasn't just these components, crucial as they were. Leveraging new transformer technology and Large Language Models (LLMs) available in the open domain, courtesy of Google, the Glean team achieved something groundbreaking: the ability to generate embeddings and establish semantic search. In 2019, this approach was revolutionary. Within the product, it translated into a significantly improved user experience and precise search results. For instance, if a worker queried, "show me the product manual for X" in Glean, the technology would surface user guides, team manuals, and product playbooks for X based on semantic search, surpassing the limitations of traditional keyword-based matching and QR-based search employed by many competitors at the time.

When creating a new product, SaaS startups often prioritize getting their Minimum Viable Product (MVP) in front of customers early and iterating based on feedback. In the realm of search, however, this approach is a luxury that isn't easily attainable.

“The first impression that a user has with your search product matters. They’re going to come and ask a question, and if they don't find the right answer, they’ll be turned off. They might never come back. So you have to think carefully about how you deliver that first experience to them. Because you don't get a second chance.”

To avoid such a fate, Arvind Jain and the team took a different approach. Instead of quickly acquiring customers, they invested a significant amount of time—around two years—meticulously building a robust product. This involved gaining insights into how enterprises operate, understanding the critical company knowledge, and determining how to provide users with the most relevant, up-to-date, and useful documents based on a given search. The team collected numerous signals from clients, fostering a deep understanding of every individual at companies using Glean.

In the initial stages, Jain allowed his earliest customers to use Glean for free:

“Initially, we let people we already knew play around with the product without paying for it. We did a lot of that. But friends will always support you. I wanted real feedback. So I then spent a lot of time cold-connecting with people to build up a database of folks who had a genuine interest in the problem we were solving. I was our only salesperson. We built our early pipeline that way.”

However, beyond the initial customers and friends, persuading users to pay for Glean posed challenges. Firstly, Jain was attempting to sell a product that few purchasing managers had bought before, with no allocated budget in enterprises for "search software."

Secondly, proving the value—how the tool increases revenue and reduces costs—of a search and knowledge discovery platform was more challenging than other software categories like customer service software, where it's easier to track metrics such as ticket reduction or improvements in customer satisfaction scores.

Thirdly, there was the hurdle of onboarding:

“Building a great product and selling it is only half the challenge. How do you actually get people to adopt it? Using Glean felt easy to me; it worked just like Google. So I couldn’t understand why people weren’t using the product as much as I expected.”

We did surveys, and half the respondents said they didn’t even know Glean was available at their company. And of those who did see the initial announcement, many said, 'Oh, yeah, I tried it. I like it. But then I forgot about it.' Onboarding customers right so they feel motivated to use your product is a major challenge.”

Addressing Enterprise-level Security Concerns

When operating at the enterprise level, robust security becomes an absolute necessity. Arvind Jain and his team understand this well; for Glean to operate seamlessly, it requires access to an enterprise's entire data. The consequences of a compromise, putting customer data at the mercy of cyber threats, are exceptionally high. Therefore, security measures must be airtight.

Jain and his co-founders faced the additional challenge of convincing customers that Glean adhered to the highest standards of security. As a small startup running its product in the cloud, how could they persuade enterprise companies to entrust all their data to a relatively unknown provider?

One way they addressed this concern was by designing their product using a single-tenant architecture. This approach ensures that each customer runs their own instance of Glean in their individual environment. Single tenancy not only grants customers more control and customization of their instance but, critically, provides them with the highest level of security.

Today, some of the world’s largest companies, including fintechs and hedge funds, rely on Glean. They place their trust in the platform because Jain and his team have been unwavering in providing uncompromising security measures.

Customization for Every Customer

Glean is an advanced enterprise platform designed to centralize and provide access to company knowledge through AI-driven applications. The architecture is built on two primary engines:

  1. The Knowledge Engine: This serves as a storage mechanism for all of a company's data and knowledge. Operating similarly to a search engine, it incorporates components beyond typical vector databases to enhance data retrieval capabilities. The engine retrieves knowledge based on user queries, playing a pivotal role in the system.

  2. The Language Engine: Unlike the knowledge storage system, this engine focuses on reasoning, understanding user intentions, and interpreting knowledge. It deciphers user inputs and interacts with the knowledge engine to fetch relevant information.

Glean's foundation is rooted in the search product developed over the last 4.5 years. This product powers Glean Chat and any generative AI applications an enterprise may want to create. As a result, Glean has unintentionally become the standard enterprise generative AI platform, seamlessly consolidating company-specific knowledge for various AI applications.

Technologically, Glean's approach is unique. It leverages smaller open domain models, like the Bard family, and customizes them for each client. Customization involves training the models on the specific enterprise's data, ensuring the system understands unique company terminologies, concepts, code names, and acronyms. These fine-tuned models are then used for functions such as semantic similarity and synonym detection.

For user-facing interactions, Glean employs super large language models (LLMs) like GPT-4, 3.5, PaLM, or Llama 2. These LLMs are integrated via API and are primarily responsible for generating AI-driven answers displayed to end-users. Due to their role in summary and synthesis, these super LLMs don't require further training.

In summary, Glean's robust system ensures the convergence of extensive company knowledge with sophisticated AI, providing tailored and insightful interactions for enterprise users.

Introducing Glean Chat

Fast-forward to today, and Glean is actively staying ahead of the AI curve. Earlier this year, they introduced Glean Chat, a workplace chatbot that enables employees to find information through conversation.

“We wanted to take our product to the next level. So instead of Glean just surfacing all relevant documents relating to your search query, now it can actually read those documents and synthesise an answer for you. It’s more conversational. It’s also smarter in terms of how it understands user questions and composes responses. And because users are chatting with the product, we get to have a much better insight into how they interact with the tool, what they want, and how the results they generate compare to that.”

This offering empowers Glean users to choose whether they want to find specific information, discover a list of relevant documents, or access an entire piece of content that directly addresses their question. Importantly, the responses are always tailored to the unique context of their workplace, drawing from thousands of pieces of information specific to that company.

Key Takeaways

  • In the dynamic landscape of artificial intelligence, Glean has carved out a distinctive niche as an AI-powered search and knowledge discovery platform finely tuned for enterprises.

  • Rooted in the real-world challenges faced by its founder, Arvind Jain, during his tenure at a previous startup, Glean squarely addresses the pervasive issue of information retrieval within organizations.

  • By employing innovative techniques like semantic search and leveraging large language models, Glean provides a centralized system adept at effortlessly locating and presenting internal information, thereby revolutionizing workplace efficiency.

  • The recent introduction of Glean Chat represents a significant stride forward, offering a conversational interface that transforms data into actionable insights. This underscores Glean's unwavering commitment to innovation and a user-centric approach.

Glean's revolutionary approach to harnessing AI for knowledge discovery positions it at the forefront of transforming workplace efficiency. In a world increasingly driven by AI, Glean not only sets standards but envisions the next significant leap in enterprise knowledge management.

Special thanks to Arvind Jain for his valuable contribution to this article.