Monday, August 20, 2007

Why Don't We Just Use Google?

I get this question about once a week. The reasons for not using Google inside the company I work for* are so numerous, I tend to shrug off the question now days. But it probably deserves answering fully at least once.

I understand why they ask: it seems so easy to find information out on the internet. Why is it so complicated and difficult inside the corporate firewall, where there are so many people (including myself) working to make it accessible?

Why Not?

The argument goes something like this:

  • Much of our information inside the firewall is critical business data and is not made available to every Tom, Dick, and Harry (even within the company) so cannot be crawled by a generic search engine.
  • We can't have our employees wasting their time searching through page after page of search results. We need to provide a "better search" tuned to their needs. This better search means:

    1. Only indexing "quality" content -- that which is deemed part of the official corporate intranet -- not cluttering the results with unstructured information such as discussions, forums, blogs, personal sites, etc.
    2. Qualifying certain content as "best bets" (or whatever you like to call them) -- so the right answer shows up first and highlighted.
    3. Providing custom search interfaces for specific types of data -- such as customer testimonials, employee records, market data, sales collateral, etc.
  • Much of the important business information is in special databases and applications, such as SAP, Documentum, SharePoint, Lotus Notes... name your favorite business app. Therefore, you must use that application's UI and custom search (see above) to find and access it.
  • We've already spent significant resources in both time and money creating the high quality search environment we have (see above). We can't afford to throw away more money and start over.
  • Finally we're not the group responsible for the corporate intranet and search, so we couldn't do anything about it even if we wanted to. Besides, we'd get our wrists slapped if we go buying and installing a competitor to the corporate solution.
OK. So that's the argument. Is it valid? Well, the part about restricted access and the Google Appliance** having trouble crawling content it cannot reach is true enough. It is a technical limitation that creates a problem for any proposed search solution (more on this later).

As for the "better search" argument, this is wrong on at least two points. The first is so obvious it barely needs stating, but perhaps it is its obviousness that makes it hard for the proponents of intranet search solutions to see. That is, if your custom search is so much better, why do people keep suggesting alternatives out of frustration?

Sure, there's always a certain number of naysayers to any decision, change, or technology within a company. But this isn't just nay saying. The people asking the question are honestly saying "I can do this better somewhere else -- better, faster, and easier. Why is it so difficult inside the firewall?" Answering "but we're better" may convince management, but it doesn't win over the users.

The second reason this is wrong is more complex. it involves the rationale for "better" and the assumptions that underlie it. They claim they are better because they get the users to the right answer more directly. Of course, if that were true we wouldn't hear so many complaints (point #1) but more importantly, there are two key assumptions here that deserve attention.

  • The first assumption for this to be true is that they (the designers of the search interface) know what the right answers are. That in itself is questionable.
  • The second assumption -- a priori -- is, if they assume they know the right answers, then they must also know what the questions will be!
These assumptions justify decisions like restricting what content is indexed and excluding "noise" such as forums and blogs. But from a knowledge management perspective, these "noisy" channels are where the true nuggets of wisdom and experience are shared! This is one of the longstanding dilemmas of KM: whether to focus on refined knowledge (often referred to as "explicit knowledge" or "best practices") or to support the messier knowledge-in-action of forums and distribution lists where tacit knowledge (the hard-won tidbits of wisdom from experience) becomes explicit through the interaction of practitioners.

By eliminating the "clutter" of unstructured knowledge, the enterprise search reshapes itself as a qualified but sterile channel for "approved" knowledge, not what the people need to answer specific questions that arise in their day to day work. And not what they have come to expect fro a global search engine.

I say it is a dilemma because it is not a matter of picking one over the other: explicit vs. implicit, structured vs. unstructured. Both have their place and need to be supported -- and findable. And when the "corporate" search solution excludes one or the other, it is very difficult for KM or IA to recover the necessary balance.

So, what about the rest of the argument, custom searches and special applications? Yes, it is true these exist. But why are these separate and mutually exclusive to global search? Even if these custom searches and UIs exist, why can't the standard corporate search also return appropriate results from these databases?

The answer to that question is two-fold. The first part goes back to argument #1: often these custom applications and databases have restrictive access permissions that don't allow them to be crawled. The second part is a conceit similar to the argument that the current search is "better"; the owners and developers of these applications feel their interfaces are better and see no need to expose their data for generalized crawling.

The problem with this attitude is that it puts the onus on the user to know that the data exists and to go find it. As a KM professional, I am familiar with many of the resources within the company, but that's my job. The average employee is pretty much in the dark. Why should they be expected to know the contents and whereabouts of every website and database in the company?

This severity of the problem was brought home to me recently. I am not responsible for corporate search, or for the content in many of these special repositories . But I am responsible for the knowledge architecture for my division and was aware of the problem our employees were having finding knowledge.

We couldn't "just use Google" and we don't have the resources to do a federated search (which would be another alternative). Instead, I built a simple javascript-based search interface that simply provides a text box and a pull-down menu asking which repository you want to search. The results are displayed in a frame so the search interface stays visible, in case you want to switch to search a different resource.

Very simple. Crude even. But amazingly popular. It doesn't consolidate results; it doesn't do anything more than many of our KM websites, which already list all these resources in one way or another. Except that it is small. concise, and it lets the user take action. I was surprised how enthusiastic our users were for this tool. Which goes to show how little would be needed to help them...

What if?

So, back to the point. If you hadn't guessed, I think many corporate intranets would be significantly improved if they used a commercial search engine like Google. But... But.. It is not as simple as just installing hardware and software.

The primary argument against commercial crawlers is a tiny bit technical and a large part cultural. And that is argument #1. The attitude towards information as "intellectual property" that has to be protected against misuse by the company's own employees (or contractors, or partners, or customers...). This attitude leads to an intranet that operates like a building full of locked rooms with no signs on the doors. (Funnily enough, not unlike the physical offices of many corporations I have visited...) It confounds the ability to use the technology, like Google, that makes the public internet the amazing resource it is.

To use Google, or any other commercial search engine, the company as a whole -- starting from the very top and going through all levels of management -- has to believe that information only has value when it is used, not when it is locked up. There is no inherent value in information, only in what you can do with it. And providing the mechanisms to make it accessible proportionally increase its value to the company. This includes:

  • Making information read-accessible to all intranet users
  • Making RSS feeds, XML representations and other open interfaces for databases and business applications as important as their own custom UIs, so the content can be crawled and reused on a broader scale
  • Crawling all content, including content created by individuals through discussions, forums, personal blogs, etc.
Are these strictures a panacea? No. Other, more focused activities are also needed. (Notice that I didn't say abandon special applications and custom UIs, just open them up.) But search is such a fundamental, rudimentary activity for knowledge management, that crippling as is done so often sets you off on the wrong foot and puts far more pressure on the other solutions to "get it right". (Which, to be truthful, they are not likely to do on their own.)

And from a pragmatical level, is this approach even feasible? Perhaps not. Changing an entire corporate culture is an unlikely event. So this ideal situation is more likely to happen in small companies where top management has a clear vision, or in new startups.

Or Else...

So, what are the alternatives? If you can't change corporate culture (and that is a pretty tall order for anyone, including a CEO), what can you do to improve the situation?

Even small steps can have a significant impact, just as my crude javascript-based search consolidation did for the employees I work with. Some things you can do are:

  • Start by getting the "noisy" stuff indexed. If they won't include it in the primary index, look into including it as a separate scope within the same interface.
  • If your corporate search isn't working for you, notify your users of the alternatives. Make a list of all of the best resources you know of within the corporation for searching. As messy as the list might be, it will help somebody.
  • Don't create any new silos! If you manage content, make sure it is accessible, work to get it indexed by the corporate search engine (and any other search engines you know of internally).


*Footnote: I am using the company I work for as an example, but I know for a fact from talking to other that the same questions plague large corporations around both nationally and globally.

**Footnote: Just as my company is a placeholder for pretty much any large corporation, Google is a placeholder for any good, commercial crawler-based search engine. Fill in the blank with your favorite...

No comments: