Tuesday, June 12, 2007

Final Results

I thought I would update everyone on the final results of this study, from my perspective. For the month where I went Google free, the results were less than stellar. In general, none of the other possible approaches to search created a significant improvement over Google. Most were not as good.

As long as the first page of results contains good information, the additional retrieval tools are not helpful. When comparing the results head to head, each time Google had a better result.

The other tools were better if I was doing a known item search for a specific, unpopular page. This is an unlikely scenario in the Internet, but quite likely in the intranet.

From this, I am going to take as my main focus on enterprise search to improve the first page of results, rather than the more interesting - and sexy - information retrieval tools. When I hit the point of diminishing returns with relevance tuning, then I will turn to the more interesting IR items.

I'd like to see how the Google appliance would do on our content, given these results. The issues are: the Internet is homogeneous content, while our content is heterogeneous, there are no strong cross linking patterns within our content, and the breadth of our content is much smaller. People will tend to be looking for a known item, not general information on a particular topic.

Wednesday, May 23, 2007

Interesting Article on Enterprise Search

Enterprise search: Why it’s a crisis and why Googzilla will strike by ZDNet's Larry Dignan -- Enterprise search is a mess and technology managers–as well as the vendors selling them stuff–are to blame. In the end, Google will take over the enterprise. Those were just some of the takeaways from Stephen Arnold, managing director at ArnoldIT.com. Arnold spoke at the Enterprise Search Summit in New York. Here’s his list of why [...]

Read this article, it is an interesting take on the future of enterprise search. Main point seems to be - people implimenting enterprise search are lazy and don't do the work they need to for a good implimentation.

I think Steve Arnold said as much last year, and the year before . . .

Wednesday, April 11, 2007

Comparing and Contrasting

I've modified the experiment somewhat, based on new questions the experience has raised. I am now running the search first in one or more of our tools and then in google, to see if the results are significantly different, or if they are about the same. As some of you may know, I need a great deal of work done on my car. This work is expensive, so I have been looking for options. I did a search on Porsche Exhaust, 911 SC exhaust and what exhaust options are available on a 911 SC. I used Hakia, Kartoo and Google for this search. Of the group, only google came up with a good 1st page of results.

Friday, March 30, 2007

Just had a GREAT experience . . .

I've been doing some research on information retrieval and improvement of intranet search engines. As part of this project, I have been trying to understand what is a good, or good enough, precision, recall and f-measure.

--- Technically, for those who care, I am using a F1 measure with equal balance between precision and recall because I am not sure which the user population prefers, at this time. I am also measuring precision and recall across 25 and 200 results. My ideal sets are sets of 25 documents, culled from a possible 1.5 million documents using queries generated by the "experts". ---

So anyway, I've been using all of our tools to try and find good articles on f-measure. Generally, I have found lots of web sites with f and measure near each other, but no good hits on the first page of results. The search engine Hakia did significantly better. It brought back only documents about the statistical tool known as F-measure. ONLY documents that were about the topic, no documents that are not about the topic! Do you know how rare that is? OutSTANDING!

Thursday, March 29, 2007

Ask.com and Quintara issue

I started to use Ask.com a little more regularly. I noticed that it does some seem to pull back a pretty different set of data than google(naturally they are different.) But the interesting part is that I did a search on White Whale (the band). On Google, it was right near the top but on Ask, it brought back a lot more hits on the animal first. I may use Ask for some work related issues and see what comes about.

Quintara has an annoying usability issue. If you accidently mouse over an item on the left that you are not interested in, too bad. You're getting new results. Couldn't they make it take effect with an onclick rather then hover?

Wednesday, March 28, 2007

Google Labs

Okay, I had another failure today. I was trying to get to the Google Labs site, and could not remember the URL. I tried searching on Exalead for Google Labs. I found lots and lots of references to offices they opened, but no links to http://labs.google.com. AltaVista and Quintura did find the right location fairly quickly

Monday, March 19, 2007


First day of the great search experiment. Couple of notes on Quintura. This engine uses the Yahoo results to create a search term navigation cloud. The presentation of this is fairly dynamic, updating the results list quickly as you navigate along the cloud. Overall, for the most part, I have not been using the cloud much. This is because the first page of results generally has the a "good hit". When I have needed this functionality, it has been less than useful.

I was writing a email to a friend. When last we spoke, he had mentioned possibly buying a small plane for his personal use. I wanted to ask about the plane, but I could not remember the name of the plane he was considering. I thought the cloud tool might let me "berry pick" my way to the right company. I started with airplane, no luck. I then tried sport aviation, no luck. No matter which starting keywords I used, the cloud never seemed to lead to names of companies. I eventually gave up. I went to Wikipedia and delved into the general aviation section until I found a list of manufactures.

It is a good thing I committed to a month, otherwise I might not use this tool again.

Thursday, March 15, 2007

Other Search Engines

We'll try thirty days starting 3/19 to stay Google free(Yahoo and MSN as well, but Google Free sounds pretty interesting.) This isn't a knock on these engines by any means... they obviously are great at what they do. However, thinking back about 7 or 8 years ago, when someone asked you a question, what was your first reaction? To check Alta Vista, WebCrawler or Go? Today its practically second nature to hop on Google and get what you need. Its now a tool similar to Excel, Word or Photoshop. Therefore, can we eliminate these tools and "function" in work and at home. One search engine we can start with is http://www.quintura.com/. As these other search engines are developed are these new types of searching good? Or are people content with a set of links, granted probably what they need, returned to them and nothing else? If you know of any other search engines that could be used please let us know. We'll be watching our search habits and see if what changes, if anything. Stay tuned.


Why would anyone eliminate the three best tools for finding things from their lives? We were looking at some of the more interesting search products that exist out there. The ones that are not one of the top three - Google, Yahoo and MSN. Looking at them, my thoughts were "would I ever use this, really?". As long as I can use Google, get what I want in the first page or two, why would I look at something with clouds, faceted navigation, and so on. To try and force ourselves to use these new tools, we are eliminating the top three from our tools for finding stuff on the web.