Tuesday, June 12, 2007
As long as the first page of results contains good information, the additional retrieval tools are not helpful. When comparing the results head to head, each time Google had a better result.
The other tools were better if I was doing a known item search for a specific, unpopular page. This is an unlikely scenario in the Internet, but quite likely in the intranet.
From this, I am going to take as my main focus on enterprise search to improve the first page of results, rather than the more interesting - and sexy - information retrieval tools. When I hit the point of diminishing returns with relevance tuning, then I will turn to the more interesting IR items.
I'd like to see how the Google appliance would do on our content, given these results. The issues are: the Internet is homogeneous content, while our content is heterogeneous, there are no strong cross linking patterns within our content, and the breadth of our content is much smaller. People will tend to be looking for a known item, not general information on a particular topic.
Wednesday, May 23, 2007
Enterprise search: Why it’s a crisis and why Googzilla will strike by ZDNet's Larry Dignan -- Enterprise search is a mess and technology managers–as well as the vendors selling them stuff–are to blame. In the end, Google will take over the enterprise. Those were just some of the takeaways from Stephen Arnold, managing director at ArnoldIT.com. Arnold spoke at the Enterprise Search Summit in New York. Here’s his list of why [...]
Read this article, it is an interesting take on the future of enterprise search. Main point seems to be - people implimenting enterprise search are lazy and don't do the work they need to for a good implimentation.
I think Steve Arnold said as much last year, and the year before . . .
Wednesday, April 11, 2007
Friday, March 30, 2007
--- Technically, for those who care, I am using a F1 measure with equal balance between precision and recall because I am not sure which the user population prefers, at this time. I am also measuring precision and recall across 25 and 200 results. My ideal sets are sets of 25 documents, culled from a possible 1.5 million documents using queries generated by the "experts". ---
So anyway, I've been using all of our tools to try and find good articles on f-measure. Generally, I have found lots of web sites with f and measure near each other, but no good hits on the first page of results. The search engine Hakia did significantly better. It brought back only documents about the statistical tool known as F-measure. ONLY documents that were about the topic, no documents that are not about the topic! Do you know how rare that is? OutSTANDING!
Thursday, March 29, 2007
Quintara has an annoying usability issue. If you accidently mouse over an item on the left that you are not interested in, too bad. You're getting new results. Couldn't they make it take effect with an onclick rather then hover?
Wednesday, March 28, 2007
Monday, March 19, 2007
I was writing a email to a friend. When last we spoke, he had mentioned possibly buying a small plane for his personal use. I wanted to ask about the plane, but I could not remember the name of the plane he was considering. I thought the cloud tool might let me "berry pick" my way to the right company. I started with airplane, no luck. I then tried sport aviation, no luck. No matter which starting keywords I used, the cloud never seemed to lead to names of companies. I eventually gave up. I went to Wikipedia and delved into the general aviation section until I found a list of manufactures.
It is a good thing I committed to a month, otherwise I might not use this tool again.