Wednesday, November 17, 2010

11/17/2010 Readings for 11/22/2010

The article "Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting" was interesting to me, since I think that Open Archives Initiatives are such a cool device.  I thought it was interesting that the OAI protocol can provide some access to parts of the deep web.  I didn't know that was even possible!  I also appreciated how the protocol can be applicable to other areas, other than the one for which it was designed.

"Web Search Engines:  Part 1" was a little hard to find, but I was able to locate it for free after some effort on my part.  I wasn't about to pay 19 cents for it.  Anyway, I thought it was interesting that the number of servers for the largest search engines is estimated to be in the hundreds of thousands.  Wow!  I guess that makes sense, I'd just never really thought about what all would be required for such a gargantuan effort.  I liked how the process of the web crawlers was explained.

"Web Search Engines:  Part 2" reminded me of the project we did for LIS 2005, so yeah...I shuddered.  I thought it was spelled out pretty clearly.  I felt that both articles presented a pretty clear view of how web search engines operate.

2 comments:

  1. And...I completely forgot to discuss the Deep Web paper. Oops! I thought the part explaining how direct queries are needed in order to access the information in the Deep Web was really interesting. I already knew that the Deep Web was huge, but I would have had no idea how it could be properly accessed.

    ReplyDelete
  2. Tim, 18 cents is my absolute limit.

    The Deep Web is certainly interesting. So much depends on the query itself. About searching in general, the number of servers involved is truly impressive. I was most surprised to learn that major search engines do not pull of much of the relevant material out there. I've always found it interesting that different search engines often find wildly varying results, at least within the first page or two of results. I don't usually make it any further than that.

    ReplyDelete