A Method for Discovering and Downloading Hidden Web Content
Automatic generation of queries for web databases with no human interaction Generated queries are more effective in discovering content within the web databases Robust system design can handle partial list returns from the web databases Maximizes amount of downloadable web content and uses resources efficiently
Internet information searching
Researchers at UCLA have designed a system for searching the internet that is able to interact with web-based databases by automatically generating queries for their search pages.
Patent Number: US7685112B2
Application Number: US2006570330A
Inventor: Ntoulas, Alexandros | Cho, Junghoo | Zerfos, Petros
Priority Date: 17 Jun 2004
Priority Number: US7685112B2
Application Date: 8 Dec 2006
Publication Date: 23 Mar 2010
IPC Current: G06F001730 | G06F0015173
US Class: 707715 | 707003 | 707741 | 709238 | 715255
Assignee Applicant: The Regents of the University of California
Title: Method and apparatus for retrieving and indexing hidden web pages | Method and apparatus for retrieving and indexing hidden pages
Usefulness: Method and apparatus for retrieving and indexing hidden web pages | Method and apparatus for retrieving and indexing hidden pages
Summary: For downloading hidden web pages from web sites having site-specific search interfaces.
Novelty: Hidden web page downloading method in wide area network, involves estimating efficiency of each potential query item identified from downloaded hidden web pages and issuing query using query term with greatest efficiency
ICT/Telecom
Software/Application
7685112
State of Development Invention has been prototyped, implemented and tested successfully. Background Current internet search engines are limited in their ability to search through web-based databases that are only accessible through front-end search pages. Although this information is hidden to the typical user, it is usually of high value. Related Materials Tech ID/UC Case 20188/2004-656-0 Related Cases 2004-656-0
USA
