A Method for Discovering and Downloading Hidden Web Content
- Technology Benefits
- Automatic generation of queries for web databases with no human interaction Generated queries are more effective in discovering content within the web databases Robust system design can handle partial list returns from the web databases Maximizes amount of downloadable web content and uses resources efficiently
- Technology Application
- Internet information searching
- Detailed Technology Description
- Researchers at UCLA have designed a system for searching the internet that is able to interact with web-based databases by automatically generating queries for their search pages.
- Supplementary Information
- Patent Number: US7685112B2
Application Number: US2006570330A
Inventor: Ntoulas, Alexandros | Cho, Junghoo | Zerfos, Petros
Priority Date: 17 Jun 2004
Priority Number: US7685112B2
Application Date: 8 Dec 2006
Publication Date: 23 Mar 2010
IPC Current: G06F001730 | G06F0015173
US Class: 707715 | 707003 | 707741 | 709238 | 715255
Assignee Applicant: The Regents of the University of California
Title: Method and apparatus for retrieving and indexing hidden web pages | Method and apparatus for retrieving and indexing hidden pages
Usefulness: Method and apparatus for retrieving and indexing hidden web pages | Method and apparatus for retrieving and indexing hidden pages
Summary: For downloading hidden web pages from web sites having site-specific search interfaces.
Novelty: Hidden web page downloading method in wide area network, involves estimating efficiency of each potential query item identified from downloaded hidden web pages and issuing query using query term with greatest efficiency
- Industry
- ICT/Telecom
- Sub Category
- Software/Application
- Application No.
- 7685112
- Others
-
State of Development
Invention has been prototyped, implemented and tested successfully.
Background
Current internet search engines are limited in their ability to search through web-based databases that are only accessible through front-end search pages. Although this information is hidden to the typical user, it is usually of high value.
Related Materials
Tech ID/UC Case
20188/2004-656-0
Related Cases
2004-656-0
- *Abstract
-
Researchers in the Computer Science Department at UCLA have developed a method for searching hidden web content that has previously been difficult to gather for the end user.
- *IP Issue Date
- Mar 23, 2010
- *Principal Investigator
-
Name: Junghoo Cho
Department:
Name: Alexandros Ntoulas
Department:
Name: Petros Zerfos
Department:
- Country/Region
- USA

For more information, please click Here