An Associative Internet/Intranet Search Engine
- Summary
- A computer-based method and system for establishing topic words to represent a document, the topic words being suitable for use in document retrieval. The method includes determining document keywords from the document; classifying each of the document keywords into one of a plurality of preestablished keyword classes; and selecting words as the topic words, each selected word from a different one of the preestablished keyword classes, to minimize a cost function on proposed topic words. The cost function may be a metric of dissimilarity, such as cross-entropy, between a first distribution of likelihood of appearance by the plurality of document keywords in a typical document and a second distribution of likelihood of appearance by the plurality of document keywords in a typical document, the second distribution being approximated using proposed topic words. The cost function can be a basis for sorting the priority of the documents.
- Application No.
- 97/ENG/021 Inventor: Professor Wing Shing WONG, Department of Information Engineering Patent Status: US Patent no. 6,128,613 Chinese Patent no. ZL 98 102672.9 HK Standard Patent no. HK1020096
- Country/Region
- Hong Kong
For more information, please click Here