Efficient Translation from Edit Distance to Hamming Distance
- Technology Application
- More efficient data and string searching. Potentially faster music or video search engines. Improved data archiving routines. Shorter algorithm running times for proteomic and genomic searches. Potentially faster and more accurate ways to match internet viewable pages to appropriate advertisement pages that appear on the same page. Other biocomputing applications that require fast approximate matching. National security applications, where approximate patterns need to be discovered.
- Detailed Technology Description
- UCLA researchers have developed a novel method for more efficiently searching and comparing sequences. The innovation lies in a method for mapping the sequences to a new set of strings, whose similarity can be compared using hamming distances instead of edit distances. The hamming distance similarity between these new strings is proportional the similarity of the original sequences, and more importantly, there are already existing far more efficient algorithms that can search and index strings using hamming distances.
- Supplementary Information
- Patent Number: US8060808B2
Application Number: US2007816890A
Inventor: Ostrovsky, Rafail | Rabani, Yuval
Priority Date: 28 Feb 2005
Priority Number: US8060808B2
Application Date: 22 Aug 2007
Publication Date: 15 Nov 2011
IPC Current: H03M001300
US Class: 714777 | 714755 | 714786
Assignee Applicant: The Regents of the University of California | The TRDF Research & Development Foundation Ltd
Title: Method for low distortion embedding of edit distance to Hamming distance
Usefulness: Method for low distortion embedding of edit distance to Hamming distance
Summary: For mapping input string into output string in computational biology field.
Novelty: Input string mapping method for computational biology field, involves concatenating z-string for each sample of mapped string, and converting concatenated z-string to bit string using unary notation to form output string
- Industry
- ICT/Telecom
- Sub Category
- Software/Application
- Application No.
- 8060808
- Others
-
State of Development
The mathematical method has been proven, and accepted by Symposium on Theory of Computing (STOC) 2005. Provisional patent has been filed.ABOUT THE LAB This innovation was created by the researchers associated with the Center for Information and Computation Security (CICS) at UCLA which is involved in a wide range of research involving cryptography and computer security. The web site for the lab is http://www.cs.ucla.edu/security/INVENTORS Dr. Rafail Ostrovsky is a Professor in the Department of Computer Science, UCLAs Henry Samueli School of Engineering and Applied Science, and is the director of the Center For Information And Computation Security (CICS) at UCLA.
Background
Many data-intensive applications require computationally intensive algorithms for approximate string matching. Examples include text editors, database archiving, internet search-engines, and bioinformatics applications. For example, sequences of DNA or proteins are routinely searched against one another to determine biological similarity. The edit distance between two strings, the minimum number of character changes, insertions and deletes to map from one string to another, is usually hailed as the one of the best measures for accuracy. Unfortunately, calculating edit distances for hundreds of sequences, which is often the case, is extremely inefficient.Many heuristic algorithms such as BLAST and FASTA have been developed to overcome this inefficiency. However, the innovation disclosed here provides a faster way to handle edit distance (by transforming into a much simpler form) thus potentially speeding up a host of applications that need approximate matching using the edit distance.
Additional Technologies by these Inventors
Tech ID/UC Case
20150/2005-431-0
Related Cases
2005-431-0
- *Abstract
-
UCLA researchers have developed a new approach to more efficiently search and compare strings for approximate matching according to so-called edit distance, allowing for much more efficient search for applications such as bioinformatics, music and video search, and data backup and retrieval.
- *IP Issue Date
- Nov 15, 2011
- *Principal Investigator
-
Name: Rafail Ostrovsky
Department:
Name: Yuval Rabani
Department:
- Country/Region
- USA
