亚洲知识产权资讯网为知识产权业界提供一个一站式网上交易平台,协助业界发掘知识产权贸易商机,并与环球知识产权业界建立联系。无论你是知识产权拥有者正在出售您的知识产权,或是制造商需要购买技术以提高操作效能,又或是知识产权配套服务供应商,你将会从本网站发掘到有用的知识产权贸易资讯。

Automated Debugging for Data-Intensive Scalable Computing

技术优势
BIGSIFT improves the accuracy of fault localizability by several order-of-magnitude (103-107) compared to Titian data provenanceImproves performance by up to 66x compared to Delta DebuggingAble to localize fault-inducing data within 62% of the original job running time for each faulty output
技术应用
Debugging for Data-Intensive Scalable Computing (DISC) systems
详细技术说明
Researchers at UCLA have developed a new faulty data localization approach called BIGSIFT, which combines insights from automated fault isolation in software engineering and data provenance in database systems to find a minimum set of failure-inducing inputs. BIGSIFT redefines data provenance for the purpose of debugging using a test oracle function and implements several unique optimizations, specifically geared towards the iterative nature of automated debugging workloads.
*Abstract
UCLA researchers in the Department of Computer Science have developed BIGSIFT, a new faulty data localization approach that combines insights from automated fault isolation in software engineering and data provenance in database systems to find a minimum set of failure-introducing inputs.
*Principal Investigation

Name: Muhammad Ali Gulzar

Department:


Name: Miryung Kim

Department:

其他

State Of Development

The BIGSIFT is ready to be used for DISC systems.


Background

Data-Intensive Scalable Computing (DISC) systems draw valuable insights from massive data sets to help make business decisions and scientific discoveries. Similar to other software development platforms, developers often deal with program errors and incorrect inputs that require error debugging. When errors (e.g. program crash, outlier results) arise, developers often have to go through a lengthy and expensive process of manual trial and error debugging by identifying a subset of the input data that is able to reproduce the problem.

Current approaches such as Data Provenance (DP) and Delta Debugging (DD) are not suitable for debugging DISC workloads because 1) DD does not consider the semantics of data-flow operators and thus cannot prune input records known to be irrelevant; 2) DD’s search strategy is iterative, which is prohibitively expensive for large datasets such as DISC; 3) DP over-approximates the scope of failure-inducing inputs by considering that all intermediate inputs mapping to the same key contribute to the erroneous output.

For complex DISC systems, it is therefore crucial to equip developers with toolkits that can better pinpoint the root cause of an error.


Related Materials



Tech ID/UC Case

29154/2018-151-0


Related Cases

2018-151-0

国家/地区
美国

欲了解更多信息,请点击 这里
移动设备