亞洲知識產權資訊網為知識產權業界提供一個一站式網上交易平台,協助業界發掘知識產權貿易商機,並與環球知識產權業界建立聯繫。無論你是知識產權擁有者正在出售您的知識產權,或是製造商需要購買技術以提高操作效能,又或是知識產權配套服務供應商,你將會從本網站發掘到有用的知識產權貿易資訊。

Automated Debugging for Data-Intensive Scalable Computing

技術優勢
BIGSIFT improves the accuracy of fault localizability by several order-of-magnitude (103-107) compared to Titian data provenanceImproves performance by up to 66x compared to Delta DebuggingAble to localize fault-inducing data within 62% of the original job running time for each faulty output
技術應用
Debugging for Data-Intensive Scalable Computing (DISC) systems
詳細技術說明
Researchers at UCLA have developed a new faulty data localization approach called BIGSIFT, which combines insights from automated fault isolation in software engineering and data provenance in database systems to find a minimum set of failure-inducing inputs. BIGSIFT redefines data provenance for the purpose of debugging using a test oracle function and implements several unique optimizations, specifically geared towards the iterative nature of automated debugging workloads.
*Abstract
UCLA researchers in the Department of Computer Science have developed BIGSIFT, a new faulty data localization approach that combines insights from automated fault isolation in software engineering and data provenance in database systems to find a minimum set of failure-introducing inputs.
*Principal Investigation

Name: Muhammad Ali Gulzar

Department:


Name: Miryung Kim

Department:

其他

State Of Development

The BIGSIFT is ready to be used for DISC systems.


Background

Data-Intensive Scalable Computing (DISC) systems draw valuable insights from massive data sets to help make business decisions and scientific discoveries. Similar to other software development platforms, developers often deal with program errors and incorrect inputs that require error debugging. When errors (e.g. program crash, outlier results) arise, developers often have to go through a lengthy and expensive process of manual trial and error debugging by identifying a subset of the input data that is able to reproduce the problem.

Current approaches such as Data Provenance (DP) and Delta Debugging (DD) are not suitable for debugging DISC workloads because 1) DD does not consider the semantics of data-flow operators and thus cannot prune input records known to be irrelevant; 2) DD’s search strategy is iterative, which is prohibitively expensive for large datasets such as DISC; 3) DP over-approximates the scope of failure-inducing inputs by considering that all intermediate inputs mapping to the same key contribute to the erroneous output.

For complex DISC systems, it is therefore crucial to equip developers with toolkits that can better pinpoint the root cause of an error.


Related Materials



Tech ID/UC Case

29154/2018-151-0


Related Cases

2018-151-0

國家/地區
美國

欲了解更多信息,請點擊 這裡
移動設備