Causal Network Algorithm for Big Data Analytics and Prediction
- 詳細技術說明
- The invention is an algorithm to constructnon-parametric models of directional causal dependence between data streams,for use in big data analytics and prediction of entire systems, such as globalearthquake activity or financial markets, for example.
- *Abstract
-
An algorithm was developed atCornell University for big data analytics and prediction that constructsnon-parametric models of the existence and degree of causal dependence betweendata streams. While correlation measuresare used to discern statistical relationships in nearly all branches ofdata-driven scientific inquiry and across many industries, people are generallymost interested in the existence of causal dependence. The ability to computationally inferstatistical evidence of causal dependence can be a far better tool for dataanalysis than simple correlations.
The new algorithm precisely computesthe degree of causal dependence between multiple time-synchronized streams ofdata, without making any restrictive assumptions and with no prior knowledgerequired. Further, the algorithmcomputes the coefficient of causality for each data stream to capture thedirectional flow of causal influence among the different input streams. The network models of causal cross dependenceconstructed from the algorithm (i.e. the Causality Network) can be used forprediction, rather than a simple analysis of past data.
Potential Applications
- Predicting global seismic events with greater accuracy than currently available earthquake prediction methods
- Predicting and modeling financial time series, e.g. devising strategies for high frequency trading, and portfolio management
- Use in network security, e.g. finding unauthorized communications between computers, detecting cyber intrusions and modeling social activity networks
- Useful in medical applications, e.g. modeling molecular evolution of retroviral genomes such as the mutating HIV genome, modeling epidemiology of different diseases, and brain mapping efforts and predicting how neurons connect
Advantages
- Powerful capabilityfor analyzing data and predicting effects on entire systems
- Quantifies thedegree of causal influence between observed data streams without presupposingany particular model structure
- Requires no priorknowledge or assumptions
- Efficient,optimized software is easy to run
Publications
· Chattopadhyay,I. (2014). Causality Networks,arXiv:1406.6651
· Chattopadhyay,I. & Lipson, H. (2014). DistillingEvidence of Long-Range Direction-Specific Causal Cross-Talk in MolecularEvolution of Retro-Viral Genomes. Twenty-Eighth AAAI Conference on ArtificialIntelligence.
- *Licensing
- Carolyn A. Theodorecat42@cornell.edu607 254 4514
- 其他
-
- 國家/地區
- 美國
