Multi-Relational Data Characterization by Tensors: Perturbation Analysis
IEEE Transactions on Knowledge and Data Engineering
Data perturbation is deemed a common problem in data processing. It is often inevitable to avoid noisy or misleading data which may arise from real-world collection or model imprecision. Besides, when data privacy is concerned, data perturbation is used as a prevalent data-protection approach, which alters individual data in a way such that the summary statistics still remain more or less the same. Since many data-mining problems can be formulated as tensor equations for characterizing multi-relational data, the main focus of this work is to perform a new perturbation analysis of tensor equations. From our recent study on tensor inversion, we propose a new mathematical framework to invert an arbitrary tensor but the existing iterative algorithms cannot always do so. In this work, we will establish the theoretical tensor-perturbation analysis to quantify the crucial query performance in terms of normalized error-norm with respect to perturbation degree and condition number. The condition number can be taken as a new measure to determine how the solution of a tensor equation varies as the entries are perturbed. Information-retrieval experiments for conducting the perturbation analysis of the solutions to tensor equations over both artificial and real data are undertaken and studied finally.
Analytical models, Data analysis, Data models, data perturbation, Mathematical model, multi-relational data, perturbation analysis, Perturbation methods, Task analysis, Tensor equations, tensor inverse, Tensors
Applied Data Science
Shih Yu Chang and Hsiao Chun Wu. "Multi-Relational Data Characterization by Tensors: Perturbation Analysis" IEEE Transactions on Knowledge and Data Engineering (2021). https://doi.org/10.1109/TKDE.2021.3087671