Introduction à l’approche symbolique en analyse des données. (Introduction to data analysis by symbolic approach). (French) Zbl 0673.62003

Summary: The aim of this paper is to define the symbolic approach in data analysis and to show that it extends data analysis to more complex data which may be closer to the multidimensional reality. We introduce several kinds of symbolic objects (“events”, “assertions”, and also “hordes” and “synthesis” objects) which are defined by a logical conjunction of properties concerning the variables. They can take for instance several values on a same variable and they are well adapted to the case of missing and nonsensical values. Background knowledge may be represented by “pyramidal taxonomies” and “affinities”. In clustering the problem remains to find inter-class structures such as partitions, hierarchies and pyramids on symbolic objects instead of classical one.
Symbolic data analysis is conducted on several principles: accuracy of the representation, coherence between the kind of objects used at input and output, knowledge predominance for driving the algorithms, self explanation of the results. We define the notion of order, union and intersection between symbolic objects and we show that they are organised according to an inheritance lattice. We study several kinds of qualities of symbolic objects, of classes and classification of symbolic objects. Finally we propose several kinds of data analysis relating to the symbolic approach.


62-07 Data analysis (statistics) (MSC2010)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI EuDML