Exploratory Data Analysis is a critical step in the analysis of a data in an experiment as it helps in detection of mistakes, checking the validity of the assumptions, selecting the appropriate models etc. This analysis uses statistical tools to investigate data sets for understanding their characteristics.
This analysis includes the collection of data into rectangular arrays where each row represents an experimental subject and each column represents an identifier or outcome variable. Exploratory data analysis includes usage of certain data and hiding the others for better decision making. It is classified in two ways:
Non-graphical and graphical
Univariate and multi-variate
Non-graphical methods include calculations of the data using statistical methods while graphical methods include summarizing of the data in the form of diagrams or charts etc. Univariate methods take one variable at a time while multi-variate methods consider more than one variable simultaneously. Hence the four different types of exploratory data analysis are:
The general techniques used in EDA are box plot, histogram, multi-variate chart, run chart, pareto chart, scatter plot, etc. in the graphical methods and median polish, the trimean and ordination in the non-graphical methods.