Making Data Speak

'Making Data Speak' or analyzing data to extract information contained within the data is an important competency for business organizations. Some of the key questions which emerge when 'Making Data Speak' are: 

What data to collect?

This also includes related questions like how to collect, when to collect, who will collect, how to store, etc.

Data collection is the foundation for identifying meaningful and effective actions from data analysis. Inappropriate and inaccurate data will speak but not rightly and would generally be misleading. Data collection must also account for subjective factors like people's attitude towards reporting data correctly and on time, estimate of the error in measurement due to the measuring devices, etc.

What analysis to perform?

This also includes related questions like how to clean data before subjecting it to analysis, which analysis tool to apply, when to perform analysis, who will perform analysis, how to store analysis results, etc.

Data analysis serves the purpose of subjecting the data to speak up through the application of the right kind of analysis method. In data analysis the most crucial aspect is selecting the appropriate analysis method. Employing the selected analysis method is generally easy due to the plethora of analysis software tools available in the market. At times, data may need to be collated or organized differently than how they are stored before analysis can be performed.

How to interpret the analysis results for decision making?

This also includes related questions like what other non-data factors to consider, how to check validity of the inference drawn, how to identify what actions to take going forward, etc.

Data interpretation is the real skill. Inferring what the data speak rightly is important to guide right actions. Inferring from data analysis results requires good, solid understanding of the actual underlying physical processes. This is where domain understanding becomes absolutely important. There are times when an expert in data analysis can provide useful insight - this is because the underlying processes across the various domains follow certain statistical distributions (most common is the Normal Distribution, there are others also like Poisson, Exponential, Geometric, Weibull, Hypergeometric, etc.).

No comments:

Post a Comment