Learners will be able:
1. Identify the most important differences between the gathered and designed data, and summarize the key dimensions in the Total Data Quality Framework (TDQ).
2. Define the three dimensions of the Total Data Quality Framework and discuss potential threats to data quality along these dimensions, for both gathered data and designed data.
3. Define the three dimensions of the Total Data Quality Framework and discuss potential threats to data quality along these dimensions, for both gathered data and designed data.
4. Define data analysis as an important dimension in the Total Data Quality framework and discuss potential threats to the overall quality and effectiveness of an analysis plan for designed or gathered data.
This specialization aims to provide more information on the Total Data Quality framework and help learners understand the details of data quality before they can be used for data analysis. It is the goal of this specialization to help learners incorporate data quality evaluations into their projects as an essential component. We are eager to share knowledge about data quality with all learners, including data scientists and quant analysts who have not received sufficient training in the first steps of the data science process, which focuses on data collection and evaluation. If the data collected/gathered is not of sufficient quality, then a thorough knowledge of statistical analysis techniques and data science techniques will not be of any benefit to a quantitative research project.
This specialization will concentrate on the first steps of any scientific investigation that uses data. It will include generating and gathering data, understanding the source of the data, evaluating its quality, and taking steps towards maximizing the quality of data before performing any statistical analysis or using data science techniques to answer research queries. This will mean that there will not be much material on data analysis, which is covered in many other Coursera specializations. This specialization will focus on understanding and maximising data quality before analysis.