This Statistics for Data Science course will introduce you to basic statistical methods and the procedures for data analysis. This course will give you a practical understanding of key topics in statistics. It covers data gathering, data summarization using descriptive statistics, visualizing and displaying data, analysing relationships between variables and probability distributions, expected value, hypothesis testing, introduction of ANOVA (analysis by variance), regression, and correlation analysis. This course will teach you how to use Python and Jupyter notebooks, the preferred tools for Data Analysts and Data Scientists.
You will be required to complete a project that applies the concepts to a Data Science problem. This project will involve a real-life scenario. It is important to have a solid understanding of different types of data. You will also be able to make intuitive assessments and make appropriate decisions about the methods. Finally, you will learn how to use Python to analyze the data and interpret the results accurately. This course is appropriate for students and professionals who are interested in starting their career as data-driven role such as Data Scientists and Data Analysts, Business Analysts and Statisticians. This course does not require any prior statistics or computer science knowledge. To get comfortable with Python, Jupyter notebooks and libraries, we recommend that you take the Python for Data Science Course before you start this course. Optional refresher courses in Python are also available. This course will teach you how to calculate and apply measures central tendency and dispersion to ungrouped and grouped data. The course will teach you how to summarize, present, and visualize data. Find the most appropriate hypothesis tests for common data sets. Perform regression analysis, correlation tests, hypothesis testing. Demonstrate proficiency with statistical analysis using Python or Jupyter Notebooks.