
Google Data Analytics Professional Certificate - Part 5
This is part of a series of posts on the Google Data Analytics course from Coursera. It is not meant to be a review of the course nor by any means an extensive overview of its content. This is intended to be short and incorporate only the main concepts and learnings I gathered from each module. My purpose for these blog posts is mainly to consolidate what I learned from the course and also an attempt to help anyone who might be interested in reading a little bit about these subjects/this course.
In this post you will find a mix of direct content from the course, my own personal notes and also some extrapolations and additions I made wherever I felt the need to add information.
Analyse Data to Answer Questions
Data analysis basics
→ The four phases of analysis:
Organise data in the registry that’s easy to reference;
Formatting data streamlines things and saves time;
Getting input from others — gives a viewpoint that we may not understand or have access to;
Transforming data by observing relationships between data points and making calculations.
→ Sorting and filtering are two ways we can keep things organized when we format and adjust data to work with it:
Filtering — showing data that meet specific criteria and hiding the rest.
Sorting — arranging data into a meaningful order.
→ Data validation allows us to control what can and can’t be entered in a worksheet. Some examples on how to use it are adding a dropdown list with predetermined options, creating custom checkboxes or protecting structured data and formulas. Here are some types of data validation and their purpose:
Data type validation — check that the data matches the data type defined for the field;
Data range validation — check that the data falls within an acceptable range of values defined for the field;
Data constraints — check that the data meets certain conditions or criteria for a field;
Data consistency — check that the data makes sense in the context of other related data;
Data structure — check that the data follows or conforms to a set structure;
Code validation — check that the application code systematically performs any of the previous validations during user data input.
→ Most of the content from this part of the course was about spreadsheets and SQL. Regarding the spreadsheet's content, it was mostly very basic functions and workarounds so I didn't feel the need to write any notes. For the SQL parts, as I already mentioned in previous parts, I didn’t write many notes because I have already done Udacity’s course ‘SQL for Data Analysis’, which is a very good course on the topic, and I took an extensive amount of notes from it so I felt no need to do the same here. I do plan on soon writing some blog posts (depending on how long the full content gets) that cover that course.
That’s it for the fifth part of the Google Data Analytics course from Coursera. You can also read the previous parts (first, second, third and fourth) of the course and soon I’ll be posting the following parts of the course. I also intend to write some more posts on other courses I took (SQL and Python so far), some detailed notes I took (and continue to take) from subjects like data visualisation and probably some short book summaries of my favourite books, with the best quotes and key concepts.
As I mentioned in the beginning, this is mainly with the goal of consolidating all topics I’m interested in learning and also having all of it well structured and put together in one place (this website). So if you find this kind of content useful and wish to read some more, you can follow me on Medium just so you know whenever I post more stuff.