2.1.3 - Common Data Wrangling Steps
Enroll to start learning
You’ve not yet enrolled in this course. Please enroll for free to listen to audio lessons, classroom podcasts and take practice test.
Practice Questions
Test your understanding with targeted questions
Define what is meant by 'removing duplicates' in a dataset.
💡 Hint: Think about how multiple entries of the same data could impact your results.
What is imputation in data wrangling?
💡 Hint: Remember the different methods of handling missing values that were discussed.
4 more questions available
Interactive Quizzes
Quick quizzes to reinforce your learning
What is the purpose of removing duplicates in data wrangling?
💡 Hint: Think about how duplicated rows can mislead results.
True or False: Imputation can only be done by removing rows with missing data.
💡 Hint: What are the various options available for handling missing values?
Get performance evaluation
Challenge Problems
Push your limits with advanced challenges
You have a dataset with the following columns: Name, Age, Weight, Height, and some rows with missing values for Age. Describe the steps you would take to prepare this dataset for modeling.
💡 Hint: Think through the procedures in order from the information we learned.
In a dataset of test scores, an outlier stands out—one score is 25% higher than the next closest score. Discuss how you would evaluate and treat this outlier.
💡 Hint: Recall the techniques for outlier treatment discussed.
Get performance evaluation
Reference links
Supplementary resources to enhance your learning experience.