This course focuses on preparing datasets for AI applications and data analysis. Students will learn to clean, filter, and reshape data; identify and correct missing or inconsistent entries; and apply techniques such as dimensionality reduction to produce high-quality datasets for modeling. Structured Query Language (SQL) will be taught and used for data selection, summarization, filtering, merging, and subsetting. Tools and technologies include spreadsheet software as well as R and Python libraries for data manipulation and analysis.
Prerequisite Courses