General Information #
This is the course website for SCI 2000: Introduction to Data Science. This course aims to provide students with an introduction to data science. Specifically, this course will introduce you to tools and hands-on experience needed to analyse data. By the end of the course, students will:
- Become proficient in
R
, to the level that they can analyse data using the tools from this class. - Be able to describe and analyze data through visualization and simple statistical procedures.
- Be introduced to statistical thinking and be able to think critically about variation and biases.
Course Details #
- Instructor: Max Turgeon
- Email: max.turgeon@umanitoba.ca
- Office: 373 Machray Hall
- Website: https://maxturgeon.ca/f20-stat3150/
- Lectures: TR 11:30 AM–12:45 AM, via Webex
- Office Hours:
- By appointment only
The course outline can be downloaded here.
Prerequisites #
Instructor approval.
Textbook #
There is no textbook for this course. Notes will be provided to students through UM Learn, along with additional resources.
Assessments #
The assessments for this course include:
- Four (4) assignments.
- Three (3) data analysis summaries.
- One (1) final project.
Outline of Topics #
The course is expected to cover the following topics:
- Data visualization
- Data wrangling
- Relational data
- Web scraping
- Introduction to regular expressions
- (If time permits) Automation and version control
Throughout the course, the applied topics above will be complemented with an introduction to statistical thinking: how to think about variability, what biases can occur in the data, and how to perform simple statistical procedures (e.g. comparing means, proportions, linear regression).
Statistical Software #
The course requires you to make extensive use of the R statistical software for your assignments and final data project. Sample codes will be provided to students.
You can download R
for free (for Windows, Mac, Linux, and Solaris) from the Comprehensive R Archive Network at: https://cran.r-project.org/
For additional resources on R
, see here.