CS 307 - Modeling and Learning in Data Science (Spring 2022)

CS 307 - Modeling and Learning in Data Science (Spring 2022)

Instructor
Bo Li, lbo@illinois.edu, 4310 Siebel Center
David Forsyth, daf@illinois.edu, 3310 Siebel Center

Lectures
Class Time: Wednesday, Friday 11:00 AM - 12:15 PM, Transportation Building: Room 112
Lab Time: Tuesday 02:00 PM - 02:50 PM, Siebel Center for Comp Sci: Room 1105
Zoom

Office Hours
Bo Li: After class each day
David Forsyth: After class each day
TA: Thursday 1-2 pm CDT, Zoom
Forums
Canvas link

Course Overview

This course introduces students to the use of classical approaches in data modeling and machine learning in the context of solving data-centric problems. A broad coverage of fundamental models is presented, including linear models, unsupervised learning, supervised learning, and deep learning. A significant emphasis is placed on the application of the models in Python and the interpretability of the results.

Prerequisites: STAT 207 Data Science Exploration; One of MATH 225, MATH 257, MATH 415, MATH 416, ASRM 406.

Please contact the instructor if you have questions regarding the material or concerns about whether your background is suitable for the course.

Course Schedule

The following table outlines the schedule for the course. We will update it as the semester progresses.

Date Lecture Content Readings
(Chapters)
Labs Material & Slides Recording
1/19 & 1/21 Python and simple classification KNN and cross-validation PS 11.1, 11.2, 13.5 Survey
Slides
Code
Datasets:
breast cancer; emotion in Turkish music; seeds; yacht hydrodynamics
[1/19] [1/21]
1/26 & 1/28 Classification I Linear SVM with SGD, L2 regularizer and choice of reg weight using x valid PS 11.4 Lab1
Slides
Code
[1/26] [1/28]
2/2 & 2/4 Classification II Random forests PS 11.5 PS ex 11.4 (a, b, c) Slides [2/2] [2/4]
2/9 & 2/11 Regression I Linear regression; R-squared, robustness, ridge; Know how to debug and when you are in trouble PS 13.1, 13.2, 13.3 PS ex 11.7 with your own choice for stumps in random forest [2/9] [2/11]
2/16 & 2/18 Regression II Lasso regression; Bias and variance; Adding features and feature selection AML 11.1, AML 11.4 PS ex 13.1, 13.10 [2/16] [2/18]
2/23 & 2/25 High Dimensions PCA; Getting PCA with an SVD; Getting a few principal components with NIPALS PS ch 10; AML 5.1.5, 5.1.6, 5.1.7 PS ex 10.4, 10.9 [2/23] [2/25]
3/2 & 3/4 Clustering K-means; K-means with soft weights; Vector quantization PS ch 12 PS ex 12.14 (a) [3/2] [3/4]
3/9 & 3/11 Using probabilistic ideas Naive Bayes; Logistic Regression; Simple filtering (1D Kalman filter) Slides-1; Slides-2 [3/9] [3/11]
3/12 - 3/20 Spring Break
3/23 & 3/25 Deep Neural Networks I Neural network; Backpropagation; Feature construction (e.g.,CNN) AML ch16 AML ex 16.5 [3/23] [3/25]
3/30 & 4/1 Deep Neural Networks II Dropout; SGD tricks; Batch normalization; Simple image classifiers; Adversarial examples and how to make them AML ch17 AML ex 17.2 [3/30] [4/1]
4/6 & 4/8 Other kinds of networks (Perhaps) simple feature learning and linear decoding; Autoencoders, cross encoders; GAN and adversarial smoothing; Self-supervised learning AML ch19.2 AML 19.2 Slides-1; Slides-2 [4/6] [4/8]
4/13 & 4/15 Learning embeddings Simple word embeddings; Simple image embeddings from triplet loss AML worked example 19.2 Slides-1; Slides-2 [4/13] [4/15]
4/20 & 4/22 Sequential modeling Markov chains; Hidden Markov Models and DP for inference AML 13.1, 13.2, 13.31 AML 13.4 Slides [4/20] [4/22]
4/27 & 4/29 Reinforcement Learning ideas Markov decision process (Reward, q function, loss, policy); Policy gradient descent Slides [4/27] [4/29]
5/4 Real-world Applications of DNNs Transformers, Vision Transformers, Masked Autoencoder; Face recognition, Autonomous driving, Question-answering systems Slides
5/11 Final Exam Final Project

Textbook

Textbooks "Probability and Statistics for Computer Science (PS)" and "Applied Machine Learning (AML)" for this course are available for download for free within the University network.

Grading

TBD

Course Expectations

The expectations for the course are that students will attend every class, do any readings assigned for class, and actively and constructively participate in class discussions. Class participation will be a measure of contributing to the discourse both in class, through discussion and questions, and outside of class through contributing and responding to the Canvas forum.

More information about course requirements will be made available leading up to the start of classes.

Ethics Statement

This course will include topics related computer security and privacy. As part of this investigation we may cover technologies whose abuse could infringe on the rights of others. As computer scientists, we rely on the ethical use of these technologies. Unethical use includes circumvention of an existing security or privacy mechanisms for any purpose, or the dissemination, promotion, or exploitation of vulnerabilities of these services. Any activity outside the letter or spirit of these guidelines will be reported to the proper authorities and may result in dismissal from the class and possibly more severe academic and legal sanctions.

Academic Integrity Policy

The University of Illinois at Urbana-Champaign Student Code should also be considered as a part of this syllabus. Students should pay particular attention to Article 1, Part 4: Academic Integrity. Read the Code at the following URL: http://studentcode.illinois.edu/.

Academic dishonesty may result in a failing grade. Every student is expected to review and abide by the Academic Integrity Policy: http://studentcode.illinois.edu/. Ignorance is not an excuse for any academic dishonesty. It is your responsibility to read this policy to avoid any misunderstanding. Do not hesitate to ask the instructor(s) if you are ever in doubt about what constitutes plagiarism, cheating, or any other breach of academic integrity.

Students with Disabilities

To obtain disability-related academic adjustments and/or auxiliary aids, students with disabilities must contact the course instructor and the as soon as possible. To insure that disability-related concerns are properly addressed from the beginning, students with disabilities who require assistance to participate in this class should contact Disability Resources and Educational Services (DRES) and see the instructor as soon as possible. If you need accommodations for any sort of disability, please speak to me after class, or make an appointment to see me, or see me during my office hours. DRES provides students with academic accommodations, access, and support services. To contact DRES you may visit 1207 S. Oak St., Champaign, call 333-4603 (V/TDD), or e-mail a message to disability@uiuc.edu. Please refer to http://www.disability.illinois.edu/.

Emergency Response Recommendations

Emergency response recommendations can be found at the following website: http://police.illinois.edu/emergency-preparedness/. I encourage you to review this website and the campus building floor plans website within the first 10 days of class: http://police.illinois.edu/emergency-preparedness/building-emergency-action-plans/.

Family Educational Rights and Privacy Act (FERPA)

Any student who has suppressed their directory information pursuant to Family Educational Rights and Privacy Act (FERPA) should self-identify to the instructor to ensure protection of the privacy of their attendance in this course. See http://registrar.illinois.edu/ferpa for more information on FERPA.