CSE 5523: Machine Learning and Statistical Pattern Recognition

Introduction to basic concepts of machine learning and statistical pattern recognition; techniques for classification, clustering and data representation and their theoretical analysis.

Details

Tuesday, Thursday, 12:45 - 2:05
Place: Dreese 480
Instructor: Alan Ritter (ritter.1492@osu.edu)
Office Hours: Tuesday 4:30-5:30pm, Dreese 595
TA: Chaoyue Liu (liu.2656@buckeyemail.osu.edu)
Office Hours: By Appointment, DL 586

Textbook:

Kevin Murphy Machine Learning: a Probabilistic Perspective
(There will be other readings as well)

Grading

Grading will be based on:

Participation (10%)

You will receive credit for asking and answering questions related to the homework on Piazza and engaging in class discussion.

Homeworks (50%)

The homeworks will include both written and programming assignments. Homework should be submitted to the Dropbox folder in Carmen by 11:59pm on the day it is due (unless otherwise instructed). Each student will have 3 flexible days to turn in late homework throughout the semester. As an example, you could turn in the first homework 2 days late and the second homework 1 day late without any penalty. After that you will loose 20% for each day the homework is late. Please email your homework to the instructor in case there are any technical issues with submission.

Midterm (20%)

There will be an in-class midterm on March 7.

Final Projects (20%)

The final project is an open-ended assignment, with the goal of gaining experience applying the techniques presented in class to real-world datasets. Students should work in groups of 3-4. It is a good idea to discuss your planned project with the instructor to get feedback. The final project report should be 4 pages and is due on April 30. The report should describe the problem you are solving, what data is being used, the proposed technique you are applying in addition to what baseline is used to compare against.

Resources

Piazza (discussion, announcements and restricted resources). https://piazza.com/osu/spring2017/5523/home

Carmen (homework submission + grades). https://osu.instructure.com/courses/14167

Academic Integrity

Any assignment or exam that you hand in must be your own work (with the exception of group projects). However, talking with others to better understand the material is strongly encouraged. Copying a solution or letting someone copy your solution is cheating. Everything you hand in must be your own words. Code you hand in must be written by you, with the exception of any code provided as part of the assignment. Any collaboration during an exam is considered cheating. Any student who is caught cheating will be reported to the Committee on Academic Misconduct. Please don't take a chance - if you are having trouble understanding the material, let us know and we will be happy to help.

Homeworks

Homework 1 (Due 1/17, hand in a paper copy at the beginning of class)

Homework 2 Starter Code (Due 2/2, turn in to the dropbox on Carmen)

Homework 3 (Due 2/14, hand in a paper copy at the beginning of class)

Homework 4 Starter Code (Due 3/21, turn in to the dropbox on Carmen)

Homework 5 Starter Code (Due 4/21, turn in to the dropbox on Carmen)

Anonymous Feedback

https://goo.gl/forms/ajvb9dJaTHsWh7sS2

Tentative Schedule:

https://docs.google.com/spreadsheets/d/1wDjARgTT-vvlsKtm02c9Hk8VAr1lkjDJ-wAOKTebGRI/edit?usp=sharing

Reading Assignments

Date	Topic	Required Reading	Suggested Reading
1/10	Course Overview	Murphy Chapter 1	Probability Primer (2.1-2.5, 3.1-3.3), Linear Algebra, Calculus
1/12	Decision Trees	Murphy 16.2	CIML Chapter 1
1/17	Decision Trees (cont) and Probability Review	Murphy Chapter 2.1-2.5, 2.8	Mackay Book 2.1-2.3
1/19	Statistical Estimation	Murphy Chapter 3.1-3.4
1/22	Dirichlet-Multinomial + Naive Bayes	Murphy Chapter 3.4-3.5
1/24	Linear Regression	Murphy Chapter 7.1-7.3, 7.5	7.4, 7.6
1/31	Logistic Regression	Murphy 8.1,8.2,8.31,8.32	Tom Mitchell's Notes on NB + LR
2/2	Logistic Regression	Murphy 8.3.7, 8.5, 8.6.1	Ng & Jordan 2002
2/7	Perceptron	CIML Chapter 4
2/9	Instance-Based Learning	CIML Chapter 3
2/14	Kernel Methods	CIML Chapter 11
2/16	SVMs	CIML Chapter 7	Murphy 14.1, 14.2, 14.3, 14.4, 14.5
2/21	Neural Networks	Murphy 16.5	CIML Chapter 10, Deep Learning Book Chapter 6
2/23	Neural Networks	Murphy 16.5	CIML Chapter 10, Deep Learning Book Chapter 6
2/28	Boosting	Murphy 16.4	CIML Chapter 13
3/2	Midterm Review
2/28	Midterm
3/2	Course Project + Boosting (cont)	Murphy 16.4	CIML Chapter 13
3/21	Boosting	Murphy 16.4
3/23	Expectation Maximization / Unsupervised Learning	Murphy Chapter 11	CIML Chapter 16
3/28	No Class
3/30	Guest Lecture (Wei Xu)		SemEval-2015 Task 1: Paraphrase and Semantic Similarity in Twitter (PIT)
4/4	Expectation Maximization / Unsupervised Learning (cont)	Murphy Chapter 11	CIML Chapter 16
4/6	Directed Graphical Models	Murphy 10.1, 10.2, 10.3, 10.4, 10.5,
4/11	Directed Graphical Models (cont)
4/13	Structure Learning	Murphy 26.4
4/18	Convolutional Neural Networks	Deep Learning Book Chapter 9
4/20	Recurrent Neural Networks	Deep Learning Book Chapter 10
4/27 (2pm)	Final Project Presentations