**Click on the text like “Week 1: Jan 20 – 24” to expand or collapse the items we covered in that week.**

I will fill in more detail and provide links to lecture notes and labs as we go along. Items for future dates are tentative and subject to change.

**In class**, we will work on:**After class**, please:**Register for GitHub**here if you haven’t already; I will ask you to provide your GitHub user name in the questionairre below.**Fill out**a brief questionnairre (if you are taking two classes with me, you only need to fill out this questionairre once)**Fill out**this brief poll about when my office hours should be held (if you are taking two classes with me, you only need to fill out this poll once)**Sign up**for our class at Piazza (anonymous question and answer forum): https://piazza.com/mtholyoke/spring2020/stat344ne**Reading**- Chollet:
- Skim sections 1.1 – 1.3. This is pretty fluffy and we’ll mostly either skip it or talk about it in much more depth later, but it might be nice to see now for a little context.
- Read sections 2.1 – 2.3 more carefully. We will talk about this over the next day or two.

- Goodfellow et al.:
- Read the intro to section 5.5 (but don’t worry about KL divergence), read section 5.5.1, lightly skim section 6.1, read the first 3 paragraphs of section 6.2.1.1, and skim sections 6.2.2.1, 6.2.2.2, and 6.2.2.3. We will talk about this over the next day or two.

- Chollet:
**Videos**- I moved the videos that were here to later days.

**Homework 1**- Written part due 5pm Wed, Jan 29
- Coding part due 5 pm Fri, Jan 31

**In class**, we will work on:- Maximum likelihood and output activations for binary classification. I don’t have any lecture notes for this.
- Matrix formulation of calculations for logistic regression across multiple observations. I don’t have any lecture notes for this, but it’s written up in Lab 01.
- Highlights of NumPy: https://github.com/mhc-stat344ne-s2020/Python_NumPy_foundations/blob/master/Python.ipynb
- Lab 1: you do some calculations for logistic regression in NumPy. You should receive an email from GitHub about this.

**After class**, please:**Reading**- Continue/finish readings listed for Wed, Jan 22.
- Take a look at the NumPy document listed above. You can’t run it directly on GitHub, but if you want you could sign into colab.research.google.com and try out some of the code there. Also cross-reference this with the Numpy section in Chollet.

**Videos**: Here are some videos of Andrew Ng talking about logistic regression and stuff we did today; you don’t need to watch these, but feel free if you want a review:- Logistic regression set up: youtube
- Loss function just thrown out there without justification: youtube
- Discussing set up for loss function via maximum likelihood: youtube
- Start at thinking about “vectorization”, i.e. writing things in terms of matrix operations: youtube
- More on vectorization, but I think this video is more complicated than necessary and you might skip it: youtube
- Vectorizing logistic regression. Note that Andrew does this in the variant where your observations are in columns of the X matrix. I want us to understand that you can also just take the transpose of that and get a just-as-valid way of doing the computations, just sideways. This is worth understanding because different sources and different software packages will organize things different ways and you want mental flexibility. We talked about this in class but I don’t know of a video that explains things with the other orientation. youtube
- Broadcasting in NumPy – This is among my least favorite examples, apologies!

**Homework 1**- Written part due 5pm Wed, Jan 29
- Coding part due 5 pm Fri, Jan 31

**In class**, we will work on:- Maximum likelihood and output activations for regression. Lecture notes: pdf
- Lab 1 about numpy calculations relevant to logistic regression – complete and turn in by Friday, Jan 31.

**After class**, please:**Homework 1**- Written part due 5pm Wed, Jan 29
- Coding part due 5 pm Fri, Jan 31

**Lab 1**- Due 5pm Fri, Jan 31

**In class**, we will work on:- Overview/details of maximum likelihood for logistic regression. Demo visualization here.
- Overview/details of maximum likelihood for linear regression. Demo visualization here.
- Maximum likelihood and output activations for multi-class classification. No lecture notes, but see the video linked below.
- Summary of models, activation functions, and losses so far: pdf

**After class**, please:**Videos**- Some of you may find these videos helpful:
- Softmax regression is another word for multinomial regression, which is the equivalent to logistic regression for multi-class classification.

- Some of you may find these videos helpful:
**Homework 1**- Written part due 5pm
**today**Wed, Jan 29 - Coding part due 5 pm Fri, Jan 31

- Written part due 5pm
**Lab 1**- Due 5pm Fri, Jan 31

**In class**, we will work on:**Quiz**on logistic regression- Highlights from last class (see also first page of lecture notes below)
- More concrete example of calculations for multinomial logistic regression
- Lecture notes: pdf
- We also wrote out that for \(m\) observations, \[\begin{align*} z &= \begin{bmatrix} z^{(1)} & \cdots & z^{(m)} \end{bmatrix} = \begin{bmatrix} z^{(1)}_1 & \cdots & z^{(m)}_1 \\ z^{(1)}_2 & \cdots & z^{(m)}_2 \\ \vdots & \ddots & \vdots \\ z^{(1)}_K & \cdots & z^{(m)}_K \end{bmatrix} \\ &= \begin{bmatrix} b_1 + w_1^T x^{(1)} & \cdots & b_1 + w_1^T x^{(m)} \\ b_2 + w_2^T x^{(1)} & \cdots & b_2 + w_2^T x^{(m)} \\ \vdots & \ddots & \vdots \\ b_K + w_K^T x^{(1)} & \cdots & b_K + w_K^T x^{(m)} \end{bmatrix} \\ &= \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_K \end{bmatrix} + \begin{bmatrix} w_1^T \\ w_2^T \\ \vdots \\ w_K^T \end{bmatrix} \begin{bmatrix} x^{(1)} & x^{(2)} & \cdots & x^{(m)} \end{bmatrix} \end{align*}\] where the last equals sign uses Python broadcasting to pull out the vector \(b\).
- Hand out doing the calculations: pdf

- General Python stuff (also posted on resources page)

**After class**, please:**Homework 1**- Coding part due 5 pm
**today**Fri, Jan 31

- Coding part due 5 pm
**Lab 1**- Due 5pm Fri, Jan 31

**Homework 2**- Coding part due 5pm Fri, Feb 7

**Videos**- Some of you may find this sequence of three videos introducing Keras helpful (note that I will follow the code in Chollet, which differs slightly from code used in these videos):
- Video 1: data set up
- Video 2: defining the network
- Video 3: training models

- Some of you may find this sequence of three videos introducing Keras helpful (note that I will follow the code in Chollet, which differs slightly from code used in these videos):

**In class**, we will work on:- Hidden layers and forward propagation. Lecture notes: pdf. Note there is an error on the bottom of page 1: for multinomial regression, you use a softmax activation, not a sigmoid activation.
- Lab 02

**After class**, please:

**In class**, we will work on:- Gradient Descent, start on backpropagation for logistic regression. Lecture notes: pdf. Note there is an error on page 6 that we did not make in class: \(dJdz = a - y\), not \(y - a\).

**After class**, please:**Videos**- Andrew Ng discusses Gradient Descent
- Andrew Ng discusses Derivatives in Computation Graphs. This video feels unnecessarily complicated to me, but you might find it helpful to see things worked out with actual numbers.
- Andrew Ng discusses Gradient Descent for Logistic Regression with 1 observation
- Andrew Ng discusses Gradient Descent for Logistic Regression with m observatons. But he uses a for loop that we really don’t want. He then gets rid of the for loop later, but I feel this is extra mental energy over what we did.
- Andrew Ng discusses Vectorizing Gradient Descent for Logistic Regression with m observatons. This is the matrix formulation that we want. Note that where we would write \(\frac{\partial J(b, w)}{\partial z}\)$, Andrew writes
`dz`

**In class**, we will work on:**After class**, please:**Videos**- Andrew Ng discusses Gradient descent for neural networks, but he is inconsistent in his use of transposes for \(W\).
- Andrew Ng discusses “intuition” for backpropagation

**In class**, we will work on:- Continue on backpropagation with hidden layers – notes posted Monday
- Start lab on backpropagation

**After class**, please:

**In class**, we will work on:- Issues with gradient descent, learning rates, and stochastic gradient descent: pdf
- More time for lab

**After class**, please:**Videos:**- Andrew Ng discusses feature normalization
- Andrew Ng discusses stochastic gradient descent (which he calls mini-batch gradient descent)
- Andrew Ng discusses more about stochastic gradient descent

**In class**, we will work on:**After class**, please:**Homework 3**due 5pm Fri, Feb. 21- Videos:
- Andrew Ng discusses Regularization
- Andrew Ng discusses intuition for why regularization works using ideas that are fairly different from my motivation.

- Reading:
- Chapter 4 of Chollet
- Section 7.1.1 of Goodfellow et al up through Equation 7.5.

**In class**, we will work on:**After class**, please:**Homework 3**due 5pm Fri, Feb. 21**Videos:**- Andrew Ng discusses dropout regularization
- Andrew Ng discusses more about dropout
- Andrew Ng discusses vanishing and exploding gradients
- Andrew Ng discusses weight initialization

**In class**, we will work on:- Start on convolutional neural networks: pdf

**After class**, please:**Homework 3**due 5pm**today**Fri, Feb. 21

**In class**, we will work on:**After class**, please:**Homework 4**due 5pm Fri, Feb. 28

**In class**, we will work on:- Generators in Python, Data Augmentation for image data: pdf
- Lab on CNNs

**After class**, please:**Homework 4**due 5pm Fri, Feb. 28

**In class**, we will work on:- Overview of common architectures, transfer learning

**After class**, please:**Homework 4**due 5pm Fri, Feb. 28

**In class**, we will work on:**After class**, please:

**In class**, we will work on:- Maybe we’ll do
**Midterm 1**covers material up through Wed, Feb 19

- Maybe we’ll do
**After class**, please:

**In class**, we will work on:- Or else possibly we’ll do
**Midterm 1**covers material up through Wed, Feb 19

- Or else possibly we’ll do
**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**No Class**: Midsemester Break. Safe travels!

**No Class**: Midsemester Break. Safe travels!

**No Class**: Midsemester Break. Safe travels!

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

**In class**, we will work on:**After class**, please:

We will not have a final exam in this class.