Posts Logistic Regression

Logistic Regression

Parameter Estimate

The parameters are estimated by maximum likelihood estimate.

  • When we assume $p$ is a constant, $L = \prod_{i=1}^{n} p^{y_{i}}(1-p)^{1-y_{i}}$, then $\hat{p}=n^{-1} \sum_{i=1}^{n} y_{i}$.
  • When we assume each trial has its own sucess probability $p_i$, $L = \prod_{i=1}^{n} p_{i}^{y_{i}}\left(1-p_{i}\right)^{1-y_i}$, then the model is not estimable.
  • As a remedy, we consider using $\theta$ as the parameter for $p_i$, namely, $p_i = p(x_i, \theta)$. When $x_i$ are the same, $p_i$ would be the same. Now, the model would be estimable with $\theta$ having lower dimension than the sample size $n$.


Smoothing of Binary variables


  • Cook, R. Dennis, and Sanford Weisberg. “Graphics for Assessing the Adequacy of Regression Models.” Journal of the American Statistical Association, vol. 92, no. 438, 1997, pp. 490–499. JSTOR, Accessed 28 Feb. 2020.

After reading Cooks and Weisberg’s paper, I would like to try it on a classification problem. They proposed to use lowess to smooth data, so I would just follow what they did.

For numerical response, there would be no issues and we just need to use stats::lowess() in R. But for categorical response, things might be more tricky.

This post is licensed under CC BY 4.0 by the author.

Python and R

Scrape using PHP

Comments powered by Disqus.