What is the Search/Prediction Time Complexity of Logistic Regression?

久未见 提交于 2019-12-22 18:36:33

问题


I am looking into the time complexities of Machine Learning Algorithms and I cannot find what is the time complexity of Logistic Regression for predicting a new input. I have read that for Classification is O(c*d) c-beeing the number of classes, d-beeing the number of dimensions and I know that for the Linear Regression the search/prediction time complexity is O(d). Could you maybe explain what is the search/predict time complexity of Logistic Regression? Thank you in advance

Example For The other Machine Learning Problems: https://www.thekerneltrip.com/machine/learning/computational-complexity-learning-algorithms/


回答1:


Complexity of training for logistic regression methods with gradient based optimization: O((f+1)csE), where:

  • f - number of features (+1 because of bias). Multiplication of each feature times it's weight (f operations, +1 for bias). Another f + 1 operations for summing all of them (obtaining prediction). Using gradient method to improve weights counts for the same number of operations, so in total we get 4* (f+1) (two for forward pass, two for backward), which is simply O(f+1).
  • c - number of classes (possible outputs) in your logistic regression. For binary classification it's one, so this term cancels out. Each class has it's corresponding set of weights.
  • s - number of samples in your dataset, this one is quite intuitive I think.
  • E - number of epochs you are willing to run the gradient descent (whole passes through dataset)

Note: this complexity can change based on things like regularization (another c operations), but the idea standing behind it goes like this.

Complexity of predictions for one sample: O((f+1)c)

  • f + 1 - you simply multiply each weight by the value of feature, add bias and sum all of it together in the end.
  • c - you do it for every class, 1 for binary predictions.

Complexity of predictions for many samples: O((f+1)cs)

  • (f+1)c - see complexity for one sample
  • s - number of samples

Difference between logistic and linear regression in terms of complexity: activation function.

For multiclass logistic regression it will be softmax, while linear regression, as the name suggests, has linear activation (effectively no activation). It does not change the complexity using big O notation, but it's another c*f operations during the training (didn't want to clutter the picture further) multiplied by 2 for backprop.



来源:https://stackoverflow.com/questions/54238493/what-is-the-search-prediction-time-complexity-of-logistic-regression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!