Model Complexity in Machine Learning

What exactly is complexity of model? How do we measure it? And how complex should a machine learning model be?

Ishan | Virginia Tech & IIT Delhi
8 min readJun 12, 2022

Model Complexity is relative.

An algorithm with complicated methods could have different model complexity. For instance, a Random Forest Algorithm is inherently complicated but it could have different complexity based-on the number of trees used. Higher the number of trees, higher will be the complexity. Hence, the model complexity is relative.

On the other hand, a linear regression algorithm is a simpler method, however its complexity also increase as the feature space in data increase.

In general, if we compare Random Forest Regressor with Linear Regressor (OLS), we can say Random Forest Regressor has more model complexity.

Does Model Complexity depend on the data?

Theoretically speaking, algorithm complexity doesn’t depend on data. However, data (or data size) at hand can also imply which model will be complex or simple.

For instance, a linear regression algorithm is a simpler method.. however, complexity of model, built on linear regression, increases as the feature space increases because more weight parameters need to be estimated.

Same algorithm could be complex or simple or just-balanced given the data and model structure.

Let’s consider two training datasets to understand this better —

  1. Sample 1 — has linear trend in between Target Y and Feature X (Linear Regression)
  2. Sample 2 — has cubic polynomial fit in between Target Y and Feature X

In this case, simple linear regression (OLS) will be just the right fit for Sample 1. On the other hand, it will be not-complex enough for Sample 2.

Now let’s consider polynomial regression with order 2 (quadratic). It would be complex for Sample 1 and not-complex enough for Sample 2.

Let’s look at polynomial regression with order 3 (cubic). It will be complex for Sample 1 and just the right fit for the Sample 2.

In general, model complexity often depend on the number of features or terms included, as well as whether the chosen model is linear, nonlinear, and so on.

Is Model Complexity same as Computational Complexity or Big O?

Not really. Not Exactly.

Model complexity is not the time complexity or the memory complexity of corresponding algorithms.

Let’s consider an example of linear regression (OLS) algorithm —

Big O for Linear Regression (OLS) with dataset size n*p 
(n = observations and p = predictors or features)
- Train-time complexity = O((p^2)*(n+p))
- Run-time complexity = O(p)
- Space complexity (during run time)= O(p)
  1. Sample 1 — Size (100 x 2) with linear relationship.
  2. Sample 2 — Size (100 x 2) with cubic relationship.

In this case, linear regression model will have the same time complexity and space complexity but it would be called just-the-right fit for the Sample 1 but for the Sample 2 it wouldn’t be the right fit (i.e, model complexity needs to increase).

Also, theoretically speaking, two different models might have the same time/space complexity yet their model complexity could be very different.

Model Complexity = The richness of the model space

Model complexity refers to how intricate a machine learning model is in terms of its structure and the number of parameters it possesses. In simpler terms, it relates to how well a model can fit the training data and potentially generalize to new, unseen data. A model that is too simple might underfit the data and fail to capture important patterns, while a model that is overly complex might overfit the data and capture noise instead of true relationships.

ML model complexity is measured by the number of parameters that model possess. A model is said to become more complex if the more parameters are added and conversely, if some parameters are removed model will become simpler (less-complex).

It does not imply a one-to-one relationship between the model complexity and the number of parameters since each parameter is not necessarily equally important. It is not necessarily correct to say that of two models the one with more parameters is more complex (although this is very often the case).

Does it mean two models with same number of parameters have equal model complexity? Not all the parameter contribute to model complexity equally and some parameters could have tight coupling or dependence among each other. Dependence among parameters will typically make a model more complex.

Model Complexity may be considered synonymous to model flexibility.

A flexible model has the capacity to capture a complex underlying process.

For instance, Linear Regressors are very restrictive and hence inflexible since they impose a linear structure onto what the true model looks like. On the other hand, Splines Regression Models allow non-linearity, and hence are more flexible. In this case model complexity can be considered in line (or synonymous) with model flexibility.

In case of clustering with kMeans algorithm, lower value of k means lesser-flexibility and lesser model complexity. Higher the k-value, more the model flexibility and model complexity would be….

Why do we care about Model Complexity?

We care about model complexity because it directly impacts the performance and generalization ability of machine learning models.

Finding the right balance of model complexity is essential to creating models that can effectively learn from data and make accurate predictions on new, unseen data.

Here’s why model complexity matters:

  1. Underfitting and Overfitting: Model complexity is closely linked to the concepts of underfitting and overfitting. An overly simple model (low complexity) may underfit the data by failing to capture underlying patterns and relationships. On the other hand, an overly complex model (high complexity) can overfit the data by memorizing noise and outliers, leading to poor generalization to new data.
  2. Generalization: The ultimate goal of machine learning is to create models that generalize well to new, unseen data. A model with appropriate complexity strikes a balance between fitting the training data well and being able to make accurate predictions on new data.
  3. Bias-Variance Trade-off: Model complexity is tied to the bias-variance trade-off. A simple model has high bias and low variance, while a complex model has low bias and high variance. Balancing these two factors is crucial to achieving optimal model performance.
  4. Computational Efficiency: More complex models generally require more computational resources (memory, processing power, time) to train and make predictions. Choosing an overly complex model can lead to inefficiencies during both training and deployment.
  5. Model Interpretability: Simpler models are often more interpretable, meaning their decisions and predictions are easier to understand and explain. Complex models can be harder to interpret due to their intricate structures.
  6. Data Requirements: Complex models might require larger datasets to effectively learn patterns without overfitting. Simpler models can sometimes work well with smaller datasets.
  7. Hyperparameter Tuning: Model complexity is influenced by hyperparameters, and finding the right hyperparameters becomes crucial to achieving the desired balance between bias and variance. This process requires experimentation and tuning.
  8. Robustness: Simpler models are less prone to capturing noise and outliers in the data, making them more robust in the presence of noisy data.
  9. Domain Constraints: Some domains have limitations on model complexity due to regulatory, safety, or practical considerations. For instance, in medical applications, interpretability and explainability of models are often critical.

It’s important to note that in some cases, model complexity might not be expressed as a single formula but rather as a combination of hyperparameters that control the model’s behavior. Hyperparameters are settings that are decided before training the model and can impact its complexity, regularization, and learning dynamics. These hyperparameters are tuned through techniques like cross-validation to find the optimal balance between underfitting and overfitting.

Selecting the appropriate level of model complexity is often a crucial task in machine learning. It requires a good understanding of the problem, the data, and the algorithm being used. Techniques like cross-validation and regularization can help in finding the right balance between model simplicity and performance on both the training and validation data.

A model should not be too simple, because then it would not be capable of capturing the process behavior with a reasonable degree of accuracy. On the other hand, a model should not be too complex because then it would possess too many parameters to be estimated with the available finite data set. Thus, it is clear that somewhere in between there must exist something like an optimal model complexity.

Intuitive Examples to understand Model Complexity

Case 1: The sample size is large and the number of predictors is small.

In general, a model with relatively higher complexity may perform better. Because of the large sample size, we’re less likely to overfit even when using a more flexible model. For instance, Random Forest algorithm could give better results over Linear Regression algorithms.

Case 2: The number of predictors is large and the sample size is small.

In general, a model with less complexity may perform better. A flexible model (high model complexity) may cause overfitting because of the small sample size.

Case 3: The relationship between the predictors and response is highly non-linear.

A model with relatively higher complexity (for instance, SVM with RBF kernels) will perform better in general because it’ll be necessary to use such flexible model to find the non-linear effect.

Case 4: The variance of the errors is large.

An inflexible model will perform better in general. Because a flexible model will capture too much of the noise in the data due to the large variance of the errors.

Model Complexity and Formula

  1. Linear Regression:
  • Model Complexity: Determined by the number of features and the degree of polynomial expansion.
  • Formula: y = β₀ + β₁x₁ + β₂x₂ + … + βᵣxᵣ + ε

2. Decision Trees:

  • Model Complexity: Determined by the depth of the tree and the number of leaf nodes.
  • Formula: Not exactly a formula, but it’s a hierarchical structure of if-else conditions.

3. Random Forests:

  • Model Complexity: Determined by the number of trees and their individual depth.
  • Formula: Combination of multiple decision trees with a voting mechanism.

4. Support Vector Machines (SVM):

  • Model Complexity: Determined by the choice of kernel function and the regularization parameter.
  • Formula: y = w·x + b, where w is the weight vector and b is the bias term.

5. k-Nearest Neighbors (k-NN):

  • Model Complexity: Determined by the choice of k (number of neighbors) and the distance metric.
  • Formula: Classification is based on majority class among k-nearest neighbors.

6. Naive Bayes:

  • Model Complexity: Generally simple, determined by the number of features.
  • Formula: Relies on Bayes’ theorem to compute conditional probabilities.

7. Neural Networks:

  • Model Complexity: Determined by the number of layers, number of neurons per layer, and connectivity pattern.
  • Formula: In a simple feedforward neural network, it’s a composition of weighted sums and activation functions across layers.
  • Model complexity in Neural Network is usually measured in terms of the number of hidden units and layers and their connectivity, the form of activation functions, and parameters of the learning algorithm itself.
  • Increasing the number and size of layers used in a neural network model, increases model complexity. Model complexity is higher if number of epochs are higher than an optimum value.

Summary

Overly complex models are less easily interpreted, at greater risk of overfitting and will likely be more computationally expensive.

So what methods can be used to reduce model’s complexity?

These methods include:

  • Linear model and subset selection
  • Shrinkage methods (including regularisation)
  • Dimensionality reduction (for example, PCA)

Can we measure Model Complexity in the absolute sense?

There is no such thing as an absolute measure of complexity of a model. However, there are many theories (and measure around them) focused on explaining or measuring model complexity or richness of model space. Some of them are:

--

--

No responses yet