【图文】A collection of machine learning algorithms: from Bayes to deep learning and their respective advantages and disadvantages（part one）

篱笆资讯

A collection of machine learning algorithms: from Bayes to deep learning and their respective advantages and disadvantages（part one）

In the applications that we use in our daily lives, such as recommendation systems, smart image beautification applications and chatbots, a variety of machine learning and data processing algorithms are doing their part. In this article, we filter and briefly introduce some of the most common algorithm categories, and for each category, we list some actual algorithms and briefly introduce their advantages and disadvantages.

https://static.coggle.it/diagra

catalogue

• Regularization Algorithm
• Ensemble Algorithms
• Decision Tree Algorithm
• Regression
• Artificial Neural Network
• Deep Learning
• Support Vector Machine
• Dimensionality Reduction Algorithms
• Clustering Algorithms
• Instance-based Algorithms
• Bayesian Algorithms
• Association Rule Learning Algorithms
• Graphical Models

Regularization Algorithms

It is an extension of another method (usually a regression method) that penalizes models based on their complexity, preferring models that are relatively simple and can be better generalized.

Example：
• Ridge Regression
• LASSO
• GLASSO
• Elastic Net
• Least-Angle Regression

Advantages：

• Its penalty will reduce overfitting
• There will always be a solution

Disadvantages：
• Penalties will cause underfitting
• Hard to calibrate

Ensemble algorithms

The Ensemble Algorithms is an integration model, in which multiple weaker models are integrated into a model set, where the models can be trained individually and their predictions can be combined in some way to make an overall prediction.

The main problem of the algorithm is to find out which of the weaker models can be combined and the way of the combination. This is a very powerful set of techniques and is enjoy great popularity.

• Boosting
• Bootstrapped Aggregation（Bagging）
• AdaBoost
• Stacked Generalization（blending）
• Gradient Boosting Machines，GBM
• Gradient Boosted Regression Trees，GBRT
• Random Forest

Advantages：
• Almost all of the most advanced predictions available today use algorithm integration. It is much more accurate than predictions with individual model

Disadvantages：
• Requires a lot of maintenance work

Decision Tree Algorithm

Decision trees learn and use a decision tree as a predictive model that maps observations on an item (characterized on branches) into conclusions about the target value of that item (characterized on leaves).

The target in a tree model is variable and can take a finite set of values, and is called a classification tree; in these tree structures, the leaves represent class labels and the branches represent features that characterize the connections of these class labels.

Examples:
• Classification and Regression Tree，CART
• Iterative Dichotomiser 3（ID3）

• C4.5 and C5.0 (two different versions of a powerful method)

Advantages:

• Easy to interpret

• Non-parametric type

Disadvantages:

• Tends to be overfitting

• May or may not be caught in local minimums

• No online learning

Regression Algorithms

Regression is a statistical process used to estimate the relationship between two variables. The algorithm provides many techniques for modeling and analyzing multiple variables when it is used to analyze the relationship between the dependent variable and one or more independent variables. To be more specific, regression analysis can help us understand when any of the independent variables change and the other independent variable remains constant, the typical value of the change in the dependent variable. Most commonly, regression analysis can be used to evaluate the conditional expectation of the dependent variable given the conditions of the independent variable.

Regression algorithms are the main algorithms in statistics, which have been incorporated into statistical machine learning.

Examples:
• Ordinary Least Squares Regression，OLSR
• Linear Regression
• Logistic Regression
• Stepwise Regression
• Multivariate Adaptive Regression Splines，MARS
• Locally Estimated Scatterplot Smoothing，LOESS

Advantages：

• Straightforward and fast

• Well-known

Disadvantages：
• Requires strict assumptions

• Requires handling of abnormal values

Artificial Neural Network

Artificial neural network is an algorithm model inspired by biological neural networks.

It is a pattern matching, often used for regression and classification problems, but has a huge subdomain consisting of hundreds of algorithms and variants for various types of problems.

Example：

• Perceptron

• Counterpropagation

• Hopfield networks

• Radial Basis Function Network (RBFN)

Advantages:

• Performs extremely well in tasks for voice, semantics, vision, and various games (e.g., Go-chess).

• Algorithms can be quickly adapted to new problems.

Disadvantages:

Requires large amounts of data for training

Training requires high hardware configuration

The model is in a "black box state" and it is difficult to understand the internal mechanism