The author compiled | | GUEST BLOG source of vitamin k | Analytics Vidhya

introduce

In machine learning projects, you need to follow a series of steps until you reach your goal, and one of the steps you must perform is hyperparametric optimization of the model you choose. This task is always completed after the model selection process (selecting the best model that performs better than the other models).

What is hyperparametric optimization?

Before defining hyperparametric optimization, you need to understand what hyperparametric optimization is. In short, hyperparameters are values of different parameters used to control the learning process and have a significant impact on the performance of machine learning models.

Examples of hyperparameters in random forest algorithms are the number of estimators (n_estimators), maximum depth (max_depth), and criteria. These parameters are adjustable and can directly affect the quality of the training model.

Hyperparameter optimization is to find the right combination of hyperparameter values in order to achieve the maximum performance of data in a reasonable time. It plays an important role in the prediction accuracy of machine learning algorithm. Therefore, hyperparametric optimization is considered to be the most difficult part of building machine learning models.

Most machine learning algorithms have default hyperparameter values. Defaults don’t always perform well in different types of machine learning projects, which is why you need to optimize them to get the right combination of best performance.

Good hyperparameters can make an algorithm shine.

There are some common strategies for optimizing hyperparameters:

(a) Grid search

This is a widely used traditional approach that performs hyperparametric tuning to determine the best value for a given model. Grid searches work by trying all possible combinations of parameters in the model, which means that it takes a lot of time to perform the entire search, which can lead to very high computational costs.

Note: You can learn how to implement grid search here :github.com/Davisy/Hype…

(b) Random search

This approach works differently when a random combination of hyperparameter values is used to find the best solution for the constructed model. The disadvantage of random searches is that they sometimes miss important points (values) in the search space.

Note: You can learn more about implementing random searches here :github.com/Davisy/Hype…

Hyperparameter optimization techniques

In this series of articles, I will introduce you to different advanced hyperparametric optimization techniques/methods that can help you get the best parameters for a given model. We will examine the following techniques.

  • Hyperopt
  • Scikit Optimize
  • Optuna

In this article, I will focus on the implementation of Hyperopt.

What is a Hyperopt

Hyperopt is a powerful python library for hyperparameter optimization, developed by jamesbergstra. Hyperopt uses a form of Bayesian optimization for parameter tuning, allowing you to get the best parameters for a given model. It can optimize models with hundreds of parameters on a large scale.

The characteristics of Hyperopt

Hyperopt contains 4 important features that you need to know in order to run your first optimization.

(a) Search space

Hyperopt has different functions to specify the range of input parameters, these are random search Spaces. Select the most commonly used search options:

  • Hp.choice (label, options)- This can be used for the classification parameter, which returns one of the options, which should be a list or tuple. Example: HP. Choice (” criterion “, [” gini “, “entropy”,])
  • Hp.randint (label, upper)- Can be used for integer arguments, which returns random integers within the range (0, upper). Example: HP. Randint (” max_features “, 50)
  • Hp.uniform (Label, Low, High)- It returns a value between low and high. Example: HP. Uniform (” max_leaf_nodes “, 1, 10)

Other options you can use include:

  • Hp. normal(label, mu, sigma)- This returns an actual value that follows a normal distribution of mean mu and standard deviation sigma
  • Hp. qnormal(label, mu, sigma, q)- Returns a value similar to round(normal(mu, sigma)/q) * q
  • Hp. lognormal(label, mu, sigma)- return exp(normal(mu, sigma))
  • Hp. qlogNormal (label, mu, sigma, q) – Returns a value similar to round(exp(normal(mu, sigma))/q) * q

You can learn more about search space options here: github.com/hyperopt/hy…

Note: Each optimizable random expression has a label (for example, N_ESTIMators) as its first parameter. These tags are used to return the parameter selection to the caller during optimization.

(b) Objective function

This is a minimization function that takes the hyperparameter value as input from the search space and returns the loss. This means that during optimization, we train the model and predict the target characteristics using the selected hyperparameter values, and then evaluate the prediction error and return it to the optimizer. The optimizer decides which values to check and iterates over again. You will learn how to create an object function in a practical example.

(c) fmin

An FMIN function is an optimization function that iterates over different sets of algorithms and their hyperparameters, and then minimizes the objective function. Fmin has five inputs:

  • Minimized objective function

  • Define the search space

  • The search algorithms used include random search, TREE-parzen estimator (TPE) and adaptive TPE.

    Note: Hyperopt.rand. Suggest and hyperopt.tpe. Suggest provide logic for sequential searches in hyperparametric Spaces.

  • Maximum number of assessments

  • Trials object (optional)

Example:

from hyperopt import fmin, tpe, hp,Trials

trials = Trials()

best = fmin(fn=lambda x: x ** 2,
    		space= hp.uniform('x', -10.10),
    		algo=tpe.suggest,
    		max_evals=50,
    		trials = trials)

print(best)
Copy the code
(d) Test subjects

Trials objects are used to hold all hyperparameters, losses, and other information, which means you can access them after running optimizations. In addition, Trials can help you save and load important information and then continue to optimize the process. You’ll learn more about this in a practical example.

from hyperopt import Trials 

trials = Trials()
Copy the code

After understanding the important features of Hyperopt, the following describes how to use Hyperopt.

  • Initialize the space to search.

  • Define the objective function.

  • Select the search algorithm to use.

  • Run the Hyperopt function.

  • Analyze the evaluation output stored in the test object.

Hyperpot in practice

Now that you know the important features of Hyperopt, in this practical example we will use the mobile price data set and the task is to create a model that predicts the price of a mobile device to be 0 (low cost) or 1 (medium cost) or 2 (high cost) or 3 (very high cost).

Install Hyperopt

You can install Hyperopt from PyPI.

pip install hyperopt
Copy the code

Then import the important packages

# import packages
import numpy as np 
import pandas as pd 
from sklearn.ensemble import RandomForestClassifier 
from sklearn import metrics
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import StandardScaler 
from hyperopt import tpe, hp, fmin, STATUS_OK,Trials
from hyperopt.pyll.base import scope

import warnings
warnings.filterwarnings("ignore")
Copy the code

The data set

Let’s load the dataset from the data directory. In order to get more information about this data set: www.kaggle.com/iabhishekof…

# load data

data = pd.read_csv("data/mobile_price_data.csv")
Copy the code

Check the first five rows of the dataset.

# fetch data

data.head()
Copy the code

As you can see, in our data set, we have different numerical characteristics.

Let’s look at the shape of the data set.

# display shape

data.shape
Copy the code

(2000, 21)

In this data set, we have 2000 rows and 21 columns. Now let’s look at the list of features in this dataset.

# display list

list(data.columns)
Copy the code
[' battery_power ', 'blue', 'clock_speed', 'dual_sim', 'the fc', 'four_g', 'int_memory', 'm_dep', 'mobile_wt', 'n_cores', 'PC', 'px_height', 'px_width', 'ram', 'sc_h', 'sc_w', 'talk_time', 'three_g', 'touch_screen', 'wifi', 'price_range]Copy the code

You can find the meaning of each column name here: www.kaggle.com/iabhishekof…

Decompose the dataset into target features and independent features

This is a classification problem, and we will separate the target features and independent features from the data set. Our target is the price range.

# Split the data into features and targets

X = data.drop("price_range", axis=1).values 
y = data.price_range.values
Copy the code

Preprocessing data set

The individual features are then standardized using the StandardScaler method in scikit-learn.

# Standardize characteristic variables

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
Copy the code

Define the parameter space for optimization

We will use three hyperparameters of the random forest algorithm, namely n_ESTIMators, max_depth, and criterion.

space = {
    "n_estimators": hp.choice("n_estimators"[100.200.300.400.500.600]),
    "max_depth": hp.quniform("max_depth".1.15.1),
    "criterion": hp.choice("criterion"["gini"."entropy"]),}Copy the code

We set different values in the hyperparameters selected above. Then you define the target function.

Define the minimization function (objective function)

Our minimization function is called hyperparameter adjustment, and the classification algorithm for optimizing its hyperparameters is random forest. I use cross validation to avoid overfitting, and then the function returns a loss value and its state.

# Define the target function

def hyperparameter_tuning(params) :
    clf = RandomForestClassifier(**params,n_jobs=-1)
    acc = cross_val_score(clf, X_scaled, y,scoring="accuracy").mean()
    return {"loss": -acc, "status": STATUS_OK}
Copy the code

Note: Remember hyperopt minimizes functions, so I added a minus sign to ACC:

Fine-tuning model

Finally, the Trial object is instantiated first, the model is fine-tuned, and then the optimal loss is printed with its hyperparameter value.

Initialize the Trial object
trials = Trials()

best = fmin(
    fn=hyperparameter_tuning,
    space = space, 
    algo=tpe.suggest, 
    max_evals=100, 
    trials=trials
)

print("Best: {}".format(best))
Copy the code
100% | █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ |100/100 [10:30<00:00.6.30s/trial, best loss: -0.8915] Best: {‘criterion’: 1, 'max_depth:11.0, 'n_estimators:2}.
Copy the code

After hyperparameter optimization, the loss is -0.8915, and using n_ESTIMators =300, MAX_DEPTH =11, criterian= “entropy” in the random forest classifier, the accuracy of model performance is 89.15%.

The Results were analyzed using the Trials object

The Trials object helps us examine all the return values computed during the experiment.

(a) trials. The results

This displays a list of dictionaries returned by “Objective” during the search.

trials.results
Copy the code
[{‘loss’: -0.8790000000000001, 'status' :' OK '}, {' loss ': -0.877, 'status' :' OK '}, {' loss ': -0.768, 'status' :' OK '}, {' loss ': -0.8205, 'status' :' OK '}, {' loss ': -0.8720000000000001, 'status' :' OK '}, {' loss ': -0.883, 'status' :' OK '}, {' loss ': -0.8554999999999999, 'status' :' OK '}, {' loss ': -0.8789999999999999, 'status' :' OK '}, {' loss ': -0.595, 'status' :' ok '},... .]Copy the code
(2) trials. Losses ()

This shows a list of losses

trials.losses()
Copy the code
[-0.8790000000000001, -0.877, -0.768, -0.8205, -0.8720000000000001, -0.883, -0.8554999999999999, -0.8789999999999999, -0.595, -0.8765000000000001, -0.877And...... ]Copy the code
(3) trials. Statuses ()

This displays a list of status strings.

trials.statuses()
Copy the code
[' ok ', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok', 'ok',.......)Copy the code

Note: This test object can be saved, passed to a built-in drawing routine, or analyzed with your own custom code.

At the end

Congratulations, you have finished the essay

You can download the data sets and notebooks used in this article here: github.com/Davisy/Hype…

The original link: www.analyticsvidhya.com/blog/2020/0…

Welcome to panchuangai blog: panchuang.net/

Sklearn123.com/

Welcome to docs.panchuang.net/