Hyperparameter Optimization 4 - Leveraging Genetic Algorithms for Hyperparameter Tuning in Python


In the previous blog posts, we introduced the concept of hyperparameter optimization and explored various techniques, including Grid Search, Random Search, and advanced optimization libraries like Optuna, Hyperopt, and Scikit-Optimize. In this post, we will explore another powerful technique for hyperparameter tuning: Genetic Algorithms (GAs). We will discuss the basics of genetic algorithms and provide a practical example using the DEAP library in Python.

Genetic Algorithms

Genetic algorithms are a class of optimization algorithms inspired by the process of natural selection. They are used to find approximate solutions to optimization problems by mimicking the process of evolution. Genetic algorithms work by maintaining a population of candidate solutions and iteratively applying genetic operators such as selection, crossover (recombination), and mutation to evolve the population towards better solutions.

The main advantage of genetic algorithms is their ability to explore a large search space efficiently. They are particularly useful for optimization problems where the search space is complex, non-linear, and has multiple local optima.

DEAP Library

DEAP (Distributed Evolutionary Algorithms in Python) is a popular Python library for implementing genetic algorithms and other evolutionary computation techniques. DEAP provides a flexible framework for defining custom genetic operators, selection strategies, and evaluation functions, making it suitable for a wide range of optimization problems, including hyperparameter tuning in machine learning.

Example: Hyperparameter Tuning with Genetic Algorithms in Python

In this example, we will demonstrate hyperparameter tuning using the DEAP library and the Genetic Algorithm on the famous Iris dataset with the Support Vector Machine (SVM) algorithm.

  1. Import necessary libraries and load the dataset:
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.svm import SVC
from sklearn.metrics import classification_report
from deap import base, creator, tools, algorithms
import random

iris = datasets.load_iris()
X = iris.data
y = iris.target
  1. Split the dataset into training and testing sets:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
  1. Define the evaluation function for the Genetic Algorithm:
def evaluate_individual(individual):
C, kernel, gamma = individual
svm = SVC(C=C, kernel=kernel, gamma=gamma)
score = np.mean(cross_val_score(svm, X_train, y_train, cv=5))
return score,
  1. Set up the DEAP framework for the Genetic Algorithm:
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)

toolbox = base.Toolbox()
toolbox.register("C", random.uniform, 1e-2, 1e2)
toolbox.register("kernel", random.choice, ['linear', 'rbf'])
toolbox.register("gamma", random.uniform, 1e-4, 1e1)

toolbox.register("individual", tools.initCycle, creator.Individual, (toolbox.C, toolbox.kernel, toolbox.gamma), n=1)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)

toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutGaussian, mu=0, sigma=1, indpb=0.1)
toolbox.register("select", tools.selBest)
toolbox.register("evaluate", evaluate_individual)
  1. Run the Genetic Algorithm:
population = toolbox.population(n=50)
hof = tools.HallOfFame(1)
stats = tools.Statistics(lambda ind: ind.fitness.values)
stats.register("avg", np.mean)
stats.register("min", np.min)
stats.register("max", np.max)

population, logbook = algorithms.eaSimple(population, toolbox, cxpb=0.5, mutpb=0.2, ngen=50, stats=stats, halloffame=hof, verbose=True)
  1. Evaluate the best model:
best_individual = hof[0]
best_model = SVC(C=best_individual[0], kernel=best_individual[1], gamma=best_individual[2])
best_model.fit(X_train, y_train)
y_pred = best_model.predict(X_test)
print(classification_report(y_test, y_pred))

Conclusion

In this blog post, we explored the use of Genetic Algorithms for hyperparameter tuning in machine learning. We discussed the basics of genetic algorithms and provided a practical example using the DEAP library in Python. Genetic algorithms are a powerful technique for efficiently exploring large and complex search spaces, making them a valuable tool for hyperparameter optimization. In the next blog post, we will explore more advanced techniques for hyperparameter optimization, such as population-based training and reinforcement learning.
Continue your learning by reading:
Advanced Techniques, Hyperband and Population-Based Training for Hyperparameter Optimization


Author: robot learner
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !
  TOC