Funcții Kernel
Funcțiile kernel permit algoritmilor liniari să rezolve probleme neliniare prin transformarea datelor într-un spațiu de dimensiune mai mare, fără a calcula explicit această transformare.
Problema separabilității liniare
Unele date nu pot fi separate printr-o linie dreaptă (în 2D) sau un hiperplan (în dimensiuni mai mari).
import numpy as np
import matplotlib.pyplot as plt
# Date neseparabile liniar (cercuri concentrice)
np.random.seed(42)
n = 200
# Clasa 0 - cerc interior
r1 = np.random.uniform(0, 2, n//2)
theta1 = np.random.uniform(0, 2*np.pi, n//2)
X1 = np.column_stack([r1 * np.cos(theta1), r1 * np.sin(theta1)])
# Clasa 1 - cerc exterior
r2 = np.random.uniform(3, 5, n//2)
theta2 = np.random.uniform(0, 2*np.pi, n//2)
X2 = np.column_stack([r2 * np.cos(theta2), r2 * np.sin(theta2)])
X = np.vstack([X1, X2])
y = np.array([0] * (n//2) + [1] * (n//2))
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='coolwarm')
plt.title('Date neseparabile liniar')
plt.show()
Ideea kernel trick
Transformare în spațiu de dimensiune mai mare
Dacă adăugăm o nouă dimensiune z = x^2 + y^2, datele devin separabile:
# Transformare: adaugă z = x² + y²
z = X[:, 0]**2 + X[:, 1]**2
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(10, 5))
ax1 = fig.add_subplot(121)
ax1.scatter(X[:, 0], X[:, 1], c=y, cmap='coolwarm')
ax1.set_title('2D - Neseparabil')
ax2 = fig.add_subplot(122, projection='3d')
ax2.scatter(X[:, 0], X[:, 1], z, c=y, cmap='coolwarm')
ax2.set_title('3D - Separabil!')
plt.show()
Kernel Trick
În loc să calculăm explicit transformarea \phi(x), folosim funcția kernel:
Aceasta calculează produsul scalar în spațiul transformat fără a transforma efectiv datele.
Tipuri de kernel
1. Kernel Liniar
Echivalent cu nicio transformare. Folosit când datele sunt deja separabile liniar.
from sklearn.svm import SVC
svm_linear = SVC(kernel='linear')
svm_linear.fit(X, y)
2. Kernel Polinomial
- d = gradul polinomului
- \gamma = coeficient de scalare
- r = termen independent
# Kernel polinomial de grad 3
svm_poly = SVC(kernel='poly', degree=3, gamma='scale', coef0=1)
svm_poly.fit(X, y)
3. Kernel RBF (Gaussian)
Cel mai utilizat kernel. Mapează în spațiu de dimensiune infinită.
- \gamma mare → frontieră complexă (risc overfitting)
- \gamma mic → frontieră netedă
# Kernel RBF
svm_rbf = SVC(kernel='rbf', gamma='scale')
svm_rbf.fit(X, y)
4. Kernel Sigmoid
Similar cu rețelele neurale.
svm_sigmoid = SVC(kernel='sigmoid', gamma='scale', coef0=0)
svm_sigmoid.fit(X, y)
Support Vector Machines (SVM)
SVM găsește hiperplanul optim care separă clasele cu margine maximă.
SVM Liniar
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Date separabile liniar
X, y = make_classification(n_samples=200, n_features=2, n_redundant=0,
n_informative=2, n_clusters_per_class=1,
random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# SVM liniar
svm = SVC(kernel='linear', C=1.0)
svm.fit(X_train, y_train)
print(f"Acuratețe: {accuracy_score(y_test, svm.predict(X_test)):.3f}")
print(f"Vectori suport: {len(svm.support_vectors_)}")
Vizualizare frontieră de decizie
import numpy as np
import matplotlib.pyplot as plt
def plot_decision_boundary(model, X, y, title):
h = 0.02
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.3, cmap='coolwarm')
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='coolwarm', edgecolors='black')
plt.title(title)
# Comparație kerneluri
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
kernels = [
('linear', {}),
('poly', {'degree': 3}),
('rbf', {'gamma': 'scale'}),
('rbf', {'gamma': 0.1})
]
for ax, (kernel, params) in zip(axes.flatten(), kernels):
plt.sca(ax)
svm = SVC(kernel=kernel, **params)
svm.fit(X, y)
title = f"{kernel}" + (f" (gamma={params.get('gamma', 'auto')})"
if 'gamma' in params else
f" (degree={params.get('degree', '')})"
if 'degree' in params else "")
plot_decision_boundary(svm, X, y, title)
plt.tight_layout()
plt.show()
Parametrul C (Regularizare)
- C mare: Margine mică, mai puține erori pe train, risc overfitting
- C mic: Margine mare, tolerează mai multe erori, mai robust
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
for ax, C in zip(axes, [0.1, 1, 100]):
plt.sca(ax)
svm = SVC(kernel='rbf', C=C, gamma='scale')
svm.fit(X, y)
plot_decision_boundary(svm, X, y, f'C = {C}')
plt.tight_layout()
plt.show()
Parametrul Gamma (pentru RBF)
- Gamma mare: Fiecare punct influențează doar vecinii apropiați
- Gamma mic: Influență pe rază mare
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
for ax, gamma in zip(axes, [0.01, 1, 10]):
plt.sca(ax)
svm = SVC(kernel='rbf', gamma=gamma)
svm.fit(X, y)
plot_decision_boundary(svm, X, y, f'gamma = {gamma}')
plt.tight_layout()
plt.show()
Ridge Regression (Kernel)
Ridge Regression adaugă regularizare L2 la regresia liniară. Poate fi kernelizată.
Ridge clasic
from sklearn.linear_model import Ridge
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
# Date
X = np.random.randn(100, 10)
y = X @ np.random.randn(10) + np.random.randn(100) * 0.5
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Scalare
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Ridge Regression
ridge = Ridge(alpha=1.0)
ridge.fit(X_train_scaled, y_train)
print(f"R² score: {ridge.score(X_test_scaled, y_test):.3f}")
Kernel Ridge Regression
Combină Ridge cu kernel trick pentru regresie neliniară.
from sklearn.kernel_ridge import KernelRidge
# Date neliniare
X = np.sort(5 * np.random.rand(100, 1), axis=0)
y = np.sin(X).ravel() + np.random.randn(100) * 0.1
# Kernel Ridge cu RBF
kr = KernelRidge(alpha=1.0, kernel='rbf', gamma=0.5)
kr.fit(X, y)
# Predicție
X_plot = np.linspace(0, 5, 100).reshape(-1, 1)
y_pred = kr.predict(X_plot)
plt.scatter(X, y, label='Date')
plt.plot(X_plot, y_pred, 'r-', label='Kernel Ridge')
plt.legend()
plt.show()
Comparație Ridge vs Kernel Ridge
from sklearn.linear_model import Ridge
from sklearn.kernel_ridge import KernelRidge
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
# Ridge liniar
axes[0].scatter(X, y, alpha=0.5)
ridge = Ridge()
ridge.fit(X, y)
axes[0].plot(X_plot, ridge.predict(X_plot), 'r-', linewidth=2)
axes[0].set_title('Ridge (Liniar)')
# Kernel Ridge
axes[1].scatter(X, y, alpha=0.5)
kr = KernelRidge(kernel='rbf', gamma=0.5)
kr.fit(X, y)
axes[1].plot(X_plot, kr.predict(X_plot), 'r-', linewidth=2)
axes[1].set_title('Kernel Ridge (RBF)')
plt.tight_layout()
plt.show()
Perceptron
Perceptron este cel mai simplu clasificator liniar, baza rețelelor neurale.
Perceptron clasic
from sklearn.linear_model import Perceptron
from sklearn.datasets import make_classification
# Date separabile liniar
X, y = make_classification(n_samples=200, n_features=2, n_redundant=0,
n_informative=2, n_clusters_per_class=1,
class_sep=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Perceptron
perceptron = Perceptron(max_iter=1000, tol=1e-3)
perceptron.fit(X_train, y_train)
print(f"Acuratețe: {perceptron.score(X_test, y_test):.3f}")
Kernel Perceptron
Pentru date neseparabile liniar, folosim kernel.
# Implementare simplă Kernel Perceptron
class KernelPerceptron:
def __init__(self, kernel='rbf', gamma=1.0, max_iter=100):
self.kernel = kernel
self.gamma = gamma
self.max_iter = max_iter
def _kernel(self, X1, X2):
if self.kernel == 'rbf':
# RBF kernel
sq_dist = np.sum(X1**2, axis=1).reshape(-1, 1) + \
np.sum(X2**2, axis=1) - 2 * X1 @ X2.T
return np.exp(-self.gamma * sq_dist)
elif self.kernel == 'linear':
return X1 @ X2.T
elif self.kernel == 'poly':
return (X1 @ X2.T + 1) ** 3
def fit(self, X, y):
self.X_train = X
self.y_train = np.where(y == 0, -1, 1) # Conversie la -1, 1
n_samples = X.shape[0]
self.alpha = np.zeros(n_samples)
K = self._kernel(X, X)
for _ in range(self.max_iter):
for i in range(n_samples):
pred = np.sign(np.sum(self.alpha * self.y_train * K[:, i]))
if pred != self.y_train[i]:
self.alpha[i] += 1
return self
def predict(self, X):
K = self._kernel(self.X_train, X)
pred = np.sign(np.sum((self.alpha * self.y_train).reshape(-1, 1) * K, axis=0))
return np.where(pred == -1, 0, 1)
# Utilizare
kp = KernelPerceptron(kernel='rbf', gamma=0.5)
kp.fit(X_train, y_train)
print(f"Acuratețe Kernel Perceptron: {np.mean(kp.predict(X_test) == y_test):.3f}")
Comparație algoritmi
| Algoritm | Tip | Kernel support | Utilizare |
|---|---|---|---|
| SVM | Clasificare | Da (implicit) | Date mici-medii, margine maximă |
| Ridge | Regresie | Da (KernelRidge) | Regularizare, multicoliniaritate |
| Perceptron | Clasificare | Da (manual) | Simplu, online learning |
Comparație pe același dataset
from sklearn.svm import SVC
from sklearn.linear_model import Perceptron, RidgeClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import cross_val_score
# Date neseparabile liniar
from sklearn.datasets import make_moons
X, y = make_moons(n_samples=200, noise=0.2, random_state=42)
models = {
'SVM Linear': SVC(kernel='linear'),
'SVM RBF': SVC(kernel='rbf'),
'SVM Poly': SVC(kernel='poly', degree=3),
'Perceptron': Perceptron(),
'Ridge Classifier': RidgeClassifier()
}
print("Cross-validation scores:")
for name, model in models.items():
pipe = make_pipeline(StandardScaler(), model)
scores = cross_val_score(pipe, X, y, cv=5)
print(f"{name:20s}: {scores.mean():.3f} (+/- {scores.std():.3f})")
Tuning hiperparametri
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# Pipeline
pipe = Pipeline([
('scaler', StandardScaler()),
('svm', SVC())
])
# Grid search
param_grid = {
'svm__kernel': ['rbf', 'poly'],
'svm__C': [0.1, 1, 10],
'svm__gamma': ['scale', 0.1, 1],
'svm__degree': [2, 3] # doar pentru poly
}
grid = GridSearchCV(pipe, param_grid, cv=5, scoring='accuracy')
grid.fit(X, y)
print(f"Cei mai buni parametri: {grid.best_params_}")
print(f"Cel mai bun scor: {grid.best_score_:.3f}")
Rezumat
- Kernel trick permite algoritmi liniari să rezolve probleme neliniare
- RBF este cel mai versatil kernel (default în sklearn)
- C controlează trade-off-ul bias-variance
- Gamma controlează complexitatea frontierei (pentru RBF)
- Scalarea datelor este esențială pentru SVM
- Folosește GridSearchCV pentru tuning