Categories: Julia

SVMs with Julia

We’re almost at the end of another month of Julia here on MachineLearningGeek, and we’ll be quickly covering Support Vector Machines today.

Today we’ll be showcasing another machine learning library for Julia. Much like in Python you’d use SciKitLearn for simple algorithms like an SVM, while you’d use PyTorch or the likes for neural networks. We have MLJ.jl for simpler algorithms, while Flux.jl is reserved for neural networks. (There’s also MLJFlux for when you want to put both of them together). So lets get started.

As is the custom, we first import the packages we’ll be needing later, MLJ will be our workhorse, and Gafly we’ll use for plotting.

using MLJ
import RDatasets: dataset
using Gadfly

I prefer to use random values while learning, and so I define a function to initiate my variables like that:

function init()
 X = randn(40, 2)
 y = vcat(-ones(20), ones(20))
 return X,y
 end

 X,y=init()

Today we’re going to use a slightly different way of starting off our data, we’ll use y as a mask:

ym1 = y .== -1
ym2 = .!ym1

And then plot our data

plot(
layer(x=X[ym1, 1], y=X[ym1, 2], ),
layer(x=X[ym2, 1], y=X[ym2, 2], Theme(default_color=colorant"red")),
 #Scale.shape_discrete(levels=["+ve","-ve"]),
 Guide.manual_color_key("", ["+ve", "-ve"], ["deepskyblue","red"]),
 point_shapes=[Shape.circle, Shape.star1])

The reason we constructed our data in such a convoluted way was that in order to use our data with MLJ, we’re going to have to put it in a format that MLJ likes. So, now we do that by converting into tables:

tX = MLJ.table(X)
ty = categorical(y);

Finally, we do the modelling. So, you should know that just like Plots.jl, MLJ.jl, instead of writing everything from scratch, relies on pre-existing libraries. For SVMs, that pre-existing library is LIBSVM, which is pretty standard. So, we load it using the load macro provided by MLJ:

@load SVC pkg=LIBSVM

Then we instantiate it and feed the data:

svc_mdl = SVC()
svc = machine(svc_mdl, tX, ty)

To fit the data is as simple as:

fit!(svc);

And we can check the accuracy of our model by using the misclassification_rate function provided by MLJ

ypred = predict(svc, tX)
misclassification_rate(ypred, ty)

And you can see that even on random data, we get a 0.1 accuracy. That’s pretty good! You can check out the Pluto notebook at the following link:

https://github.com/mathmetal/Misc/blob/master/MLG/SimpleSVM.jl
Editorial Staff

Share
Published by
Editorial Staff

Recent Posts

MapReduce Algorithm

In this tutorial, we will focus on MapReduce Algorithm, its working, example, Word Count Problem,…

4 weeks ago

Linear Programming using Pyomo

Learn how to use Pyomo Packare to solve linear programming problems. In recent years, with…

8 months ago

Networking and Professional Development for Machine Learning Careers in the USA

In today's rapidly evolving technological landscape, machine learning has emerged as a transformative discipline, revolutionizing…

10 months ago

Predicting Employee Churn in Python

Analyze employee churn, Why employees are leaving the company, and How to predict, who will…

1 year ago

Airflow Operators

Airflow operators are core components of any workflow defined in airflow. The operator represents a…

1 year ago

MLOps Tutorial

Machine Learning Operations (MLOps) is a multi-disciplinary field that combines machine learning and software development…

1 year ago