ScikitLearn.jl
ScikitLearn.jl
lets you use many stats packages and machine learning models from Python's scikit-learn
library — but directly in Julia! It helps you do things like predictions, classifications, and more using very beginner-friendly tools.
With ScikitLearn.jl, you can:
Train and evaluate machine learning models
Use toy datasets to explore machine learning models
Installation & Setup
First, make sure you have Julia installed. On Oscar you can just enter the command module load julia
in terminal. If not, refer to this page to install the appropriate version of Julia for you computer.
Once Julia is installed, enter the Julia interactive window by entering the command julia
.
Once in the interactive window enter the following command to download the appropriate packages:
This command installs Python's ScikitLearn package to your conda environment. Now, open Julia and run one at a time (these might take a while so be patient):
If you are using ScikitLearn
for the first time you might need to install it. Julia should automatically give you some installation prompts.
Example 1: Logistic Regression
ScikitLearn has several 'toy' datasets that can be used for experimentation and development (see here). We’ll use a pretty well know dataset of iris flowers to train a model to predict a flower's type given some quantitative descriptive data. We will start with a basic logistic regression model (more info here).
Example 2: Decision Tree
Now let’s try using a decision tree to classify the same flowers.
Note that the 'simpler' logistic regression model actually may outperform the more complex decision tree. In this case that is due to the simplicity of the Iris dataset.
Key Terms to Know
fit!
Teach the model using your data
predict
Ask the model to guess based on new data
score
See how good the model is (1.0 = perfect, 0.0 = bad)
X
The input data (features)
y
The correct answers (labels)
Resources
Last updated