Demystifying machine learning for beginners

If you’re a confused beginner like I was when just starting out with machine learning in python, then stick around, because today, I’ll be trying my best at demystifying and simplifying machine learning for you!

To start off, I presume th…


This content originally appeared on DEV Community and was authored by Code_Jedi

If you're a confused beginner like I was when just starting out with machine learning in python, then stick around, because today, I'll be trying my best at demystifying and simplifying machine learning for you!

To start off, I presume that you would like to learn machine learning for the following reasons:

  1. Working with datasets
  2. Visualizing data
  3. Predicting data
  4. Classifying data

In this tutorial we're going to be making a python script, that will:

  • Load a dataset
  • Visualize the dataset
  • Classify a new piece of data given the dataset

Let's get started!

First, let's import the required libraries:

import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing

If you don't have some of these installed, you can install them by using pip install or pip3 install

Next, we're going to load-in the dataset which we're going to be using for this project:

import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing

df = pandas.read_csv('IRIS.csv')

For this this project, we're going to be using the classic iris dataset which you can download here

Now comes the tricky bit...

Add these lines of code to your python script:

model = KNeighborsClassifier(n_neighbors=3)

features = list(zip(df["sepal_length"], df["sepal_width"]))

model.fit(features,df["species"])

Let me explain...

  • First, we define our model and give it 3 possible classes into which a new piece of data can be classified.
  • We then define the "features" variable which is going to take the "sepal_length" and "sepal_width" columns as the characteristics that we're going to compare in order to classify new pieces of data.
  • Finally, we fit our model with the names of the 3 Iris species, as well as their corresponding "sepal_length" and "sepal_width" variables.

Before, we start predicting new pieces of data, let's plot our dataset using a scatter graph. In our plot, the X axis will be representing the sepal_length and the Y axis will be representing the sepal_width. We're also going to color code the different species of Iris flowers by adding hue='species'. and then finally we'll define the data that we're going to be plotting as our Iris dataset by adding data=df to the end:

sns.scatterplot(x='sepal_length', y='sepal_width',
                hue='species', data=df, )

# Placing Legend outside the Figure
plt.legend(bbox_to_anchor=(1, 1), loc=1)

plt.show()

Here's how the scatter graph should look like:
scatter

To start classifying new pieces of data, first comment out the last code snippet like so:

import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing


df = pandas.read_csv('IRIS.csv')
model = KNeighborsClassifier(n_neighbors=3)

features = list(zip(df["sepal_length"], df["sepal_width"]))

model.fit(features,df["species"])

"""sns.scatterplot(x='sepal_length', y='sepal_width',
                hue='species', data=df, )

# Placing Legend outside the Figure
plt.legend(bbox_to_anchor=(1, 1), loc=1)

plt.show()
"""

Then add these 2 lines of code to the end of your script:

predicted = model.predict([[4.6,5.8]]) 
print(predicted) 

This will simply predict which species of Iris flower is one that has a sepal_length of 4.6 and a sepal_width of 5.8.

Now if you run your code, your output should look like this:

['Iris-setosa']

This means that our new mystery Iris flower has been classified as an "Iris-setosa".

Congradulations!

You've made your first machine learning project!

You can now experiment with this code as well as try some new datasets(you can find lots of great ones on https://www.kaggle.com/).

If you're a beginner who likes discovering new things about python, try my weekly python newsletter

minecraft in python

Byeeeee?


This content originally appeared on DEV Community and was authored by Code_Jedi


Print Share Comment Cite Upload Translate Updates
APA

Code_Jedi | Sciencx (2021-07-31T08:24:38+00:00) Demystifying machine learning for beginners. Retrieved from https://www.scien.cx/2021/07/31/demystifying-machine-learning-for-beginners/

MLA
" » Demystifying machine learning for beginners." Code_Jedi | Sciencx - Saturday July 31, 2021, https://www.scien.cx/2021/07/31/demystifying-machine-learning-for-beginners/
HARVARD
Code_Jedi | Sciencx Saturday July 31, 2021 » Demystifying machine learning for beginners., viewed ,<https://www.scien.cx/2021/07/31/demystifying-machine-learning-for-beginners/>
VANCOUVER
Code_Jedi | Sciencx - » Demystifying machine learning for beginners. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/07/31/demystifying-machine-learning-for-beginners/
CHICAGO
" » Demystifying machine learning for beginners." Code_Jedi | Sciencx - Accessed . https://www.scien.cx/2021/07/31/demystifying-machine-learning-for-beginners/
IEEE
" » Demystifying machine learning for beginners." Code_Jedi | Sciencx [Online]. Available: https://www.scien.cx/2021/07/31/demystifying-machine-learning-for-beginners/. [Accessed: ]
rf:citation
» Demystifying machine learning for beginners | Code_Jedi | Sciencx | https://www.scien.cx/2021/07/31/demystifying-machine-learning-for-beginners/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.