Numpy arrays are a grid of values of the same type. You can use these arrays to store a list of values that needs to be used for data analysis or machine learning activities.
You can normalize a NumPy array to a unit vector using the sklearn.normalize() method.
When using the array of data in machine learning, you can only pass the normalized values to the algorithms to achieve better accuracy. A unit vector is a vector that has a magnitude of 1
.
In this tutorial, you’ll learn how to normalize a NumPy
array to a unit vector using the python libraries sklearn.normalize()
and numpy.norm()
method.
If you’re in Hurry
You can use the below code snippet to normalize an array in NumPy
to a unit vector.
np.linalg.norm()
method will return one of eight different matrix norms or one of an infinite number of vector norms depending on the value of the ord
parameter. If you do not pass the ord
parameter, it’ll use the FrobeniusNorm.
When you divide the data using this Norm, you’ll get normalized data as shown below.
Snippet
import numpy as np
x = np.random.rand(10)*10
normalized_x= x/np.linalg.norm(x)
print(normalized_x)
Output
[0.46925769 0.12092959 0.37642505 0.09316824 0.38277321 0.07894217
0.36265182 0.28934431 0.49484541 0.04406218]
This is how you can get a unit vector of a NumPy array.
If You Want to Understand Details, Read on…
In this tutorial, you’ll learn how to get the unit vector from a NumPy array using different methods.
Sample Numpy Array
First, let’s create a sample NumPy
array with 10
random values. You can use this in the later steps to learn how to normalize the data.
Snippet
import numpy as np
from sklearn.preprocessing import normalize
x = np.random.rand(10)*10
x
Output
array([4.59743528, 2.49994446, 5.45313476, 2.22769086, 3.19143523,
8.56257209, 7.01471989, 6.23370745, 7.21487837, 8.86694182])
Using SKlearn Normalize
In this section, you’ll learn how to normalize a NumPy
array using the sklearn normalize()
method.
Normalize() method scales the input vector to an individual unit norm.
It accepts one mandatory parameter.
X
– Array-like input. You can pass the data to be normalized in this parameter.
Parameters
It also accepts three other optional parameters.
norm
– {‘l1’, ‘l2’, ‘max’}, default=’l2’ – The norm to be used for normalizing the data.
axis
– {0, 1}, default=1 – axis used to normalize the data along. If 1
, each sample will be normalized individually, If 0
, each feature will be normalized.
copy
– bool, default=True – If false
, the normalization will take place in the same instance of the array. Otherwise, a new copy of the array will be created and normalized.
return_norm
– bool, default=False – Whether you need the computed norms to be returned or not.
Snippet
normalize(x[:,np.newaxis], axis=0)
is used to normalize the data in variable X
.
Where,
np.newaxis
increases the dimension of the NumPy array. Using it along the array X will make the array a one-dimensional array.
x[:, np.newaxis]
– To return all rows from the array for normalization.axis=0
– To normalize the each feature in the array
import numpy as np
from sklearn.preprocessing import normalize
x = np.random.rand(10)*10
normalized_x = normalize(x[:,np.newaxis], axis=0)
print(normalized_x)
When you print the array, you’ll see the array is in a normalized form.
Output
[[0.05341832]
[0.42901918]
[0.34359858]
[0.00150131]
[0.48057246]
[0.3178608 ]
[0.27146542]
[0.27559803]
[0.37805814]
[0.26545377]]
Using np.linalg.norm()
You can also use the np.linalg.norm()
method from the NumPy library to normalize the NumPy array into a unit vector.
np.linalg.norm()
method will return one of eight different matrix norms or one of an infinite number of vector norms depending on the value of the ord
parameter. If you do not pass the ord
parameter, it’ll use the FrobeniusNorm.
You can divide the data using the returned norm to get the unit vector of the NumPy array.
Snippet
import numpy as np
x = np.random.rand(10)*10
normalized_x= x/np.linalg.norm(x)
print(normalized_x)
When you print the normalized vector, you’ll see the normalized value as shown below.
Output
[0.46925769 0.12092959 0.37642505 0.09316824 0.38277321 0.07894217
0.36265182 0.28934431 0.49484541 0.04406218]
This is how you can use the np.linalg.norm()
method to normalize the NumPy
array to a unit vector.
Using Maths Formula
In this section, you’ll create a maths formula to normalize the NumPy
array to a unit vector.
You’ll create a vector norm by taking the square root of the sum of the values in the array. Then using this vector, you can create a normalized form of the data.
Use the below form to normalize the NumPy
array using the mathematical form.
Snippet
import numpy as np
x = np.random.rand(10)*10
normalized_x = x / np.sqrt(np.sum(x**2))
print(normalized_x)
Output
[0.12280124 0.36840538 0.05669781 0.27392538 0.43742201 0.45143303
0.20542178 0.03980713 0.13138495 0.5610464 ]
This is how you can normalize a NumPy array into a unit vector by using the mathematical formula.
Normalize Numpy Array Along Axis
In this section, you’ll learn how to normalize the NumPy array into a unit vector along the different axis. Namely, row
axis and column
axis.
Normalize Numpy Array By Columns
You can use the axis=0
in the normalize function to normalize the NumPy array into a unit vector by columns. When you use this, each feature of the dataset will be normalized.
Snippet
import numpy as np
from sklearn.preprocessing import normalize
x = np.random.rand(10)*10
normalized_x = normalize(x[:,np.newaxis], axis=0)
print(normalized_x)
This array has only one feature. Hence, when you print the normalized array, you’ll see the below values.
Output
[[0.23542553]
[0.38018535]
[0.05725614]
[0.01711471]
[0.59367405]
[0.58159005]
[0.04489816]
[0.09942305]
[0.1961091 ]
[0.23538758]]
Normalize Numpy Array By Rows
You can use the axis=1
in the normalize function to normalize the NumPy array into a unit vector by rows. When you use this, each sample of the dataset will be normalized individually.
Snippet
import numpy as np
from sklearn.preprocessing import normalize
x = np.random.rand(10)*10
normalized_x = normalize(x[:,np.newaxis], axis=1)
print(normalized_x)
The array has only one column. When you normalize based on the row, each sample will be normalized and you’ll see the output as below.
Output
[[1.]
[1.]
[1.]
[1.]
[1.]
[1.]
[1.]
[1.]
[1.]
[1.]]
This is how you can normalize the NumPy array by rows. Each sample will be sample individually.
Conclusion
To summarize, you’ve learned how to normalize a NumPy
array into a unit vector for using it for various data analysis purposes.
You’ve also learned how to get the unit vector from a NumPy array using the maths formula, NumPy norm()
method, and the sklearn normalize()
method.
If you’ve any questions, comment below.