Libraries in Python
Libraries in Python are one of the most important aspects to have knowledge about.
If you know how to utilize the main libraries, and the functions included,
you will be able to handle a lot of the different
problems that will occur when working with different projects.
1. numpy
Background
Numpy is one of the most used packages in Data handling, either directly or indirectly.
It is the backbone which for example Pandas is built on, which is a very popular Data Science Library.
Features
Even if you can perform a lot of the Data Science-related statement in Python with Pandas,
there is important to understand what you can do in the numpy package as well.
Documentation
Documentation about Numpy can be found here
Blog post
Find my Blog post about Matplotlib here (TBA)
Import
You can import numpy as below:
import numpy as np
(as np is not neccessary, but it makes it easier to work with later by abbreviate it to np)
Below are the numpy functions that I use most:
Create an Array
Function Name: np.array
Use case: Creates an Array
Example: np.array([1,2,3,4,5,6])
Blog post here: TBA
Link to array documentation
Return unique elements
Function Name: np.unique
Use case: Returns unique elements from array
Example: np.unique(array)
Blog post here: TBA
Link to unique documentation
Create incremental array
Function Name: np.arange
Use case: Create an array with increment
Example: np.arange(0,100,5)
Blog post here: TBA
Link to linspace documentation
Random Numbers I
Function Name: np.random.random
Use case: Returns random floats in array
Example: np.random.randint(size=10)
Blog post here: TBA
Link to random documentation
Create a Sample
Function Name: np.linspace
Use case: Generate evenly spaced samples
Example: np.linspace(0,100,101)
Blog post here: TBA
Link to linspace documentation
Repeat elements of array
Function Name: np.repeat
Use case: Repeat elements of an array
Example: np.repeat(1,7)
Blog post here: TBA
Link to repeat documentation
Random Numbers II
Function Name: np.random.randint
Use case: Returns random integers in array
Example: np.random.randint(1,7)
Blog post here: TBA
Link to randint documentation
Random Numbers III
Function Name: np.random.randn
Use case: Returns sample from Std. normal distr.
Example: np.random.randn(100)
Blog post here: TBA
Link to randn documentation
2.Pandas
Background
Pandas is a library in Python built on another library called Numpy.
If you work with anything related to Data, Pandas will have a very high probability of being your most used package.
Features
The package make it possible to perform high performance calls related to especially ETL based functions from my usage.
Documentation
Documentation about Pandas can be found here
Blog post
Find my Blog post about Matplotlib here (TBA)
Import
You can import pandas with the calls below (Numpy is needed as it is built on it).
import numpy as np
import pandas as pd
Below are the functions in Pandas that I like most.
Read CSV File
Function Name: pd.readcsv
Usecase Read csv-files to pandas dataframes
Example: df = pd.readcsv(“test.csv”)
Link to read_csv documentation
See Datatypes
Function Name: dtypes
Usecase Want to create data types of dataframe
Example: df.dtypes
Link to pandas documentation
See first n rows
Function Name: head()
Usecase: To see the first n rows of a dataframe
Example: df.head()
Link to pandas documentation
Create subset by index
Function Name: iloc[]
Usecase Want to create subset of dataset
Example: df.iloc[:5, 10:20]
Link to pandas documentation
See column names
Function Name: columns
Usecase: See column names of a dataframe
Example: df.columns
Link to pandas documentation
Create Subset by Name
Function Name: loc[]
Usecase Want to create subset of dataset
Example: df.loc[[5,10],[“height”,”length”]]
Link to pandas documentation
Convert to dataFrame
Function Name: DataFrame()
How is it used: Convert type to dataframe
Example: pd.DataFrame(data)
Link to pandas documentation
Drop columns
Function Name: drop()
Usecase Drops specificed columns
Example: df = df.drop(columns = [‘height’])
Link to pandas documentation
3. Matplotlib
Background
Matplotlib is one of the main visualization libraries in Python.
As mentioned before is it the backend for the Seaborn package.
Features
Just as with numpy, a lot of things can be done in Seaborn.
But there is good to also know how to do different plots directly in matplotlib.
Documentation
Find documentation about Matplotlib here
Blog post
Find my Blog post about Matplotlib here (TBA)
Import
You can import matplotlib with the calls below:
import matplotlib as mpl
import matplotlib.pyplot as plt
import warnings; warnings.filterwarnings(action='once')
Below are the functions in Matplotlib that I like the most
Create a Scatterplot
Function Name: plt.scatter()
Usecase: Creates a scatterplot of the data
Example:
plt.scatter(xAxis, yAxis)
plt.title(“title name”)
plt.xlabel(‘xAxis label’)
plt.ylabel(‘yAxis label’)
plt.show()
Link to MatPlotLib Documentation
Create a Correlelogram
Sample Texxt
Create a Barchart
Function Name: plt.bar(xAxis, yAxis)
Usecase: Creates a bar chart of the data
Example:
plt.bar(xAxis, yAxis)
plt.title(“title name”)
plt.xlabel(‘xAxis label’)
plt.ylabel(‘yAxis label’)
plt.show()
Link to MatPlotLib Documentation
Sample Title
Sample Text
Create a linechart
Function Name: plt.plot(xAxis, yAxis)
Usecase: Creates a line chart of the data
Example:
plt.plot(xAxis, yAxis)
plt.title(“title name”)
plt.xlabel(‘xAxis label’)
plt.ylabel(‘yAxis label’)
plt.show()
Link to MatPlotLib Documentation
Sample Title
Sample Text
Create a Histogram
Function Name: plt.hist(xAxis, bins)
Usecase: Creates a histogram of the data
Example:
plt.hist(x, bins=20)
Link to MatPlotLib Documentation
Sample Title
Sample Text