Libraries in Python

Libraries in Python are one of the most important aspects to have knowledge about.
If you know how to utilize the main libraries, and the functions included,
you will be able to handle a lot of the different
problems that will occur when working with different projects.

 
 
 

1. numpy

Background
Numpy is one of the most used packages in Data handling, either directly or indirectly.
It is the backbone which for example Pandas is built on, which is a very popular Data Science Library.

Features
Even if you can perform a lot of the Data Science-related statement in Python with Pandas,
there is important to understand what you can do in the numpy package as well.

Documentation
Documentation about Numpy can be found here

Blog post
Find my Blog post about Matplotlib here (TBA)

Import
You can import numpy as below:
import numpy as np
(as np is not neccessary, but it makes it easier to work with later by abbreviate it to np)
Below are the numpy functions that I use most:

 
 

Create an Array

Function Name: np.array
Use case: Creates an Array
Example: np.array([1,2,3,4,5,6])
Blog post here: TBA
Link to array documentation

 

Return unique elements

Function Name: np.unique
Use case: Returns unique elements from array
Example: np.unique(array)
Blog post here: TBA
Link to unique documentation

Create incremental array

Function Name: np.arange
Use case: Create an array with increment
Example: np.arange(0,100,5)
Blog post here: TBA
Link to linspace documentation

 

Random Numbers I

Function Name: np.random.random
Use case: Returns random floats in array
Example: np.random.randint(size=10)
Blog post here: TBA
Link to random documentation

Create a Sample

Function Name: np.linspace
Use case: Generate evenly spaced samples
Example: np.linspace(0,100,101)
Blog post here: TBA
Link to linspace documentation

Repeat elements of array

Function Name: np.repeat
Use case: Repeat elements of an array
Example: np.repeat(1,7)
Blog post here: TBA
Link to repeat documentation

 

Random Numbers II

Function Name: np.random.randint
Use case: Returns random integers in array
Example: np.random.randint(1,7)
Blog post here: TBA
Link to randint documentation

Random Numbers III

Function Name: np.random.randn
Use case: Returns sample from Std. normal distr.
Example: np.random.randn(100)
Blog post here: TBA
Link to randn documentation

 
 

2.Pandas

Background
Pandas is a library in Python built on another library called Numpy.
If you work with anything related to Data, Pandas will have a very high probability of being your most used package.

Features
The package make it possible to perform high performance calls related to especially ETL based functions from my usage.

Documentation
Documentation about Pandas can be found here

Blog post
Find my Blog post about Matplotlib here (TBA)


Import
You can import pandas with the calls below (Numpy is needed as it is built on it).

import numpy as np
import pandas as pd

Below are the functions in Pandas that I like most.

 
 

Read CSV File

Function Name: pd.readcsv
Usecase Read csv-files to pandas dataframes
Example: df = pd.readcsv(“test.csv”)
Link to read_csv documentation

 

See Datatypes

Function Name: dtypes
Usecase Want to create data types of dataframe
Example: df.dtypes
Link to pandas documentation

See first n rows

Function Name: head()
Usecase: To see the first n rows of a dataframe
Example: df.head()
Link to pandas documentation

 

Create subset by index

Function Name: iloc[]
Usecase Want to create subset of dataset
Example: df.iloc[:5, 10:20]
Link to pandas documentation

See column names

Function Name: columns
Usecase:
See column names of a dataframe
Example: df.columns
Link to pandas documentation

 

Create Subset by Name

Function Name: loc[]
Usecase Want to create subset of dataset
Example: df.loc[[5,10],[“height”,”length”]]
Link to pandas documentation

Convert to dataFrame

Function Name: DataFrame()
How is it used:
Convert type to dataframe
Example: pd.DataFrame(data)
Link to pandas documentation

 

Drop columns

Function Name: drop()
Usecase Drops specificed columns
Example: df = df.drop(columns = [‘height’])
Link to pandas documentation

 
 

3. Matplotlib

Background
Matplotlib is one of the main visualization libraries in Python.
As mentioned before is it the backend for the Seaborn package.

Features
Just as with numpy, a lot of things can be done in Seaborn.
But there is good to also know how to do different plots directly in matplotlib.

Documentation
Find documentation about Matplotlib here

Blog post
Find my Blog post about Matplotlib here (TBA)

Import
You can import matplotlib with the calls below:

import matplotlib as mpl
import matplotlib.pyplot as plt
import warnings; warnings.filterwarnings(action='once')

Below are the functions in Matplotlib that I like the most

 
 

Create a Scatterplot

Function Name: plt.scatter()
Usecase: Creates a scatterplot of the data
Example:
plt.scatter
(xAxis, yAxis)
plt.title(“title name”)
plt.xlabel(‘xAxis label’)
plt.ylabel(‘yAxis label’)
plt.show()
Link to MatPlotLib Documentation

 

Create a Correlelogram

Sample Texxt

Create a Barchart

Function Name: plt.bar(xAxis, yAxis)
Usecase: Creates a bar chart of the data
Example:
plt.bar(xAxis, yAxis)
plt.title(“title name”)
plt.xlabel(‘xAxis label’)
plt.ylabel(‘yAxis label’)
plt.show()
Link to MatPlotLib Documentation

 

Sample Title

Sample Text

Create a linechart

Function Name: plt.plot(xAxis, yAxis)
Usecase:
Creates a line chart of the data
Example:
plt.plot(xAxis, yAxis)
plt.title(“title name”)
plt.xlabel(‘xAxis label’)
plt.ylabel(‘yAxis label’)
plt.show()
Link to MatPlotLib Documentation

 

Sample Title

Sample Text

Create a Histogram

Function Name: plt.hist(xAxis, bins)
Usecase:
Creates a histogram of the data
Example:
plt.hist(x, bins=20)
Link to MatPlotLib Documentation

 

Sample Title

Sample Text