Python lists allow you to store multiple items in a single object.
You can convert multiple lists into pandas dataframe using the zip()
method.
Basic Example
To convert multiple lists into pandas dataframe,
- Create multiple lists and create a list of tuples with one value from each list
- Use the
pd.DataFrame()
and pass the list of tuples to create a dataframe out of the lists
designation = ['Data Scientist', 'Developer', 'Sr. Developer', 'Product Manager']
avg_salary = [200000, 175000, 190000, 250000]
salary_lists = list(zip(designation, avg_salary))
df = pd.DataFrame(salary_lists, columns = ['designation', 'avg_salary'])
df.head()
DataFrame Will Look Like
The list values are converted into a DataFrame.
designation | avg_salary | |
---|---|---|
0 | Data Scientist | 200000 |
1 | Developer | 175000 |
2 | Sr. Developer | 190000 |
3 | Product Manager | 250000 |
Different methods are available to convert multiple lists into Pandas Dataframe. Let us learn each method in detail and see when it is appropriate to use them.
Table of Contents
Creating Multiple Lists
- Create multiple lists of the same size.
- To calculate the size of a list, read How to count the number of elements in the list.
designation = ['Data Scientist', 'Developer', 'Sr. Developer', 'Product Manager']
avg_salary = [200000, 175000, 190000, 250000]
Convert Lists to Dataframe Using a Zip
In this section, you’ll use the zip() method to create a list of tuples.
- Each tuple will contain one item from each list.
- When lists are of different sizes, the least size list will decide the number of tuples created. Other items will be ignored.
Use this method when you want to create a dataframe from more than two lists or you have a list of different sizes.
Code
- Use the
zip()
method to create a list of tuples and create a list of tuples with thelist()
method - Use the list with the
pd.DataFrame()
to create a pandas dataframe out of the tuples list - Add a header to the dataframe using the
columns
attribute
salary_lists = list(zip(designation, avg_salary))
df = pd.DataFrame(salary_lists, columns = ['designation', 'avg_salary'])
df.head()
DataFrame Will Look Like
When you print the dataframe using the df.head()
, you’ll see the list of values available in the dataframe.
designation | avg_salary | |
---|---|---|
0 | Data Scientist | 200000 |
1 | Developer | 175000 |
2 | Sr. Developer | 190000 |
3 | Product Manager | 250000 |
This is how you can convert multiple lists into a pandas dataframe using the zip()
method.
Convert Lists to Dataframe Using a Dictionary
This section teaches you how to convert multiple lists to a pandas dataframe using a dictionary and from_dict() method.
- Use the from_dict() method to create a pandas dataframe from a dictionary
Use this method when you have two lists to create a dataframe because you can use one list as a key and another list as a value while creating a dictionary.
Code
import pandas as pd
designation = ['Data Scientist', 'Developer', 'Sr. Developer', 'Product Manager']
avg_salary = [200000, 175000, 190000, 250000]
salary_dict = dict(designation=designation, avg_salary =avg_salary)
df = pd.DataFrame.from_dict(salary_dict)
df.head()
DataFrame Will Look Like
designation | avg_salary | |
---|---|---|
0 | Data Scientist | 200000 |
1 | Developer | 175000 |
2 | Sr. Developer | 190000 |
3 | Product Manager | 250000 |
This is how you can use a dictionary to convert two lists to a pandas dataframe.
Create Dataframe from Lists of Different Length
When having multiple lists, there are chances that the lists are of different sizes.
When you use those lists with different sizes, you’ll see the ValueError
saying that all the arrays must be of the same length.
ValueError: All arrays must be of the same length
There are two possibilities to create a dataframe with lists of different sizes.
- Use the Zip method explained above. This will create a dataframe with the size of the minimum size list.
- Use the dictionary and the from_dict() method, and create a dataframe with the
orient=index
parameter. It’ll create a dataframe with the lists as rows. You can use the transpose() method. It’ll transpose rows as columns.
Code
designation = ['Data Scientist', 'Developer', 'Sr. Developer', 'Product Manager', 'Project Manager']
avg_salary = [200000, 175000, 190000, 250000]
salary_dict = dict(designation=designation, avg_salary =avg_salary)
df = pd.DataFrame.from_dict(salary_dict, orient='index').transpose()
df
DataFrame Will Look Like
The missing values will be denoted with None
Values.
A | B | |
---|---|---|
0 | Data Scientist | 200000 |
1 | Developer | 175000 |
2 | Sr. Developer | 190000 |
3 | Product Manager | 250000 |
4 | Project Manager | None |
This is how you can create a dataframe from lists of different lengths.
Create Dataframe From Lists as Columns
This section teaches you how to create a dataframe from lists as columns instead of rows.
- Create a dataframe from lists as columns using the
orient=‘index’
parameter while creating the dataframe.
Code
designation = ['Data Scientist', 'Developer', 'Sr. Developer', 'Product Manager']
avg_salary = [200000, 175000, 190000, 250000]
salary_dict = dict(designation=designation, avg_salary =avg_salary)
df = pd.DataFrame.from_dict(salary_dict, orient='index')
df
DataFrame Will Look Like
0 | 1 | 2 | 3 | |
---|---|---|---|---|
designation | Data Scientist | Developer | Sr. Developer | Product Manager |
avg_salary | 200000 | 175000 | 190000 | 250000 |