Pandas Dataframe is a two-dimensional data structure that can be used to store the data in rows and columns format. Dataframes are very useful in data science and machine learning use cases.
You can create an empty dataframe in pandas using the pd.DataFrame() method.
In this tutorial, you’ll learn how to create an empty dataframe in Pandas.
If you’re in Hurry
You can use the below code snippet to create an empty dataframe in pandas
import pandas as pd
# create an empty dataframe
df = pd.DataFrame()
df
Dataframe Looks Like
Empty DataFrame
Columns: []
Index: []
If You Want to Understand Details, Read on…
In this tutorial, you’ll learn the different methods available to create an empty dataframe in pandas and additional options available while creating an empty dataframe. Read on…
Create Empty Dataframe
First, you’ll learn how to just create an empty dataframe using the Dataframe() class available in the pandas library. It is similar to a constructor which can be used to construct the class.
Snippet
# import pandas library
import pandas as pd
# create an empty dataframe
df = pd.DataFrame()
df
Empty Dataframe Looks Like
Empty DataFrame
Columns: []
Index: []
The Dataframe()
class supports the below parameters. All the parameters are optional. If you dint pass any parameter, then a simple empty dataframe object will be created.
data
– Used to pass the inital values to the dataframeindex
– Used to create index in the resulting dataframecolumns
– Column labels to be used in the resulting dataframedtypes
– Used to mention the datatypes for thenewly created columns in the dataframecopy
– Used to mention if the data should be copied from the inputs. By default, itsFalse
.
This is how you can create an empty dataframe.
Next, you’ll learn about creating a dataframe with just column names.
Create Empty Dataframe With column names
In this section, you’ll learn how to create an empty dataframe with column names.
You can define the column names as a list and pass the list to the parameter columns
while calling the DataFrame()
as shown below.
column_names = ['Column_1', 'Column_2', 'Column_3']
df = pd.DataFrame(columns = column_names)
df
An empty dataframe will be created with headers as shown below.
Dataframe Looks Like
Column_1 | Column_2 | Column_3 |
---|
This is how you can create an empty dataframe with the defined column names as headers.
Next, you’ll create an empty dataframe with dtypes
.
Create Empty Dataframe With column names And Datatypes
In this section, you’ll learn how to create an empty dataframe with column names and data types defined for each column.
You’ll need to create an empty pandas series for each column and specify the data type for that column using the dtype
parameter.
Creating a series
pd.Series([], dtype='int')
You can create a number of series with the different data types available in python. You can assign the series to each column while creating the dataframe as shown below.
Snippet
You can use the below snippet to create an empty dataframe with column headers and data types defined for it.
df = pd.DataFrame({'Column_1': pd.Series([], dtype='int'),
'Column_2': pd.Series([], dtype='str'),
'Column_3': pd.Series([], dtype='float')})
df.dtypes
When you print the dataframe column types using the df.dtypes
, you’ll see the below output.
Output
Column_1 int32
Column_2 object
Column_3 float64
dtype: object
This is how you can create an empty dataframe with column headers and data types defined for each column.
Next, you’ll learn how to create an empty dataframe with size.
Create Empty Dataframe With Size
In this section, you’ll learn how to create an empty dataframe with size.
You can create a dataframe with a specified size for both columns and rows.
Use the range function to create a sequence of numbers and pass it to the index
range or the columns
range specify column and row sizes.
To specify the size of the rows, you can use the index
parameter with range()
. For example, index=range(no_of_Rows)
To specify the size of the columns, you can use the columns
parameter with range()
. For example, columns=range(no_of_Cols)
Snippet
Use the below snippet to create an empty dataframe with 2
rows and 5
columns.
no_of_Rows = 2
no_of_Cols = 5
df = pd.DataFrame(index=range(no_of_Rows),columns=range(no_of_Cols))
df
You’ll see the empty dataframe created with 2 rows and 5 columns and all the cells will have the value NaN which means the missing data.
Dataframe Looks Like
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | NaN | NaN | NaN | NaN | NaN |
1 | NaN | NaN | NaN | NaN | NaN |
To create an empty Dataframe only with a specified number of rows, use the below snippet.
nRows= 2
df = pd.DataFrame(index=range(nRows))
df
Dataframe Looks Like
0 |
---|
1 |
To create a dataframe with only a specified number of columns, use the below snippet.
nCols = 5
df = pd.DataFrame(columns=range(nCols))
df
Dataframe Looks Like
0 | 1 | 2 | 3 | 4 |
---|
This is how you can create an empty dataframe with size.
Next, you’ll learn about appending columns to empty dataframe.
Create Empty Dataframe and Append Columns
In this section, you’ll learn how to create an empty dataframe and append columns to the empty dataframe.
First, create an empty dataframe using pd.Dataframe()
.
Next, you can append a column to the created dataframe using the insert() method. To know more about other methods available to add columns to the dataframe, refer to add column to dataframe tutorial.
Dataframe’s Insert()
method accepts the following parameters.
loc
– Index position when the new column to be insertedcolumn
– Name of the new column to be appendedvalues
– List of values for the new column. It must be int, series or an array.allow_duplicates
– To mention if the duplicate column names are allowed. By default, itsFalse
. If there is a column already available in the dataframe with the same name, then an error will be raised. If this parameter isTrue
, then error will not be raised and a duplicate column will be created.
Snippet
Use the below code to append a column at the 0 th
position of the dataframe.
df = pd.DataFrame()
# Using DataFrame.insert() to add a column
df.insert(0, "Column_1", [5,10,10,5,10], True)
df
Where,
0
– Index positionColumn_1
– Name for the new column[5,10,10,5,10]
– List of values to pass to the dataframeTrue
– To allow the duplicate column headers.
Column_1 will be inserted into the dataframe as shown below.
Dataframe Looks Like
Column_1 | |
---|---|
0 | 5 |
1 | 10 |
2 | 10 |
3 | 5 |
4 | 10 |
To append multiple columns to the empty dataframe, then you can use the below code.
Snippet
df['Column_2'], df['Column_3'] = [pd.NaT, 3]
df
Then column_2
and column_3
will be inserted into the dataframe.
Dataframe Looks Like
Column_1 | Column_2 | Column_3 | |
---|---|---|---|
0 | 5 | NaT | 3 |
1 | 10 | NaT | 3 |
2 | 10 | NaT | 3 |
3 | 5 | NaT | 3 |
4 | 10 | NaT | 3 |
This is how you can create an empty dataframe and add columns to it.
Next, you’ll learn about adding rows.
Create Empty Dataframe and Append Rows
In this section, you’ll learn how to create an empty dataframe and append rows to it.
First, create an empty dataframe using pd.DataFrame()
and with the headers by using the columns
parameter.
Next, append rows to it by using a dictionary. Each row needs to be created as a dictionary.
Dictionary’s key should be the column name and the Value should be the value of the cell. Create a dictionary with values for all the columns available in the dataframe and use the append() method to append the dictionary as a row.
For Example, a dictionary for each row should look like {'Name' : 'CPU', 'Quantity' : 5, 'Price' : 20000}
for the dataframe with columns Name Quantity and Price.
df = pd.DataFrame(columns = ['Name', 'Quantity', 'Price'])
print(df)
# append rows to an empty DataFrame
df = df.append({'Name' : 'CPU', 'Quantity' : 5, 'Price' : 20000},
ignore_index = True)
df = df.append({'Name' : 'Monitor', 'Quantity' : 10, 'Price' : 10000},
ignore_index = True)
df = df.append({'Name' : 'Keyboard', 'Quantity' : 10, 'Price' : 550},
ignore_index = True)
df
Where
df.append()
method invokes the append method on the dataframe.{'Name' : 'CPU', 'Quantity' : 5, 'Price' : 20000}
– Dictionary with values for each columnignore_index = True
– To label the index columns as0
or1
orn
. Other words, it means, the dictionary doesn’t contain values for the index columns. So the default index value will be used.
Output
Empty DataFrame
Columns: [Name, Quantity, Price]
Index: []
Dataframe Looks Like
Name | Quantity | Price | |
---|---|---|---|
0 | CPU | 5 | 20000 |
1 | Monitor | 10 | 10000 |
2 | Keyboard | 10 | 550 |
This is how you can create an empty dataframe and append rows to it.
Next, you’ll learn about creating a dataframe from another dataframe.
Create Empty Dataframe from Another Dataframe
In this section, you’ll create an empty dataframe from another dataframe which is already existing.
For example, assume the existing data df
with the following columns and data.
Dataframe Looks Like
Name | Quantity | Price | Column_2 | Column_3 | |
---|---|---|---|---|---|
0 | CPU | 5 | 20000 | NaT | 3 |
1 | Monitor | 10 | 10000 | NaT | 3 |
2 | Keyboard | 10 | 550 | NaT | 3 |
Now, you’ll create a dataframe df2
using the dataframe df
and its column but without copying the data.
First, you need to get the list of columns from the dataframe df
using df.columns
.
Then, you can create an empty dataframe by passing this column list to columns
parameter.
Use the below snippet to create an empty dataframe from other dataframe columns.
columns_list = df.columns
df2 = pd.DataFrame(columns = columns_list)
print(df2)
Printing the new dataframe df2
will show the output where you can see the columns from the dataframe df
is used to create the dataframe.
Output
Empty DataFrame
Columns: [Name, Quantity, Price, Column_2, Column_3]
Index: []
This is how you can create a dataframe using other dataframe columns.
Conclusion
To summarize, you’ve learned how to create an empty dataframe and also learned the various options available in the create dataframe operation. You’ve also appended columns and rows to the newly created dataframe.
If you have any questions, comment below.