pandas: Remove NaN (missing values) with dropna() | note.nkmk.me (2024)

You can remove NaN from pandas.DataFrame and pandas.Series with the dropna() method.

Contents

  • Remove rows/columns where all elements are NaN: how='all'
  • Remove rows/columns that contain at least one NaN: how='any' (default)
  • Remove rows/columns according to the number of non-missing values: thresh
  • Remove based on specific rows/columns: subset
  • Update the original object: inplace
  • For pandas.Series

While this article primarily deals with NaN (Not a Number), it's important to note that in pandas, None is also treated as a missing value.

  • Missing values in pandas (nan, None, pd.NA)

See the following article on extracting, replacing, and counting missing values.

  • pandas: Find rows/columns with NaN (missing values)
  • pandas: Replace NaN (missing values) with fillna()
  • pandas: Detect and count NaN (missing values) with isnull(), isna()

The sample code in this article uses pandas version 2.0.3. As an example, read a CSV file with missing values.

import pandas as pdprint(pd.__version__)# 2.0.3df = pd.read_csv('data/src/sample_pandas_normal_nan.csv')print(df)# name age state point other# 0 Alice 24.0 NY NaN NaN# 1 NaN NaN NaN NaN NaN# 2 Charlie NaN CA NaN NaN# 3 Dave 68.0 TX 70.0 NaN# 4 Ellen NaN CA 88.0 NaN# 5 Frank 30.0 NaN NaN NaN

Remove rows/columns where all elements are NaN: how='all'

By setting how='all', rows where all elements are NaN are removed.

print(df.dropna(how='all'))# name age state point other# 0 Alice 24.0 NY NaN NaN# 2 Charlie NaN CA NaN NaN# 3 Dave 68.0 TX 70.0 NaN# 4 Ellen NaN CA 88.0 NaN# 5 Frank 30.0 NaN NaN NaN

If axis is set to 1 or 'columns', columns where all elements are NaN are removed.

print(df.dropna(how='all', axis=1))# name age state point# 0 Alice 24.0 NY NaN# 1 NaN NaN NaN NaN# 2 Charlie NaN CA NaN# 3 Dave 68.0 TX 70.0# 4 Ellen NaN CA 88.0# 5 Frank 30.0 NaN NaN

Note that if axis is set to 0 or 'index', rows are removed. Since the default value of axis is 0, rows are removed if omitted, as shown in the first example.

In former versions, both rows and columns were removed with axis=[0, 1], but since version 1.0.0, axis can no longer be specified with a list or tuple. If you want to remove both rows and columns, you can repeatedly apply dropna().

# print(df.dropna(how='all', axis=[0, 1]))# TypeError: supplying multiple axes to axis is no longer supported.print(df.dropna(how='all').dropna(how='all', axis=1))# name age state point# 0 Alice 24.0 NY NaN# 2 Charlie NaN CA NaN# 3 Dave 68.0 TX 70.0# 4 Ellen NaN CA 88.0# 5 Frank 30.0 NaN NaN

Remove rows/columns that contain at least one NaN: how='any' (default)

To use as an example, remove rows and columns where all values are NaN.

df2 = df.dropna(how='all').dropna(how='all', axis=1)print(df2)# name age state point# 0 Alice 24.0 NY NaN# 2 Charlie NaN CA NaN# 3 Dave 68.0 TX 70.0# 4 Ellen NaN CA 88.0# 5 Frank 30.0 NaN NaN

By setting how='any', rows that contain at least one NaN are removed. Since the default value of how is 'any', the result is the same even if omitted.

print(df2.dropna(how='any'))# name age state point# 3 Dave 68.0 TX 70.0print(df2.dropna())# name age state point# 3 Dave 68.0 TX 70.0

If axis is set to 1 or 'columns', columns that contain at least one NaN are removed.

print(df2.dropna(axis=1))# name# 0 Alice# 2 Charlie# 3 Dave# 4 Ellen# 5 Frank

Remove rows/columns according to the number of non-missing values: thresh

With the thresh argument, you can remove rows and columns according to the number of non-missing values.

For example, if thresh=3, the rows that contain more than three non-missing values remain, and the other rows are removed.

print(df.dropna(thresh=3))# name age state point other# 0 Alice 24.0 NY NaN NaN# 3 Dave 68.0 TX 70.0 NaN# 4 Ellen NaN CA 88.0 NaN

If axis is set to 1 or 'columns', columns are removed.

print(df.dropna(thresh=3, axis=1))# name age state# 0 Alice 24.0 NY# 1 NaN NaN NaN# 2 Charlie NaN CA# 3 Dave 68.0 TX# 4 Ellen NaN CA# 5 Frank 30.0 NaN

Remove based on specific rows/columns: subset

If you want to remove based on specific rows and columns, specify a list of rows/columns labels (names) to the subset argument of dropna(). Even if you want to set only one label, you need to specify it as a list, like subset=['name'].

Since the default is how='any' and axis=0, rows with NaN in the columns specified by subset are removed.

print(df.dropna(subset=['age']))# name age state point other# 0 Alice 24.0 NY NaN NaN# 3 Dave 68.0 TX 70.0 NaN# 5 Frank 30.0 NaN NaN NaNprint(df.dropna(subset=['age', 'state']))# name age state point other# 0 Alice 24.0 NY NaN NaN# 3 Dave 68.0 TX 70.0 NaN

If how is set to 'all', rows with NaN in all specified columns are removed.

print(df.dropna(subset=['age', 'state'], how='all'))# name age state point other# 0 Alice 24.0 NY NaN NaN# 2 Charlie NaN CA NaN NaN# 3 Dave 68.0 TX 70.0 NaN# 4 Ellen NaN CA 88.0 NaN# 5 Frank 30.0 NaN NaN NaN

If axis is set to 1 or 'columns', columns are removed.

print(df.dropna(subset=[0, 4], axis=1))# name state# 0 Alice NY# 1 NaN NaN# 2 Charlie CA# 3 Dave TX# 4 Ellen CA# 5 Frank NaNprint(df.dropna(subset=[0, 4], axis=1, how='all'))# name age state point# 0 Alice 24.0 NY NaN# 1 NaN NaN NaN NaN# 2 Charlie NaN CA NaN# 3 Dave 68.0 TX 70.0# 4 Ellen NaN CA 88.0# 5 Frank 30.0 NaN NaN

An error is raised if a non-existent row or column name is specified. An error is also raised if you set axis=1 but specify column names or set axis=0 (default) but specify row names.

# print(df.dropna(subset=['age', 'state', 'xxx']))# KeyError: ['xxx']# print(df.dropna(subset=['age', 'state'], axis=1))# KeyError: ['age', 'state']

Update the original object: inplace

As shown in the examples above, by default, a new object is returned, and the original object is not changed, but if inplace=True, the original object itself is updated.

df.dropna(subset=['age'], inplace=True)print(df)# name age state point other# 0 Alice 24.0 NY NaN NaN# 3 Dave 68.0 TX 70.0 NaN# 5 Frank 30.0 NaN NaN NaN

For pandas.Series

The only valid argument for dropna() of pandas.Series is inplace. Since it is one-dimensional data, the elements with NaN are simply removed.

s = pd.read_csv('data/src/sample_pandas_normal_nan.csv')['age']print(s)# 0 24.0# 1 NaN# 2 NaN# 3 68.0# 4 NaN# 5 30.0# Name: age, dtype: float64print(s.dropna())# 0 24.0# 3 68.0# 5 30.0# Name: age, dtype: float64s.dropna(inplace=True)print(s)# 0 24.0# 3 68.0# 5 30.0# Name: age, dtype: float64
pandas: Remove NaN (missing values) with dropna() | note.nkmk.me (2024)

References

Top Articles
The UPS Store | Ship & Print Here > 608 W Parkway Dr
The UPS Store | Ship & Print Here > 3318 Hwy 365
Kmart near me - Perth, WA
123 Movies Black Adam
Danielle Moodie-Mills Net Worth
Fredatmcd.read.inkling.com
Phone Number For Walmart Automotive Department
Minn Kota Paws
B67 Bus Time
Hssn Broadcasts
Nj Scratch Off Remaining Prizes
Hartford Healthcare Employee Tools
6th gen chevy camaro forumCamaro ZL1 Z28 SS LT Camaro forums, news, blog, reviews, wallpapers, pricing – Camaro5.com
Char-Em Isd
Troy Bilt Mower Carburetor Diagram
Dark Chocolate Cherry Vegan Cinnamon Rolls
Las 12 mejores subastas de carros en Los Ángeles, California - Gossip Vehiculos
Northeastern Nupath
Pay Boot Barn Credit Card
If you bought Canned or Pouched Tuna between June 1, 2011 and July 1, 2015, you may qualify to get cash from class action settlements totaling $152.2 million
Why Should We Hire You? - Professional Answers for 2024
Georgetown 10 Day Weather
3Movierulz
11526 Lake Ave Cleveland Oh 44102
Expression Home XP-452 | Grand public | Imprimantes jet d'encre | Imprimantes | Produits | Epson France
Neteller Kasiinod
Mia Malkova Bio, Net Worth, Age & More - Magzica
Ravens 24X7 Forum
Metro By T Mobile Sign In
Kattis-Solutions
Hattie Bartons Brownie Recipe
Pitco Foods San Leandro
Consume Oakbrook Terrace Menu
Blue Beetle Movie Tickets and Showtimes Near Me | Regal
Unity Webgl Player Drift Hunters
Naya Padkar Newspaper Today
Compare Plans and Pricing - MEGA
Timberwolves Point Guard History
Nba Props Covers
Сталь aisi 310s российский аналог
Suffix With Pent Crossword Clue
Shoecarnival Com Careers
2024-09-13 | Iveda Solutions, Inc. Announces Reverse Stock Split to be Effective September 17, 2024; Publicly Traded Warrant Adjustment | NDAQ:IVDA | Press Release
Watch Chainsaw Man English Sub/Dub online Free on HiAnime.to
Quiktrip Maple And West
Go Nutrients Intestinal Edge Reviews
My Gsu Portal
Human Resources / Payroll Information
303-615-0055
A Man Called Otto Showtimes Near Cinemark Greeley Mall
Enjoy Piggie Pie Crossword Clue
Latest Posts
Article information

Author: Aracelis Kilback

Last Updated:

Views: 5673

Rating: 4.3 / 5 (64 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Aracelis Kilback

Birthday: 1994-11-22

Address: Apt. 895 30151 Green Plain, Lake Mariela, RI 98141

Phone: +5992291857476

Job: Legal Officer

Hobby: LARPing, role-playing games, Slacklining, Reading, Inline skating, Brazilian jiu-jitsu, Dance

Introduction: My name is Aracelis Kilback, I am a nice, gentle, agreeable, joyous, attractive, combative, gifted person who loves writing and wants to share my knowledge and understanding with you.