Alexander Hagmann

Knowing how to export Pandas DataFrames to a CSV file is an essential skill in every data scientist’s toolkit. Pandas is a Python-based data manipulation tool, popular for data science uses. Data specialists use DataFrames, a common Pandas object and represents a table, to merge, manipulate, and analyze tabular data.

在熊猫编码会话结束时,需要保存任何数据和进度。这样做的最常见方法是将数据框写入CSV文件,这无非就是简单的文本文件。这是存储和交换表格数据的最常见和最简单的方法。CSV文件格式之所以如此,是因为它得到了其他应用程序的广泛支持,包括Excel,Open Office和Tableau。

Some typical use cases for exporting DataFrames to CSV include:

这Complete Pandas Bootcamp 2022: Data Science with Python

Last Updated November 2021

  • 325 lectures
  • All Levels
4.7 (2,413)

Pandas fully explained | 150+ Exercises | Must-have skills for Machine Learning & Finance | + Scikit-Learn and Seaborn |By Alexander Hagmann

Explore Course

将大熊猫数据范围导出到CSV文件的基础知识

了解DataFrame DF。作为第一步,我们必须用导入大熊猫作为pd

导入大熊猫作为pd

pd.dataframe()we can create a simple DataFrame object.

df = pd.dataframe(data = {“ name”:[“莱昂内尔·梅西”,“ cristiano ronaldo”,“ neymar junior”,“ kylian mbappe”,“ manuel neuer”,“ manuel neuer”],“乡村”,“葡萄牙“,“巴西”,“法国”,“德国”],“ height_m”:[1.70,1.87,1.75,1.78,1.78,1.93]})df

数据框是二维标记的数据结构。在我们的示例中,DF有五行和三列。每行代表一个足球运动员,每一列包含有关球员的信息。左侧的“列”不是列。这是数据框的索引。索引标记行。如果未指定,则数据范围具有带有上升整数的rameindex。数据框的顶部是列标题。

要将数据框写入CSV文件,我们可以使用DataFrame方法至_csv()。一个简单的例子是:

df.to_csv(“ players.csv”)

这将创建CSV文件players.csv。打开文件时,我们可以看到以下结构:

,Name,Country,Height_m 0,Lionel Messi,Argentina,1.7 1,Cristiano Ronaldo,Portugal,1.87 2,Neymar Junior,Brazil,1.75 3,Kylian Mbappe,France,1.78 4,Manuel Neuer,Germany,1.93

A CSV file is a delimited text file that uses a comma to separate values. You can still see the tabular data structure. Each line of the file is a data record – the soccer player. Each record consists of one or more values — player information — separated by commas.

Depending on the use case, we can customize the export. The method至_csv()提供几个选项(参数)来微调最终输出。

5 ways to customize Pandas to CSV

  1. Define file name and location

这first and most important parameter ispath_or_buf。Here you can define:

球员是合适的文件名。您可以选择其他文件名。但是不要使用任何空间(football players)or special characters. Use underscores if your filename contains two or more words (足球运动员)。

使用CSV Filetype(.csv)如果未另有指定。另外,您可以使用。txtextension.

Saving in current working directory

如果您没有指定具有完整路径的位置,则PANDA将文件保存在您当前的工作目录(CWD)中:

df.to_csv(path_or_buf = "players.csv")

This savesplayers.csvin your CWD. Note that you can omit “path_or_buf =“.

Saving in a specified location

慢性消耗病可以各不相同,取决于您的系统和哟ur Python Installation. Therefore, you may define a specified location by adding the full file path. To saveplayers.csvon a Windows desktop, you will add the pathC:\ Users \ Alex \ Desktop \players.csv

这full filename on Windows is:C:\ Users \ Alex \ Desktop \players.csv

这full filename on macOS and Linux is:/USERS/ALEX/DESKTOP/PLAYERS.CSV

Please note that Windows employs the backslash (“\”) instead of the slash (“/”). Since backslash is a special character in Python, using the following code will drop an error:

df.to_csv(“ C:\ Users \ Alex \ Desktop \ Players.csv”)

解决此问题有两种方法:

df.to_csv(“ c:/users/alex/desktop/players.csv”)
df.to_csv(r“ c:\ users \ alex \ desktop \ pleseres.csv”)

在MacOS和Linux上,最佳解决方案是:

df.to_csv(r”C:\Users\alex\desktop\players.csv”)

  1. Exporting the Index

至_csv()方法默认情况下导出索引。您可以通过添加删除索引索引= false

df.to_csv("players.csv", index = False)

让我们在CSV文件中查看:

Name,Country,Height_m Lionel Messi,Argentina,1.7 Cristiano Ronaldo,Portugal,1.87 Neymar Junior,Brazil,1.75 Kylian Mbappe,France,1.78 Manuel Neuer,Germany,1.93

一个简单的规则: If your DataFrame has a default RangeIndex, don’t export the index as it doesn’t contain any valuable information. If you reimport the dataset from CSV withpd.read_csv(),该索引可以在您的数据框架中列出两次。

When should you export the index? In cases where you have important information in the index. The following DataFrame股票包含Microsoft(MSFT)和Apple(AAPL)的股票价格:

This DataFrame has an index with datetime information, which is aDatetimeIndex。In this example, you shouldn’t drop the index.

股票。至_csv("stocks.csv")

CSV文件Stocks.csvstill contains the datetime information:

Date,AAPL,MSFT 2020-05-04,293.16,178.84 2020-05-05,297.56,180.76 2020-05-06,300.63,182.54 2020-05-07,303.74,183.60 2020-05-08,310.13,184.68
  1. 选择列

如果未指定,至_csv()writes all columns of a DataFrame to CSV. You may select one or many columns and omit all other columns.

Create a list (我的列表)带有您希望导出的那些列(例如姓名和国家)。

my_list = [“名称”,“乡村”]

经过我的列表列=

my_list = [“名称”,“乡村”]
  1. Exporting column headers

至_csv()默认情况下,方法将列标题(例如国家)写给CSV。您可以通过添加这些列标签标头= false

df.to_csv(...,header = false)
  1. Be careful with all other options

还有14个其他参数可以进一步自定义导出至_csv()。It’s best to use the default settings here.

In rare cases, alternative settings may be appropriate. Let’s consider two more options:

更改定界符(不建议)

在CSV文件中,值通过逗号分隔。您可以更改定界符,然后使用半隆(“;”)。将所需的定界符引用到sep =

df.to_csv(..., sep = ";")

定义丢失数据的替代表示(不建议)

当将数据框写入CSV时,缺少数据由一个空字符串(“”)表示。您可以通过将其传递给na_rep =

df.to_csv(..., na_rep = "None")

Data scientists frequently write Pandas DataFrames to CSV. The至_csv()method provides many options to customize the export. If you want to save your data until the next coding session, do the following:

df.to_csv(“ file_name.csv”,index = false)

This allows you to reimport the data into Pandas with simple code:

pd.read_csv("file_name.csv", ...)

In all other cases, you can customize the export to your needs.
既然您已经有能力执行这项重要的熊猫任务,那么您可以在其中了解更多有关Pandas的信息文档或通过启动Pandas bootcamp

页面最后更新:2020年8月

Top courses in Pandas

这Ultimate Pandas Bootcamp: Advanced Python Data Analysis
安迪·贝克(Andy Bek)
4.8 (608)
畅销书
Data Analysis with Pandas and Python
Boris Paskhaver
4.7 (15,014)
畅销书
Manage Finance Data with Python & Pandas: Unique Masterclass
Alexander Hagmann
4.7 (505)
畅销书
Data Manipulation in Python: A Pandas Crash Course
塞缪尔·欣顿(Samuel Hinton),基里尔·埃雷蒙科(Kirill Eremenko),Ligency I团队
4.6 (1,072)
Complete Data Analysis with Pandas : Hands-on Pandas Python
Ankit Mistry, Data Science & Machine Learning Academy
4.4 (661)
Pandas with Python for Data Science
考试草皮
4.3 (108)

更多的熊猫课程

熊猫的学生也学习

赋予您的团队能力。领导行业。

Get a subscription to a library of online courses and digital learning tools for your organization with Udemy for Business.

Request a demo

Courses by Alexander Hagmann