精华内容
下载资源
问答
  • python 合并个csv文件

    2021-07-20 14:39:45
    pandas提供concat函数对两或多个csv文件进行合并。 1.行合并 f1 = pd.read_csv('file1.csv') f2 = pd.read_csv('file2.csv') file = [f1,f2] train = pd.concat(file) train.to_csv("file3" + ".csv", index=0, ...

    pandas提供concat函数对两个或多个csv文件进行合并。
    1.行合并

    f1 = pd.read_csv('file1.csv')
    f2 = pd.read_csv('file2.csv')
    file = [f1,f2]
    train = pd.concat(file)
    train.to_csv("file3" + ".csv", index=0, sep=',')

    2.列合并设置concat函数中的axis=1即可实现

    f1 = pd.read_csv('file1.csv')
    f2 = pd.read_csv('file2.csv')
    file = [f1,f2]
    train = pd.concat(file,axis=1)
    train.to_csv("file3" + ".csv", index=0, sep=',')
    展开全文
  • OK I have read several threads here on Stack Overflow... I thought this would be fairly easy for me to do but I find that I still do not have a very good grasp of Python. I tried the example located a...

    OK I have read several threads here on Stack Overflow. I thought this would be fairly easy for me to do but I find that I still do not have a very good grasp of Python. I tried the example located at How to combine 2 csv files with common column value, but both files have different number of lines and that was helpful but I still do not have the results that I was hoping to achieve.

    Essentially I have 2 csv files with a common first column. I would like to merge the 2. i.e.

    filea.csv

    title,stage,jan,feb

    darn,3.001,0.421,0.532

    ok,2.829,1.036,0.751

    three,1.115,1.146,2.921

    fileb.csv

    title,mar,apr,may,jun,

    darn,0.631,1.321,0.951,1.751

    ok,1.001,0.247,2.456,0.3216

    three,0.285,1.283,0.924,956

    output.csv (not the one I am getting but what I want)

    title,stage,jan,feb,mar,apr,may,jun

    darn,3.001,0.421,0.532,0.631,1.321,0.951,1.751

    ok,2.829,1.036,0.751,1.001,0.247,2.456,0.3216

    three,1.115,1.146,2.921,0.285,1.283,0.924,956

    output.csv (the output that I actually got)

    title,feb,may

    ok,0.751,2.456

    three,2.921,0.924

    darn,0.532,0.951

    The code I was trying:

    '''

    testing merging of 2 csv files

    '''

    import csv

    import array

    import os

    with open('Z:\\Desktop\\test\\filea.csv') as f:

    r = csv.reader(f, delimiter=',')

    dict1 = {row[0]: row[3] for row in r}

    with open('Z:\\Desktop\\test\\fileb.csv') as f:

    r = csv.reader(f, delimiter=',')

    #dict2 = {row[0]: row[3] for row in r}

    dict2 = {row[0:3] for row in r}

    print str(dict1)

    print str(dict2)

    keys = set(dict1.keys() + dict2.keys())

    with open('Z:\\Desktop\\test\\output.csv', 'wb') as f:

    w = csv.writer(f, delimiter=',')

    w.writerows([[key, dict1.get(key, "''"), dict2.get(key, "''")] for key in keys])

    Any help is greatly appreciated.

    解决方案

    When I'm working with csv files, I often use the pandas library. It makes things like this very easy. For example:

    import pandas as pd

    a = pd.read_csv("filea.csv")

    b = pd.read_csv("fileb.csv")

    b = b.dropna(axis=1)

    merged = a.merge(b, on='title')

    merged.to_csv("output.csv", index=False)

    Some explanation follows. First, we read in the csv files:

    >>> a = pd.read_csv("filea.csv")

    >>> b = pd.read_csv("fileb.csv")

    >>> a

    title stage jan feb

    0 darn 3.001 0.421 0.532

    1 ok 2.829 1.036 0.751

    2 three 1.115 1.146 2.921

    >>> b

    title mar apr may jun Unnamed: 5

    0 darn 0.631 1.321 0.951 1.7510 NaN

    1 ok 1.001 0.247 2.456 0.3216 NaN

    2 three 0.285 1.283 0.924 956.0000 NaN

    and we see there's an extra column of data (note that the first line of fileb.csv -- title,mar,apr,may,jun, -- has an extra comma at the end). We can get rid of that easily enough:

    >>> b = b.dropna(axis=1)

    >>> b

    title mar apr may jun

    0 darn 0.631 1.321 0.951 1.7510

    1 ok 1.001 0.247 2.456 0.3216

    2 three 0.285 1.283 0.924 956.0000

    Now we can merge a and b on the title column:

    >>> merged = a.merge(b, on='title')

    >>> merged

    title stage jan feb mar apr may jun

    0 darn 3.001 0.421 0.532 0.631 1.321 0.951 1.7510

    1 ok 2.829 1.036 0.751 1.001 0.247 2.456 0.3216

    2 three 1.115 1.146 2.921 0.285 1.283 0.924 956.0000

    and finally write this out:

    >>> merged.to_csv("output.csv", index=False)

    producing:

    title,stage,jan,feb,mar,apr,may,jun

    darn,3.001,0.421,0.532,0.631,1.321,0.951,1.751

    ok,2.829,1.036,0.751,1.001,0.247,2.456,0.3216

    three,1.115,1.146,2.921,0.285,1.283,0.924,956.0

    展开全文
  • 使用python合并个csv

    2021-07-16 13:28:22
    I am new to python and I have got to a point where I have created multiple csv file from large text files. So my csv's look like below.CSV1ABC, 1DEF, 2GHI, 3CSV2ABC, 4DEF, 5GHI, 6and so on for upto 15...

    I am new to python and I have got to a point where I have created multiple csv file from large text files. So my csv's look like below.

    CSV1

    ABC, 1

    DEF, 2

    GHI, 3

    CSV2

    ABC, 4

    DEF, 5

    GHI, 6

    and so on for upto 15 csv files.

    I would like to create a combined csv file which looks something like below.

    ABC, 1, 4

    DEF, 2, 5

    GHI, 3, 6

    Any pointers on how to do this is appreciated.

    解决方案

    Assuming all the CSV files are of the same length and contain the same first column in the same order, something like this might work for you:

    list_of_files = ['csv1.csv', 'csv2.csv', 'csv3.csv']

    # Use the first file as a template

    with open(list_of_files[0], 'r') as f:

    output_text = [line.strip() for line in f]

    # Append the values to the end of the lines

    for fn in list_of_files[1:]:

    with open(fn, 'r') as f:

    for i, line in enumerate(f):

    key, value = line.strip().split(",")

    output_text[i] += "," + value

    # Dump result to new csv

    with open("result.csv", 'w') as f:

    f.write("\n".join(output_text))

    展开全文
  • Python合并个csv文件

    万次阅读 多人点赞 2018-08-24 19:12:54
    合并个csv文件 csv_list = glob.glob('*.csv') #查看同文件夹下的csv文件数 print(u'共发现%s个CSV文件'% len(csv_list)) print(u'正在处理............') for i in csv_list: #循环读取同文...

    导入所需的包

    import os
    import pandas as pd
    import glob

    合并多个csv文件

    csv_list = glob.glob('*.csv') #查看同文件夹下的csv文件数
    print(u'共发现%s个CSV文件'% len(csv_list))
    print(u'正在处理............')
    for i in csv_list: #循环读取同文件夹下的csv文件
        fr = open(i,'rb').read()
        with open('result.csv','ab') as f: #将结果保存为result.csv
            f.write(fr)
    print(u'合并完毕!')
    共发现9个CSV文件
    正在处理............
    合并完毕!
    

    去重函数

    这个函数将重复的内容去掉,主要是去表头。

    df = pd.read_csv("result.csv",header=0)
    df.info()
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 659867 entries, 0 to 659866
    Data columns (total 3 columns):
    UrbanRuralCode    659867 non-null object
    code              659867 non-null object
    name              659867 non-null object
    dtypes: object(3)
    memory usage: 15.1+ MB
    
    IsDuplicated = df.duplicated()
    True in IsDuplicated
    True
    

    这说明了这个DataFrame格式的数据含有重复项。

    DataFrame.drop_duplicates函数的使用

    DataFrame.drop_duplicates(subset=None, keep='first', inplace=False)
    • subset : column label or sequence of labels, optional
      用来指定特定的列,默认所有列
    • keep : {‘first’, ‘last’, False}, default ‘first’
      删除重复项并保留第一次出现的项
    • inplace : boolean, default False
      是直接在原来数据上修改还是保留一个副本
    datalist = df.drop_duplicates(keep = False)
    datalist.info()
    <class 'pandas.core.frame.DataFrame'>
    Int64Index: 659859 entries, 0 to 659866
    Data columns (total 3 columns):
    UrbanRuralCode    659859 non-null object
    code              659859 non-null object
    name              659859 non-null object
    dtypes: object(3)
    memory usage: 20.1+ MB
    

    排序函数

    datalist_sorted = datalist.sort_values(by = ['code']) #按1列进行升序排序

    结果写入csv文件

    datalist_sorted.to_csv("village_all.csv", sep = ',', header = True,index = False)

    问题

    Python读取文件问题

    错误信息

    "UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal multibyte sequence"  

    解决方案

    fr = open(i,'r').read() 改为 fr = open(i,'rb').read()
    with open('result.csv','a') as f: 改为 with open('result.csv','ab') as f:

    重复值问题

    这里我合并了9个csv文件,检查最后合并结果发现,里面还有一个列名。这是因为9个为文件,其中8个的列名被认为是DataFrame的值,第1个的列名依旧为列名,然后再去重的过程中,8个相同值被保留了1个,所以这会导致最后的csv文件多了一个列名

    解决方案

    IsDuplicated = df.duplicated() 改为 IsDuplicated = df.duplicated(keep = False) #重复数据全部去除
    展开全文
  • 今天小编就为大家分享一篇python 对多个csv文件分别进行处理的方法,具有很好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧
  • 由于项目取数需要,要将两不同的csv文件合并到一文件中,并根据公共列(即两文件中有一或几列的数据应该是对应一致的)合并到同一行,具体代码实现如下: import pandas as pd #读取数据 r1= pd.read_...
  • python3,该代码能自动合并目录下所有csv文件,并自动去除表头
  • readList = [] #用来装合并文件的内容 ...#############读取多个csv文件内容################### with open('{}.csv'.format(i),'r',newline="",encoding="GB18030") as read_csvfile: readcsv_.
  • 合并个csv文件 csv_list = glob.glob('*.csv') #查看同文件夹下的csv文件数 print(u'共发现%s个CSV文件'% len(csv_list)) print(u'正在处理............') for i in csv_list: #循环读取同文件夹下的csv文件 fr...
  • Python合并个csv / txt 为一文件 在数据处理的时候遇到需要将多文件一一对应拼接成一文件的情况,就比如说这里有一群人,每人有不同的的特征,每特征是一个csv文件,最终需要将各个特征横向拼接起来,...
  • import glob import os import pandas as pd def mkdir(path): folder = os.path.exists(path) if not folder: # 判断是否存在文件夹如果不存在则创建...def merge_csv(csv_path, save_path, save_name): inputfile
  • [使用Python3]我对(Python)编程很陌生,但是我正在编写一脚本来扫描文件夹中的某些csv文件,然后我想读取它们并将它们附加到另一个csv文件中。在这两者之间,需要仅在特定列中的值与设置的条件匹配时返回数据。...
  • python合并个csv文件并去重

    千次阅读 2018-07-06 10:26:43
    #coding=utf-8import osimport pandas as pdimport globdef hebing(): csv_list = glob.glob('*.csv') print(u'共发现%s个CSV文件'% len(csv_list)) print(u'正在处理............') for i in csv_list: fr...
  • I am a beginner with Python. I have multiple CSV files (more than 10), and all of them have same number of columns. I would like to merge all of them into a single CSV file, where I will not have head...
  • 我有两个csv文件,如下所示.CSV1data13 data23 d main_data1;main_data2 data13 data23data12 data22 d main_data1;main_data2 data12 data22data11 data21 ...
  • glob是python自己带的一文件操作相关模块,用它可以查找符合自己目的的文件。 代码: import os import pandas as pd import glob os.chdir(r'C:\Users\Administrator\Desktop\XJTU-SY_Bearing_Datasets\XJTU-SY_...
  • 在做两个csv文件合并时,其实挺简单,但是中间遇到的问题比较多,所以统一在这里总结一下踩过的坑,希望可以给到大家帮助。 首先说一下,一开始使用的办法,刚开始希望通过第一换行’\n‘地方往后去合并,这样就...
  • 使用python 3.5, # coding=utf-8 import glob import pandas as pd def mergeCSV(): byte = b'\r\n' #换行符 csv_list = glob.glob('D:\\360极速... print(u'共发现%s个CSV文件'% len(csv_list)) print(u'正在
  • Python合并(拼接)多个CSV文件

    万次阅读 多人点赞 2017-07-31 19:33:29
    笔者最近做数据分析与挖掘,经常遇到要合并CSV文件的问题,正好练习Python遂使用Python的Pandas库进行拼接,记下和大家分享,大家有更好的方法欢迎评论交流。
  • 背景文件夹2019 的文件为csv文件,文件夹2020的文件...步骤a:先遍历excel2020文件夹所有的文件,合并一csv表至2019文件夹b:遍历2019文件夹所有的文件名,合并csv终表。#合并数据到Alldata文件夹下的Alldata文件i...
  • Python——dat文件批量合并为同一个csv文件,含代码与demo数据,可直接运行。 目的:将文件夹ZW下的所有dat文件(含子文件夹)合并,并保存到results.csv中,便于后续使用excel进行处理与分析。
  • python合并个csv文件

    2021-08-24 10:28:42
    url = "D:\\" new_url = "D:\\" file = [] for path in os.listdir(url): file.append(pd.read_csv(url+path)) ...all_data.to_csv(new_url + "all_data_result2" + ".csv", encoding="gb18030",index=0)
  • 对于文件夹中后缀为.csv的所有文件import globimport osfilelist = []os.chdir("folderwithcsvs/")for counter, files in enumerate(glob.glob("*.csv")):filelist.append(files)print "do stuff with file:", files...
  • 最近尝试使用PYTHON处理CSV数据,由于CSV文件有好几,需要提前拼接,然后再处理,因此遇到了PYTHON对文件进行合并的问题。此次尝试了两种方法:一是调用CMD命令处理;二是使用PYTHON写程序处理,经过尝试,觉得第...
  • 使用Python批量合并文件夹下.csv数据实现目标:一文件夹下包括n.csv数据文件,想将后缀为ts.csv的文件与对应数字的.csv文件进行合并此图为wf.csv文件中的数据格式,ts.csv文件是与此文件等行数的一列数据,将此...
  • Python CSV 合并到多sheet工作表 2、csv合并方法 import pandas as pd analysis_file = outDir+delimiter+'analysis_result.xlsx' writer = pd.ExcelWriter(analysis_file) csv_file1 = pd.read_csv...
  • 我有两.csv文件具有相同的初始列标题:NAME RA DEC Mean_I1 Mean_I2 alpha_K24 class alpha_K8 class.1 Av avgAvMon-000101 100.27242 9.608597 11.082 10.034 0.39 I 0.39 I...
  • 我试图将多个csv文件合并成一excel文件,其中每文件都是xls文件中自己的工作表。在下面是一个python脚本,它可以将文件夹中的所有csv文件转换为各自的excel文件。在import osimport globimport csvfrom ...
  • 但是,在尝试应用相同的逻辑来合并基于公共列的两文件时,它失败了。问题是第二循环完全运行,然后顶部循环运行(不知道为什么会这样)。我试过用纽比buys = np.genfromtxt('buys_dtsep.dat',delimiter=",",dtype=...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 20,236
精华内容 8,094
关键字:

python合并4个csv

python 订阅