整理几个用python写的比较两文件夹_整理几个用Python写的比较两文件差异的程序【推荐】

更新时间：2019-09-11 来源：python 手机版 字体：大中小

【www.bbyears.com--python】

Python写的比较2个文件不同的程序，如果其中有1个文件不同，就会返回第一个不同的地方的行号和列号。

下面给出比较文件的Python源代码：

代码如下 # 用Python比较两个文件
# 如果相同返回0

def cmpstr(str1, str2):
    col = 0
    for c1, c2 in zip(str1, str2):
        if c1 == c2:
            col += 1
            continue
        else :
            break

    #判断是怎样退出循环的，还有一种情况是串长度不同
    if c1 != c2 or len(str1) != len(str2):
        return col+1
    else :
        return 0

file1 = open("a.txt","r")
file2 = open("b.txt","r")

fa = file1.readlines()
fb = file2.readlines()
file1.close()
file2.close()

#用GBK解码，这样可以处理中文字符
fa = [ str.decode("gbk") for str in fa]
fb = [ str.decode("gbk") for str in fb]

row = 0
col = 0

#学习Python上玩蛇网 www.iplaypython.com！
#开始比较内容
for str1, str2 in zip(fa, fb):
    col = cmpstr(str1,str2)
    # col=0则说明两行相等
    if col == 0 :
        row += 1
        continue
    else:
        break

#如果有一行不同，或者文件长度不一样
if str1 != str2 or len(fa) != len(fb):
    #打印出不同的行序和列序，并把不同的前一句后本句打印出来
    #最后两个字符是不同的地方
    print "row:", row+1, "col:", col
    print "file a is:n", fa[row-1],fa[row][:col+1], "n"
    print "file b is:n", fb[row-1],fb[row][:col+1], "n"
else :
    print "All are same!"

#获取用户输入。
raw_input("Press Enter to exit.")

python比较两个文件的差异

下面的python代码比较两个文件的不同之处，并将比较的结果输出出来。

代码如下 #version 0
import sys

f1 = open(sys.argv[1], "r")
f2 = open(sys.argv[2], "r")

fileOne = f1.readlines()
fileTwo = f2.readlines()

f1.close()
f2.close()

outFile1 = open(sys.argv[3], "w")
outFile2 = open(sys.argv[4], "w")

for i in fileOne:
        if not i in fileTwo:
                outFile1.write(i)

for i in fileTwo:
        if not i in fileOne:
                outFile2.write(i)

outFile1.close()
outFile2.close()

#first time refacotring
import sys
from operator import attrgetter,itemgetter

#verify inputs
USAGE="""
%s file1 file2 output1 output2
"""% __file__

if len(sys.argv)<5:
        print USAGE
        sys.exit(2)

#open files with try
try:
        f1 = open(sys.argv[1], "r")
        f2 = open(sys.argv[2], "r")
except Exception,e:
        print "encounter issues %s, while opening in files: %s %s" % (str(e),itemgetter(1)(sys.argv),itemgetter(2)(sys.argv))
        sys.exit(1)

fileOne = f1.readlines()
fileTwo = f2.readlines()

f1.close()
f2.close()

#open files with try
try:
        outFile1 = open(sys.argv[3], "w")
        outFile2 = open(sys.argv[4], "w")
except Exception,e:
        print "encounter issues %s, while opening out files: %s %s" % (str(e),itemgetter(3)(sys.argv),itemgetter(4)(sys.argv))
        sys.exit(1)

l_minus=lambda x,y:list(set(x)-set(y))

outFile1.write("n".join(l_minus(fileOne,fileTwo)))
outFile2.write("n".join(l_minus(fileTwo,fileOne)))

outFile1.close()
outFile2.close()

#2nd time refactoring
import sys
from operator import attrgetter,itemgetter

#verify inputs
USAGE="""
%s file1 file2 output1 output2
"""% __file__

if len(sys.argv)<5:
        print USAGE
        sys.exit(2)

#open files with try
with open(itemgetter(1)(sys.argv), "r") as f1, open(itemgetter(2)(sys.argv), "r") as f2:
        fileOne = f1.readlines()
        fileTwo = f2.readlines()

#list subset
l_minus=lambda x,y:list(set(x)-set(y))
#open files with try
with open(itemgetter(3)(sys.argv), "w") as outFile1, open(itemgetter(4)(sys.argv), "w") as outFile2:
        outFile1.write("n".join(l_minus(fileOne,fileTwo)))
        outFile2.write("n".join(l_minus(fileTwo,fileOne)))

Python 程序比较两文件夹差异并读取出来

在客户那边建库，需要把几百个G几十万个文件导入到Oracle里，好不容易导完了才发现中间缺了好几大块数据，约有四分之一吧，郁闷得很。

数据是客户从第三方买的，据客户分析是拷贝过来的时候有部分数据漏掉了，但漏掉的数据又没有什么规律，加之文件夹的嵌套层数很多，查找起来很困难。

客户又从第三方那边拷了一份全的数据，现在对这个问题有两种处理方案，一种是全部重新导一遍，大约得花一周多的时间;另一种方案是把两个数据的差异找出来，把差异的部分追加进去就行了。

项目的时间比较紧张了，只能选第二种方案，可如何把差异的数据找出来呢，试了几个文件夹比较的工具，处理几百兆的数据都吃力，更不必说这么大量的数据和文件了。

后来想了想，觉得Python解决这个问题比较方便，就研究了一下Python里的文件和目录操作，很快就完成了下面的一个脚本，可以很好地解决这个问题。

下面的脚本可以在Python24里很好地运行，在其他版本里没有测试，但用的都是基本功能，应该没有什么问题。

代码里的PathA是全的数据的文件夹，PathB是不全的数据的文件夹，PathC是个新的空目录，脚本执行完后就把PathA中有且pathB中没有的文件和目录都写到PathC里了，还可以保持原来的目录结构，速度和正确性都很令人满意。

因为是急用的代码，所以写得不很简洁，也不是很规范，在此留志，一方面供自己以后参考，另一方面也提供给需要使用Python进行文件和目录操作的兄弟们共同参考。

代码如下：

代码如下 # coding: GB2312

#系统模块
import sys
import os
import shutil
#用于文件查找的模块
from os.path import walk, join, normpath

#这个是完整的文件夹
PathA = "F:FullData"
#这个是缺文件的文件夹
PathB = "F:IncomplData"

#这个是目标文件夹
PathC = "F:DiffData"

#============================================================
#这个函数是用来递归处理PathA，对PathA里的每个文件和文件夹在PathB中找是否有对应的文件或文件夹
#若找不到，则在PathC中创建目录并拷贝文件
#拷贝文件时使用了shutil模块的copy2函数，以保留文件原来的创建时间和最后更新时间
def visit(arg, dirname, names):
    #把目录打印出来，以监视进度
    print dirname

    #得到路径名后，把前面的主路径名去掉
    dir=dirname.replace(PathA,"")

    dirnameB = os.path.join(PathB,dir)
    dirnameC = os.path.join(PathC,dir)

    if os.path.isdir(dirnameB):
        #若PathB里存在对应的文件夹，再逐个文件判断是否存在
        for file in names:
            if os.path.isfile(os.path.join(dirname

本文来源：http://www.bbyears.com/jiaocheng/67419.html

链接：http://www.bbyears.com/jiaocheng/67419.html
整理几个用python写的比较两文件夹_整理几个用Python写的比较两文件差异的程序【推荐】(转载时请注明本文出处及链接)

猜你感兴趣

3月7日是什么星座|3月7日女生节快乐的祝福语短信 2019-09-11
[如何用phonegap判断ios开发]如何用phonegap判断iOS版本并设置兼容教程详解 2019-09-11
【php中curl模拟进行微信接口的get与post请求】php中CURL模拟进行微信接口的GET与POST例子 2019-09-11
[霸道军少亲一亲免费]717亲一亲给领导的祝福语祝你幸福一家亲，快乐永跟随 2019-09-11
逆战靶场在哪里_逆战全新靶场重开CDE礼包活动一览 2019-09-11
世界好友周给好朋友的祝福语_世界好友周给好朋友的祝福语 2019-09-11
【整理数据的常用方法】整理几个常用的Python功能代码片段【收藏】 2019-09-11
centos安装|centos中zabbix2.2安装配置介绍 2019-09-11
信长之野望15威力加强版|信长之野望14威力加强版城郭建筑增加攻击力攻略 2019-09-11
对女墙山色_对女孩子说好听的话愿爱妻永远美丽，永远幸福 2019-09-11

本类排行

本类最新

更多>>

整理几个用python写的比较两文件夹_整理几个用Python写的比较两文件差异的程序【推荐】

猜你感兴趣

热门标签

本类排行

本类最新