首页 - 技术栈

创建官方网站网站建设公司公司我我提供一个平台

作者: 五速梦信息网
时间: 2026年04月20日 11:32

当前位置：首页 > news >正文

创建官方网站,网站建设公司公司我我提供一个平台,百度推广客户端怎么登陆,水果商城网站制作多少钱第四章#xff1a;字符串与正则表达式 4.1字符串最早的字符串编码是美国标准信息交换码ASCII#xff0c;仅对10个数字、26个大写英文字母、26个小写英文字母及一些其他符号进行了编码。ASCII码采用1个字节来对字符进行编码#xff0c;最多只能表示256个符号。随着信息技…第四章字符串与正则表达式 4.1字符串最早的字符串编码是美国标准信息交换码ASCII仅对10个数字、26个大写英文字母、26个小写英文字母及一些其他符号进行了编码。ASCII码采用1个字节来对字符进行编码最多只能表示256个符号。随着信息技术的发展和信息技术的需要各国的文字都需要进行编码不同的应用领域和场合对字符串编码的要求有不同于是又分别设计了多种不同的编码格式常见的主要有UTF-8、UTF-16、UTF-32、GB2312、GBK、CP936、base64、CP437等等。 GB2312是我国定制的中文编码使用1个字节表示英语2个字节表示中文GBK是GB 2312的扩充而CP936是微软在GBK基础上开发的编码方式。GB2312、GBK和CP936都是使用2个字节表示中文。 UTF-8对全世界所有国家需要用到的字符进行了编码以1个字节表示英语字符兼容ASCII以3个字节表示中文还有些语言的符号使用2个字节例如俄语和希腊语符号或4个字节。不同编码格式之间相差很大采用不同的编码格式意味着不同的表示和存储形式把同一字符存入文件时写入的内容可能不同在试图理解其内容时必须了解编码规则并进行正确的解码。如果解码方法不正确就无法还原信息从这个角度来讲字符串编也就具有加密的效果。 python 3.x 完全支持中文字符默认使用UTF8编码格式无论是一个数字、英文字母还是汉字都按一个字符对待和处理。 s中国山东烟台 len(s) #字符串长度或者包含的字符个数 6 s中国山东烟台ABCDE #中文与英文字符同样对待都算一个字符 len(s) 11 姓名张三 #使用中文作为变量名 print(姓名) #输出变量的值张三在python中字符串属于不可变序列有序序列类型除了支持序列通用方法包括分片操作以外还支持特有的字符串操作方法。 testStringgood id(testString) 2377223672624 testString[0]b #不可变指的是不能通过下标的方式改变字符串中的某个元素值 Traceback (most recent call last): File pyshell#8, line 1, in module testString[0]b TypeError: str object does not support item assignment testStringwell id(testString) 2377223676080 元组和字符串不能改变其中的元素值 python字符串驻留机制对于短字符串将其赋值给对多个不同的对象时内存中只有一个副本多个对象共享该副本。长字符串不遵守驻留机制。 4.1.1字符串格式化常用格式字符 x1235 so%o%x #%o八进制数 so 2323 sh%x%x # #%x十六进制数 sh 4d3 se%e%x se 1.235000e03 #ord返回单个字符的ASCII值 #chr输入整数返回对应的ASCII符号 chr(ord(3)1) 4 %s%65 #%s转换成字符串 65 %s%65333 65333 %d%555 #%d转换成整数 Traceback (most recent call last): File pyshell#35, line 1, in module %d%555 TypeError: %d format: a real number is required, not str int(555) #eval相同 555 %s%[1,2,3] [1, 2, 3] str((1,2,3)) (1, 2, 3) str([1,2,3]) [1, 2, 3] 使用format方法进行格式化 print(The number {0:,} in hex is:{0:#x},the number {1} in oct is {1:#o}.format(5555,55)) The number 5,555 in hex is:0x15b3,the number 55 in oct is 0o67 print(The number {1:,} in hex is:{1:#x},the number {0} in oct is {0:#o}.format(5555,55)) The number 55 in hex is:0x37,the number 5555 in oct is 0o12663 print(my name is {name},my age is {age},and my QQ is {qq}.format(nameDong Fuguo,age37,qq306467355)) my name is Dong Fuguo,my age is 37,and my QQ is 306467355 position(5,8,13) print(X:{0[0]};Y:{0[1]};Z:{0[2]}.format(position)) X:5;Y:8;Z:13 weather[(Monday,rain),(Tuesday,sunny),(Aednesday,sunny),(Thursday,rain),(Firday,Cloudy)] formatterWeather of {0[0]} is {0[1]}.format #map:把一个函数映射到一个序列上。 for item in map(formatter,weather):print(item)#第二种输出方式 for item in weather:print(formatter(item)) 从python3.6开始支持一种新的字符串格式化方式官方叫做Formatted String Literals其含义与字符串对象的format方法类似但形式更加简洁。 nameDong age39 fMy name is {name},and I am {age} years old. My name is Dong,and I am 39 years old. width10 #宽度 precision4 #精度 value11/3 #计算的值 fresult:{value:{width}.{precision}} result: 3.667 4.1.2字符串常用方法 find、rfind find和rfind方法分别用来查找一个字符串在另一个字符串指定范围默认是整个字符串中首次出现和最后一次出现的位置如果不存在则返回-1 index、rindex index和rindex方法用来返回一个字符串在另一个字符串指定范围中首次和最后一次出现的位置如果不存在则抛出异常 count count方法用来返回一个字符串在另一个字符串中出现的次数。 sapple,peach,banana,peach,pear s.find(peach) #在s字符串中查找peach出现的位置下标 6 s.find(peach,7) #从s字符串下标为7的位置开始找peach 19 s.find(peach,7,20) #从s字符串下标从7开始到20结束的区间找peach -1 s.rfind(p) #从右向左找 25 s.index(p) #查找p在字符串中第一次出现的位置 1 s.index(pe) 6 s.index(pear) 25 s.index(ppp) Traceback (most recent call last): File pyshell#8, line 1, in module s.index(ppp) ValueError: substring not found s.count(p) 5 s.count(pp) 1 s.count(ppp) 0 split、rsplit split和rsplit方法分别用来以指定字符串为分隔符将字符串左端和右端开始将其分割成多个字符串并返回包含分隔结果的列表 partition、rpartition partition和rpartition用来以指定字符串为分隔符将原字符分割为3部分即分隔符前的字符串、分隔符字符串、分隔符后的字符串如果指定的分隔符不在原字符串中则返回原字符串和两个空字符串。 sapple,peach,banana,pear lis.split(,) li [apple, peach, banana, pear] #输出列表 s.partition(,) (apple, ,, peach,banana,pear) s.rpartition(,) (apple,peach,banana, ,, pear) s.partition(banana) (apple,peach,, banana, ,pear) s2014-10-31 ts.split(-) #遇到-号就分隔 print(t) [2014, 10, 31] print(list(map(int,t))) [2014, 10, 31] 对于split和rsplit方法如果不指定分隔符则字符串中的人任何空白符号包括空格、换行符、制表符等等多个算一个都将被认为是分隔符返回包含最终分割结果的列表。 shello world \n\n My name is Dong s.split() [hello, world, My, name, is, Dong] s\n\nhello world \n\n\n My name is Dong s.split() [hello, world, My, name, is, Dong] s\n\nhello\t\t world \n\n\n my name\t is Dong s.split() [hello, world, my, name, is, Dong] split和rsplit方法还允许指定最大分隔次数 s\n\nhello\t\t world \n\n\n My name is Dong s.split(None,1) #None表示不指定分隔符使用任意空白字符作为分隔符最大只分隔一次 [hello, world \n\n\n My name is Dong ] s.rsplit(None,1) #从右往左分隔一次 [\n\nhello\t\t world \n\n\n My name is, Dong] s.split(None,2) #从左往右分隔两次 [hello, world, My name is Dong ] s.rsplit(None,2) #从右往左分隔两次 [\n\nhello\t\t world \n\n\n My name, is, Dong] s.split(maxsplit6) #指定最大分隔次数为6次 [hello, world, My, name, is, Dong] s.split(maxsplit100) #指定最大分隔次数为100次 [hello, world, My, name, is, Dong] 调用split方法并且不传递任何参数时将使用空白字符作为分隔符把连续多个空白字符看作一个明确传递参数指定split使用分隔符时情况略有不同。 a,,,bb,,cc.split(,) [a, , , bb, , cc] a\t\t\tbb\t\tccc.split(\t) [a, , , bb, , ccc] a\t\tbb\t\tccc.split() [a, bb, ccc] partition和rpartition方法以指定字符串为分隔符将原字符串分隔为3部分即分隔符之前的字符串、分隔符字符串和分隔符之后的字符串。 sapple,peach,banana,pear s.partition(,) (apple, ,, peach,banana,pear) s.rpartition(,) (apple,peach,banana, ,, pear) s.partition(banana) (apple,peach,, banana, ,pear) s.partition(banana) (apple,peach,, banana, ,pear) abababab.partition(a) (, a, bababab) abababab.rpartition(a) (ababab, a, b) join 字符串连接join li[apple,peach,banana,pear] sep, #指定分隔符 ssep.join(li) #使用分隔符的方法join就可以将列表中的字符串连接起来 s apple,peach,banana,pear 不推荐使用运算符连接字符串优先使用join方法 #运算符连接字符串 import timeitstrlist[This is a long string that will not keep in memory. for n in range(10000)]def use_join():return .join(strlist) def use_plus():resultfor strtemp in strlist:resultresultstrtempreturn result#上面的测试代码 ifnamemain:times1000#从main中导入并调用use_joinjointimertimeit.Timer(use_join(),from import use_join)print(time for join:,jointimer.timeit(numbertimes))plustimertimeit.Timer(use_plus(),from main import use_plus)print(time for plus:,plustimer.timeit(numbertimes)) timeit模块还支持下面代码演示的用法从运行结果可以看出当需要对大量数据进行类型转换时内置函数map可以提供非常高的效率。 #执行的语句执行的次数 timeit.timeit(-.join(str(n) for n in range(100)),number10000) 0.09980920003727078 timeit.timeit(-.join([str(n) for n in range(100)]),number10000) 0.07888470002217218 timeit.timeit(-.join(map(str,range(100))),number10000) 0.06950240000151098 lower、upper、capitalize0、title、swapcase #返回的都是新字符串并不是在原来的字符串上进行修改 sWhat is Your Name? s.lower() #返回小写字符串 what is your name? s.upper() #返回大写字符串 WHAT IS YOUR NAME? s.capitalize() #字符串首字符大写 What is your name? s.title() #每个但单词的首字母大写 What Is Your Name? s.swapcase() #大小写互换 wHAT IS yOUR nAME? replace 查找替换replace类似于“查找与替换”功能 #实际是返回一个新字符串 s中国中国 s 中国中国 s2s.replace(中国,中华人民共和国) #s中查找所有的中国并全部替换 s2 中华人民共和国中华人民共和国敏感词替换测试用户输入的是否是敏感词如果有敏感词的话就把敏感词替换为3个星号。 words(测试,非法,暴力,话) text这句话里含有非法内容 for word in words: if word in text: texttext.replace(word,) #这句话中的所有查到的这个词全部替换 text 这句里含有内容 maketrans、translate 字符串对象的maketrans、方法用来生成字符映射表而translate方法用来根据映射表中定义的对应关系转换字符串并替换其中的字符使用这两个方法的组合可以同时处理多个不同的字符replace方法则无法满足这一要求。 #创建映射表将字符“abcdef123”一一对应转换为“uvwxyz#$” table.maketrans(abcdef123,uvwxyz#$) sPython is a greate programming language. I like it! s.translate(table) Python is u gryuty progrumming lunguugy. I liky it! #按照映射表进行替换凯撒加密 import string #导入模块 def kaisa(s,k): lowerstring.ascii_lowercase #模块中的.ascii_lowercase所有的小写字母 upperstring.ascii_uppercase #所有的大写字母 beforestring.asciiletters #所有的英文字母 afterlower[k:]lower[:k]upper[k:]upper[:k] #大小写字母分别围城一圈在第k位置分割 table.maketrans(before,after) return s.translate(table) sPython is a greate programming language. I like it! kaisa(s,3) Sbwkrq lv d juhdwh surjudpplqj odqjxdjh. L olnh lw! strip、rstrip、lstrip strip删除两边的空白字符删除指定字符。 rstrip删除字符串左边的空白字符或指定字符。 lstrip删除字符串右边的空白字符或指定字符。 s abc s2s.strip() #不带任何参数strip删除空白字符 s2 abc \n\nhello world \n\n.strip() #删除空白字符 hello world aaaassddf.strip(a) #删除指定字符 ssddf aaaassddf.strip(af) #两边所有的a和f ssdd aaaassaaddf.strip(a) ssaaddf aaaassddfaaa.rstrip(a) #删除字符串右端指定的字符 aaaassddf aaaassddfaaa.lstrip(a) #删除字符串左端指定的字符 ssddfaaa 这三个函数的参数指定的字符串并不作为一个整体对待而是在原字符串的两侧、右侧、左侧删除参数字符串中包含的所有字符一层一层地从外往里扒 #不是在原来的字符串中删是返回一个新字符串字符串是不可变的 aabbccddeeeffg.strip(af) #字母f不在字符串两侧所以不删除 bbccddeeeffg aabbccddeeeffg.strip(gaf) bbccddeee aabbccddeeeffg.strip(gbaef) ccdd aabbccddeeeffg.strip(gbaefcd) eval 内置函数eval eval(34) #对字符串求值 7 a3 b5 eval(ab) #对表达式求值 8 import math eval(help(math.sqrt)) #相当于help(math.sqrt) Help on built-in function sqrt in module math: sqrt(x, /) Return the square root of x. eval(math.sqrt(3)) 1.7320508075688772 eval(aa) Traceback (most recent call last): File pyshell#57, line 1, in module eval(aa) File string, line 1, in module NameError: name aa is not defined. Did you mean: a? eval函数是非常危险的可以执行任意的表达式 ainput(Please input:) #导入模块调用方法指定程序 Please input:import(os).startfile(rC:\Windows\notepad.exe) eval(a) eval(import(os).system(md testtest)) #md创建文件 Traceback (most recent call last): File pyshell#60, line 1, in module eval(import(os).system(md testtest)) TypeError: eval() arg 1 must be a string, bytes or code object in成员测试运算符成员判断关键字in 列表、元组、字符串、map、range效率低时间复杂度是线性的需要从头到尾扫描一遍这个在不在里面字典和集合不存在这个情况时间复杂度是常级的 a in abcde #测试 True ab in abcde True ac in abcde False j in abcde False 序列重复* python字符串支持与整数的乘法运算表示序列重复也就是字符串内容的重复。字典和集合不行 abcd*3 abcdabcdabcd startswith、endswith s.startswitht、s.endswitht判断字符串是否以指定字符串开始或结束。 sBeautiful is better than ugly. s.startswith(Be) #检测整个字符串 True s.startswith(Be,5) #从下标为5的位置开始找 False s.startswith(Be,0,5) #起始位置 0-5之间找 True os是python的一个标准库其中它有一个函数listdir指定路径C盘根目录下所有这个三个类型结尾的图片 center、ljust、rjust返回指定宽度的新字符串原字符串居中、左对齐或右对齐出现在新字符串中如果指定宽度大于字符串长度则使用指定的字符默认为空格进行填充。 Hello world!.center(20) #生成具有20个字符宽度的字符串原来的字符串居中 Hello world! Hello world!.center(20,) #居中对齐以字符进行填充 Hello world! Hello world!.ljust(20,) #左对齐 Hello world! Hello world!.rjust(20,) #右对齐 Hello world! zfill zfill返回指定宽度的字符串在左侧以字符0进行填充。 abc.zfill(5) #生成一个新字符串字符串有5个字符在左侧填充数字字符0 00abc abc.zfill(2) #指定宽度小于字符串长度返回字符串本身 abc abc.zfill(20) 00000000000000000abc islnum、isalpha、isdigit islnum、isalpha、isdigit、isdecinmal、isnumeric、isspace、isupper、islower用来测试字符串是否为数字或字母、是否为字母、是否为数字字符、是否为空白字符、是否为大写字母以及是否为小写字母。 1234abcd.isalnum() #测试字符串是否为字母或数字 True 1234abcd.isalpha() #测试字符串是否只包含英文字母全部为英文字母时返回True False 1234abcd.isdigit() #测试字符串是否只包含数字 False abcd.isalpha() True 1234.0.isdigit() #isdigit() 主要测试的是整数 False isdigit、isdecinmal、isnumeric 都是测试字符串是否为数字 1234.isdigit() True 九.isnumeric() #.isnumeric()方法支持汉字数字 True 九.isdigit() False 九.isdecimal() False IVIIIX.isdecimal() False IVIIIX.isdigit() False IVIIIX.isnumeric() #.isnumeric() 方法支持罗马数字 True 除了字符串对象提供的方法以外很多python内置函数也可以对字符串进行操作例如 xHello world. len(x) #字符串长度 12 max(x) #最大字符 w min(x) list(zip(x,x)) #zip也可以用作于字符串 [(H, H), (e, e), (l, l), (l, l), (o, o), ( , ), (w, w), (o, o), (r, r), (l, l), (d, d), (., .)] isspace 是否为空白字符空格、换行符、制表符 isupper、islower 是否为大写字母是否为小写字母切片字典集合无序不支持下标操作。具有惰性求值特点的也不支持下标操作。切片也适用于字符串但仅限于读取其中的元素不支持字符串修改。支持下标操作支持随机访问列表可变、元组、字符串不可变 Explicit is better than implicit.[:8] Explicit Explicit is better than implicit.[9:23] is better than compress、decompress python标准库zlib中提供的compress和decompress函数可以用于数据的压缩和解压缩在压缩字符串之前需要先编码为字节码。 import zlib xPython程序设计系列图书董付国编著清华大学出版社.encode() #字符串转化为字节串默认UTF-8 len(x) #UTF-8英文占1个字节中文一个占3个字节 72 yzlib.compress(x) #压缩 len(y) #长度更大了字符串没有什么重复的信息 83 x(Python系列图书*3).encode() len(x) 54 yzlib.compress(x) #信息重复度越高压缩比越大 len(y) 30 zzlib.decompress(y) #解压缩 len(z) 54 z.decode() #解码 Python系列图书Python系列图书Python系列图书 x[董付国]8 #x是一个列表不能直接压缩先转换然后编码 ystr(x).encode() len(y) 104 zzlib.compress(y) len(z) 26 zlib.decompress(z).decode() [董付国, 董付国, 董付国, 董付国, 董付国, 董付国, 董付国, 董付国] 4.1.3字符串常量 python标准库string中定义数字字符、标点符号、英文字母、大写字母、小写字母等常量。 import string string.digits #.digits常量包含了所有数字字符 0123456789 string.punctuation #.punctuation是一些标点符号 !#$%(),-./:;?[\]^{|}~ string.ascii_letters #.ascii_letters所有的英语字母大写所有的英语字母小写 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ string.ascii_lowercase #.ascii_lowercase所有小写字母 abcdefghijklmnopqrstuvwxyz string.ascii_uppercase #.ascii_uppercase所有大写字母 ABCDEFGHIJKLMNOPQRSTUVWXYZ 随机密码生成原理 import string xstring.digitsstring.asciilettersstring.punctuation #数字大小写字母标点符号 x 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!#$%()*,-./:;?[\]^{|}~ import random .join([random.choice(x) for i in range(8)]) #生成的密码长度8 EaJ)hN .join([random.choice(x) for i in range(8)]) ?^0Y:3/ .join([random.choice(x) for i in range(8)]) vP–0w .join([random.choice(x) for i in range(8)]) ]\aXjIRL 4.1.4可变字符串在python中字符串属于不可变对象不支持原地修改如果需要修改其中的值只能重新创建一个新的字符串对象。然而如果其中确实需要一个支持原地修改的unicode数据对象可以使用io.StringIO对象或array模块数组。 import io #导入io模块 sHello,world sioio.StringIO(s) #以字符串为参数创建一个StringIO对象 sio.getvalue() #创建好的对象有一个方法是.getvalue()查看里面你的内容是什么 Hello,world sio.seek(7) #.seek找到字符串下标为7的位置 7 sio.write(there!) #在位置7的位置写there! 6 #返回的是我们成功的写入了几个字符 sio.getvalue() Hello,wthere! import array #数组array模块 #array模块中有一个array类 aarray.array(u,s) #首先创建一个数组对象初始化类型原始数据 print(a) array(u, Hello,world) #a是array对象 a[0]y #数组是可变的直接修改 print(a) array(u, yello,world) a.tounicode() #生成unicode字符串 yello,world 4.1.5字符串应用案例精选例4-1 编写函数实现字符串加密和解密循环使用指定密钥采用简单的异或运算法。 #编写函数实现字符串加密和解密循环使用指定密钥采用简单的异或运算法。 #异或A^B^BA1^10^001^00^11 def crypt(source,key): #传入的参数明文密钥#itertool标准库中有cycle类from itertools import cycleresult #空字符串tempcycle(key) #对密钥创建一个cycle对象首尾相接可迭代的对象for ch in source:#对明文每个ord(ch)首先算出ascii码#ord(next(temp))获取cycle可迭代对象的下一个字符算出ascii码#两个数字才能进行异或运算两个字符不行#chr:数字变字符resultresultchr(ord(ch)^ord(next(temp)))return resultsourceShandong Institute of Business and Technology keyDong FuGuoprint(Before Encrypted:source) encryptedcrypt(source,key) print(After Encrypted:encrypted) decryptedcrypt(encrypted,key) print(After Decrypted:decrypted) 例4-2 编写程序生成大量随机信息这在需要获取大量数据来测试或演示软件功能的时候非常有用不仅能真实展示软件功能或算法还可以避免泄露真实数据或是引起不必要的争议。 P32、33、34、35、36 Python字符串与正则表达式1字符串编码与格式化_哔哩哔哩_bilibili Python字符串与正则表达式2字符串方法1_哔哩哔哩_bilibili Python字符串与正则表达式3字符串方法2_哔哩哔哩_bilibili Python字符串与正则表达式4字符串方法3_哔哩哔哩_bilibili