Amadokhumenti emojula ye-Re yePython 3 ku. Imojuli ye-re yezinkulumo ezijwayelekile

Izinkulumo ezivamile ziyingxenye edume kakhulu cishe yanoma yiluphi ulimi lohlelo. Bakusiza ukuthi ufinyelele ngokushesha ulwazi oludingayo. Ikakhulukazi, zisetshenziswa uma kudingekile ukucubungula umbhalo. I-Python iza nemojula ekhethekile ngokuzenzakalelayo. re, enesibopho sokusebenza nezinkulumo ezivamile.

Namuhla sizokhuluma ngokuningiliziwe ukuthi kuyini ngokujwayelekile, ukuthi singasebenza kanjani nabo nokuthi imodyuli re kuzosiza.

Izinkulumo ezivamile: isingeniso

Yiziphi ukusetshenziswa kwezinkulumo ezijwayelekile? Cishe bonke. Ngokwesibonelo, lezi:

  1. Izinhlelo zokusebenza zewebhu ezidinga ukuqinisekiswa kombhalo. Isibonelo esijwayelekile amaklayenti e-imeyili aku-inthanethi.
  2. Noma yimaphi amanye amaphrojekthi ahlobene nemibhalo, imininingwane yolwazi nokunye.

Ngaphambi kokuba siqale ukuhlukanisa i-syntax, kufanele siqonde kabanzi imigomo eyisisekelo yokusebenza komtapo wolwazi. re futhi ngokuvamile, yini ngokuvamile enhle ngakho. Sizophinde sinikeze izibonelo ezivela ekusebenzeni kwangempela, lapho sizochaza indlela yokusetshenziswa kwazo. Ungakha isifanekiso esinjalo, esilungele wena ukwenza imisebenzi ehlukahlukene ngombhalo.

Siyini isifanekiso kumtapo wolwazi we-Re?

Ngayo, ungasesha ulwazi lwezinhlobo ezahlukene, uthole ulwazi oluhambisana nazo, ukuze wenze eminye imisebenzi ivumelane nezimo. Futhi, kunjalo, ukucubungula le datha.

Isibonelo, thatha isifanekiso esilandelayo: s+. Kusho noma yimuphi uhlamvu lwesikhala. Uma ungeza uphawu lokuhlanganisa kuyo, kusho ukuthi iphethini ihlanganisa isikhala esingaphezu kwesisodwa. Ingakwazi nokufanisa izinhlamvu zethebhu ezibizwa ngazo t+.

Ngaphambi kokuzisebenzisa, udinga ukungenisa umtapo wolwazi Re. Ngemva kwalokho, sisebenzisa umyalo okhethekile ukuze sihlanganise isifanekiso. Lokhu kwenziwa ngezinyathelo ezimbili.

>>> ngenisa kabusha

>>> regex = re.compile('s+')

Ngokukhethekile, le khodi yenza umsebenzi wokuhlanganisa isifanekiso esingasetshenziswa. isibonelo, ukucinga izikhala (eyodwa noma ngaphezulu).

Ukuthola ulwazi oluhlukene kumayunithi ezinhlamvu ahlukene usebenzisa izinkulumo ezivamile

Ake sithi sinokuguquguquka okuqukethe ulwazi olulandelayo.

>>> umbhalo = “””100 INF Informatics

213 MAT Mathematics  

156 ENG IsiZulu»»»

Iqukethe izifundo ezintathu zokuqeqesha. Ngayinye yazo iqukethe izingxenye ezintathu - inombolo, ikhodi kanye negama. Siyabona ukuthi isikhawu phakathi kwalawa magama sihlukile. Yini okufanele uyenze ukuze uhlukanise lo mugqa ube izinombolo ezihlukene namagama? Kunezindlela ezimbili zokufeza lo mgomo:

  1. shayela umsebenzi phinda.
  2. sebenzisa umsebenzi Hlukanisa ngoba regex.

Nasi isibonelo sokusebenzisa i-syntax yendlela ngayinye yokuguquguquka kwethu.

>>> re.split('s+', umbhalo)  

# noma

>>> regex.split(umbhalo)

Okukhiphayo: ['100', 'INF', 'Computer Science', '213', 'MAT', 'Math', '156', 'ENG', 'English']

Ngokuvamile, zombili izindlela zingasetshenziswa. Kodwa empeleni kulula kakhulu ukusebenzisa isisho esivamile esikhundleni sokusebenzisa umsebenzi izikhathi eziningi. phinda.

Ukuthola amameshi anemisebenzi emithathu

Ake sithi sidinga ukukhipha izinombolo kuphela kuyunithi yezinhlamvu. Yini okudingeka yenziwe kulokhu?

re.findall()

Nali ikesi lokusebenzisa lomsebenzi findall(), okuthi, kanye nezinkulumo ezivamile, kukuvumela ukuba ukhiphe ukuvela kwenombolo eyodwa noma ngaphezulu kokuhluka kombhalo.

>>> phrinta (umbhalo)  

100 INF Informatics

213 MAT Mathematics  

156 ENG IsiZulu

>>> regex_num = re.compile('d+')  

>>> regex_num.findall(umbhalo)  

['100', '213', '156']

Kanye nophawu u-d, sisebenzise isifanekiso esibonisa ngokuphelele noma yiliphi inani lezinombolo elitholakala kokuguquguqukayo noma umbhalo. Futhi njengoba sengeze eyodwa + lapho, lokhu kusho ukuthi okungenani inombolo eyodwa kufanele ibe khona. 

Ungasebenzisa futhi uphawu * ukuze ucacise ukuthi ubukhona bedijithi akudingekile ukuze kutholakale okufanayo.

Kodwa kithi, njengoba sasisebenzisa +, sakhipha nge findall() 1 noma ngaphezulu izigaba zedijithali zezifundo ezivela embhalweni. Ngakho-ke, esimweni sethu, izinkulumo ezivamile zisebenza njengezilungiselelo zomsebenzi.

phinda.search() vs re.match()

Njengoba ungakwazi ukuqagela egameni lemisebenzi, eyokuqala icinga okufanayo embhalweni. Umbuzo: Uyini umehluko phakathi uthole konke? Iphuzu liwukuthi ibuyisela into ethile efana nephethini, futhi hhayi lonke ukulandelana kwemiphumela etholiwe ngendlela yohlu, njengomsebenzi wangaphambilini.

Ngokulandelayo, umsebenzi we-re.match wenza okufanayo. I-syntax kuphela ehlukile. Isifanekiso kufanele sibekwe ekuqaleni. 

Ake sithathe isibonelo esibonisa lokhu.

>>> # dala okuhlukile ngombhalo

>>> umbhalo2 = «»»INF Informatics

213 MAT Mathematics 156″»»  

>>> # hlanganisa i-regex futhi ubheke amaphethini

>>> regex_num = re.compile('d+')  

>>> s = regex_num.search(umbhalo2)  

>>> phrinta('Inkomba yokuqala: ', s.start())  

>>> phrinta('Inkomba yokugcina: ', s.end())  

>>> phrinta(text2[s.start():s.end()]) 

Inkomba yokuqala: 17 

Inkomba yokugcina: 20

213

Uma ufuna ukuthola umphumela ofanayo ngendlela ehlukile, ungasebenzisa umsebenzi iqembu().

Ishintsha ingxenye yombhalo ngomtapo wezincwadi we-Re

Ukushintsha umbhalo, sebenzisa umsebenzi kabusha.sub(). Ake sithi uhlu lwethu lwezifundo lushintshile kancane. Siyabona ukuthi ngemva kwevelu ngayinye yedijithali sinethebhu. Umsebenzi wethu uwukuhlanganisa konke lokhu kulandelana kube umugqa owodwa. Ukuze senze lokhu, sidinga ukumiselela inkulumo ethi s+ ukwedlula 

Umbhalo wokuqala wawuthi:

# dala okuhlukile ngombhalo

>>> umbhalo = “””100 INF t Informatics

213 MAT t Izibalo  

156 ENG t IsiNgisi»»»  

>>> phrinta (umbhalo)  

100 ULWAZI Isayensi yekhompyutha

213 MAT Mathematics  

156 ENG isiZulu

Ukuze senze umsebenzi esiwufunayo, sisebenzise imigqa elandelayo yekhodi.

# shintsha isikhala esisodwa noma ngaphezulu ngo-1

>>> regex = re.compile('s+')  

>>> phrinta(regex.sub(' ', umbhalo))  

Ngenxa yalokho, sinomugqa owodwa. 

101 COM Computers 205 MAT Mathematics 189 ENG IsiZulu

Manje cabangela enye inkinga. Asibhekene nomsebenzi wokubeka izikhala. Kubaluleke kakhulu kithina ukuthi wonke amagama ezifundo aqale emgqeni omusha. Ukwenza lokhu, kusetshenziswa esinye isisho esingeza umugqa omusha kokuhlukile. Uhlobo luni lwesisho lolu?

Library Re isekela isici esifana nokumatanisa okunegethivu. Ihlukile kule eqondile ngoba iqukethe isibabazo ngaphambi kokusika. Okusho ukuthi, uma sidinga ukweqa uhlamvu lomugqa omusha, kuzomele sibhale !n esikhundleni sika-n.

Sithola ikhodi elandelayo.

# susa zonke izikhala ngaphandle komugqa omusha  

>>> regex = re.compile('((?!n)s+)')  

>>> phrinta(regex.sub(' ', umbhalo))  

100 INF Informatics

213 MAT Mathematics  

156 ENG IsiZulu

Ayini amaqembu ezinkulumo ezijwayelekile?

Ngosizo lwamaqembu ezinkulumo ezivamile, singathola izinto ezifunwayo ngendlela yezakhi ezihlukene, hhayi emgqeni owodwa. 

Ake sithi sidinga ukuthola inombolo yesifundo, ikhodi kanye negama hhayi emgqeni owodwa, kodwa njengezinto ezihlukene. Ukuze uqedele umsebenzi, uzodinga ukubhala inombolo enkulu yemigqa yekhodi engadingekile. 

Eqinisweni, umsebenzi ungenziwa ube lula kakhulu. Ungakwazi ukuhlanganisa isifanekiso sakho konke okufakiwe bese uvele ucacise idatha odinga ukuyithola kubakaki.

Kuzoba nenani elincane kakhulu lemigqa. 

# dala amaqembu ezifanekiso zombhalo wezifundo futhi uzikhiphe

>>> course_pattern = '([0-9]+)s*([A-ZY]{3})s*([a-zA-ZoY]{4,})'  

>>> re.findall(course_pattern, umbhalo)  

[('100', 'INF', 'Computer Science'), ('213', 'MAT', 'Math'), ('156', 'ENG', 'English')]

Umqondo wokufanisa "okuhaha".

Ngokwejwayelekile, izengezo ezijwayelekile zihlelelwe ukukhipha inani eliphakeme ledatha efanayo. Futhi ngisho noma udinga kancane kakhulu.

Ake sibheke isampula lekhodi ye-HTML lapho sidinga khona ukuthola ithegi.

>>> umbhalo = “Isibonelo Sokufaniswa Kwezinkulumo Ezivamile Zokuhaha”  

>>> re.findall('', umbhalo)  

['Isibonelo Sokufaniswa Kwezinkulumo Ezivamile Ezihahayo']

Esikhundleni sokukhipha ithegi eyodwa, iPython ithole yonke intambo. Yingakho kubizwa ngokuthi ukuhaha.

Futhi yini okufanele uyenze ukuze uthole ithegi kuphela? Kulokhu, udinga ukusebenzisa ukulinganisa okuvilaphayo. Ukuze ucacise isisho esinjalo, uphawu lombuzo lwengezwa ekugcineni kwephethini.

Uzothola ikhodi elandelayo kanye nokukhishwa komhumushi.

>>> re.findall('', umbhalo)  

[", ”]

Uma kudingekile ukuthola kuphela isenzakalo sokuqala esihlangabezene naso, khona-ke indlela isetshenziswa sesha ().

sesha kabusha('', umbhalo).group()  

"

Ngemuva kwalokho kuzotholakala umaka wokuvula kuphela.

Izifanekiso Ezidumile Zokukhuluma

Nali ithebula eliqukethe amaphethini enkulumo avame ukusetshenziswa kakhulu.

Amadokhumenti emojula ye-Re yePython 3 ku. Imojuli ye-re yezinkulumo ezijwayelekile

Isiphetho

Sicabangele kuphela izindlela eziyisisekelo zokusebenza ngezinkulumo ezijwayelekile. Kunoma yikuphi, ubonile ukuthi zibaluleke kangakanani. Futhi lapha akwenzi mehluko ukuthi kudingekile yini ukuhlaziya wonke umbhalo noma izingcezu zawo ngazinye, kungakhathaliseki ukuthi kudingekile ukuhlaziya okuthunyelwe kunethiwekhi yokuxhumana nabantu noma ukuqoqa idatha ukuze ukucutshungulwe kamuva. Izinkulumo ezivamile ziwumsizi onokwethenjelwa kule ndaba.

Bakuvumela ukuthi wenze imisebenzi efana nale:

  1. Icacisa ifomethi yedatha, njengekheli le-imeyili noma inombolo yocingo.
  2. Ukuthola intambo futhi uyihlukanise ibe yizintambo ezimbalwa ezincane.
  3. Yenza imisebenzi ehlukahlukene ngombhalo, njengokusesha, ukukhipha ulwazi oludingekayo, noma ukufaka esikhundleni ingxenye yezinhlamvu.

Izinkulumo ezivamile ziphinde zivumele ukuthi wenze imisebenzi engeyona into encane. Uma uthi nhlá, ukuqonda le sayensi akulula. Kodwa ekusebenzeni, konke kulinganiselwe, ngakho-ke kwanele ukukuthola kanye, ngemva kwalokho leli thuluzi lingasetshenziswa hhayi kuphela ku-Python, kodwa futhi kunoma yiluphi olunye ulimi lohlelo. Ngisho ne-Excel isebenzisa izinkulumo ezivamile ukuze zenze ukucubungula idatha. Ngakho kuyisono ukungalisebenzisi leli thuluzi.

shiya impendulo