&cV~M SrSSKrSSKrSSKrSSKJr SSKJr SSKJ r SSKJ r SSKJ r \ "S5r\\ "S 5-r\ "S 5r\\-r\\ "S 5- r\\ "S 5- r\\ "S 5-\ "S 5- r\\-r\\ "S5-r\\-r\\ "S5- rSS1r\\-rSrSr\R:"S\R<\R>-5r "SS\!5r""SS\"5r#"SS\"5r$"SS\"5r%"SS\"5r&"SS \#5r'"S!S"\"5r("S#S$\"5r)"S%S&\"5r*"S'S(\"5r+"S)S*\+5r,"S+S,\#5r-"S-S.\"5r."S/S0\"5r/"S1S2\"5r0"S3S4\"5r1"S5S6\"5r2"S7S8\"5r3"S9S:\"5r4"S;S<\"5r5"S=S>\"5r6"S?S@\"5r7"SASB\"5r8"SCSD\"5r9"SESF\"5r:"SGSH\"5r;"SISJ\"5r<"SKSL\"5r="SMSN\%5r>"SOSP\"5r?"SQSR\"5r@"SSST\"5rA"SUSV\"5rB"SWSX\B5rC"SYSZ\"5rD"S[S\\"5rE"S]S^\"5rF"S_S`\"5rG"SaSb\"5rH"ScSd\H5rI"SeSf\H5rJ"SgSh\"5rK"SiSj\"5rL"SkSl\"5rM"SmSn\M5rN"SoSp\N5rO"SqSr\"5rP"SsSt\Q5rR"SuSv\R5rS"SwSx\R5rT"SySz\S5rU"S{S|\ R5rW\T"S S}5rX\T"S~S5rYS\YlZS\Yl[\T"SS5r\\R:"SRSR\555Rr`\R:"SR\R"SR\5555Rrc\R:"S5Rre\R:"SR\R"SR\5555Rrf\R:"SR\R"SR\5555Rrg\R:"SR\R"SR\5555RrhSriSrjSrkSSjrlSrmSrnSroSrpSrqSrrSrsSrtSruSrvSrwSrxSrySrzSr{Sr|Sr}Sr~SrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrSrg)alHeader value parser implementing various email-related RFC parsing rules. The parsing methods defined in this module implement various email related parsing rules. Principal among them is RFC 5322, which is the followon to RFC 2822 and primarily a clarification of the former. It also implements RFC 2047 encoded word decoding. RFC 5322 goes to considerable trouble to maintain backward compatibility with RFC 822 in the parse phase, while cleaning up the structure on the generation phase. This parser supports correct RFC 5322 generation by tagging white space as folding white space only when folding is allowed in the non-obsolete rule sets. Actually, the parser is even more generous when accepting input than RFC 5322 mandates, following the spirit of Postel's Law, which RFC 5322 encourages. Where possible deviations from the standard are annotated on the 'defects' attribute of tokens that deviate. The general structure of the parser follows RFC 5322, and uses its terminology where there is a direct correspondence. Where the implementation requires a somewhat different structure than that used by the formal grammar, new terms that mimic the closest existing terms are used. Thus, it really helps to have a copy of RFC 5322 handy when studying this code. Input to the parser is a string that has already been unfolded according to RFC 5322 rules. According to the RFC this unfolding is the very first step, and this parser leaves the unfolding step to a higher level message parser, which will have already detected the line breaks that need unfolding while determining the beginning and end of each header. The output of the parser is a TokenList object, which is a list subclass. A TokenList is a recursive data structure. The terminal nodes of the structure are Terminal objects, which are subclasses of str. These do not correspond directly to terminal objects in the formal grammar, but are instead more practical higher level combinations of true terminals. All TokenList and Terminal objects have a 'value' attribute, which produces the semantically meaningful value of that part of the parse subtree. The value of all whitespace tokens (no matter how many sub-tokens they may contain) is a single space, as per the RFC rules. This includes 'CFWS', which is herein included in the general class of whitespace tokens. There is one exception to the rule that whitespace tokens are collapsed into single spaces in values: in the value of a 'bare-quoted-string' (a quoted-string with no leading or trailing whitespace), any whitespace that appeared between the quotation marks is preserved in the returned value. Note that in all Terminal strings quoted pairs are turned into their unquoted values. All TokenList and Terminal objects also have a string value, which attempts to be a "canonical" representation of the RFC-compliant form of the substring that produced the parsed subtree, including minimal use of quoted pair quoting. Whitespace runs are not collapsed. Comment tokens also have a 'content' attribute providing the string found between the parens (including any nested comments) with whitespace preserved. All TokenList and Terminal objects have a 'defects' attribute which is a possibly empty list all of the defects found while creating the token. Defects may appear on any token in the tree, and a composite list of all defects in the subtree is available through the 'all_defects' attribute of any node. (For Terminal notes x.defects == x.all_defects.) Each object in a parse tree is called a 'token', and each has a 'token_type' attribute that gives the name from the RFC 5322 grammar that it represents. Not all RFC 5322 nodes are produced, and there is one non-RFC 5322 node that may be produced: 'ptext'. A 'ptext' is a string of printable ascii characters. It is returned in place of lists of (ctext/quoted-pair) and (qtext/quoted-pair). XXX: provide complete list of token types. N) hexdigits) itemgetter)_encoded_words)errors)utilsz (z ()<>@,:;.\"[].z."(z/?=z*'%%  cX[U5RSS5RSS5$)z;Escape dquote and backslash for use within a quoted-string.\\\"z\")strreplacevalues A/opt/alt/python313/lib64/python3.13/email/_header_value_parser.pymake_quoted_pairsrcs& u:  dF + 3 3C ??c$[U5nSUS3$)Nr)r)rescapeds r quote_stringrhs&G wiq>rz =\? # literal =? [^?]* # charset \? # literal ? [qQbB] # literal 'q' or 'b', case insensitive \? # literal ? .*? # encoded word \?= # literal ?= c^\rSrSrSrSrSrU4SjrSrU4Sjr \ S5r \ S5r S r \ S 5r\ S 5rS rSS jrSSjrSSjrSrU=r$) TokenList}NTc4>[TU]"U0UD6 /UlgN)super__init__defects)selfargskw __class__s rr!TokenList.__init__s $%"% rc2SRSU55$)Nc38# UHn[U5v M g7frr.0xs r $TokenList.__str__..,t!s1vvtjoinr#s r__str__TokenList.__str__sww,t,,,rch>SRURR[TU]55$Nz{}({})formatr&__name__r __repr__r#r&s rr=TokenList.__repr__s+t~~66"W-/1 1rc2SRSU55$)Nr)c3^# UH#oR(dMURv M% g7frrr,s rr/"TokenList.value..s81wqwws--r3r5s rrTokenList.valuesww8888rc<[SU5UR5$)Nc38# UHoRv M g7fr) all_defectsr,s rr/(TokenList.all_defects..s04aMM4r2)sumr"r5s rrFTokenList.all_defectss040$,,??rc(USR5$Nr)startswith_fwsr5s rrLTokenList.startswith_fwssAw%%''rc&[SU55$)zATrue if all top level tokens of this part may be RFC2047 encoded.c38# UHoRv M g7fr) as_ew_allowed)r-parts rr/*TokenList.as_ew_allowed..s7$$%%$r2)allr5s rrPTokenList.as_ew_alloweds7$777rcR/nUHnURUR5 M U$r)extendcomments)r#rWtokens rrWTokenList.commentss&E OOENN +rc[XS9$)Npolicy)_refold_parse_treer#r\s rfoldTokenList.folds !$66rc4[URUS95 g)Nindent)printppstrr#rcs rpprintTokenList.pprints djjj'(rc>SRURUS95$)Nr rb)r4_pprfs rreTokenList.ppstrsyy011rc## SRUURRUR5v UHHn[ US5(dUSRU5-v M,UR US-5ShvN MJ UR (aSRUR 5nOSnSRX5v gNK7f)Nz{}{}/{}(rjz* !! invalid element in token list: {!r}z z Defects: {}r)z{}){})r;r&r< token_typehasattrrjr")r#rcrXextras rrj TokenList._pps  NN # # OO E5%((!55;VE]CD!99VF]333  <<"))$,,7EEnnV++ 4sA9C ;C. s#9DqCFFDr2)rr4r5s rr6BareQuotedString.__str__ sBGG#9D#99::rc2SRSU55$)Nr)c38# UHn[U5v M g7frr+r,s rr/)BareQuotedString.value..r1r2r3r5s rrBareQuotedString.value ww,t,,,rrN) r<rrrsrtrmr6rwrrxrrrrrs %J;--rrcD\rSrSrSrSrSr\S5r\S5r Sr g) Commentirc SR[S/UVs/sHoRU5PM snS///55$s snf)Nr)r))r4rHquoters rr6Comment.__str__sIwws E489DqZZ]D9 E " #$ $9sAcURS:Xa [U5$[U5RSS5RSS5RSS5$)Nrrrrz\(rz\))rmrr)r#rs rr Comment.quotesR   y (u: 5z!!$/77"%u..5g"%u/. .rc2SRSU55$)Nr)c38# UHn[U5v M g7frr+r,s rr/"Comment.content..%r1r2r3r5s rrComment.content#rrcUR/$r)rr5s rrWComment.comments's ~rrN) r<rrrsrtrmr6rrwrrWrxrrrrrs9J$.--rrcH\rSrSrSr\S5r\S5r\S5rSr g) AddressListi+z address-listcTUVs/sHoRS:XdMUPM sn$s snf)Naddressrmrs r addressesAddressList.addresses/#;4a<<#:4;;;%%c([SU5/5$)Nc3\# UH"oRS:XdMURv M$ g7frNrm mailboxesr,s rr/(AddressList.mailboxes..5s&>!\\9%< AKK!,,rHr5s rrAddressList.mailboxes3!>!>?AC Crc([SU5/5$)Nc3\# UH"oRS:XdMURv M$ g7frrm all_mailboxesr,s rr/,AddressList.all_mailboxes..:s&>!\\9%<$AOO!rrr5s rrAddressList.all_mailboxes8rrrN) r<rrrsrtrmrwrrrrxrrrrr+sEJ <<CCCCrrcH\rSrSrSr\S5r\S5r\S5rSr g)Addressi>rcHUSRS:XaUSR$g)Nrgrouprm display_namer5s rrAddress.display_nameBs) 7   (7'' ' )rc|USRS:XaUS/$USRS:Xa/$USR$Nrmailboxinvalid-mailboxrr5s rrAddress.mailboxesGsH 7   *G9  !W  #4 4IAw   rcUSRS:XaUS/$USRS:XaUS/$USR$rrr5s rrAddress.all_mailboxesOsO 7   *G9  !W  #4 4G9 Aw$$$rrN) r<rrrsrtrmrwrrrrxrrrrr>sAJ ((!!%%rrc8\rSrSrSr\S5r\S5rSrg) MailboxListiW mailbox-listcTUVs/sHoRS:XdMUPM sn$s snf)Nrrrs rrMailboxList.mailboxes[rrcVUVs/sHnURS;dMUPM sn$s snf)N)rrrrs rrMailboxList.all_mailboxes_s2?4a||==4? ??s&&rN r<rrrsrtrmrwrrrxrrrrrWs-J <<??rrc8\rSrSrSr\S5r\S5rSrg) GroupListie group-listcXU(aUSRS:wa/$USR$Nrrrr5s rrGroupList.mailboxesis+tAw))^;IAw   rcXU(aUSRS:wa/$USR$r rr5s rrGroupList.all_mailboxesos+tAw))^;IAw$$$rrNrrrrrres-J !! %%rrcH\rSrSrSr\S5r\S5r\S5rSr g)GroupivrcJUSRS:wa/$USR$Nrrr5s rrGroup.mailboxeszs) 7   -IAw   rcJUSRS:wa/$USR$rrr5s rrGroup.all_mailboxess) 7   -IAw$$$rc USR$rK)rr5s rrGroup.display_namesAw###rrN) r<rrrsrtrmrwrrrrxrrrrrvsAJ !! %% $$rrch\rSrSrSr\S5r\S5r\S5r\S5r \S5r Sr g ) NameAddri name-addrc@[U5S:XagUSR$Nr)lenrr5s rrNameAddr.display_names t9>Aw###rc USR$N local_partr5s rr$NameAddr.local_partsBx"""rc USR$r!domainr5s rr(NameAddr.domainsBxrc USR$r!)router5s rr+NameAddr.routesBx~~rc USR$r! addr_specr5s rr/NameAddr.addr_specsBx!!!rrN r<rrrsrtrmrwrr$r(r+r/rxrrrrrsiJ $$ ##""rrcX\rSrSrSr\S5r\S5r\S5r\S5r Sr g) AngleAddriz angle-addrcRUH!nURS:XdMURs $ gN addr-spec)rmr$rs rr$AngleAddr.local_parts"A||{*||#rcRUH!nURS:XdMURs $ gr5rmr(rs rr(AngleAddr.domains!A||{*xxrcRUH!nURS:XdMURs $ g)N obs-route)rmdomainsrs rr+AngleAddr.routes"A||{*yy rcUHVnURS:XdMUR(aURs $[UR5UR-s $ g)Nr6z<>)rmr$r/rrs rr/AngleAddr.addr_specsFA||{*<<;;&' 5 CC rrN) r<rrrsrtrmrwr$r(r+r/rxrrrr3r3sUJ $$   !! rr3c(\rSrSrSr\S5rSrg)ObsRouteir<chUVs/sH oRS:XdMURPM" sn$s snf)Nr(r9rs rr=ObsRoute.domainss)"&C$Q,,(*B$CCCrrN)r<rrrsrtrmrwr=rxrrrrBrBsJ DDrrBch\rSrSrSr\S5r\S5r\S5r\S5r \S5r Sr g ) MailboxircHUSRS:XaUSR$gNrrrr5s rrMailbox.display_names) 7   ,7'' ' -rc USR$rKr#r5s rr$Mailbox.local_partAw!!!rc USR$rKr'r5s rr(Mailbox.domainsAw~~rcHUSRS:XaUSR$grH)rmr+r5s rr+ Mailbox.routes' 7   ,7==  -rc USR$rKr.r5s rr/Mailbox.addr_specsAw   rrNr1rrrrFrFsiJ ((""!!!!rrFc8\rSrSrSr\S5r\=r=r=r r Sr g)InvalidMailboxircgrrr5s rrInvalidMailbox.display_namerrNr1rrrrTrTs/"J /;:J::%)rrTc:^\rSrSrSrSr\U4Sj5rSrU=r $)Domainir(FcR>SR[TU]R55$Nr)r4r rsplitr>s rr( Domain.domainwwuw}**,--rr) r<rrrsrtrmrPrwr(rxryrzs@rrYrYsJM ..rrYc\rSrSrSrSrg)DotAtomidot-atomrNrrrrrarasJrrac\rSrSrSrSrSrg) DotAtomTextiz dot-atom-textTrNr<rrrsrtrmrPrxrrrrdrds  JMrrdc\rSrSrSrSrSrg) NoFoldLiterali zno-fold-literalFrNrerrrrgrg s "JMrrgc\\rSrSrSrSr\S5r\S5r\S5r \S5r Sr g ) AddrSpecir6Fc USR$rKr#r5s rr$AddrSpec.local_partrLrc@[U5S:agUSR$)Nr")rr(r5s rr(AddrSpec.domains t9q=Bxrc[U5S:aUSR$USRR5USR-USRR5-$)Nrmrrr)rrrstriplstripr5s rrAddrSpec.valuesU t9q=7== Aw}}##%d1gmm3DGMM4H4H4JJJrc[UR5n[U5[U[- 5:a[ UR5nO URnUR bUS-UR -$U$)N@)setr$r DOT_ATOM_ENDSrr()r#namesetlps rr/AddrSpec.addr_spec$s_doo& w<#gm34 4doo.BB ;; "8dkk) ) rrN) r<rrrsrtrmrPrwr$r(rr/rxrrrriris\JM "" KK rric\rSrSrSrSrSrg) ObsLocalParti0zobs-local-partFrNrerrrr{r{0s !JMrr{cJ^\rSrSrSrSr\S5r\U4Sj5rSr U=r $) DisplayNamei6z display-nameFc[U5n[U5S:Xa UR$USRS:XaUR S5 OB[ US[5(a*USSRS:Xa[USSS5US'USRS:XaUR 5 UR$[ US[5(a*USSRS:Xa[USSS5US'UR$)Nrrrr")rrrrmpop isinstance)r#rs rrDisplayName.display_name;so s8q=99  q6   & GGAJ3q69--F1I((F2"3q6!":.A r7   ' GGI yy3r7I..GBK**f4#CGCRL1Byyrc>SnUR(aSnOUHnURS:XdMSnM [U5S:waU(aS=p4USRS:Xd.[US[5(aUSSRS:XaSnUSRS:Xd.[US[5(aUSSRS:XaSnU[ UR 5-U-$[TU] $) NFTrrr)rrr") r"rmrrrrrr r)r#rr.prepostr&s rrDisplayName.valueNs <<E<eOCQ""f,47I..Q %%/R##v-48Y//R ''61|D$5$566t; ;7= rr) r<rrrsrtrmrvrwrrrxryrzs@rr}r}6s4J $!!rr}c<\rSrSrSrSr\S5r\S5rSr g) LocalPartifz local-partFcdUSRS:XaUSR$USR$)Nrr)rmrrr5s rrLocalPart.valueks2 7   07'' '7== rc[/n[nSnUS[/-HnURS:XaMU(a4URS:Xa$USRS:Xa[USS5US'[U[5nU(aAURS:Xa1USRS:XaUR [USS55 OUR U5 USnUnM [USS5nUR $)NFrrdotr"r)DOTrmrrrr)r#rlast last_is_tltokis_tls rr$LocalPart.local_partrse 7cU?C~~'s~~6H''61#D"I.BsI.E$//U2F%%/ 9SW-. 3r7DJ#Ab "yyrrN) r<rrrsrtrmrPrwrr$rxrrrrrfs2JM !! rrcJ^\rSrSrSrSr\U4Sj5r\S5rSr U=r $) DomainLiteralizdomain-literalFcR>SR[TU]R55$r[r\r>s rr(DomainLiteral.domainr_rcRUH!nURS:XdMURs $ g)Nptextrrs ripDomainLiteral.ips!A||w&wwrr) r<rrrsrtrmrPrwr(rrxryrzs@rrrs3!JM ..rrc \rSrSrSrSrSrSrg) MIMEVersioniz mime-versionNr)r<rrrsrtrmmajorminorrxrrrrrsJ E ErrcD\rSrSrSrSrSrSr\S5r \S5r Sr g) Parameteri parameterFus-asciicFUR(aUSR$S$r) sectionednumberr5s rsection_numberParameter.section_numbers"&tAw~~6Q6rcUHynURS:XaURs $URS:XdM3UH@nURS:XdMUH%nURS:XdMURs s s $ MB M{ g)Nrrrr))rmrrs r param_valueParameter.param_valuesxE7*+++?2"E''+??%*E$//7:',';'; ;&+# rrN) r<rrrsrtrmrextendedrrwrrrxrrrrrs<JIHG 77   rrc\rSrSrSrSrg)InvalidParameteriinvalid-parameterrNrrrrrrs$Jrrc(\rSrSrSr\S5rSrg) Attributei attributecrUH1nURRS5(dM%URs $ g)Nattrtext)rmendswithrrs rrAttribute.stripped_values-E((44{{"rrNr<rrrsrtrmrwrrxrrrrrsJ ##rrc\rSrSrSrSrSrg)SectionisectionNr)r<rrrsrtrmrrxrrrrrs J Frrc(\rSrSrSr\S5rSrg)ValueircUSnURS:XaUSnURRS5(a UR$UR$)Nrrr)rrzextended-attribute)rmrrrrs rrValue.stripped_valuesVQ   v %GE    $ $D F F'' 'zzrrNrrrrrrsJ rrc2\rSrSrSrSr\S5rSrSr g)MimeParametersimime-parametersFc## 0nUHnURRS5(dM%USRS:waM:USRR5nX1;a/X'XR UR U45 M UR 5GHup4[U[S5S9nUSSnURnUR(dU[U5S:aFUSSS:Xa:USSRR [R"S55 USSn/nSnUGH+upX:waqU R(d1U RR [R"S55 MMU RR [R"S55 US- nU Rn U R(a|[ R"R%U 5n U R'US 5n [,R."U 5(a.U RR [R0"55 UR U 5 GM. S R5U5n X;4v GM g![([*4a U R'S S 5n Nf=f![*a! [ R"R3U S S 9n Nf=f7f)Nrrr)keyrz.duplicate parameter name; duplicate(s) ignoredz+duplicate parameter name; duplicate ignoredz(inconsistent RFC2231 parameter numberingsurrogateescaperzlatin-1)encodingr))rmrrstriprritemssortedrrrrr"rInvalidHeaderDefectrurllibparseunquote_to_bytesdecode LookupErrorUnicodeEncodeErrorr_has_surrogatesUndecodableBytesDefectunquoter4) r#paramsrXnameparts first_paramr value_partsirparamrs rrMimeParameters.paramssE##,,[99Qx""k18>>'')D!! L  !5!5u = >"<<>KD5jm4E(1+K!))G''CJN8A;!#!HQK''..v/I/IH0JK!"1IEKA).%!&!>> ,,V-G-GI.KL  ,,V-G-GF.HIQ))>>R & = =e DP$)LL:K$LE!0077!MM001N1N1PQ""5)C*/DGGK(E+ g*R!,-?@P %*LL=N$OE P.P!' 4 4UY 4 O PsIGKJ'2I?A;K?"J$!K#J$$K'(KKKKc /nURHIup#U(a,URSRU[U555 M8URU5 MK SR U5nU(aSU-$S$)N{}={}z; rr))rrr;rr4)r#rrrs rr6MimeParameters.__str__,sc;;KD gnnT<3FGH d# ' 6"%sV|-2-rrN) r<rrrsrtrmrurwrr6rxrrrrrs&"JO CCJ.rrc(\rSrSrSr\S5rSrg)ParameterizedHeaderValuei7Fcf[U5H!nURS:XdMURs $ 0$)Nr)reversedrmrrs rrParameterizedHeaderValue.params=s0d^E#44||#$ rrN)r<rrrsrtrurwrrxrrrrr7sO rrc$\rSrSrSrSrSrSrSrg) ContentTypeiEz content-typeFtextplainrN) r<rrrsrtrmrPmaintypesubtyperxrrrrrEsJMHGrrc \rSrSrSrSrSrSrg)ContentDispositioniLzcontent-dispositionFNr)r<rrrsrtrmrPcontent_dispositionrxrrrrrLs&JMrrc \rSrSrSrSrSrSrg)ContentTransferEncodingiRzcontent-transfer-encodingF7bitrN)r<rrrsrtrmrPrrxrrrrrRs,JM Crrc\rSrSrSrSrSrg) HeaderLabeliXz header-labelFrNrerrrrrXs JMrrc"\rSrSrSrSrSrSrg)MsgIDi]zmsg-idFc2[U5UR-$r)rlinesepr^s rr_ MsgID.foldas4y6>>))rrN)r<rrrsrtrmrPr_rxrrrrr]sJM*rrc\rSrSrSrSrg) MessageIDifz message-idrNrrrrrrfsJrrc\rSrSrSrSrg)InvalidMessageIDijzinvalid-message-idrNrrrrrrjs%Jrrc\rSrSrSrSrg)HeaderinheaderrNrrrrrrnrrrc^\rSrSrSrSrSrU4SjrU4SjrSr \ S5r S U4Sjjr Sr \ S 5rS rS rU=r$) TerminalivTc@>[TU]X5nX#l/UlU$r)r __new__rmr")clsrrmr#r&s rrTerminal.__new__|s"ws*$  rch>SRURR[TU]55$r9r:r>s rr=Terminal.__repr__s&t~~668H8JKKrcb[URRS-UR-5 g)N/)rdr&r<rmr5s rrgTerminal.pprints" dnn%%+doo=>rc,[UR5$r)listr"r5s rrFTerminal.all_defectssDLL!!rc >SRUURRUR[TU]5UR (dS5/$SRUR 55/$)Nz {}{}/{}({}){}r)z {})r;r&r<rmr r=r")r#rcr&s rrj Terminal._ppsg&&  NN # # OO G  llB   ). T\\(B  rcgrrr5s rpop_trailing_wsTerminal.pop_trailing_wsrWrc/$rrr5s rrWTerminal.commentss rc0[U5UR4$r)rrmr5s r__getnewargs__Terminal.__getnewargs__s4y$//**r)r"rmrq)r<rrrsrtrPrvrurr=rgrwrFrjr rWrrxryrzs@rrrvs_MO L?""++rrc*\rSrSr\S5rSrSrg)WhiteSpaceTerminalicgrrr5s rrWhiteSpaceTerminal.valuerrcg)NTrr5s rrL!WhiteSpaceTerminal.startswith_fwssrrNr<rrrsrtrwrrLrxrrrrrs rrc*\rSrSr\S5rSrSrg) ValueTerminalicU$rrr5s rrValueTerminal.values rcg)NFrr5s rrLValueTerminal.startswith_fwssrrNrrrrrrs rrc*\rSrSr\S5rSrSrg)EWWhiteSpaceTerminalicgr[rr5s rrEWWhiteSpaceTerminal.valuesrcgr[rr5s rr6EWWhiteSpaceTerminal.__str__srrN)r<rrrsrtrwrr6rxrrrr!r!s rr!c\rSrSrSrSrg)_InvalidEwErroriz1Invalid encoded word found while parsing headers.rN)r<rrrsrt__doc__rxrrrr'r's;rr'r,zlist-separatorFrtzroute-component-markerz([{}]+)r)z[^{}]+z[\x00-\x20\x7F]c[U5nU(a/URR[R"U55 [ R "U5(a0URR[R"S55 gg)z@If input token contains ASCII non-printables, register a defect.z*Non-ASCII characters found in header tokenN)_non_printable_finderr"rrNonPrintableDefectrrr)xtextnon_printabless r_validate_xtextr/sf+51N V66~FG U## V:: 8: ;$rcHU(dg[US5tp#/nSnSn[[U55HBnX'S:XaU(aSnSnOSnMU(aSnO X'U;a OURX'5 MD WS-nSR U5SR X'S/U-5U4$)aWScan printables/quoted-pairs until endchars and return unquoted ptext. This function turns a run of qcontent, ccontent-without-comments, or dtext-with-quoted-printables into a single string by unquoting any quoted printables. It returns the string, the remaining value, and a flag that is True iff there were any quoted printables decoded. )r)r)FrFrTr)N) _wsp_splitterrangerrr4)rendcharsfragment remaindervcharsescapehad_qpposs r_get_ptext_to_endcharsr:s (2H F F FS]# =D  F ]h &  hm$$Ag 776?BGGXd^$4y$@A6 IIrcpUR5n[US[U5[U5- S5nX!4$)zFWS = 1*WSP This isn't the RFC definition. We're using fws to represent tokens where folding can be done, but when we are parsing the *un*folding has already been done so we don't need to watch out for CRLF. Nfws)rqrr)rnewvaluer<s rget_fwsr>s8||~H U#r1rr/)r terminal_typeewrr5remstrrestrrrr"rXcharsvtexts rget_encoded_wordrO"sM B   D ! !%% 0 7 7 >@ @ABioodA.OC ABi%% 0 7 7 >@ @ WWY F F aq Yq Y #!<<a0Dj4 399;! &44 ,. / F GGI E@'*zz$*t2C'D$tJGJJg  7c>!$-KE IIe  )$2e3 %wwy! $ q$ &44 <> ? 9)  !@ / 6 6rvv >@ @@s I%%5Jcl[5nU(GazUS[;a [U5up URU5 M5SnUR S5(a[ US5up Sn[ U5S:aDUSRS:wa1URR[R"S55 SnU(a4[ U5S :a%US RS :Xa[USS5US'URU5 M[US 5tpVU(a,[R!U5(aUR#S5tpV[%US5n['U5 URU5 S R)U5nU(aGMzU$![a SnN[Ra Nf=f) aunstructured = (*([FWS] vchar) *WSP) / obs-unstruct obs-unstruct = *((*LF *CR *(obs-utext) *LF *CR)) / FWS) obs-utext = %d0 / obs-NO-WS-CTL / LF / CR obs-NO-WS-CTL is control characters except WSP/CR/LF. So, basically, we have printable runs, plus control characters or nulls in the obsolete syntax, separated by whitespace. Since RFC 2047 uses the obsolete syntax in its specification, but requires whitespace on either side of the encoded words, I can see no reason to need to separate the non-printable-non-whitespace from the printable runs if they occur, so we parse this into xtext tokens separated by WSP tokens. Because an 'unstructured' value must by definition constitute the entire value, this 'get' routine does not return a remaining value, only the parsed TokenList. rTr@utextr"r<z&missing whitespace before encoded wordFrrr))rrHr>rrBrOrrmr"rrr!r'rCr1rfc2047_matchersearch partitionrr/r4)rrrXvalid_ewhave_wsrr5rNs rget_unstructuredrXSs.)*L  8s?"5>LE    &    D ! ! /w? |$q(#B'22e;$,,33F4N4ND5FG"'s<014#B'22nD+?(,e,5 R(##E*'q1 ..s33#ood3OCc7+E" "Q %R A# ! **  sF F3F32F3cT[US5upn[US5n[U5 X4$)actext = This is not the RFC ctext, since we are handling nested comments in comment and unquoting quoted-pairs here. We allow anything except the '()' characters, but if we find any ASCII other than the RFC defined printable ASCII, a NonPrintableDefect is added to the token's defects list. Since quoted pairs are converted to their unquoted values, what is returned is a 'ptext' token. In this case it is a WhiteSpaceTerminal, so it's value is ' '. z()r)r:rr/rr_s r get_qp_ctextr\s0-UD9OE! ug .EE <rcT[US5upn[US5n[U5 X4$)aWqcontent = qtext / quoted-pair We allow anything except the DQUOTE character, but if we find any ASCII other than the RFC defined printable ASCII, a NonPrintableDefect is added to the token's defects list. Any quoted pairs are converted to their unquoted values, so what is returned is a 'ptext' token. In this case it is a ValueTerminal. rr)r:rr/rZs r get_qcontentr^s0-UC8OE! % )EE <rc[U5nU(d%[R"SRU55eUR 5nU[ U5Sn[ US5n[U5 X 4$)zatext = We allow any non-ATOM_ENDS in atext, but add an InvalidATextDefect to the token's defects list if we find non-atext characters. zexpected atext but found '{}'Natext)_non_atom_end_matcherrrCr;rrrr/)rmr`s r get_atextrcsi e$A %% + 2 25 9; ; GGIE #e*+ E % )EE <rcU(a USS:wa%[R"SRU55e[5nUSSnU(a'USS:Xa[ U5up UR U5 U(aUSS:waUS[ ;a[U5up OUSSS:XaSn[U5up URR [R"S 55 S nU(aG[U5S:a8US RS :Xa%US RS:Xa[US S 5US 'O [ U5up UR U5 U(a USS:waMU(d2URR [R"S55 X4$XSS4$![Ra [ U5up Nf=f)zbare-quoted-string = DQUOTE *([FWS] qcontent) [FWS] DQUOTE A quoted-string without the leading or trailing white space. Its value is the text between the quote marks, with whitespace preserved and quoted pairs decoded. rrzexpected '"' but found '{}'rNrr@Fz!encoded word inside quoted stringTr"r<rRrz"end of header inside quoted string)rrCr;rr^rrHr>rOr"rrrmr!)rbare_quoted_stringrXrVs rget_bare_quoted_stringrfs E!HO%% * 1 1% 8: :)+ !"IE qS#E* !!%( E!HO 8s?"5>LE5 2AY$ H 3/6 "**11&2L2L739: C 23a7&r*55>*2.99^K-A*2..7&r*(.LE!!%(+ E!HO, ""))&*D*D 0+2 3!(( QRy ((!** 3+E2 u 3s*>F&&!G  G cU(a.USS:wa%[R"SRU55e[5nUSSnU(akUSS:wabUS[;a[ U5up O$USS:Xa[ U5up O [U5up URU5 U(a USS:waMbU(d2URR[R"S55 X4$XSS4$)zcomment = "(" *([FWS] ccontent) [FWS] ")" ccontent = ctext / quoted-pair / comment We handle nested comments here, and quoted-pair in our qp-ctext routine. rrzexpected '(' but found '{}'rNrzend of header inside comment) rrCr;rrHr> get_commentr\rr"r)rrrXs rrhrhs  qS%% ) 0 0 79 9iG !"IE E!HO 8s?"5>LE5 1X_&u-LE5'.LEu E!HO v99 * , -~ !"I rc[5nU(a\US[;aOUS[;a[U5up O [ U5up UR U5 U(aUS[;aMOX4$)z,CFWS = (1*([FWS] comment) [FWS]) / FWS r)r CFWS_LEADERrHr>rhr)rrrXs rget_cfwsrk sc :D E!H + 8s?"5>LE5&u-LE E E!H + ;rc [5nU(a+US[;a[U5up URU5 [ U5up URU5 U(a+US[;a[U5up URU5 X4$)zquoted-string = [CFWS] [CFWS] 'bare-quoted-string' is an intermediate class defined by this parser and not by the RFC grammar. It is the quoted string without any attached CFWS. r)rrjrkrrf)r quoted_stringrXs rget_quoted_stringrns|!NM q[( U#)%0LE q[( U#  rc*[5nU(a+US[;a[U5up URU5 U(a2US[;a%[ R "SRU55eURS5(a[U5up O [U5up URU5 U(a+US[;a[U5up URU5 X4$![ R a [U5up Nif=f)zHatom = [CFWS] 1*atext [CFWS] An atom could be an rfc2047 encoded word. rzexpected atom but found '{}'r@) rrjrkr ATOM_ENDSrrCr;rBrOrc)rrrXs rget_atomrq)s 6D q[(  E qY&%% * 1 1% 8: :  ,+E2LE5 !' KK q[(  E ;&& ,%U+LE5 ,s C..!DDc[5nU(a US[;a%[R"SR U55eU(akUS[;a^[ U5up UR U5 U(a#USS:XaUR [5 USSnU(aUS[;aM^US[La([R"SR SU-55eX4$)z'dot-text = 1*atext *("." 1*atext) rz8expected atom at a start of dot-atom-text but found '{}'r rNr"z4expected atom at end of dot-atom-text but found '{}')rdrprrCr;rcrr)r dot_atom_textrXs rget_dot_atom_textrtDs MM E!H )%%'++16%=: : E!HI- ' U# U1X_   %!"IE E!HI- RC%%'#VCI.0 0  rc[5nUS[;a[U5up URU5 UR S5(a[ U5up O [U5up URU5 U(a+US[;a[U5up URU5 X4$![ Ra [U5up Nif=f)zydot-atom = [CFWS] dot-atom-text [CFWS] Any place we can have a dot atom, we could instead have an rfc2047 encoded word. rr@) rarjrkrrBrOrrCrt)rdot_atomrXs r get_dot_atomrwWs yH Qx;   4+E2LE5 )/  OOE q[(  ?&& 4-U3LE5 4s B..!CCc:US[;a[U5upOSnU(d[R"S5eUSS:Xa[ U5up O?US[ ;a%[R"SR U55e[U5up UbU/USS&X 4$)a`word = atom / quoted-string Either atom or quoted-string may start with CFWS. We have to peel off this CFWS first to determine which type of word to parse. Afterward we splice the leading CFWS, if any, into the parsed sub-token. If neither an atom or a quoted-string is found before the next special, a HeaderParseError is raised. The token returned is either an Atom or a QuotedString, as appropriate. This means the 'word' level of the formal grammar is not represented in the parse tree; this is because having that extra layer when manipulating the parse tree is more confusing than it is helpful. rNz5Expected 'atom' or 'quoted-string' but found nothing.rz1Expected 'atom' or 'quoted-string' but found '{}')rjrkrrCrnSPECIALSr;rq)rleaderrXs rget_wordr{ps  Qx;   %% CE E Qx}(/ u qX %%'77=ve}F F   Hbq <rc[5n[U5up URU5 U(aUS[;aUSS:XaJUR[5 UR R[R"S55 USSnO[U5up URU5 U(aUS[;aMX4$![Ra2 UR R[R "S55 Nf=f![RaM US[;a>[U5up UR R[R"S55 Nef=f)aphrase = 1*word / obs-phrase obs-phrase = word *(word / "." / CFWS) This means a phrase can be a sequence of words, periods, and CFWS in any order as long as it starts with at least one word. If anything other than words is detected, an ObsoleteHeaderDefect is added to the token's defect list. We also accept a phrase that starts with CFWS followed by a dot; this is registered as an InvalidHeaderDefect, since it is not supported by even the obsolete grammar. zphrase does not start with wordrr zperiod in 'phrase'rNzcomment found without atom) rr{rrrCr"r PHRASE_ENDSrObsoleteHeaderDefectrjrk)rrrXs r get_phrasers@XF0  e E!HK/ 8S= MM#  NN ! !&"="=$#& '!"IE '  MM% ! E!HK/" =)  " "0f88 -/ 00** 8{*#+E?LENN))&*E*E4+67  s%C  D ADDAE31E3c[5nSnU(aUS[;a [U5up U(d%[R"SR U55e[ U5up0UbU/USS&URU5 U(aUSS:Xd US[;a[[U5U-5up@URS:Xa0URR[R"S55 O/URR[R "S55 XAS'UR"R%S5 X4$![RaO [U5up0GN![Ra& USS:waUS[;ae[5nGN@f=ff=f![&a4 URR[R("S 55 X4$f=f) z@ @ #E* Hbq e %(D.E!HK$? 23z?U3J K  $ $(@ @    % %f&@&@N'P Q    % %f&A&A>'@ A&1 >(  1  " "  #E?LE5&& Qx4E!H $;KE  * >!!&"@"@;#= >  >s< E ,F0 F- E//5F)$F-(F))F-09G.-G.c[5nSnU(Ga`USS:XdUS[;GaIUSS:XaTU(a/URR[R "S55 UR[ 5 SnUSSnM|USS:XaVUR[USS 55 USSnURR[R "S 55 SnMU(aBUS RS :wa/URR[R "S 55 [U5up0SnURU5 U(aUSS:XaGM9US[;aGMIU(d%[R"SRU55eUSRS :Xd5USRS:XaQ[U5S:aBUSRS :Xa/URR[R "S55 US RS :Xd5US RS:XaQ[U5S:aBUSRS :Xa/URR[R "S55 UR(aSUlX4$![Ra US[;ae[U5up0GNf=f)z&obs-local-part = word *("." word) Frrr zinvalid repeated '.'TrNmisplaced-specialz/'\' character outside of quoted-string/ccontentr"rzmissing '.' between wordsz&expected obs-local-part but found '{}'rz!Invalid leading '.' in local partrRz"Invalid trailing '.' in local partr)r{r}r"rrrrrrmr{rCrjrkr;r)rrlast_non_ws_was_dotrXs rrrs"^N U1Xt^uQx{'B 8s?"&&--f.H.H*/,-  ! !# &"& !"IE  1Xt^  ! !-a0C#E F!"IE  " " ) )&*D*DB+D E"'   nR0;;uD  " " ) )&*D*D++- . +#E?LE"'  e$7 U1Xt^uQx{'B8 %% 4 ; ;E BD Dq$$- 1  ( (& 0  ! # 1  ( (% /%%f&@&@ /'1 2r%%. 2  ) )6 1  ! # 2  ) )5 0%%f&@&@ 0'2 3$<!   -&& +Qx{*#E?LE5 +s2J**/KKc[US5upn[US5nU(a/URR[R "S55 [ U5 X4$)adtext = / obs-dtext obs-dtext = obs-NO-WS-CTL / quoted-pair We allow anything except the excluded characters, but if we find any ASCII other than the RFC defined printable ASCII, a NonPrintableDefect is added to the token's defects list. Quoted pairs are converted to their unquoted values, so what is returned is a ptext token, in this case a ValueTerminal. If there were quoted-printables, an ObsoleteHeaderDefect is added to the returned token's defect list. z[]rz(quoted printable found in domain-literal)r:rr"rrr~r/)rrr8s r get_dtextrsV2%>E& % )E  V88 68 9E <rcU(agURR[R"S55 UR[ SS55 g)NFz"end of input inside domain-literal]domain-literal-endT)r"rrrr)rdomain_literals r_check_for_early_dl_endr'sE !!&"<"<,#./--ABC rc[5nUS[;a[U5up URU5 U(d[R "S5eUSS:wa%[R "SR U55eUSSnUR[SS55 [X5(aX4$US[;a[U5up URU5 [U5up URU5 [X5(aX4$US[;a[U5up URU5 [X5(aX4$USS:wa%[R "S R U55eUR[SS 55 USSnU(a+US[;a[U5up URU5 X4$) zAdomain-literal = [CFWS] "[" *([FWS] dtext) [FWS] "]" [CFWS] rzexpected domain-literal[z6expected '[' at start of domain-literal but found '{}'rNzdomain-literal-startrz4expected ']' at end of domain-literal but found '{}'r) rrjrkrrrCr;rrrHr>r)rrrXs rget_domain_literalr/s#_N Qx; e$ %%&?@@ Qx3%%'!!'0 0 !"IE--CDEu55$$ Qx3u~ e$U#LE% u55$$ Qx3u~ e$u55$$ Qx3%%'!!'0 0--ABC !"IE q[( e$   rcp[5nSnU(aUS[;a [U5up U(d%[R"SR U55eUSS:Xa*[ U5up0UbU/USS&URU5 X4$[U5up0U(aUSS:Xa[R"S5eUbU/USS&URU5 U(aUSS:XaURR[R"S55 USRS :XaUSUSS&U(aQUSS:XaHUR[5 [US S5up0URU5 U(a USS:XaMHX4$![Ra [U5up0GNf=f) zPdomain = dot-atom / domain-literal / obs-domain obs-domain = atom *("." atom)) Nrzexpected domain but found '{}'rrtzInvalid Domainr z(domain is not a dot-atom (contains CFWS)rbr)rYrjrkrrCr;rrrwrqr"r~rmr)rr(rzrXs r get_domainrVs XF F q[(   %% , 3 3E :< < Qx3)%0   E"1I e}'#E*  qS%%&677 Hbq  MM% qSf99 68 9 !9  : -q F1IaC MM# #E!"I.LE MM% aC =!  " "' u's F!F54F5cT[5n[U5up URU5 U(a USS:wa2URR[R "S55 X4$UR[ SS55 [USS5up URU5 X4$)z'addr-spec = local-part "@" domain rrtz#addr-spec local part with no domainaddress-at-symbolrN)rirrr"rrrr)rr/rXs r get_addr_specr|s I!%(LE U E!HO  !;!; 1"3 4 ]3(;<=eABi(LE U  rcJ[5nU(aUSS:Xd US[;apUS[;a[U5up URU5 O#USS:XaUR[5 USSnU(aUSS:XaMaUS[;aMpU(a USS:wa%[ R "SRU55eUR[5 [USS5up URU5 U(aUSS:XaUR[5 USSnU(dOUS[;a[U5up URU5 U(dOQUSS:Xa6UR[5 [USS5up URU5 U(a USS:XaMU(d[ R "S5eUSS:wa%[ R "S RU55eUR[SS 55 XSS4$) zobs-route = obs-domain-list ":" obs-domain-list = *(CFWS / ",") "@" domain *("," [CFWS] ["@" domain]) Returns an obs-route token with the appropriate sub-tokens (that is, there is no obs-domain-list in the parse tree). rr)rNrtz(expected obs-route domain but found '{}'z%end of header while parsing obs-route:z4expected ':' marking end of obs-route but found '{}'zend-of-obs-route-marker) rBrjrkr ListSeparatorrrCr;RouteComponentMarkerrr)r obs_routerXs r get_obs_routers I U1Xs]eAh+&= 8{ "#E?LE   U # 1X_   ] +!"IE U1Xs]eAh+&= E!HO%% 6 = =e DF F )*eABi(LE U E!HcM'ab   8{ "#E?LE   U #  8s?   1 2%eABi0LE   U # E!HcM %%&MNN Qx3%%(''-ve}6 6 ]3(ABC ABi rc[5nU(a+US[;a[U5up URU5 U(a USS:wa%[R "SR U55eUR[SS55 USSnU(a[USS:XaRUR[SS55 URR[R"S 55 USSnX4$[U5up URU5 U(aUSS:XaUSSnO/URR[R"S 55 UR[SS55 U(a+US[;a[U5up URU5 X4$![R a [U5up URR[R"S 55 O=![R a& [R "S R U55ef=fURU5 [U5up GNVf=f) zzangle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr obs-angle-addr = [CFWS] "<" obs-route addr-spec ">" [CFWS] rzangle-addr-endznull addr-spec in angle-addrz*obsolete route specification in angle-addrz.expected addr-spec or obs-route but found '{}'z"missing trailing '>' on angle-addr) r3rjrkrrrCr;rr"rrrr~)r angle_addrrXs rget_angle_addrrs( J q[( % E!HO%% 0 7 7 >@ @mC);<= !"IE qS--=>?!!&"<"< *#, -ab    ,$U+ e qSab !!&"<"< 0#2 3mC)9:; q[( %  )  " " , P(/LE    % %f&A&A<'> ?&& P))@GGNP P P % $U+ u ,s*5 F((I=.'s' 3 11 a33 4 4 1')r) rFrrrCrr;anyrFrmr)rrrXs r get_mailboxrs iGA$U+   3 % 1 1 333. NN5 >  " "A A(/LE5&& A))188?A A AAs AB7* A99:B33B7c[5nU(ajUSU;aaUS[;a$UR[USS55 USSnO[ U5up0URU5 U(a USU;aMaX 4$)zRead everything up to one of the chars in endchars. This is outside the formal grammar. The InvalidMailbox TokenList that is returned acts like a Mailbox, but the data attributes are None. rrrN)rTr}rrr)rr3invalid_mailboxrXs rget_invalid_mailboxr-s%&O E!HH, 8{ "  " "=q1D$F G!"IE%e,LE  " "5 ) E!HH,  !!rc[5nU(aUSS:wa[U5up URU5 U(acUSS;aZUSnS Ul [US5up URU5 URR[R"S55 U(a#USS:XaUR[5 US SnU(a USS:waMX4$![RGa_ SnUS[ ;a[ U5up0U(a USS;aCURU5 URR[R"S55 GN-[US5up UbU/USS&URU5 URR[R"S55 GNUSS:Xa2URR[R"S55 GN[US5up UbU/USS&URU5 URR[R"S55 GNf=f) a)mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list obs-mbox-list = *([CFWS] ",") mailbox *("," [mailbox / CFWS]) For this routine we go outside the formal grammar in order to improve error handling. We recognize the end of the mailbox list only at the end of the value or at a ';' (the group terminator). This is so that we can turn invalid mailboxes into InvalidMailbox tokens and continue parsing any remaining valid mailboxes. We also allow all mailbox entries to be null, and this condition is handled appropriately at a higher level. r;Nz,;zempty element in mailbox-listzinvalid mailbox in mailbox-listr)r"rr)rrrrrCrjrkr"r~rrrmrVr)r mailbox_listrXrzrs rget_mailbox_listr?s(=L E!HO 8&u-LE    &4 U1XT)#2&G!2G .ud;LE NN5 !  ' '(B(B1)3 4 U1X_    .!"IEQ E!HOR  K&& 8FQx;& ( aD 0 ''/ ((//0K0K719:$7ud#CLE)%+Hbq  ''. ((//0J0J91;<qS$$++F,G,G3-56 35$? %!'E"1I##E*$$++F,F,F5-78/ 8s&C##BI'AI8I >u EG G LLs$678 !"IE q[(  U <rc$[5n[U5up UR U5 X4$![RaN [ U5up N8![Ra& [R"SR U55ef=ff=f)aaddress = mailbox / group Note that counter-intuitively, an address can be either a single address or a list of addresses (a group). This is why the returned Address object has a 'mailboxes' attribute which treats a single address as a list of length one. When you need to differentiate between to two cases, extract the single element, which is either a mailbox or a group token. zexpected address but found '{}')rrrrCrr;r)rrrXs r get_addressrs"iGA '  NN5 >  " "A A&u-LE5&& A))188?A A AAs -B A:B  Bc[5nU(a[U5up URU5 U(afUSS:wa]USSnSUl [US5up URU5 URR[R"S55 U(aUR[5 US SnU(aMX4$![RGas SnUS[ ;a[ U5up0U(a USS:XaCURU5 URR[R"S55 GN[US5up UbU/USS&UR[U/55 URR[R"S55 GNUSS:Xa2URR[R"S55 GN[US5up UbU/USS&UR[U/55 URR[R"S55 GN!f=f) aaddress_list = (address *("," address)) / obs-addr-list obs-addr-list = *([CFWS] ",") address *("," [address / CFWS]) We depart from the formal grammar here by continuing to parse until the end of the input, assuming the input to be entirely composed of an address-list. This is always true in email parsing, and allows us to skip invalid addresses to parse additional valid ones. Nrr)z"address-list entry with no contentzinvalid address in address-listzempty element in address-listr"rr)rrrrrCrjrkr"r~rrrrmrVr)r address_listrXrzrs rget_address_listrs=L  8&u-LE    &4 U1X_#2&q)G!2G .uc:LE NN5 !  ' '(B(B1)3 4     .!"IEQ %R  K&& 8FQx;& ( aC ''/ ((//0K0K<1>?$7uc#BLE)%+Hbq  ''(89 ((//0J0J91;<qS$$++F,G,G3-56 35#> %!'E"1I##GUG$45$$++F,F,F5-78/ 8s&C BIA!I38I.A!IIc[5nU(d%[R"SRU55eUSS:wa%[R"SRU55eUR [ SS55 USSn[ U5up UR U5 U(a USS:wa%[R"S RU55eUR [ SS 55 XSS4$) z%no-fold-literal = "[" *dtext "]" z'expected no-fold-literal but found '{}'rrz;expected '[' at the start of no-fold-literal but found '{}'zno-fold-literal-startrNrz9expected ']' at the end of no-fold-literal but found '{}'zno-fold-literal-end)rgrrCr;rrr)rno_fold_literalrXs rget_no_fold_literalrs$oO %% 5 < " [CFWS] id-left = dot-atom-text / obs-id-left id-right = dot-atom-text / no-fold-literal / obs-id-right no-fold-literal = "[" *dtext "]" rrzexpected msg-id but found '{}'z msg-id-startrNzobsolete id-left in msg-idz4expected dot-atom-text or obs-id-left but found '{}'rtzmsg-id with no id-rightrz msg-id-endrzobsolete id-right in msg-idzFexpected dot-atom-text, no-fold-literal or obs-id-right but found '{}'zmissing trailing '>' on msg-id)rrjrkrrrCr;rrtrr"r~rrr)rmsg_idrXs r get_msg_idr%s WF q[(  e E!HO%% , 3 3E :< < MM-^45 !"IE 1(/  MM% E!HOf88 %' ( U1X_ MM-\: ;!"IE} MM-%89: !"IE 5(/  MM% qSab f88 ,. / MM-\23 q[(  e =a  " " 1 1-e4LE NN ! !&"="=,#. /&& 1))""(&-1 1 1 14  " " 5 5.u5LE5&& 5 5)%0 %%f&A&A1'344** 5--&&,fUm55 5  5 5sT G7 J 7J  URR[R "SR U555 U$![Ra_n[U5n[U5nURR[R "SR U555 SnAU$SnAff=f)z2message-id = "Message-ID:" msg-id CRLF zUnexpected {!r}zInvalid msg-id: {!r}N) rrrr"rrr;rCrXr)r message_idrXexs rparse_message_idrjsJ 2!%( %      % %f&@&@!((/'1 2   " "K '%e, !!  & &'='D'DR'H I K K KsA11C$ACC$c0[5nU(d1URR[R"S55 U$US[ ;aT[ U5up URU5 U(d/URR[R"S55 SnU(aAUSS:wa8US[ ;a+X0S- nUSSnU(aUSS:waUS[ ;aM+UR5(dZURR[R"SRU555 UR[US 55 O+[U5Ul UR[US 55 U(a+US[ ;a[ U5up URU5 U(a USS:wa`URb/URR[R"S 55 U(aUR[US 55 U$UR[SS 55 USSnU(a+US[ ;a[ U5up URU5 U(d>URb/URR[R"S 55 U$SnU(a/US[ ;a"X0S- nUSSnU(aUS[ ;aM"UR5(dZURR[R"S RU555 UR[US 55 O+[U5Ul UR[US 55 U(a+US[ ;a[ U5up URU5 U(aJURR[R"S55 UR[US 55 U$)zDmime-version = [CFWS] 1*digit [CFWS] "." [CFWS] 1*digit [CFWS] z%Missing MIME version number (eg: 1.0)rz0Expected MIME version number but found only CFWSr)r rNz1Expected MIME major version number but found {!r}r-digitsz0Incomplete MIME version; found only major numberzversion-separatorz1Expected MIME minor version number but found {!r}z'Excess non-CFWS text after MIME version)rr"rrHeaderMissingRequiredValuerjrkisdigitrr;rintrr)r mime_versionrXrs rparse_mime_versionrsj =L ##F$E$E 3%5 6 Qx; E"  ' '(I(IB)D E F E!HOa (C(ab  E!HOa (C >>  ##F$>$> ? F Fv N%P QM&':; [ M&(;< q[( E" E!HO    )  ' '(B(BB)D E     eW = > c+>?@ !"IE q[( E"     )  ' '(B(BB)D E F E!HK/(ab  E!HK/ >>  ##F$>$> ? F Fv N%P QM&':; [ M&(;< q[( E" ##F$>$> 5%7 8M%9: rc[5nU(ajUSS:waaUS[;a$UR[USS55 USSnO[ U5up URU5 U(a USS:waMaX4$)zRead everything up to the next ';'. This is outside the formal grammar. The InvalidParameter TokenList that is returned acts like a Parameter, but the data attributes are None. rrrrN)rr}rrr)rinvalid_parameterrXs rget_invalid_parameterrs)* E!HO 8{ "  $ $]583F&H I!"IE%e,LE  $ $U + E!HO  ##rc[U5nU(d%[R"SRU55eUR 5nU[ U5Sn[ US5n[U5 X 4$)a$ttext = We allow any non-TOKEN_ENDS in ttext, but add defects to the token's defects list if we find non-ttext characters. We also register defects for *any* non-printables even though the RFC doesn't exclude all of them, because we follow the spirit of RFC 5322. zexpected ttext but found '{}'Nttext)_non_token_end_matcherrrCr;rrrr/)rrbrs r get_ttextrsi u%A %% + 2 25 9; ; GGIE #e*+ E % )EE <rc[5nU(a+US[;a[U5up URU5 U(a2US[;a%[ R "SRU55e[U5up URU5 U(a+US[;a[U5up URU5 X4$)ztoken = [CFWS] 1*ttext [CFWS] The RFC equivalent of ttext is any US-ASCII chars except space, ctls, or tspecials. We also exclude tabs even though the RFC doesn't. The RFC implies the CFWS but is not explicit about it in the BNF. rexpected token but found '{}') rrjrkr TOKEN_ENDSrrCr;r)rmtokenrXs r get_tokenrsWF q[(  e qZ'%% + 2 25 9; ;U#LE MM% q[(  e =rc[U5nU(d%[R"SRU55eUR 5nU[ U5Sn[ US5n[U5 X 4$)a=attrtext = 1*(any non-ATTRIBUTE_ENDS character) We allow any non-ATTRIBUTE_ENDS in attrtext, but add defects to the token's defects list if we find non-attrtext characters. We also register defects for *any* non-printables even though the RFC doesn't exclude all of them, because we follow the spirit of RFC 5322. z expected attrtext but found {!r}Nr)_non_attribute_end_matcherrrCr;rrrr/rrbrs r get_attrtextr si #5)A %% . 5 5e <> >wwyH #h-. !EXz2HH ?rc[5nU(a+US[;a[U5up URU5 U(a2US[;a%[ R "SRU55e[U5up URU5 U(a+US[;a[U5up URU5 X4$)a3[CFWS] 1*attrtext [CFWS] This version of the BNF makes the CFWS explicit, and as usual we use a value terminal for the actual run of characters. The RFC equivalent of attrtext is the token characters, with the subtraction of '*', "'", and '%'. We include tab in the excluded set just as we do for token. rr) rrjrkrATTRIBUTE_ENDSrrCr;rrrrXs r get_attributer s I q[(  q^+%% + 2 25 9; ;&LE U q[(   rc[U5nU(d%[R"SRU55eUR 5nU[ U5Sn[ US5n[U5 X 4$)zattrtext = 1*(any non-ATTRIBUTE_ENDS character plus '%') This is a special parsing routine so that we get a value that includes % escapes as a single string (which we decode as a single string later). z)expected extended attrtext but found {!r}Nextended-attrtext)#_non_extended_attribute_end_matcherrrCr;rrrr/rs rget_extended_attrtextr0 sl ,E2A %% 7 > >u EG GwwyH #h-. !EX':;HH ?rc[5nU(a+US[;a[U5up URU5 U(a2US[;a%[ R "SRU55e[U5up URU5 U(a+US[;a[U5up URU5 X4$)z[CFWS] 1*extended_attrtext [CFWS] This is like the non-extended version except we allow % characters, so that we can pick up an encoded value as a single string. rr) rrjrkrEXTENDED_ATTRIBUTE_ENDSrrCr;rrs rget_extended_attributerB s I q[(  q44%% + 2 25 9; ;(/LE U q[(   rc[5nU(a USS:wa%[R"SRU55eUR [ SS55 USSnU(aUSR 5(d%[R"SRU55eSnU(aEUSR 5(a-X S- nUSSnU(aUSR 5(aM-USS :Xa5US :wa/URR [R"S 55 [U5Ul UR [ US 55 X4$) a!'*' digits The formal BNF is more complicated because leading 0s are not allowed. We check for that and add a defect. We also assume no CFWS is allowed between the '*' and the digits, though the RFC is not crystal clear on that. The caller should already have dealt with leading CFWS. r*zExpected section but found {}zsection-markerrNz$Expected section number but found {}r)0z'section number has an invalid leading 0r) rrrCr;rrrr"rrr)rrrs r get_sectionrX s7iG E!HO%%&E&L&L(-'/0 0 NN=&678 !"IE a((**%%'117@ @ F E!H$$&&(ab  E!H$$&&ayCFcMv999 ; <[GN NN=23 >rcb[5nU(d[R"S5eSnUS[;a [ U5up U(d%[R"SR U55eUSS:Xa[ U5up0O [U5up0UbU/USS&URU5 X4$)zquoted-string / attribute z&Expected value but found end of stringNrz Expected value but found only {}r) rrrCrjrkr;rnrr)rvrzrXs r get_valuerv s A %%&NOO F Qx;   %%'006v@ @ Qx3(/ u-e4  Hbq HHUO 8Orc& [5n[U5up URU5 U(a USS:XaAURR[R "SR U555 X4$USS:Xas[U5up SUlURU5 U(d[R"S5eUSS:Xa'UR[SS55 USS nSUl USS :wa[R"S 5eUR[S S 55 USS nU(a+US[;a[U5up URU5 S nUnUR(Ga$U(GaUSS :XGa[U5upSURnSnUR S:Xa3U(a USS:XaSnO7[#U5up(U(a USS:XaSnO[%U5up(U(dSnU(aeURR[R "S55 URU5 UHn U R&S:XdM/U S S &U n O UnO1S nURR[R "S55 U(a USS:XaS nO [)U5up UR(aUR S:afU(a USS:wa'URU5 UbU(aU5eUnX4$URR[R "S55 U(dHURR[R "S55 URU5 UcX4$GO1UbLUHn U R&S:XdM O W R&S:H URU 5 U R*UlUSS:wa%[R"SR U55eUR[SS55 USS nU(amUSS:wad[#U5up URU5 UR*UlU(a USS:wa%[R"SR U55eUR[SS55 USS nUbq[15n U(a]US[2;a[5U5up O(USS :Xa[S S5nUSS nO [7U5up U RU5 U(aM]U nO [)U5up URU5 UbU(aU5eUnX4$![Ra GNf=f! GN=f)aDattribute [section] ["*"] [CFWS] "=" value The CFWS is implied by the RFC but not made explicit in the BNF. This simplified form of the BNF from the RFC is made to conform with the RFC BNF through some extra checks. We do it this way because it makes both error recovery and working with the resulting parse tree easier. rrz)Parameter contains name ({}) but no valuerTzIncomplete parameterzextended-parameter-markerrN=zParameter not followed by '='parameter-separatorrF'z5Quoted string value for extended parameter is invalidrzZParameter marked as extended but appears to have a quoted string value that is non-encodedzcApparent initial-extended-value but attribute was not marked as extended or was not initial sectionz(Missing required charset/lang delimitersrrz=Expected RFC2231 char/lang encoding delimiter, but found {!r}zRFC2231-delimiterz;Expected RFC2231 char/lang encoding delimiter, but found {}DQUOTE)rrrr"rrr;rrrCrrrjrkrnrrrrrmrrrrrrHr>r^) rrrXr5appendtoqstring inner_value semi_validrLtrs r get_parameterr s KE 'LE LL E!HO V779%%+VE]4 5| Qx3 &u-LE"EO LL ))*@A A 8s? LLs,GH I!"IE!EN Qx3%%&EFF LLs$9:; !"IE q[(  UIH ~~~%E!HO/u5,,    1 ${1~4! *;7 DGsN!%J &3K@ !%J  MM !;!;G"I J LL !<<#77AaD H   EI MM !;!;:"; < qS '  >>U11A5aC OOE "$ '%'y!<  V77 DE F  V77 68 9  <    <<#66 LLJ & OOA GGEM 8s?))+FFLfUmU U c+>?@ab  U1X_'.LE OOE "EJE!HO--/<?@ab  GQx3&u~ uqS%c84ab +E2  HHUOe '  OOE%y <i&&   D s%U0. V 0VV Vc[5nU(a[U5up URU5 U(aqUSS:wahUSnSUl [U5up URU5 URR[R"SRU555 U(a UR[SS 55 US SnU(aMU$![Ra SnUS[ ;a [ U5up0U(dURU5 Us$USS:XaFUbURU5 URR[R"S55 GNE[U5up U(aU/USS&URU5 URR[R"SRU555 GNf=f) aparameter *( ";" parameter ) That BNF is meant to indicate this routine should only be called after finding and handling the leading ';'. There is no corresponding rule in the formal RFC grammar, but it is more convenient for us for the set of parameters to be treated as its own TokenList. This is 'parse' routine because it consumes the remaining value, but it would never be called to parse a full header. Instead it is called to parse everything after the non-parameter value of a specific MIME header. Nrrzparameter entry with no contentzinvalid parameter {!r}r"rz)parameter with invalid trailing text {!r}rr)rrrrrCrjrkr"rrr;rmrVr)rmime_parametersrXrzrs rparse_mime_parametersr s%&O  =(/LE  " "5 )( U1X_$B'E2E 07LE LL   # # * *6+E+E;BB5I,K L   " "=6K#L M!"IEG %H A&& =FQx;& ( &&v.&&Qx3%#**62''..v/I/I5078 5U; !'E"1I&&u-''..v/I/I,33E:0<=# =s CA G#'A G#6A)G#"G#cfU(ajUSS:waaUS[;a$UR[USS55 USSnO[U5up!URU5 U(a USS:waMaU(dgUR[SS55 UR[ USS55 g)zBDo our best to find the parameters in an invalid MIME header rrrrNr)r}rrrr) tokenlistrrXs r_find_mime_parametersrM s E!HO 8{ "   ]585HI J!"IE%e,LE   U # E!HO  ]3(=>? *5956rcf[5nU(d1URR[R"S55 U$[ U5up URU5 U(a USS:waCURR[R"S55 U(a [X5 U$URR5R5Ul UR[SS55 USSn[ U5up URU5 URR5R5UlU(dU$USS :waOURR[R"S RU555 U? U?[X5 U$UR[S S 55 UR[!USS55 U$![R aN URR[R"SRU555 [X5 Us$f=f![R aN URR[R"S RU555 [X5 Us$f=f) zmaintype "/" subtype *( ";" parameter ) The maintype and substype are tokens. Theoretically they could be checked against the official IANA list + x-token, but we don't do that. z"Missing content type specificationz(Expected content maintype but found {!r}rrzInvalid content typezcontent-type-separatorrNz'Expected content subtype but found {!r}rz> 02 3  '   LL E!HO V77 "$ %  !% / [[&&(..0EN LLs$<=> !"IE '   LLKK%%'--/EM   Qx3 V77 ( ) NEMe+  LLs$9:; LL&uQRy12 LQ  " " V77 6 = =e DF Ge+ &  " " V77 5 < . s'.,qa!>!>??,r unknown-8bitTrr[r"rrrr)!max_line_lengthsysmaxsizeutf8rrrrrm SPECIALSNL isdisjointNLSETrrrrF_fold_mime_parametersrPrur_rrrrrn _fold_as_ewrvrLrHr4rrinsert) parse_treer\maxlenrrleading_whitespacelast_ew last_charsetr want_encodingend_ew_not_allowedrrQtstrr encoded_partnewlinewhitespace_accumulatorcharnewpartsps rr]r] sV  # # 2s{{F ++w:H DEGLM!"&:;  E yy|  % ! #  4y"44$.$9$9$$? ? %*$4$4T$: :  ! KK !G ??/ / !$v @  !3%% % ''#'99F9#;Q=Q#RL~~\9|,vE"I/FF&CE&JG!LL1b \1  4**T U* % ##'+!^3!W,J1F"G%d6&*&=&=w\&("& %  %  t9U2Y/ / "I I     D A '3E:G$--//  Wt^,)+&!"ID3*11$7&&(WW-C%D"tX&&DzH"66 #301&(&##4Q#7A&((#3012 %%"a'" 23u$E    &8 LLD ! M /6 d))++ LL4 ( "I IK %N >>  u % 66o" !. ,,...(" M !B(sQ R 2RRcjUb0U(a)[[USUSU-55nUSSUUS'OPUS[;aCUSnUSSn[US5U:XaUR [ U55 US==U- ss'SnUS[;a USnUSSnUc[US5OUn US:XaSOUn [U 5S-n U S-U:a[ R"S 5eU(Ga.U[US5- n X- [U5- n U S::aUR S 5 MB[U5S:a<[US5S:Xa*U(a#[R"XjS 9nUS==U- ss'SnUSU n[R"XS 9n[U5U - nUS:a/USSn[R"XS 9n[U5U - nUS:aM/US==U- ss'U[U5SnSnU(aUR S 5 [US5n U(aGM.US==U- ss'U(aU $S$) aFold string to_encode into lines as encoded word, combining if allowed. Return the new value for last_ew, or None if ew_combine_allowed is False. If there is already an encoded word in the last line of lines (indicated by a non-None value for last_ew) and ew_combine_allowed is true, decode the existing ew, combine it with to_encode, and re-encode. Otherwise, encode to_encode. In either case, split to_encode as necessary so that the encoded segments fit within maxlen. Nr"rrr)rrz3max_line_length is too small to fit an encoded wordr)r) rrXrHrrrrrCrEr) to_encoderr$r&rvrr% leading_wsp trailing_wsp new_last_ew encode_as chrome_lenremaining_space text_space encoded_wordto_encode_wordexcesss rr!r! s^1 U2Ywx09< =? "Ihw'b 1  l abM b Nf $ LL6u= > b [ L} } crN $+O#eBi.K"j0gIY!#JQ6!%% AC C  3uRy>1$1C8J4KK ? LL   u:>c%)n16H::&8LL "I %I!# ";J/zz.D \"_4qj,CR0N::nHL&8F qj b \! c.123   LL eBi.K? )@ "II,;6$6rc @URGHupEUSR5RS5(d US==S- ss'UnSnURU5 SnU(a2[RRUS US 9n S RXFU 5n OS RU[U55n [US5[U 5-S -U:aUSS-U -US'M[U 5S-U::aURSU -5 MSn US-n U(dGM[U5[[U 55-S-[U 5-n X-S-::aSnX-- S- =pUSUn[RRUS US 9n [U 5U::aOUS -nM;URSRXKX55 S n U S - n X^SnU(a US==S- ss'U(aMGM g![a* Sn[ R "U5(aSnSnGNSnGNf=f)a*Fold TokenList 'part' into the 'lines' list as mime parameters. Using the decoded list of parameters and values, format them according to the RFC rules, including using RFC2231 encoding if the value cannot be expressed in 'encoding' and/or the parameter+value is too long to fit within 'maxlen'. r"rstrictFTrrrr))saferz {}*={}''{}rrrrrz''rmNNz {}*{}*={}{})rrprrrrrrrrr;rrrr)rQrr$rrrr error_handlerencoding_required encoded_valuer*r extra_chromer8 splitpointmaxcharspartials rr r  s:{{  Ry!**3// "I I  " LL " %  "LL..B}/6M&&tmDD>>$ U(;CI % )F 2b C$.E"I  Y]f $ LLt $ ~ eTSW%66:S=NNJa' $*$7!$; ;J , & 2 2"]!3!< }%1a  LL..|< =L qLG+&Eb S -eI#" " $ $$U++( 1 !  "s G))+HHH)rN)r(rerrstringroperatorremailrrErrrurHrjryrprvr} TSPECIALSr ASPECIALSrrrrrrcompileVERBOSE MULTILINErSrrr|rrrrrrrrrrrrrrrrr3rBrFrTrYrardrgrir{r}rrrrrrrrrrrrrrrrrrrrrrr!rCr'rrrPrurr;r4r]r1r7matchrafindallr+rrrr/r:r>rOrXr\r^rcrfrhrkrnrqrtrwr{rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr rrr]r!r rrrrTs CJ '  %jCHn   sN CH$ U# E "c#h . _ E " S(3s83 t    @  **ZZ",, @,@,FD)D I Y9"9I )#9#6 -| -!4C)C&%i%2 ?) ?% %"$I$*"y"6 DDyD!i!6;Y;.Y.i) I yB9 -!&-!`! !H I  ) 8%y% # #i I S.YS.l y *1 i ) *I*&y&Y(+s(+VH->I$4$4IIbggj!"%$%%*UZZ (8(8IIbggn%&)()).&(jj1A1AIIbgg-./21'227%$;J@ /bAF"  ))V2  $6 &2 D$L%N2!h(%!N$L ) V,\ "H*"$6r#J<:4n&,BJ8BH$$&.&.$,<,KZ2h7 6p<^Y7vJ7XI!r