There's a regression in the last config version, resulting in secondary/tertiary body tags being removed, even though they contain crucial attributes (onload, style, ...), breaking also the FOX TV community sites (
http://community.myfoxny.com/ , ...).
Only applies to config setups with "2.1 Never alter Page/Link Styles" turned off (default: on) *and* "Remove duped Body Tags" turned on (default).
Fix below.
Code:
[Patterns]
Name = "<html><body>: Mark First - Remove Dupes 9.03.19 (multi) [sd] (d.r)"
Active = TRUE
Multi = TRUE
URL = "$TYPE(htm)(^$TST(spBounds=*))"
Limit = 4096
Match = "<html(\s[^>]+|)>"
"(^$TST(script=*)|$TST(comment=1)|$TST(tAnc=j)|$TST(tFrameset=2))"
"("
"$TST(mHtml=([#1:*])\0)"
"$TST((\0+)=$LST(Count)|*)$SET(mHtml=$GET(i))"
" ($TST(volat=*.log:2*)$ADDLST(Log-Main,[$DTM(d T)]\tWEB Multi_html $GET(mHtml) \t\u)|)"
"|$SET(mHtml=1)PrxFail$TST()"
")"
"|"
"<body(\s$INEST(<body,>)>|>)\3"
"(^$TST(script=*)|$TST(comment=1)|$TST(tAnc=j)|$TST(tNoframes=1)|$TST(tNoembed=1)|$TST(tFrameset=2))"
"("
"$TST(mBody=([#1:*])\0)(^$TST(cType=xhtml))"
"("
"$TST((\0+)=$LST(Count)|*)$SET(mBody=$GET(i))"
" ($TST(volat=*.log:2*)$ADDLST(Log-Main,[$DTM(d T)]\tWEB Multi_body $GET(mBody) \3 \t\u)|)"
"&$TST(keyword=*.i_mbody:[12].*)(^$TST(\0=[#*:3])$TST(\3="
"*\s(("
"(onload)\3=$AV( (^(mm_|)preload)?*)|(b(ackground|gcolor)|class|id|style)\3="
")$SET(eBodyT=$GET(eBodyT) \3))\3*>"
"))"
")"
"|(^$TST(\0=[#1:*]))$SET(mBody=1)PrxFail$TST()"
")"
"|"
"</(body|html)\2 >( |)(^"
"$TST(script=*)|$TST(comment=1)|$TST(tAnc=j)"
"|$TST(tFrameset=2)|$TST(preBlock=prx-*)|$TST(cType=xhtml)"
")$SET(1=</old-\2>)"
"|"
"</prox(body|html)\2>$SET(1=</\2>)"
Replace = "\1"