hangnum corrhangnumber and dot combinations
search:
/<p rend=[^\w]hangnum[^\w]>([^<>]+?)<hi rend=[^\w]dot[^\w]>.<\/hi><\/p>/replace:
<div hangnum><span para #$1>[$1]</span></div>Only in one file in the four sections:
search:
/<p rend=[^\w]hangnum[^\w]><hi rend=[^\w]dot[^\w]>.<\/hi>([^<>]+?)<\/p>/
replace:
<div hangnum><span para #5>[$1]</span></div>"<p rend="hangnum">(ка)</p>" cases
search:
/<p rend=[^\w]hangnum[^\w]>\(([^><]+?)\)<\/p>/replace:
<div hangnum>($1)</div>hangnum+paranum casessearch:
<p rend=[^\w]hangnum[^\w] n=[^\w]([^<>]*?)[^\w]><hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[\. ,]*([^\n]*)<\/p>[\s]*
replace:
<div hangnum><span para #para_$1>[$2]</span></div> $3\n\n
(not wished exceptions here include: ...
<hi rend="bold"></hi>.</p> -> ...
<hi rend="bold"></hi>.)
dot bold corr (
exception !! not in :tipitaka:sut.kn.thi.05 , here it is a dot at the end of a sentence)
search:
<hi rend=[^\w]bold[^\w]><\/hi>\.|<hi rend=[^\w]bold[^\w]>\.<\/hi>replace with nothing and for :tipitaka:sut.kn.thi.05 with
.Special cases of "lost" hangnum<div gatha3>បញ្ញាធិតិសីលគុណោឃវិន្ទំ,</div>
<p rend="hangnum">វន្ទេ មុនិមន្តិមជាតិយុត្តំ។</p>
<p rend="gathalast"></p>
search:
<div gatha([0-9]+)>(.+?)<\/div>\n\n<p rend=[^\w]hangnum[^\w]>(.+?)<\/p>\n\n<p rend=[^\w]gathalast[^\w]>[\s]*<\/p>replace:
<div gatha$1>$2</div>\n\n<div gathalast>$3</div>strange + in tika:abh.pa.31_tik#10299 and :tipitaka:sut.dn.20#26891
<p rend="hangnum">д̇ам̣ гаммапассад̣̇ваарзхи, д̣̇увид̇хам̣ самбавад̇д̇ад̇и.+</p>
search:
\+<\/div>replace:
</div>strange () and lost hangum in gatha<div gatha3>г̇анд̇аб̣б̣ам̣ во суг̇ахид̇амид̣̇ам̣ бхаасид̇ам̣ бхигкунод̇и</div>
<p rend="hangnum"> </p>
<div gathalast> чад̣д̣зд̇аб̣б̣ам̣ гаважанамид̇арам̣ д̣̇уг̇г̇ахийд̇анд̇и но жз;()</div>
search:
<div gatha([0-9]+)>(.+?)<\/div>\n\n<p rend=[^\w]hangnum[^\w]>[\s]+?<\/p>\n\n<div gathalast>(.*?)<\/div>replace:
<div gatha$1>$2</div>\n\n<div gathalast>$3</div>strange () :anya:siha-gs.jiva_any – 1798 matches x 4 ! (good if looking whether source had been touched)
search:
\(\)<\/div>replace:
</div>last 19 matches on 12 pages of "hangnum" are corrected manual matching single similar case with exceptions. Good done hopefully and to be listed good for further language implementations.
Exceptions (requiring further work later):
.:anya:niti-gs.kav05_any missing paragraph after 225
.:tika:vin.vviu.05_vviu_tik misses hangnum for paragraphs
.:tipitaka:sut.kn.jat.v22 some hangnums are headers
Pages which have lost content "strange () issue":
siha-gs.jiva_any (e1208n.nrf.xml)
Suspected files (also indexing errors):
abh.pa.70_tik ?
cs-th:atthakatha:sut.kn.pat.v1.01_att
cs-th:tika:sut.mn.v01_tik