Post reply

Name:
Email:
Subject:
Tags:

Seperate each tag by a comma
Message icon:

Attach:
(Clear Attachment)
(more attachments)
Allowed file types: apk, doc, docx, gif, jpg, mpg, pdf, png, txt, zip, xls, 3gpp, mp2, mp3, wav, odt, ods, html, mp4, amr, apk, m4a, jpeg
Restrictions: 50 per post, maximum total size 150000KB, maximum individual size 150000KB
Note that any files attached will not be displayed until approved by a moderator.
Anti-spam: complete the task

shortcuts: hit alt+s to submit/post or alt+p to preview


Topic Summary

Posted by: Johann
« on: April 11, 2019, 12:38:23 PM »

Currently working on the "single-sutta release" files, which can require some time, given about 40.000 headers, but would then also give finally values for the {no..} replacements (for links to them) in the source-files.

Since making single files for Atthakatha and Tika would cause huge amount of files, if not skipping, and so Atma thought of implementing the related commentaries direct in the Sutta (Mula) files.
Posted by: Johann
« on: April 02, 2019, 08:18:28 AM »

{lang} and {ns-section} have now replaced on all pages except the 416 pages in cs-th (Thai, 268 in Atthakatha and 148 pages in Tika)

The further replacements ({file}, {path-source}...) could be made according the list above either page for page or with a script using the list. Files+/- etc, how ever, may need further renderings later. {no}... the same.

Sadhu for the great work and assitence of many to bring the first four languages into here and the availability for the Sangha and those with Nissaya.

Atma will look after the last xml converting into ati-syntax in the Khmer pages and then look after the css for "good" layouts.

An Excel-file which is of help for creation of the release files, also in languages to come, can be used: renaming_list.xlsx To extract them into directories and files for an upload the Converting lists into txt-files - Tools for Ati.eu can be used.
Posted by: Johann
« on: April 01, 2019, 12:23:19 PM »

Aramika   *

Ein oder mehrer Beiträge wurden hier im Thema abgeschnitten und damit in neues Thema "[ati.eu] Indexing, search engine " eröffnet, dem angehäng.
One or more posts have been cut out of this topic here. A new topic, based on it, has been created as "[ati.eu] Indexing, search engine " or attached there.
Posted by: Johann
« on: March 23, 2019, 11:45:38 AM »

Main indexes in the four scripts should be fine and complete now:

Tipiṭaka (Roman)
តិបិដក (បាឡិ​ខ្មែរ) ติปิฎก (Thai) д̇ибидага (кириллица)
My person currently ties to rebuild the index by actualization option, which actually seems to be double slower as to build anew, but possible would not aim in no index when stopping in between (about 3000 pages of 20500 indexed since this morning)
Posted by: Johann
« on: March 17, 2019, 11:41:59 AM »

List of renaming of the index files (toc.xml): renaming_files#index-files_toc
Posted by: Johann
« on: March 16, 2019, 04:25:30 PM »

Files are all anew uploaded so far. The Khmer files need some rest replacements of xml codes. Renamed files have been deleted.

Once the index is rebuild, the last replacements can be made.

As for the replacements of the placeholder {file}, {ns-section}... it's maybe good if runing similar scripts on the server.

In regard of {no}: no over all idea for now, so maybe good as before.

Attached an excel-list containing all particular replacements for each single file.
Posted by: Johann
« on: March 16, 2019, 01:02:54 AM »

Atma will upload the renamed files with original content and try again to make the replacements online with batchedit, since having come across that Notepad sometimes loses found matches and gives nothing back when replacing.
In this way, at least, the originals would be stored on ati as well. Lets see whether web-space and sun allows it the next days.
Posted by: Johann
« on: March 15, 2019, 10:58:27 AM »

And using Powershell such as ((Get-Content vin.par.ve.txt -Raw) -replace '{lang}','cs-km') | Set-Content vin.par.ve.txt destroyed the files, possible a utf-8-issue... (and having not made a backup...)

all once again  ^-^ :)
Posted by: Johann
« on: March 15, 2019, 10:39:40 AM »

Further edits:

###DOT

Code: [Select]
<hi rend="dot">\.</hi>

.

###HANGUM INTO HEADER (JAT only) multiline

Code: [Select]
===== ([^<>]*?) =====\n<span sang_id #([^\n]*?)</span>(.*?)<p rend=[^\w]hangnum[^\w]>[\s]*<\/p>[\r\n]+ ([១២៣៤៥៦៧៨៩០1234567890๑๒๓๔๕๖๗๘๙๐\-]+)\. ([^\n]*?)\n

===== $1 =====\n<span sang_id #{file+}>[[{path-release}:{file+}|{file+}]] | [[{path-source}:{file}#{file+}|source]]</span>$3==== $4. $5 ====\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>\n

###HANGUM INTO HEADER (JAT only) multiline [..] X.

Code: [Select]
===== ([^<>]*?) =====\n<span sang_id #([^\n]*?)</span>(.*?)<p rend=[^\w]hangnum[^\w]>[\s]*<\/p>[\r\n]+ \[([១២៣៤៥៦៧៨៩០1234567890๑๒๓๔๕๖๗๘๙๐\-]+)\] ([១២៣៤៥៦៧៨៩០1234567890๑๒๓๔๕๖๗๘๙๐\-]+)\. ([^\n]*?)\n

===== $1 =====\n<span sang_id #{file+}>[[{path-release}:{file+}|{file+}]] | [[{path-source}:{file}#{file+}|source]]</span>$3==== [$4] $5. $6 ====\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>\n

###HANGUM INTO HEADER HH (JAT only) multiline

Code: [Select]
======= ([^<>]*?) =======[\r\n]+<span sang_id #\{file\}>\[\[\{path-release\}:\{file\}\|\{file\}\]\] \| \[\[\{path-source\}:\{file\}#\{file\}\|source\]\]<\/span>(.*?)<p rend=[^\w]hangnum[^\w]>[\s]*<\/p>[\r\n]+ ([១២៣៤៥៦៧៨៩០1234567890๑๒๓๔๕๖๗๘๙๐\-]+)\. ([^\n]*?)[\r\n]+

======= $1 =======\n<span sang_id #{file}>[[{path-release}:{file}|{file}]] | [[{path-source}:{file}#{file}|source]]</span>$2==== $3. $4 ====\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>\n\n

###HANGUM INTO HEADER HH (JAT only) no NO.

Code: [Select]
<p rend=[^\w]hangnum[^\w]>[\s]*<\/p>[\r\n]+ ([^\n]*?)[\r\n]+

==== $1 ====\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>\n\n

###HANGNUM CORR (exception in bud-vgs.nk.2_any.txt and sut.sn.01.txt!!)

Code: [Select]
<p rend=[^\w]hangnum[^\w]>([^<>]+?)\.<\/p>

<div hangnum>$1.</div>

###Search "<p rend=[^\w]hangnum[^\w]>" further 47 hits in 39 files: best made one by one since many exceptions.

###BOLD corrections

without regex:

Code: [Select]
]</span> .

]</span>

correction before, ###HANGNUM, again

###P HANGNUM HI PARANUM BOLD

Code: [Select]
<p rend=[^\w]hangnum[^\w] n=[^\w]([^<>]*?)[^\w]>[\s]*<hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[\s]*<hi rend=[^\w]bold[^\w]>\.<\/hi>([^\n]*?)<\/p>[\s]*

<span para #para_$1>[$2]</span>$3\n\n

###further <hi rend="bold"> corr. are made on the single pages

###P HANGNUM HI PARANUM

Code: [Select]
<p rend=[^\w]hangnum[^\w] n=[^\w]([^<>]*?)[^\w]>[\s]*<hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[\s]*<\/p>[\s]*

<span para #para_$1>[$2]</span>\n\n

###P INTENT PARANUM

Code: [Select]
<p rend=[^\w]indent[^\w] n=[^\w]([^<>]*?)[^\w]>[\s]*<hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>\. ([^\n]*?)<\/p>[\s]*

<span para #para_$1>[$2]</span> $3\n\n

###P HANGNUM HI PARANUM content

Code: [Select]
<p rend=[^\w]hangnum[^\w] n=[^\w]([^<>]*?)[^\w]>[\s]*<hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[\. ]([^\n]*?)<\/p>[\s]*

<span para #para_$1>[$2]</span>$3\n\n

###GATHA PARANUM

Code: [Select]
<div gatha1[^\w] n=[^\w]([0-9]*?)><hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[\. ]*([^\n]*?)</div>

<span para #para_$1>[$2]</span>\n\n<div gatha1>$3</div>

###Correction

Code: [Select]
<div gatha2" n="-><hi rend="paranum">-</hi>

<div gatha2>

###Manual corrections for all matches of "<p rend"

###Cleanings

Code: [Select]
\r\n

\n

There might be further xml-tags left and small edits needed, but those can be made online.

Atma will now replace the placeholder (except {no}, {no+}) where he has no idea of how to process that right and effective for now, and then upload all files anew.

(Note: working/processing on replacements with batchedit online is much faster as with notepad++ local (about a 3-4 days). Of course the cleaning of cache and delete of history online takes the also a good while.)
Posted by: Johann
« on: March 12, 2019, 01:23:18 PM »

Status (lokal)

Some files have been re-renamed. Current list: Renaming of source files , renaming files.

Regex-list for xml- to ati-standard as done for "cs-rm", "cs-km", "cs-th", "cs-ru" at once.

Note that {...} strings will be replaced in a later selective session. Replacments are done "single-line" if not other mentioned.

##Starting with the header and footer, which replaces "content".

###HEADER multiline (10792 replacements)

Code: [Select]
[\s]*<\?xml(.+?)<body>[\s]*

<span hide>sources: cs-file name {cs file} path ati{lang}:{ns-section}:{file}</span>\n{{section>en:tech:template_includes#{lang}_header&nouser&nodate&noheader&noeditbutton&firstsectiononly}}\n<div {lang}>\n\n

###FOOTER multiline

Code: [Select]
[\s]*<\/body>(.*)<\/([^p]*?)>[\s]*

\n\n</div>\n{{section>en:tech:template_includes#{lang}_footer&nouser&nodate&noheader&noeditbutton&firstsectiononly}}

###CS-CD ANCHORS

Code: [Select]
<pb ed=[^\w]([^<>]*?)[^\w] n=[^\w]([^<>]*?)[^\w][\s]*\/>

<span anchor #$1_$2></span>

###BOLD

Code: [Select]
<hi rend=[^\w]bold[^\w]>([^\n]+?)<\/hi>

**$1**

###P CENTRE

Code: [Select]
<p rend=[^\w]centre[^\w]>(.*?)<\/p>

<div centeralign>$1</div>

###NOTE

Code: [Select]
<note>([^\n]+?)<\/note>

<span note>$1</span>

###P HI PARANUM DOT

Code: [Select]
<p rend=[^\w]bodytext[^\w] n=[^\w]([^<>]*?)[^\w]>[\s]*<hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[\s]*<hi rend=[^\w]dot[^\w]>\.<\/hi>([^\n]*?)<\/p>[\s]*

<span para #para_$1>[$2]</span>$3\n\n

###P HI PARANUM DOT []

Code: [Select]
<p rend=[^\w]bodytext[^\w] n=[^\w]([^<>]*?)[^\w]><hi rend=[^\w]paranum[^\w]>([^<>]*?)[\. ]*?<\/hi>[\. ]*?([^\n]*?)<\/p>[\s]*

<span para #para_$1>[$2]</span> $3\n\n

###P

Code: [Select]
<p rend=[^\w]bodytext[^\w]>([^\n]+?)<\/p>[\s]*

$1\n\n

###P HI PARANUM DOT

Code: [Select]
<p rend=[^\w]hangnum[^\w] n=[^\w]([^<>]*?)[^\w]><hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[\. ]*<hi rend=[^\w]dot[^\w]>[\. ]*<\/hi>[\. ]*([^\n]*)<\/p>[\s]*

<div hangnum><span para #para_$1>[$2]</span></div> $3\n\n

###GATHA

Code: [Select]
<p rend=[^\w]gatha([^<>]*?)[^\w]>([^\n]+)<\/p>[\s]*

<div gatha$1>$2</div>\n\n

###INDENT|UNINDENTED

Code: [Select]
<p rend=[^\w](indent|unindented)[^\w]>([^\n]+)<\/p>[\s]*

<div $1>$2</div>\n\n

###NIKAYA

Code: [Select]
<p rend=[^\w]nikaya[^\w]>([^<>]*?)<\/p>

<div centeralign #nikaya>**$1**</div>\n<span sang_id #{file--}>[[{path-release}:{file--}|{file--}]] | [[{path-source}:{file}#{file--}|source]]</span>

###BOOK 868

Code: [Select]
<p rend=[^\w]book[^\w]>([^<>]*?)<\/p>

======== $1 ========\n<span sang_id #{file-}>[[{path-release}:{file-}|{file-}]] | [[{path-source}:{file}#{file-}|source]]</span>

###CHAPTER

Code: [Select]
<p rend=[^\w]chapter[^\w]>([^<>]*?)<\/p>

======= $1 =======\n<span sang_id #{file}>[[{path-release}:{file}|{file}]] | [[{path-source}:{file}#{file}|source]]</span>

###TITLE

Code: [Select]
<p rend=[^\w]title[^\w]>([^<>]*?)<\/p>

===== $1 =====\n<span sang_id #{file+}>[[{path-release}:{file+}|{file+}]] | [[{path-source}:{file}#{file+}|source]]</span>

###SUBHEAD

Code: [Select]
<p rend=[^\w]subhead[^\w]>([^<>]*?)<\/p>

==== $1 ====\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>

###SUBSUBHEAD

Code: [Select]
<p rend=[^\w]subsubhead[^\w]>([^<>]*?)<\/p>

=== $1 ===\n<span sang_id #{file-}.{no+}>[[{path-release}:{file-}.{no+}|{file-}.{no+}]] | [[{path-source}:{file}#{file-}.{no+}|source]]</span>

###SUBHEAD NOTE

Code: [Select]
<p rend=[^\w]subhead[^\w]>([^<>]*?)<span note>([^<>]*?)<\/span>([^<>]*?)<\/p>

==== $1$3 ====\n<div centeralign>**$1<span note>$2</span>$3**</div>\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>

###CHAPTER NOTE

Code: [Select]
<p rend=[^\w]chapter[^\w]>([^<>]*?)<span note>([^<>]*?)<\/span>([^<>]*?)<\/p>

======= $1$3 =======\n<div centeralign>**$1<span note>$2</span>$3**</div>\n<span sang_id #{file}>[[{path-release}:{file}|{file}]] | [[{path-source}:{file}#{file}|source]]</span>

###TITLE NOTE

Code: [Select]
<p rend=[^\w]title[^\w]>([^<>]*?)<span note>([^<>]*?)<\/span>([^<>]*?)<\/p>

===== $1$3 =====\n<div centeralign>**$1<span note>$2</span>$3**</div>\n<span sang_id #{file+}>[[{path-release}:{file+}|{file+}]] | [[{path-source}:{file}#{file+}|source]]</span>

###SUBHEAD ANCHOR

Code: [Select]
<p rend=[^\w]subhead[^\w]>([^<>]*?)<span anchor #([^\n]*?)<\/span>([^<>]*?)<\/p>

==== $1$3 ====\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>\n<span span anchor #$2</span>

###CHAPTER ANCHOR

Code: [Select]
<p rend=[^\w]chapter[^\w]>([^<>]*?)<span anchor #([^\n]*?)<\/span>([^<>]*?)<\/p>

======= $1$3 =======\n<span sang_id #{file}>[[{path-release}:{file}|{file}]] | [[{path-source}:{file}#{file}|source]]</span>\n<span span anchor #$2</span>

###TITLE ANCHOR

Code: [Select]
<p rend=[^\w]title[^\w]>([^<>]*?)<span anchor #([^\n]*?)<\/span>([^<>]*?)<\/p>

===== $1$3 =====\n<span sang_id #{file+}>[[{path-release}:{file+}|{file+}]] | [[{path-source}:{file}#{file+}|source]]</span>\n<span span anchor #$2</span>

###BOOK ANCHOR

Code: [Select]
<p rend=[^\w]book[^\w]>([^<>]*?)<span anchor #([^\n]*?)<\/span>([^<>]*?)<\/p>

======== $1$3 ========\n<span sang_id #{file-}>[[{path-release}:{file-}|{file-}]] | [[{path-source}:{file}#{file-}|source]]</span>\n<span span anchor #$2</span>

Posted by: Johann
« on: March 11, 2019, 07:23:36 AM »

Having already found one of an potential content eater, having forgotten to escape dot

<p rend=[^\w]hangnum[^\w] n=[^\w]([^<>]*?)[^\w]><hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi>[. ]*<hi rend=[^\w]dot[^\w]>[. ]*<\/hi>[. ]*([^\n]*)<\/p>[\s]*   <div hangnum><span para #para_$1>[$2]</span></div> $3\n\n
Posted by: Johann
« on: March 09, 2019, 08:50:23 AM »

Sadhu and

Noo...  :) Nyom makes his needed things and brings no burden of duties and a released kusala mind with him, no IT needed.

Original files are on the Sangha laptop here as well as uploads by Nyom here on Sangham.net.

Having processed again late night, Atma came to see that Notepad seems to work different in regard of regex and it has eaten away content again being also very in-transparent in regard of multi file replacements and slow.

So thinking that the server and things there working actually better, Atma thinks to upload the whole originals right and "lastly" renamed and makes the work online again. Actually not that much and possible to do in one day if doing smart.

Command or powershell and other tool might be great, but Atma is not willing to learn other and more then required IT stuff on sustaining on alms, just using what is left and known from past.

For a good an final replacement of the placeholder like {file}... Atma has started to make a list that provides with replacements for each file then. That is something Atma is not able then to do online.

So here Atma tries to prepare offline and then inform of actions so that another me could always help here and there if rejoicing by such.

Quote from: Upasaka Moritz
backups

..oh yes. Only to relay on good actions might be risky if maintaining a "monastery", of cause. What ever does not become a burden and hurtful for one self or others.
Posted by: Moritz
« on: March 09, 2019, 07:47:15 AM »

Vandami Bhante,

Possible good to make all replacements anew local and then upload, overwrite them, again on the server. To much different standards while processing and even if just some file wrong rendered more difficult to find them without overseeing one. At least saving server resource.

if knowing from which point to start and edit again, I could also do the same replacements and upload from here, maybe better internet connection.

But in the last week now before the journey, really quite busy.

I still have a backup of the files in cs-rm, cs-km, cs-th and cs-th directories just before the {file} etc. replacements on my computer.

Downloading backups from Greensta should also be working now. I think daily automatic backups are deleted after 4-5 days. So one always should have one for the last three or four days. But just looking now: Daily accesstoinsight.eu backups have the "name" "Error: accesstoinsight.eu". Not sure if they are usable.

The last manual backup for accesstoinsight.eu, which seems to be okay and can still be downloaded, containing all the huge old attic archive, was from 17th of February.

I could bring it and what I have on my computer on a USB flash drive on the journey. I could also bring a laptop with me if useful.

_/\_
Posted by: Johann
« on: March 08, 2019, 12:15:32 PM »

Possible good to make all replacements anew local and then upload, overwrite them, again on the server. To much different standards while processing and even if just some file wrong rendered more difficult to find them without overseeing one. At least saving server resource.

Regex with notepad++ (local) and batchedit (on the server) has also slight different regex-syntax which may also had it's (even huge) impact forgetting here and there considering it.

"Knowing", possible remembering now some of the many exceptions, possible to continue without long break..

Very common traditional monks encourage there disciple to learn to cite the text in this way: "You say you have a book, hmm? When you have the texts in the books, then they are still just in the books..."  :)

Yet one should not forget that remembering is not for sure and gone by breaking of ones body as well, or even before by sickness or accident. So what behind of practicing, to see that reality of anicca clear, could help at least?



or in the Buddhas words more matching, holding on a good Nimitta (object of "Hobby"), not having something as object that is for trade and gain thought by "a teacher":

Namo tassa bhagavato arahato sammā-sambuddhassa

The non-doing of any evil,
the performance of what's skillful,
the cleansing of one's own mind:
    this is the teaching
    of the Awakened.

...and having just learned about commands in powershell, such simple ways of creating a list in excel and them command strings like ((Get-Content filename.txt -Raw) -replace '{file}','filename.txt') | Set-Content  filename.txt might made the file and path and other {...} strings possible to do for Atma. Even not needing hours for one replacement in 12000 files with notepad. Oh wait! Create or giving possibilities and hints to make merits? What does one like? Rebirth as a "compassionate intelligent wiki bot"? Late already.
Posted by: Johann
« on: March 08, 2019, 10:43:38 AM »

"Beautifications" such as not more then 2 line-breaks, double white spaces... and other things possible disturbing a standard.

more than two line-breaks, done (for each cs-.. lang-name-space)

search:

[\n][\n]+

replace:

\n\n

white spaces done (for each cs-.. lang-name-space):

search:

[ ][ ]+(?!\*)

replace: (one white space)



Removing the Changing id to class for over-all lang-div since it breaks the section edit.

search:

<div #(cs-km|cs-rm|cs-th|cs-ru)>

replace:

<div $1>

Just seeing many files in Anya which have lost content, it's possible good to check that first now...