Post reply

Name:
Email:
Subject:
Tags:

Seperate each tag by a comma
Message icon:

Attach:
(Clear Attachment)
(more attachments)
Allowed file types: apk, doc, docx, gif, jpg, mpg, pdf, png, txt, zip, xls, 3gpp, mp2, mp3, wav, odt, ods, html, mp4, amr, apk, m4a, jpeg
Restrictions: 50 per post, maximum total size 150000KB, maximum individual size 150000KB
Note that any files attached will not be displayed until approved by a moderator.
Anti-spam: complete the task

shortcuts: hit alt+s to submit/post or alt+p to preview


Topic Summary

Posted by: Johann
« on: February 21, 2019, 02:29:51 PM »

This are the replacements done incl. the placeholders for the two kind of paths {path-source} {path-release}, filename-part {file-}, filename {file} and number {no}.

"Chapter" should be always corresponding to the file-name. (but just 10500+ replacements, 2698x4 => 300?)

{path-source}: {lang}:{section}:{file}
{lang}: cs-rm or cs-km or cs-th or cs-ru
{section}: tipitaka or anya or atthakatha or tika; according to the file-name-end *_any, *_att, *_tik, or * (no "_" for "tipitaka")

{path-release}: {lang}:{section}:{pitaka}:{nikaya}:({sub-nikaya}:){chapter}:({title}:){no}:

{file}: {pitaka}.{nikaya}.({subnikaya}.){book}.({title}.){chapter}.{no}_{section}

{file-}: id/file-name-release reduced by one namespace

{file--}: id/file-name-release reduced by two namespaces

{file+}: id/file-name-release increased by one namespaces

Note that "subhead" has sometimes {file} (dn), {file-} (an,sn,kn sometimes) or {file--}
 (mn). Att, Any, Tik may be even more various.


Code: [Select]
<p rend=[^\w]centre[^\w]>(.*?)<\/p>	<div centeralign>$1</div>

<p rend=[^\w]bodytext[^\w] n=[^\w]([^<>]*?)[^\w]><hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi><hi rend=[^\w]dot[^\w]>\.<\/hi>([^\n]*?)<\/p>[\s]* <span para #para_$1>[$2]</span>$3\n\n

<pb ed=[^\w]([^<>]*?)[^\w] n=[^\w]([^<>]*?)[^\w] \/> <span anchor #$1_$2></span>

<p rend=[^\w]bodytext[^\w]>([^\n]+)<\/p>[\s]* $1\n\n

<note>([^<>]+?)<\/note> <span note>$1<\/span>

<p rend=[^\w]gatha([^<>]*?)[^\w]>([^\n]+)<\/p>[\s]* <div gatha$1>$2</div>\n\n

<p rend=[^\w]hangnum[^\w] n=[^\w]([^<>]*?)[^\w]><hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi><hi rend=[^\w]dot[^\w]>.<\/hi>([^\n]+)<\/p>[\s]* <div hangnum><span para #para_$1>[$2]</span></div>$3\n\n

<hi rend=[^\w]bold[^\w]>([^\n]+?)<\/hi> **$1**

<p rend=[^\w]nikaya[^\w]>([^<>]*?)<\/p> <div centeralign #nikaya>**$1**</div>\n<span sang_id #{file--}>[[{path-release}:{file--}|{file--}]] | [[{path-source}:{file}#{file--}|source]]</span>

<p rend=[^\w]book[^\w]>([^<>]*?)<\/p> ======== $1 ========\n<span sang_id #{file-}>[[{path-release}:{file-}|{file-}]] | [[{path-source}:{file}#{file-}|source]]</span>

<p rend=[^\w]chapter[^\w]>([^<>]*?)<\/p> ======= $1 =======\n<span sang_id #{file}>[[{path-release}:{file}|{file}]] | [[{path-source}:{file}#{file}|source]]</span>

<p rend=[^\w]title[^\w]>([^<>]*?)<\/p> ===== $1 =====\n<span sang_id #{file+}>[[{path-release}:{file+}|{file+}]] | [[{path-source}:{file}#{file+}|source]]</span>

<p rend=[^\w]subhead[^\w]>([^<>]*?)<\/p> ==== $1 ====\n<span sang_id #{file-}.{no}>[[{path-release}:{file-}.{no}|{file-}.{no}]] | [[{path-source}:{file}#{file-}.{no}|source]]</span>


Replacements for each language/script (cs-rm, cs-km, cs-th, cs-ru): header and footer

Code: [Select]
^(.*?)<body>(.*?)<\/body>(.*)$	<div #cs-rm>\n{{section>en:tech:template_includes#cs-rm_header&nouser&nodate&noheader&noeditbutton&firstsectiononly}}\n<span hide>{file}</span>$2{{section>en:tech:template_includes#cs-rm_footer&nouser&nodate&noheader&noeditbutton&firstsectiononly}}\n</div>

Script-specific replacements:

cs-km: replacement of ព្ព with ព្វ (according to Khmer spelling tradition in Pali)

(cs-rm: replacment of "m dot below" to "m dot above", still not finally decided)

If not finding additional header-kinds and other tags to be replaces, the files will be uploaded, rendered with this replacements, starting tomorrow.

* Johann comment Friday: cloudy, maybe Saturday noon till upload possible.
Posted by: Johann
« on: February 21, 2019, 12:50:44 PM »

Sadhu!

Atma will finish the replacements as far as possible and then upload the files. The file-list should be fine. The files are renamed and syntax replacements are 90% done aside of the headers.

Similar to the use for the Khmer-Tipitaka, each header should get it's anchor and back-link and forward-link (to the released file in the "public area" by "include-plugins" use). It will not always match right but often in this way:

File-name:{pitaka}.{nikaya}.{book}.({sometimes chapter, like sn,an}.){subhead serial no.}

anchor for sutta: sang_id #sut.kn.iti.001
back path to the source file: (cs-rm|cs-km|cs-th|cs-ru):tipitaka:{file name}#sut.kn.iti.001
path to the released single sutta/vagga: (cs-rm|cs-km|cs-th|cs-ru):tipitaka:sut:kn:iti:sut.kn.iti.001|sut.kn.iti.001

======= pitaka =======
<span sang_id #sut>[[km:tipitaka:sut:index|sut]] | [[km:tipitaka:book_053#sut|book_053]]</span>

======= nikaya =======
<span sang_id #sut.kn>[[km:tipitaka:sut:kn:index|sut.kn]] | [[km:tipitaka:book_053#sut.kn|book_053]]</span>

======= book =======
<span sang_id #sut.kn.iti>[[km:tipitaka:sut:kn:iti:sut.kn.iti|sut.kn.iti]] | [[km:tipitaka:book_053#sut.kn.iti|book_053]]</span>

====== chapter ======
<span sang_id #sut.kn.iti.v1>[[km:tipitaka:sut:kn:iti:sut.kn.iti.v1|sut.kn.iti.v1]] | [[km:tipitaka:book_053#sut.kn.iti.v1|book_053]]</span>

===== title =====
<span sang_id #sut.kn.iti.v1.1>[[km:tipitaka:sut:kn:iti:sut.kn.iti#sut.kn.iti.v1.1|sut.kn.iti.v1.1]] | [[km:tipitaka:book_053#sut.kn.iti.v1.1|book_053]]</span>

==== subhead ====
<span sang_id #sut.kn.iti.001>[[km:tipitaka:sut:kn:iti:sut.kn.iti.001|sut.kn.iti.001]] | [[km:tipitaka:book_053#sut.kn.iti.001|book_053]]</span>

...possible to complicated and a lot of exceptions, since all different structured.

Maybe easier to put the file name under the header and a serial number for the similar headers. Preparing it like this:

====== chapter ======
<span sang_id #{file(-_.-_.)}>

===== title =====
<span sang_id #{file(-_.)}.v{no 1}>

==== subhead ====
<span sang_id #{file}.{no 1,2...}>


Once the last is present, it should be no problem to make the rest with normal regex.

Atma will look that he can upload them today or tomorrow, depending on lasting battery (and hopefully havn't used all web-space by other revisions till then) and leave the replacement anchors {file} and {no} in not not clear order in those files.

May Nyom not invest to much time in to complicated solutions, as told, to many inconsistencies to match it without rendering it another time anyway.

Putting the file name into files and making an increasing replacement, this two things Atma misses tools (or skill).

(Atma is just trying to keep a "guide to do for", doku, for additional languages/scripts like Burmes, Sri Lankan...)
Posted by: Moritz
« on: February 21, 2019, 09:55:02 AM »

The name of each file in it's content as part of the text, Nyom. File XY may get it filename as text content at the fist line.

With notepat++"s replacements this can then be used to render certain links and anchors now not existing in the files.
I see. With a complete list of the files I could write a php script to let it be done on the server directly.

So, as I understand, this is the complete list of files for which this should happen?

Or do still some need to be renamed? This could also be done with a script on the server. I could write it on the weekend.

_/\_
Posted by: Johann
« on: February 20, 2019, 11:20:09 AM »

Bisher getane Replacements, Schritt fuer Schritt (1 Kommando dauert etwa eine Stunde), spaeter dann in eine ati-Seite fuer folgende Schriften und gleichen Standard:

Code: [Select]
<p rend=[^\w]centre[^\w]>(.*?)<\/p>	<div centeralign>$1</div>

<p rend=[^\w]bodytext[^\w] n=[^\w]([^<>]*?)[^\w]><hi rend=[^\w]paranum[^\w]>([^<>]*?)<\/hi><hi rend=[^\w]dot[^\w]>\.<\/hi>([^\n]*?)<\/p>[\s]* <span para #para_$1>[$2]</span>$3\n\n

<pb ed=[^\w]([^<>]*?)[^\w] n=[^\w]([^<>]*?)[^\w] \/> <span anchor #$1_$2></span>
Posted by: Johann
« on: February 20, 2019, 10:11:16 AM »

The name of each file in it's content as part of the text, Nyom. File XY may get it filename as text content at the fist line.

With notepat++"s replacements this can then be used to render certain links and anchors now not existing in the files.
Posted by: Moritz
« on: February 20, 2019, 09:25:08 AM »

_/\_

Quote
Atma thought it is good to make a flat structure for the source-files, only divided in Mula, Atthak., Tika and Anya and the other structure similar like for Khmer Tipitaka started, with the "include" tags.

If a trick of how to bring the file-name into the text of each page is known, that would make the modification into the ati-standard easier and faster, with 2698 files per script (4 at this time).

Not sure exactly what is required. Bring which file name into which text?

_/\_
Posted by: Johann
« on: February 19, 2019, 12:31:42 PM »

Atma could now find a way, having now the possibility to use a laptop, to rename the files. By Commander with the "ren {file} {file new} & ren ...

Here the list of the renaming: renaming_files.

Only the main directory index, is up to date. The sub-directories have to be rebuild.

Atma thought it is good to make a flat structure for the source-files, only divided in Mula, Atthak., Tika and Anya and the other structure similar like for Khmer Tipitaka started, with the "include" tags.

If a trick of how to bring the file-name into the text of each page is known, that would make the modification into the ati-standard easier and faster, with 2698 files per script (4 at this time).
Posted by: Johann
« on: September 16, 2018, 05:58:21 PM »

Configuration Setting: fnencode + Configuration Setting: deaccent + 5 Pali-aufwartungen + ati-alt + Wörterbuchautoren + 100 IT-Überraschungen + Wetter/Körper + 2 Jahre alt: Battery u. kl. Tablet... + keinerlei Bildung in sprachen incl. IT + Riesen "Vogel" ... + :-\ = total Verrückt

so und nun weiter, da wo gerade, und nach 30x nochmal machen wird's passen, so neben sañña, nicht auch noch saṅkhāra nicht sicher ist, neben den anderen Aggregaten involviert.

... oder mal wieder eine Nacht darüberschlafen... und auf Hilfsmaschinen und Wissenschaft(ler) hoffen. Aber etwas schlafen ist gut. Nur nicht zu lange (sati verfällt dann vollkommen und man hat alles vergessen und wundert sich nur warum)  :)

theoretisch hat ati.eu schon etwa 500.000 - 1Mio Seiten in den nächsten Monaten... und spider-, suchmaschinen off, damit etwas Wald und Wildnis überbleibt.



kamma-vipaka? das paßt gut im Anschluß:

"einfach nur Gänsehaut"



kataññū + saṁvega + pāsāda = Ver-rückt
Posted by: Johann
« on: September 16, 2018, 05:10:02 PM »

  • Atma hat in cs-rm index "ṃ" auf "ṁ" geändert.
  • Dateinamen sollten gleiches erfahren, und auch die Texte in den Files.
  • Ebenfalls: alle "pali"-Nachspannen in den Ordnern.
  • Auch wird er die Option der Settings des Umwandelns von Buchstaben "teilweise romanisiert" auf "nicht romanisieren« einstellen, was da natürlich heißt, 1500 Wörterbuch-filenamen vielleicht ebenfalls nochmals zu tun...  .  ohh Achtung

    Komplex, wenn man's komplex machen möchte... das ist noch Arbeit und Gelegenheit (gut PTS-dic. dazwischen, auch so eine Sañña-Abgleichsherausforderung, mit massig anderen Sprachen und Zeichen...). Wenn man denkt wie die Herrschaften noch vor 100 Jahren das gemeistert haben. Was für eine (De-)Generation nun schon erreicht. "Alles ist ein", mit 10.000+ google-programierern, die all die Arbeit für die anderen tun: free!  :)

    Die Doku von all den Dingen wird auch noch gute Arbeit.
Posted by: Johann
« on: July 13, 2018, 04:28:32 PM »

a Haa... that solves much (one each page has its heading for friendly displa Configuration Setting: useheading and IndexMenu Plugin for not needing to edit the indexes (possible, just would not give corresponding accesslink to the public tipitaka.org pages). Atma installs it for testing, assuming its welcome and given.

A sample of this index is now put on http://accesstoinsight.eu/doku.php?id=cs-rm:index#index

The use heading opinion nicely displays the title names now, will filenames (in code) are matched as fine selection as when typed into the search box.

Still a combination of both to display would be fine.
Posted by: Johann
« on: July 13, 2018, 08:43:38 AM »

Useally my person does things 2, 3, 4 times from the begining again. Vision in mind, then still lacking this or that, yet not perfect.

The second question is particulary reflected in the first. And if having later 2100 file and wishing to put them on their places, morits would see why.

And thats also for the brain. Some might know mn001 , that it is in the Suttapitaka, in the first vagga. But if having a file called bhu001 one would have problems.

Now one could give them a real name only. karanayametta sutta So, knowing the name, would you know which pitaka, nikaya, vagga and subvagga it belongs?

Therefor both useful systems, that of "modern" codes from ATI (western focus is on suttas and ends there) and the tree from the edlers by names.

that is why there came {pitaka}.{nikaya}.{vagga}.({subvagga}).{sutta no.} as for the filename into being.

If searching for an01.001, by the surfix _{att/tik/any} on matches them well in the preview putting the letters into the search box. On the other side, if searching via sidemap it's fine as well.

This works all fine till atthakatha Abhidhamma and parts of tika. When coming to anya it's no more that clear executeable and Anya it self contains already double and tripple naming. A certain collection has the first book and the first chaper with same name containing things not clear a counterpart of the tipitaka.

Till today, and actually having spend 100's of hour on trying to sort in fine, may person came till abouf jataka to be sure that the system would not run ugly of build on a not suitable structur.

Now this here, my person guesses, since not even abhidhamma (horrible structur) has been sorted well in the west, is the first time after tipitaka.org (which used a simple but not asumesable code and indexssystem for a stabil not dynamic storing, yet hard to find anything if not a little familar) that the whole heritage of the Sanghayana get's sorted.

It's all looked simple for my person as well. Then after you developed structure for the fist and second level, after the 10 file you match a new vagga/subvagga structur... Since from jakata on there is since longer no much broad interest, Anya is like the book shelf in a studend room and not like a chemist register.

Practical Anya:

Caturārakkhadīpanī is a book under the collection Nīti-gantha-saṅgaho in Anya-Commentaty and contains serial book. Within the is the Caper Kāyapaccavekkhaṇā which is the actual file (pagename)

To come to it one follows the indexes (pagenames under the namespace tree calked index) one after another or more direct, since the fist index contains already the whole structur. Thats right, sub indexes are not really necessary if the fist already contains the whole.

So it has more practical reasons. For example think on an, mn, iti. If knowing the system of the Sanghayana one knows that iti is a subvagga of kn. Same counts danger counts for mn. there is no mn123 in the sanghayana edition. It came from many people focusing on a certain levels "gemeinsames vielfaches).

Now, for example, if on works out Visudhimagga the first book, he might expand the capters index and if finished, might copy it into the index of visudhimagga, even to the anya index. Another might work from another level...

It means it has been the result of practical work in the worst situation of knowing the whole of particalar parts. Since it will stay dynamic, the further levels indexes have been not deleted (like cscd) but serve 2 purposes easy to acces in both directions, on which level ever one might enter, and to focus on a scale suitable to ones concentration and reminding and then put it together upwardly, downwardy.

Thats why this system from of pitaka, nikaya, vagga has been keep here as well and the structure is either by name flat (aside anya for all files) but also physical in levels, not only presented by a digital tree like the xml in cscd. If looking on the flat system of cscd one will fine att file codes in the tipitaka and so on. Meaning that even this simple system runned out ugly after finding out detail from the elders.

Further, the middle placed indexes are thought to get enriched by single chapers within one file (anchor content #v1...v5) and the deepes would contain later also the single suttas links (anchor #s001...s057) of the files. Meaning getting a zoom level by level. For one index that becomes to large.

But as told, open to others as well. Just knowing that it can serve for "headage" for weeks and month, yet next day finding out... "ok, again from the beginning". It's like doing/training Jhana, mastering the worlds.  :) That is why doing = sacrify has it's benefit = having learned a skill.

So know that Nyom has to structer something that probably nobody knows as a whole in it's various details and structur. It needs to be open in that far and "nachvollziehbar" for others, as well as accessable for people coming from differen learnsystems. West does not know the way of the elders and elders do not know the code-thinking of western.

For example look at ATI where it ends and beginns to go astray of suttacentral, having trouble with vinaya and abhidhamma and possible no logical way to ever add the commentaries, yet references to brahmic text from nepal.

But as told, while knowing that even some parts of the suttapitaka in the tipitaka have to be chanced, it can not be expected to be perfect or to work out to be perfect before putting it into the shelf.

If particular names have to chanced late on, if the is no double naming on the pagename level, such can be made by steps online if the whole structur has certain consistence as a whole.5z

Things open to do at this point if wishing to do it in a larfe scale:

- proof and eventually correcting indexes and names of files
- renaming of files
- converting into wiki/wrap standard
- implementing anchors to the single suttas, chapters
- incl. Data table to each file (titel, url, date, origin...)
- upload into the folders (or incl folders) in the single lang-namespaces

So it's really open how one like to do it, but its not really a quick job to develop such, at least for my persons limits.
Posted by: Moritz
« on: July 13, 2018, 04:52:09 AM »

Sadhu. Thanks for the hints and explanations.

Quote
(on this place: anya filenames have no _any at the end and other files than the tipitaka-codes do not include the path in there name when it comes to tika and anya and simply new names. Maybe something that my person should change since it is difficuld to put them into the right folders without such a sort/search possibility in an explorer)

It seems some deeper nesting of indexes would be good. Some things are confusing.

For example, the index/TOC:
cs-rm:anya:niti-gantha-sangaho:index
contains this, which is also an index/TOC:
cs-rm:anya:niti-gantha-sangaho:caturarakkhadipani
within the same directory/namespace (cs-rm:anya:niti-gantha-sangaho ).
And from there there are links to actual texts, like cs-rm:anya:niti-gantha-sangaho:kayapaccavekkhana , also in the same diretory.
I think it would be good if for each TOC there would be another level/directory.

If including the path in the final page name as well then of course the final name could be very long with deep directories, like

cs-rm:anya:niti-gantha-sangaho:caturarakkhadipani:anya:niti-gantha-sangaho.caturarakkhadipani.kayapaccavekkhana_any , or even only the filename without namespace niti-gantha-sangaho:caturarakkhadipani:anya:niti-gantha-sangaho.caturarakkhadipani.kayapaccavekkhana_any could look very strange on the sitemap as well.

Or maybe just leave out such indermediary TOCs/indexes like cs-rm:anya:niti-gantha-sangaho:caturarakkhadipani which is already completely included in another bigger index file in the same directory.

So that there would be no "caturarakkhadipani" appearing in the final path, "caturarakkhadipani" being simply part of the one big "index".

Not sure if I understand this correctly:
Quote
Maybe something that my person should change since it is difficuld to put them into the right folders without such a sort/search possibility in an explorer
Why is it helpful to have the complete pathname also in the filename (separated with '.')? From my perspective it just produces very unnecessarily long filenames. But this is a problem with the tablet explorer software? (Don't really know what Bhante is using now.)

I could rename all files to not path-including filenames and simply put them in their "right" deeper directories if this seems helpful, (making a deep hieararchy everywhere, but with short filename in the end), but no time before next week.


(Not necessary to answer all in detail now. May Bhante find enough rest in between. I have no time to come back to this before next week.)

_/\_
Posted by: Johann
« on: July 13, 2018, 03:23:22 AM »

Nyom Moritz

All namespaces in latin scripts, yes, otherwise only troubles and the whole translation tools and lang-namespaces would be of no use at all.

The naming of folders and files now might be not perfect, such as double names, but such would be clear if starting to rename.

Other, ideologic renderings of single names can/could be made later by hand, step by step, online.

For your easy rendering, its possible good to put the files in the tree folder before, since it need to be made by hand and if not done with the root files it might be of more work, but probably the same for each lang. (on this place: anya filenames have no _any at the end and other files than the tipitaka-codes do not include the path in there name when it comes to tika and anya and simply new names. Maybe something that my person should change since it is difficuld to put them into the right folders without such a sort/search possibility in an explorer)
But if that things would not trouble to much, let it be like that for now. As Morits feels inspired to organice.
Posted by: Moritz
« on: July 12, 2018, 08:19:36 PM »

Vandami, Bhante _/\_

Sadhu! So just to make sure about plans to proceed:

As I understand, "readable" Latin script names/codes were used for the pages and namespaces, including also names from commentaries like: "cs-rm:anya:visuddhimagga:11._samadhiniddeso" whenever there are areas in the commentaries which do not correspond 1-to-1 to certain Tipitaka books/vaggas/pages.

If this structure is now clearly defined, it seems it would be good to use the same structure and pagenames/namespaces for all other scripts (Khmer, Thai, ...) as well.

So: Roman codes/pagenames for all scripts, in order to be able to use the language switch between different scripts.

I assume this like Bhante had in mind as well?

It could take some time (a week or more) till I can get to it, but if the names and structure of tables of content are all clearly defined in this way, I think I could import the tables of content with the same structure for all remaining scripts without much trouble.

(And for later at some point maybe: As mentioned before , might be possible to convert namespaces to use other scripts with .htaccess rules or some other tricks. Maybe even possible to switch between very different looking names with the langauge switch, with help of some JavaScript.)

_/\_
Posted by: Johann
« on: July 12, 2018, 11:44:54 AM »

Nyom Moritz,

Attached a list of the whole index cs-rm-namespace as my person thought it till here.

Tree structur is equal other lang-namespaces build.

It includes the names of the indexes, title, pagename, path, cscd-file name

Thought till here was to rename the xlm file into the listed pagenames, coverting them before according implementation_cscd in regard of wiki/html code (wrap) and anchors.

h1-Title might be good being the same like title

Than there have been strong considerations to make the structur total flat which would require to rename the "index" file in proper code pagenames, after of cause the same for all othe lang-namespaces, e.g. ati "index-files" would be good to get the same codes with "_ati" attached, single files as well "_{....}" (translator) attached.

No, on a dynamic page: does not makes sense and horror in maintaining indexes and name code systems. The ATI tree modified like now is fine. Maybe just looking for keeping the name index free for automatical folder indexes via a plugin. At least the sidemap is wonderful for quick finding and teaches/trains sati, flat is just by search engine good accessable.

So far the thought and state of progress. Atma thought to go on with the restyling of the tags and files from ati.

If thinking on different ways, it's just an idea of mine and there might be better, so don't feel limited by it.

(it might be that there are some double name conflicts in the indexlist for flat structur and renaming, and not checked if all files are matched. 1 or to are not listended, as the contained only the name of a group)