Recent Topics

[February 26, 2020, 07:09:16 PM]

[February 26, 2020, 02:32:33 PM]

[February 26, 2020, 09:07:06 AM]

[February 26, 2020, 07:46:16 AM]

[February 25, 2020, 04:54:03 PM]

[February 25, 2020, 12:26:24 PM]

[February 24, 2020, 09:45:07 AM]

[February 24, 2020, 09:31:03 AM]

[February 24, 2020, 09:20:10 AM]

[February 23, 2020, 08:26:20 PM]

[February 23, 2020, 06:33:55 PM]

[February 23, 2020, 05:46:40 PM]

[February 22, 2020, 09:07:50 PM]

[February 22, 2020, 01:50:28 AM]

[February 21, 2020, 07:01:35 PM]

[February 20, 2020, 04:54:12 PM]

[February 20, 2020, 03:50:56 PM]

[February 20, 2020, 08:21:28 AM]

[February 20, 2020, 06:39:32 AM]

[February 19, 2020, 05:28:03 PM]

[February 18, 2020, 09:36:56 PM]

[February 18, 2020, 11:56:13 AM]

[February 17, 2020, 03:11:05 PM]

[February 17, 2020, 05:53:48 AM]

[February 16, 2020, 12:05:08 PM]

[February 16, 2020, 06:24:11 AM]

[February 15, 2020, 10:55:12 PM]

[February 15, 2020, 07:35:53 PM]

[February 14, 2020, 09:21:35 PM]

[February 14, 2020, 07:53:10 PM]

[February 13, 2020, 02:19:39 PM]

[February 13, 2020, 10:30:39 AM]

[February 12, 2020, 12:29:16 PM]

Talkbox

2020 Feb 22 20:44:08
Cheav Villa:  _/\_ _/\_ _/\_

2020 Feb 22 18:45:07
Johann: May all spend a blessed Uposatha, those holding it today and those tomorrow, or both

2020 Feb 19 12:31:58
Johann: Nyom Moritz

2020 Feb 19 12:05:18
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Feb 18 21:27:07
Johann: Nyom

2020 Feb 18 21:02:57
Cheav Villa: Vandami Bhante _/\_ _/\_ _/\_

2020 Feb 18 09:12:06
Danilo: Bhante Johann _/\_

2020 Feb 18 09:10:31
Johann: Nyom Danilo

2020 Feb 16 22:24:43
Moritz: Bang Villa _/\_

2020 Feb 16 10:54:04
Cheav Villa: Sadhu Sadhu Sadhu  _/\_ _/\_ _/\_

2020 Feb 16 06:43:32
Johann: A blessed Sila day all today, observing the Uposatha Silas

2020 Feb 15 22:34:40
Danilo: Bhante _/\_

2020 Feb 15 22:31:22
Johann: Nyom Danilo

2020 Feb 15 14:16:33
Cheav Villa:  : *thumb* _/\_

2020 Feb 15 13:51:34
Moritz: Bang Villa _/\_

2020 Feb 12 23:36:22
Moritz: Chom reap leah, good night _/\_

2020 Feb 12 23:04:39
Cheav Villa: Master Moritz _/\_

2020 Feb 12 23:04:24
Cheav Villa: Vandami Bhante _/\_ _/\_ _/\_

2020 Feb 12 23:03:33
Moritz: Bang Villa _/\_

2020 Feb 12 22:57:44
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Feb 10 18:34:34
Johann: Nyom

2020 Feb 10 15:06:07
Sophorn:  _/\_ _/\_ _/\_ Vandami Bhante

2020 Feb 10 15:05:31
Sophorn: Meister Moritz  _/\_

2020 Feb 10 14:10:41
Moritz: Bang Sophorn _/\_

2020 Feb 09 21:43:05
Johann:  _/\_ Bhante Ariyadhammika

2020 Feb 09 19:11:06
Johann: Nyom

2020 Feb 09 19:06:51
Cheav Villa: Master Moritz _/\_

2020 Feb 09 19:06:36
Cheav Villa: Vandami Bhante _/\_ _/\_ _/\_

2020 Feb 09 18:53:15
Moritz: Bong Villa _/\_

2020 Feb 08 22:38:59
Johann: Wie immer zugeneigt, Nyom.

2020 Feb 08 17:21:10
Sophorn:  _/\_ Bhante, es lässt sich nicht hochladen. Kana macht das wie gehabt _/\_

2020 Feb 08 06:18:20
Johann: A blessed full moon Uposatha, a blessed Magha Puja, Sangha day, all today.

2020 Feb 06 20:30:12
Johann:  _/\_ Bhante Ariyadhammika

2020 Feb 04 14:45:08
Cheav Villa:  _/\_ _/\_ _/\_

2020 Feb 01 06:40:40
Johann: A blessed Sila-day in Union all. May it be by nobody missed for certain not so benifical things.

2020 Jan 31 10:22:34
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 31 07:05:45
Moritz: Bong Villa _/\_

2020 Jan 30 11:25:07
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 30 08:44:22
Johann: Bhante Ariyadhammika  _/\_

2020 Jan 30 05:03:27
Johann: Nyom

2020 Jan 30 03:22:25
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 30 02:39:50
Moritz: Sadhu for good wishes, brother Vivek. Maybe good for another topic :) _/\_

2020 Jan 29 20:10:42
Vivek:  :-|  :) Be courageous for wholesomeness.  :-\  ^-^ but don't attach to it  _/\_  *sgift* . All Youngsters-- be a Veera(celibacy)  ;-)  to reach mahaveera(buddha).  Live and let others live in R.I.P.(rest in peace)  <.I.>

2020 Jan 29 16:05:37
Cheav Villa: Welcome Master Moritz :) _/\_

2020 Jan 29 13:44:04
Moritz: I finally arrived :)

2020 Jan 29 13:43:56
Moritz:  Brother Vivek _/\_

2020 Jan 29 13:43:46
Moritz: Bong Villa _/\_

2020 Jan 28 13:04:10
Cheav Villa:  _/\_ _/\_ _/\_

2020 Jan 28 12:31:39
Johann: Atmas "sleeping rock" is always a well protected place. The fire now moving fast east into the village, may it case no harm and destruction for all.

2020 Jan 28 12:12:48
Cheav Villa: May Bhante could find a safe place during this fire time _/\_ _/\_ _/\_

2020 Jan 26 12:06:21
Cheav Villa: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 25 23:25:33
Johann: Nyom Villa

2020 Jan 24 09:58:40
Johann: As thought, incl. the Devas.

2020 Jan 24 08:29:47
Cheav Villa: He kept walking on road num 3 _/\_ _/\_ _/\_

2020 Jan 24 08:28:40
Cheav Villa: But could not help only giving drinking water. Bhante told his lost in this area 3days ago then he walr

2020 Jan 24 08:24:26
Cheav Villa: with Dad and Srey muk kamao, going to visit His Grand ma Grand pa. Fortunately he met Bhante Khmema kumara on the main road

2020 Jan 24 08:22:10
Cheav Villa: Kana Bhante this morning aroung 7.30 kana son arrived near Psar Tram kna on national road number3

2020 Jan 24 08:20:39
Cheav Villa: Sadhu Sadhu Sadhu  _/\_ _/\_ _/\_

2020 Jan 24 07:05:21
Johann: A blessed chinese new year, new moon Uposatha those who celebrate it today.

2020 Jan 23 21:53:22
Cheav Villa:  _/\_ _/\_ _/\_

2020 Jan 23 21:46:43
Johann: Sokh chomreoun

2020 Jan 23 21:25:30
Cheav Villa: Vandami Bhante :) _/\_ _/\_ _/\_

2020 Jan 23 11:48:02
Johann: A blessed and fruitful new moon Uposatha, those observing it today

2020 Jan 22 13:30:29
Johann: May Sukha come to fulfillment, Nyom. It's well for now.

2020 Jan 22 09:39:03
Danilo: did Bhante's health get better?

2020 Jan 22 09:37:11
Danilo: Bhante Johann _/\_

2020 Jan 21 19:20:53
Johann: Meister Moritz

2020 Jan 21 19:01:10
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 21 13:09:02
Johann: Blind like ants are being believing in technic, scients and incapable to trace where and how effects take their cause.

2020 Jan 20 19:30:27
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 19 20:32:04
Johann: Nyom Moritz

2020 Jan 19 20:29:34
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 19 15:01:46
Johann: some moved to topic here

2020 Jan 19 10:20:11
Johann: Nyom Moritz

2020 Jan 19 10:03:16
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 18 08:14:54
Moritz: _/\_ _/\_ _/\_

2020 Jan 18 05:57:52
Johann: A blessed and fruitful Sila day

2020 Jan 17 19:19:18
Moritz: Chom reap leah _/\_

2020 Jan 17 18:37:01
Cheav Villa:  _/\_

2020 Jan 17 18:14:15
Moritz: Bong Villa _/\_

2020 Jan 17 13:39:36
Sophorn: Many greetings to everyone,

2020 Jan 17 13:39:19
Sophorn:  _/\_ _/\_ _/\_

2020 Jan 17 10:29:19
Johann: "so, now I go up and clear the area a little, make some merits, and I will not share my merits, with anybody..."  ^-^

2020 Jan 17 08:49:11
Chanroth: ធ្វើអាស្រមហើយចេញពីរសេចក្ដីល្អ ធ្វើអ្វីដើម្បីខ្លួល្អជាង

2020 Jan 17 08:45:08
Chanroth: សូមលាហើយ

2020 Jan 17 08:33:16
Cheav Villa:  _/\_ _/\_ _/\_

2020 Jan 17 06:56:27
Johann: A blessed and fruitful Uposatha, those observing it today.

2020 Jan 17 06:45:45
Johann: Nyom Moritz

2020 Jan 17 06:44:49
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 16 23:01:31
Moritz: Mr. Lew _/\_

2020 Jan 15 14:16:00
Cheav Villa:  _/\_ _/\_ _/\_

2020 Jan 15 11:15:29
Moritz: _/\_ _/\_ _/\_

2020 Jan 15 11:12:01
Johann: Nyom

2020 Jan 15 10:24:02
Cheav Villa: Master Moritz _/\_

2020 Jan 15 10:21:50
Moritz: Bong Villa _/\_

2020 Jan 15 10:11:56
Moritz: Vandami Bhante _/\_ _/\_ _/\_

2020 Jan 12 22:22:36
Moritz: Chom reap leah _/\_

2020 Jan 12 22:04:02
Moritz: _/\_

2020 Jan 12 21:34:30
Cheav Villa: Sadhu Sadhu Sadhu  _/\_ _/\_ _/\_

2020 Jan 12 21:31:02
Johann: Sokh chomreoun, may happiness come to fullfillment, all

Tipitaka Khmer

 Please feel welcome to join the transcription project of the Tipitaka translation in khmer, and share one of your favorite Sutta or more. Simply click here or visit the Forum: 

Search ATI on ZzE

Zugang zur Einsicht - Schriften aus der Theravada Tradition



Access to Insight / Zugang zur Einsicht: Dhamma-Suche auf mehr als 4000 Webseiten (deutsch / english) - ohne zu googeln, andere Ressourcen zu nehmen, weltliche Verpflichtungen einzugehen. Sie sind für den Zugang zur Einsicht herzlich eingeladen diese Möglichkeit zu nutzen. (Info)

Random Sutta
Random Article
Random Jataka

Zufälliges Sutta
Zufälliger Artikel
Zufälliges Jataka


Arbeits/Work Forum ZzE

"Dhammatalks.org":
[logo dhammatalks.org]
Random Talk
[pic 30]

Dear Visitor!

Herzlich Willkommen auf sangham.net! Welcome to sangham.net!
Ehrenwerter Gast, fühlen sie sich willkommen!

Sie können sich gerne auch unangemeldet an jeder Diskussion beteiligen und eine Antwort posten. Auch ist es Ihnen möglich, ein Post oder ein Thema an die Moderatoren zu melden, sei es nun, um ein Lob auszusprechen oder um zu tadeln. Beides ist willkommen, wenn es gut gemeint und umsichtig ist. Lesen Sie mehr dazu im Beitrag: Melden/Kommentieren von Postings für Gäste
Sie können sich aber auch jederzeit anmelden oder sich via Email einladen und anmelden lassen oder als "Visitor" einloggen, und damit stehen Ihnen noch viel mehr Möglichkeiten frei. Nutzen Sie auch die Möglichkeit einen Segen auszusprechen oder ein Räucherstäbchen anzuzünden und wir freuen uns, wenn Sie sich auch als Besucher kurz vorstellen oder Hallo sagen .
Wir wünschen viel Freude beim Nutzen und Entdecken des Forums mit all seinen nützlichen Möglichkeiten .
 
Wählen Sie Ihre bevorzugte Sprache rechts oben neben dem Suchfenster.

Wähle Sprache / Choose Language / เลือก ภาษา / ជ្រើសយកភាសា: ^ ^
 Venerated Visitor, feel heartily welcome!
You are able to participate in discussions and post even without registration. You are also able to report a post or topic to the moderators, may it be praise or a rebuke. Both is welcome if it is meant with good will and care. Read more about it within the post: Report/comment posts for guests
But you can also register any time or get invited and registered in the way to request via Email , or log in as "Visitor". If you are logged in you will have more additional possibilities. Please feel free to use the possibility to  give a blessing or light an incent stick and we are honored if you introduce yourself or say "Hello" even if you are on a short visit.
We wish you much joy in using and exploring the forum with all its useful possibilities  
Choose your preferred language on the right top corner next to the search window!

Zugang zur Einsicht - Übersetzung, Kritik und Anmerkungen

Herzlich Willkommen im Arbeitsforum von zugangzureinsicht.org im Onlinekloster sangham.net!


Danke werte(r) Besucher(in), dass Sie von dieser Möglichkeit Gebrauch machen und sich direkt einbringen wollen.

Unten (wenn Sie etwas scrollen) finden Sie eine Eingabemaske, in der Sie Ihre Eingabe einbringen können. Es stehen Ihnen auch verschiedene Gestaltungsmöglichkeiten zur Verfügung. Wenn Sie einen Text im formatierten Format abspeichern wollen, klicken Sie bitte das kleine Kästchen mit dem Pfeil.

Die Textfelder "Name" und "email" müssen ausgefüllt werden, Sie können hier aber auch eine Anonyme Angabe machen und eine Pseudo-email angeben (geben Sie, wenn Sie Rückantwort haben wollen, jedoch einen Kontakt an), wenn Ihnen das unangenehm ist. Der Name scheint im Forum als Text auf und die Email ist von niemanden außer dem Administrator einsehbar.

Wenn Sie den Text fertig geschrieben haben, müssen Sie noch den Spamschutz überwinden, das Bild zusammen setzen, und dann auf "Vorschau" oder "Senden" drücken, wenn für Sie alles passt.

Wenn Sie eine Spende einer Übersetzung machen wollen, wäre es schön, wenn Sie etwas vom Entstehen bzw. deren Herkunft erzählen und Ihrer Gabe vielleicht noch eine Widmung anhängen.

Gerne, so es möglich ist, werden wir Ihre Übersetzung dann auch den Seiten von Zugang zur Einsicht veröffentlichen. Für generelle Fragen zu dem Umfang der Dhamma-Geschenke auf ZzE sehen Sie bitte in den FAQ von ZzE ein.

Gerne empfangen wir Kritik und selbstverständlich auch Korrekturen oder Anregungen hier. Es steht Ihnen natürlich offen und Sie sind dazu herzlich eingeladen auch direkt mit einem eigenen Zugang hier an den Arbeiten vielleicht direkt teilzunehmen.

Sadhu!

metta & mudita
Ihr Zugang zur Einsicht Team

Um sich im Abeitsforum etwas unzusehen, klicken Sie hier. . Sie finden hier viele Informationen und vielleicht sogar neues rund um Zugang zur Einsicht.

Author Topic: [ATI.eu] Indexing and search engine issues  (Read 6603 times)

0 Members and 1 Guest are viewing this topic.

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #30 on: March 28, 2019, 02:32:56 PM »
Currently not using search or batchedit, how ever Nyom might think.

(There is a inbuilt search.php, told that it can be executed direct on the server to rebuild the index. Maybe that helps. https://www.dokuwiki.org/cli#indexerphp )
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Offline Moritz

  • Cief houskeeper / Chefhausmeister
  • Very Engaged Member
  • *
  • Sadhu! or +268/-0
  • Gender: Male
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #31 on: March 28, 2019, 02:48:11 PM »
Currently not using search or batchedit, how ever Nyom might think.

(There is a inbuilt search.php, told that it can be executed direct on the server to rebuild the index. Maybe that helps. https://www.dokuwiki.org/cli#indexerphp )

Rebuilding index started.

The helper scripts listed on https://www.dokuwiki.org/cli are only usable if one has shell access on the server. But that is not the case for the Greensta server here. (But still possibly useful to look into and adapt something maybe when having more time for it.) So just using the previous approach now.

_/\_

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #32 on: March 28, 2019, 02:51:04 PM »
Sadhu
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Offline Moritz

  • Cief houskeeper / Chefhausmeister
  • Very Engaged Member
  • *
  • Sadhu! or +268/-0
  • Gender: Male
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #33 on: March 29, 2019, 02:25:48 PM »
I accidentally restarted rebuilding the index again from scratch. So now, progress is again at about 5000/20000 pages.

I wrote a new script, adapting methods from the CLI script , so that the whole process would run on the server, not needing to have a connection and open browser window all the time to send commands for every single page to be indexed one by one.
This should at least be a little bit faster, without the sending commands and responses back and forth, but the speed difference is not really noticeable. So it should, again, be finished in one day.

The current progress can be seen by opening http://accesstoinsight.eu/indexer.success.log (listing pages that were indexed successfully) and http://accesstoinsight.eu/indexer.error.log (listing pages which could not be indexed for some reason, currently empty).
There is a counting number before each page name in the lists, so one can see how many pages have already been processed.

_/\_

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #34 on: March 29, 2019, 03:44:28 PM »
Sadhu
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Offline Moritz

  • Cief houskeeper / Chefhausmeister
  • Very Engaged Member
  • *
  • Sadhu! or +268/-0
  • Gender: Male
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #35 on: March 30, 2019, 02:48:15 PM »
The indexing script I had started on the server (which should be doing just the same as the CLI indexer script) stopped at some point due to running out of memory (working memory, not storage memory). It seems that certain pages simply cannot be indexed because the indexer would need too much memory for it.
For example http://accesstoinsight.eu/cs-th:tika:sut.dn.0_tik and following pages always fail with
Code: [Select]
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 67108872 bytes) in /var/www/clients/client2157/web5417/web/inc/indexer.php on line 612
or similar.

Line 612 is here:
Code: [Select]
$wordlist = explode(' ', $text);
splitting the whole text of a page into single words by spaces.

But I really do not understand why this would take so much memory. Also, replicating this same operation on my computer, splitting the same page text with the same methods into single words and storing in a variable in PHP, does not need nearly as much memory here.

Trying to find a way to work around it, I gave up now.

Continued indexing with the other method (which runs locally on my computer and sends a command for every single page to be indexed through the network, and does not stop if a page fails to be indexed), currently indexed until ~11000 pages (with many "holes" of pages which just cannot be indexed with the current server).

Should be finished in some 16 hours maybe if now just let to run. But with the current server infrastructure it seems the search index will always be incomplete.

_/\_

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #36 on: March 30, 2019, 04:07:07 PM »
Sadhu for effort and care. May Nyom always give/take himself his time.

(The "big pages", Atma thinks about 10 %, like the other of the cscd Tipitaka, would not change later on in regard of content. Atma remembers that once there was still a search engine on ZzE, it was also never possible to index all Pali Tipitaka pages of original Ati as well, always having errors.

On the other side, on ZzE once and also now on ati.eu, there have been times where the index was obviously complete.)

« Last Edit: March 30, 2019, 04:29:30 PM by Johann »
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Offline Moritz

  • Cief houskeeper / Chefhausmeister
  • Very Engaged Member
  • *
  • Sadhu! or +268/-0
  • Gender: Male
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #37 on: April 01, 2019, 12:23:59 AM »
Quote
May Nyom always give/take himself his time.
_/\_

Indexing finished some time this morning.

Quote
On the other side, on ZzE once and also now on ati.eu, there have been times where the index was obviously complete.)
Obviously (offensichtlich)? Or apparently (offenbar, scheinbar, anscheinend)?

I think maybe the latter, because these errors would never appear in the Searchindex Manager plugin. It would just say "page already up to date" or something, when a page could not be indexed.

After retrying several times to index the files which failed, all files which still could not be indexed are just 474 pages in Thai script (listed below). I think the reason is the way DokuWiki handles some Asian scripts, including Thai, treating every character as a single word, which would take a lot of memory for the indexer. Quote from inc/indexer.php file, line 18 and following:
Code: [Select]
// Asian characters are handled as words. The following regexp defines the
// Unicode-Ranges for Asian characters
// Ranges taken from http://en.wikipedia.org/wiki/Unicode_block
// I'm no language expert. If you think some ranges are wrongly chosen or
// a range is missing, please contact me
define('IDX_ASIAN1','[\x{0E00}-\x{0E7F}]'); // Thai

I have deleted all files in en:s and de:s which were just examples on how to integrate Google Site Search and comments about other search engines tested by Mr. Bullitt for accesstoinsight.org in the past.

List of unindexed Thai script files:
Code: [Select]
cs-th:atthakatha:sut.kn.jat.v01_att
cs-th:atthakatha:sut.kn.jat.v02_att
cs-th:atthakatha:sut.kn.jat.v03_att
cs-th:atthakatha:sut.kn.jat.v04_att
cs-th:atthakatha:sut.kn.jat.v05_att
cs-th:atthakatha:sut.kn.jat.v06_att
cs-th:atthakatha:sut.kn.jat.v07_att
cs-th:atthakatha:sut.kn.jat.v08_att
cs-th:atthakatha:sut.kn.jat.v09_att
cs-th:atthakatha:sut.kn.jat.v10_att
cs-th:atthakatha:sut.kn.jat.v11_att
cs-th:atthakatha:sut.kn.jat.v12_att
cs-th:atthakatha:sut.kn.jat.v13_att
cs-th:atthakatha:sut.kn.jat.v14_att
cs-th:atthakatha:sut.kn.jat.v15_att
cs-th:atthakatha:sut.kn.jat.v16_att
cs-th:atthakatha:sut.kn.jat.v17_att
cs-th:atthakatha:sut.kn.jat.v18_att
cs-th:atthakatha:sut.kn.jat.v19_att
cs-th:atthakatha:sut.kn.jat.v20_att
cs-th:atthakatha:sut.kn.jat.v21_att
cs-th:atthakatha:sut.kn.jat.v22_att
cs-th:atthakatha:sut.kn.jat.v23_att
cs-th:atthakatha:sut.kn.khp.0_att
cs-th:atthakatha:sut.kn.khp.1_att
cs-th:atthakatha:sut.kn.khp.2_att
cs-th:atthakatha:sut.kn.khp.3_att
cs-th:atthakatha:sut.kn.khp.4_att
cs-th:atthakatha:sut.kn.khp.5_att
cs-th:atthakatha:sut.kn.khp.6_att
cs-th:atthakatha:sut.kn.khp.7_att
cs-th:atthakatha:sut.kn.khp.8_att
cs-th:atthakatha:sut.kn.khp.9_att
cs-th:atthakatha:sut.kn.man.00_att
cs-th:atthakatha:sut.kn.man.01_att
cs-th:atthakatha:sut.kn.man.02_att
cs-th:atthakatha:sut.kn.man.03_att
cs-th:atthakatha:sut.kn.man.04_att
cs-th:atthakatha:sut.kn.man.05_att
cs-th:atthakatha:sut.kn.man.06_att
cs-th:atthakatha:sut.kn.man.07_att
cs-th:atthakatha:sut.kn.man.08_att
cs-th:atthakatha:sut.kn.man.09_att
cs-th:atthakatha:sut.kn.man.10_att
cs-th:atthakatha:sut.kn.man.11_att
cs-th:atthakatha:sut.kn.man.12_att
cs-th:atthakatha:sut.kn.man.13_att
cs-th:atthakatha:sut.kn.man.14_att
cs-th:atthakatha:sut.kn.man.15_att
cs-th:atthakatha:sut.kn.man.16_att
cs-th:atthakatha:sut.kn.net.0_att
cs-th:atthakatha:sut.kn.net.1_att
cs-th:atthakatha:sut.kn.net.2_att
cs-th:atthakatha:sut.kn.net.3_att
cs-th:atthakatha:sut.kn.net.4_att
cs-th:atthakatha:sut.kn.net.5_att
cs-th:atthakatha:sut.kn.net.6_att
cs-th:atthakatha:sut.kn.pat.v0_att
cs-th:atthakatha:sut.kn.pat.v1.01_att
cs-th:atthakatha:sut.kn.pat.v1.02_att
cs-th:atthakatha:sut.kn.pat.v1.03_att
cs-th:atthakatha:sut.kn.pat.v1.04_att
cs-th:atthakatha:sut.kn.pat.v1.05_att
cs-th:atthakatha:sut.kn.pat.v1.06_att
cs-th:atthakatha:sut.kn.pat.v1.07_att
cs-th:atthakatha:sut.kn.pat.v1.08_att
cs-th:atthakatha:sut.kn.pat.v1.09_att
cs-th:atthakatha:sut.kn.pat.v1.10_att
cs-th:atthakatha:sut.kn.pat.v1_att
cs-th:atthakatha:sut.kn.pat.v2_att
cs-th:atthakatha:sut.kn.pat.v3.01_att
cs-th:atthakatha:sut.kn.pat.v3.02_att
cs-th:atthakatha:sut.kn.pat.v3.03_att
cs-th:atthakatha:sut.kn.pat.v3.04_att
cs-th:atthakatha:sut.kn.pat.v3.05_att
cs-th:atthakatha:sut.kn.pat.v3.06_att
cs-th:atthakatha:sut.kn.pat.v3.07_att
cs-th:atthakatha:sut.kn.pat.v3.08_att
cs-th:atthakatha:sut.kn.pat.v3.09_att
cs-th:atthakatha:sut.kn.pat.v3.10_att
cs-th:atthakatha:sut.kn.pat.v3_att
cs-th:atthakatha:sut.kn.pev.0_att
cs-th:atthakatha:sut.kn.pev.1_att
cs-th:atthakatha:sut.kn.pev.2_att
cs-th:atthakatha:sut.kn.pev.3_att
cs-th:atthakatha:sut.kn.pev.4_att
cs-th:atthakatha:sut.kn.snp.1_att
cs-th:atthakatha:sut.kn.snp.2_att
cs-th:atthakatha:sut.kn.snp.3_att
cs-th:atthakatha:sut.kn.snp.4_att
cs-th:atthakatha:sut.kn.snp.5_att
cs-th:atthakatha:sut.kn.tha.00_att
cs-th:atthakatha:sut.kn.tha.01_att
cs-th:atthakatha:sut.kn.tha.02_att
cs-th:atthakatha:sut.kn.tha.03_att
cs-th:atthakatha:sut.kn.tha.04_att
cs-th:atthakatha:sut.kn.tha.05_att
cs-th:atthakatha:sut.kn.tha.06_att
cs-th:atthakatha:sut.kn.tha.07_att
cs-th:atthakatha:sut.kn.tha.08_att
cs-th:atthakatha:sut.kn.tha.09_att
cs-th:atthakatha:sut.kn.tha.10_att
cs-th:atthakatha:sut.kn.tha.11_att
cs-th:atthakatha:sut.kn.tha.12_att
cs-th:atthakatha:sut.kn.tha.13_att
cs-th:atthakatha:sut.kn.tha.14_att
cs-th:atthakatha:sut.kn.tha.15_att
cs-th:atthakatha:sut.kn.tha.16_att
cs-th:atthakatha:sut.kn.tha.17_att
cs-th:atthakatha:sut.kn.tha.18_att
cs-th:atthakatha:sut.kn.tha.19_att
cs-th:atthakatha:sut.kn.tha.20_att
cs-th:atthakatha:sut.kn.tha.21_att
cs-th:atthakatha:sut.kn.thi.01_att
cs-th:atthakatha:sut.kn.thi.02_att
cs-th:atthakatha:sut.kn.thi.03_att
cs-th:atthakatha:sut.kn.thi.04_att
cs-th:atthakatha:sut.kn.thi.05_att
cs-th:atthakatha:sut.kn.thi.06_att
cs-th:atthakatha:sut.kn.thi.07_att
cs-th:atthakatha:sut.kn.thi.08_att
cs-th:atthakatha:sut.kn.thi.09_att
cs-th:atthakatha:sut.kn.thi.10_att
cs-th:atthakatha:sut.kn.thi.11_att
cs-th:atthakatha:sut.kn.thi.12_att
cs-th:atthakatha:sut.kn.thi.13_att
cs-th:atthakatha:sut.kn.thi.14_att
cs-th:atthakatha:sut.kn.thi.15_att
cs-th:atthakatha:sut.kn.thi.16_att
cs-th:atthakatha:sut.kn.uda.0_att
cs-th:atthakatha:sut.kn.uda.1_att
cs-th:atthakatha:sut.kn.uda.2_att
cs-th:atthakatha:sut.kn.uda.3_att
cs-th:atthakatha:sut.kn.uda.4_att
cs-th:atthakatha:sut.kn.uda.5_att
cs-th:atthakatha:sut.kn.uda.6_att
cs-th:atthakatha:sut.kn.uda.7_att
cs-th:atthakatha:sut.kn.uda.8_att
cs-th:atthakatha:sut.kn.viv.v0_att
cs-th:atthakatha:sut.kn.viv.v1_att
cs-th:atthakatha:sut.kn.viv.v2_att
cs-th:atthakatha:sut.mn.v00_att
cs-th:atthakatha:sut.mn.v01_att
cs-th:atthakatha:sut.mn.v02_att
cs-th:atthakatha:sut.mn.v03_att
cs-th:atthakatha:sut.mn.v04_att
cs-th:atthakatha:sut.mn.v05_att
cs-th:atthakatha:sut.mn.v06_att
cs-th:atthakatha:sut.mn.v07_att
cs-th:atthakatha:sut.mn.v08_att
cs-th:atthakatha:sut.mn.v09_att
cs-th:atthakatha:sut.mn.v10_att
cs-th:atthakatha:sut.mn.v11_att
cs-th:atthakatha:sut.mn.v12_att
cs-th:atthakatha:sut.mn.v13_att
cs-th:atthakatha:sut.mn.v14_att
cs-th:atthakatha:sut.mn.v15_att
cs-th:atthakatha:sut.sn.00_att
cs-th:atthakatha:sut.sn.01_att
cs-th:atthakatha:sut.sn.02_att
cs-th:atthakatha:sut.sn.03_att
cs-th:atthakatha:sut.sn.04_att
cs-th:atthakatha:sut.sn.05_att
cs-th:atthakatha:sut.sn.06_att
cs-th:atthakatha:sut.sn.07_att
cs-th:atthakatha:sut.sn.08_att
cs-th:atthakatha:sut.sn.09_att
cs-th:atthakatha:sut.sn.10_att
cs-th:atthakatha:sut.sn.11_att
cs-th:atthakatha:sut.sn.12_att
cs-th:atthakatha:sut.sn.13_att
cs-th:atthakatha:sut.sn.14_att
cs-th:atthakatha:sut.sn.15_att
cs-th:atthakatha:sut.sn.16_att
cs-th:atthakatha:sut.sn.17_att
cs-th:atthakatha:sut.sn.18_att
cs-th:atthakatha:sut.sn.19_att
cs-th:atthakatha:sut.sn.20_att
cs-th:atthakatha:sut.sn.21_att
cs-th:atthakatha:sut.sn.22_att
cs-th:atthakatha:sut.sn.23_att
cs-th:atthakatha:sut.sn.24_att
cs-th:atthakatha:sut.sn.25_att
cs-th:atthakatha:sut.sn.26_att
cs-th:atthakatha:sut.sn.27_att
cs-th:atthakatha:sut.sn.28_att
cs-th:atthakatha:sut.sn.29_att
cs-th:atthakatha:sut.sn.30_att
cs-th:atthakatha:sut.sn.31_att
cs-th:atthakatha:sut.sn.32_att
cs-th:atthakatha:sut.sn.33_att
cs-th:atthakatha:sut.sn.34_att
cs-th:atthakatha:sut.sn.35_att
cs-th:atthakatha:sut.sn.36_att
cs-th:atthakatha:sut.sn.37_att
cs-th:atthakatha:sut.sn.38_att
cs-th:atthakatha:sut.sn.39_att
cs-th:atthakatha:sut.sn.40_att
cs-th:atthakatha:sut.sn.41_att
cs-th:atthakatha:sut.sn.42_att
cs-th:atthakatha:sut.sn.43_att
cs-th:atthakatha:sut.sn.44_att
cs-th:atthakatha:sut.sn.45_att
cs-th:atthakatha:sut.sn.46_att
cs-th:atthakatha:sut.sn.47_att
cs-th:atthakatha:sut.sn.48_att
cs-th:atthakatha:sut.sn.49_att
cs-th:atthakatha:sut.sn.50_att
cs-th:atthakatha:sut.sn.51_att
cs-th:atthakatha:sut.sn.52_att
cs-th:atthakatha:sut.sn.53_att
cs-th:atthakatha:sut.sn.54_att
cs-th:atthakatha:sut.sn.55_att
cs-th:atthakatha:sut.sn.56_att
cs-th:atthakatha:vin.cv.01_att
cs-th:atthakatha:vin.cv.02_att
cs-th:atthakatha:vin.cv.03_att
cs-th:atthakatha:vin.cv.04_att
cs-th:atthakatha:vin.cv.05_att
cs-th:atthakatha:vin.cv.06_att
cs-th:atthakatha:vin.cv.07_att
cs-th:atthakatha:vin.cv.08_att
cs-th:atthakatha:vin.cv.09_att
cs-th:atthakatha:vin.cv.10_att
cs-th:atthakatha:vin.cv.11_att
cs-th:atthakatha:vin.cv.12_att
cs-th:atthakatha:vin.mv.01_att
cs-th:atthakatha:vin.mv.02_att
cs-th:atthakatha:vin.mv.03_att
cs-th:atthakatha:vin.mv.04_att
cs-th:atthakatha:vin.mv.05_att
cs-th:atthakatha:vin.mv.06_att
cs-th:atthakatha:vin.mv.07_att
cs-th:atthakatha:vin.mv.08_att
cs-th:atthakatha:vin.mv.09_att
cs-th:atthakatha:vin.mv.10_att
cs-th:atthakatha:vin.pac.ak_att
cs-th:atthakatha:vin.pac.nii_att
cs-th:atthakatha:vin.pac.pc_att
cs-th:atthakatha:vin.pac.pci_att
cs-th:atthakatha:vin.pac.pd_att
cs-th:atthakatha:vin.pac.pdi_att
cs-th:atthakatha:vin.pac.pri_att
cs-th:atthakatha:vin.pac.sgi_att
cs-th:atthakatha:vin.pac.sk_att
cs-th:atthakatha:vin.par.ay_att
cs-th:atthakatha:vin.par.ga_att
cs-th:atthakatha:vin.par.ni_att
cs-th:atthakatha:vin.par.pr_att
cs-th:atthakatha:vin.par.sg_att
cs-th:atthakatha:vin.par.ve_att
cs-th:atthakatha:vin.pv.01_att
cs-th:atthakatha:vin.pv.02_att
cs-th:atthakatha:vin.pv.03_att
cs-th:atthakatha:vin.pv.04_att
cs-th:atthakatha:vin.pv.05_att
cs-th:atthakatha:vin.pv.06_att
cs-th:atthakatha:vin.pv.07_att
cs-th:atthakatha:vin.pv.08_att
cs-th:atthakatha:vin.pv.09_att
cs-th:atthakatha:vin.pv.10_att
cs-th:atthakatha:vin.pv.11_att
cs-th:atthakatha:vin.pv.12_att
cs-th:atthakatha:vin.pv.13_att
cs-th:atthakatha:vin.pv.14_att
cs-th:atthakatha:vin.pv.15_att
cs-th:atthakatha:vin.pv.16_att
cs-th:atthakatha:vin.pv.17_att
cs-th:atthakatha:vin.pv.18_att
cs-th:tika:abh.ava-pura.01_tik
cs-th:tika:abh.ava-pura.02_tik
cs-th:tika:abh.ava-pura.03_tik
cs-th:tika:abh.ava-pura.04_tik
cs-th:tika:abh.ava-pura.05_tik
cs-th:tika:abh.ava-pura.06_tik
cs-th:tika:abh.ava-pura.07_tik
cs-th:tika:abh.ava-pura.08_tik
cs-th:tika:abh.ava-pura.09_tik
cs-th:tika:abh.ava-pura.10_tik
cs-th:tika:abh.ava-pura.11_tik
cs-th:tika:sut.dn.01_abh_tik
cs-th:tika:sut.dn.01_tik
cs-th:tika:sut.dn.02_abh_tik
cs-th:tika:sut.dn.02_tik
cs-th:tika:sut.dn.03_abh_tik
cs-th:tika:sut.dn.03_tik
cs-th:tika:sut.dn.04_abh_tik
cs-th:tika:sut.dn.04_tik
cs-th:tika:sut.dn.05_abh_tik
cs-th:tika:sut.dn.05_tik
cs-th:tika:sut.dn.06_abh_tik
cs-th:tika:sut.dn.06_tik
cs-th:tika:sut.dn.07_abh_tik
cs-th:tika:sut.dn.07_tik
cs-th:tika:sut.dn.08_abh_tik
cs-th:tika:sut.dn.08_tik
cs-th:tika:sut.dn.09_abh_tik
cs-th:tika:sut.dn.09_tik
cs-th:tika:sut.dn.0_tik
cs-th:tika:sut.dn.10_abh_tik
cs-th:tika:sut.dn.10_tik
cs-th:tika:sut.dn.11_abh_tik
cs-th:tika:sut.dn.11_tik
cs-th:tika:sut.dn.12_abh_tik
cs-th:tika:sut.dn.12_tik
cs-th:tika:sut.dn.13_abh_tik
cs-th:tika:sut.dn.13_tik
cs-th:tika:sut.dn.14_tik
cs-th:tika:sut.dn.15_tik
cs-th:tika:sut.dn.16_tik
cs-th:tika:sut.dn.17_tik
cs-th:tika:sut.dn.18_tik
cs-th:tika:sut.dn.19_tik
cs-th:tika:sut.dn.20_tik
cs-th:tika:sut.dn.21_tik
cs-th:tika:sut.dn.22_tik
cs-th:tika:sut.dn.23_tik
cs-th:tika:sut.dn.24_tik
cs-th:tika:sut.dn.25_tik
cs-th:tika:sut.dn.26_tik
cs-th:tika:sut.dn.27_tik
cs-th:tika:sut.dn.28_tik
cs-th:tika:sut.dn.29_tik
cs-th:tika:sut.dn.30_tik
cs-th:tika:sut.dn.31_tik
cs-th:tika:sut.dn.32_tik
cs-th:tika:sut.dn.33_tik
cs-th:tika:sut.dn.34_tik
cs-th:tika:sut.kn.paka00_tik
cs-th:tika:sut.kn.paka01_tik
cs-th:tika:sut.kn.paka02_tik
cs-th:tika:sut.kn.paka03_tik
cs-th:tika:sut.kn.paka04_tik
cs-th:tika:sut.kn.paka05_tik
cs-th:tika:sut.kn.paka06_tik
cs-th:tika:sut.kn.vibh01_tik
cs-th:tika:sut.kn.vibh02_tik
cs-th:tika:sut.kn.vibh03_tik
cs-th:tika:sut.kn.vibh04_tik
cs-th:tika:sut.kn.vibh05_tik
cs-th:tika:sut.kn.vibh06_tik
cs-th:tika:sut.mn.0_tik
cs-th:tika:sut.mn.v01_tik
cs-th:tika:sut.mn.v02_tik
cs-th:tika:sut.mn.v03_tik
cs-th:tika:sut.mn.v04_tik
cs-th:tika:sut.mn.v05_tik
cs-th:tika:sut.mn.v06_tik
cs-th:tika:sut.mn.v07_tik
cs-th:tika:sut.mn.v08_tik
cs-th:tika:sut.mn.v09_tik
cs-th:tika:sut.mn.v10_tik
cs-th:tika:sut.mn.v11_tik
cs-th:tika:sut.mn.v12_tik
cs-th:tika:sut.mn.v13_tik
cs-th:tika:sut.mn.v14_tik
cs-th:tika:sut.mn.v15_tik
cs-th:tika:sut.sn.01_tik
cs-th:tika:sut.sn.02_tik
cs-th:tika:sut.sn.03_tik
cs-th:tika:sut.sn.04_tik
cs-th:tika:sut.sn.05_tik
cs-th:tika:sut.sn.06_tik
cs-th:tika:sut.sn.07_tik
cs-th:tika:sut.sn.08_tik
cs-th:tika:sut.sn.09_tik
cs-th:tika:sut.sn.0_tik
cs-th:tika:sut.sn.10_tik
cs-th:tika:sut.sn.11_tik
cs-th:tika:sut.sn.12_tik
cs-th:tika:sut.sn.13_tik
cs-th:tika:sut.sn.14_tik
cs-th:tika:sut.sn.15_tik
cs-th:tika:sut.sn.16_tik
cs-th:tika:sut.sn.17_tik
cs-th:tika:sut.sn.18_tik
cs-th:tika:sut.sn.19_tik
cs-th:tika:sut.sn.20_tik
cs-th:tika:sut.sn.21_tik
cs-th:tika:sut.sn.22_tik
cs-th:tika:sut.sn.23_tik
cs-th:tika:sut.sn.24_tik
cs-th:tika:sut.sn.25_tik
cs-th:tika:sut.sn.26_tik
cs-th:tika:sut.sn.27_tik
cs-th:tika:sut.sn.28_tik
cs-th:tika:sut.sn.29_tik
cs-th:tika:sut.sn.30_tik
cs-th:tika:sut.sn.31_tik
cs-th:tika:sut.sn.32_tik
cs-th:tika:sut.sn.33_tik
cs-th:tika:sut.sn.34_tik
cs-th:tika:sut.sn.35_tik
cs-th:tika:sut.sn.36_tik
cs-th:tika:sut.sn.37_tik
cs-th:tika:sut.sn.38_tik
cs-th:tika:sut.sn.39_tik
cs-th:tika:sut.sn.40_tik
cs-th:tika:sut.sn.41_tik
cs-th:tika:sut.sn.42_tik
cs-th:tika:sut.sn.43_tik
cs-th:tika:sut.sn.44_tik
cs-th:tika:sut.sn.45_tik
cs-th:tika:sut.sn.46_tik
cs-th:tika:sut.sn.47_tik
cs-th:tika:sut.sn.48_tik
cs-th:tika:sut.sn.49_tik
cs-th:tika:sut.sn.50_tik
cs-th:tika:sut.sn.51_tik
cs-th:tika:sut.sn.52_tik
cs-th:tika:sut.sn.53_tik
cs-th:tika:sut.sn.54_tik
cs-th:tika:sut.sn.55_tik
cs-th:tika:sut.sn.56_tik
cs-th:tika:vin.bhi.0_dvem_tik
cs-th:tika:vin.bhi.0_kank_tik
cs-th:tika:vin.bhi.0_vima_tik
cs-th:tika:vin.bhi.v_dvem_tik
cs-th:tika:vin.bhu.0_dvem_tik
cs-th:tika:vin.bhu.0_kank_tik
cs-th:tika:vin.bhu.ni_kank_tik
cs-th:tika:vin.bhu.pc_kank_tik
cs-th:tika:vin.bhu.pr_kank_tik
cs-th:tika:vin.bhu.sg_kank_tik
cs-th:tika:vin.cv.01_sara_tik
cs-th:tika:vin.cv.01_vima_tik
cs-th:tika:vin.cv.02_sara_tik
cs-th:tika:vin.cv.02_vima_tik
cs-th:tika:vin.cv.03_sara_tik
cs-th:tika:vin.cv.03_vima_tik
cs-th:tika:vin.cv.04_sara_tik
cs-th:tika:vin.cv.04_vima_tik
cs-th:tika:vin.cv.05_sara_tik
cs-th:tika:vin.cv.05_vima_tik
cs-th:tika:vin.cv.06_sara_tik
cs-th:tika:vin.cv.06_vima_tik
cs-th:tika:vin.cv.07_sara_tik
cs-th:tika:vin.cv.07_vima_tik
cs-th:tika:vin.cv.08_sara_tik
cs-th:tika:vin.cv.08_vima_tik
cs-th:tika:vin.cv.09_sara_tik
cs-th:tika:vin.cv.09_vima_tik
cs-th:tika:vin.cv.0_paci_tik
cs-th:tika:vin.cv.0_sara_tik
cs-th:tika:vin.cv.0_vaji_tik
cs-th:tika:vin.cv.0_vima_tik
cs-th:tika:vin.cv.10_sara_tik
cs-th:tika:vin.cv.10_vima_tik
cs-th:tika:vin.cv.11_sara_tik
cs-th:tika:vin.cv.11_vima_tik
cs-th:tika:vin.cv.12_sara_tik
cs-th:tika:vin.cv.12_vima_tik
cs-th:tika:vin.kank.0_kank_tik
cs-th:tika:vin.kankha.0_dvem_tik
cs-th:tika:vin.khud.01_khud_tik
cs-th:tika:vin.khud.02_khud_tik
cs-th:tika:vin.pac.pci_vima_tik
cs-th:tika:vin.vila.08_vila_tik
cs-th:tika:vin.vila.09_vila_tik
cs-th:tika:vin.vila.10_vila_tik
cs-th:tika:vin.vila.11_vila_tik
cs-th:tika:vin.vila.12_vila_tik
cs-th:tika:vin.vila.13_vila_tik
cs-th:tika:vin.vila.14_vila_tik
cs-th:tika:vin.vila.15_vila_tik
cs-th:tika:vin.vila.16_vila_tik
cs-th:tika:vin.vila.17_vila_tik
cs-th:tika:vin.vila.18_vila_tik
cs-th:tika:vin.vila.19_vila_tik
cs-th:tika:vin.vila.20_vila_tik
cs-th:tika:vin.vila.21_vila_tik
cs-th:tika:vin.vila.22_vila_tik
cs-th:tika:vin.vila.23_vila_tik
cs-th:tika:vin.vila.24_vila_tik

_/\_

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: from: [ATI.eu] CSCD xml to ati.eu format: converting, editing
« Reply #38 on: April 01, 2019, 06:17:50 AM »
Sadhu

Atma recognized that the search is less rendered for scripts other then latin. Yet not understanding why separating certain ranges character by character.
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: [ATI.eu] Indexing and search engine issues
« Reply #39 on: April 01, 2019, 06:34:00 PM »
Atma has attached the Khmer and Thai Unicode table to possible exclude special characters like stops, computations ... breaking into single characters seems to be meaningless.

Not sure if Upasaka Vorapol may like to assist here for Thai. Atma will try to list special characters in Khmer but not sure now how the indexer or better the search engine works with compunctions and so on generally (simply cut them all away?)
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: [ATI.eu] Indexing and search engine issues
« Reply #40 on: April 01, 2019, 07:19:51 PM »
17D4 KHMER SIGN KHAN • functions as a full stop, period
(→ 0E2F   thai character paiyannoi
→ 104A   myanmar sign little section)

17D5 KHMER SIGN BARIYOOSAN • indicates the end of a section or a text
(→ 0E5A   thai character angkhankhu
→ 104B   myanmar sign section)

17D6 KHMER SIGN CAMNUC PII KUUH • functions as colon
(• the preferred transliteration is camnoc pii kuuh
→ 00F7 ÷  division sign → 0F14   tibetan mark gter tsheg )

17D7 KHMER SIGN LEK TOO • repetition sign
(→ 0E46   thai character maiyamok )

17D8 KHMER SIGN BEYYAL • et cetera
• use of this character is discouraged; other abbreviations for et cetera also exist • preferred spelling: ។ល។

17D9 KHMER SIGN PHNAEK MUAN • indicates the beginning of a book or a treatise • the preferred transliteration is phnek moan
(→ 0E4F   thai character fongman )

17DA KHMER SIGN KOOMUUT • indicates the end of a book or treatise • this forms a pair with 17D9   • the preferred transliteration is koomoot
(→ 0E5B   thai character khomut )

17DB KHMER CURRENCY SYMBOL RIEL

17E0 KHMER DIGIT ZERO
17E1 KHMER DIGIT ONE
17E2 KHMER DIGIT TWO
17E3 KHMER DIGIT THREE
17E4 KHMER DIGIT FOUR
17E5 KHMER DIGIT FIVE
17E6 KHMER DIGIT SIX
17E7 KHMER DIGIT SEVEN
17E8 KHMER DIGIT EIGHT
17E9 KHMER DIGIT NINE

0E50 THAI DIGIT ZERO
0E51 THAI DIGIT ONE
0E52 THAI DIGIT TWO
0E53 THAI DIGIT THREE
0E54 THAI DIGIT FOUR
0E55 THAI DIGIT FIVE
0E56 THAI DIGIT SIX
0E57 THAI DIGIT SEVEN
0E58 THAI DIGIT EIGHT
0E59 THAI DIGIT NINE

Word breaks are either white spaces or zero width spaces in both scripts, Khmer and Thai.
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Offline Moritz

  • Cief houskeeper / Chefhausmeister
  • Very Engaged Member
  • *
  • Sadhu! or +268/-0
  • Gender: Male
Re: [ATI.eu] Indexing and search engine issues
« Reply #41 on: April 01, 2019, 09:58:43 PM »
Sadhu, I was just reading the searching code, trying to understand a bit how it works.

Probably the breaking into single characters idea came from Chinese or Japanese, where I think characters usually represent a complete word. Maybe Thai was included by mistake, thinking that all Asian scripts use such logographic "full word" characters.

I think it is best to just remove the Thai block from the "Asian" set and treat it "normally" like Roman etc. The Khmer block (1780–17FF) is not included there either, and search is working well. Most important change needed would be probably to use zero-width spaces as separators.

not sure now how the indexer or better the search engine works with compunctions and so on generally (simply cut them all away?)
* Moritz ("punctuation" - not "compunction" = "Gewissenhaftigkeit", as Bhante Thanissaro translates "otappa")
Punctuation marks are removed during indexing, and the resulting pieces are stored as "words" in the index, along with some reference tables to store which page has how many occurrences of each word.
When searching, first the tables are searched to find all pages which contain all the single words in the search phrase. And then, if the search phrase (or parts of the search phrase) has been put into quotes "", also the whole search phrase (or the quoted parts) is matched to find the exact occurrence in the text, including punctuation marks and so on.
For example, searching for "ist den Drei Juwelen, dem Buddha, dem Dhamma, der Sangha, gewidmet" will find one result on the page http://www.accesstoinsight.eu/km/index now.
* Moritz (Strange: It should also find the same on http://www.accesstoinsight.eu/de/index) but it does not. Seems like the index is somehow incomplete again.
If leaving out one comma or one word, like "ist den Drei Juwelen Buddha, dem Dhamma, der Sangha, gewidmet", still put in quotes, no result would be found, because there is no exact match of the quotation.
However, if searching for the same without quotation marks around, results would be found again, just looking for every word, not for the whole phrase, and ignoring all punctuation.
* Moritz (Strange: Searching in this way also gives http://www.accesstoinsight.eu/de/index as a result, as it should be. But it did not find the match when searching for the whole quoted phrase.)

So it should still be possible then to find exact text passages including punctuation marks, if quoted. Although, as just seen, sometimes the search engine might not work as it should. ^-^

I think I know now how to do the necessary changes to include the Khmer and Thai punctuation marks and zero-white spaces as separators. After adding these separators, the pages would have to be re-indexed again.

I will try to do it later this week. Not today anymore.

_/\_

Online Johann

  • Samanera
  • Very Engaged Member
  • *
  • Sadhu! or +375/-0
  • Gender: Male
  • Date of ordination/Datum der Ordination.: 20140527
Re: [ATI.eu] Indexing and search engine issues
« Reply #42 on: April 01, 2019, 10:11:35 PM »
Sadhu
This post and Content has come to be by Dhamma-Dana and so is given as it       Dhamma-Dana: Johann

Tags: