Virtual Dhamma-Vinaya Vihara

Studies, projects & library - [Studium, Projekte & Bibliothek] (brahma & nimmanarati deva) => Translation projects - [Übersetzungsprojekte] => Studygroups & Dhamma Dana - [Studiengruppen & Dhamma Dana] => Zugang zur Einsicht - [Access to Insight] => Topic started by: Johann on June 16, 2018, 02:53:02 PM

Title: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Johann on June 16, 2018, 02:53:02 PM
Atma has just seen the nice Wrap plugin (https://www.dokuwiki.org/plugin:wrap) been installed which might make it much easier to bring html content of ZzE pages into the cms/wiki.

The backward is that it's then no more as "light" as with simple wiki-syntax.

How ever, this topic is dedicated for styling of content, ideas, standards... what ever.

A list of replacements can be found here: http://accesstoinsight.eu/doku.php?id=de:import_zze (development in progress)
Title: Re: [dokuwiki] ATI/ZzE Content-style
Post by: Johann on June 16, 2018, 03:06:26 PM
Existing stylings on ZzE for content part:

Chapter
Code: [Select]
<div class="chapter">
replace with
Code: [Select]
<div chapter>

Editor note

Citation excerpt
Code: [Select]
<div class='excerpt'>
replace with
Code: [Select]
<div excerpt>

Cite (text source)
Code: [Select]
<p class='cite'>Text</p>
replace with
Code: [Select]
<span cite>Text</span>

Free verse
Code: [Select]
<div class="freeverse">

Verse
Code: [Select]
<div class="verse">

Tagline
Code: [Select]
<p class="tagline">
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 17, 2018, 01:34:28 PM
Atma had started here with replacements in file-coding.

http://www.accesstoinsight.eu/doku.php?id=de:lib:authors:thanissaro:beyond1

Might be "no" problem if having such as notepad++, multiple files access and regex with replacement function avaliable. By "hand", with given tools and skill, proximate 4000h for all pages.

Atma will now try to get the zze-styling in the ati.eu css.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 25, 2018, 05:24:34 PM
A list of replacements can be found here: http://accesstoinsight.eu/doku.php?id=de:import_zze (development in progress)
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 26, 2018, 05:12:36 AM
First action with replace plugin:

Code: [Select]
find	^	replace with	^	matches ^
/<!DOCTYPE(.*?)<body>/s | <body> | 6508 |

Quote
Display: Warning: Unknown: Input variables exceeded 1000. To increase the limit change max_input_vars in php.ini. in Unknown on line 0

Seems that no action was taken and the building of the searchpage result (maybe 10-50Mb large) needed time.

Try to reduce to smaller amount by sellecting name spaces.

Code: "ns de:lib:" [Select]
find	^	replace with	^	matches ^
/<!DOCTYPE(.*?)<body>/s | <body> | 663 |


seems like having executed! Atma will do forther step by step in this way, accourding to the replace list (http://accesstoinsight.eu/doku.php?id=de:import_zze)

Replace plugin (see ini-isdue above) can execute "only" 1000 replace requests at once, as it seems, for now, but it works and is a useful way.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 26, 2018, 07:28:43 AM
header-deletion
find: /<!DOCTYPE(.*?)<body>/s replace with <body>


de: lib: 663, de:tipitaka:sut:an: 483, de:tipitaka:sut:sn: 471, de:tipitaka:sut:kn: j: 614, de: (rest) 797, en:lib: 654, en:zipitaka:sut:an: 364, en:tipitaka:sut:sn: 470, en:tipitaka:sut:kn:j: 611, en:tipitaka:vin: 616, en:tipitaka:sut:kn: (rest) 400, (rest over all) 365, search for *<body> still gives matches


6458 matches (if quick head-callucating was right) of 6508.

forgot the redirect pages (aside of 22, to find with "<!DOCTYPE html PUBLIC"), other redircts seems to be lost, so far.

Bug? After search request of more complex search, the result page's imput fields are destroyed. See attached. " seems to be the issue.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 28, 2018, 08:02:15 AM
Replace-plugin workes fine so far, of cource there is much to spend to get into regex most efficent.

Two things, Nyom Moritz , since the amount of 1000 matches is very less for large clean ups. Could Atma change that? Is it accessable via ftp? And is it of no problem to increase it to maybe 20.000 or even much higher?

Quote
Warning: Unknown: Input variables exceeded 1000. To increase the limit change max_input_vars in php.ini. in Unknown on line 0

Replace seems to have no variables like \L to replace a string in lower cases, or is it just a synatax language-lack of my person. (to get a word, say "halLo_Was_iSt" replaced by "hallo_was_ist".
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Moritz on June 28, 2018, 02:31:34 PM
since the amount of 1000 matches is very less for large clean ups. Could Atma change that? Is it accessable via ftp? And is it of no problem to increase it to maybe 20.000 or even much higher?
This file is not accessible for us, I think.
But I changed the code to work around that.
It should work now for larger numbers, but it could be very slow. There might come a message from the browser "script is not responding" or something, and being asked if wanting to continue the script, simply answer "yes" and wait.

Replace seems to have no variables like \L to replace a string in lower cases, or is it just a synatax language-lack of my person. (to get a word, say "halLo_Was_iSt" replaced by "hallo_was_ist".

I don't know. Maybe there is another syntax for it. I found this: https://stackoverflow.com/questions/34592160/regex-string-substitution-upper-and-lower-case
But don't know at the moment if that would provide a solution.

_/\_
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 28, 2018, 04:22:06 PM
Sadhu!
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 29, 2018, 07:26:59 AM
The replacment works good for great amount. The only thing that takes time is the builing of the resultpage which is of course huge. Sadhu
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on June 30, 2018, 03:38:23 PM
Firefox verkraften das Laden der Ergebnisseiten ganz gut.

Atma hat heute alle zze txt-file neu hochgeladen, neu-indexiert und von nochmal von Vorne begonnen.

Einen Tag herumgeregext funktioniert alles recht fein. Manchmal, bei größeren Anfragen erscheint nach Abbruch (zwischen drinnen im Überschreiben)

Quote
Fatal error: Maximum execution time of 60 seconds exceeded in /var/www/clients/clientxxx/webxxx/web/inc/io.php on line 235

mag aber mit Verbindung zusammen hängen, da kein besonderes Muster erkannt. -> kommt bei mehr als 4000 betroffenen Seiten (unabhängig der einzelnen Anzahl der Treffer) scheinbar auf.

Stück für Stück, in alle Richtungen, die Codes und Layout... Links ändern, Daten-Table... , wird wohl noch gut eine Woche, zwei, voll in Anspruch nehmen, bis erste annehmliche Erscheinung.

Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Moritz on July 01, 2018, 04:14:48 AM
Sadhu!

Manchmal, bei größeren Anfragen erscheint nach Abbruch (zwischen drinnen im Überschreiben)

Quote
Fatal error: Maximum execution time of 60 seconds exceeded in /var/www/clients/clientxxx/webxxx/web/inc/io.php on line 235

mag aber mit Verbindung zusammen hängen, da kein besonderes Muster erkannt. -> kommt bei mehr als 4000 betroffenen Seiten (unabhängig der einzelnen Anzahl der Treffer) scheinbar auf.

Das hat nichts mit der Verbindung zu tun, sondern damit, dass der Server ein Zeitlimit hat, um ein einzelnes Skript auszuführen. Das Ersetzen in 4000 Seiten dauert scheinbar zu lange und wird dann abgebrochen.

Man kann offenbar aber mitten im Skript immer wieder neu das Zeitlimit sich selbst bestimmen und hoch setzen. Habe nun entsprechend eingebaut, dass es für jeder Datei sich wieder 120 Sekunden reserviert. Das sollte locker reichen und wohl keine solchen Abbrüche mehr stattfinden.

(Habe auch die neuesten Änderungen des Original-Autors nun mit eingebaut, der mein hingehacktes "alles markieren" mit einer sauber zu DokuWiki passenden grafischen Oberfläche versehen hat.)

_/\_
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Moritz on July 01, 2018, 04:32:13 AM
Replace seems to have no variables like \L to replace a string in lower cases, or is it just a synatax language-lack of my person. (to get a word, say "halLo_Was_iSt" replaced by "hallo_was_ist".

I don't know. Maybe there is another syntax for it. I found this: https://stackoverflow.com/questions/34592160/regex-string-substitution-upper-and-lower-case
But don't know at the moment if that would provide a solution.

_/\_

It seems such things like \L and \U for uppercase and lowercase replacement are not "standard" regex features, but only part of extra stuff in some programs.
It would surely be possible to build something like this into the BatchEdit plugin as well. I am not really knowleadgable about regular expressions, but have just found a really good manual (https://www.regular-expressions.info) with clear explanations. So maybe I would try to add things like that when I understand more and find time for it.

And I also still don't know how the indexing works and how fast or slow it is. Have there ever been problems with the BatchEdit plugin showing old results that had not been updated, even when a newer version should exist?
When looking at the code it seems to me that the BatchEdit plugin will always load the latest version and show matches accordingly. So a slow index might perhaps only be a problem sometimes for new pages that have not even been included in the index for the first time so that no results for it would be found...
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on July 01, 2018, 09:47:24 AM
So far, Nyom Moritz , all works fine to progress step by step. No real problems seen in regard of index as it is used only for the selection of avaliable files while the search is done direct in the files (so no need of refreshing / new indexing, except new files are added).

Quote from: https://www.dokuwiki.org/plugin:batchedit#page_lookup
Page lookup

BatchEdit uses DokuWiki page index to get the list of existing pages instead of going through the data directories. If the index is incomplete the plugin will not see some pages. This also applies to the “special” pages, for example, namespace templates.

So it's only about the list of files that batchEdit uses the index.

Index it self: Not sure for now, but it seems so, that refresh index matches also new files. It works not too slow.

In regard of regex, yes, the returns seems to be special. Hier $ seems to work more, as for place a string \1 = $1, maybe it works also for \L = $L (did not try for now).

Since Atma does not intent to learn/invest much in this skills, when ever a need arises, he would look and maybe addopt, investigate samples given here around.

It's good then, steo by Step, to make explainary pages on ATI.eu in all regards, for further future easier work and transfer.

http://accesstoinsight.eu/doku.php?id=de:tech:regex_use z.b.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on July 01, 2018, 11:39:08 AM
Spoken of problems... Nyom Moritz (attached)

Note that after copy the new file into cs-rm (Cattasanghayana - Roman), Atma did only a refresh index, so maybe this had caused an appearances never had before, also first time to regex in cs-rm .

oh... maybe it (this error) has to do with the uploaded image/media-files (maybe wrong Uppercase-cases), have to look at it and rename them...

Done, so far, but not reidexed for now. It seems that regex also addresses mediafiles, not clear in how far (or just when collecting posdible files avaliable. If also executing them likewise, this could be a mess probably.

now the task of replacement there gave:

Code: [Select]
Fatal error: Uncaught Error: Call to undefined function setTimeLimit() in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php:412 Stack trace: #0 /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php(385): admin_plugin_batchedit->applyMatches() #1 /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php(102): admin_plugin_batchedit->apply() #2 /var/www/clients/client2157/web5417/web/inc/Action/Admin.php(47): admin_plugin_batchedit->handle() #3 /var/www/clients/client2157/web5417/web/inc/ActionRouter.php(83): dokuwiki\Action\Admin->preProcess() #4 /var/www/clients/client2157/web5417/web/inc/ActionRouter.php(48): dokuwiki\ActionRouter->setupAction('admin') #5 /var/www/clients/client2157/web5417/web/inc/ActionRouter.php(60): dokuwiki\ActionRouter->__construct() #6 /var/www/clients/client2157/web5417/web/inc/actions.php(16): dokuwiki\ActionRouter::getInstance(true) #7 /var/www/clients/client2157/web5417/web/doku.php(120): act_dispatch() #8 {main} thrown in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on line 412

Atma will do an reindex, since files have different names now and index still holds the old. Maybe that solves that.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Moritz on July 01, 2018, 04:37:11 PM
Spoken of problems... Nyom Moritz (attached)

/.../

Code: [Select]
Fatal error: Uncaught Error: Call to undefined function setTimeLimit() in ...

Atma will do an reindex, since files have different names now and index still holds the old. Maybe that solves that.

Oh, that is my error. I wrote 'setTimeLimit' instead of 'set_time_limit' in the program, without testing it. So should have nothing to do with indexing any new files. I will change that. One moment...

Okay, I changed it. Now it should work correctly, I hope. But have not tested it.

_/\_
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on July 01, 2018, 04:52:01 PM
Index is still slow, might be retested tomorow, Nyom Moritz, so that not break up with the progress.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on July 02, 2018, 02:07:15 AM
Still this media-error (attachment; messy layout, no select all and resultpage same as searchpage), indexing might not have complete since battery was empty over night.

execution gives now no fatal error

"$" comand seems no more working in replace-line. Maybe a change of plugin-code have been done. Havn't tested "\"

Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on July 02, 2018, 10:44:52 AM
After update Indexing finished, the "layout error" still exists, like pic above. The text says:

Code: [Select]
Warning: file_get_contents(/var/www/clients/client2157/web5417/web/lib/plugins/batchedit/images/file-document.svg): failed to open stream: No such file or directory in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on line 833

Warning: file_get_contents(/var/www/clients/client2157/web5417/web/lib/plugins/batchedit/images/pencil.svg): failed to open stream: No such file or directory in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on line 833

Warning: file_get_contents(/var/www/clients/client2157/web5417/web/lib/plugins/batchedit/images/arrow-down.svg): failed to open stream: No such file or directory in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on

Thinking, oh, my person added the images in the directory, having taken them from the github download (trusting that this might be welcome), and now seems to work fine, in regard of layou.

The download on docuwiki misses those images. My person told it via the forum (https://forum.dokuwiki.org/post/61558).

How ever, the resultpages misses now the amout of pages matched, and sum of matches, which is a useful controll and estimation of success point.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Moritz on July 02, 2018, 02:41:48 PM
Oh, I forgot to include the images that were added by the original author in his most recent updates. The DokuWiki download still has the version from February. The original author, Mykola Ostrovskyy (https://github.com/dwp-forge/batchedit) has not yet created a new release version (https://github.com/dwp-forge/batchedit/releases) since February. It seems he is still working on some major changes he would like to add before the next "official" version.

So all errors here are just because I forgot to upload certain new files. But now I think it should be okay?

How ever, the resultpages misses now the amout of pages matched, and sum of matches, which is a useful controll and estimation of success point.

I'm not sure how this could be. Testing from here, I get infos like this:

After "Preview":
Quote
Search results: 9808 matches on 1019 pages

After "Apply":
Quote
Edit results: 9808 matches on 1019 pages, 2 replacements applied

"$" comand seems no more working in replace-line. Maybe a change of plugin-code have been done. Havn't tested "\"

The replacement syntax has not been changed. Testing here, both "\" and "$" works for inserting match back-references.

For example:

regex: "/(mindfulness)/"

replacement: "$1 test \1"

will replace "mindfulness" with "mindfulness test mindfulness".
Seems to be working without problem here.

_/\_
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Moritz on July 02, 2018, 02:55:00 PM
Just came across another error that could happen when the amount of matches is really huge (for example, searching for "/is/" - must have millions of matches probably):

Quote
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 69632 bytes) in /var/www/clients/client2157/web5417/web/lib/plugins/batchedit/admin.php on line 336

So just to inform, if coming across that, that is because there are too many results to keep in memory.

I had some discussion with the author who is currently in the process of making some major changes, also thinking about how to deal with huge result sets (https://github.com/dwp-forge/batchedit/issues/16#issuecomment-401596844). So I think I should mention that to him as well and maybe help and try to find a solution. But at the moment don't have much time for this.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Moritz on July 02, 2018, 03:13:41 PM
"$" comand seems no more working in replace-line. Maybe a change of plugin-code have been done. Havn't tested "\"

The replacement syntax has not been changed. Testing here, both "\" and "$" works for inserting match back-references.

Maybe another hint, not to forget the parantheses ().

\1, \2, \3 or $1, $2, $3 ... etc. are references to the groups inside parantheses. With no parantheses, there is no input for \1, \2, $1, $2 etc.

example:
regex: "/(\s[a-zA-Z]*) something in between (mindfulness)/"

replacement: "\1 something different \2"

would replace like this:

"satipatthana something in between mindfulness"
=> "satipatthana something different mindfulness"

"ariyasacca something in between mindfulness"
=> "ariyasomething something different mindfulness"

_/\_
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on July 02, 2018, 04:18:11 PM
Oh, I forgot to include the images that were added by the original author in his most recent updates. The DokuWiki download still has the version from February. The original author, Mykola Ostrovskyy (https://github.com/dwp-forge/batchedit) has not yet created a new release version (https://github.com/dwp-forge/batchedit/releases) since February. It seems he is still working on some major changes he would like to add before the next "official" version.

So all errors here are just because I forgot to upload certain new files. But now I think it should be okay?

How ever, the resultpages misses now the amout of pages matched, and sum of matches, which is a useful controll and estimation of success point.

I'm not sure how this could be. Testing from here, I get infos like this:

After "Preview":
Quote
Search results: 9808 matches on 1019 pages

After "Apply":
Quote
Edit results: 9808 matches on 1019 pages, 2 replacements applied

Maybe it's a matter of display, caused by responsibility for mobil advices.
But what was just white before, contains now the matches.

/me : There seems to be a lot to understand in regard of "Zwischenspeicher"Also troubles with favicon, even on all places placed and a great deal that in cs-rm, the site takes the old version as the newer, meaning all "drafts" to recover, one by one.

"$" comand seems no more working in replace-line. Maybe a change of plugin-code have been done. Havn't tested "\"

The replacement syntax has not been changed. Testing here, both "\" and "$" works for inserting match back-references.

For example:

regex: "/(mindfulness)/"

replacement: "$1 test \1"

will replace "mindfulness" with "mindfulness test mindfulness".
Seems to be working without problem here.

_/\_

That's great. Might be again certain momentary personal handycap, here and there.
Title: Re: [ATI.eu] ATI/ZzE Content-style
Post by: Johann on July 02, 2018, 04:27:22 PM
"$" comand seems no more working in replace-line. Maybe a change of plugin-code have been done. Havn't tested "\"

The replacement syntax has not been changed. Testing here, both "\" and "$" works for inserting match back-references.

Maybe another hint, not to forget the parantheses ().

\1, \2, \3 or $1, $2, $3 ... etc. are references to the groups inside parantheses. With no parantheses, there is no input for \1, \2, $1, $2 etc.

...
_/\_
Sadhu for acting zuvorkommend.
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Johann on July 26, 2018, 12:08:30 PM
lookahead: no idea why there is recognition but no replacement for example with this regex:

Code: "find" [Select]
/\[\[([^\w]*)\/(?=lib\/|tipitaka\/|cdrom\/|extras\/|news\/|noncanon\/|ousources\/|pdf\/|s\/|tech\/)/


Code: "replace with [Select]
[[de:


a given [[../../../lib gets the match [[../../../ and the replacement look the same [[../../../lib ?

certainly total no more my persons sphere at all, this chess thinking...
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Johann on August 12, 2018, 03:35:25 PM
(...lookahead need something behind. Guess the issue is solved for my person)

The replacment tool has a problem with " . Put into search or replace, it would break the search or replace string after execution. But can be fixed by using [^\w] instead, at least for search.
Title: strange error
Post by: Johann on August 15, 2018, 11:16:51 AM
strange error appeared while doing on replacement after another. But seems to be fine if just loging in again.

screenshort attached.

Info of a "bug": if selecting matches but push again on preview, it will return the replacements in green althought just reviewed.
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Moritz on August 17, 2018, 12:22:57 AM
The first looks like an error on Greensta's side. The database was unavailable for a moment it seems.  :-| But if not happening more often, hopefully not a big problem.

The second bug was introduced by me. I just wanted to have different colors for match and replacement.
So instead of having yellow for both, I wanted to have yellow and green in the preview.
And red and green after the replacement.

But it seems I have not changed it for the first preview, where still both is yellow.

The original author was also wondering why I did this change. Now I see it's different for the first preview. Okay.

There has been a lot of new work been done (https://github.com/dwp-forge/batchedit/commits) in the meantime by the original author (https://github.com/dwp-forge) and others, including some really helpful new features like a progress bar, so that one can estimate how much more time a replacement will take for large updates. And much cleaner solutions to the small changes that I made.

I think I should update to the new version soon.
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Johann on August 17, 2018, 02:15:37 AM
Good to hear. As far for now, Atma is used to and knows it's capacity and ways well.
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Moritz on September 05, 2018, 11:39:30 PM
I installed the new version of BatchEdit plugin.
There are some new useful options next to the search input:



There is also a cog wheel (Zahnrad) symbol in the top right corner next to these options, which brings up some additional options:

Also there is now a time limit on how long a search or replacement can take (can be changed in Admin settings). I have set this to 10 hours now. Should be enough usually.

Very helpful: there is now a progress bar for the search progress and replacement progress, helping estimating how much longer it will take (very light grey, difficult to distinguish from white).

Not tested much, hopefully not any new errors.


Edit: Just tested searching for "dhamma" with no limit of results; returns an empty result page. Probably too many results so that something gets broken.
Searching for "dhamma" with limit of 16000 results works, and takes a few minutes to complete.
Searching for "Johann" without limit works and gives 2076 results.
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Johann on September 06, 2018, 02:43:58 AM
Sadhu!

The options "multiline", upper/lower case... replace the use of delimiter.
Usually putting the search between /{string}/x. x would give definitions to lower/upeer case, more the one line...

So more user-friendly. Let's see if it can match with the previous "hack" in regard of 10.000 and more matches.

Seems to work fine, shortly tested.
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Johann on September 17, 2018, 09:40:43 AM


1. Spaces at line-beginning and linebreaks before tags, using find: \n[\s]+< and replace with \n\n, one one hand because replaced would nevertheless give a match and doing folder by folder would need long and has it's and at the root lang. (just lib:thai: could be managed so far, namespace thanissaro would require 300MB+, all in a lang-space propable some 10GB). A possible way, if nothing else found, is maybe 2 two step way, replacing firts with any special character and this later with two line-break. In this way matches can be reduced, slowly, slowly, step by step (about 20-50h).

2. p-tags with two line-breaks by using something like find <\/p>[\s]*<p> and replace with \n\n.

3. the many spaces and tabs between tags without touching/destroying unformated textpages (not thought in detail about it, but would be a mass-problem as well)

4. later on things like em, i, b, br, u, s-tags, while these matches can of cause be reduced step by step.

5. of cause the will be other mass-replacements harder to manage, but can be all of cause done by beggar-"tricks" and effort and patient like always.

/me : switching back to huge amount of pts-dictionary -> accessibility replacements for "dummies" and those not wishing to become schoolars or x.y.z., ax4 language speaker, Brahmans or depending on them, before or rather then gaining awakening.
Title: Re: [ATI.eu] Replacement, regex issues (Content styling)
Post by: Johann on September 21, 2018, 01:46:31 PM
The result-page does no more display the amount of matched pages and matches. Only the amout of replacements, after execution, would be displayed.