Good morning
I need to "purify" sentences to be able to use them in an app.
Thanks a lot for your help in helping me build the right code Image may be NSFW.
Clik here to view.
I think the pattern is:
- 'any sentence' (0 or 1 time) {text here} (0-N times)
example
AB CD {xxxxx}
or AB CD {xxxxx,yyyy}
or {xxxxx}
or AB CD
or AB CD {xxxxx} AB CD {xxxxx}
or AB CD {xxxxx} {xxxxx} {xxxxx}
etc
- the {text here} block looks like {digit|some text}.
For example:
{1|xxxxxxxxx}
- the 'some text' block can (not mandatory) contain 'default=xxx' at any place in the text
ex: {3|abc=d,default=my value} or {2|a b c=d,default=my value,another=valueThatIDontNeed} or {1|default=my value}
I need to isolate the following parts and return them to a string.
- 'any sentence' text (if exists)
- xxx of the 'default=xxx' pattern, as per above explanation
This does not need to be done in one pass, I can script that in loops in Python for example.
Here are a few examples
Example 1
Store bulk masses greater than {0|message=<specify mass="" value="">|filter=^(_)?MASS_VALUE.+|add space after=false.+}{1|message=<specify mass="" unit="">|filter=^(_)?P413_MASS_UNIT.+} at temperatures not exceeding {2|message=<specify temperature="" value="">|filter=^(_)?TEMP_VALUE_.+|add space after=false.+}{3|message=<specify temperature="" unit="">|filter=^(_)?P413_TEMP_UNIT_.+}
this should give
Store bulk masses greater than at temperatures not exceeding
Example 2
Inhoud onder {0|message=<geschikt(e) vloeistof="" of="" gas="" specificeren="">|default=inert gas|filter=^(_)?P231_STORAGE_.+} gebruiken en bewaren. Tegen vocht beschermen.
Should give
Inhoud onder inert gas gebruiken en bewaren. Tegen vocht beschermen.
Example 3
EN CAS DE CONTACT AVEC LA PEAU: Laver abondamment{0|message=<préciser un="" produit="" de="" nettoyage="">|default=à l’eau|filter=^(_)?P352_WASH_.+}. Appeler immédiatement {1|message=<préciser qui="" pourra="" émettre="" comme="" il="" convient="" n="" avis="" médical="" en="" cas="" d’urgence="">|default=un CENTRE ANTIPOISON ou un médecin|filter=^(_)?P310_EMERGENCY_.+}.
Should give
EN CAS DE CONTACT AVEC LA PEAU: Laver abondamment à l’eau . Appeler immédiatement un CENTRE ANTIPOISON ou un médecin
Example 4
{0|message=<specificeren of="" dumpingvoorschriften="" van="" toepassing="" zijn="" op="" inhoud,="" container="">|default=Inhoud/verpakking|filter=^(_)?P501_REQUIREMENT_.+} afvoeren naar {1|message=<specificeer welke="" lokale="" regionale="" nationale="" internationale="" wetgeving="">|default=…|filter=^(_)?P501_DISPOSAL_.+}.
Should give
Inhoud/verpakking afvoeren naar … .
thanks !
I need to "purify" sentences to be able to use them in an app.
Thanks a lot for your help in helping me build the right code Image may be NSFW.
Clik here to view.

I think the pattern is:
- 'any sentence' (0 or 1 time) {text here} (0-N times)
example
AB CD {xxxxx}
or AB CD {xxxxx,yyyy}
or {xxxxx}
or AB CD
or AB CD {xxxxx} AB CD {xxxxx}
or AB CD {xxxxx} {xxxxx} {xxxxx}
etc
- the {text here} block looks like {digit|some text}.
For example:
{1|xxxxxxxxx}
- the 'some text' block can (not mandatory) contain 'default=xxx' at any place in the text
ex: {3|abc=d,default=my value} or {2|a b c=d,default=my value,another=valueThatIDontNeed} or {1|default=my value}
I need to isolate the following parts and return them to a string.
- 'any sentence' text (if exists)
- xxx of the 'default=xxx' pattern, as per above explanation
This does not need to be done in one pass, I can script that in loops in Python for example.
Here are a few examples
Example 1
Store bulk masses greater than {0|message=<specify mass="" value="">|filter=^(_)?MASS_VALUE.+|add space after=false.+}{1|message=<specify mass="" unit="">|filter=^(_)?P413_MASS_UNIT.+} at temperatures not exceeding {2|message=<specify temperature="" value="">|filter=^(_)?TEMP_VALUE_.+|add space after=false.+}{3|message=<specify temperature="" unit="">|filter=^(_)?P413_TEMP_UNIT_.+}
this should give
Store bulk masses greater than at temperatures not exceeding
Example 2
Inhoud onder {0|message=<geschikt(e) vloeistof="" of="" gas="" specificeren="">|default=inert gas|filter=^(_)?P231_STORAGE_.+} gebruiken en bewaren. Tegen vocht beschermen.
Should give
Inhoud onder inert gas gebruiken en bewaren. Tegen vocht beschermen.
Example 3
EN CAS DE CONTACT AVEC LA PEAU: Laver abondamment{0|message=<préciser un="" produit="" de="" nettoyage="">|default=à l’eau|filter=^(_)?P352_WASH_.+}. Appeler immédiatement {1|message=<préciser qui="" pourra="" émettre="" comme="" il="" convient="" n="" avis="" médical="" en="" cas="" d’urgence="">|default=un CENTRE ANTIPOISON ou un médecin|filter=^(_)?P310_EMERGENCY_.+}.
Should give
EN CAS DE CONTACT AVEC LA PEAU: Laver abondamment à l’eau . Appeler immédiatement un CENTRE ANTIPOISON ou un médecin
Example 4
{0|message=<specificeren of="" dumpingvoorschriften="" van="" toepassing="" zijn="" op="" inhoud,="" container="">|default=Inhoud/verpakking|filter=^(_)?P501_REQUIREMENT_.+} afvoeren naar {1|message=<specificeer welke="" lokale="" regionale="" nationale="" internationale="" wetgeving="">|default=…|filter=^(_)?P501_DISPOSAL_.+}.
Should give
Inhoud/verpakking afvoeren naar … .
thanks !