Jump to content

Module talk:DecodeEncode: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Template-protected edit request on 21 March 2023: answered=no activate: the /sandbox checked ok then
 
Line 154: Line 154:
== Template-protected edit request on 21 March 2023 ==
== Template-protected edit request on 21 March 2023 ==


{{edit template-protected|Module:DecodeEncode|answered=no}}
{{edit template-protected|Module:DecodeEncode|answered=yes}}
Please replace all code [[:Module:DecodeEncode]] with [[:module:DecodeEncode/sandbox]]. ({{DiffPages|page1=Module:DecodeEncode|page2=Module:DecodeEncode/sandbox}})
Please replace all code [[:Module:DecodeEncode]] with [[:module:DecodeEncode/sandbox]]. ({{DiffPages|page1=Module:DecodeEncode|page2=Module:DecodeEncode/sandbox}})


Line 165: Line 165:
::::thx. As said, please someone with trust perform ER because me editing/commenting in between does not help. [[User:DePiep|DePiep]] ([[User talk:DePiep|talk]]) 08:18, 22 March 2023 (UTC)
::::thx. As said, please someone with trust perform ER because me editing/commenting in between does not help. [[User:DePiep|DePiep]] ([[User talk:DePiep|talk]]) 08:18, 22 March 2023 (UTC)
* Set {{para|answered|no}} after two positive critiques. Also, I met no error while developing with this sandbox. -[[User:DePiep|DePiep]] ([[User talk:DePiep|talk]]) 09:00, 22 March 2023 (UTC)
* Set {{para|answered|no}} after two positive critiques. Also, I met no error while developing with this sandbox. -[[User:DePiep|DePiep]] ([[User talk:DePiep|talk]]) 09:00, 22 March 2023 (UTC)
:{{done}}<!-- Template:ETp --> &mdash;&nbsp;Martin <small>([[User:MSGJ|MSGJ]]&nbsp;·&nbsp;[[User talk:MSGJ|talk]])</small> 18:35, 22 March 2023 (UTC)

Latest revision as of 18:35, 22 March 2023

Bug report: bad decoding of U+03B5 ε (epsilon)

[edit]

About U+03B5 ε GREEK SMALL LETTER EPSILON (&epsi; &epsilon;)

  • Issue: after resolving HTML entity &epsilon; by mw.text.decode(), the plain character is not found by mw.ustring.gsub(). No issue with alternative HTML entity &epsi;. &epsi; good, &epsilon; bad.
Report limitations: Original report and bug reproduction is at enwiki Module talk:DecodeEncode, from where en:module:DecodeEncode and en:module:String are used live. At phabricator pseudocode may be used and some "results" may be hardcoded. In-text the escape &amp; is used, not in-function. Lua patterns not used ("no %").
  • To reproduce:
1. Create research string:
X&epsi;1X&epsilon;2X (shows live and unedited as: Xε1Xε2X)
2. Render the string by decode() (as inner function)
3. then on rendered result use gsub() to replace plain character εE: (as outer function)
mw.ustring.gsub( s=(mw.text.decode( s=X&epsi;1X&epsilon;2X, decodeNamedEntities=true ) ), pattern=ε, repl=E ) [is pseudo-code, see note. 21:10, 7 February 2023 (UTC)]
4. Result3 (s&r pattern use ε from Xε1X):
XE1XE2X
5. Result4 (s&r pattern use ε from Xε2X):
XE1XE2X
  • Expected: XE1XE2X (only one character ε exists)
{{#invoke:String|replace|source={{#invoke:DecodeEncode|decode|s=X&epsi;1X&epsilon;2X}}|pattern=ε|replace=E|plain=true}}
→ XE1XE2X
-DePiep (talk) 21:10, 7 February 2023 (UTC)[reply]

Workaround A, ad hoc

[edit]

Workaround A, ad hoc: add innermost function to first replace in the research string &epsilon;&epsi;:

A1: {{#invoke:String|replace|source={{#invoke:DecodeEncode|decode|s={{#invoke:String|replace|source=X&epsi;1X&epsilon;2X|pattern=&epsilon;|replace=&epsi;|plain=true}}}}|pattern=ε|replace=E|plain=true}}
XE1XE2X

Workaround B, in module (THIN SPACE example)

[edit]

Workaround B: early in :en:module:DecodeEncode, replace &epsilon;&epsi;

About THIN SPACE: it looks like character U+2009 THIN SPACE (&thinsp; &ThinSpace;) has a samilar issue. &ThinSpace; good, &thinsp; bad.

Currently in code:

function p._decode( s, subset_only )
	local ret = nil;
    s = mw.ustring.gsub( s, '&thinsp;', '&ThinSpace;' ) -- Workaround for bug: &ThinSpace; gets properly decoded in decode, but &thinsp; doesn't.
	ret = mw.text.decode( s, not subset_only )
	return ret
end

In en:module:DecodeEncode/sandbox, I have coded a similar handling of EPSILON:

module:DecodeEncode, module:DecodeEncode/sandbox diff
function p._decode( s, subset_only )
	local ret = nil;
	-- U+2009 THIN SPACE: workaround for bug: HTML entity &thinsp; is decoded incorrect. Entity &ThinSpace; gets decoded properly
	s = mw.ustring.gsub( s, '&thinsp;', '&ThinSpace;' )
	-- U+03B5 ε GREEK SMALL LETTER EPSILON: workaround for bug (phab:T328840): HTML entity &epsilon; is decoded incorrect for gsub(). Entity &epsi; gets decoded properly
	s = mw.ustring.gsub( s, '&epsilon;', '&epsi;' )
	ret = mw.text.decode( s, not subset_only )
	return ret
end
  • /sandbox tests:
B. {{#invoke:String|replace|source={{#invoke:DecodeEncode/sandbox|decode|s=X&epsi;1X&epsilon;2X}}|pattern=ε|replace=E|plain=true}}
B1. ResultB1 (s&r pattern use ε from Xε1X): XE1XE2X
B2. ResultB2 (s&r pattern use ε from Xε2X): XE1XE2X

I propose to edit the module along this way.

Workaround C (mw, Lua)

[edit]

Changes in mw, Lua: I have not idea.

testcases EPSILON

[edit]
  • Original failure, now solved=not showing any more:
(hardcoded explanation here): in cell marked Red XN, the result showed as "XE1Xε2X". That is: wikitext input "&epsilon;" was not recognised & replaced. -DePiep (talk) 07:49, 19 February 2023 (UTC)[reply]
EPSILON ε &epsi; error & fix proposal (16 Feb 2023)
1 2 3 4 5 6
id entity code plain mod:.. decode(&entity;) replace(decode(..)) with E
pattern=hardcoded ⟨ε⟩ from plain
(s=&entity;)
(s=checkstring)
mod:..decode/sandbox
checkstring X&epsi;1X&epsilon;2X >Xε1Xε2X< >Xε1Xε2X<
EPSI &epsi; >ε< >ε< E
XE1XE2X
E
XE1XE2X
EPSILON &epsilon; >ε< >ε< E
XE1XE2X
Red XN
E
XE1XE2X
Similar fix as U+2009 THIN SPACE (&thinsp;, &ThinSpace;) has (though original cause bug may be different for THIN SPACE).
  • Phabricator T328840 did not gain traction. Would be mw-level, not this module.
-DePiep (talk) 06:22, 16 February 2023 (UTC)[reply]

Template-protected edit request on 16 February 2023

[edit]
Issue: bad decoding of HTML entity &epsi; Red XN
re U+03B5 ε GREEK SMALL LETTER EPSILON (&epsi;, &epsilon;)
Change: fix by replacing with entity &epsilon; Green tickY before applying decode(). See § Workaround B for code diff & backgrounds; minor comment change
Discussion: (1) reported at T328840, no responses (mw-level); (2) bug report here not challenged
Testcases: See § testcases EPSILON.
DePiep (talk) 06:49, 16 February 2023 (UTC)[reply]
 Done * Pppery * it has begun... 03:11, 19 February 2023 (UTC)[reply]

NBSP behaviour

[edit]

Leaving this note here.

About NBSP, U+00A0   NO-BREAK SPACE (&nbsp;, &NonBreakingSpace;). With input &nbsp; I am experiencing problems reminding of § epsilon (T328840, now resolved).

When nested like: (replace|s=(decode|s=AB&nbsp;YZ)|replace=AB_YZ) returns breaking code (breaking when used in/with HTML/css code like span, sup, class).

No time to build the reproduction/test, so have to leave it for now. Not reported on phab. DePiep (talk) 07:27, 20 February 2023 (UTC)[reply]

Template-protected edit request on 21 March 2023

[edit]

Please replace all code Module:DecodeEncode with module:DecodeEncode/sandbox. (compare )

Change: apply require('strict'), and declade function local explicit. DePiep (talk) 14:34, 21 March 2023 (UTC)[reply]

Invitation is out. -DePiep (talk) 14:49, 21 March 2023 (UTC)[reply]
Upd: Gonnym has made large improvements, so the sandboxdiff is large. I do not see strict-related changes. DePiep (talk) 21:31, 21 March 2023 (UTC)[reply]
The changes are good and no globals remain. The two mw.ustring could be string. Johnuniq (talk) 06:40, 22 March 2023 (UTC)[reply]
thx. As said, please someone with trust perform ER because me editing/commenting in between does not help. DePiep (talk) 08:18, 22 March 2023 (UTC)[reply]
 Done — Martin (MSGJ · talk) 18:35, 22 March 2023 (UTC)[reply]