Module:DecodeEncode/doc: Difference between revisions
Appearance
Content deleted Content added
High-use template and update /doc |
|||
Line 1: | Line 1: | ||
{{Module rating |general}} |
{{Module rating |general}} |
||
<!-- Please place categories where indicated at the bottom of this page and interwikis at Wikidata (see [[Wikipedia:Wikidata]]) --> |
<!-- Please place categories where indicated at the bottom of this page and interwikis at Wikidata (see [[Wikipedia:Wikidata]]) --> |
||
{{High-use}} |
|||
Implements Lua functions [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]], [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]] in a module. |
Implements Lua functions [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]], [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]] in a module. |
||
:<code><nowiki>{{#invoke:decodeEncode|decode|s=Source&nbsp;text&copy;}}</nowiki></code> → <code><nowiki>Source text©</nowiki></code> |
:<code><nowiki>{{#invoke:decodeEncode|decode|s=Source&nbsp;text&copy;}}</nowiki></code> → <code><nowiki>Source text©</nowiki></code> |
||
Line 73: | Line 75: | ||
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]] |
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]] |
||
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]] |
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]] |
||
⚫ | |||
* [[:Module:Urldecode]] |
* [[:Module:Urldecode]] |
||
⚫ | |||
<includeonly>{{sandbox other|| |
<includeonly>{{sandbox other|| |
||
<!-- Categories below this line, please; interwikis at Wikidata --> |
<!-- Categories below this line, please; interwikis at Wikidata --> |
Revision as of 08:38, 17 February 2023
![]() | This Lua module is used on approximately 134,000 pages. To avoid major disruption and server load, any changes should be tested in the module's /sandbox or /testcases subpages, or in your own module sandbox. The tested changes can be added to this page in a single edit. Consider discussing changes on the talk page before implementing them. |
Implements Lua functions mw.text.decode, mw.text.encode in a module.
{{#invoke:decodeEncode|decode|s=Source text©}}
→Source text©
See List of XML and HTML character entity references.
Decode (© → ©)
- Decodes Named Entities from entity name into a regular (unicode) character:
©
→©
>
→>
All welldefined named entities are decoded (HTML Named character references, formally: as defined in the PHP table).
- A regular, rendered sentence:
- "At 100 °F, & with a "burning" sun above, we , we ⁄walked⁄."
- In code:
- "
At 100 °F, & with a "burning" sun above, we ⁄walked⁄.
" -- wikitext
- "
- Processing:
{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we ⁄walked⁄.}}
→At 100 °F, & with a "burning" sun above, we ⁄walked⁄.
-- In code: straight characters, no named entities.
- Renders, again:
- "At 100 °F, & with a "burning" sun above, we ⁄walked⁄."
Decode a reduced set only
By setting |subset_only=true
, only these five entity names are decoded: '<', '>', '&', '"', ' ' (that is, into '<', '>', '&', '"', ' ').
- Note: There is a difference with the relevant Lua parameter. (This only concerns your task if you also work directly with the Lua mw.text.decode function). Lua documentation defines parameter
|decodeNamedEntities=
, having this effect: when omitted or false, only the reduced set of entities is recognized and decoded. This use of 'false' is inverted in using|subset_only=
:|decodeNamedEntities=false
=|subset_only=true
.
- Also, this module ignores the "omitted" logic:
|subset_only=
should be set explicitly to 'true' to be effective.
Encode (© → ©)
- Function
encode
encodes some entity-named characters into that name (for example:&
→&
).
Regular sentence:
- "At >100 °F, & with a "burning" sun above, we walked. ©"
In code:
- "
At >100 °F, & with a "burning" sun above, we walked. ©
"
Encode:
{{#invoke:decodeEncode|encode|s=At >100 °F, & with a "burning" sun above, we walked. ©|charset=&<>{{!}}°"'&©}}
- →
At >100 °F, & with a "burning" sun above, we walked. ©
- Renders as:
- "At >100 °F, & with a "burning" sun above, we walked. ©"
character set to encode
Per Lua documentation, only a small set of characters is processed. The characterset can be set (expanded) by using |charset=
.
- Example:
|charset=<>" \'&
(the default),|charset=<>°"'&©{{!}}
; characters not in the default will be replaced by their decimal entity:©
→©
(hexadecimal number, not decimal nor named ©)
Known issues
- 13 Sep 2021: NOTE: The encode function with user-supplied charset is now used productively in {{R/superscript}} and {{R/ref}}. Before implementing breaking changes here, these templates need to be adjusted accordingly!
- 26 Sep 2021: U+2009 THIN SPACE ( ,  )
- Note: Possible bug: Decoding
 
works, but 
doesn't. - Resolved in code.
- 4 Feb 2023: U+03B5 ε GREEK SMALL LETTER EPSILON (ε, ε)
See also