Module:Buffer/doc
![]() | This is a documentation subpage for Module:Buffer. It may contain usage information, categories and other content that is not part of the original module page. |
![]() | Do not "beautify" the source of this metamodule. Its unconventional syntax was developed using the scientific method for performance. For example, v2~=true and v2~=false , though longer than type(v2)~='boolean' , is about 8 times faster per op. Though the
type op is normally trivial, the way this is used could cause any inefficiency to be multiplied by over a billion.
[note 1] |
This was originally developed to optimize string concatenation as a helper method within Module:Asbox, but has been generalized for all modules.
The interface for Module:Buffer objects is similar to that of mw.html
objects in that you may build complex strings with independent child nodes. In most cases, you may use Buffer objects like a normal string, including using ..
operator (though
Buffer:_
has the same role, but potentially over 10 times faster than ..
). See also: #Calling string, mw.ustring, and mw.text libraries
Additionally, there are four specialized forms, described further in their respective sections: Buffer:stream, Buffer-HTML, Element-Buffer and Buffer-variable.
Last but not least, this module has an ordered __pairs more thorough than ipairs and pairs. (Even reads nil keys!) The logical uniqueness of this iterator may be reason enough to assimilate Module:Buffer.
Basic usage
require'Module:Buffer'
require'Module:Buffer'
( _G, name, save, ... )
Creates a new Module:Buffer object when it (the module) is called as a function—i.e., there is no 'main'.
Pass the module (or
Buffer:_in
) your global variable
_G
to enable global functions. If passed _G then the next two
varargs will pass to
Buffer:_G
and any extra will pass to
Buffer:_
. If initialized without _G then all varargs will pass to
Buffer:_
( ... )
You may also use most Buffer object functions directly on the module—i.e require'Module:Buffer':function{...}
is equivalent to require'Module:Buffer'():function{...}
.[note 2]
Buffer
Buffer:_str for advanced string conversion.
Get Buffer as type
string by performing a function call on the Buffer object (as opposed to a call on the Module).
Calling a Buffer is basically shorthand for
table.concat
, or, with no args, ( Buffer, ... )
tostring
.
However, if your Buffer contains raw objects or out-of-sequence values, then the return string would be result of ( Buffer )
empty-buffer:_all
( Buffer )( ... )
instead.[note 3]
Unconventionally, any i
or j
of type
string would be treated as relative to length;
that is,
Buffer
( '-', -1, '-3' )
is equivalent to
Buffer
( '-', -1, #Buffer - 3 )
(obviating the need to set a local Buffer to use the length operator).
Note you do not need to string a Buffer object to append it to an mw.html object via
mw.html:node
(though not mw.html:wikitext because of type checking).
Buffer.last_concat
When strung without a separator, the result may be retrieved via Buffer.last_concat
. Future tostring operations on the Buffer will return Buffer.last_concat until it is modified.
You may purge the cache by setting this key to nil, by appending a valid value and/or removing a value:
Buffer:_(0)
:_nil()
, as well as by passing nothing to
Buffer:_c()
Buffer:_
Buffer:stream for a faster, simpler version of this op.
Appends a value to the Buffer. In rough terms, Buffer:_'string1':_'string2'
is the same as Buffer = Buffer..'string1'..'string2'
.
(It may help to imagine :_
as a ..
that has stood up and is now casting a shadow.)
If passed an invalid value
listed below, this is a no-op:
-
boolean
-
nil
- empty
string
-
table without a
__tostring metamethod
and which table[1]
is nil or false.
A table with no __tostring will pass through
table.concat
before insertion. An
error may be thrown if the table would cause table.concat to error.
(Use
Buffer:_all
instead for such tables.) For all other value, the result of
tostring( value )
would be inserted so long as it is not an empty string.
When passed pos
of type
number, the argument is identical to pos for
table.insert( table, pos, value )
.
In fact, assuming a valid value,
Buffer:_( 'string', 1 )
is exactly the same as
table.insert( Buffer, 1, 'string' )
.
Just like with the position arguments of Buffer(), any pos of type
string would be treated as relative to length.
Set raw
to true to force append a value without tostring coercion, including invalid values.
[note 3]
If given only two (non-self) arguments with the second being a boolean, then the second is read as raw instead.
Buffer:_nil
Buffer:_nil( pos, replacement )
Removes the value buffered at pos
. As with
Buffer:_
, a string pos string is treated as #Buffer + pos
.
If replacement
is provided, then this will replace the value at the pos index as long as replacement is not a boolean, in which case, this is a no-op.
When replacement is nil, the op is simply
table.remove( Buffer, pos )
with string pos relative to length.
Note however there is no further type checking on replacement, so, if nil nor boolean, then Buffer will be set to raw mode.
Pos cannot be omitted if replacement is passed, though a pos that is nil will be treated as #Buffer
.
Buffer:_all
Buffer:_all( { ..., value = pos, functionName = args, ... }, nanKeys )
Takes a table value
, iterates through all number keys in order, appending each valid value to the end of the Buffer.
In contrast to
ipairs
, this starts at the most negative key (down to -inf), continues through any nil keys,
until it reaches the most positive index and includes non-integer number keys.
(Note: despite Module:Buffer.__pairs having a more thorough iteration than ipairs, the difference in their runtimes is almost statistically insignificant.
Details at #Performance and #Using the iterator outside of buffer.)
A table value that has no metatable will have its contents iterated by this function before moving on to the next value. All other data types are processed by
Buffer:_
.
By default, this ignores non-number keys unless nanKeys
evaluates true. If so, non-number keys are processed after number keys.
Keep in mind such keys are iterated in no particular order, though an order may be imposed by wrapping each pair in a table indexed at a number key.
If given a value = pos
pair, defined as a number or number string indexed at a non-number key, then they will be passed as the value
and pos
arguments for
Buffer:_.
Thus,
Buffer:_all({1,2,3,'... done',[3.5]=variable and 4 or {four='1',zero=1}},true)
Buffer:_(1):_(2):_(3)
if variable then
Buffer:_(4)
else
Buffer:_'four':_('zero',1)
end
Buffer:_'... done'
If a non-number key points to a value that cannot be coerced into a coerced into a number then the pair may be treated as functionName = args
,
when functionName matches a Buffer object function and args is not boolean.
If args is such that value[1]
evaluates true, then this will pass the return of
unpack( value, 1, table.maxn(value) )
to the named function; otherwise, the value is passed as is.
[note 4]
For example:
p(_G,'arg', true):_all({'arg',arg==true and {'==true: ' ,_in={_G, 't', nil, ' awesome'}}}, true):_(t and {t(), t..'r', t..'st'})
produces: 'arg==true: awesome awesomer awesomest'
Buffer:_in
Buffer:_in( _G, name, save, ... )
Passes any arguments to Module:Buffer to create a new Buffer object and then sets an external reference to the parent Buffer and returns the child.
[note 5]
This does not append the child to the parent. (See
Buffer:_out)
Also, be aware that Buffer parent references are weak. Thus, if you were to (re-)set a local variable that is currently set to the parent, such could trigger immediate garbage collection on the parent.
Buffer:_out
Buffer:_out( outs, sep–list, { default–sep, ..., [out] = sep } )
Joins Buffer with sep
and appends result to the parent Buffer. Returns the parent. If no parent is found, this is a no-op and returns the same Buffer.
If given more than one (non-self) argument, then the first is read as outs
—the number of :_out() operations to perform.[note 6] Each additional argument in sep-list
is applied as sep for that :_out operation. That is, the first sep applies to the current Buffer, the second to its parent Buffer, the third to its grandparent, and so on.
If the last vararg is a table, then table[1] will be applied as the default separator for all nil varargs in sep-list. This table may directly follow outs (i.e. sep-list may be omitted). If it contains other keys, then the sep at key N would apply for the Nth :_out() instead of default-sep. Thus, these two snippets are synonymous:
Buffer:_out( 4, nil, nil, nil, ' and ', {', '} )
and
Buffer:_out( 4, {', ', [4] = ' and '} )
. Include false in sep-list to indicate no separator applies for that out.[note 7]
Buffer:_str
Buffer:_str( generations, sep–list, { default–sep, ..., [gen] = sep } )
Joins a Buffer with sep
and returns the string. Varargs are handled by the same function as
Buffer:_out
, which, if provided, this will create a new temporary Buffer and backtrack the number of generations
specified, inserting each ancestor in front of its descendants in the temporary Buffer. The sep indexed at generations + 1
will be used as the joiner for the temporary Buffer (unless the first ancestor is reached before the specified number of generations, in which case it is the index following that of the original generation).
Unlike :_out, this does not append the child into the parent. As such, even with the same arguments, it may return a different result than would be obtained from stringing the return of :_out since each parents' sep is not used to join parent and child. Furthermore, the number of generations counted includes the current Buffer, whereas the number of "outs" in Buffer:_out does not.
Buffer:_cc
Buffer:_cc( clear, copy, meta )
Nils all keys of the table referenced by clear
and unsets its metatable. If clear evaluates false, this simply purges the cache at
Buffer.last_concat
.
If given a table to copy
, this will duplicate all key-value pairs of copy into clear, cloning any table value recursively via Buffer:_cc(0, value)
. This returns the Buffer unless passed the number 0
as clear, which causes this to create a new table and return that instead. Passing true
as copy is equivalent to passing the Buffer itself. If copy is not a table, then it will be set as the first item in clear as long as it is not false.
While this may resemble
mw.clone
, there are several differences, namely that this:
- Gives clear the same metatable as copy (or sets
meta
, if given) as opposed to a "clone" of the metatable.
- Conserves Length attribute (though empty strings may replace some nil keys[note 8])
-
Rawsets values and iterates without invoking any __pairs metamethod.
- Includes Buffer parent and raw attributes (stored externally)
To obtain the table of key-value pairs left as empty strings in the previous copy op, simply call this again with any same value passed to both clear and copy (as long as they do not evaluate false).
Buffer:_parent
Buffer:_parent( outs, sep–list, { default–sep, [out] = sep, ...} )
- To skip generations without breaking the Buffer chain, see #global functions.
Similar to
Buffer:_out
except, instead of apending the Buffer to its parent, this calls
Buffer:_str
on the parent and appends the ancestor(s).
The parent is unaffected by this operation and may still be retrieved via
Buffer:_out
or re-appended again with this function.
Buffer:getParent
- Note that there is no 'getChild' method[note 5]
Returns parent Buffer, or, if none exists, sets a newly created Buffer as the 'parent' and returns the adopted parent.
In according with the "waste no ()
philosophy" of this Module, arguments are passed to the parent. If passed only one value
, this is equivalent to Buffer:getParent():_
( value )
.
If additional
varargs are given, functionName
must be a string naming a Buffer object function (or #library) to be called on the parent using the varargs.
Buffer:killParent
Unsets the parent reference, allowing
garbage collection unless there are non-weak references to the parent.
If passed any args, they will be passed to the current parent via Buffer:getParent as a "parting gift". In either case, returns to current Buffer.
Stream mode
Buffer:stream
Switches the Buffer to stream mode, in which the __call metamethod, instead of returning a string, now acts as streamlined version of
Buffer:_
. Though using the same helper method as :_ to validate values, the stream call op performs 50 percent faster.
Note that any args given are passed to Stream–
Buffer:each
rather than to __call for a reason that should be evident after you read that section.
Stream-Buffer
When streaming, you can append string (and table) literals with nothing between them (or only ASCII space chars if desired). For example, both A and B will produce identical strings:
local A = require'Module:Buffer':stream'A string of text may flow''with nothing between each string' 'or perhaps only a space'
'or even tab and line-break characters''and continue to append individually''for use with a joiner'
local B = require'Module:Buffer':_'A string of text may flow':_'with nothing between each string' :_ 'or perhaps only a space'
:_'or even tab and line-break characters':_'and continue to append individually':_'for use with a joiner'
mw.log(A==B, A:_str' ')
true A string of text may flow with nothing between each string or perhaps only a space or even tab and line-break characters and continue to append individually for use with a joiner
Keep in mind that Lua numbers[note 9]
and named variables are too shy to skinny dip in a Buffer stream and must wear parenthesis ()
as with any function call.
No special action is needed to exit this mode. The normal call to string op is restored upon the use of any regular Buffer function or any operation which coerces the Buffer into a string.
[note 10]
Stream-Buffer:each
Stream-
Bufferstream:each( ... )
Appends an undetermined number of valid values.
While analogous to
mw.html:wikitext, one distinguishing point (other than being twice as fast and able to handle tables and booleans) is that this does not stop at the first nil value.
In short, something like :wikitext('string1', varName, 'string2')
can be replaced with :each('string1', {varName, 'string2'})
when varName
is either a string or nil.
HTML extension
Buffer:_inHTML
Buffer:_inHTML( tagName, args )
Creates and returns a modified mw.html object. Accepts the same parameters as
mw.html.create
.
Modifications are summarized below:
- The
..
may be used on Buffer-mw.html objects directly (no
tostring
needed).
- If initialized, will store tags and wikitext in an Element-Buffer, with which you may use Module:Buffer object functions to append (and remove, etc.) values.
- Element-Buffer objects may use
Element-
Buffer:_add
, which greatly reduces the code size needed to build an equivalent mw.html object.
Unlike mw.html.create, if args
has keys other than args.parent
and args.selfClosing
, it will pass through Element-Buffer
Buffer:_add
for further processing. Moreover, if passed a table where mw.html.create expects tagName, this treats it as args instead.
Most mw.html functions are unchanged, except
:tag
,
:done
, and
:allDone
are embedded in a wrapper function that checks whether they return a normal mw.html object. If so, converts it to a Buffer-HTML object and sets a parent reference.
[note 11]
Note that other functions in section #HTML extension are only available after Buffer:_inHTML is used for the first time.
Buffer-HTML
Buffer-HTML objects may be used like any mw.html object. (In fact, if the only change were to substitute mw.html.create
with require'Module:Buffer':_inHTML
in an existing Module, its output should remain the same.)
Call the object as a function to return its Element-Buffer, which is the table found at mw.html–object.nodes
converted into a Module:Buffer object.
Strings are passed to the Element-Buffer via
Buffer:_
which basically has the same effect as though :wikitext
were between Buffer-HTML
and 'string'
, the only difference being the object returned. Tables are passed to Element-
Buffer:_add
.
Most Buffer object functions are either unavailable for use directly on the Buffer-HTML object. Those listed below have been modified so that
Element-Buffer
Element-Buffers have the same metatable as normal Buffer objects, so calling it will string it in the same manner.
The string returned is analogous to that returned by the JavaScript DOM method "innerHTML". In other words, when strung, it is the contents of the Buffer-HTML object without the "outerHTML" or tag (though it will include the outer Buffer-HTML when appended via
mw.html:node
).
You may use most Buffer object functions normally, however those which have a Buffer-HTML version (such as
Buffer-HTML:_out
) will instead behave as though used on the outer HTML object.[note 12] Also,
pre-Element:_inHTML
has been modified as described in that section.
Additionally, you may chain any mw.html object function directly on an Element-Buffer. With the exception of Element-
Buffer:tag
and Element-
Buffer:done
, the mw.html function has been placed in a wrapper function that merely redirects the self-action to the outside Buffer-HTML.[note 13]
Concatenate an Element-Buffer to another value with the ..
operator to return the result inside the tag, such that:
local Buff = require'Module:Buffer':_inHTML'div'{'Section ',color='red'}
return {Buff..1,Buff..2,Buff..3}
Can be a rapid way of generating:
local section = {}
for k = 1, 3 do
table.insert(section, tostring(mw.html.create'div':css{color='red'}:wikitext('Section ', k)))
end
return section
Element-Buffer:done
When called without arguments, this behaves just like
mw.html:done
as called on the outer HTML object.
However, it has been modified to accept dones
, the number of :done() operations to perform. Thus, Element–Buffer:done(4)
is equivalent to Buffer–HTML:done():done():done():done()
.
Pass zero (0
) as dones to return to the Element-Buffer's direct HTML container. (Using an mw.html function to no-op is another way to return to the Buffer-HTML object, e.g. Element-
Buffer:node()
ipairs with HTML-Buffer
- See also #Using _pairs outside of buffer for more details about Module:Buffer's custom iterator.
BufferHTML = p:_inHTML'td'{1,2,nil, '', true, 3,4,tag='br'}:done(0)
mw.log(BufferHTML)
for k, v in ipairs(BufferHTML) do mw.log(k,v) end
for k, v in ipairs(BufferHTML) do if v=='3' then BufferHTML():_nil(k) end end
mw.log(BufferHTML)
<td>1234<br /></td>
1 1
2 2
3 3
4 4
5 <br />
<td>124<br /></td>
Global functions
Buffer-variable objects
Buffer:_var
Buffer:_var( initial–value, name )
Appends a Buffer-variable object. This also disables future caching at
Buffer.last_concat for all Buffer objects in your module (and any module which may require it).
If passed a number as initial-value
, the number will increase by one each time the Buffer-variable is strung.
Similarily, if it is a string, then it reappear as the next ASCII character each time.
When passed a table, the first item will be used, then the second, and so on. This loops back to the first after reaching the last item.
You may also pass your own custom function, though how to code one is beyond the scope of this manual.
For example:
require'Module:Buffer'
:_inHTML'div'
{{tag='br'},'Heading ',color='blue',['text-decoration']='underline'}
:_var'A':_' - ':_var{'odd', 'even'}
:_out()
:_in(_G, 'bod'):_'Body ':_var(1):_'\n':_out():_html():_(bod):_html():_(bod)
--[[ Produces:
<div style="color:blue;text-decoration:underline"><br />Heading A - odd</div>Body 1
<div style="color:blue;text-decoration:underline"><br />Heading B - even</div>Body 2
<div style="color:blue;text-decoration:underline"><br />Heading C - odd</div>Body 3 --]]
Pass a boolean to re-append the last Buffer-variable object created. Anything that is not false or nil will append the object #raw. False appends the previous value as a string without incrementing the object. If passed nil, this is a no-op.
If given name
, then, if global functions are enabled, the Buffer-variable will be saved via
Buffer:_G
.
Modified ..
operator
Buffer .. value
Buffer-HTML .. value
This is akin to '''new-buffer'':_all
{ Buffer, value}
or
tostring( Buffer )
.. value
. HTML objects created by a Buffer may also be concatenated in this manner.
Buffer-HTML .. value
value .. Element-Buffer
require'Module:Buffer'.__pairs
Calling string, mw.ustring, and mw.text libraries
Tips and style recommendations
- If joining Buffer with a string immediately after
:_'text'
, place a space between 'string' and the separator and use double/single quote marks to . (i.e. :_'text' " "
instead of :_'text'' '
or :_'text'(' ')
)
- Saving Module:Buffer locally, e.g.
local Buffer =
require'Module:Buffer'
, though fine, is often unnecessary since all Buffer objects can create new buffers via
For
Buffer:_
- Treat
:_
as though it were a ..
op. Wrapping strings with unnecessary ()
is akin to ( 'string1' ) .. ( 'string2' ) .. ( 'string3' )
.
- Most uses of
raw
can be avoided through careful planning with the pos
argument. That said, the performance decrease from raw is unlikely to be significant for modules transcluded on less 100,000 pages. In short, reduction in server load from avoiding raw may not be worth it if such makes the code harder to maintain.
- To insert an empty string as a placeholder for a separator without setting
raw
, pass a table containing only a empty string, like so:
Buffer:_{''}
For
Buffer:_all
- Appending values in multiple locations is one of the primary reasons why the nanKeys argument exists. While passing a boolean directly will cause an error, you can do something like...
- this:
Buffer:_all({condition and {_nil={'0', 'replacement'},Front=1,getParent='from child'}}}, true)
- versus:
Buffer:_nil('0', condition and 'replacement' or false):_(condition and 'Front', 1):getParent(condition and 'from child'):_B(child)
.
For
Buffer:_cc
- If the table reference passed as
clear
was appended raw in multiple positions, this is akin to performing
Buffer:_nil
at all positions simultaneously. (May be easier than trying to come up with a
string.gsub
pattern)
- Inserting a named empty table is raw as a placeholder to be populated later via this function may be easier than calculating pos argument of
For
Buffer:_inHTML
Buffer:_( mw.html.create'br' )
is roughly 6 times more efficient than
Buffer:_inHTML'br':_out()
, at least in terms of server CPU usage. (Though Buffer:_'<br />'
is 25 and 4 times more efficient, respectively. Also note that Buffer:_inHTML is slower on the first run due to initialization. After the first run, the efficiency ratio of using mw.html.create directly over Buffer:_inHTML drops to 2.)
Performance
Notes
- ^ For instance, Module:Asbox is transcluded on about 2 million pages, which each have Asbox using Buffer functions on 10-30 variables, some of which may be strings generated by other Modules that may eventually use Module:Buffer several times. Finally, throw in the fact that many pages transclude Asbox multiple times, and you can see how a few microseconds per op could translate to hours for the job queue.
- ^ For your convience, the self operator
:
and .
are interchangable when used on the Module directly, though the self-op is required for nearly all interactions with objects created by the Module.
- ^ a b Setting a Buffer to raw incurs performance penalty for all future tostring ops as it must re-validate each indexed value
through
Buffer:_all
to a new table before passing that to table.concat (vs. passing itself directly).
That said, re-stringing a raw Buffer is still usually several times faster than using the ..
op to join an equivalent number of strings.
(See #Tips for ways to avoid using raw)
- ^ In other words, if args is a string or a table without [1] set, it will be passed as the only argument.
Further note it is not possible to pass a
functionName = args
pair where args is numerical since such would be read as value = pos
.
Finally, passing a function type as args will throw an error message.
- ^ a b There is no 'getChild' method. If a child needed after returning to the parent, set it locally
or use
Buffer:_G
prior to returning. (No, Codehydro did not get lazy. Rather, this allows garbage collection on children with no further purpose.)
- ^ The first argument is not type checked. For #performance, it is read as outs only when there are multiple varargs. In other words,
Buffer:_out(2)
will use 2
as the separator. To append N generations to their parent with no separator, use Buffer:_outs(N, nil)
.
- ^ An empty string would produce the same output as false, however, Lua strings, even empty ones, take up memory until garbage collected.
- ^
For example, given
{nil, 'string'}
as copy, Buffer:_cc(clear, copy)
makes #clear
equal 2
,
whereas #mw.clone{nil, 'string'}
equals 0
(as of March 2015).
This replicates length by filling clear halfway to the length of copy (the minimum needed to 'trick' Lua) and then setting nil every key that would not trigger recalculation.
As a result, keys that would resize clear when set nil are left as empty strings. Such should be fairly rare;
given tables representing every possible way to position a single nil key for all lengths between 2 and 32 (inclusive), only 8.39 percent such tables would have its nil copied as an empty string instead.
Also note that tables returned from Buffer:_(0, copy)
have length declared on creation instead, and thus won't have extra strings attached.
The odds can be estimated using , where is the upper limit that an arbitrary nil key from copy of length ranging from 1 to is imaged as an empty string.
- ^ It is best practice to pass number strings instead of number literals (i.e.
Buffer:stream'1'
instead of Buffer:stream(1)
).
Such improves performance (and is perhaps more aesthetically pleasing in this mode).
- ^ No explicit trigger to exit stream mode has been programmed for
Element-
Buffer
functions (including Buffer-HTML redirects).
Stringing an Element-Buffer without its outer HTML-Buffer was deemed uncommon enough that the #performance penalty from modifying those specialized functions
would outweigh the inconvience of having to exit via typing :_( ... )
around the last item.
- ^ Buffer(-HTML) objects reference their parent differently from mw.html objects. Passing a normal mw.html object to Buffer:_inHTML as
args.parent
and then calling
:done
the object created, followed by
Buffer:getParent
on the adopted parent, may return the "child." This is a feature rather than a bug.
- ^ While Buffer-HTML objects may use #global functions, there is no separate Buffer-HTML version. In other words, the self-action of a global function on an Element-Buffer is not redirected to the outer Buffer-HTML object.
- ^
mw.html:allDone
is doubly wrapped for Element-Buffers. The inner wrapper sets a Buffer parent reference as described at
Buffer:_inHTML
.