encode

Synthesised documentation from type/Str

From type/Str

multi method encode(Str:D $encoding = 'utf8', :$replacement, Bool() :$translate-nl = False, :$strict)

Returns a Blob which represents the original string in the given encoding and normal form. The actual return type is as specific as possible, so $str.encode('UTF-8') returns a utf8 object, $str.encode('ISO-8859-1') a buf8. If :translate-nl is set to True, it will translate newlines from \n to \r\n, but only in Windows. $replacement indicates how characters are going to be replaced in the case they are not available in the current encoding, while $strict indicates whether unmapped codepoints will still decode; for instance, codepoint 129 which does not exist in windows-1252.

my $str = "Þor is mighty";
say $str.encode("ascii", :replacement( 'Th') ).decode("ascii");
# OUTPUT: «Thor is mighty␤»

In this case, any unknown character is going to be substituted by Th. We know in advance that the character that is not known in the ascii encoding is Þ, so we substitute it by its latin equivalent, Th. In the absence of any replacement set of characters, :replacement is understood as a Bool:

say $str.encode("ascii", :replacement).decode("ascii"); # OUTPUT: «?or is mighty␤»

If :replacement is not set or assigned a value, the error Error encoding ASCII string: could not encode codepoint 222 will be issued (in this case, since þ is codepoint 222).

Since the Blob returned by encode is the original string in normal form, and every element of a Blob is a byte, you can obtain the length in bytes of a string by calling a method that returns the size of the Blob on it:

say "þor".encode.bytes; # OUTPUT: «4␤» 
say "þor".encode.elems; # OUTPUT: «4␤»