Data.String.Utils
- Package
- purescript-stringutils
- Repository
- menelaos/purescript-stringutils
#NormalizationForm Source
#charAt Source
charAt :: Int -> String -> Maybe String
Return the character at the given index, if the index is within bounds.
Note that this function handles Unicode as you would expect.
If you want a simple wrapper around JavaScript's String.prototype.charAt
method, you should use the Data.String.CodeUnits.charAt
function from
purescript-strings.
This function returns a String
instead of a Char
because PureScript
Char
s must be UTF-16 code units and hence cannot represent all Unicode
code points.
Example:
-- Data.String.Utils.charAt
charAt 2 "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†" == Just "𝕣"
-- Data.String.CodeUnits.charAt
charAt 2 "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†" == Just '�'
#codePointAt Source
codePointAt :: Warn (Text "DEPRECATED: `Data.String.Utils.codePointAt`") => Int -> String -> Maybe Int
DEPRECATED: This function is now available in purescript-strings
.
Return the Unicode code point value of the character at the given index,
if the index is within bounds.
Note that this function handles Unicode as you would expect.
If you want a simple wrapper around JavaScript's
String.prototype.codePointAt
method, you should use codePointAt'
.
Example:
codePointAt 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120792
codePointAt 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120793
codePointAt 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120794
codePointAt 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Nothing
codePointAt' 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120793
codePointAt' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 57304 -- Surrogate code point
codePointAt' 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120794
codePointAt' 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 57313 -- Surrogate code point
#codePointAt' Source
codePointAt' :: Int -> String -> Maybe Int
Return the Unicode code point value of the character at the given index,
if the index is within bounds.
This function is a simple wrapper around JavaScript's
String.prototype.codePointAt
method. This means that if the index does
not point to the beginning of a valid surrogate pair, the code unit at
the index (i.e. the Unicode code point of the surrogate pair half) is
returned instead.
If you want to treat a string as an array of Unicode Code Points, use
codePointAt
from purescript-strings
instead.
Example:
codePointAt' 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120793
codePointAt' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 57304 -- Surrogate code point
codePointAt' 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120794
codePointAt' 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 57313 -- Surrogate code point
codePointAt 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120792
codePointAt 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120793
codePointAt 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Just 120794
codePointAt 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == Nothing
#escapeRegex Source
escapeRegex :: String -> String
Escape a string so that it can be used as a literal string within a regular expression.
#fromCharArray Source
fromCharArray :: Array String -> String
Convert an array of characters into a String
.
This function uses String
instead of Char
because PureScript
Char
s must be UTF-16 code units and hence cannot represent all Unicode
code points.
Example:
fromCharArray ["ℙ", "∪", "𝕣", "ⅇ", "Ⴝ", "𝚌", "𝕣", "ⅈ", "𝚙", "†"]
== "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†"
#includes' Source
includes' :: String -> Int -> String -> Boolean
Determine whether the second string argument contains the first one,
beginning the search at the given position.
Note that this function handles Unicode as you would expect.
Negative position
values result in a search from the beginning of the
string.
Example:
includes' "𝟙" 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == true
includes' "𝟙" 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == false
includes' "𝟡" 10 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == false
-- This behaviour is different from `String.prototype.includes`:
-- "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡".includes("𝟡", 10) == true
#length Source
length :: Warn (Text "DEPRECATED: `Data.String.Utils.length`") => String -> Int
DEPRECATED: This function is now available in purescript-strings
.
Return the number of Unicode code points in a string.
Note that this function correctly accounts for Unicode symbols that
are made up of surrogate pairs. If you want a simple wrapper around
JavaScript's string.length
property, you should use the
Data.String.CodeUnits.length
function from purescript-strings
.
length "PureScript" == 10
length "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†" == 10 -- 14 with `Data.String.length`
#mapChars Source
mapChars :: (String -> String) -> String -> String
Return the string obtained by applying the mapping function to each
character (i.e. Unicode code point) of the input string.
Note that this is probably not what you want as Unicode code points are
not necessarily the same as user-perceived characters (grapheme clusters).
Only use this function if you know what you are doing.
This function uses String
s instead of Char
s because PureScript
Char
s must be UTF-16 code units and hence cannot represent all Unicode
code points.
Example:
-- Mapping over what appears to be six characters...
mapChars (const "x") "Åström" == "xxxxxxxx" -- See? Don't use this!
#normalize' Source
normalize' :: NormalizationForm -> String -> String
Return a given Unicode Normalization Form of a string.
#padEnd Source
padEnd :: Int -> String -> String
Pad the given string with space from the end until the resulting string
reaches the given length.
Note that this function handles Unicode as you would expect.
If you want a simple wrapper around JavaScript's
String.prototype.padEnd
method, you should use padEnd'
.
Example:
-- Treats strings as a sequence of Unicode code points
padEnd 1 "0123456789" == "0123456789"
padEnd 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padEnd 11 "0123456789" == "0123456789 "
padEnd 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡 "
padEnd 21 "0123456789" == "0123456789 "
padEnd 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡 "
-- Treats strings as a sequence of Unicode code units
padEnd' 1 "0123456789" == "0123456789"
padEnd' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padEnd' 11 "0123456789" == "0123456789 "
padEnd' 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padEnd' 21 "0123456789" == "0123456789 "
padEnd' 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡 "
#padEnd' Source
padEnd' :: Int -> String -> String
Wrapper around JavaScript's String.prototype.padEnd
method.
Note that this function treats strings as a sequence of Unicode
code units.
You will probably want to use padEnd
instead.
Example:
-- Treats strings as a sequence of Unicode code points
padEnd 1 "0123456789" == "0123456789"
padEnd 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padEnd 11 "0123456789" == "0123456789 "
padEnd 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡 "
padEnd 21 "0123456789" == "0123456789 "
padEnd 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡 "
-- Treats strings as a sequence of Unicode code units
padEnd' 1 "0123456789" == "0123456789"
padEnd' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padEnd' 11 "0123456789" == "0123456789 "
padEnd' 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padEnd' 21 "0123456789" == "0123456789 "
padEnd' 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡 "
#padStart Source
padStart :: Int -> String -> String
Pad the given string with space from the start until the resulting string
reaches the given length.
Note that this function handles Unicode as you would expect.
If you want a simple wrapper around JavaScript's
String.prototype.padStart
method, you should use padStart'
.
Example:
-- Treats strings as a sequence of Unicode code points
padStart 1 "0123456789" == "0123456789"
padStart 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart 11 "0123456789" == " 0123456789"
padStart 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == " 𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart 21 "0123456789" == " 0123456789"
padStart 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == " 𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
-- Treats strings as a sequence of Unicode code units
padStart' 1 "0123456789" == "0123456789"
padStart' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart' 11 "0123456789" == " 0123456789"
padStart' 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart' 21 "0123456789" == " 0123456789"
padStart' 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == " 𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
#padStart' Source
padStart' :: Int -> String -> String
Wrapper around JavaScript's String.prototype.padStart
method.
Note that this function treats strings as a sequence of Unicode
code units.
You will probably want to use padStart
instead.
Example:
-- Treats strings as a sequence of Unicode code points
padStart 1 "0123456789" == "0123456789"
padStart 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart 11 "0123456789" == " 0123456789"
padStart 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == " 𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart 21 "0123456789" == " 0123456789"
padStart 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == " 𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
-- Treats strings as a sequence of Unicode code units
padStart' 1 "0123456789" == "0123456789"
padStart' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart' 11 "0123456789" == " 0123456789"
padStart' 11 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
padStart' 21 "0123456789" == " 0123456789"
padStart' 21 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == " 𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"
#repeat Source
repeat :: Int -> String -> Maybe String
Return a string that contains the specified number of copies of the input
string concatenated together. Return Nothing
if the repeat count is
negative or if the resulting string would overflow the maximum string size.
Example:
repeat 3 "𝟞" == Just "𝟞𝟞𝟞"
repeat (-1) "PureScript" == Nothing
repeat 2147483647 "PureScript" == Nothing
#startsWith Source
startsWith :: String -> String -> Boolean
Determine whether the second argument starts with the first one.
#startsWith' Source
startsWith' :: String -> Int -> String -> Boolean
Determine whether a string starts with a certain substring at a given position.
#stripChars Source
stripChars :: String -> String -> String
Strip a set of characters from a string. This function is case-sensitive.
Example:
stripChars "aeiou" "PureScript" == "PrScrpt"
stripChars "AEIOU" "PureScript" == "PureScript"
#stripDiacritics Source
stripDiacritics :: String -> String
Strip diacritics from a string.
Example:
stripDiacritics "Ångström" == "Angstrom"
stripDiacritics "Crème Brulée" == "Creme Brulee"
stripDiacritics "Götterdämmerung" == "Gotterdammerung"
stripDiacritics "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†" == "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†"
stripDiacritics "Raison d'être" == "Raison d'etre"
stripDiacritics "Týr" == "Tyr"
stripDiacritics "Zürich" == "Zurich"
#stripMargin Source
stripMargin :: String -> String
Removes leading whitespace and pipe character from each line. Useful for
dedenting strings enclosed in triple double quotation marks.
Inspired by Scala's stripMargin
method.
Does not preserve original line endings.
Example:
stripMargin
"""
|Line 1
|Line 2
|Line 3
"""
== "Line 1\nLine 2\nLine 3"
#stripMarginWith Source
stripMarginWith :: String -> String -> String
Same as stripMargin
except with the option to use any given string
to delimit the margin.
Does not preserve original line endings.
Example:
stripMarginWith ">> "
"""
>> Line 1
>> Line 2
>> Line 3
"""
== "Line 1\nLine 2\nLine 3"
#toCharArray Source
toCharArray :: String -> Array String
Convert a string to an array of Unicode code points.
Note that this function is different from
Data.String.CodeUnits.toCharArray
in purescript-strings
which
converts a string to an array of 16-bit code units.
The difference becomes apparent when converting strings
that contain characters which are internally represented
as surrogate pairs.
This function uses String
s instead of Char
s because PureScript
Char
s must be UTF-16 code units and hence cannot represent all Unicode
code points.
Example:
-- Data.String.Utils
toCharArray "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†"
== ["ℙ", "∪", "𝕣", "ⅇ", "Ⴝ", "𝚌", "𝕣", "ⅈ", "𝚙", "†"]
-- Data.String.CodeUnits
toCharArray "ℙ∪𝕣ⅇႽ𝚌𝕣ⅈ𝚙†" ==
['ℙ', '∪', '�', '�', 'ⅇ', 'Ⴝ', '�', '�', '�', '�', 'ⅈ', '�', '�', '†']
#unsafeCodePointAt Source
unsafeCodePointAt :: Int -> String -> Int
Return the Unicode code point value of the character at the given index,
if the index is within bounds.
Note that this function handles Unicode as you would expect.
If you want a simple (unsafe) wrapper around JavaScript's
String.prototype.codePointAt
method, you should use unsafeCodePointAt'
.
Unsafe: Throws runtime exception if the index is not within bounds.
Example:
unsafeCodePointAt 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120792
unsafeCodePointAt 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120793
unsafeCodePointAt 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120794
unsafeCodePointAt 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" -- Error
unsafeCodePointAt' 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120793
unsafeCodePointAt' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 57304 -- Surrogate code point
unsafeCodePointAt' 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120794
unsafeCodePointAt' 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 57313 -- Surrogate code point
#unsafeCodePointAt' Source
unsafeCodePointAt' :: Int -> String -> Int
Return the Unicode code point value of the character at the given index,
if the index is within bounds.
This function is a simple (unsafe) wrapper around JavaScript's
String.prototype.codePointAt
method. This means that if the index does
not point to the beginning of a valid surrogate pair, the code unit at
the index (i.e. the Unicode code point of the surrogate pair half) is
returned instead.
If you want to treat a string as an array of Unicode Code Points, use
unsafeCodePointAt
instead.
Example:
unsafeCodePointAt' 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120793
unsafeCodePointAt' 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 57304 -- Surrogate code point
unsafeCodePointAt' 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120794
unsafeCodePointAt' 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 57313 -- Surrogate code point
unsafeCodePointAt 0 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120792
unsafeCodePointAt 1 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120793
unsafeCodePointAt 2 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" == 120794
unsafeCodePointAt 19 "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡" -- Error
- Modules
- Data.
Char. Utils - Data.
String. Utils