Parsing.String.Replace  
- Package
- purescript-parsing
- Repository
- purescript-contrib/purescript-parsing
This module is for finding patterns in a String, and also
replacing or splitting on the found patterns.
This activity is traditionally done with
Regex,
but this module uses parsers instead for the pattern matching.
Functions in this module are ways to run a parser on an input String,
like runParser or runParserT.
Why would we want to do pattern matching and substitution with parsers instead of regular expressions?
- Monadic parsers have a nicer syntax than regular expressions, which are notoriously difficult to read. With monadic parsers we can perform textual pattern-matching in plain PureScript rather than using a special regex domain-specific programming language. 
- Regular expressions can do “group capture” on sections of the matched pattern, but they can only return stringy lists of the capture groups. Parsers can construct typed data structures based on the capture groups, guaranteeing no disagreement between the pattern rules and the rules that we're using to build data structures based on the pattern matches. - For example, consider scanning a string for numbers. A lot of different things can look like a number, and can have leading plus or minus signs, or be in scientific notation, or have commas, or whatever. If we try to parse all of the numbers out of a string using regular expressions, then we have to make sure that the regular expression and the string-to-number conversion function agree about exactly what is and what isn't a numeric string. We can get into an awkward situation in which the regular expression says it has found a numeric string but the string-to-number conversion function fails. A typed parser will perform both the pattern match and the conversion, so it will never be in that situation. Parse, don't validate. 
- Regular expressions are only able to pattern-match regular grammars. Monadic parsers are able pattern-match context-free (by recursion) or context-sensitive (by monad transformer) grammars. 
- The replacement expression for a traditional regular expression-based substitution command is usually just a string template in which the Nth “capture group” can be inserted with the syntax - \N. With this library, instead of a template, we can perform any replacement computation, including- Effects.
Implementation Notes
All of the functions in this module work by calling runParserT
with the anyTill combinator.
We can expect the speed of parser-based pattern matching to be
about 10× worse than regex-based pattern matching in a JavaScript
runtime environment.
This module is based on the Haskell packages
replace-megaparsec
and
replace-attoparsec.
#breakCap Source
breakCap :: forall a. String -> Parser String a -> Maybe (T3 String a String)Break on and capture one pattern
Find the first occurence of a pattern in the input String, capture the found
pattern, and break the input String on the found pattern.
This function can be used instead of Data.String.indexOf or Data.String.Regex.search or Data.String.Regex.replace and it allows using a parser for the pattern search.
This function can be used instead of
Data.String.takeWhile
or
Data.String.dropWhile
and it is predicated beyond more than just the next single CodePoint.
Output
- Nothingwhen no pattern match was found.
- Just (prefix /\ parse_result /\ suffix)for the result of parsing the pattern match, and the- prefixstring before and the- suffixstring after the pattern match.- prefixand- suffixmay be zero-length strings.
Access the matched section of text
To capture the matched string combine the pattern
parser sep with the match combinator.
With the matched string, we can reconstruct the input string.
For all input, sep, if
let (Just (prefix /\ (infix /\ _) /\ suffix)) =
      breakCap input (match sep)
then
input == prefix <> infix <> suffix
Example
Find the first pattern match and break the input string on the pattern.
breakCap "hay needle hay" (string "needle")
Result:
Just ("hay " /\ "needle" /\ " hay")
Example
Find the first pattern match, capture the matched text and the parsed result.
breakCap "abc 123 def" (match intDecimal)
Result:
Just ("abc " /\ ("123" /\ 123) /\ " def")
#splitCap Source
splitCap :: forall a. String -> Parser String a -> NonEmptyList (Either String a)Split on and capture all patterns
Find all occurences of the pattern parser sep, split the
input String, capture all the matched patterns and the splits.
This function can be used instead of Data.String.Common.split or Data.String.Regex.split or Data.String.Regex.match or Data.String.Regex.search.
The input string will be split on every leftmost non-overlapping occurence
of the pattern sep. The output list will contain
the parsed result of input string sections which match the sep pattern
in Right a, and non-matching sections in Left String.
Access the matched section of text
To capture the matched strings combine the pattern
parser sep with the match combinator.
With the matched strings, we can reconstruct the input string.
For all input, sep, if
let output = splitCap input (match sep)
then
input == fold (either identity fst <$> output)
Example
Split the input string on all Int pattern matches.
splitCap "hay 1 straw 2 hay" intDecimal
Result:
[Left "hay ", Right 1, Left " straw ", Right 2, Left " hay"]
Example
Find the beginning positions of all pattern matches in the input.
catMaybes $ hush <$> splitCap ".𝝺...\n...𝝺." (position <* string "𝝺")
Result:
[ Position {index: 1, line: 1, column: 2 }
, Position { index: 9, line: 2, column: 4 }
]
Example
Find groups of balanced nested parentheses. This pattern is an example of a “context-free” grammar, a pattern that can't be expressed by a regular expression. We can express the pattern with a recursive parser.
balancedParens :: Parser String Unit
balancedParens = do
  void $ char '('
  void $ manyTill (balancedParens <|> void anyCodePoint) (char ')')
rmap fst <$> splitCap "((🌼)) (()())" (match balancedParens)
Result:
[Right "((🌼))", Left " ", Right "(()())"]
#splitCapT Source
splitCapT :: forall m a. Monad m => MonadRec m => String -> ParserT String m a -> m (NonEmptyList (Either String a))Monad transformer version of splitCap. The sep parser will run in the
monad context.
Example
Count the pattern matches.
Parse in a State monad to remember state in the parser. This
stateful letterCount parser counts
the number of pattern matches which occur in the input, and also
tags each match with its index.
letterCount :: ParserT String (State Int) (Tuple Char Int)
letterCount = do
  x <- letter
  i <- modify (_+1)
  pure (x /\ i)
flip runState 0 $ splitCapT "A B" letterCount
Result:
[Right ('A' /\ 1), Left " ", Right ('B' /\ 2)] /\ 2
#replace Source
replace :: String -> Parser String String -> StringFind-and-replace
Also called “match-and-substitute”. Find all
of the leftmost non-overlapping sections of the input string which match
the pattern parser sep, and
replace them with the result of the parser.
The sep parser must return a result of type String for the replacement.
This function can be used instead of Data.String.replaceAll or Data.String.Regex.replace'.
Access the matched section of the input string
To get access to the matched string for calculating the replacement,
combine the pattern parser sep
with the match combinator.
This allows us to write a sep parser which can choose to not
replace the match and just leave it as it is.
So, for all sep:
replace input (fst <$> match sep) == input
Example
Find and uppercase the "needle" pattern.
replace "hay needle hay" (toUpper <$> string "needle")
Result:
"hay NEEDLE hay"
Example
Find integers and double them.
replace "1 6 21 107" (show <$> (_*2) <$> intDecimal)
Result:
"2 12 42 214"
#replaceT Source
replaceT :: forall m. Monad m => MonadRec m => String -> ParserT String m String -> m StringMonad transformer version of replace.
Example
Find an environment variable in curly braces and replace it with its value
from the environment.
We can read from the environment with lookupEnv because replaceT is
running the sep parser in Effect.
replaceT "◀ {HOME} ▶" do
  _ <- string "{"
  Tuple variable _ <- anyTill (string "}")
  lift (lookupEnv variable) >>= maybe empty pure
Result:
"◀ /home/jbrock ▶"
