# A Parser Combinator in PureScript (part 2/2)

### A Parser Combinator in PureScript (Series)

A Parser Combinator in PureScript (part 1/2) A Parser Combinator in PureScript (part 2/2)In the previous post we’ve created the basic building blocks of a parser combinator library. Now it’s time to add some more fancy stuff.

## Choice and Failure

Up until now, each parser either parsed successfully or failed. We’d like to have a parser that can try different parsers before giving up and failing.

Luckily, math already did the dirty job for us. In fact, we can use a couple of typeclasses: `Alt`

and `Plus`

.

`Alt`

captures the intuition of a choice between alternatives. It has only one member `alt`

or infix `<|>`

:

```
instance altParser :: (Functor Parser) => Alt Parser where
-- alt :: forall a. f a -> f a -> f a
= Parser (\s -> case runParser p1 s of
alt p1 p2 Just (Tuple v s') -> Just $ Tuple v s'
Nothing -> runParser p2 s)
```

In other words, it tries `p1`

and falls back to `p2`

in case of failure. For example

```
main :: Effect Unit
= do
main $ runParser (char 'Z' <|> char 's') "string"
logShow -- (Just (Tuple 's' "tring"))
```

`Plus`

represents failure:

```
instance plusParser :: (Alt Parser) => Plus Parser where
-- empty :: forall a. f a
= Parser (\_ -> Nothing) empty
```

In the previous post, we represented failure in the `fail`

parser. Turns out we could have used `empty`

in its place:

```
main :: Effect Unit
= do
main -- `:: Parser Unit` is needed to make the compiler happy
$ runParser (empty :: Parser Unit) "string"
logShow -- Nothing
```

Putting it together

```
main :: Effect Unit
= do
main $ runParser (empty <|> anyChar) "string"
logShow -- (Just (Tuple 's' "tring"))
$ runParser (anyChar <|> empty) "string"
logShow -- (Just (Tuple 's' "tring"))
$ runParser (char 'Z' <|> char 's') "string"
logShow -- (Just (Tuple 's' "tring"))
```

## Attempt Parsing Multiple Times

It would be awesome to be able to use one parser as many times as it can succeed. Let’s say we wanted to match any number of spaces. Then we could write

```
spaces :: Parser Unit
= do
spaces <- anyChar
c if c == ' '
then spaces
else Parser (\s -> Just (Tuple unit $ fromChars [c] <> s))
```

In other words, we parse any character, if it’s a space we recursively call `spaces`

. Otherwise, we noop by putting back the character in the string.

Or similarly

```
spaces' :: Parser Unit
= (char ' ' >>= \_ -> spaces') <|> pure unit spaces'
```

we either succeed at parsing a space and recursively call `spaces'`

or we get out successfully.

Those work

```
main :: Effect Unit
= do
main $ runParser spaces " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces "string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces' " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces' "string"
logShow -- (Just (Tuple Unit "string"))
```

However, they look ugly. But guess what? Math has our back once again. As a matter of fact, we can use `Data.List.many`

.

To use `many`

, we need to satisfy a couple of constraints. In fact, the signature is

```
many :: forall f a. Alternative f
=> Lazy (f (List a))
=> f a
-> f (List a)
```

If we substitute `Parser`

in place of `f`

then we get

```
many :: forall a. Alternative Parser
=> Lazy (Parser (List a))
=> Parser a
-> Parser (List a)
```

That means we need the following instances: `Alternative Parser`

and `Lazy (Parser (List a))`

.

`Alternative`

is easy pease because it has no members to implement. In particular, for a type to be an `Alternative`

, it only needs to be both `Applicative`

and `Plus`

. Since `Parser`

implements those, we can just write

`instance alternativeParser :: (Applicative Parser, Plus Parser) => Alternative Parser`

Attempting to parse an indefinite amount of times requires laziness for the same reason we need laziness to have an infinite list (check the mutual recursive definition of `many`

and `some`

to dig deeper).

In Haskell that is not a problem because the language is lazy by default. In PureScript we need to use `Lazy`

:

```
instance lazyParser :: Lazy (Parser (List a)) where
-- defer :: (Unit -> l) -> l
= Parser (\s -> runParser (g unit) s) defer g
```

That means now we can write

```
spaces0 :: Parser Unit
= do
spaces0 <- many $ char ' '
_ pure unit
```

We discard the spaces (`_ <- many $ char ' '`

) because we are not interested in a list of spaces, we just want to consume them:

```
main :: Effect Unit
= do
main $ runParser spaces0 " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces0 "string"
logShow -- (Just (Tuple Unit "string"))
```

The best part is that now we can also use `some`

to have a parser that needs at least one match to succeed:

```
spaces1 :: Parser Unit
= do
spaces1 <- some $ char ' '
_ pure unit
main :: Effect Unit
= do
main $ runParser spaces0 " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces0 "string"
logShow -- Nothing
```

## The `int`

Parser

Just for fun, let’s put everything together into a parser that parses `Int`

s.

At the top we define `int`

as

```
int :: Parser Int
=
int <|> nat neg
```

In other words, an int is either a negative or a natural. `nat`

is implemented as follows

```
nat :: Parser Int
= do
nat <- some digit
xs case fromString $ fromChars xs of
Just n -> pure n
Nothing -> empty
```

And `neg`

as follows

```
neg :: Parser Int
= do
neg <- char '-'
_ negate <$> nat
```

Et voilà

```
main :: Effect Unit
= do
main $ runParser int "123 string"
logShow -- (Just (Tuple 123 " string"))
$ runParser int "-123 string"
logShow -- (Just (Tuple -123 " string"))
$ runParser int "string"
logShow -- Nothing
```

## The Whole Code

```
module Main where
import Prelude
import Effect (Effect)
import Effect.Console (logShow)
import Data.Maybe
import Data.Tuple
import Data.String.Yarn hiding (fromString)
import Data.List
import Data.Char.Unicode
import Data.String.CodeUnits
import Control.Alt
import Control.Plus
import Control.Alternative
import Control.Lazy (class Lazy)
import Data.Int
newtype Parser a = Parser (String -> Maybe (Tuple a String))
runParser :: forall a. Parser a -> String -> Maybe (Tuple a String)
Parser g) s = g s
runParser (
instance functorParser :: Functor Parser where
-- map :: forall a b. (a -> b) -> f a -> f b
map g f = Parser (\s -> case runParser f s of
Just (Tuple v s') -> Just $ Tuple (g v) s'
Nothing -> Nothing)
instance applyParser :: (Functor Parser) => Apply Parser where
-- apply :: forall a b. f (a -> b) -> f a -> f b
= Parser (\s -> case runParser fg s of
apply fg f Just (Tuple g s') ->
case runParser f s' of
Just (Tuple v s'') -> Just $ Tuple (g v) s''
Nothing -> Nothing
Nothing -> Nothing)
instance applicativeParser :: (Apply Parser) => Applicative Parser where
-- pure :: forall a. a -> f a
pure x = Parser (\s -> Just $ Tuple x s)
instance bindParser :: (Apply Parser) => Bind Parser where
-- bind :: forall a b. m a -> (a -> m b) -> m b
= Parser (\s -> case runParser m s of
bind m g Just (Tuple v s') -> runParser (g v) s'
Nothing -> Nothing)
instance monadParser :: (Bind Parser) => Monad Parser
instance altParser :: (Functor Parser) => Alt Parser where
-- alt :: forall a. f a -> f a -> f a
= Parser (\s -> case runParser p1 s of
alt p1 p2 Just (Tuple v s') -> Just $ Tuple v s'
Nothing -> runParser p2 s)
instance plusParser :: (Alt Parser) => Plus Parser where
-- empty :: forall a. f a
= Parser (\_ -> Nothing)
empty
instance alternativeParser :: (Applicative Parser, Plus Parser) => Alternative Parser
instance lazyParser :: Lazy (Parser (List a)) where
-- defer :: (Unit -> l) -> l
= Parser (\s -> runParser (g unit) s)
defer g
fail :: forall a. Parser a
fail = Parser (\_ -> Nothing)
anyChar :: Parser Char
= Parser (\s -> case toChars s :: List Char of
anyChar Nil -> Nothing
Cons x xs -> Just $ Tuple x $ fromChars xs)
sat :: (Char -> Boolean) -> Parser Char
pred = do
sat <- anyChar
c if pred c then pure c else fail
digit :: Parser Char
= sat isDigit
digit
lower :: Parser Char
= sat isLower
lower
upper :: Parser Char
= sat isUpper
upper
letter :: Parser Char
= sat isAlpha
letter
alphanum :: Parser Char
= sat isAlphaNum
alphanum
char :: Char -> Parser Char
= sat ((==) c)
char c
string :: String -> Parser String
=
string s map fromChars $ case toChars s :: List Char of
Nil -> pure Nil
Cons x xs -> do
<- char x
_ <- string $ fromChars xs
_ pure $ Cons x xs
spaces :: Parser Unit
= do
spaces <- anyChar
c if c == ' '
then spaces
else Parser (\s -> Just (Tuple unit $ fromChars [c] <> s))
spaces' :: Parser Unit
= (char ' ' >>= \_ -> spaces') <|> pure unit
spaces'
spaces0 :: Parser Unit
= do
spaces0 <- many $ char ' '
_ pure unit
spaces1 :: Parser Unit
= do
spaces1 <- some $ char ' '
_ pure unit
int :: Parser Int
=
int <|> nat
neg
nat :: Parser Int
= do
nat <- some digit
xs case fromString $ fromChars xs of
Just n -> pure n
Nothing -> empty
neg :: Parser Int
= do
neg <- char '-'
_ negate <$> nat
main :: Effect Unit
= do
main $ runParser anyChar "string"
logShow -- (Just (Tuple 's' "tring"))
$ runParser (char 's') "string"
logShow -- (Just (Tuple 's' "tring"))
$ runParser (char 'Z') "string"
logShow -- Nothing
$ runParser digit "3tring"
logShow -- (Just (Tuple '3' "tring"))
$ runParser (string "stri") "string"
logShow -- (Just (Tuple "stri" "ng"))
$ runParser (string "ZZZ") "string"
logShow -- Nothing
$ runParser (char 'Z' <|> char 's') "string"
logShow -- (Just (Tuple 's' "tring"))
$ runParser (empty :: Parser Unit) "string"
logShow -- Nothing
$ runParser (empty <|> anyChar) "string"
logShow -- (Just (Tuple 's' "tring"))
$ runParser (anyChar <|> empty) "string"
logShow -- (Just (Tuple 's' "tring"))
$ runParser spaces " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces "string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces' " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces' "string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces0 " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces0 "string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces1 " string"
logShow -- (Just (Tuple Unit "string"))
$ runParser spaces1 "string"
logShow -- Nothing
$ runParser int "123 string"
logShow -- (Just (Tuple 123 " string"))
$ runParser int "-123 string"
logShow -- (Just (Tuple -123 " string"))
$ runParser int "string"
logShow -- Nothing
```

## Outro

Special thanks to Tom for writing “Fantas, Eel, and Specification 10: Alt, Plus, and Alternative” and to Typeclassopedia. Those two provide clear explanations of the intuitions behind `Alternative`

. Also, thanks a bunch to @monoidmusician and @kadblas from fpchat who helped me understand `Lazy`

.