Yes. I’m divided into “hum… 100 lines is larger than I expected” and “what did he mean ‘from scratch’? did he write the parser combinators? if so, 100 lines is crazy small!”
But I’m settling in believing 80 of those lines are verbose type declarations.
I decided to write it myself for fun. I decided that “From Scratch” means:
No parser libraries (parsec/happy/etc)
No using read from Prelude
No hacky meta-parsing
Here is what I came up with (using my favourite parsing method: parser combinators):
import Control.Monad ((>=>), replicateM)
import Control.Applicative (Alternative (..), asum, optional)
import Data.Maybe (fromMaybe)
import Data.Functor (($>))
import Data.List (singleton)
import Data.Map (Map, fromList)
import Data.Bifunctor (first, second)
import Data.Char (toLower, chr)
newtype Parser i o = Parser { parse :: i -> Maybe (i, o) } deriving (Functor)
instance Applicative (Parser i) where
pure a = Parser $ \i -> Just (i, a)
a <*> b = Parser $ parse a >=> \(i, f) -> second f <$> parse b i
instance Alternative (Parser i) where
empty = Parser $ const Nothing
a <|> b = Parser $ \i -> parse a i <|> parse b i
instance Monad (Parser i) where
a >>= f = Parser $ parse a >=> \(i, b) -> parse (f b) i
instance Semigroup o => Semigroup (Parser i o) where
a <> b = (<>) <$> a <*> b
instance Monoid o => Monoid (Parser i o) where
mempty = pure mempty
type SParser = Parser String
charIf :: (a -> Bool) -> Parser [a] a
charIf cond = Parser $ \i -> case i of
(x:xs) | cond x -> Just (xs, x)
_ -> Nothing
char :: Eq a => a -> Parser [a] a
char c = charIf (== c)
one :: Parser i a -> Parser i [a]
one = fmap singleton
str :: Eq a => [a] -> Parser [a] [a]
str = mapM char
sepBy :: Parser i a -> Parser i b -> Parser i [a]
sepBy a b = (one a <> many (b *> a)) <|> mempty
data Decimal = Decimal { mantissa :: Integer, exponent :: Int } deriving Show
data JSON = Object (Map String JSON) | Array [JSON] | Bool Bool | Number Decimal | String String | Null deriving Show
whitespace :: SParser String
whitespace = many $ asum $ map char [' ', '\t', '\r', '\n']
digit :: Int -> SParser Int
digit base = asum $ take base [asum [char c, char (toLower c)] $> n | (c, n) <- zip (['0'..'9'] <> ['A'..'Z']) [0..]]
collectDigits :: Int -> [Int] -> Integer
collectDigits base = foldl (\acc x -> acc * fromIntegral base + fromIntegral x) 0
unsignedInteger :: SParser Integer
unsignedInteger = collectDigits 10 <$> some (digit 10)
integer :: SParser Integer
integer = asum [char '-' $> (-1), char '+' $> 1, str "" $> 1] >>= \sign -> (sign *) <$> unsignedInteger
-- This is the ceil of the log10 and also very inefficient
log10 :: Integer -> Int
log10 n
| n < 1 = 0
| otherwise = 1 + log10 (n `div` 10)
jsonNumber :: SParser Decimal
jsonNumber = do
whole <- integer
fraction <- fromMaybe 0 <$> optional (str "." *> unsignedInteger)
e <- fromIntegral . fromMaybe 0 <$> optional ((str "E" <|> str "e") *> integer)
pure $ Decimal (whole * 10^log10 fraction + signum whole * fraction) (e - log10 fraction)
escapeChar :: SParser Char
escapeChar = char '\\'
*> asum [
str "'" $> '\'',
str "\"" $> '"',
str "\\" $> '\\',
str "n" $> '\n',
str "r" $> '\r',
str "t" $> '\t',
str "b" $> '\b',
str "f" $> '\f',
str "u" *> (chr . fromIntegral . collectDigits 16 <$> replicateM 4 (digit 16))
]
jsonString :: SParser String
jsonString =
char '"'
*> many (asum [charIf (\c -> c /= '"' && c /= '\\'), escapeChar])
<* char '"'
jsonObjectPair :: SParser (String, JSON)
jsonObjectPair = (,) <$> (whitespace *> jsonString <* whitespace <* char ':') <*> json
json :: SParser JSON
json =
whitespace *>
asum [
Object <$> fromList <$> (char '{' *> jsonObjectPair `sepBy` char ',' <* char '}'),
Array <$> (char '[' *> json `sepBy` char ',' <* char ']'),
Bool <$> asum [str "true" $> True, str "false" $> False],
Number <$> jsonNumber,
String <$> jsonString,
Null <$ str "null"
]
<* whitespace
main :: IO ()
main = interact $ show . parse json
This parses numbers as my own weird Decimal type, in order to preserve all information (converting to Double is lossy). I didn’t bother implementing any methods on the Decimal, because there are other libraries that do that and we’re just writing a parser.
It’s also slow as hell but hey, that’s naive implementations for you!
It ended up being 113 lines. I think I could reduce it a bit more if I was willing to sacrifice readability and/or just inline things instead of implementing stdlib typeclasses.
Just looking at the image, yeah he’s a little parser combinator library entirely from scratch.
Not sure what you mean by verbose type declarations. It looks to be 2 type declarations in a few lines of code (a newtype for the parser and a sum type to represent the different types of JSON values). It’s really not much at all.
You gotta admit though, Haskell is crazy good for parsing and marshaling data
Yes. I’m divided into “hum… 100 lines is larger than I expected” and “what did he mean ‘from scratch’? did he write the parser combinators? if so, 100 lines is crazy small!”
But I’m settling in believing 80 of those lines are verbose type declarations.
I decided to write it myself for fun. I decided that “From Scratch” means:
readfrom PreludeHere is what I came up with (using my favourite parsing method: parser combinators):
This parses numbers as my own weird
Decimaltype, in order to preserve all information (converting toDoubleis lossy). I didn’t bother implementing any methods on theDecimal, because there are other libraries that do that and we’re just writing a parser.It’s also slow as hell but hey, that’s naive implementations for you!
It ended up being 113 lines. I think I could reduce it a bit more if I was willing to sacrifice readability and/or just inline things instead of implementing stdlib typeclasses.
So, ARE you bringing a girl?
I’m not coming to my parents for this new year’s because I might get arrested and/or sent to die in a war. But once Putin dies, yes, I am
Didn’t know where you were talking about til you said Putin.
So that’s two things to look forward to!
Just looking at the image, yeah he’s a little parser combinator library entirely from scratch.
Not sure what you mean by verbose type declarations. It looks to be 2 type declarations in a few lines of code (a newtype for the parser and a sum type to represent the different types of JSON values). It’s really not much at all.
Haskell is succinct.
You could probably write a very basic parser combinator library, enough to parse JSON, in 100 lines of Haskell
Judging by the Parser newtype, he did.
With recursive list comprehensions you can cram quite some complexity into one line of code.
serde has entered the chat
Personally I’m more partial to nom. Serde is quite verbose and complex for a parser.
From Scratch (as much as I like Rust, it’s very likely more verbose from scratch). Haskell is perfect for these kinds of things.
I will concede that implementing the first version in Haskell would be better.
Mostly so that we can then fulfil the meme of reimplementing it Rust!