Prelude
This is part 2 in a polyonymous series that seeks to be a companion guide for those learning Haskell through reading Real World Haskell. In case you missed it, here’s the full intro and part 1.
When we last finished, we ended up with a swell type for representing JSON data and some functions for converting from existing types like String to our new type and back again. We also thought about a couple more functions we’d need to complete our library, including one for converting our Haskell representation of a JSON type to real JSON. We tentatively gave this function the name toJson, with the logical type JsonType -> String. After all, the goal is to get from our custom type to something we can print to the screen or to a file, and we’d like our data in String form to do that.
There are a couple of advantages to keeping functions small and focused. Sure, we could write a function that takes a JsonType and prints it right away, instead of returning a String. But what if we later decide we want to compress the string before printing? That way it’d be smaller and thus take up less bandwidth if we need to send it over a network (a common practice with JSON-serialized data). What if we also want to provide an option for legibly-formatted, human-readable output?
If we try to account for these (and more) possibilities, our function soon becomes bloated. We’ll likely end up tearing pieces of it out to reuse anyway, so it’s better to keep in mind ahead of time that functions should do one thing. This is sound advice in any language, but there are a couple more especially enticing reasons to do it in Haskell:
- The type signature of a function can tell you a lot about it. We’ll see in this article how that gives you an advantage that even something like Visual Studio’s Intellisense is hard-pressed to match. If your function becomes bloated, trying to do too much, then you sacrifice this nicety.
- Haskell makes a bigger deal about side effects than most programming languages. Side effects are the parts of a program that affect global state–that is, when a function interacts with something it didn’t create. This sort of code can be tough to reason about, as the function interacting with global state cannot tell what else might interact with it. Successful Haskellers have found that properly controlling and segregating code that has side effects (like I/O) is a boon to maintainability. The general principle is, code that doesn’t have to have side effects shouldn’t. This would include our example of getting our parsed JSON data into a string suitable for printing: eventually we do need to print it out, which is I/O, but the actual construction of the appropriate string doesn’t involve I/O, and can be performed separately. We’ll continue to focus on this concept.
How to say toString() in Haskell
Many mainstream object-oriented languages have a hierarchy of types such that all types are subtypes of a base “Object” class. Very frequently, this Object class (and thus all classes) has a method that is used to print a string representation of itself. In such a language, our JsonType might override this method so that when printed, JsonType values would come out looking like real chunks of JSON.
Haskell has similar functionality, but for a different purpose. Whereas the typical toString() method mentioned above is generally not of much use until overridden, and the format of its output unspecified, Haskell has a function called show that provides a more consistent approach. Specifically, calling show on a value will return a string that, if read back into Haskell, would yield the value passed. That is, its output is designed to be not just human-readable, but compiler-readable. For example, passing a String value to show will return it surrounded in double quotes and containing properly escaped special characters.
We get a default implementation for show when we put “deriving Show” in the declaration of a new type. If you recall, we indeed did this for JsonType. That’s why GHCi is able to print out our JsonType values–GHCi always calls show on the result of an expression!
If for some reason we aren’t satisfied with the show that Haskell gives a type automatically, we can define it ourselves. It’s recommended, though, that if we do so, it should behave as we described above–the output should be valid Haskell. Since we want our toJson function to produce valid JSON, not valid Haskell, we aren’t going to redefine JsonType‘s show.
What the heck, man? Then why did you just tell me about show?
Fear not–show will still help us out! A couple of our JsonTypes are really Strings and Doubles, and so once we extract them out of the JsonString and JsonNumber constructors respectively, we can call show on the values, and Haskell will format them to strings representing valid Haskell syntax. For strings and numbers, JSON’s syntax and Haskell’s are the same, so we get these cases taken care of for free:
toJson :: JsonType -> String toJson (JsonString string) = show string toJson (JsonNumber number) = show number
However, if we tried to do the same thing with JsonBool, we’d be in trouble: if you try show True in GHCi, you’ll see it returns “True”, and JSON wants lowercase booleans. Thus:
toJson (JsonBool True) = "true" toJson (JsonBool False) = "false"
Likewise, trying show JsonNull in GHCi gives “JsonNull”, and that’s not what we want.
toJson JsonNull = "null"
Printing a JSON “object” is also a bit more complicated. In JSON, an object looks like this:
{ "some key": value, "another key": anotherValue }
In other words, curly braces surround the object, pairs are delimited by commas, and a pair consists of a string followed by a colon followed by a value (which can be of any JSON type). Another wrinkle is that there can be any number of pairs, including zero. This means we’re going to use a common functional programming pattern:
- Our primary function does the stuff that will happen for sure, and passes the processing that can vary to a helper function.
- The helper function needs to handle the cases of what to do when there’s useful input, and what to do when there isn’t. Commonly, input of this sort is a list, so we at least need to handle the empty list vs. a list with elements.
In our case:
-- Remember, we defined JsonObject like this: JsonObject [(String, JsonType)] toJson (JsonObject object) = "{" ++ toKeyValuePairs object ++ "}" where toKeyValuePairs [] = "" toKeyValuePairs pairs = -- ??
We’re off to a good start. Our top-level function is doing the part that will always be done: surrounding the object with curly braces. Our helper function to handle the rest is called toKeyValuePairs, and we define it using the common Haskellism of putting it in a where clause. When a function is supplementary like this, it doesn’t need to be available to the whole module, and it automatically has access to values introduced in the function it’s attached to. Perhaps most importantly, it allows our functions to read quite nicely, allowing a reader to digest a function a chunk at a time. It might look strange at first, but it’s a handy feature you’ll soon wish was in more languages.
Our helper function is primed and ready to handle any or no input. In fact, no input was easy: just return the empty string and we’re done. The other case takes a bit more work. Thinking of our algorithm, we need to turn our list of pairs into strings with the format “key”: value. Then we need to join those strings with commas.
-- ... toKeyValuePairs pairs = joinWithCommas (map pairToString pairs) pairToString (key, value) = show key ++ ": " ++ toJson value joinWithCommas :: [String] -> String joinWithCommas stringList = -- ??
There’s a bit going on here, but it’s hopefully not too hard to follow. The part in parentheses comes first–the “turn list of pairs into list of properly formatted strings” part. For this, we use map, which you’ve probably come across before–it’s a functional programming building block. It applies a function to everything in a list, spitting out a new list containing the results of each application. We want each of our pairs turned into a string, so we’ll pass a function–let’s call it pairToString–and our list of pairs to map.
We then write pairToString. Remember, map will be passing it one element of the list of pairs at a time. We use pattern matching to break up the passed-in pair into its key and value parts. The key part needs to be printed as a properly escaped string in double quotes, so we’ll use the same trick we did earlier and use the show function. We then append the colon. Now we need to print out the value. The problem is, value can be any valid JSON type, and different types get printed out in different ways. If only we had a function that handled the printing of every JSON type! Oh wait, we’re writing it, and it’s called toJson :)
Now we need to join our resulting list of formatted strings with commas. It’s reasonable to assume that joining a list, delimiting elements with a certain thing, is already in the standard library (as it is in many languages, for strings at least). Problem is, we may not know exactly what that function is called. We could just write it ourselves, but it’s of course better not to have to replicate effort if we don’t have to.
Earlier I mentioned how Haskell’s cool type system can help us know what a function does merely by knowing its type. In fact, an entire method of searching the Haskell API is built upon this concept. One website that uses this method to allow us to find functions is Hoogle.
So let’s go to Hoogle, and search for the type of the function we’re looking for. We want a function that takes a list of Strings to be joined, a String to join them with, and returns a single joined String as the result. In Haskell, we write that type as:
[String] -> String -> String
So type that into the search box in Hoogle.
The first result of the search is the function we want! It’s called intercalate, and the description says it “inserts the list xs in between the lists in xss and concatenates the result.” If you think about it, that’s exactly what we’re doing. Note that Hoogle found this function for us even though we didn’t even type in the type exactly. We had the arguments swapped, and we specified String even though intercalate can really use any type of list, not just [Char] (which is all a String is). Cool, eh? Use Hoogle a lot.
Before we go using the function we found in our code, we should take note of where this function is defined. Unless it’s in the Prelude (the name of the Haskell library that’s imported by default), we have to explicitly import the specified library so that the function name can be found. This practice is common in many languages, as the standard library tends to be divided up into namespaces/modules/whatever. (An apology to PHP programmers, who are used to having every standard library function ever written smooshed into the default namespace.)
Hoogle notes in soothing green text that intercalate resides in the module Data.List, so add the following to the top of your source file:
import Data.List (intercalate)
Now we can finish up our joinWithCommas function:
joinWithCommas stringList = intercalate ", " stringList
Now that printing JSON objects is taken care of, we’re just left with JSON arrays. The process will be almost exactly the same as for objects, except we don’t have to print keys, only values. Oh, and we use square brackets instead of curly braces.
toJson (JsonArray array) = "[" ++ toValues array ++ "]" where toValues [] = "" toValues values = joinWithCommas (map toJson values)
Again we use map, this time to apply our toJson function over all the values in the underlying list we’re using to represent our JSON array. Then we join the resulting properly formatted strings with the joinWithCommas function we just fleshed out.
Here, then, is the full toJson function:
toJson :: JsonType -> String toJson (JsonString string) = show string toJson (JsonNumber number) = show number toJson (JsonBool True) = "true" toJson (JsonBool False) = "false" toJson JsonNull = "null" toJson (JsonObject object) = "{" ++ toKeyValuePairs object ++ "}" where toKeyValuePairs [] = "" toKeyValuePairs pairs = joinWithCommas (map pairToString pairs) pairToString (key, value) = show key ++ ": " ++ toJson value toJson (JsonArray array) = "[" ++ toValues array ++ "]" where toValues [] = "" toValues values = joinWithCommas (map toJson values)
Not bad! Now that we’ve taken care of formatting the JSON data, we just need to actually print it. For the time being, we’ll write a quick function that will print to the screen:
printJson jsonValue = putStrLn (toJson jsonValue)
In the next article, we’ll play with different ways of rendering our JSON data.