Tag Archives: tutorial

Fake World Haskell, Part 1

Introduction

Real World Haskell (RWH) is an amazing book. Along with Learn You a Haskell, it’s one of the main resources available for Haskell newbies/intermediates who are looking for a comprehensive guide. It especially focuses on using “everyday life” coding examples to show how things are done. Many concepts are taught by example in this way.

RWH is a big book. And it’s not just fluff: it’s big and concentrated–you can tell that it could easily be twice as big. I was unable to digest it on one reading, and have read many of the chapters many times. I’m not the sharpest knife in the drawer, but I’m an experienced programmer in other languages, and I’m sure there are others in my boat: programmers who are intrigued by Haskell and functional programming, but are looking for more resources to make the journey easier.

I’m writing this, therefore, to be a sort of “companion” to RWH. From it, I’m going to steal much of the organization and flow. The authors of RWH didn’t have the luxury of being able to go back and explain things they’d already explained, and as such I found that following the examples could be difficult. The goal of this text is to provide a more gentle guide through the examples presented in RWH.

Who is this for?

Hopefully, any Haskell newbie will benefit from reading this, even if they haven’t had any issues following along with RWH. But it’s especially geared toward those who have found RWH’s pace to be a little too rigorous, and wouldn’t mind some reinforcement about concepts already touched upon. I’m going to assume that syntax largely isn’t a problem; the first couple chapters of RWH cover most of what I think you’d need to know to follow along (and do so quite well). I won’t, however, assume total familiarity with the concepts the syntax represents (in other words, a datatype definition shouldn’t look weird, but it’s okay if you don’t know what a “data constructor” is).

I’m also going to break a longstanding tradition as far as variable names. As RWH points out, since function definitions in Haskell are short and succinct, short variable names often don’t hamper readability much once you are used to them. For me however, this was (and is) quite an adjustment. I believe the new reader will have plenty of opportunity to acclimate to this style without needing to do so while also trying to figure out what’s going on conceptually. I’m going to take care to avoid one-letter variables and abbreviations where possible.

Part 1: Companion to Chapter 5

I’m going to start with the first “real world” example in RWH, taking some time to re-explain stuff that was brought up earlier in the book as I see fit. This article will cover the definition of a datatype and implementation of accessor functions.

Onward to JSON

JSON (JavaScript Object Notation) is a mini-language designed to represent data, usually to store or transmit said data. It’s an alternative to column-based formats like CSV, and has a simple syntax. We’d like to be able to use Haskell to store data in this format, and then read it back again.

JSON has some notion of datatypes, based (unsurprisingly) on JavaScript’s types: strings, numbers, booleans, arrays, objects, and null.

It would make sense to have representations of those types in Haskell. Of course, most of them are trivial to implement, as they already exist in Haskell. We’ll make a new algebraic data type to represent any JSON type; let’s call it JsonType:

data JsonType =

Remember, “JsonType” is the type constructor, which will be how we refer to the type in the type signatures of functions. On the other side of the = we’ll define our data constructors, which is how we create JsonType data.

data JsonType = JsonString String
    | JsonNumber Double
    | JsonBool Bool
    | JsonNull

We can use these data constructors like functions in order to create JsonType values. We can see that the first 3 constructors are just wrappers for the existing Haskell types String, Double, and Bool. This means we can write something like JsonBool True to represent the true value that would appear in a JSON file.

Why not just use the types we already have?

Types exist so that we can more exactly specify our intent, both for programmers (including ourselves), and for programming tools that can help prove the correctness of a given program. In C, for example, it is common to use a #define or an enum to give a name to an otherwise ambiguous value.

#define TRUE 1

Now we can more clearly express our intent in situations like flag = TRUE. This is a step in the right direction, but there is nothing stopping you from doing things like taking the square root of flag, because the name TRUE is really just veneer for an integer, and the compiler sees it as such.

We made a JsonType so we can distinctly operate on values associated with our task. The JSON language has strings, but does not care about the things Haskell strings/chars care about (length, case, etc.). What is significant to JSON is how they will be parsed and represented in comparison to numbers. From our standpoint, JSON’s strings and numbers are both JsonTypes that we will write functions to work on, sharing the fact that they represent a JSON value, and notwithstanding the fact that they bear resemblance to existing Haskell types.

On the other hand, it would be silly to not take advantage of similar functionality that would work just as well on a JsonString as a Haskell String, if we need to. For that reason, we’ll see it is painless to “extract” the Haskell value from its JsonString wrapper and play with it as needed.

Compound JSON Types

We haven’t yet talked about a couple of important JSON types yet: objects and arrays. These let us describe structured data (“dictionary style,” with key-value pairs, and laundry list style, respectively). We can represent compound types in our Haskell library quite simply:

-- ...
    | JsonArray [JsonType]

Whoa, let’s stop there already. On the surface, we’re doing the same thing as we did for JsonNumber and JsonBool: we make a data constructor, and make it take an existing Haskell type as a parameter (in this case, a list). But there’s an additional wrinkle: we need to state what kind of values this “JSONy list” should be able to hold. Well, in JSON, you can put any JSON type inside an array: a string, a boolean, even another array if you want. We need a way to express “this list can hold all JSON types.” And it just so happens we have a type that represents all JSON types, including JSON arrays themselves: JsonType! This is an example of a recursive type definition, or a type that refers to itself in its own definition. Don’t worry about the fact that we aren’t finished defining it yet!

And finally:

-- ...
    | JsonObject [(String, JsonType)]

This is also a recursive definition, as it’s a JsonType that contains a JsonType. If you recall, comma-delimited values inside parentheses represent a tuple (a fixed-sized collection of values). Thus we’re saying a JsonObject is a list of pairs with a String key and a JsonType value.

Let’s add default instances to our new type–Haskell can automatically give our new type the ability to be compared, sorted, and printed. We just have to say the words!

data JsonType = JsonString String
    | JsonNumber Double
    | JsonBool Bool
    | JsonNull
    | JsonArray [JsonType]
    | JsonObject [(String, JsonType)]
    deriving (Eq, Ord, Show)

The above complete type definition is suitable for some playin’ around in GHCi. Save it to a file (for example, SimpleJSON.hs) and open GHCi in the directory where you saved it. Use the :load command to tell it about your new definition.

ghci> :load SimpleJSON
[1 of 1] Compiling Main ( SimpleJSON.hs, interpreted )
Ok, modules loaded: Main.
ghci> :type JsonString “hello”
JsonString “hello” :: JsonType
ghci> 3.1 + JsonNumber 2.2
…error…

The important thing to note is that using a value constructor like JsonString creates a value of type JsonType. As the programmer who wrote the type, you know that it is a thin veneer over existing Haskell types, but other programmers who use the type don’t (and shouldn’t). For example, just because you know that a JsonNumber is basically a Double, trying to treat it like one (as in the last example above) is illegal.

But um, so?

In its current form, our new type isn’t very useful; we can really only use JsonType values for making JsonArrays or JsonObjects (or with the automatic stuff Haskell did thanks to our deriving statement). Creating a library in Haskell (and in many languages) is a matter of creating the necessary types, and the operations that work on those types. That means we need to define some functions!

What functions do we need? Now’s a good time for a look at the Big Picture:

What is our goal?

I think it’s important to not lose focus of why we created JsonType in the midst of learning how to do it. So keeping in mind that our end goal is translating data to JSON and back again, it makes sense that we need some functions to help with this conversion.

Our JsonType serves as a good “programmy” representation of JSON; we can now write functions that read JSON-formatted files or streams (for example, customer order data from a website) and convert that text into Haskell by outputting JsonType values. Likewise, we can imagine writing functions that then extract the useful data out of JsonTypes, for actual use (for example, in order to total up the line items of a customer’s order, we need to get the “number part” out of JsonNumber so we can add them). Finally, we’ll probably want a way to spit out JSON to the outside world, and we can write functions that can output JSON when given data in JsonType form.

Let’s put that in words. Programmy words.

A great way to write programs in a language with an expressive type system is to express our goal via “function blueprints,” if you will–just writing the type signatures without worrying about implementation yet. Let’s take a shot at it, writing types for what we outlined as desired functionality:

-- Going from real JSON to our Haskell JsonType
fromJson :: String -> JsonType
-- Extracting actual data for manipulation
extractData :: JsonType -> ??
-- Saving data as JSON
toJson :: JsonType -> String

Not too bad so far. We can imagine how getting fromJson and toJson will work; given a string containing JSON-formatted text, we will figure out a way to parse the text and end up with a JsonType. Similarly, if we have a JsonType, we can figure out how to print it with JSON syntax. The tricky thing to think about the type of seems to be extracting values from our new type; after all, the result could be a number of different types (a Double, a String, a list of stuff, etc.). So we’ll need to break that down more:

extractString :: JsonType -> String
extractDouble :: JsonType -> Double
extractBool   :: JsonType -> Bool
extractObject :: JsonType -> [(String, JsonType)]  -- Remember, "object" in JSON terms is a collection of key-value pairs
extractArray  :: JsonType -> [JsonType]  -- Remember, we are representing a JSON array as a list

Now we’re getting somewhere! Oh wait, we forgot one of our JsonTypesJsonNull. Looking at our type definition, we didn’t wrap an existing Haskell type to make a JsonNull; “null” can really only be one thing (the absence of a value), and therefore it’s not necessary to “extract” data out of it (we’d always get the same thing!). Of course, we do need a way of telling if we are dealing with a null value:

isNull :: JsonType -> Bool

Why boilerplate?

Users of dynamically-typed languages are probably not excited about having to write a function for every different type that can be extracted from a JsonType. Haskell recognizes that dealing with types sometimes requires more typing (no pun intended), and provides a shortcut or two to help out. The primary one is via record syntax, which I won’t talk about here. Suffice to say it’s quite easy to have Haskell automatically create “accessor functions” like our extract functions above for a type we define. So don’t worry too much!

Implementation

Now that we’ve sketched out some functions, we can start implementing them. During this process, we might think of other functions that would be nice to have. I personally like to pretend such methods existed, then go back and implement them once I’m done with whatever I was doing.

Stubbing out functions
It often happens that we want to compile our source file in order to test a particular function we’ve written. It’s probably the case that we want to do this before implementing every function we’ve dreamed up. We shouldn’t let unimplemented functions prevent us from a quick compile to test what we’ve already done!

It’s therefore useful to be able to “stub out” a function that we haven’t written yet. This is done very simply:

someFunction = undefined

You might say the special value undefined counts as any type, so code using it will compile even if you’ve already written the type signature for the stubbed function.

Let’s start off by implementing our extractor functions, as they are straightforward. The key here is going to be pattern matching.

extractString :: JsonType -> String
extractString (JsonString innerString) = innerString

Remember, when we define functions in Haskell, we can not only specify parameters, but how those parameters were created. Our type signature says we accept a JsonType parameter, and in our definition we specify a pattern that says specifically we want a JsonType that was created via the JsonString data constructor. That’s already cool, but furthermore it allows us to give the underlying data the type was created with a name. That is, in this case, we can now name the Haskell String inside the JsonString and use it! And use it we do–by returning it as the value of our extractString function.

Error handling

We just made our first accessor function by matching a JsonType value that was created with the JsonString constructor. This works fine for situations like:

ghci> let valueParsed = JsonString “hello”
ghci> extractString valueParsed
“hello”

But try this:

ghci> let valueParsed = JsonNumber 23
ghci> extractString valueParsed

Whoops–our extractString function can take any sort of JsonType, so the above code compiles, but then it blows up when it realizes its parameter wasn’t created via the JsonString constructor.

Since our ultimate goal is to parse arbitrary JSON data, this won’t do. If extractString fails to parse its argument, it doesn’t mean our whole program should crash, it just means we need to try a different extractor function. So clearly we need a way of indicating failure in a less drastic way, without throwing an exception. Many languages leave how to achieve this up to you: Perhaps a special value is returned, like -1, or false, or null. Then future code has to know that special value and check for it explicitly.

Haskell has a nice solution to “returning null” that leverages the type system so there’s no guesswork in future code: we can write a function that maybe returns a useful value, and there’s an actual type for that! This means that code that uses the return value knows in advance that it’s only maybe going to get a useful value; it can’t blindly march forward without checking first. Hence, you’ll never run into the “null reference exception” problem that we’ve all dealt with in other languages.

So, how do we do this? Well, we need to find out how the JsonType that is passed in was created, and if it matches the JsonString constructor, then we return a useful value. If not, we return a failure value. Future code can then check if the failure value was returned, and if so, try a different extractor or whatever.

Ugh, does that mean boilerplate again?

We all hate writing “tests for null.” They are ubiquitous and as such lead to our code containing a lot of boilerplate if-statements. Fortunately, Haskell provides a neat way of hiding this particular breed of inelegance, and we will learn about it in the future. To avoid introducing too much at once, however, we’ll explicitly check for the time being.

Our extractor implementations, version 2

The type that represents “maybe a value” is called, interestingly enough, Maybe. Maybe is the name of the type constructor, just like JsonType is the name of our type’s type constructor. So we write our type signatures something like this:

extractString :: JsonType -> Maybe String

That reads pretty well, doesn’t it? Notice how we can still specify the type of the value that might be returned (String in this case).

Now we need to know how to create a Maybe value. Just like JsonType has a number of value constructors (JsonString, JsonNumber, etc.), Maybe has a couple as well. The constructor for making a value that represents “no useful value” is called Nothing. It’s kinda like our JsonNull constructor: it doesn’t wrap any other value, because there’s no value to wrap. The constructor for representing a useful value is called Just. Just takes a parameter–the value to wrap. Here is Maybe in action:

extractString :: JsonType -> Maybe String
extractString (JsonString innerString) = Just innerString
extractString somethingElse            = Nothing

So now it looks like we have two versions of our extractString function, and indeed we do: one for taking JsonStrings, and one for anything else. So if the function receives a JsonString, we create the “useful” Maybe type, wrapping the String we got. Otherwise, we just return Nothing.

As an (important) aside, note that we can use whatever name we want for “somethingElse.” This is because we aren’t trying to match an argument’s specific constructor; we’re basically just saying “match anything and give it the name somethingElse.” It is pointless (and can be misleading) to give a name to a value we aren’t going to use. In our extractor function, we don’t care about the value of somethingElse; we are simply interested in catching all values that weren’t created via JsonString. Therefore, Haskell programmers use the name _ (the underscore) in cases like this. It still matches everything, but makes it clear you aren’t interested in what was matched. We’ll use the underscore from now on for this purpose.

The rest of our extractors are similar:

extractDouble (JsonNumber number) = Just number
extractDouble _                   = Nothing
 
extractBool (JsonBool bool)       = Just bool
extractBool _                     = Nothing
 
extractObject (JsonObject object) = Just object
extractObject _                   = Nothing
 
extractArray (JsonArray array)    = Just array
extractArray _                    = Nothing

We can also write our isNull function quite easily:

isNull jsonData = jsonData == JsonNull

That was easy since Haskell figured out how to compare one JsonType with another automatically, because in our type declaration we told it to derive Eq. We’ll talk about what Eq is later.

Wrapping up part 1
To review, we’ve talked about making our own types, and what a type constructor and data constructor are. We’ve talked about the value of having distinct representations of distinct entities. You should feel comfortable defining accessor functions to get at data contained in a type, and using the Maybe type to represent a possible lack of value. We’ll continue next time with more from RWH, chapter 5.

Hacking Tutor, Part 2

Find part one here.

Let’s continue with familiarizing ourselves with how the shell works.

In the previous article, we talked about the ls command and using it to list files in a directory. One question I received about this was: what happens to that list after it’s printed?

An important thing to keep in mind is that most of the commands we are dealing with at this point are not interactive. This means that they do their job and then quit. In the case of ls, the program’s job is to spit out a list of the current files in the directory, and then it’s done: we move on to other commands. I’d liken it to requesting information by mail: you send a letter with your request, and some information is sent back to you. If that information changes in the future, you won’t know that unless you receive another letter informing you of it.

Let’s do an experiment to test this principle. In so doing, we will also learn an important Unix concept: redirection.

Open up your terminal. We’re going to create a directory to play around in.

We make a directory with the mkdir program. This name, although abbreviated, should make sense. The mkdir program needs to know what we want to call our new directory, so we pass it an argument to give it this information. (If you don’t remember what “passing an argument” means, read the previous article again.) You can call your new directory almost anything: however, let’s try to stick to letters and numbers and dashes (including spaces or special characters has some special rules involved that we don’t want to get into just yet).

mkdir testing-ground

The above example would make a directory called “testing-ground”. Where does it make it? Well remember, in the world of the shell, you are always located somewhere. As we mentioned in the last article, most shells will tell you your current location in the prompt (see this picture for a reminder of where your location is displayed). If we run the mkdir command as shown above, the “testing-ground” directory will be created inside our current location. If we are in the location “~/Documents”, the location (or path) of our new directory will be “~/Documents/testing-ground”.

What location do I start in? Upon opening a new terminal, most shells start you in your home directory. This is a special place that belongs to your user: you own it, and you can create and delete things inside it at will. Though Microsoft Windows actually does now have the concept of a home directory, many people either don’t use it, or don’t know they are using it. This is because on a standard Windows system, users and programs don’t think about permissions. There’s nothing stopping you from deleting something important, because you have the permission to do anything you want.

On the other hand, on a Unix system like Linux or OS X, if you try to delete a file that is important to the entire system, you will be denied with a stern message. Serves you right.

The “home directory” is so common in Unix systems, that shells provide a handy abbreviation for it, so that you don’t have to type out its full name: ~. If you look at that picture again, you’ll see my location is “~/Documents”–in other words, I’m located in the Documents directory inside my home folder.

Alright, so we issued the mkdir command. Now we need to go inside of it! To do so, we will use the main method of changing our location: cd. This stands for “change directory,” and indeed that is exactly what it does. The cd program also usually needs to know where we want to go: in this case, we want to go to our newly-made directory, so we pass it an argument with the directory’s name:

cd testing-ground

We are now in the new directory. Let’s see what’s inside it, by doing our familiar ls command:

ls -l

It tells us that the directory is empty; this should make sense, since we only just created it!

Let’s put something inside of it. How are we going to do this? Well, we could copy a file from somewhere else, or we could move a file from somewhere else, or we could create a new file altogether.

Let’s make a new, empty file by doing the following:

>jordanrules

Using a > to make a new file might seem weird. Don’t worry, we’ll explain it in a moment.

Let’s ask our friendly friend ls to list the files in the directory again, and see if it sees our new file:

ls -l

Sure enough, jordanrules is now showing up!

The echo command. Let’s take a moment to play with the echo program. It’s one of the most simple programs: you tell it something, and it tells you that same thing back! Indeed, it’s aptly named, since it’s like talking when there’s an echo: your echo will repeat exactly what you said. Let’s try it:

echo hello
echo echo echo echo
echo RENEW!

Useless? Not at all! This is the most important part of this article. If you understand nothing else from this rambling, understand the following thing:

Commands can be chained together.

Let’s say that one more time, for emphasis:

Commands can be chained together.

We are going to do our first “cool thing” in the Unix shell now, by chaining the echo command, which we just played with.

echo hello >jordanrules

So now we come to what that > sign really means: it’s one way of chaining commands! What we just did, was chained the result or output of the echo command and put it into the “jordanrules” file.

Don’t believe me? Let’s open the file and check!

There are lots of ways to open a file. Let’s use the program less to do it. Type:

less jordanrules

Sure enough, we see that the file now contains the word “hello”. (Press q to get out of the less program)

Let’s do this again, with another phrase:

echo blah blah blah >jordanrules

Now check the file again:

less jordanrules

That’s interesting! Our old word, “hello,” is gone entirely, and the file now only contains “blah blah blah”. What if we didn’t want to replace the file’s contents, but just add to it?

We can use a different chaining operator for that. (The term “operator” comes from math, where operators are things like + and -.) Here’s an example:

echo cheerio >>jordanrules

Now open the file with less again:

less jordanrules

Sure enough, the file contains what it did before, plus our new word. Neat, eh?

I know what you’re thinking. “What if I wanted to put the phrase ‘jordan > you’ in the file?” Obviously, we can’t just type echo jordan > you >jordanrules because the > sign is special! The shell would think we are trying to chain the echo command into a file called “you”. But don’t worry, the solution is easy:

echo 'jordan > you' >jordanrules

See what we did there? We put the phrase in single-quotes. In the shell, anything you enclose in single quotes is treated “literally”–this means you can use special symbols etc. in your phrase, and they won’t be treated specially.

For this reason, it’s a good habit to always put phrases that you give to echo inside of single-quotes, at least for what we are using it for now.

Anyway, feel free to play around with creating different files, putting stuff in them, making new directories, and moving around with cd. We’ll talk more about other commands (and of course, more about chaining) in the next article!