Pattern matching in Haskell

These notes discuss the Haskell syntax for function definitions. Given the central role that functions play in Haskell, these aspects of Haskell syntax are fundamental.

Pattern matching consists of specifying patterns to which some data should conform, then checking to see if it does, and de-constructing the data according to those patterns.

When defining functions, you can define separate function bodies for different patterns. You can pattern match on any data type – numbers, characters, lists, tuples, etc. This leads to a very expressive code that is also simple and readable.

Let’s make a really trivial function that checks if the number we supplied to it is a 7 or not

ghci 1> let {lucky :: (Integral a) => a -> String;
		lucky 7 = "LUCKY NUMBER SEVEN!";
		lucky x = "Sorry, you’re out of luck, pal!"}

When you call lucky, the patterns will be checked from top to bottom and when the argument of the function conforms to a pattern, the corresponding function body will be used.

The only way a number can conform to the first pattern here is if it is 7. If it’s not, it falls through to the second pattern, which matches anything and binds it to x.

ghci 2> lucky 7
ghci 3> lucky 19
		"Sorry, you’re out of luck, pal!"

This function could have also been implemented by using an if statement.

But what if we wanted a function that says the numbers from 1 to 5 and says “Not between 1 and 5�? for any other number? Without pattern matching, we’d have to make a pretty convoluted if then else tree. However, with pattern matching:

ghci 4> let {sayMe :: (Integral a) => a -> String;
		sayMe 1 = "One!";
		sayMe 2 = "Two!";
		sayMe 3 = "Three!";
		sayMe 4 = "Four!";
		sayMe 5 = "Five!";
		sayMe x = "Not between 1 and 5"}
ghci 5> sayMe 1
ghci 6> sayMe 5
ghci 7> sayMe 6
"Not between 1 and 5"

Note that if we moved the last pattern (the catch-all one) to the top, it would always say “Not between 1 and 5�? because it would catch all the numbers and they wouldn’t have a chance to fall through and be checked for any other patterns.

The Haskell syntax for functions is much cleaner when we’re not in ghci. For example, you could create a file sayMe.hs in your working ghci directory (run the command :!pwd in ghci to determine that directory if you’re using a Linux or Mac machine) and type the following code in:

sayMe':: (Integral a) => a -> String
sayMe' 1 = "One!"
sayMe' 2 = "Two!"
sayMe' 3 = "Three!"
sayMe' 4 = "Four!"
sayMe' 5 = "Five!"
sayMe' x = "Not between 1 and 5"

Note that we do not use a let keyword, semicolons or curly braces; white space (block indentation) is however significant. To load the function in ghci, just run the following command:

ghci 8> : l sayMe
ghci 9> sayMe0 5
ghci 10> sayMe0 6
"Not between 1 and 5"

Implementing factorial again

Remember the factorial function we implemented previously? We defined the factorial of a number n as a product [1 . . n].

ghci 11> let factorial0 n = product [1 . . n]
ghci 12> factorial0 3
ghci 13> factorial0 6

We can also define a factorial function recursively, the way it is usually defined in mathematics. We start by saying that the factorial of 0 is 1. Then we state that the factorial of any positive integer is that integer multiplied by the factorial of its predecessor

Here’s what that looks like translated in Haskell terms.

ghci 14> let {factorial :: (Integral a) => a -> a;
factorial 0 = 1;
factorial n = n ∗ factorial (n − 1)}
ghci 15> factorial 0
ghci 16> factorial 3
ghci 17> factorial 53

This is the first time we’ve defined a function recursively. Recursion is important in Haskell and we’ll take a closer look at it later.

But in a nutshell, this is what happens if we try to get the factorial of, say, 3:

  • ghci tries to compute 3 * factorial 2
  • factorial 2 is 2 * factorial 1, so for now we have 3 * (2 * factorial 1)
  • factorial 1 is 1 * factorial 0, so we have 3 * (2 * (1 * factorial 0))
  • nowhere comes the trick: we’ve defined factorial 0 to be just 1 and because it encounters that pattern before the catch-all one, it just returns 1
  • so the final result is equivalent to 3 * (2 * (1 * 1))

Had we written the second pattern on top of the first one, it would catch all numbers, including 0 and our calculation would never terminate.

That's why the order is important when specifying patterns and it’s always best to specify the most specific ones first and then the more general ones later.

Pattern matching failures

Pattern matching can also fail. If we define a function like this:

ghci 18> let {charName :: Char -> String;
		charName ’a’ = "AntMan";
		charName ’b’ = "Black Widow";
		charName ’c’ = "Captain America"}

And then try to call it with an input that we didn’t expect, this is what happens:

ghci 19> charName ’a’
ghci 20> charName ’b’
"Black Widow"
ghci 21> charName ’h’

ghci complains that we have non-exhaustive patterns, and rightfully so. When making patterns, we should always include a catch-all pattern so that our program doesn’t crash if we get some unexpected input.

ghci 22> let {charName' :: Char -> String;
		charName' ’a’ = "AntMan";
		charName' ’b’ = "Black Widow";
		charName' ’c’ = "Captain America";
		charName' = "I am not sure of this Avenger, he must be from DC""}
ghci 23> charName0 ’a’
ghci 24> charName' ’b’
"Black Widow"
ghci 25> charName0 ’h’
"I am not sure of this Avenger, he must be from DC"

Pattern matching on tuples

What if we wanted to make a function that takes two vectors in a 2D space (that are in the form of pairs) and adds them together?

To add two vectors together, we add their x components and their y components separately

Here’s how we would have done it if we didn’t know about pattern matching:

ghci 26> let {addVectors' :: (Num a) => (a, a) -> (a, a) -> (a, a);
addVectors' a b = (fst a + fst b,snd a + snd b)}
ghci 27> addVectors' (1, 2.3) (1, 1)
		(2.0, 3.3)

Well, that works, but there’s a better way to do it. Let’s modify the function so that it uses pattern matching.

ghci 28> let {addVectors :: (Num a) => (a, a) -> (a, a) -> (a, a);
		addVectors (x1, y1) (x2, y2) = (x1 + x2, y1 + y2)}

Note that this is already a catch-all pattern. The type of addVectors (in both cases) is addVectors :: (Num a) => (a, a) -> (a, a) -> (a, a), so we are guaranteed to get two pairs as parameters.

ghci 29> : t addVectors
		addVectors :: Num a => (a, a) -> (a, a) -> (a, a)
ghci 30> addVectors (1, 2) (3, 4.2)
		(4.0, 6.2)

Generalizing fst and snd

fst and snd extract the components of pairs. But what about triples? There are no provided functions that do that but we can make our own.

ghci 31> let {first :: (a, b, c) -> a;
		first (x, , ) = x;
		second :: (a, b, c) -> b;
		second ( , y, ) = y;
		third :: (a, b, c) -> c;
		third ( , , z) = z}

The means the same thing as it does in list comprehensions: we don’t care what that part is.

ghci 32> first (1, 2, 3)
ghci 33> second (1, 2, 3)
ghci 34> third (1, 2, 3)

Note the error if we apply these functions to pairs:

ghci 35> first (1, 2)

Pattern matching in list comprehensions

ghci 36> let xs = [ (1, 3),(4, 3),(2, 4),(5, 3),(5, 6),(3, 1)]
ghci 37> [a + b | (a, b) �? xs]
[4, 7, 6, 8, 11, 4]

Should a pattern match fail, we just move on to the next element in the input list. Consider this example:

ghci 38> let xs = [ ("abc", "def"),("ghi", ""),("ABC", "DEF"),("", "GHI"),("the", "end")]
ghci 39> [[y] ++ ". " ++ [z] ++ "." | ((y : ys),(z : zs)) �? xs]
["a. d.", "A. D.", "t. e."]

Lists in pattern matching

Lists themselves can be used in pattern matching. We can match with the empty list [ ] or any pattern that involves: and the empty list, but since [1, 2, 3] is just syntactic sugar for 1: 2 : 3 : [ ], we can also use this pattern.

A pattern like x: xs will bind the head of the list to x and the rest of it to xs, even if there’s only one element so xs ends up being an empty list.

The x: xs pattern is used a lot, especially with recursive functions. But patterns that have: in the only match against lists of length 1 or more.

If you want to bind, say, the first three elements to variables and the rest of the list to another variable, you can use something like x: y: z: zs. It will only match against lists that have three elements or more.

We can’t use ++ in pattern matches (ambiguous patterns).

Implementing the head function :

Now that we know how to pattern-match against lists, let’s make our own implementation of the head function.

ghci 40> let {head' :: [a] -> a;
		head' [ ] = error "Can’t call head on an empty list, dummy!";
		head' (x: _) = x}
ghci 41> head' [4, 5, 6]
ghci 42> head' "Hello"
ghci 43> head' [ ]
* * * Exception : Can't call head on an empty list, dummy!

Note that if you want to bind to several variables (even if one of them is just and doesn’t actually bind at all), we have to put them in parentheses.

Also, note the error function that we used: it takes a string and generates a runtime error, using that string as information about what kind of error occurred.

ghci 44> : t error
error :: [Char] -> a
ghci 45> : i error
error :: [Char] -> a -- Defined in ‘GHC.Err’
ghci 46> :! hoogle -- info error
Prelude error :: [Char] -> a
error stops execution and displays an error message.
From package base error :: [Char] -> a

Note that error causes the program to crash, so use it sparingly

Another example: pattern matching with lists of different lengths :

Let’s make a trivial function that tells us some of the first elements of the list in (in)convenient English form.

ghci 47> let {tell :: (Show a) => [a] -> String;
		tell [ ] = "The list is empty";
		tell (x : [ ]) = "The list has one element: " ++ show x;
		tell (x : y : [ ]) = "The list has two elements: " ++ show x ++ " and " ++ show y;
		tell (x : y: _) = "This list is long. The first two elements are: "
		++ show x ++ " and " ++ show y}
ghci 48> tell [ ]
"The list is empty"
ghci 49> tell "h"
"The list has one element: ’h’"
ghci 50> tell "he"
"The list has two elements: ’h’ and ’e’"
ghci 51> tell "hello"
"This list is long. The first two elements are: ’h’ and ’e’"

This function is safe because it takes care of the empty list, a singleton list, a list with two elements, and a list with more than two elements.

Note that (x : [ ]) and (x : y : [ ]) could be rewritten as [x] and [x, y] – we don’t need parentheses in this case.

We can’t rewrite (x : y: _) with square brackets because it matches any list of length 2 or more.

Implementing length again :

We already implemented our own length function using list comprehension. Now we’ll do it by using pattern matching and a little recursion:

ghci 52> let {length' :: (Num b) => [a] -> b;
		length' [ ] = 0;
		length' ( : xs) = 1 + length0 xs}

This is similar to the factorial function we wrote earlier. First, we defined the result of a known input – the empty list. This is also known as the edge condition (a.k.a. the base condition).

Then in the second pattern, we take the list apart by splitting it into a head and a tail. We say that the length is equal to 1 plus the length of the tail. We use to match the head because we don’t actually care what it is.

Also note that we’ve taken care of all possible patterns of a list: the first pattern matches an empty list and the second one matches anything that isn’t an empty list.

ghci 53> length' [ ]
ghci 54> length' "hello"
ghci 55> length' "hello world"

Let’s see what happens if we call length' on "ham".

ghci 56> length' "ham"
  • first, it will check if it’s an empty list; because it isn’t, it falls through to the second pattern
  • it matches on the second pattern and there it says that the length is 1 + length0 "am", because we broke it into a head and a tail and discarded the head
  • length0 "am" is similarly 1 + length0 "m"; so right now we have 1 + (1 + length0 "m")
  • length' "m" is 1 + length' "" (equivalently, 1 + length' [ ]), and we’ve defined length' [ ] to be 0
  • so in the end, we have 1 + (1 + (1 + 0))

Implementing sum :

Assume that the sum of an empty list is 0. We write that down as a pattern. And we also know that the sum of a list is the head plus the sum of the rest of the list.

ghci 57> let {sum' :: (Num a) => [a] -> a;
		sum' [ ] = 0;
		sum' (x : xs) = x + sum' xs}
ghci 58> sum' [ ]
ghci 59> sum' [1 .. 10]
ghci 60> sum' [1 . . 100]

As-patterns in Haskell

As-patterns are a useful way of breaking something up according to a pattern and binding it to names while still keeping a reference to the whole thing. We do that by putting a name and an @ in front of a pattern.

For instance, the pattern [email protected](x : y : ys) will match exactly the same thing as x : y : ys, but you can easily get the whole list via xs instead of repeating yourself by typing out x : y : ys in the function body again.

ghci 61> let {firstLetter :: String -> String;
		firstLetter "" = "Empty string, whoops!";
		firstLetter [email protected](x : xs) = "The first letter of " ++ all ++ " is " ++ [x]}
ghci 62> firstLetter "Dracula"
		"The first letter of Dracula is D"
ghci 63> firstLetter "Tonsil"
		"The first letter of Tonsil is T"
ghci 64> firstLetter "tonsil"
		"The first letter of tonsil is t"
ghci 65> firstLetter ""
		"Empty string, whoops!"
ghci 66> firstLetter [ ]
		"Empty string, whoops!"
About Author :

I am Pavankumar, Having 8.5 years of experience currently working in Video/Live Analytics project.

Comment / Suggestion Section
Point our Mistakes and Post Your Suggestions