forked from learnyouahaskell/learnyouahaskell.github.io
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathmodules.html
More file actions
1330 lines (1314 loc) · 72.3 KB
/
modules.html
File metadata and controls
1330 lines (1314 loc) · 72.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"https://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Modules - Learn You a Haskell for Great Good!</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<base href="">
<style type="text/css">
@import url('reset.css');
@import url('style.css');
</style>
<link rel="shortcut icon" href="assets/images/favicon.png" type="image/png">
<link rel="prev" href="higher-order-functions.html">
<link rel="next" href="making-our-own-types-and-typeclasses.html">
<link type="text/css" rel="stylesheet" href="sh/Styles/SyntaxHighlighter.css">
<link href="rss.php" rel="alternate" type="application/rss+xml" title="Learn You a Haskell for Great Good! feed">
</head>
<body class="introcontent">
<div class="bgwrapper">
<div id="content">
<div class="footdiv" style="margin-bottom:25px;">
<ul>
<li style="text-align:left">
<a href="higher-order-functions.html" class="prevlink">Higher Order Functions</a>
</li>
<li style="text-align:center">
<a href="chapters.html">Table of contents</a>
</li>
<li style="text-align:right">
<a href="making-our-own-types-and-typeclasses.html" class="nxtlink">Making Our Own Types and Typeclasses</a>
</li>
</ul>
</div>
<h1 id="modules">Modules</h1>
<h2 id="loading-modules">Loading modules</h2>
<p><img src="assets/images/modules/modules.png" class="right"
width="230" height="162" alt="modules" /></p>
<p>A Haskell module is a collection of related functions, types and
typeclasses. A Haskell program is a collection of modules where the main
module loads up the other modules and then uses the functions defined in
them to do something. Having code split up into several modules has
quite a lot of advantages. If a module is generic enough, the functions
it exports can be used in a multitude of different programs. If your own
code is separated into self-contained modules which don’t rely on each
other too much (we also say they are loosely coupled), you can reuse
them later on. It makes the whole deal of writing code more manageable
by having it split into several parts, each of which has some sort of
purpose.</p>
<p>The Haskell standard library is split into modules, each of them
contains functions and types that are somehow related and serve some
common purpose. There’s a module for manipulating lists, a module for
concurrent programming, a module for dealing with complex numbers, etc.
All the functions, types and typeclasses that we’ve dealt with so far
were part of the <code>Prelude</code> module, which is imported by
default. In this chapter, we’re going to examine a few useful modules
and the functions that they have. But first, we’re going to see how to
import modules.</p>
<p>The syntax for importing modules in a Haskell script is
<code>import <module name></code>. This must be done before
defining any functions, so imports are usually done at the top of the
file. One script can, of course, import several modules. Just put each
import statement into a separate line. Let’s import the
<code>Data.List</code> module, which has a bunch of useful functions for
working with lists and use a function that it exports to create a
function that tells us how many unique elements a list has.</p>
<pre class="haskell:hs"><code>import Data.List
numUniques :: (Eq a) => [a] -> Int
numUniques = length . nub</code></pre>
<p>When you do <code>import Data.List</code>, all the functions that
<code>Data.List</code> exports become available in the global namespace,
meaning that you can call them from wherever in the script.
<code>nub</code> is a function defined in <code>Data.List</code> that
takes a list and weeds out duplicate elements. Composing
<code>length</code> and <code>nub</code> by doing
<code>length . nub</code> produces a function that’s the equivalent of
<code>\xs -> length (nub xs)</code>.</p>
<p>You can also put the functions of modules into the global namespace
when using GHCI. If you’re in GHCI and you want to be able to call the
functions exported by <code>Data.List</code>, do this:</p>
<pre class="haskell:ghci"><code>ghci> :m + Data.List</code></pre>
<p>If we want to load up the names from several modules inside GHCI, we
don’t have to do <code>:m +</code> several times, we can just load up
several modules at once.</p>
<pre class="haskell:ghci"><code>ghci> :m + Data.List Data.Map Data.Set</code></pre>
<p>However, if you’ve loaded a script that already imports a module, you
don’t need to use <code>:m +</code> to get access to it.</p>
<p>If you just need a couple of functions from a module, you can
selectively import just those functions. If we wanted to import only the
<code>nub</code> and <code>sort</code> functions from
<code>Data.List</code>, we’d do this:</p>
<pre class="haskell:hs"><code>import Data.List (nub, sort)</code></pre>
<p>You can also choose to import all of the functions of a module except
a few select ones. That’s often useful when several modules export
functions with the same name and you want to get rid of the offending
ones. Say we already have our own function that’s called
<code>nub</code> and we want to import all the functions from
<code>Data.List</code> except the <code>nub</code> function:</p>
<pre class="haskell:hs"><code>import Data.List hiding (nub)</code></pre>
<p>Another way of dealing with name clashes is to do qualified imports.
The <code>Data.Map</code> module, which offers a data structure for
looking up values by key, exports a bunch of functions with the same
name as <code>Prelude</code> functions, like <code>filter</code> or
<code>null</code>. So when we import <code>Data.Map</code> and then call
<code>filter</code>, Haskell won’t know which function to use. Here’s
how we solve this:</p>
<pre class="haskell:hs"><code>import qualified Data.Map</code></pre>
<p>This makes it so that if we want to reference <code>Data.Map</code>’s
<code>filter</code> function, we have to do
<code>Data.Map.filter</code>, whereas just <code>filter</code> still
refers to the normal <code>filter</code> we all know and love. But
typing out <code>Data.Map</code> in front of every function from that
module is kind of tedious. That’s why we can rename the qualified import
to something shorter:</p>
<pre class="haskell:hs"><code>import qualified Data.Map as M</code></pre>
<p>Now, to reference <code>Data.Map</code>’s <code>filter</code>
function, we just use <code>M.filter</code>.</p>
<p>Use <a
href="https://downloads.haskell.org/~ghc/latest/docs/html/libraries/">this
handy reference</a> to see which modules are in the standard library. A
great way to pick up new Haskell knowledge is to just click through the
standard library reference and explore the modules and their functions.
You can also view the Haskell source code for each module. Reading the
source code of some modules is a really good way to learn Haskell and
get a solid feel for it.</p>
<p>To search for functions or to find out where they’re located, use <a
href="https://hoogle.haskell.org/">Hoogle</a>. It’s a really awesome
Haskell search engine, you can search by name, module name or even type
signature.</p>
<h2 id="data-list">Data.List</h2>
<p>The <code>Data.List</code> module is all about lists, obviously. It
provides some very useful functions for dealing with them. We’ve already
met some of its functions (like <code>map</code> and
<code>filter</code>) because the <code>Prelude</code> module exports
some functions from <code>Data.List</code> for convenience. You don’t
have to import <code>Data.List</code> via a qualified import because it
doesn’t clash with any <code>Prelude</code> names except for those that
<code>Prelude</code> already steals from <code>Data.List</code>. Let’s
take a look at some of the functions that we haven’t met before.</p>
<p><code class="label function">intersperse</code> takes an element and
a list and then puts that element in between each pair of elements in
the list. Here’s a demonstration:</p>
<pre class="haskell:ghci"><code>ghci> intersperse '.' "MONKEY"
"M.O.N.K.E.Y"
ghci> intersperse 0 [1,2,3,4,5,6]
[1,0,2,0,3,0,4,0,5,0,6]</code></pre>
<p><code class="label function">intercalate</code> takes a list and a
list of lists. It then inserts that list in between all those lists and
then flattens the result.</p>
<pre class="haskell:ghci"><code>ghci> intercalate " " ["hey","there","folks"]
"hey there folks"
ghci> intercalate [0,0,0] [[1,2,3],[4,5,6],[7,8,9]]
[1,2,3,0,0,0,4,5,6,0,0,0,7,8,9]</code></pre>
<p><code class="label function">transpose</code> transposes a list of
lists. If you look at a list of lists as a 2D matrix, the columns become
the rows and vice versa.</p>
<pre class="haskell:ghci"><code>ghci> transpose [[1,2,3],[4,5,6],[7,8,9]]
[[1,4,7],[2,5,8],[3,6,9]]
ghci> transpose ["hey","there","folks"]
["htf","eho","yel","rk","es"]</code></pre>
<p>Say we have the polynomials <em>3x<sup>2</sup> + 5x + 9</em>,
<em>10x<sup>3</sup> + 9</em> and <em>8x<sup>3</sup> + 5x<sup>2</sup> + x
- 1</em> and we want to add them together. We can use the lists
<code>[0,3,5,9]</code>, <code>[10,0,0,9]</code> and
<code>[8,5,1,-1]</code> to represent them in Haskell. Now, to add them,
all we have to do is this:</p>
<pre class="haskell:ghci"><code>ghci> map sum $ transpose [[0,3,5,9],[10,0,0,9],[8,5,1,-1]]
[18,8,6,17]</code></pre>
<p>When we transpose these three lists, the third powers are then in the
first row, the second powers in the second one and so on. Mapping
<code>sum</code> to that produces our desired result.</p>
<p><img src="assets/images/modules/legolists.png" class="left"
width="230" height="212" alt="shopping lists" /></p>
<p><code class="label function">foldl'</code> and <code
class="label function">foldl1'</code> are stricter versions of their
respective lazy incarnations. When using lazy folds on really big lists,
you might often get a stack overflow error. The culprit for that is that
due to the lazy nature of the folds, the accumulator value isn’t
actually updated as the folding happens. What actually happens is that
the accumulator kind of makes a promise that it will compute its value
when asked to actually produce the result (also called a thunk). That
happens for every intermediate accumulator and all those thunks overflow
your stack. The strict folds aren’t lazy buggers and actually compute
the intermediate values as they go along instead of filling up your
stack with thunks. So if you ever get stack overflow errors when doing
lazy folds, try switching to their strict versions.</p>
<p><code class="label function">concat</code> flattens a list of lists
into just a list of elements.</p>
<pre class="haskell:ghci"><code>ghci> concat ["foo","bar","car"]
"foobarcar"
ghci> concat [[3,4,5],[2,3,4],[2,1,1]]
[3,4,5,2,3,4,2,1,1]</code></pre>
<p>It will just remove one level of nesting. So if you want to
completely flatten <code>[[[2,3],[3,4,5],[2]],[[2,3],[3,4]]]</code>,
which is a list of lists of lists, you have to concatenate it twice.</p>
<p>Doing <code class="label function">concatMap</code> is the same as
first mapping a function to a list and then concatenating the list with
<code>concat</code>.</p>
<pre class="haskell:ghci"><code>ghci> concatMap (replicate 4) [1..3]
[1,1,1,1,2,2,2,2,3,3,3,3]</code></pre>
<p><code class="label function">and</code> takes a list of boolean
values and returns <code>True</code> only if all the values in the list
are <code>True</code>.</p>
<pre class="haskell:ghci"><code>ghci> and $ map (>4) [5,6,7,8]
True
ghci> and $ map (==4) [4,4,4,3,4]
False</code></pre>
<p><code class="label function">or</code> is like <code>and</code>, only
it returns <code>True</code> if any of the boolean values in a list is
<code>True</code>.</p>
<pre class="haskell:ghci"><code>ghci> or $ map (==4) [2,3,4,5,6,1]
True
ghci> or $ map (>4) [1,2,3]
False</code></pre>
<p><code class="label function">any</code> and <code
class="label function">all</code> take a predicate and then check if any
or all the elements in a list satisfy the predicate, respectively.
Usually we use these two functions instead of mapping over a list and
then doing <code>and</code> or <code>or</code>.</p>
<pre class="haskell:ghci"><code>ghci> any (==4) [2,3,5,6,1,4]
True
ghci> all (>4) [6,9,10]
True
ghci> all (`elem` ['A'..'Z']) "HEYGUYSwhatsup"
False
ghci> any (`elem` ['A'..'Z']) "HEYGUYSwhatsup"
True</code></pre>
<p><code class="label function">iterate</code> takes a function and a
starting value. It applies the function to the starting value, then it
applies that function to the result, then it applies the function to
that result again, etc. It returns all the results in the form of an
infinite list.</p>
<pre class="haskell:ghci"><code>ghci> take 10 $ iterate (*2) 1
[1,2,4,8,16,32,64,128,256,512]
ghci> take 3 $ iterate (++ "haha") "haha"
["haha","hahahaha","hahahahahaha"]</code></pre>
<p><code class="label function">splitAt</code> takes a number and a
list. It then splits the list at that many elements, returning the
resulting two lists in a tuple.</p>
<pre class="haskell:ghci"><code>ghci> splitAt 3 "heyman"
("hey","man")
ghci> splitAt 100 "heyman"
("heyman","")
ghci> splitAt (-3) "heyman"
("","heyman")
ghci> let (a,b) = splitAt 3 "foobar" in b ++ a
"barfoo"</code></pre>
<p><code class="label function">takeWhile</code> is a really useful
little function. It takes elements from a list while the predicate holds
and then when an element is encountered that doesn’t satisfy the
predicate, it’s cut off. It turns out this is very useful.</p>
<pre class="haskell:ghci"><code>ghci> takeWhile (>3) [6,5,4,3,2,1,2,3,4,5,4,3,2,1]
[6,5,4]
ghci> takeWhile (/=' ') "This is a sentence"
"This"</code></pre>
<p>Say we wanted to know the sum of all third powers that are under
10,000. We can’t map <code>(^3)</code> to <code>[1..]</code>, apply a
filter and then try to sum that up because filtering an infinite list
never finishes. You may know that all the elements here are ascending
but Haskell doesn’t. That’s why we can do this:</p>
<pre class="haskell:ghci"><code>ghci> sum $ takeWhile (<10000) $ map (^3) [1..]
53361</code></pre>
<p>We apply <code>(^3)</code> to an infinite list and then once an
element that’s over 10,000 is encountered, the list is cut off. Now we
can sum it up easily.</p>
<p><code class="label function">dropWhile</code> is similar, only it
drops all the elements while the predicate is true. Once predicate
equates to <code>False</code>, it returns the rest of the list. An
extremely useful and lovely function!</p>
<pre class="haskell:ghci"><code>ghci> dropWhile (/=' ') "This is a sentence"
" is a sentence"
ghci> dropWhile (<3) [1,2,2,2,3,4,5,4,3,2,1]
[3,4,5,4,3,2,1]</code></pre>
<p>We’re given a list that represents the value of a stock by date. The
list is made of tuples whose first component is the stock value, the
second is the year, the third is the month and the fourth is the date.
We want to know when the stock value first exceeded one thousand
dollars!</p>
<pre class="haskell:ghci"><code>ghci> let stock = [(994.4,2008,9,1),(995.2,2008,9,2),(999.2,2008,9,3),(1001.4,2008,9,4),(998.3,2008,9,5)]
ghci> head (dropWhile (\(val,y,m,d) -> val < 1000) stock)
(1001.4,2008,9,4)</code></pre>
<p><code class="label function">span</code> is kind of like
<code>takeWhile</code>, only it returns a pair of lists. The first list
contains everything the resulting list from <code>takeWhile</code> would
contain if it were called with the same predicate and the same list. The
second list contains the part of the list that would have been
dropped.</p>
<pre class="haskell:ghci"><code>ghci> let (fw, rest) = span (/=' ') "This is a sentence" in "First word:" ++ fw ++ ", the rest:" ++ rest
"First word: This, the rest: is a sentence"</code></pre>
<p>Whereas <code>span</code> spans the list while the predicate is true,
<code class="label function">break</code> breaks it when the predicate
is first true. Doing <code>break p</code> is the equivalent of doing
<code>span (not . p)</code>.</p>
<pre class="haskell:ghci"><code>ghci> break (==4) [1,2,3,4,5,6,7]
([1,2,3],[4,5,6,7])
ghci> span (/=4) [1,2,3,4,5,6,7]
([1,2,3],[4,5,6,7])</code></pre>
<p>When using <code>break</code>, the second list in the result will
start with the first element that satisfies the predicate.</p>
<p><code class="label function">sort</code> simply sorts a list. The
type of the elements in the list has to be part of the <code>Ord</code>
typeclass, because if the elements of a list can’t be put in some kind
of order, then the list can’t be sorted.</p>
<pre class="haskell:ghci"><code>ghci> sort [8,5,3,2,1,6,4,2]
[1,2,2,3,4,5,6,8]
ghci> sort "This will be sorted soon"
" Tbdeehiillnooorssstw"</code></pre>
<p><code class="label function">group</code> takes a list and groups
adjacent elements into sublists if they are equal.</p>
<pre class="haskell:ghci"><code>ghci> group [1,1,1,1,2,2,2,2,3,3,2,2,2,5,6,7]
[[1,1,1,1],[2,2,2,2],[3,3],[2,2,2],[5],[6],[7]]</code></pre>
<p>If we sort a list before grouping it, we can find out how many times
each element appears in the list.</p>
<pre class="haskell:ghci"><code>ghci> map (\l@(x:xs) -> (x,length l)) . group . sort $ [1,1,1,1,2,2,2,2,3,3,2,2,2,5,6,7]
[(1,4),(2,7),(3,2),(5,1),(6,1),(7,1)]</code></pre>
<p><code class="label function">inits</code> and <code
class="label function">tails</code> are like <code>init</code> and
<code>tail</code>, only they recursively apply that to a list until
there’s nothing left. Observe.</p>
<pre class="haskell:ghci"><code>ghci> inits "w00t"
["","w","w0","w00","w00t"]
ghci> tails "w00t"
["w00t","00t","0t","t",""]
ghci> let w = "w00t" in zip (inits w) (tails w)
[("","w00t"),("w","00t"),("w0","0t"),("w00","t"),("w00t","")]</code></pre>
<p>Let’s use a fold to implement searching a list for a sublist.</p>
<pre class="haskell:hs"><code>search :: (Eq a) => [a] -> [a] -> Bool
search needle haystack =
let nlen = length needle
in foldl (\acc x -> if take nlen x == needle then True else acc) False (tails haystack)</code></pre>
<p>First we call <code>tails</code> with the list in which we’re
searching. Then we go over each tail and see if it starts with what
we’re looking for.</p>
<p>With that, we actually just made a function that behaves like <code
class="label function">isInfixOf</code>. <code>isInfixOf</code> searches
for a sublist within a list and returns <code>True</code> if the sublist
we’re looking for is somewhere inside the target list.</p>
<pre class="haskell:ghci"><code>ghci> "cat" `isInfixOf` "im a cat burglar"
True
ghci> "Cat" `isInfixOf` "im a cat burglar"
False
ghci> "cats" `isInfixOf` "im a cat burglar"
False</code></pre>
<p><code class="label function">isPrefixOf</code> and <code
class="label function">isSuffixOf</code> search for a sublist at the
beginning and at the end of a list, respectively.</p>
<pre class="haskell:ghci"><code>ghci> "hey" `isPrefixOf` "hey there!"
True
ghci> "hey" `isPrefixOf` "oh hey there!"
False
ghci> "there!" `isSuffixOf` "oh hey there!"
True
ghci> "there!" `isSuffixOf` "oh hey there"
False</code></pre>
<p><code class="label function">elem</code> and <code
class="label function">notElem</code> check if an element is or isn’t
inside a list.</p>
<p><code class="label function">partition</code> takes a list and a
predicate and returns a pair of lists. The first list in the result
contains all the elements that satisfy the predicate, the second
contains all the ones that don’t.</p>
<pre class="haskell:ghci"><code>ghci> partition (`elem` ['A'..'Z']) "BOBsidneyMORGANeddy"
("BOBMORGAN","sidneyeddy")
ghci> partition (>3) [1,3,5,6,3,2,1,0,3,7]
([5,6,7],[1,3,3,2,1,0,3])</code></pre>
<p>It’s important to understand how this is different from
<code>span</code> and <code>break</code>:</p>
<pre class="haskell:ghci"><code>ghci> span (`elem` ['A'..'Z']) "BOBsidneyMORGANeddy"
("BOB","sidneyMORGANeddy")</code></pre>
<p>While <code>span</code> and <code>break</code> are done once they
encounter the first element that doesn’t and does satisfy the predicate,
<code>partition</code> goes through the whole list and splits it up
according to the predicate.</p>
<p><code class="label function">find</code> takes a list and a predicate
and returns the first element that satisfies the predicate. But it
returns that element wrapped in a <code>Maybe</code> value. We’ll be
covering algebraic data types more in depth in the next chapter but for
now, this is what you need to know: a <code>Maybe</code> value can
either be <code>Just something</code> or <code>Nothing</code>. Much like
a list can be either an empty list or a list with some elements, a
<code>Maybe</code> value can be either no elements or a single element.
And like the type of a list of, say, integers is <code>[Int]</code>, the
type of maybe having an integer is <code>Maybe Int</code>. Anyway, let’s
take our <code>find</code> function for a spin.</p>
<pre class="haskell:ghci"><code>ghci> find (>4) [1,2,3,4,5,6]
Just 5
ghci> find (>9) [1,2,3,4,5,6]
Nothing
ghci> :t find
find :: (a -> Bool) -> [a] -> Maybe a</code></pre>
<p>Notice the type of <code>find</code>. Its result is
<code>Maybe a</code>. That’s kind of like having the type of
<code>[a]</code>, only a value of the type <code>Maybe</code> can
contain either no elements or one element, whereas a list can contain no
elements, one element or several elements.</p>
<p>Remember when we were searching for the first time our stock went
over $1000. We did
<code>head (dropWhile (\(val,y,m,d) -> val < 1000) stock)</code>.
Remember that <code>head</code> is not really safe. What would happen if
our stock never went over $1000? Our application of
<code>dropWhile</code> would return an empty list and getting the head
of an empty list would result in an error. However, if we rewrote that
as <code>find (\(val,y,m,d) -> val > 1000) stock</code>, we’d be
much safer. If our stock never went over $1000 (so if no element
satisfied the predicate), we’d get back a <code>Nothing</code>. But if
there was a valid answer in that list, we’d get, say,
<code>Just (1001.4,2008,9,4)</code>.</p>
<p><code class="label function">elemIndex</code> is kind of like
<code>elem</code>, only it doesn’t return a boolean value. It maybe
returns the index of the element we’re looking for. If that element
isn’t in our list, it returns a <code>Nothing</code>.</p>
<pre class="haskell:ghci"><code>ghci> :t elemIndex
elemIndex :: (Eq a) => a -> [a] -> Maybe Int
ghci> 4 `elemIndex` [1,2,3,4,5,6]
Just 3
ghci> 10 `elemIndex` [1,2,3,4,5,6]
Nothing</code></pre>
<p><code class="label function">elemIndices</code> is like
<code>elemIndex</code>, only it returns a list of indices, in case the
element we’re looking for crops up in our list several times. Because
we’re using a list to represent the indices, we don’t need a
<code>Maybe</code> type, because failure can be represented as the empty
list, which is very much synonymous to <code>Nothing</code>.</p>
<pre class="haskell:ghci"><code>ghci> ' ' `elemIndices` "Where are the spaces?"
[5,9,13]</code></pre>
<p><code class="label function">findIndex</code> is like find, but it
maybe returns the index of the first element that satisfies the
predicate. <code class="label function">findIndices</code> returns the
indices of all elements that satisfy the predicate in the form of a
list.</p>
<pre class="haskell:ghci"><code>ghci> findIndex (==4) [5,3,2,1,6,4]
Just 5
ghci> findIndex (==7) [5,3,2,1,6,4]
Nothing
ghci> findIndices (`elem` ['A'..'Z']) "Where Are The Caps?"
[0,6,10,14]</code></pre>
<p>We already covered <code>zip</code> and <code>zipWith</code>. We
noted that they zip together two lists, either in a tuple or with a
binary function (meaning such a function that takes two parameters). But
what if we want to zip together three lists? Or zip three lists with a
function that takes three parameters? Well, for that, we have <code
class="label function">zip3</code>, <code
class="label function">zip4</code>, etc. and <code
class="label function">zipWith3</code>, <code
class="label function">zipWith4</code>, etc. These variants go up to 7.
While this may look like a hack, it works out pretty fine, because there
aren’t many times when you want to zip 8 lists together. There’s also a
very clever way for zipping infinite numbers of lists, but we’re not
advanced enough to cover that just yet.</p>
<pre class="haskell:ghci"><code>ghci> zipWith3 (\x y z -> x + y + z) [1,2,3] [4,5,2,2] [2,2,3]
[7,9,8]
ghci> zip4 [2,3,3] [2,2,2] [5,5,3] [2,2,2]
[(2,2,5,2),(3,2,5,2),(3,2,3,2)]</code></pre>
<p>Just like with normal zipping, lists that are longer than the
shortest list that’s being zipped are cut down to size.</p>
<p><code class="label function">lines</code> is a useful function when
dealing with files or input from somewhere. It takes a string and
returns every line of that string as separate element of a list.</p>
<pre class="haskell:ghci"><code>ghci> lines "first line\nsecond line\nthird line"
["first line","second line","third line"]</code></pre>
<p><code>'\n'</code> is the character for a unix newline. Backslashes
have special meaning in Haskell strings and characters.</p>
<p><code class="label function">unlines</code> is the inverse function
of <code>lines</code>. It takes a list of strings and joins them
together using a <code>'\n'</code>.</p>
<pre class="haskell:ghci"><code>ghci> unlines ["first line", "second line", "third line"]
"first line\nsecond line\nthird line\n"</code></pre>
<p><code class="label function">words</code> and <code
class="label function">unwords</code> are for splitting a line of text
into words or joining a list of words into a text. Very useful.</p>
<pre class="haskell:ghci"><code>ghci> words "hey these are the words in this sentence"
["hey","these","are","the","words","in","this","sentence"]
ghci> words "hey these are the words in this\nsentence"
["hey","these","are","the","words","in","this","sentence"]
ghci> unwords ["hey","there","mate"]
"hey there mate"</code></pre>
<p>We’ve already mentioned <code class="label function">nub</code>. It
takes a list and weeds out the duplicate elements, returning a list
whose every element is a unique snowflake! The function does have a kind
of strange name. It turns out that “nub” means a small lump or essential
part of something. In my opinion, they should use real words for
function names instead of old-people words.</p>
<pre class="haskell:ghci"><code>ghci> nub [1,2,3,4,3,2,1,2,3,4,3,2,1]
[1,2,3,4]
ghci> nub "Lots of words and stuff"
"Lots fwrdanu"</code></pre>
<p><code class="label function">delete</code> takes an element and a
list and deletes the first occurrence of that element in the list.</p>
<pre class="haskell:ghci"><code>ghci> delete 'h' "hey there ghang!"
"ey there ghang!"
ghci> delete 'h' . delete 'h' $ "hey there ghang!"
"ey tere ghang!"
ghci> delete 'h' . delete 'h' . delete 'h' $ "hey there ghang!"
"ey tere gang!"</code></pre>
<p><code class="label function">\\</code> is the list difference
function. It acts like a set difference, basically. For every element in
the right-hand list, it removes a matching element in the left one.</p>
<pre class="haskell:ghci"><code>ghci> [1..10] \\ [2,5,9]
[1,3,4,6,7,8,10]
ghci> "Im a big baby" \\ "big"
"Im a baby"</code></pre>
<p>Doing <code>[1..10] \\ [2,5,9]</code> is like doing
<code>delete 2 . delete 5 . delete 9 $ [1..10]</code>.</p>
<p><code class="label function">union</code> also acts like a function
on sets. It returns the union of two lists. It pretty much goes over
every element in the second list and appends it to the first one if it
isn’t already in yet. Watch out though, duplicates are removed from the
second list!</p>
<pre class="haskell:ghci"><code>ghci> "hey man" `union` "man what's up"
"hey manwt'sup"
ghci> [1..7] `union` [5..10]
[1,2,3,4,5,6,7,8,9,10]</code></pre>
<p><code class="label function">intersect</code> works like set
intersection. It returns only the elements that are found in both
lists.</p>
<pre class="haskell:ghci"><code>ghci> [1..7] `intersect` [5..10]
[5,6,7]</code></pre>
<p><code class="label function">insert</code> takes an element and a
list of elements that can be sorted and inserts it into the last
position where it’s still less than or equal to the next element. In
other words, <code>insert</code> will start at the beginning of the list
and then keep going until it finds an element that’s equal to or greater
than the element that we’re inserting and it will insert it just before
the element.</p>
<pre class="haskell:ghci"><code>ghci> insert 4 [3,5,1,2,8,2]
[3,4,5,1,2,8,2]
ghci> insert 4 [1,3,4,4,1]
[1,3,4,4,4,1]</code></pre>
<p>The <code>4</code> is inserted right after the <code>3</code> and
before the <code>5</code> in the first example and in between the
<code>3</code> and <code>4</code> in the second example.</p>
<p>If we use <code>insert</code> to insert into a sorted list, the
resulting list will be kept sorted.</p>
<pre class="haskell:ghci"><code>ghci> insert 4 [1,2,3,5,6,7]
[1,2,3,4,5,6,7]
ghci> insert 'g' $ ['a'..'f'] ++ ['h'..'z']
"abcdefghijklmnopqrstuvwxyz"
ghci> insert 3 [1,2,4,3,2,1]
[1,2,3,4,3,2,1]</code></pre>
<p>What <code>length</code>, <code>take</code>, <code>drop</code>,
<code>splitAt</code>, <code>!!</code> and <code>replicate</code> have in
common is that they take an <code>Int</code> as one of their parameters
(or return an <code>Int</code>), even though they could be more generic
and usable if they just took any type that’s part of the
<code>Integral</code> or <code>Num</code> typeclasses (depending on the
functions). They do that for historical reasons. However, fixing that
would probably break a lot of existing code. That’s why
<code>Data.List</code> has their more generic equivalents, named <code
class="label function">genericLength</code>, <code
class="label function">genericTake</code>, <code
class="label function">genericDrop</code>, <code
class="label function">genericSplitAt</code>, <code
class="label function">genericIndex</code> and <code
class="label function">genericReplicate</code>. For instance,
<code>length</code> has a type signature of
<code>length :: [a] -> Int</code>. If we try to get the average of a
list of numbers by doing
<code>let xs = [1..6] in sum xs / length xs</code>, we get a type error,
because you can’t use <code>/</code> with an <code>Int</code>.
<code>genericLength</code>, on the other hand, has a type signature of
<code>genericLength :: (Num a) => [b] -> a</code>. Because a
<code>Num</code> can act like a floating point number, getting the
average by doing
<code>let xs = [1..6] in sum xs / genericLength xs</code> works out just
fine.</p>
<p>The <code>nub</code>, <code>delete</code>, <code>union</code>,
<code>intersect</code> and <code>group</code> functions all have their
more general counterparts called <code
class="label function">nubBy</code>, <code
class="label function">deleteBy</code>, <code
class="label function">unionBy</code>, <code
class="label function">intersectBy</code> and <code
class="label function">groupBy</code>. The difference between them is
that the first set of functions use <code>==</code> to test for
equality, whereas the <em>By</em> ones also take an equality function
and then compare them by using that equality function.
<code>group</code> is the same as <code>groupBy (==)</code>.</p>
<p>For instance, say we have a list that describes the value of a
function for every second. We want to segment it into sublists based on
when the value was below zero and when it went above. If we just did a
normal <code>group</code>, it would just group the equal adjacent values
together. But what we want is to group them by whether they are negative
or not. That’s where <code>groupBy</code> comes in! The equality
function supplied to the <em>By</em> functions should take two elements
of the same type and return <code>True</code> if it considers them equal
by its standards.</p>
<pre class="haskell:ghci"><code>ghci> let values = [-4.3, -2.4, -1.2, 0.4, 2.3, 5.9, 10.5, 29.1, 5.3, -2.4, -14.5, 2.9, 2.3]
ghci> groupBy (\x y -> (x > 0) == (y > 0)) values
[[-4.3,-2.4,-1.2],[0.4,2.3,5.9,10.5,29.1,5.3],[-2.4,-14.5],[2.9,2.3]]</code></pre>
<p>From this, we clearly see which sections are positive and which are
negative. The equality function supplied takes two elements and then
returns <code>True</code> only if they’re both negative or if they’re
both positive. This equality function can also be written as
<code>\x y -> (x > 0) && (y > 0) || (x <= 0) && (y <= 0)</code>,
although I think the first way is more readable. An even clearer way to
write equality functions for the <em>By</em> functions is if you import
the <code class="label function">on</code> function from
<code>Data.Function</code>. <code>on</code> is defined like this:</p>
<pre class="haskell:ghci"><code>on :: (b -> b -> c) -> (a -> b) -> a -> a -> c
f `on` g = \x y -> f (g x) (g y)</code></pre>
<p>So doing <code>(==) `on` (> 0)</code> returns an equality function
that looks like <code>\x y -> (x > 0) == (y > 0)</code>.
<code>on</code> is used a lot with the <em>By</em> functions because
with it, we can do:</p>
<pre class="haskell:ghci"><code>ghci> groupBy ((==) `on` (> 0)) values
[[-4.3,-2.4,-1.2],[0.4,2.3,5.9,10.5,29.1,5.3],[-2.4,-14.5],[2.9,2.3]]</code></pre>
<p>Very readable indeed! You can read it out loud: Group this by
equality on whether the elements are greater than zero.</p>
<p>Similarly, the <code>sort</code>, <code>insert</code>,
<code>maximum</code> and <code>minimum</code> also have their more
general equivalents. Functions like <code>groupBy</code> take a function
that determines when two elements are equal. <code
class="label function">sortBy</code>, <code
class="label function">insertBy</code>, <code
class="label function">maximumBy</code> and <code
class="label function">minimumBy</code> take a function that determine
if one element is greater, smaller or equal to the other. The type
signature of <code>sortBy</code> is
<code>sortBy :: (a -> a -> Ordering) -> [a] -> [a]</code>.
If you remember from before, the <code>Ordering</code> type can have a
value of <code>LT</code>, <code>EQ</code> or <code>GT</code>.
<code>sort</code> is the equivalent of <code>sortBy compare</code>,
because compare just takes two elements whose type is in the
<code>Ord</code> typeclass and returns their ordering relationship.</p>
<p>Lists can be compared, but when they are, they are compared
lexicographically. What if we have a list of lists and we want to sort
it not based on the inner lists’ contents but on their lengths? Well, as
you’ve probably guessed, we’ll use the <code>sortBy</code> function.</p>
<pre class="haskell:ghci"><code>ghci> let xs = [[5,4,5,4,4],[1,2,3],[3,5,4,3],[],[2],[2,2]]
ghci> sortBy (compare `on` length) xs
[[],[2],[2,2],[1,2,3],[3,5,4,3],[5,4,5,4,4]]</code></pre>
<p>Awesome! <code>compare `on` length</code> … man, that reads almost
like real English! If you’re not sure how exactly the <code>on</code>
works here, <code>compare `on` length</code> is the equivalent of
<code>\x y -> length x `compare` length y</code>. When you’re dealing
with <em>By</em> functions that take an equality function, you usually
do <code>(==) `on` something</code> and when you’re dealing with
<em>By</em> functions that take an ordering function, you usually do
<code>compare `on` something</code>.</p>
<h2 id="data-char">Data.Char</h2>
<p><img src="assets/images/modules/legochar.png" class="right"
width="230" height="323" alt="lego char" /></p>
<p>The <code>Data.Char</code> module does what its name suggests. It
exports functions that deal with characters. It’s also helpful when
filtering and mapping over strings because they’re just lists of
characters.</p>
<p><code>Data.Char</code> exports a bunch of predicates over characters.
That is, functions that take a character and tell us whether some
assumption about it is true or false. Here’s what they are:</p>
<p><code class="label function">isControl</code> checks whether a
character is a control character.</p>
<p><code class="label function">isSpace</code> checks whether a
character is a white-space characters. That includes spaces, tab
characters, newlines, etc.</p>
<p><code class="label function">isLower</code> checks whether a
character is lower-cased.</p>
<p><code class="label function">isUpper</code> checks whether a
character is upper-cased.</p>
<p><code class="label function">isAlpha</code> checks whether a
character is a letter.</p>
<p><code class="label function">isAlphaNum</code> checks whether a
character is a letter or a number.</p>
<p><code class="label function">isPrint</code> checks whether a
character is printable. Control characters, for instance, are not
printable.</p>
<p><code class="label function">isDigit</code> checks whether a
character is a digit.</p>
<p><code class="label function">isOctDigit</code> checks whether a
character is an octal digit.</p>
<p><code class="label function">isHexDigit</code> checks whether a
character is a hex digit.</p>
<p><code class="label function">isLetter</code> checks whether a
character is a letter.</p>
<p><code class="label function">isMark</code> checks for Unicode mark
characters. Those are characters that combine with preceding letters to
form letters with accents. Use this if you are French.</p>
<p><code class="label function">isNumber</code> checks whether a
character is numeric.</p>
<p><code class="label function">isPunctuation</code> checks whether a
character is punctuation.</p>
<p><code class="label function">isSymbol</code> checks whether a
character is a fancy mathematical or currency symbol.</p>
<p><code class="label function">isSeparator</code> checks for Unicode
spaces and separators.</p>
<p><code class="label function">isAscii</code> checks whether a
character falls into the first 128 characters of the Unicode character
set.</p>
<p><code class="label function">isLatin1</code> checks whether a
character falls into the first 256 characters of Unicode.</p>
<p><code class="label function">isAsciiUpper</code> checks whether a
character is ASCII and upper-case.</p>
<p><code class="label function">isAsciiLower</code> checks whether a
character is ASCII and lower-case.</p>
<p>All these predicates have a type signature of
<code>Char -> Bool</code>. Most of the time you’ll use this to filter
out strings or something like that. For instance, let’s say we’re making
a program that takes a username and the username can only be comprised
of alphanumeric characters. We can use the <code>Data.List</code>
function <code>all</code> in combination with the <code>Data.Char</code>
predicates to determine if the username is alright.</p>
<pre class="haskell:ghci"><code>ghci> all isAlphaNum "bobby283"
True
ghci> all isAlphaNum "eddy the fish!"
False</code></pre>
<p>Kewl. In case you don’t remember, <code>all</code> takes a predicate
and a list and returns <code>True</code> only if that predicate holds
for every element in the list.</p>
<p>We can also use <code>isSpace</code> to simulate the
<code>Data.List</code> function <code>words</code>.</p>
<pre class="haskell:ghci"><code>ghci> words "hey folks its me"
["hey","folks","its","me"]
ghci> groupBy ((==) `on` isSpace) "hey folks its me"
["hey"," ","folks"," ","its"," ","me"]
ghci></code></pre>
<p>Hmmm, well, it kind of does what <code>words</code> does but we’re
left with elements of only spaces. Hmm, whatever shall we do? I know,
let’s filter that sucker.</p>
<pre class="haskell:ghci"><code>ghci> filter (not . any isSpace) . groupBy ((==) `on` isSpace) $ "hey folks its me"
["hey","folks","its","me"]</code></pre>
<p>Ah.</p>
<p>The <code>Data.Char</code> also exports a datatype that’s kind of
like <code>Ordering</code>. The <code>Ordering</code> type can have a
value of <code>LT</code>, <code>EQ</code> or <code>GT</code>. It’s a
sort of enumeration. It describes a few possible results that can arise
from comparing two elements. The <code>GeneralCategory</code> type is
also an enumeration. It presents us with a few possible categories that
a character can fall into. The main function for getting the general
category of a character is <code>generalCategory</code>. It has a type
of <code>generalCategory :: Char -> GeneralCategory</code>. There are
about 31 categories so we won’t list them all here, but let’s play
around with the function.</p>
<pre class="haskell:ghci"><code>ghci> generalCategory ' '
Space
ghci> generalCategory 'A'
UppercaseLetter
ghci> generalCategory 'a'
LowercaseLetter
ghci> generalCategory '.'
OtherPunctuation
ghci> generalCategory '9'
DecimalNumber
ghci> map generalCategory " \t\nA9?|"
[Space,Control,Control,UppercaseLetter,DecimalNumber,OtherPunctuation,MathSymbol]</code></pre>
<p>Since the <code>GeneralCategory</code> type is part of the
<code>Eq</code> typeclass, we can also test for stuff like
<code>generalCategory c == Space</code>.</p>
<p><code class="label function">toUpper</code> converts a character to
upper-case. Spaces, numbers, and the like remain unchanged.</p>
<p><code class="label function">toLower</code> converts a character to
lower-case.</p>
<p><code class="label function">toTitle</code> converts a character to
title-case. For most characters, title-case is the same as
upper-case.</p>
<p><code class="label function">digitToInt</code> converts a character
to an <code>Int</code>. To succeed, the character must be in the ranges
<code>'0'..'9'</code>, <code>'a'..'f'</code> or
<code>'A'..'F'</code>.</p>
<pre class="haskell:ghci"><code>ghci> map digitToInt "34538"
[3,4,5,3,8]
ghci> map digitToInt "FF85AB"
[15,15,8,5,10,11]</code></pre>
<p><code class="label function">intToDigit</code> is the inverse
function of <code>digitToInt</code>. It takes an <code>Int</code> in the
range of <code>0..15</code> and converts it to a lower-case
character.</p>
<pre class="haskell:ghci"><code>ghci> intToDigit 15
'f'
ghci> intToDigit 5
'5'</code></pre>
<p>The <code class="label function">ord</code> and <code>chr</code>
functions convert characters to their corresponding numbers and vice
versa:</p>
<pre class="haskell:ghci"><code>ghci> ord 'a'
97
ghci> chr 97
'a'
ghci> map ord "abcdefgh"
[97,98,99,100,101,102,103,104]</code></pre>
<p>The difference between the <code>ord</code> values of two characters
is equal to how far apart they are in the Unicode table.</p>
<p>The Caesar cipher is a primitive method of encoding messages by
shifting each character in them by a fixed number of positions in the
alphabet. We can easily create a sort of Caesar cipher of our own, only
we won’t constrict ourselves to the alphabet.</p>
<pre class="haskell:hs"><code>encode :: Int -> String -> String
encode shift msg =
let ords = map ord msg
shifted = map (+ shift) ords
in map chr shifted</code></pre>
<p>Here, we first convert the string to a list of numbers. Then we add
the shift amount to each number before converting the list of numbers
back to characters. If you’re a composition cowboy, you could write the
body of this function as <code>map (chr . (+ shift) . ord) msg</code>.
Let’s try encoding a few messages.</p>
<pre class="haskell:ghci"><code>ghci> encode 3 "Heeeeey"
"Khhhhh|"
ghci> encode 4 "Heeeeey"
"Liiiii}"
ghci> encode 1 "abcd"
"bcde"
ghci> encode 5 "Marry Christmas! Ho ho ho!"
"Rfww~%Hmwnxyrfx&%Mt%mt%mt&"</code></pre>
<p>That’s encoded alright. Decoding a message is basically just shifting
it back by the number of places it was shifted by in the first
place.</p>
<pre class="haskell:hs"><code>decode :: Int -> String -> String
decode shift msg = encode (negate shift) msg</code></pre>
<pre class="haskell:ghci"><code>ghci> encode 3 "Im a little teapot"
"Lp#d#olwwoh#whdsrw"
ghci> decode 3 "Lp#d#olwwoh#whdsrw"
"Im a little teapot"
ghci> decode 5 . encode 5 $ "This is a sentence"
"This is a sentence"</code></pre>
<h2 id="data-map">Data.Map</h2>
<p>Association lists (also called dictionaries) are lists that are used
to store key-value pairs where ordering doesn’t matter. For instance, we
might use an association list to store phone numbers, where phone
numbers would be the values and people’s names would be the keys. We
don’t care in which order they’re stored, we just want to get the right
phone number for the right person.</p>
<p>The most obvious way to represent association lists in Haskell would
be by having a list of pairs. The first component in the pair would be
the key, the second component the value. Here’s an example of an
association list with phone numbers:</p>
<pre class="haskell:hs"><code>phoneBook =
[("amelia","555-2938")
,("freya","452-2928")
,("isabella","493-2928")
,("neil","205-2928")
,("roald","939-8282")
,("tenzing","853-2492")
]</code></pre>
<p>Despite this seemingly odd indentation, this is just a list of pairs
of strings. The most common task when dealing with association lists is
looking up some value by key. Let’s make a function that looks up some
value given a key.</p>
<pre class="haskell:hs"><code>findKey :: (Eq k) => k -> [(k,v)] -> v
findKey key xs = snd . head . filter (\(k,v) -> key == k) $ xs</code></pre>
<p>Pretty simple. The function that takes a key and a list, filters the
list so that only matching keys remain, gets the first key-value that
matches and returns the value. But what happens if the key we’re looking
for isn’t in the association list? Hmm. Here, if a key isn’t in the
association list, we’ll end up trying to get the head of an empty list,
which throws a runtime error. However, we should avoid making our
programs so easy to crash, so let’s use the <code>Maybe</code> data
type. If we don’t find the key, we’ll return a <code>Nothing</code>. If
we find it, we’ll return <code>Just something</code>, where something is
the value corresponding to that key.</p>
<pre class="haskell:hs"><code>findKey :: (Eq k) => k -> [(k,v)] -> Maybe v
findKey key [] = Nothing
findKey key ((k,v):xs) = if key == k
then Just v
else findKey key xs</code></pre>
<p>Look at the type declaration. It takes a key that can be equated, an
association list and then it maybe produces a value. Sounds about
right.</p>
<p>This is a textbook recursive function that operates on a list. Edge
case, splitting a list into a head and a tail, recursive calls, they’re
all there. This is the classic fold pattern, so let’s see how this would
be implemented as a fold.</p>
<pre class="haskell:hs"><code>findKey :: (Eq k) => k -> [(k,v)] -> Maybe v
findKey key = foldr (\(k,v) acc -> if key == k then Just v else acc) Nothing</code></pre>
<div class="hintbox">
<p><strong>Note:</strong> It’s usually better to use folds for this
standard list recursion pattern instead of explicitly writing the
recursion because they’re easier to read and identify. Everyone knows
it’s a fold when they see the <code>foldr</code> call, but it takes some
more thinking to read explicit recursion.</p>
</div>
<pre class="haskell:ghci"><code>ghci> findKey "tenzing" phoneBook
Just "853-2492"
ghci> findKey "amelia" phoneBook
Just "555-2938"
ghci> findKey "christopher" phoneBook
Nothing</code></pre>
<p><img src="assets/images/modules/legomap.png" class="left" width="214"
height="240" alt="legomap" /></p>
<p>Works like a charm! If we have the friend’s phone number, we
<code>Just</code> get the number, otherwise we get
<code>Nothing</code>.</p>
<p>We just implemented the <code>lookup</code> function from
<code>Data.List</code>. If we want to find the corresponding value to a
key, we have to traverse all the elements of the list until we find it.
The <code>Data.Map</code> module offers association lists that are much
faster (because they’re internally implemented with trees) and also it
provides a lot of utility functions. From now on, we’ll say we’re
working with maps instead of association lists.</p>
<p>Because <code>Data.Map</code> exports functions that clash with the
<code>Prelude</code> and <code>Data.List</code> ones, we’ll do a
qualified import.</p>
<pre class="haskell:hs"><code>import qualified Data.Map as Map</code></pre>
<p>Put this import statement into a script and then load the script via
GHCI.</p>
<p>Let’s go ahead and see what <code>Data.Map</code> has in store for
us! Here’s the basic rundown of its functions.</p>
<p>The <code class="label function">fromList</code> function takes an
association list (in the form of a list) and returns a map with the same
associations.</p>
<pre class="haskell:ghci"><code>ghci> Map.fromList [("amelia","555-2938"),("freya","452-2928"),("neil","205-2928")]
fromList [("amelia","555-2938"),("freya","452-2928"),("neil","205-2928")]
ghci> Map.fromList [(1,2),(3,4),(3,2),(5,5)]
fromList [(1,2),(3,2),(5,5)]</code></pre>
<p>If there are duplicate keys in the original association list, the
duplicates are just discarded. This is the type signature of
<code>fromList</code></p>
<pre class="haskell:hs"><code>Map.fromList :: (Ord k) => [(k, v)] -> Map.Map k v</code></pre>
<p>It says that it takes a list of pairs of type <code>k</code> and
<code>v</code> and returns a map that maps from keys of type
<code>k</code> to type <code>v</code>. Notice that when we were doing
association lists with normal lists, the keys only had to be equatable
(their type belonging to the <code>Eq</code> typeclass) but now they
have to be orderable. That’s an essential constraint in the
<code>Data.Map</code> module. It needs the keys to be orderable so it
can arrange them in a tree.</p>
<p>You should always use <code>Data.Map</code> for key-value
associations unless you have keys that aren’t part of the
<code>Ord</code> typeclass.</p>
<p><code class="label function">empty</code> represents an empty map. It
takes no arguments, it just returns an empty map.</p>
<pre class="haskell:ghci"><code>ghci> Map.empty
fromList []</code></pre>
<p><code class="label function">insert</code> takes a key, a value and a
map and returns a new map that’s just like the old one, only with the
key and value inserted.</p>
<pre class="haskell:ghci"><code>ghci> Map.empty
fromList []
ghci> Map.insert 3 100 Map.empty
fromList [(3,100)]
ghci> Map.insert 5 600 (Map.insert 4 200 ( Map.insert 3 100 Map.empty))
fromList [(3,100),(4,200),(5,600)]
ghci> Map.insert 5 600 . Map.insert 4 200 . Map.insert 3 100 $ Map.empty
fromList [(3,100),(4,200),(5,600)]</code></pre>
<p>We can implement our own <code>fromList</code> by using the empty
map, <code>insert</code> and a fold. Watch:</p>
<pre class="haskell:ghci"><code>fromList' :: (Ord k) => [(k,v)] -> Map.Map k v
fromList' = foldr (\(k,v) acc -> Map.insert k v acc) Map.empty</code></pre>
<p>It’s a pretty straightforward fold. We start of with an empty map and
we fold it up from the right, inserting the key value pairs into the
accumulator as we go along.</p>
<p><code class="label function">null</code> checks if a map is
empty.</p>
<pre class="haskell:ghci"><code>ghci> Map.null Map.empty
True
ghci> Map.null $ Map.fromList [(2,3),(5,5)]
False</code></pre>
<p><code class="label function">size</code> reports the size of a
map.</p>
<pre class="haskell:ghci"><code>ghci> Map.size Map.empty
0
ghci> Map.size $ Map.fromList [(2,4),(3,3),(4,2),(5,4),(6,4)]
5</code></pre>
<p><code class="label function">singleton</code> takes a key and a value
and creates a map that has exactly one mapping.</p>
<pre class="haskell:ghci"><code>ghci> Map.singleton 3 9
fromList [(3,9)]
ghci> Map.insert 5 9 $ Map.singleton 3 9
fromList [(3,9),(5,9)]</code></pre>
<p><code class="label function">lookup</code> works like the
<code>Data.List</code> <code>lookup</code>, only it operates on maps. It
returns <code>Just something</code> if it finds something for the key
and <code>Nothing</code> if it doesn’t.</p>
<p><code class="label function">member</code> is a predicate that takes
a key and a map and reports whether the key is in the map or not.</p>
<pre class="haskell:ghci"><code>ghci> Map.member 3 $ Map.fromList [(3,6),(4,3),(6,9)]
True
ghci> Map.member 3 $ Map.fromList [(2,5),(4,5)]
False</code></pre>
<p><code class="label function">map</code> and <code
class="label function">filter</code> work much like their list
equivalents.</p>
<pre class="haskell:ghci"><code>ghci> Map.map (*100) $ Map.fromList [(1,1),(2,4),(3,9)]
fromList [(1,100),(2,400),(3,900)]
ghci> Map.filter isUpper $ Map.fromList [(1,'a'),(2,'A'),(3,'b'),(4,'B')]
fromList [(2,'A'),(4,'B')]</code></pre>
<p><code class="label function">toList</code> is the inverse of
<code>fromList</code>.</p>
<pre class="haskell:ghci"><code>ghci> Map.toList . Map.insert 9 2 $ Map.singleton 4 3
[(4,3),(9,2)]</code></pre>
<p><code class="label function">keys</code> and <code
class="label function">elems</code> return lists of keys and values
respectively. <code>keys</code> is the equivalent of
<code>map fst . Map.toList</code> and <code>elems</code> is the
equivalent of <code>map snd . Map.toList</code>.</p>
<p><code class="label function">fromListWith</code> is a cool little
function. It acts like <code>fromList</code>, only it doesn’t discard
duplicate keys but it uses a function supplied to it to decide what to
do with them. Let’s say that a friend can have several numbers and we
have an association list set up like this.</p>
<pre class="haskell:hs"><code>phoneBook =
[("amelia","555-2938")
,("amelia","342-2492")
,("freya","452-2928")
,("isabella","493-2928")
,("isabella","943-2929")
,("isabella","827-9162")
,("neil","205-2928")
,("roald","939-8282")
,("tenzing","853-2492")
,("tenzing","555-2111")
]</code></pre>