Using the Burrows Wheeler block sorting text compression algorithm on: S = "good, jolly good". 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F ------------------------------- ------------------------------- 0 | g o o d , j o l l y g o o d 0 | g o o d g o o d , j o l l y 1 | o o d , j o l l y g o o d g 1 | j o l l y g o o d g o o d , 2 | o d , j o l l y g o o d g o 2 | , j o l l y g o o d g o o d 3 | d , j o l l y g o o d g o o 3 | d , j o l l y g o o d g o o 4 | , j o l l y g o o d g o o d 4 | d g o o d , j o l l y g o o 5 | j o l l y g o o d g o o d , 5 | g o o d , j o l l y g o o d 6 | j o l l y g o o d g o o d , 6 | g o o d g o o d , j o l l y 7 | o l l y g o o d g o o d , j 7 | j o l l y g o o d g o o d , 8 | l l y g o o d g o o d , j o 8 | l l y g o o d g o o d , j o 9 | l y g o o d g o o d , j o l 9 | l y g o o d g o o d , j o l A | y g o o d g o o d , j o l l A | o d , j o l l y g o o d g o B | g o o d g o o d , j o l l y B | o d g o o d , j o l l y g o C | g o o d g o o d , j o l l y C | o l l y g o o d g o o d , j D | o o d g o o d , j o l l y g D | o o d , j o l l y g o o d g E | o d g o o d , j o l l y g o E | o o d g o o d , j o l l y g F | d g o o d , j o l l y g o o F | y g o o d g o o d , j o l l 1. The shifts of string S 2. Matrix M: Shifts in lexicographic order 3. The output from the block sort: L and I The Last Column: L = "y,dood oloojggl"; and index of S in M: I = 5. Note that Y is initialized to the ASCCI code in this example. 4. After move-to-front-coding, we obtain R: 121,45,101,112,0,1,36,0,2,110,1,0,109,107,0,3,5 Explanation: ------------ R[0]=121, since the ASCII code for y is 121, and now we move y to the the front of Y (and consequently all characters of Y, till Y[120], move one position to the right). Now: Y[0]=y ... R[1]=45, since the ASCII code for , is 44; but now y is in the first position of Y and so the code for , is 44 + 1. The , now moves to the first entry of Y (again move characters of Y till Y[44], one postion to the right). Now: Y[0]=, Y[1]=y ... R[2]=101, since the ASCII code for d is 100 and has moved once to the right. Move d to the front of Y. Now: Y[0]=d Y[1]=, Y[2]=y ... R[3]=112, since the ASCII code for o is 111 and has moved once to the right. Move o to the front of Y. Now: Y[0]=o Y[1]=d Y[2]=, Y[3]=y ... R[4]=0, since we have just seen character o. It is in the first location of Y. We still have: Y[0]=o Y[1]=d Y[2]=, Y[3]=y ... R[5]=1, since Y[1]=d, and move d to the front of Y. R[5]=1 means that only one character (the o) has been processed since the last time we encountered a d. Now: Y[0]=d Y[1]=o Y[2]=, Y[3]=y ... R[6]=36, since the ASCII code of SP (space) is 32 and the space has moved 4 positions to the right. Move SP to the front of Y. Now: Y[0]=SP Y[1]=d Y[2]=o Y[3]=, Y[4]=y ... R[7]=0, since we have just seen a space. It is in the first location of Y. We still have: Y[0]=SP Y[1]=d Y[2]=o Y[3]=, Y[4]=y ... R[8]=2, since Y[2]=o, and move o to the front of Y. R[8]=2 means that only two characters (SP and d) have been processed since the last time we encountered an o. Now: Y[0]=o Y[1]=SP Y[2]=d Y[3]=, Y[4]=y ... R[9]=110, since the ASCII code of l is 108 and the letter l has moved 2 positions to the right. Move l to the front of Y. Now: Y[0]=l Y[1]=o Y[2]=SP Y[3]=d Y[4]=, Y[5]=y ... R[10]=1, since Y[1]=o and move o to the front of Y. R[10]=1 means that one character (the l) has been processed since the last time we encountered an o. Now: Y[0]=o Y[1]=l Y[2]=SP Y[3]=d Y[4]=, Y[5]=y ... R[11]=0, since we have just seen an o. It is in the first location of Y. We still have: Y[0]=o Y[1]=l Y[2]=SP Y[3]=d Y[4]=, Y[5]=y ... R[12]=109, since the ASCII code of j is 106 and j has moved 3 positions to the right. Move j to the front of Y. Now: Y[0]=j Y[1]=o Y[2]=l Y[3]=SP Y[4]=d Y[5]=, Y[6]=y ... R[13]=107, since the ASCII code of g is 103 and g has moved 4 positions to the right. Move g to the front of Y. Now: Y[0]=g Y[1]=j Y[2]=o Y[3]=l Y[4]=SP Y[5]=d Y[6]=, Y[7]=y ... R[14]=0, since we have just seen a g. It is in the first location of Y. We have: Y[0]=g Y[1]=j Y[2]=o Y[3]=l Y[4]=SP Y[5]=d Y[6]=, Y[7]=y ... R[15]=3, since Y[3]=l, and move l to the front of Y. R[15]=3 means that three characters (g j and o) have been processed since the last time we encountered an l. Now: Y[0]=l Y[1]=g Y[2]=j Y[3]=o Y[4]=SP Y[5]=d Y[6]=, Y[7]=y ... The last 5 is the value of I (the index of S in M). 5. The values of Y in hexadecimal: 79 2D 65 70 00 01 24 00 02 6E 01 00 6D 6B 00 03 05 6. The statistics 00: 4 01: 2 02: 1 03: 1 05: 1 24: 1 2D: 1 65: 1 6B: 1 6D: 1 6E: 1 70: 1 79: 1