ASCII Table

ASCII tables are usually given in 8 by 16 format. This does make it easy to read the hexadecimal codes off of the rows and columns of the table, but I think that it is also instructive to see it in my 4 by 32 format. I have also highlighted the last 5 bits, which are the same across the 4 entries in a row. The only difference in the entries on a row are the 2 most significant bits. Columns have the same 2 most significant bits identical across all 32 entries.

Binary Char. Binary Char. Binary Char. Binary Char.
000 0000 NUL 010 0000 SPC 100 0000 @ 110 0000 `
000 0001 SOH 010 0001 ! 100 0001 A 110 0001 a
000 0010 STX 010 0010 " 100 0010 B 110 0010 b
000 0011 ETX 010 0011 # 100 0011 C 110 0011 c
000 0100 EOT 010 0100 $ 100 0100 D 110 0100 d
000 0101 ENQ 010 0101 % 100 0101 E 110 0101 e
000 0110 ACK 010 0110 & 100 0110 F 110 0110 f
000 0111 BEL 010 0111 ' 100 0111 G 110 0111 g
000 1000 BS 010 1000 ( 100 1000 H 110 1000 h
000 1001 TAB 010 1001 ) 100 1001 I 110 1001 i
000 1010 LF 010 1010 * 100 1010 J 110 1010 j
000 1011 VT 010 1011 + 100 1011 K 110 1011 k
000 1100 FF 010 1100 , 100 1100 L 110 1100 l
000 1101 CR 010 1101 - 100 1101 M 110 1101 m
000 1110 SO 010 1110 . 100 1110 N 110 1110 n
000 1111 SI 010 1111 / 100 1111 O 110 1111 o
001 0000 DLE 011 0000 0 101 0000 P 111 0000 p
001 0001 DC1 011 0001 1 101 0001 Q 111 0001 q
001 0010 DC2 011 0010 2 101 0010 R 111 0010 r
001 0011 DC3 011 0011 3 101 0011 S 111 0011 s
001 0100 DC4 011 0100 4 101 0100 T 111 0100 t
001 0101 NAK 011 0101 5 101 0101 U 111 0101 u
001 0110 SYN 011 0110 6 101 0110 V 111 0110 v
001 0111 ETB 011 0111 7 101 0111 W 111 0111 w
001 1000 CAN 011 1000 8 101 1000 X 111 1000 x
001 1001 EM 011 1001 9 101 1001 Y 111 1001 y
001 1010 SUB 011 1010 : 101 1010 Z 111 1010 z
001 1011 ESC 011 1011 ; 101 1011 [ 111 1011 {
001 1100 FS 011 1100 < 101 1100 \ 111 1100 |
001 1101 GS 011 1101 = 101 1101 ] 111 1101 }
001 1110 RS 011 1110 > 101 1110 ^ 111 1110 ~
001 1111 US 011 1111 ? 101 1111 _ 111 1111 DEL

One of the things is that immediately obvious is that the uppercase and lowercase letters differ from each other by only 1 bit. This made it easy to write program code that changes the case of a letter back when ASCII had not been extended. Although I imagine that there were a few programmers that wanted the letters start with a lower part of 0 0000 instead of 0 0001, I imagine that the long-established tradition of mapping letters to numbers with A mapping to 1 and Z mapping to 26 was what overrode the programmers' inclination to count from 0.

Now the mapping of Control-letter to control characters is a case that while certain ideas seem to suggest themselves, some hard choices also had to be made. The characters with the codepoints 1 to 26 are frequently represented with the string ^X, where X is a capital letter between A and Z (inclusive). For instance BS used to frequently be represented as ^H; and even now if I open a file with DOS line-endings in Emacs, I'll see ^M's for the CRs. (Note that by default Emacs automatically recognizes the line-ending convention used in each file, but I have changed this.) But the table also makes it clear why one sometimes sees the null character, NUL, represented as ^@, and ESC represented as ^[. The ASCII control characters assigned to FS, GS, RS, and US are not often used today, which is why one rarely sees ^\, ^], ^^, or ^_. Why not take the symbols from the lowercase column? Well, the fact that the last character in the column is not assigned to a printable character, but to DEL, thus making representing US a problem if one used that column; though, in a kind of symmetry ^? was sometimes used to represent the DEL character.