A set of strings
- Identifiers
- Strings of letters or digits starting with a letter
- Integer
- Non-empty string of digits
- Keyword
- else, or if, or something like that
- Whitespace
- A non-empty sequence of blanks
These are for classifying program substrings according to role.