Lexers typically operate by using simple regular expressions to recognise tokens in the input stream. For example, a NAME might be any alphabetical character or the underscore followed by any number of any alphanumeric character or the underscore. This could be represented compactly by the string [a-zA-Z_][a-zA-Z_0-9]+. This means "any character a-z, A-Z or _, then 0 or more of a-z, A-Z, _ or 0-9." Regular expressions are a simple way of representing complex problems. What they cannot do is handle recursively defined patterns, such as "n opening parentheses, followed by a statement, followed by n closing parentheses." It takes a full-fledged parser? (there exist tools which accept Context free grammars as input and emit parsers as output) to recognise such patterns.