2013年3月21日星期四

Pattern Matching - cheat sheet


Pattern Matching

Shell globbing
Pattern matching in the shell against filenames has metacharacters defined differently from the rest of unix pattern matching prgorams. * is match any character except whitespace, ? is match one character except whitespace. so *.c is match any filename ending with the two characters .c (this will list out all c source files in the directory, assuming the directory's owner is sane).
grep, sed
Table of metacharacters:
  1. ^ (caret) match beginning of line. Anchors match.
  2. $ (dollar sign) match end of line. Anchors match.
  3. . (dot) match any character. Beware, command line globbing uses ? instead.
  4. * (star) matches zero or more of preceding chracters. Beware, command line uses * as in .*.
  5. [] (square braces) set of characters inside braces, match any one of.
  6. [^ ] (carat at first character inside braces), match any character except those inside braces
  7. [a-z] (use of dash inside braces) match a range. If - is to be matched, must be first character, to avoid misinterpretation as range operator.
  8. () {parenthesis, must be escaped with backslash), save match for later use with \n, where n is a number.
  9. {m}, {m,} and {m,n} (braces, which must be escaped with a backslash), matched m, more than m, or between m and n repretitions of preceeding character.
  10. & (ampersand) expands to the matched string, used in sed.
Grep, sed Flags for grep of note:
  • -i, case insensitive
  • -v, invert, select non-matching lines
  • -c, give count of matching lines.
Flags for sed of note:
  • -n, print the line only if forced to
  • -f, commands from a file
Sed commands,
  • form is [address][,address][!]command [arguments] You tend to have to enclose this in single quotes of the shell will demolish it. Or double quotes if you want shell variables expanded inside the mess.
  • No address: all lines; one address: lines matching address are processed; two addresses: first address starts processing, second address ends processiong.
  • Addresses can be line numbers, the dollar sign or a reg. exp enclosed in //.
  • example: s/a/b/g, substitute b for a, globally. Drop the g and you only substitute the first occurrance of a on a line. Add p with the g to print out the line, especially if you are using sed -n.
  • example: /but/d, delete any line that says "but", not buts allowed!
Examples
Match three letter reversal patterns:
grep '\(.\)\(.\)\(.\)\3\2\1' web2
Subsitution using sed:
sed 's/^.*:\*:\([^:]*\).*$/\1/' /etc/passwd
Try to save old files in a subdirectory.

没有评论:

发表评论