awk and sed


Found this helpful for going though log files since they are naturally separated into fields. Good general tutorial here.

Useful predefined variables. Good write up here

  1. FILENAME – name of the file you’re in
  2. NR – line number that you’re on (global)
  3. FNR -line number that you’re on relative to current file
  4. NF – number of “fields” or words separated by the given delimiter on the line.
  5. FS – field seperator (space is the default)
  6. OFS – ouput field seperator (space is the default)

Useful Code Snippets

select lines where the third column is equal to 2 and compute the mean sum of squares of the eighth, ninth and tenth columms
awk 'BEGIN {r=0; num=0} {if ($3==2) {r += $8^2 + $9^2 + $10^2; num++}} END {print r/num}' filename

select lines with a 0 and atleast 20 columns and print the first and last column
awk '/0/ {if (NF > 20) print $1,$NF}' log.spparks.4 > hyd_diff_temp_2

select lines from a file where the second column is larger than 20 and print the filename with some additional columns
awk ' {if($2 > 20) print FILENAME,$1*10,$2}' size_time_*_1


Generally used to replace texts but can be used more powerfully. Useful tutorial here.
replace “size_time” with blank
sed 's/size_time_//'

replace “size_time_someNumber_1” with “someNumber”
sed 's/size_time_\([0-9]*\)_1/\1/'