awk
Found this helpful for going though log files since they are naturally separated into fields. Good general tutorial here.
Useful predefined variables. Good write up here
- FILENAME – name of the file you’re in
- NR – line number that you’re on (global)
- FNR -line number that you’re on relative to current file
- NF – number of “fields” or words separated by the given delimiter on the line.
- FS – field seperator (space is the default)
- OFS – ouput field seperator (space is the default)
Useful Code Snippets
select lines where the third column is equal to 2 and compute the mean sum of squares of the eighth, ninth and tenth columms
awk 'BEGIN {r=0; num=0} {if ($3==2) {r += $8^2 + $9^2 + $10^2; num++}} END {print r/num}' filename
select lines with a 0 and atleast 20 columns and print the first and last column
awk '/0/ {if (NF > 20) print $1,$NF}' log.spparks.4 > hyd_diff_temp_2
select lines from a file where the second column is larger than 20 and print the filename with some additional columns
awk ' {if($2 > 20) print FILENAME,$1*10,$2}' size_time_*_1
sed
Generally used to replace texts but can be used more powerfully. Useful tutorial here.
replace “size_time” with blank
sed 's/size_time_//'
replace “size_time_someNumber_1” with “someNumber”
sed 's/size_time_\([0-9]*\)_1/\1/'