1. Introduction to sed (Stream Editor)
sed is a powerful utility for parsing and transforming text. It's particularly useful for batch editing files.
Basic sed Usage:
sed 's/old_text/new_text/' file.txt # Replace first occurrence in each line
sed 's/old_text/new_text/g' file.txt # Replace all occurrences
sed -i 's/old_text/new_text/g' file.txt # Edit file in-place
Advanced sed Techniques:
- Using address ranges:
sed '2,5d' file.txt # Delete lines 2 through 5
- Multiple editing commands:
sed -e 's/foo/bar/g' -e 's/baz/qux/g' file.txt
- Using regular expressions:
sed 's/[0-9]\{3\}-[0-9]\{2\}-[0-9]\{4\}/XXX-XX-XXXX/g' file.txt # Mask SSN
2. Deep Dive into awk
awk is a versatile programming language designed for text processing and typically used as a data extraction and reporting tool.
Basic awk Usage:
awk '{print $1}' file.txt # Print first field of each line
awk -F: '{print $1}' /etc/passwd # Use colon as field separator
Advanced awk Techniques:
- Using conditions:
awk '$3 > 100 {print $1, $3}' data.txt # Print fields 1 and 3 if field 3 > 100
- Built-in variables:
awk '{sum += $1} END {print "Average:", sum/NR}' numbers.txt # Calculate average
- Using functions:
awk 'function max(a,b){return (a>b)?a:b} {current = max($1, $2); print $0, current}' data.txt
3. Combining sed and awk
sed and awk can be combined in powerful ways using pipes:
sed 's/,/ /g' data.csv | awk '{sum += $2} END {print "Total:", sum}'
This command replaces commas with spaces in a CSV file, then sums the values in the second column.
4. Real-world Examples
Example 1: Log File Analysis
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn | head -10
This command finds the top 10 URLs resulting in 404 errors from an Apache access log.
Example 2: Data Cleaning
sed 's/^[ \t]*//' data.txt | sed 's/[ \t]*$//' | sed '/^$/d' | awk '!seen[$0]++'
This series of commands trims leading and trailing whitespace, removes empty lines, and eliminates duplicates.
Example 3: CSV to JSON Conversion
awk -F, 'BEGIN {print "["} {printf " {\"name\": \"%s\", \"age\": %d, \"city\": \"%s\"}%s\n", $1, $2, $3, (NR==NR ? "" : ",")} END {print "]"}' data.csv
This awk script converts a simple CSV file to JSON format.
Tip:
When working with complex sed or awk commands, it's often helpful to build them up incrementally and test each part separately. Use echo or cat to pipe test data into your commands for quick iterations.