awk多维数据使用
关联下标默认是\034,可以使用SUBSEP,或OFS+split 设置下标分割符
awk -F',' 'BEGIN{SUBSEP=",";OFS=",";sum[$1,$2]=0}{sum[$1,$2]+=$3}END {for(i in sum) print i,sum[i]}' file.csv
awk -F',' 'BEGIN{OFS=",";sum[$1,$2]=0}{sum[$1,$2]+=$3}END {for(i in sum){split(i,key,SUBSEP); print key[1],key[2],sum[i]}}' file.csv
数组默认乱序输出,使用asort、asorti 对下标排序,区别在于asort会把原下标修改成1到数组长度的整数值
awk -F',' 'BEGIN{SUBSEP=",";OFS=",";sum[$1,$2]=0}{sum[$1,$2]+=$3}END {len=asort(sum); for(i=1;i<=len;i++) print i,sum[i]}' file.csv
awk -F',' 'BEGIN{SUBSEP=",";OFS=",";sum[$1,$2]=0}{sum[$1,$2]+=$3}END {len=asorti(sum, sumindex); for(i=1;i<=len;i++) print sumindex[i],sum[i]}' file.csv
awk查看列数不一致位置
awk '{print NR,NF}' yourfile | awk '!a[$0]++'
awk去除重复列
awk '!a[$0]++' yourfile
awk '!($0 in a){a[$0];print}' yourfile
awk带外部参数
变量用"'$var'"包含起来,awk '($1=="'$var'"){print "'$var'","'"$var2"'",$1}' yourfile
awk '{print var1, var2}' var1=111 var2=222 yourfile
awk –v var1=111 –v var2=222 '{print a,b}' yourfile
awk比较文件差异
grep -vxFf yourfile1 yourfile2 (v:invert match, x:整行作为pattern, Ff:读取每一行作为pattern)
awk '{if(FILENAME=="yourfile1")a[$0]=1;else{if(a[$0]!=1) print $0}}' yourfile1 yourfile2
awk 'NR==FNR{a[$0]=1} NR>FNR&&(!($0 in a)){print}' yourfile1 yourfile2
awk 'NR==FNR{a[$0]++} NR>FNR&&!a[$0]++' yourfile1 yourfile2
awk 'ARGIND==1{a[$0]} ARGIND>1&&!($0 in a){print}' yourfile1 yourfile2
参考网址
awk引用外部变量
阅读(1500) | 评论(0) | 转发(0) |