View previous topic :: View next topic |
Author |
Message |
truc Advocate
Joined: 25 Jul 2005 Posts: 3199
|
Posted: Tue Jun 21, 2011 10:25 am Post subject: AWK & services log files: Two functions for IPv6 address |
|
|
I regularly use awk to generate some quick statistics for the firewall/the nameserver/the proxy (e.g.: kern.log, daemon.log, squid3/access.log).
Recently I've started to play with IPv6, but realized that IPv6 address are in their extended format for ip6?tables, while in their compressed format for other services. This is quite annoying when you have to manually switch between the two formats, and, the compressed format is really more readable than the extended one.
The two awk functions below should do it for you. You can include them (as shown below) in you're awk scripts
utils.awk: | function extended2compressedIP (ip, lastPos, maxLength, maxLength_start, maxLength_stop, nb, i, elt) {
# IPv4: no compressed format
if (ip !~ /:/)
return ip
# else IPv6
lastPos = -1
maxLength = 0
maxLength_start = -1
maxLength_stop = -1
nb = split(ip, elt, ":")
# find the longest group of 0(if any)
for (i=1; i<=nb; i++) {
sub(/^0+/, "", elt[i])
sub(/^$/, "0", elt[i])
if ("0" == elt[i]) {
if (-1 == lastPos)
lastPos = i
} else if ( -1 != lastPos) {
lastLength = i - lastPos
if (maxLength < lastLength) {
maxLength_start = lastPos
maxLength_stop = i - 1
}
lastPos = -1
}
}
# special case: 0:0:0:0:0:0:0:0
if (1 == lastPos && -1 == maxLength_start)
maxLength_start = 1
# special case: AAAA:BBB:..:DDD:0:0:...
if (-1 != maxLength_start && -1 == maxLength_stop)
maxLength_stop = nb
ip = ""
for (i=1; i<=nb; i++) {
if (maxLength_start == i) {
# leading 0
if (1 == i)
ip = ":"
i = maxLength_stop
} else {
ip = ip elt[i]
}
if (nb != i || maxLength_stop == i)
ip = ip ":"
}
return ip
}
function compressed2extendedIP (ip, i, j, len, elt) {
# IPv4: no compressed format
if (ip !~ /:/)
return ip
# else IPv6
missing = -1
nb = split(ip, elt, ":")
ip = ""
for (i=1; i<=nb; i++) {
if (0 == length(elt[i])) {
# there should be 8 groups (of 16 bits each, 2 bytes, 4 characters)
if (0 != missing)
missing = 8 - nb
elt[i] = "0000"
for (j=1; j<=missing; j++)
elt[i] = elt[i] ":0000"
# the has to be done one time only (if any)
missing = 0
} else {
# 4 characters per group
len = length(elt[i])
for (j=len ; j<4; j++)
elt[i] = "0" elt[i]
}
ip = sprintf("%s%s%s", ip, (i>1 ? ":" : "" ), elt[i])
}
return ip
} |
For example: Code: | < ~/ipv6 awk -f utils.awk --source '{ print $0, "->", extended2compressedIP($0) }'
2001:03ee:bd04:0054:04b9:0000:0000:d47a -> 2001:3ee:bd04:54:4b9::d47a
2a01:c916:0000:0004:0000:0000:0000:0036 -> 2a01:c916:0:4::36
192.168.54.122 -> 192.168.54.122
2001:03ee:bd04:0054:021b:fcff:feec:5a3c -> 2001:3ee:bd04:54:21b:fcff:feec:5a3c
ff02:0000:0000:0000:0000:0001:ff0a:e435 -> ff02::1:ff0a:e435
0000:0000:0000:0000:0000:0000:0000:0000 -> ::
fe80:0000:0000:0000:adad:7a8c:dad7:999b -> fe80::adad:7a8c:dad7:999b |
and now the other way around:
Code: | < ~/ipv6 awk -f utils.awk --source '{ print $0, "->", extended2compressedIP($0) }' | awk -f utils.awk --source '{ print $0, "->", compressed2extendedIP($NF) }'
2001:03ee:bd04:0054:04b9:0000:0000:d47a -> 2001:3ee:bd04:54:4b9::d47a -> 2001:03ee:bd04:0054:04b9:0000:0000:d47a
2a01:c916:0000:0004:0000:0000:0000:0036 -> 2a01:c916:0:4::36 -> 2a01:c916:0000:0004:0000:0000:0000:0036
192.168.54.122 -> 192.168.54.122 -> 192.168.54.122
2001:03ee:bd04:0054:021b:fcff:feec:5a3c -> 2001:3ee:bd04:54:21b:fcff:feec:5a3c -> 2001:03ee:bd04:0054:021b:fcff:feec:5a3c
ff02:0000:0000:0000:0000:0001:ff0a:e435 -> ff02::1:ff0a:e435 -> ff02:0000:0000:0000:0000:0001:ff0a:e435
0000:0000:0000:0000:0000:0000:0000:0000 -> :: -> 0000:0000:0000:0000:0000:0000:0000:0000
|
Please, let me know if there are some corner cases where theses functions fail to do their job! _________________ The End of the Internet!
Last edited by truc on Tue Jun 21, 2011 2:10 pm; edited 2 times in total |
|
Back to top |
|
|
truc Advocate
Joined: 25 Jul 2005 Posts: 3199
|
Posted: Tue Jun 21, 2011 10:45 am Post subject: |
|
|
Now consider this really simple awk script used to selectively print the request for a given IP address in the squid3 default log file:
selIP.awk: | BEGIN {
ip=extended2compressedIP(ip)
}
($3 == ip) {
print
} |
Here is how you can use it:
Code: | </var/log/squid3/access.log awk -f utils.awk -f selIP.awk -v ip=2001:03ee:bd04:0054:021b:fcff:feec:5a3c |
But it also accepts a compressed IPv6 address
Code: | </var/log/squid3/access.log awk -f utils.awk -f selIP.awk -v ip=2001:3ee:bd04:54:21b:fcff:feec:5a3c |
And of course, it also works with a IPv4 adress
Code: | </var/log/squid3/access.log awk -f utils.awk -f selIP.awk -v ip=192.168.54.21 |
It's up to you to convert the data in INPUT to the right format(the one used in the log file) so your awk script is as efficient as possible
Feel free to comment! _________________ The End of the Internet! |
|
Back to top |
|
|
khayyam Watchman
Joined: 07 Jun 2012 Posts: 6227 Location: Room 101
|
Posted: Thu Jun 07, 2012 11:29 pm Post subject: |
|
|
Thanks truc ...
I'm using awk more and more myself, having got hooked sometime last year. There should be a banner someplace that says "awk does more than '{print $1}'" as I often see the likes of "* |grep | sed 's/foo/ba/g' | awk '{print 1}'" when in most cases this could have been handled entirely by awk '/foo/{gsub(/foo/,"ba");print}' <(input).
If your parsing really large log files, and know a little about what fields contain what, then awk can make parsing out specific data a snip eg:
Code: | awk '$23=="DPT=25"' /var/log/iptables.log |
One point though, these are not "functions", but "program files" ... not to be facicious.
I [:heart:] awk.
best ... khayyam |
|
Back to top |
|
|
truc Advocate
Joined: 25 Jul 2005 Posts: 3199
|
Posted: Sun Jun 10, 2012 2:40 pm Post subject: |
|
|
khayyam wrote: | Thanks truc ...
I'm using awk more and more myself, having got hooked sometime last year. There should be a banner someplace that says "awk does more than '{print $1}'" as I often see the likes of "* |grep | sed 's/foo/ba/g' | awk '{print 1}'" when in most cases this could have been handled entirely by awk '/foo/{gsub(/foo/,"ba");print}' <(input). |
True, but watch out when using some of the nice _gnu_ awk features, these are not posix and thus not everywhere!
Quote: | If your parsing really large log files, and know a little about what fields contain what, then awk can make parsing out specific data a snip eg:
Code: | awk '$23=="DPT=25"' /var/log/iptables.log |
|
I've done many awk programs to generate some statistics(iptables/ip6tables, dnsmasq, squid3 and a few others), I've even quite proud of the result! But honestly, now that I'm learning Perl, I realize what's written&said everywhere: Perl is good at (among a great deal of other things) parsing text files.
Awk is often installed by default, but so is perl. I'm now using perl in my one-liners when I used to use awk. I still think it's important to know how to use awk (e.g. to avoid those grep | sed where a single awk call would have done it). But if you have some time to kill, you know what you can do!
Quote: | One point though, these are not "functions", but "program files" ... not to be facicious. |
Well, I actually share these two functions to be used within _your_ program files
BTW, extended2compressed function is not correct on one point: http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-07#section-4.2.2
Code: | 4.2.2. Handling One 16 Bit 0 Field
The symbol "::" MUST NOT be used to shorten just one 16 bit 0 field.
For example, the representation 2001:db8:0:1:1:1:1:1 is correct, but
2001:db8::1:1:1:1:1 is not correct. |
I did not take the time to correct this, since it did not look like it gathered a lot of interest out there
khayyam wrote: | I [:heart:] awk. |
That'd be sed for me! I love how twisted sed programs can be! _________________ The End of the Internet! |
|
Back to top |
|
|
khayyam Watchman
Joined: 07 Jun 2012 Posts: 6227 Location: Room 101
|
Posted: Mon Jun 11, 2012 1:40 pm Post subject: |
|
|
hey ...
truc wrote: | [...]Awk is often installed by default, but so is perl. I'm now using perl in my one-liners when I used to use awk. I still think it's important to know how to use awk (e.g. to avoid those grep | sed where a single awk call would have done it). But if you have some time to kill, you know what you can do! |
I've been put off by perl somewhat, and though I'd agree it outdoes awk, sed, ed, sh, for text handling, I often get this feeling that it lacks something like transparency (for want of a better word). I was attempting to learn it some years back, but I quickly became frustrated when approaching other peoples code, it was like staring into a dark recess. This could be seen as an advantage, and no doubt perl is "flexable" in terms of how I can be wielded, but I never got the sense that I was making any headway. This was no doubt exsasperated by the fact that it was under a heavy workload at that time, I should probably revisit and see if I fair better not having 12hr workdays.
truc wrote: | Quote: | One point though, these are not "functions", but "program files" ... not to be facicious. |
Well, I actually share these two functions to be used within _your_ program files |
yes, OK, but they are more like programs, and I was meerly pointing out that in awk parlance the term "program file" is used. Anyhow, its symantics, "module", "function", "library" ... take your pick.
best ... and thanks again for the {functions,*}
khay |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|