Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
How do I join pair of lines beginning with the same pattern?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
VinzC
Watchman
Watchman


Joined: 17 Apr 2004
Posts: 5098
Location: Dark side of the mood

PostPosted: Tue Jan 22, 2013 4:44 pm    Post subject: How do I join pair of lines beginning with the same pattern? Reply with quote

Hi all.

I'd like in fact to merge two NFS export files. My basic idea was to cat both files, sort them and merge those lines that begin with the same pattern. I ended up with something like this:
Code:
SHARE1 settings_1a settings_1b
SHARE1 settings_1c settings_1d
SHARE2 settings_2a
SHARE3 settings_3a settings_3b

So I'd like
Code:
SHARE1 settings_1a settings_1b settings_1c settings_1d
SHARE2 settings_2a
SHARE3 settings_3a settings_3b

of course. Just that I haven't been able to figure out how to do with either sed or awk. [EDIT: because I don't have join on that system.]

So thanks in advance for any hint/suggestion.
_________________
Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739!


Last edited by VinzC on Tue Jan 22, 2013 10:54 pm; edited 1 time in total
Back to top
View user's profile Send private message
py-ro
Veteran
Veteran


Joined: 24 Sep 2002
Posts: 1734
Location: Velbert

PostPosted: Tue Jan 22, 2013 5:56 pm    Post subject: Reply with quote

Code:
man join


:wink:

Bye
Py
Back to top
View user's profile Send private message
VinzC
Watchman
Watchman


Joined: 17 Apr 2004
Posts: 5098
Location: Dark side of the mood

PostPosted: Tue Jan 22, 2013 10:53 pm    Post subject: Reply with quote

py-ro wrote:
Code:
man join


:wink:

Bye
Py

Argh... I should have added: I *don't* have join at hand!
_________________
Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739!
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Tue Jan 22, 2013 11:41 pm    Post subject: Reply with quote

Are you sure? /usr/bin/join is part of sys-apps/coreutils, so a system with NFS and without join is an unusual system indeed. Is this a non-Linux system?
Back to top
View user's profile Send private message
VinzC
Watchman
Watchman


Joined: 17 Apr 2004
Posts: 5098
Location: Dark side of the mood

PostPosted: Wed Jan 23, 2013 7:54 am    Post subject: Reply with quote

Hu wrote:
Are you sure? /usr/bin/join is part of sys-apps/coreutils, so a system with NFS and without join is an unusual system indeed. Is this a non-Linux system?


Well, after all these years in the IT business I better know how to double check and run a command :lol: . So yes, I'm sure. It's a QNAP network attached storage living on busybox. No join.
_________________
Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739!
Back to top
View user's profile Send private message
cwr
Veteran
Veteran


Joined: 17 Dec 2005
Posts: 1969

PostPosted: Wed Jan 23, 2013 2:22 pm    Post subject: Reply with quote

Are you trying to join two lines, or multiple lines, with the same header?

Will
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Wed Jan 23, 2013 2:47 pm    Post subject: Reply with quote

Try
join.awk:
{
    if ($1 == LastKey) {
        sub("^" $1, "");
        Accumulator = Accumulator $0;
    } else {
        print Accumulator;
        Accumulator = $0;
        LastKey = $1;
    }
}

END {
    print Accumulator;
}
Then, per your suggestion, using sort, the following command:
Code:
sort file1 file2 file3 | awk -f join.awk
should work. No need to cat 'em first; that would be a "useless use of cat". :wink:

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
VinzC
Watchman
Watchman


Joined: 17 Apr 2004
Posts: 5098
Location: Dark side of the mood

PostPosted: Wed Jan 23, 2013 3:14 pm    Post subject: Reply with quote

cwr wrote:
Are you trying to join two lines, or multiple lines, with the same header?

Will

Only two adjacent lines to join, based on the first column as the key. The results are sorted on the first column. There won't be more than two adjacent lines to join.

John R. Graham wrote:
Try
join.awk:
{
    if ($1 == LastKey) {
        sub("^" $1, "");
        Accumulator = Accumulator $0;
    } else {
        print Accumulator;
        Accumulator = $0;
        LastKey = $1;
    }
}

END {
    print Accumulator;
}
Then, per your suggestion, using sort, the following command:
Code:
sort file1 file2 file3 | awk -f join.awk
should work. No need to cat 'em first; that would be a "useless use of cat". :wink:

- John


Thank you very much John. Will try this and report the results. Just note that input files aren't "exactly" cat'ed. They're the result of some pre-processing done by sed and the output of some other QNAP config-file-builder command.

EDIT: Works! Thanks again!
_________________
Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739!
Back to top
View user's profile Send private message
cwr
Veteran
Veteran


Joined: 17 Dec 2005
Posts: 1969

PostPosted: Wed Jan 23, 2013 5:30 pm    Post subject: Reply with quote

FWIW, here's a rough cut using AWK. It gets most of the way there, but needs
to be edited to fix your circumstances.

Code:

#/usr/bin/awk -f
BEGIN {
  REGEXP="^SHARE[[:digit:]]+"
  OLD=""
  NEW=""

{
  if ($1  ~ REGEXP) {
    NEW=$1
    if (OLD !~ NEW) {
        print "zzz"
    }
    printf "%s", $0
  }
  OLD=$NEW
}

END {
}


The "zzz" is a placemarker for debugging, and the duplicated $1 fields still need to be stripped.

Will
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10587
Location: Somewhere over Atlanta, Georgia

PostPosted: Wed Jan 23, 2013 9:55 pm    Post subject: Reply with quote

@VinzC,

Minor correction in case you're going to be using this script over again. The previous version emits a spurious blank line at the beginning. :oops: One extra if clause fixes it:
join.awk v2:
{
    if ($1 == LastKey) {
        sub("^" $1, "");
        Accumulator = Accumulator $0;
    } else {
        if (Accumulator)
            print Accumulator;
        Accumulator = $0;
        LastKey = $1;
    }
}

END {
    print Accumulator;
}
- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.


Last edited by John R. Graham on Wed Feb 06, 2013 12:36 pm; edited 1 time in total
Back to top
View user's profile Send private message
VinzC
Watchman
Watchman


Joined: 17 Apr 2004
Posts: 5098
Location: Dark side of the mood

PostPosted: Thu Jan 24, 2013 8:16 am    Post subject: Reply with quote

Thank you guys for your help. I do not master awk nor use it at its full power. So thanks for enhancing my knowledge.

@John

The initial blank line does really not hurt as the output is used to create the exports file for QNAP NFS service each time the service is started. Blank lines are hence ignored. Thanks a lot again for making it neat and clean.
_________________
Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739!
Back to top
View user's profile Send private message
dataking
Apprentice
Apprentice


Joined: 20 Apr 2005
Posts: 251

PostPosted: Thu Jan 24, 2013 5:14 pm    Post subject: Reply with quote

It would help to know what the input looked like.
_________________
-= the D@7@k|n& =-
Back to top
View user's profile Send private message
VinzC
Watchman
Watchman


Joined: 17 Apr 2004
Posts: 5098
Location: Dark side of the mood

PostPosted: Fri Jan 25, 2013 11:08 am    Post subject: Reply with quote

dataking wrote:
It would help to know what the input looked like.

VinzC wrote:
I'd like in fact to merge two NFS export files.

Any syntactically valid NFS exports definition file will do. And in a generalized way: any text file containing lines of data organized in space-separated columns, with the first column serving as a primary-foreign key.
_________________
Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum