Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
InfiniBand - a cheap way to _fast_ network, PC to PC?
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1684

PostPosted: Sun Mar 19, 2017 7:35 pm    Post subject: Reply with quote

Ok, so there's one more thing to test:
I've done some reading on infiniband and it seems to be have builtin distributed switching ability.
It should be possible to use dual-port IB cards to simply chain all the machines together without any switches in between and have them all start talking to each other without any extra configuration. And if all your cards are at least dual-port, creating loops should provide self-healing capabilities rather than flood the network with everlasting packets.

I also found some brief information about RDMA suggesting that it's possible to use a remote drive with local performance (sure, it will introduce a few ns of delay, you're not gonna notice, but no extra CPU load at all). This feature should work along the chain too, as it's infiniband fabric that does the switching rather than ethernet or ip layers.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Mon Mar 20, 2017 2:25 pm    Post subject: Reply with quote

Dang...
I just got the ETA for the parts I've ordered. I'm very lucky if those arrive this week. More propably next week.

I'll start spamming here when I start setting up the IB connection between the two PCs here...
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5101
Location: The Peanut Gallery

PostPosted: Mon Mar 20, 2017 5:54 pm    Post subject: Reply with quote

Hmm TIPC (CONFIG_TIPC) would be a really nice protocol to run over IB for a clustered setup.

Oh wow, there's a CONFIG_TIPC_MEDIA_IB available too :-)
Code:
% zgrep -F TIPC /proc/config.gz
CONFIG_TIPC=m
# CONFIG_TIPC_MEDIA_IB is not set
CONFIG_TIPC_MEDIA_UDP=y
(This is on sysresccd.)
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Tue Mar 21, 2017 10:11 am    Post subject: Reply with quote

Unfortunately I went with single port cards... ;(
The good thing is that those cards aren't too expensive. That's a 2x 10Gbps ports.

However, at the moment I have only two desktop PC's. And I don't see InfiniBand coming to laptops any time soon. :P
I think the next step from IB is Thunderbolt. Afaik Thunderbolt too has direct memory access (for easy hacking :D).
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Fri Mar 31, 2017 1:38 pm    Post subject: Reply with quote

So I finally got mail that the parts are on their way here. That means that I'll be rebuilding my setup next week. Propably on tuesday.
I'll propably compile kernel and install the another IB card on my server this weekend.

It's like christmas and I'm the N64 kid. :twisted:
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Fri Apr 07, 2017 8:52 am    Post subject: Reply with quote

Just an update here.
I have been workin the last 4 days. Yes, four days.
I have plugged the another card on my desktop and kernel recognizes it. Today I'll install the another on my server and compile newest stable kernel.
Meanwhile I need to investigate a problem on the desktop PC... One of the SATA drives isn't supplying data but it's still connected. I've encountered this before and the problem was the SATA cable. I do hope it's the reason for the error messages again... If not, then I'm once again saved by btrfs.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1684

PostPosted: Tue Apr 11, 2017 6:27 am    Post subject: Reply with quote

Zucca, can you post an image of the IB port and plug you have and say which type it is?
I've noticed there are several connector types available, but very little documentation on this topic. Perhaps IB is so enterprisy everyone assumes you simply call some vendor and they deliver you everything in one, ready to use box.
I got lucky with second-hand adapters (or at least I think so right now, since I'm waiting for delivery), but the cables are going to be much more expensive and intercontinental delivery fee on top of that makes it much more than I'm willing to pay just to have a look at it before trashing the whole thing.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Tue Apr 11, 2017 8:31 am    Post subject: Reply with quote

szatox wrote:
Zucca, can you post an image of the IB port and plug you have and say which type it is?
I've noticed there are several connector types available, but very little documentation on this topic. Perhaps IB is so enterprisy everyone assumes you simply call some vendor and they deliver you everything in one, ready to use box.
Mine has one 4x connector. The non-optical one, ie. copper. I quickly took few pictures (yay! macro lens).
szatox wrote:
I got lucky with second-hand adapters (or at least I think so right now, since I'm waiting for delivery), but the cables are going to be much more expensive and intercontinental delivery fee on top of that makes it much more than I'm willing to pay just to have a look at it before trashing the whole thing.
Yeah. I paid the same price for the cable as for one card.

As for my progess. I'm about to start conpiling kernel. I had some problems with the new hardware, but I managed to pull off a workaround.

Oh and If anyone has any tips on what kind of file sharing I should use between my server and desktop when using that InfiniBand...
I'd like to remove the ethernet cable. If so, then I'd need to set up ip-over-ib. I wonder if that causes much overhead..?
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1684

PostPosted: Tue Apr 11, 2017 5:45 pm    Post subject: Reply with quote

Thanks, I think I'm about to get the same connector. Is it the one called microGiGaCN?
IB should actually reduce overhead when compared to ethernet.
Block access looks promissing, stuff like LIO should be able to take advantage of RDMA. Ceph would certainly work too (hey, it's a _fast_ network), but when I asked search for ceph and infiniband, nothing relevant showed up on the first page.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Tue Apr 11, 2017 10:11 pm    Post subject: Reply with quote

I've now managed to get the ip-over-ib to show up as ib0 network iface on both machines.
ifconfig ib0:
ib0: flags=4098<BROADCAST,MULTICAST>  mtu 2044
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
        infiniband 80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  txqueuelen 256  (InfiniBand)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
... Looking good so far. Although I had to manually modprobe ib_ipoib on my server...

szatox wrote:
Is it the one called microGiGaCN?
I haven't heard that. I have only saw "4x connector" being mentioned... or something along those lines. The 12x is the much wider, but otherwise similar connector.

Now... I don't really know how I should set up the file sharing... I use sshfs normally. But it will cause overhead even when I have hardware AES -crypto acceleration on both machines. I really don't need encrypted file sharing between these two machines. At first I was thinking about NFS (...over-ip-over-ib)... But I'm not actually sure what to use now. The more I dig into InfiniBand "features" the more I see possibilities. InfiniBand seems to be able to carry all kinds of things...
I guess I wish there was some InfiniBand for beginners/noobs guide. :D


Last edited by Zucca on Wed Apr 12, 2017 8:12 am; edited 1 time in total
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10045
Location: Somewhere over Atlanta, Georgia

PostPosted: Tue Apr 11, 2017 10:58 pm    Post subject: Reply with quote

Bah. What can it carry that Ethernet can't? Use standard mechanisms!

To elaborate just a little bit, I'm not surprised that there are mechanisms to move data in specialized ways to avoid typical network overhead for special purposes. But if you use those, then everything is special, and you probably have to write special code as well. Why not at least see what the performance is with standard mechanisms and standard solutions, for example NFS and TCP/IP.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Wed Apr 12, 2017 9:38 am    Post subject: Reply with quote

I tried simply to set ip to the ib0 on my server and start dhcpd there, but on my desktop it does not get the carrier. I think I need to "conbine" all the four lanes before I can actually use ip-over-ib.
This needs RTFM.

Also I found out the name for the cable: SFF-8470 aka CX4.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1684

PostPosted: Wed Apr 12, 2017 10:18 pm    Post subject: Reply with quote

Thanks for that link, there's that little bit of so much needed information. Also, that SFF-8470 tag is more common in sales than any other name I tried so far.
At this point I opt for SAS cables (any reasons why it's a bad idea?). We will see in a few weeks, some box is gonna travel a few thousand miles.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Wed Apr 12, 2017 10:49 pm    Post subject: Reply with quote

szatox wrote:
At this point I opt for SAS cables (any reasons why it's a bad idea?)
Just make sure that you have compatible physical "connector securing" on the card and the cable.
I've seen thumb screw securing (like on VGA, D25 parallel, serial...) AND this "jaw hook" securing connectors. I think this "jaw hook" is what InfiniBand uses. The thunb screw maybe more common in SAS use.
Re-check now?

I got better results by searching CX4 cable/connector. (It has also been used in 10Gbit ethernet.)

EDIT:
Oh.
The page I linked earlier actually mentions proper names:
https://www.cs-electronics.com/sff-8470/ wrote:
The SFF-8470 connector is a very robust connector, utilizing a die-cast outer shell, and is available in 3 ‘attachment’ styles:
● ‘Squeeze-Latch’
● ‘Pull-Latch’
● ‘Thumbscrew’
I have those pull-latch connectors... Those look much like squeeze-latch. The way how those connectors are released may be the only difference making them possibly interchangeable between cards. But those thumb screw ones definitedly aren't. The connector may plug and electrically connect, but I bet it isn't physically secure.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1684

PostPosted: Fri Apr 14, 2017 10:10 pm    Post subject: Reply with quote

Yes, I found some papers stating that plugs with screws are keyed differently than plugs with latches just to ensure they wouldn't be mixed. Type of latches is only briefly mentioned, so it seems to be irrelevant.
I've just received my cards with latched ports. I'm building the basic stuff right now, wanna at least power on the new hardware.

Have you managed to get your IB to work by now?
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Sun Apr 16, 2017 1:31 pm    Post subject: Reply with quote

szatox wrote:
Have you managed to get your IB to work by now?
Easter came on my way. I hadn't time to do much. But I gathered some links.
I'll setting up the server side today... I think I miss few features from kernel.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Mon Apr 17, 2017 10:40 am    Post subject: Reply with quote

So.
Current status at this side: More RTFM needed. Trying out with trial-and-error if something come to mind.
I've set proper USE -flags for sys-fabrics/opensm and now investigating all the new packages I have.

So far I can set ip for ib0 interface manually (ifconfig). But setting up by configuring /etc/conf.d/net and starting net.ib0 service does not set any ips on the interface. Even if manually set the connection does not work. You see? More RTFM needed.

I needed to modprobe ib_mad to be able to start opensm service.
My opensm.log gets whole lot of errors:
/var/log/opensm.log:
Apr 17 14:02:31 000587 [7ED8F700] 0x02 -> SUBNET UP                                                       Apr 17 14:02:41 000140 [7DD8D700] 0x01 -> log_send_error: ERR 5411: DR SMP Send completed with error (IB_TIMEOUT) -- dropping                                                                                                               Method 0x1, Attr 0x11, TID 0x12c1                                                 Apr 17 14:02:41 000178 [7DD8D700] 0x01 -> Received SMP on a 1 hop path: Initial path = 0,1, Return path  = 0,0                   
Apr 17 14:02:41 000192 [7DD8D700] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(NodeInfo), attr_mod 0x0, TID 0x12c1
Apr 17 14:02:41 000215 [7DD8D700] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3120 Timeout while getting attribute 0x11 (NodeInfo); Possible mis-set mkey?                                                                 Apr 17 14:02:41 000589 [7ED8F700] 0x02 -> SUBNET UP           


So. I'll start searching for the answers.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 41112
Location: 56N 3W

PostPosted: Mon Apr 17, 2017 10:53 am    Post subject: Reply with quote

Zucca,

A whole physical network with only a single interface on it is going to be a very lonely place :)
There won't be any dhcp server for you so you need to set the IP statically.

Maybe I'm teaching my granny to suck eggs because I'm misreading your post?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Mon Apr 17, 2017 11:37 am    Post subject: Reply with quote

I actually just managed to make them connect.
I started opensm on both machines. And I got "SUBNET UP" messages showing up on my server.

NeddySeagoon wrote:
A whole physical network with only a single interface on it is going to be a very lonely place :)
There won't be any dhcp server for you so you need to set the IP statically.
I have Mellanox cards on my desktop and server. Server sets static ip settings to the IB interface (apparently I need to get opensm working propely first). After the interfaces (I have set up one wired and one wireless ethernet too) are up and configured then the dnsmasqd starts dhcpd that listens on each of them. For now the others work as planned. The ib0 just needs more care and love. :)

My desktop is another case... opensm hangs badly. It works somewhat, but trying to send INT just does nothing and it reacts slowly to KILL also.
Then I saw that ibacm failed to compile with message
snippet from ibacm build log:
libtool: link: x86_64-pc-linux-gnu-gcc -g -Wall -D_GNU_SOURCE -DSYSCONFDIR=\"/etc\" -DBINDIR=\"/usr/bin\" -DRDMADIR=\"rdma\" -O2 -pipe -march=bdver2 -msse -msse2 -msse3 -mssse3 -msse4a -msse4.2 -msse4.1 -mvzeroupper -Wl,-O1 -Wl,--as-needed -o svc/ibacm src/svc_ibacm-acm.o  /usr/lib64/libibumad.so -libverbs
/usr/lib/gcc/x86_64-pc-linux-gnu/5.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: src/svc_ibacm-acm.o: undefined reference to symbol 'pthread_create@@GLIBC_2.2.5'
/lib64/libpthread.so.0: error adding symbols: DSO missing from command line
... so I'll try to solve problems on my desktop PC now. before trying to tinker any more on the server side.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Mon Apr 17, 2017 5:33 pm    Post subject: Reply with quote

I read few bits here and there.
EDIT: I also remembered that at least Mellanox HCAs don't particulary like suspending or hibernating. To circumvent the problem, I think you need to compile some of the InfiniBand stuff as modules. Then before suspend/hibernate take down all InfiniBand related and unload the modules. /EDIT
I've now managed to actually connect the two interfaces... Or HCAs should I say.
Running IBPATH="/usr/sbin/" ibnodes lists the both nodes on both computers.
However, when trying to ping using ibping -G <port GUID>, nothing passes trough. 100% packet loss.

Anyway now starting the net.ib0 service on my server actually configures the interface. My desktop side is the problem, I guess...
Back to top
View user's profile Send private message
R0b0t1
Apprentice
Apprentice


Joined: 05 Jun 2008
Posts: 255

PostPosted: Mon Apr 17, 2017 7:26 pm    Post subject: Reply with quote

I'm somewhat confused - how different is IB from an IP/Ethernet interface? I am pretty sure that e.g. in the case of Ethernet, which is physical layer, you could theoretically use something other than IP over it though this is "never" done in practice. Is this distinction more pronounced with IB? What is the talk about needing to tunnel IP about?
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Tue Apr 18, 2017 9:35 am    Post subject: Reply with quote

R0b0t1 wrote:
I'm somewhat confused - how different is IB from an IP/Ethernet interface?
As I'm not an expert (but learning every day) I'll just leave this here.

I'll also add that to my IB resource page.
Back to top
View user's profile Send private message
1clue
Advocate
Advocate


Joined: 05 Feb 2006
Posts: 2348

PostPosted: Tue Apr 18, 2017 2:34 pm    Post subject: Reply with quote

Zucca wrote:
As I'm not an expert (but learning every day) I'll just leave this here...


This page reminds me yet again how much overhead there is on the term "reliable delivery." Back when I was at IBM they had a concept, AnyNet I think it was. It guaranteed delivery across any network hardware. The problem was, if one link went down the entire network came to a halt. Not talking PCs here, minis and mainframes. We had a project which required lots of data to arrive in another state (in the USA) and AnyNet kept breaking us. On a lark we tried TCP/IP and it went much faster. We started looking at app requirements and realized we could manage out-of-order packets easily enough in the app, and we could afford to drop a few packets. We went to UDP and our throughput for the entire data set was incredibly faster than AnyNet.

Since then with every project I do my own networking on, I'm extremely prejudicial against TCP, and against any sort of system which guarantees packets to arrive, and to arrive in order.

Especially if you have any real-time data going through. With real-time packets chances are the data being sent is significantly smaller than a packet. It's easy to backfill old data with timestamps into the packet and in doing so give redundancy. On the receiving end, keep tabs on missing packets and when you get a packet be sure to back-fill anything still in the window.

With lossless streaming and guaranteed delivery, the larger the data being sent the more likely you have to resend a packet. As mentioned in the above URL that resending is not a lightweight task, and if the network has a lot of errors or is heavily congested it becomes a monumental task. Same thing if the data is being sent through a lot of routers, the latency builds up and timeouts become a real possibility.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Tue Apr 18, 2017 10:38 pm    Post subject: Reply with quote

So far I've gotten IPoIB to work. However, I cannot get ip from my server using dhcp. Setting the ip manually seems to work, since pings drop to about 50% via ib0. I tested pings using "nomal" ping and ibping. I read that tuning mtu might affect this. Althouh then the mode needs to be changed to "connected" from "datagram".

I'll continue testing... I think I'll test raw ip speeds using netcat...
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1312
Location: KUUSANKOSKI, Finland

PostPosted: Wed Apr 19, 2017 3:23 pm    Post subject: Speed tests 1 Reply with quote

So. I quickly did some testing.
listen on server:
netcat -v -v -l -n -p 2002 > /dev/null
send from client/desktop:
yes | pv | nc -v -v -n 10.0.10.1 2002
... results about 245 Mbytes/s.
Things to note:
  • ip layer on top of InfiniBand
  • tcp on ip
  • content sent
  • client side has the HCA card on 4x PCIe slot, while it could utilize 8x.
...all affect speed.
Interestingly if I pass "$RANDOM" as an argument to 'yes', the speed grows to around 270Mbytes/s.

Still there's lot of room for improvement.
EDIT: But as comparison I get around 110 Mbytes/s via the 1Gigabit ethernet by passing "$RANDOM" to 'yes'.

EDIT2: Using qperf:
qperf <server> tcp_bw tcp_lat:
tcp_bw:                           
    bw  =  392 MB/sec             
tcp_lat:                           
    latency  =  76 us             
qperf <server> udp_bw udp_lat:

udp_bw:                           
    send_bw  =  516 MB/sec         
    recv_bw  =  309 MB/sec         
udp_lat:                           
    latency  =  2 sec             
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2, 3, 4  Next
Page 2 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum