Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
OpenSSL and AMD Cryptographic CoProcessor (CCP)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
Gatak
Apprentice
Apprentice


Joined: 04 Jan 2004
Posts: 174

PostPosted: Thu Jul 23, 2020 11:43 am    Post subject: OpenSSL and AMD Cryptographic CoProcessor (CCP) Reply with quote

Hi!

I have a AMD CPU with Cryptographic CoProcessor (CCP).

Is it possible to use the hardware crypto for things like OpenSSL/OpenSSH or TCP checksum calculations?

# openssl engine
Code:
(rdrand) Intel RDRAND engine
(dynamic) Dynamic engine loading support


# Kernel config
Code:

CONFIG_CRYPTO_DEV_CCP=y
CONFIG_CRYPTO_DEV_CCP_DD=y
CONFIG_CRYPTO_DEV_SP_CCP=y
CONFIG_CRYPTO_DEV_CCP_CRYPTO=y
CONFIG_CRYPTO_DEV_CCP_DEBUGFS=y
 | ┌────────────────────────────────────────────────────────────────────┐ │
 │ │    --- Hardware crypto devices                                     │ │
 │ │    < >   Support for VIA PadLock ACE                               │ │
 │ │    < >   Support for Microchip / Atmel ECC hw accelerator          │ │
 │ │    < >   Support for Microchip / Atmel SHA accelerator and RNG     │ │
 │ │    [*]   Support for AMD Secure Processor                          │ │
 │ │    <*>     Secure Processor device driver                          │ │
 │ │    [*]       Cryptographic Coprocessor device                      │ │
 │ │    <*>         Encryption and hashing offload support              │ │
 │ │    [*]       Platform Security Processor (PSP) device              │ │
 │ │    [*]     Enable CCP Internals in DebugFS                         │ │
 │ │    < >   Support for Intel(R) DH895xCC                             │ │


# dmesg | grep -i ccp
Code:

[    3.201014] ccp 0000:07:00.2: ccp enabled
[    3.211267] ccp 0000:07:00.2: tee enabled
[    3.211451] ccp 0000:07:00.2: psp enabled


# grep -i ccp /proc/crypto
Code:

# grep -i ccp /proc/crypto
driver       : rsa-ccp
driver       : hmac-sha512-ccp
driver       : sha512-ccp
driver       : hmac-sha384-ccp
driver       : sha384-ccp
driver       : hmac-sha256-ccp
driver       : sha256-ccp
driver       : hmac-sha224-ccp
driver       : sha224-ccp
driver       : hmac-sha1-ccp
driver       : sha1-ccp
driver       : cbc-des3-ccp
driver       : ecb-des3-ccp
driver       : gcm-aes-ccp
driver       : xts-aes-ccp
driver       : cmac-aes-ccp
driver       : rfc3686-ctr-aes-ccp
driver       : ctr-aes-ccp
driver       : ofb-aes-ccp
driver       : cfb-aes-ccp
driver       : cbc-aes-ccp
driver       : ecb-aes-ccp


# ls /dev/crypto
Code:
ls: cannot access '/dev/crypto': No such file or directory


# grep . /sys/kernel/debug/ccp/ -R
Code:

/sys/kernel/debug/ccp/ccp-1/q4/stats:  Total Queue Operations: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:                     AES: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:                 XTS AES: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:                     RSA: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:               Pass-Thru: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:                     ECC: 0
/sys/kernel/debug/ccp/ccp-1/q4/stats:      Enabled Interrupts: ERROR COMPLETION
/sys/kernel/debug/ccp/ccp-1/q3/stats:  Total Queue Operations: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:                     AES: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:                 XTS AES: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:                     RSA: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:               Pass-Thru: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:                     ECC: 0
/sys/kernel/debug/ccp/ccp-1/q3/stats:      Enabled Interrupts: ERROR COMPLETION
/sys/kernel/debug/ccp/ccp-1/q2/stats:  Total Queue Operations: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:                     AES: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:                 XTS AES: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:                     RSA: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:               Pass-Thru: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:                     ECC: 0
/sys/kernel/debug/ccp/ccp-1/q2/stats:      Enabled Interrupts: ERROR COMPLETION
/sys/kernel/debug/ccp/ccp-1/stats:Total Interrupts Handled: 0
/sys/kernel/debug/ccp/ccp-1/stats:        Total Operations: 0
/sys/kernel/debug/ccp/ccp-1/stats:                     AES: 0
/sys/kernel/debug/ccp/ccp-1/stats:                 XTS AES: 0
/sys/kernel/debug/ccp/ccp-1/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/stats:                     SHA: 0
/sys/kernel/debug/ccp/ccp-1/stats:                     RSA: 0
/sys/kernel/debug/ccp/ccp-1/stats:               Pass-Thru: 0
/sys/kernel/debug/ccp/ccp-1/stats:                     ECC: 0
/sys/kernel/debug/ccp/ccp-1/info:Device name: ccp-1
/sys/kernel/debug/ccp/ccp-1/info:   RNG name: ccp-1-rng
/sys/kernel/debug/ccp/ccp-1/info:   # Queues: 3
/sys/kernel/debug/ccp/ccp-1/info:     # Cmds: 0
/sys/kernel/debug/ccp/ccp-1/info:    Version: 5
/sys/kernel/debug/ccp/ccp-1/info:    Engines: AES 3DES SHA RSA ECC ZDE TRNG
/sys/kernel/debug/ccp/ccp-1/info:     Queues: 5
/sys/kernel/debug/ccp/ccp-1/info:LSB Entries: 128


# lscpu
Code:

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              2
Core(s) per socket:              2
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           24
Model name:                      AMD Athlon 3000G with Radeon Vega Graphics
Stepping:                        1
Frequency boost:                 enabled
CPU MHz:                         3673.187
CPU max MHz:                     3900.0000
CPU min MHz:                     1600.0000
BogoMIPS:                        7785.76
Virtualization:                  AMD-V
L1d cache:                       64 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        4 MiB
NUMA node0 CPU(s):               0-3
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Vulnerable: __user pointer sanitization and use
                                 rcopy barriers only; no swapgs barriers
Vulnerability Spectre v2:        Vulnerable, IBPB: disabled, STIBP: disabled
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtr
                                 r pge mca cmov pat pse36 clflush mmx fxsr sse s
                                 se2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtsc
                                 p lm constant_tsc rep_good nopl nonstop_tsc cpu
                                 id extd_apicid aperfmperf pni pclmulqdq monitor
                                  ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes
                                 xsave avx f16c rdrand lahf_lm cmp_legacy svm ex
                                 tapic cr8_legacy abm sse4a misalignsse 3dnowpre
                                 fetch osvw skinit wdt tce topoext perfctr_core
                                 perfctr_nb bpext perfctr_llc mwaitx cpb hw_psta
                                 te sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2
                                  smep bmi2 rdseed adx smap clflushopt sha_ni xs
                                 aveopt xsavec xgetbv1 xsaves clzero irperf xsav
                                 eerptr arat npt lbrv svm_lock nrip_save tsc_sca
                                 le vmcb_clean flushbyasid decodeassists pausefi
                                 lter pfthreshold avic v_vmsave_vmload vgif over
                                 flow_recov succor smca
Back to top
View user's profile Send private message
Gatak
Apprentice
Apprentice


Joined: 04 Jan 2004
Posts: 174

PostPosted: Thu Jul 23, 2020 1:43 pm    Post subject: Use openssl -engine afalg Reply with quote

So it seems possible to use the AMD CCP using openssl -engine afalg

The speed improvement for AES is huge when you use larger block sizes! :D

# openssl speed -evp aes-192-cbc -engine afalg
Code:
engine "afalg" set.
Doing aes-192-cbc for 3s on 16 size blocks: 1685326 aes-192-cbc's in 0.45s
Doing aes-192-cbc for 3s on 64 size blocks: 1722473 aes-192-cbc's in 0.41s
Doing aes-192-cbc for 3s on 256 size blocks: 1543359 aes-192-cbc's in 0.40s
Doing aes-192-cbc for 3s on 1024 size blocks: 1127194 aes-192-cbc's in 0.33s
Doing aes-192-cbc for 3s on 8192 size blocks: 335502 aes-192-cbc's in 0.09s
Doing aes-192-cbc for 3s on 16384 size blocks: 180981 aes-192-cbc's in 0.06s
OpenSSL 1.1.1g  21 Apr 2020
built on: Thu Jul 23 11:19:52 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -march=native -pipe -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG  -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-192-cbc      59922.70k   268873.83k   987749.76k  3497717.14k 30538137.60k 49419878.40k


# openssl speed -evp aes-192-cbc
Code:
Doing aes-192-cbc for 3s on 16 size blocks: 139159126 aes-192-cbc's in 2.99s
Doing aes-192-cbc for 3s on 64 size blocks: 51864313 aes-192-cbc's in 2.99s
Doing aes-192-cbc for 3s on 256 size blocks: 13886330 aes-192-cbc's in 2.99s
Doing aes-192-cbc for 3s on 1024 size blocks: 3540324 aes-192-cbc's in 3.00s
Doing aes-192-cbc for 3s on 8192 size blocks: 444244 aes-192-cbc's in 2.99s
Doing aes-192-cbc for 3s on 16384 size blocks: 222334 aes-192-cbc's in 2.99s
OpenSSL 1.1.1g  21 Apr 2020
built on: Thu Jul 23 11:19:52 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -march=native -pipe -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG  -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-192-cbc     744664.22k  1110139.14k  1188929.93k  1208430.59k  1217139.41k  1218301.09k
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Thu Jul 23, 2020 9:56 pm    Post subject: Reply with quote

That's some interesting numbers. My CPU's CCP isn't accessible because Gigabyte doesn't know how to write a BIOS. I don't care that much about it, because nothing on my computer can do sustained 50GB/s IO anyway, but it is annoying having dmesg remind me every boot…
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6103
Location: Dallas area

PostPosted: Thu Jul 23, 2020 10:22 pm    Post subject: Reply with quote

Ant P. wrote:
That's some interesting numbers. My CPU's CCP isn't accessible because Gigabyte doesn't know how to write a BIOS. I don't care that much about it, because nothing on my computer can do sustained 50GB/s IO anyway, but it is annoying having dmesg remind me every boot…


Mine doesn't work either, so I unticked "AMD Secure Encrypted Virtualization (SEV) support" for kvm and blacklisted the ccp module.
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
Gatak
Apprentice
Apprentice


Joined: 04 Jan 2004
Posts: 174

PostPosted: Thu Jul 23, 2020 10:31 pm    Post subject: Reply with quote

Ant P. wrote:
That's some interesting numbers. My CPU's CCP isn't accessible because Gigabyte doesn't know how to write a BIOS. I don't care that much about it, because nothing on my computer can do sustained 50GB/s IO anyway, but it is annoying having dmesg remind me every boot…


I have a Gigabyte MB. https://www.gigabyte.com/Motherboard/B450M-DS3H-rev-10/

It was their last BIOS released in July that fixed this. I filed a support ticket earlier about another issue with iGPU RAM under Linux and they sent me a fix. Perhaps if you do the same they will start looking at Linux more?
Back to top
View user's profile Send private message
cyrius
n00b
n00b


Joined: 27 Jan 2007
Posts: 70
Location: France

PostPosted: Thu Nov 19, 2020 5:51 pm    Post subject: Reply with quote

Hi,

I have the same on my ASUS mother board.
By looking for it on web, i discovered this crypto hardware is supported by dpdk project (dpdk.org).
But dpdk is not in Gentoo (I m wondering why ?). You could use the overlay : earshark.
Unfortunately, it s not the last version

I ll go further more in my research about.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Thu Nov 19, 2020 11:12 pm    Post subject: Reply with quote

Gatak wrote:
I have a Gigabyte MB. https://www.gigabyte.com/Motherboard/B450M-DS3H-rev-10/

It was their last BIOS released in July that fixed this. I filed a support ticket earlier about another issue with iGPU RAM under Linux and they sent me a fix. Perhaps if you do the same they will start looking at Linux more?

Mine is https://www.gigabyte.com/Motherboard/X570-UD-rev-10/

It hasn't been fixed so far, but maybe it'll randomly show up someday. They still seem to be pushing out regular BIOS updates.
Back to top
View user's profile Send private message
cyrius
n00b
n00b


Joined: 27 Jan 2007
Posts: 70
Location: France

PostPosted: Fri Nov 20, 2020 12:57 pm    Post subject: Reply with quote

Finally, it works 8O

I had a lot of problems to make it working....
By the end, with sys-kernel/cryptodev-1.11 and openssl-1.1.1h, i succeeded to make it.
I m not sure to still have understood all.

First of all, cryptodev-1.11 doesn't compile for kernel >=5.9 or with examples use flag.
So, ensure having not this last kernel and not this use flag set up.

When having compiled the kernel with all ccp options set, only ccp module is loaded at reboot.
you'll have to modprobe ccp_crypto which is the offloading or add it in etc/conf.d/modules
Same issue with cryptodev, you ll have to modprobe it.

To ensure it works, i ve git cloned https://github.com/cryptodev-linux/cryptodev-linux, go in examples repertory and "gcc ./aes.c".

The result :
Code:

/home/user/GiT/cryptodev-linux/examples # ./a.out
Got cbc(aes) with driver cbc-aes-ccp
Got cbc(aes) with driver cbc-aes-ccp
AES Test passed


Try it until you get "With driver cbc-aes-ccp"


Second, in openssl ebuild, we add "enable-devcryptoeng \" in the source config :
Code:

 ./${config} \
                ${sslout} \
                $(use cpu_flags_x86_sse2 || echo "no-sse2") \
                enable-camellia \
                enable-ec \
                $(use_ssl !bindist ec2m) \
                enable-srp \
                $(use elibc_musl && echo "no-async") \
                ${ec_nistp_64_gcc_128} \
                enable-idea \
                enable-mdc2 \
                enable-rc5 \
                enable-devcryptoeng \
                $(use_ssl sslv3 ssl3) \
                $(use_ssl sslv3 ssl3-method) \
                $(use_ssl asm) \
                $(use_ssl rfc3779) \
                $(use_ssl sctp) \
                $(use_ssl tls-heartbeat heartbeats) \
                $(use_ssl zlib) \
                --prefix="${EPREFIX}"/usr \
                --openssldir="${EPREFIX}"${SSL_CNF_DIR} \
                --libdir=$(get_libdir) \
                shared threads \
                || die
                #-DOPENSSL_DEVCRYPTOENG \


then the magic command lines in portage/dev-libs/openssl directory :
ebuild ./openssl-1.1.1h.ebuild digest
emerge openssl

And...here we go.

Speed benchmark aes-256-cbc without devcrypto engine (openssl speed -evp aes-256-cbc) :
Quote:

Doing aes-256-cbc for 3s on 16 size blocks: 131213421 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 49485612 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 12732616 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 3205491 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 401374 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 200842 aes-256-cbc's in 3.00s
OpenSSL 1.1.1h 22 Sep 2020
built on: Fri Nov 20 13:03:40 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 699804.91k 1055693.06k 1086516.57k 1094140.93k 1096018.60k 1096865.11k



Then i've loaded the cryptodev module :

Code:

Module                  Size  Used by
cryptodev              57344  0
amdgpu               5046272  1
mfd_core               16384  1 amdgpu
gpu_sched              28672  1 amdgpu
ttm                    94208  1 amdgpu
backlight              16384  1 amdgpu
ccp_crypto             40960  0
ccp                    77824  1 ccp_crypto
sha1_generic           16384  1 ccp



Same one with devcrypto engine (same openssl command but with cryptodev module loaded):
Quote:

Doing aes-256-cbc for 3s on 16 size blocks: 82329 aes-256-cbc's in 0.00s
Doing aes-256-cbc for 3s on 64 size blocks: 81648 aes-256-cbc's in 0.02s
Doing aes-256-cbc for 3s on 256 size blocks: 75341 aes-256-cbc's in 0.02s
Doing aes-256-cbc for 3s on 1024 size blocks: 66766 aes-256-cbc's in 0.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 37154 aes-256-cbc's in 0.01s
Doing aes-256-cbc for 3s on 16384 size blocks: 27273 aes-256-cbc's in 0.00s
OpenSSL 1.1.1h 22 Sep 2020
built on: Fri Nov 20 08:08:21 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc infk 261273.60k 964364.80k infk 30436556.80k infk



Wow...
Without : For 8K, we have 1 096 018.60k witch is excellent because i ve activated openssl asm use flag (It uses aes-ni assembly language).
With : For 8k, we have 30 436 556.80k. It s just an amazing factor of 30 !!!!! Impressive

My conclusion is it's really worth to use it.

Now issues that i ve encountered :

Openssl is exclusively using it when calling his APIs from an other program. Then openSSH didn't worked anymore. i had to set up libressl for it :
Quote:

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild R ] net-misc/openssh-8.4_p1-r2::gentoo USE="X libressl pam pie scp -X509 -audit -bindist -debug -hpn -kerberos -ldns -libedit -livecd -sctp -security-key (-selinux) -ssl -static -test -xmss" 0 KiB

Total: 1 package (1 reinstall), Size of downloads: 0 KiB

* IMPORTANT: 7 news items need reading for repository 'gentoo'.
* Use eselect news read to view new items.


The encountered error at sshd start was :
"accumulate_host_timing_secret : ssh_digest_start error"
it seems that openssl api tried to make all the digests with the engine...no sense.

In Openssl command line, i ve been not able to call the engine because devcrypto engine is not a library existing in engine directory :
Quote:

~ # ls /usr/lib64/engines-1.1/
afalg.so capi.so padlock.so


Then when starting openssl command line mode :
Code:

OpenSSL> engine 
(devcrypto) /dev/crypto engine
(rdrand) Intel RDRAND engine
(dynamic) Dynamic engine loading support
OpenSSL> engine devcrypto speed aes
(devcrypto) /dev/crypto engine
139809185572672:error:25066067:DSO support routines:dlfcn_load:could not load the shared library:crypto/dso/dso_dlfcn.c:118:filename(/usr/lib64/engines-1.1/speed.so): /usr/lib64/engines-1.1/speed.so: cannot open shared object file: No such file or directory
139809185572672:error:25070067:DSO support routines:DSO_load:could not load the shared library:crypto/dso/dso_lib.c:162:
139809185572672:error:260B6084:engine routines:dynamic_load:dso not found:crypto/engine/eng_dyn.c:414:
139809185572672:error:2606A074:engine routines:ENGINE_by_id:no such engine:crypto/engine/eng_list.c:334:id=speed
139809185572672:error:25066067:DSO support routines:dlfcn_load:could not load the shared library:crypto/dso/dso_dlfcn.c:118:filename(/usr/lib64/engines-1.1/aes.so): /usr/lib64/engines-1.1/aes.so: cannot open shared object file: No such file or directory
139809185572672:error:25070067:DSO support routines:DSO_load:could not load the shared library:crypto/dso/dso_lib.c:162:
139809185572672:error:260B6084:engine routines:dynamic_load:dso not found:crypto/engine/eng_dyn.c:414:
139809185572672:error:2606A074:engine routines:ENGINE_by_id:no such engine:crypto/engine/eng_list.c:334:id=aes
error in engine
OpenSSL> speed aes engine devcrypto
speed: Unknown algorithm engine
error in speed
OpenSSL> quit


I first tried to solve this by following those documentations :
https://wiki.openssl.org/index.php/Library_Initialization
file:///usr/share/doc/openssl-1.1.1h/html/man5/config.html#NAME
without success :
- I ve modified openssl.cnf
- I ve tried to activate openssh call of OPENSSL_init_crypto....

I think to have reached the limit of my knowledge about and need help to understand those errors.

ANYWAY : The very good new are the performance and the possibility to use devcrypto to integrate it in our code. We could also use openssl APIs if we replace it by libressl everywhere :-)
I ll try perhaps to push openssl ebuild in a gentoo bug.
We don't have the right configuration. :roll:

Enjoy it :D
Back to top
View user's profile Send private message
cyrius
n00b
n00b


Joined: 27 Jan 2007
Posts: 70
Location: France

PostPosted: Fri Nov 20, 2020 4:26 pm    Post subject: Reply with quote

An other potential issue...:

On the web site of dpdk : https://doc.dpdk.org/guides-19.08/cryptodevs/ccp.html , we have the hashes and ciphers list :

Quote:


6.1. Features

CCP crypto PMD has support for:

Cipher algorithms:

RTE_CRYPTO_CIPHER_AES_CBC
RTE_CRYPTO_CIPHER_AES_ECB
RTE_CRYPTO_CIPHER_AES_CTR
RTE_CRYPTO_CIPHER_3DES_CBC

Hash algorithms:

RTE_CRYPTO_AUTH_SHA1
RTE_CRYPTO_AUTH_SHA1_HMAC
RTE_CRYPTO_AUTH_SHA224
RTE_CRYPTO_AUTH_SHA224_HMAC
RTE_CRYPTO_AUTH_SHA256
RTE_CRYPTO_AUTH_SHA256_HMAC
RTE_CRYPTO_AUTH_SHA384
RTE_CRYPTO_AUTH_SHA384_HMAC
RTE_CRYPTO_AUTH_SHA512
RTE_CRYPTO_AUTH_SHA512_HMAC
RTE_CRYPTO_AUTH_MD5_HMAC
RTE_CRYPTO_AUTH_AES_CMAC
RTE_CRYPTO_AUTH_SHA3_224
RTE_CRYPTO_AUTH_SHA3_224_HMAC
RTE_CRYPTO_AUTH_SHA3_256
RTE_CRYPTO_AUTH_SHA3_256_HMAC
RTE_CRYPTO_AUTH_SHA3_384
RTE_CRYPTO_AUTH_SHA3_384_HMAC
RTE_CRYPTO_AUTH_SHA3_512
RTE_CRYPTO_AUTH_SHA3_512_HMAC

AEAD algorithms:

RTE_CRYPTO_AEAD_AES_GCM



when doing openssl engine -t -c : no SHA3.

i ve looked for why and it seems because of cryptodev.
Here is the content of cryptodev.h :
Code:



/* All the supported algorithms
 */
enum cryptodev_crypto_op_t {
        CRYPTO_DES_CBC = 1,
        CRYPTO_3DES_CBC = 2,
        CRYPTO_BLF_CBC = 3,
        CRYPTO_CAST_CBC = 4,
        CRYPTO_SKIPJACK_CBC = 5,
        CRYPTO_MD5_HMAC = 6,
        CRYPTO_SHA1_HMAC = 7,
        CRYPTO_RIPEMD160_HMAC = 8,
        CRYPTO_MD5_KPDK = 9,
        CRYPTO_SHA1_KPDK = 10,
        CRYPTO_RIJNDAEL128_CBC = 11,
        CRYPTO_AES_CBC = CRYPTO_RIJNDAEL128_CBC,
        CRYPTO_ARC4 = 12,   
        CRYPTO_MD5 = 13, 
        CRYPTO_SHA1 = 14,
        CRYPTO_DEFLATE_COMP = 15,
        CRYPTO_NULL = 16,
        CRYPTO_LZS_COMP = 17,
        CRYPTO_SHA2_256_HMAC = 18,
        CRYPTO_SHA2_384_HMAC = 19,
        CRYPTO_SHA2_512_HMAC = 20,
        CRYPTO_AES_CTR = 21,
        CRYPTO_AES_XTS = 22,
        CRYPTO_AES_ECB = 23,
        CRYPTO_AES_GCM = 50,

        CRYPTO_CAMELLIA_CBC = 101,
        CRYPTO_RIPEMD160,
        CRYPTO_SHA2_224,
        CRYPTO_SHA2_256,
        CRYPTO_SHA2_384,
        CRYPTO_SHA2_512,
        CRYPTO_SHA2_224_HMAC,
        CRYPTO_TLS11_AES_CBC_HMAC_SHA1,
        CRYPTO_TLS12_AES_CBC_HMAC_SHA256,
        CRYPTO_ALGORITHM_ALL, /* Keep updated - see below */
};



I though it was more generic but its not the case....then we are really limited with cryptodev.. :evil:
This could explain also why sshd had issues....because of cryptodev and CCP items diff.

Well..well...well...
Back to top
View user's profile Send private message
cyrius
n00b
n00b


Joined: 27 Jan 2007
Posts: 70
Location: France

PostPosted: Fri Nov 20, 2020 8:09 pm    Post subject: Reply with quote

an other try.... I saw a previous post saying it was also supported by AFALG engine.

Well, it was pain to configure it because it must be also loaded kernel modules Cryptographic API :

Quote:


<M> Userspace cryptographic algorithm configuration
.
.
<M> User-space interface for hash algorithms
<M> User-space interface for symmetric key cipher algorithms
<M> User-space interface for random number generator algorithms
<M> User-space interface for AEAD cipher algorithms


And then we load those modules manually : af_alg,algif_hash, algif_skcipher, algif_aead and algif_rng

As a result, openssl will recognize it after having compiled it when the module are loaded....seriously, what a pain....
But it won't show you the engine and his capacities :

Quote:

Machina /# openssl engine
(rdrand) Intel RDRAND engine
(dynamic) Dynamic engine loading support
Machina /#



so we just have to test it and pray it exists :
Quote:

Machina /# openssl speed -evp aes-256-cbc -engine afalg
engine "afalg" set.
Doing aes-256-cbc for 3s on 16 size blocks: 1829135 aes-256-cbc's in 0.38s
Doing aes-256-cbc for 3s on 64 size blocks: 1777844 aes-256-cbc's in 0.36s
Doing aes-256-cbc for 3s on 256 size blocks: 1611631 aes-256-cbc's in 0.35s
Doing aes-256-cbc for 3s on 1024 size blocks: 1168347 aes-256-cbc's in 0.23s
Doing aes-256-cbc for 3s on 8192 size blocks: 323154 aes-256-cbc's in 0.05s
Doing aes-256-cbc for 3s on 16384 size blocks: 173748 aes-256-cbc's in 0.04s
OpenSSL 1.1.1h 22 Sep 2020
built on: Fri Nov 20 18:59:52 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 77016.21k 316061.16k 1178792.96k 5201684.03k 52945551.36k 71167180.80k
Machina /#


The first try is very impressive : 8k, 52 945 551.36k ...a factor of 50..

Second try :
Quote:

Machina /# openssl speed -evp aes-256-cbc -engine afalg
engine "afalg" set.
Doing aes-256-cbc for 3s on 16 size blocks: 1833337 aes-256-cbc's in 0.36s
Doing aes-256-cbc for 3s on 64 size blocks: 1793291 aes-256-cbc's in 0.40s
Doing aes-256-cbc for 3s on 256 size blocks: 1615598 aes-256-cbc's in 0.31s
Doing aes-256-cbc for 3s on 1024 size blocks: 1177088 aes-256-cbc's in 0.24s
Doing aes-256-cbc for 3s on 8192 size blocks: 324051 aes-256-cbc's in 0.06s
Doing aes-256-cbc for 3s on 16384 size blocks: 175176 aes-256-cbc's in 0.02s
OpenSSL 1.1.1h 22 Sep 2020
built on: Fri Nov 20 18:59:52 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 81481.64k 286926.56k 1334171.25k 5022242.13k 44243763.20k 143504179.20k
Machina / #


The Second try is surprising : 8k, 44 243 763.20k ...a factor of 44..
The third one, i had : 8k, 33 152 921.60k...a factor of 33.

Comparing to cryptodev, i found it strange because the displayed performances are never the same. It seems less stable at performances level.
An other point is that libkcapi use sockets (network) to discuss with the module....what a strange idea for a device....i should try to set in make.conf CFLAGS gcc -socket instead of gcc -pipe :P
What i mean if i good remind, linux is not the best at network performances (NetBSD is done for that). So why ?

At this stage i m just supposing it uses CCP cryptographic processor but i have no clue about.

So i tried to find out the libkcapi package in portage tree : doesn't exist.
Finally, i retrieved it from montjoiegentooportage github...again, a no official one...: [url]
https://github.com/montjoie/montjoiegentooportage/blob/master/dev-libs/libkcapi/libkcapi-9999.ebuild[/url]

it will create those files :
Quote:


/usr/lib64/libkcapi.so.1.2.1
/usr/libexec/libkcapi/kcapi
/usr/libexec/libkcapi/kcapi-enc-test-large
/usr/lib64/libkcapi.a
/usr/bin/kcapi-enc-test-large
/usr/libexec/libkcapi/kcapi-convenience
/usr/bin/kcapi-convenience
/usr/bin/kcapi
/usr/bin/kcapi-speed


So i ran the kcapi-speed -l.... to list the available hash kernel modules...i saw no engine in.
At this moment, no AMD CCP is used by this program....
I won't bet on the openssl benchmark as it doesn't give us the kernel module used also.
I ll continue to look for a solution as i would like to integrate it in my code to use this crypto hardware engine.
If someone could help..would be fantastic.
Back to top
View user's profile Send private message
cyrius
n00b
n00b


Joined: 27 Jan 2007
Posts: 70
Location: France

PostPosted: Sat Nov 21, 2020 12:02 pm    Post subject: Reply with quote

As said in my previous posts, i m lost. And i don't like to be lost.
So i share with you my investigations hoping a guru will help or confirm.


1- AF_ALG or Devcrypto at pure performance level ?

Purpose : See, out of the box, without overload, which one is the faster one.
I though important to know which one is the more consuming at I/O level before testing the AMD CCP relevance.
Because if AMD CCP is relevant but the interface to it is expensive at I/O level then the resulting performances are not relevant.

Test conditions :

Only generic ciphers, hashes have been activated in the kernel.
AMD CCP modules are unactivated in the kernel.
Only the the technology tools provided by those two technologies are relevant (See my previous post to have the web links).
Openssl is only for information as we don't really what he is doing.

Htop is running in a separated seesion to know how the processor is used :
- When green, it means executed at user level
- When red, it means executed at kernel level.


1.A - AF_ALG results

Quote:
kcapi-speed -a
SHA-1(G) |d| 256 bytes| 242.80 MB/s|994360 ops/s
SHA-224(G) |d| 256 bytes| 135.64 MB/s|555471 ops/s
SHA-256(G) |d| 256 bytes| 135.89 MB/s|556461 ops/s
SHA-384(G) |d| 512 bytes| 214.47 MB/s|439200 ops/s
SHA-512(G) |d| 512 bytes| 214.25 MB/s|438777 ops/s
HMAC SHA-1(G) |d| 256 bytes| 219.73 MB/s|899900 ops/s
HMAC SHA-224(G) |d| 256 bytes| 114.55 MB/s|469109 ops/s
HMAC SHA-256(G) |d| 256 bytes| 113.80 MB/s|465986 ops/s
HMAC SHA-384(G) |d| 512 bytes| 179.68 MB/s|367931 ops/s
HMAC SHA-512(G) |d| 512 bytes| 182.37 MB/s|373462 ops/s

MD5(G) |d| 256 bytes| 233.87 MB/s|957803 ops/s1
AES(G) CBC(G) 128 |d| 64 bytes| 58.42 MB/s|956850 ops/s
AES(G) CBC(G) 128 |e| 64 bytes| 58.77 MB/s|962379 ops/s
AES(G) CBC(G) 192 |d| 64 bytes| 56.60 MB/s|926899 ops/s
AES(G) CBC(G) 192 |e| 64 bytes| 54.87 MB/s|898334 ops/s
AES(G) CBC(G) 256 |d| 64 bytes| 55.63 MB/s|911028 ops/s
AES(G) CBC(G) 256 |e| 64 bytes| 54.39 MB/s|890934 ops/s
AES(G) CTR(G) 128 |d| 4 bytes| 4.28 MB/s|1118977 ops/s
AES(G) CTR(G) 128 |e| 4 bytes| 4.64 MB/s|1209172 ops/s
AES(G) CTR(G) 192 |d| 4 bytes| 4.26 MB/s|1114523 ops/s
AES(G) CTR(G) 192 |e| 4 bytes| 4.26 MB/s|1114280 ops/s
AES(G) CTR(G) 256 |d| 4 bytes| 4.22 MB/s|1105257 ops/s
AES(G) CTR(G) 256 |e| 4 bytes| 4.23 MB/s|1107521 ops/s
Blowfish(G) CBC(G) 128 |d| 32 bytes| 31.77 MB/s|1040134 ops/s
Blowfish(G) CBC(G) 128 |e| 32 bytes| 28.72 MB/s|940059 ops/s
Blowfish(G) CBC(G) 192 |d| 32 bytes| 29.75 MB/s|973888 ops/s
Blowfish(G) CBC(G) 192 |e| 32 bytes| 28.82 MB/s|943165 ops/s
Blowfish(G) CBC(G) 256 |d| 32 bytes| 29.102 MB/s|982191 ops/s
Blowfish(G) CBC(G) 256 |e| 32 bytes| 28.65 MB/s|938008 ops/s
Blowfish(G) CTR(G) 128 |d| 4 bytes| 4.67 MB/s|1216810 ops/s
Blowfish(G) CTR(G) 128 |e| 4 bytes| 4.35 MB/s|1136234 ops/s
Blowfish(G) CTR(G) 192 |d| 4 bytes| 4.32 MB/s|1130387 ops/s
Blowfish(G) CTR(G) 192 |e| 4 bytes| 4.29 MB/s|1123203 ops/s
Blowfish(G) CTR(G) 256 |d| 4 bytes| 4.28 MB/s|1118607 ops/s
Blowfish(G) CTR(G) 256 |e| 4 bytes| 4.18 MB/s|1094092 ops/s
HMAC SHA-1 DRBG NOPR |d| 80 bytes| 30.35 MB/s|397612 ops/s
HMAC SHA-256 DRBG NOPR |d| 128 bytes| 25.28 MB/s|207032 ops/s
HMAC SHA-384 DRBG NOPR |d| 192 bytes| 18.58 MB/s|101360 ops/s
HMAC SHA-512 DRBG NOPR |d| 256 bytes| 26.94 MB/s|110173 ops/s
HMAC SHA-1 DRBG PR |d| 80 bytes| 69.68 kB/s|891 ops/s
HMAC SHA-256 DRBG PR |d| 128 bytes| 55.35 kB/s|442 ops/s
HMAC SHA-384 DRBG PR |d| 192 bytes| 41.54 kB/s|221 ops/s
HMAC SHA-512 DRBG PR |d| 256 bytes| 55.43 kB/s|221 ops/s

information seen in Htop : one cpu used at 100%, with 10% in userland


1.B - cryptodev result

Quote:

./speed

Testing AES-128-CBC cipher:
Encrypting in chunks of 512 bytes: done. 1.38 GB in 5.00 secs: 0.28 GB/sec
Encrypting in chunks of 1024 bytes: done. 1.54 GB in 5.00 secs: 0.31 GB/sec
Encrypting in chunks of 2048 bytes: done. 1.61 GB in 5.00 secs: 0.32 GB/sec
Encrypting in chunks of 4096 bytes: done. 1.68 GB in 5.00 secs: 0.34 GB/sec
Encrypting in chunks of 8192 bytes: done. 1.71 GB in 5.00 secs: 0.34 GB/sec
Encrypting in chunks of 16384 bytes: done. 1.72 GB in 5.00 secs: 0.34 GB/sec
Encrypting in chunks of 32768 bytes: done. 1.72 GB in 5.00 secs: 0.34 GB/sec
Encrypting in chunks of 65536 bytes: done. 1.73 GB in 5.00 secs: 0.35 GB/sec

./sha_speed
Testing SHA1 Hash:
requested hash CRYPTO_SHA1, got sha1 with driver sha1-generic
Encrypting in chunks of 256 bytes: done. 1.55 GB in 5.00 secs: 0.31 GB/sec
Encrypting in chunks of 1024 bytes: done. 2.79 GB in 5.00 secs: 0.56 GB/sec
Encrypting in chunks of 4096 bytes: done. 3.41 GB in 5.00 secs: 0.68 GB/sec
Encrypting in chunks of 16384 bytes: done. 3.63 GB in 5.00 secs: 0.73 GB/sec
Encrypting in chunks of 65536 bytes: done. 3.69 GB in 5.00 secs: 0.74 GB/sec

Testing SHA256 Hash:
requested hash CRYPTO_SHA2_256, got sha256 with driver sha256-generic
Encrypting in chunks of 256 bytes: done. 796.02 MB in 5.00 secs: 159.20 MB/sec
Encrypting in chunks of 1024 bytes: done. 1.15 GB in 5.00 secs: 0.23 GB/sec
Encrypting in chunks of 4096 bytes: done. 1.29 GB in 5.00 secs: 0.26 GB/sec
Encrypting in chunks of 16384 bytes: done. 1.34 GB in 5.00 secs: 0.27 GB/sec
Encrypting in chunks of 65536 bytes: done. 1.35 GB in 5.00 secs: 0.27 GB/sec

information seen in htop : One cpu used at 100%, with 15% in userland

1.C - Basic Speed compare :

Quote:

Hash/cipher-------AL_AFG-----------Cryptodev-------Relevance
AES-A128-CBC----58.77 MB/s-------280 MB/s--------Nope as ALAFG encrypt 64 bytes only
SHA1---------------242.80 MB/s-----310 MB/s--------Relevant as both are doing the same
SHA256------------135.89 MB/s-----159.20 MB/s-----Relevant as both are doing the same


The aes compare is not relevant as AF_ALG needs more I/O than Cryptodev due to the difference of the data lenght sent to the crypto module.
So on generic SHA, out of the box, in a no stressed environment, we can say that cryptodev is the winner.
Cryptodev is 27,67%, 17.15% faster : an average of 22%.

It stays identical as said by the cryptodev many years ago :
http://cryptodev-linux.org/comparison.html

2 - Ensure The AMD CCP is used by both.

To ensure the AMD CCP is used by both technologies, my aim was to deactivate all generics, specifics hashes/ciphers in the kernel
and to activate only the AMD CCP.
Only Openssl will be then used (The most recent implementation of both Technologies).
The kernel didn't let me to deactivate all. Then only my eyes (and yours if you do the test) will see the differences with the help of htop (Green/red processor execution)
I ll try later to see if the kernel modules has debug parameters to really understand all this stuff.

2.1 openssl AFALG engine

SHA256

Quote:

openssl speed -evp sha256 -engine afalg
engine "afalg" set.
Doing sha256 for 3s on 16 size blocks: 83288 sha256's in 0.06s
Doing sha256 for 3s on 64 size blocks: 79802 sha256's in 0.07s
Doing sha256 for 3s on 256 size blocks: 38587 sha256's in 0.03s
Doing sha256 for 3s on 1024 size blocks: 40753 sha256's in 0.03s
Doing sha256 for 3s on 8192 size blocks: 28128 sha256's in 0.04s
Doing sha256 for 3s on 16384 size blocks: 20987 sha256's in 0.01s
OpenSSL 1.1.1h 22 Sep 2020
built on: Sat Nov 21 09:40:19 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sha256 22210.13k 72961.83k 329275.73k 1391035.73k 5760614.40k 34385100.80k

The good new is it's using the AMD CCP crypto device.
Processor comsuption is ridiculous : between 10% and 26%
No high red part in.

SHA512
Quote:

openssl speed -evp sha512 -engine afalg
engine "afalg" set.
Doing sha512 for 3s on 16 size blocks: 109547 sha512's in 0.07s
Doing sha512 for 3s on 64 size blocks: 117658 sha512's in 0.06s
Doing sha512 for 3s on 256 size blocks: 62121 sha512's in 0.06s
Doing sha512 for 3s on 1024 size blocks: 60600 sha512's in 0.07s
Doing sha512 for 3s on 8192 size blocks: 34629 sha512's in 0.03s
Doing sha512 for 3s on 16384 size blocks: 23903 sha512's in 0.02s
OpenSSL 1.1.1h 22 Sep 2020
built on: Sat Nov 21 09:40:19 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sha512 25039.31k 125501.87k 265049.60k 886491.43k 9456025.60k 19581337.60k

Here i saw 2 processors in use at 30%...but the color was purple.
Does it means it's the two co processors ???? i do not know.

2.2 openssl DEVCRYPTO engine

SHA256
Quote:

openssl speed -evp sha256 -engine devcrypto
engine "devcrypto" set.
Doing sha256 for 3s on 16 size blocks: 90462 sha256's in 0.05s
Doing sha256 for 3s on 64 size blocks: 87205 sha256's in 0.06s
Doing sha256 for 3s on 256 size blocks: 43697 sha256's in 0.04s
Doing sha256 for 3s on 1024 size blocks: 42552 sha256's in 0.03s
Doing sha256 for 3s on 8192 size blocks: 29687 sha256's in 0.02s
Doing sha256 for 3s on 16384 size blocks: 22159 sha256's in 0.02s
OpenSSL 1.1.1h 22 Sep 2020
built on: Sat Nov 21 09:40:19 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sha256 28947.84k 93018.67k 279660.80k 1452441.60k 12159795.20k 18152652.80k

I saw two processors in use (purple color) between 25% and 32%


SHA512
Quote:

openssl speed -evp sha512 -engine devcrypto
engine "devcrypto" set.
Doing sha512 for 3s on 16 size blocks: 114123 sha512's in 0.08s
Doing sha512 for 3s on 64 size blocks: 114397 sha512's in 0.09s
Doing sha512 for 3s on 256 size blocks: 55254 sha512's in 0.06s
Doing sha512 for 3s on 1024 size blocks: 57416 sha512's in 0.06s
Doing sha512 for 3s on 8192 size blocks: 34039 sha512's in 0.04s
Doing sha512 for 3s on 16384 size blocks: 23638 sha512's in 0.04s
OpenSSL 1.1.1h 22 Sep 2020
built on: Sat Nov 21 09:40:19 2020 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -pipe -O2 -fomit-frame-pointer -maes -msse -msse2 -mmmx -mfpmath=sse -mavx -mavx2 -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sha512 22824.60k 81348.98k 235750.40k 979899.73k 6971187.20k 9682124.80k

I saw two processors in use (purple color) between 22% and 47%
and one more in red : average 17%


2.3 Conclusion : is it used by both ?

YES, used by both engines. That'really a good new.
It means our PC could be more performant by using this Co-Proc
At performances level, i think it's not relevant as we are testing openssl first.
We are not testing the technologies, or the kernel, as both stay the same.

3 - What about a basic use of AMD CCP with AFALG openssl engine ?

wifi encryption, linux common hashes, ipsec, ssh through openssl ?

I reactivated ssl for openssh (See my previous post) and re compile openssl with only AFALG engine to see if AFALG engine is at the same level in than DEVCRYPTO
and....no more error with openssh.
It means that AFALG engine is more generic and up to date than devcrypto done many years ago for a specific purpose.
openssh use AES as encrytion algorythm. so, i ve used scp to copy a file of 2,6G.
I saw a purple part in htop. It seems the AMD CCP is used by it.

4 - My understanding of how it interacts with the kernel.

As i finally understoood at first glance, AMD CCP linux team has developed CCP interfaces that reflects the generics crypto modules of the
kernel (A guru is welcome to confirm it or not) as it's an alternative of it..
Then, it would have no sense to activate platform specific crypto modules (AES-NI, SSE, AVX) as our aim is to use the AMD CCP, not the processors.
It means also that AMD CCP maps the generic kernel modules API exposure.
I saw that SHA modules permit to call the three steps : init, update and final.
But i saw that the aes one has just an "Encrypt" api.

So, it seems the kernel doesn't allow us to take full advantage of the AMD CCP for specifics cryptographic stuff like cryptocurrency mining, BruteForce (Pentoo), Wifi crack(Pentoo) with AES through both technologies.
If needed, more deep investigations and creation/modifications of generic crypto modules and CCP modifications ones or better understanding of CCP will have to be achieve.
Or, more straight, the use of assembly language is needed.
This is my first feeling as linux is not a bare metal cryptographic OS and i m really not sure. I ll continue to study it.


5 - My final kernel config with the use of ALFAG engine :

I m not proud about. It s not perfect but it seems to work.
Remind two things :
- You must use kernel modules for AF_ALG (Not included code in the kernel) to ensure openssl recognizes it
- it's dependant of the final application and its use of opensssl.
So, we have no insurance on the use of the CCP AMD in all cases.

Quote:

CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_SKCIPHER=y
CONFIG_CRYPTO_SKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_KPP=y
CONFIG_CRYPTO_ACOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=m
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
# CONFIG_CRYPTO_PCRYPT is not set
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_AUTHENC=y
# CONFIG_CRYPTO_TEST is not set

#
# Public-key cryptography
#
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_DH=y
CONFIG_CRYPTO_ECC=y
CONFIG_CRYPTO_ECDH=y
# CONFIG_CRYPTO_ECRDSA is not set
# CONFIG_CRYPTO_CURVE25519 is not set
# CONFIG_CRYPTO_CURVE25519_X86 is not set

#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=y
# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
# CONFIG_CRYPTO_AEGIS128 is not set
# CONFIG_CRYPTO_AEGIS128_AESNI_SSE2 is not set
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=y

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
# CONFIG_CRYPTO_CFB is not set
CONFIG_CRYPTO_CTR=y
# CONFIG_CRYPTO_CTS is not set
# CONFIG_CRYPTO_ECB is not set
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_OFB is not set
# CONFIG_CRYPTO_PCBC is not set
# CONFIG_CRYPTO_XTS is not set
# CONFIG_CRYPTO_KEYWRAP is not set
CONFIG_CRYPTO_NHPOLY1305=y
# CONFIG_CRYPTO_NHPOLY1305_SSE2 is not set
# CONFIG_CRYPTO_NHPOLY1305_AVX2 is not set
CONFIG_CRYPTO_ADIANTUM=y
# CONFIG_CRYPTO_ESSIV is not set

#
# Hash modes
#
CONFIG_CRYPTO_CMAC=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=y
CONFIG_CRYPTO_VMAC=y

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
# CONFIG_CRYPTO_CRC32C_INTEL is not set
# CONFIG_CRYPTO_CRC32 is not set
# CONFIG_CRYPTO_CRC32_PCLMUL is not set
# CONFIG_CRYPTO_XXHASH is not set
# CONFIG_CRYPTO_BLAKE2B is not set
# CONFIG_CRYPTO_BLAKE2S is not set
# CONFIG_CRYPTO_BLAKE2S_X86 is not set
CONFIG_CRYPTO_CRCT10DIF=y
# CONFIG_CRYPTO_CRCT10DIF_PCLMUL is not set
CONFIG_CRYPTO_GHASH=y
CONFIG_CRYPTO_POLY1305=y
# CONFIG_CRYPTO_POLY1305_X86_64 is not set
CONFIG_CRYPTO_MD4=y
CONFIG_CRYPTO_MD5=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
CONFIG_CRYPTO_RMD128=y
CONFIG_CRYPTO_RMD160=y
CONFIG_CRYPTO_RMD256=y
CONFIG_CRYPTO_RMD320=y
CONFIG_CRYPTO_SHA1=y
# CONFIG_CRYPTO_SHA1_SSSE3 is not set
# CONFIG_CRYPTO_SHA256_SSSE3 is not set
# CONFIG_CRYPTO_SHA512_SSSE3 is not set
CONFIG_CRYPTO_SHA256=y
# CONFIG_CRYPTO_SHA512 is not set
CONFIG_CRYPTO_SHA3=y
CONFIG_CRYPTO_SM3=y
# CONFIG_CRYPTO_STREEBOG is not set
# CONFIG_CRYPTO_TGR192 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_TI is not set
# CONFIG_CRYPTO_AES_NI_INTEL is not set
# CONFIG_CRYPTO_ANUBIS is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA is not set
# CONFIG_CRYPTO_CAMELLIA_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
# CONFIG_CRYPTO_FCRYPT is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_SALSA20 is not set
CONFIG_CRYPTO_CHACHA20=y
# CONFIG_CRYPTO_CHACHA20_X86_64 is not set
# CONFIG_CRYPTO_SEED is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
# CONFIG_CRYPTO_SM4 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set

#
# Compression
#
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_LZO is not set
# CONFIG_CRYPTO_842 is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set
# CONFIG_CRYPTO_ZSTD is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRYPTO_USER_API=m
CONFIG_CRYPTO_USER_API_HASH=m
CONFIG_CRYPTO_USER_API_SKCIPHER=m
CONFIG_CRYPTO_USER_API_RNG=m
CONFIG_CRYPTO_USER_API_AEAD=m
# CONFIG_CRYPTO_STATS is not set
CONFIG_CRYPTO_HASH_INFO=y

#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_AES=y
CONFIG_CRYPTO_LIB_ARC4=y
# CONFIG_CRYPTO_LIB_BLAKE2S is not set
CONFIG_CRYPTO_LIB_CHACHA_GENERIC=y
# CONFIG_CRYPTO_LIB_CHACHA is not set
# CONFIG_CRYPTO_LIB_CURVE25519 is not set
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=11
CONFIG_CRYPTO_LIB_POLY1305_GENERIC=y
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_LIB_SHA256=y
CONFIG_CRYPTO_HW=y
# CONFIG_CRYPTO_DEV_PADLOCK is not set
# CONFIG_CRYPTO_DEV_ATMEL_ECC is not set
# CONFIG_CRYPTO_DEV_ATMEL_SHA204A is not set
CONFIG_CRYPTO_DEV_CCP=y
CONFIG_CRYPTO_DEV_CCP_DD=m
CONFIG_CRYPTO_DEV_SP_CCP=y
CONFIG_CRYPTO_DEV_CCP_CRYPTO=m
CONFIG_CRYPTO_DEV_SP_PSP=y
# CONFIG_CRYPTO_DEV_CCP_DEBUGFS is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
# CONFIG_CRYPTO_DEV_QAT_C62X is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
CONFIG_PKCS7_MESSAGE_PARSER=y
# CONFIG_PKCS7_TEST_KEY is not set
# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set


#
# Certificates for signature checking
#
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
# CONFIG_SECONDARY_TRUSTED_KEYRING is not set
# CONFIG_SYSTEM_BLACKLIST_KEYRING is not set
# end of Certificates for signature checking


Enjoy it, use it and good luck :wink:
Back to top
View user's profile Send private message
jpsollie
Apprentice
Apprentice


Joined: 17 Aug 2013
Posts: 291

PostPosted: Wed Dec 02, 2020 7:57 am    Post subject: Reply with quote

I have a few issues with the devcrypto engine of OpenSSL,
I installed everything as requested, and applied the following patch on devcrypto to make it work under 5.9 kernels:
Code:

From 2f5e08aebf9229599aae7f25db752f74221cd71d Mon Sep 17 00:00:00 2001
From: Joan Bruguera <joanbrugueram@gmail.com>
Date: Fri, 14 Aug 2020 00:13:38 +0200
Subject: [PATCH] Fix build for Linux 5.9-rc1

See also: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64019a2e467a288a16b65ab55ddcbf58c1b00187
          https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bce617edecada007aee8610fbe2c14d10b8de2f6
          https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/

Signed-off-by: Joan Bruguera <joanbrugueram@gmail.com>
---
 zc.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/zc.c b/zc.c
index a560db5..fdf7da1 100644
--- a/zc.c
+++ b/zc.c
@@ -76,10 +76,14 @@ int __get_userbuf(uint8_t __user *addr, uint32_t len, int write,
        ret = get_user_pages_remote(task, mm,
                        (unsigned long)addr, pgcount, write ? FOLL_WRITE : 0,
                        pg, NULL);
-#else
+#elif (LINUX_VERSION_CODE < KERNEL_VERSION(5, 9, 0))
        ret = get_user_pages_remote(task, mm,
                        (unsigned long)addr, pgcount, write ? FOLL_WRITE : 0,
                        pg, NULL, NULL);
+#else
+       ret = get_user_pages_remote(mm,
+                       (unsigned long)addr, pgcount, write ? FOLL_WRITE : 0,
+                       pg, NULL, NULL);
 #endif
 #if (LINUX_VERSION_CODE < KERNEL_VERSION(5, 8, 0))
        up_read(&mm->mmap_sem);
--
2.26.2


But while trying to establish a ssh session, the devcrypto engine complained that algorithms 4 and 12 were not found.
Unfortunately, I couldn't find anything in my sshd_config talking about CAST and RC4, so I created a new patch, which added the capabilities to the kernel module:

Code:

 --- a/ioctl.c        2020-07-28 10:03:59.000000000 +0200
+++ b/ioctl.c       2020-11-30 15:21:42.254425766 +0100
@@ -141,6 +141,9 @@
        case CRYPTO_BLF_CBC:
                alg_name = "cbc(blowfish)";
                break;
+       case CRYPTO_CAST_CBC:
+               alg_name = "cbc(cast5)";
+               break;
        case CRYPTO_AES_CBC:
                alg_name = "cbc(aes)";
                break;
@@ -154,6 +157,10 @@
                alg_name = "ctr(aes)";
                stream = 1;
                break;
+       case CRYPTO_ARC4:
+               alg_name = "ecb(arc4)";
+               stream = 1;
+               break;
        case CRYPTO_AES_GCM:
                alg_name = "gcm(aes)";
                stream = 1;


But no success, unfortunately: when I load de cryptodev module and try to ssh to another machine, I get the error message "main: mux digest failed" (view below).
This clearly is a bug. However, I do not know where to start: is this a cryptodev issue or an openssl issue?
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3411
Location: Rasi, Finland

PostPosted: Mon Dec 14, 2020 5:31 pm    Post subject: Reply with quote

Interesting...
Can someone give me a TL;DR; of which processes this makes perform faster? ssh probably. Can it help with https?
_________________
..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--
Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Tue Dec 15, 2020 12:53 am    Post subject: Reply with quote

It'll help if the software is using legacy ciphers (AES) and you're CPU-bound by them (more than 200MB/s of IO on a single thread).

Useful for disk encryption, for anything else you'll never save back the time spent getting it to work.
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3411
Location: Rasi, Finland

PostPosted: Tue Dec 15, 2020 11:03 am    Post subject: Reply with quote

Oh well...
The only thing I would maybe need it is my nfs shares, but I don't have encryption on them.
Until I setup encryption to my storage... maybe.
_________________
..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--
Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
jpsollie
Apprentice
Apprentice


Joined: 17 Aug 2013
Posts: 291

PostPosted: Thu Feb 18, 2021 6:00 am    Post subject: Reply with quote

I'd like to report back here: I got it working globally with afalg interface!
Cryptodev was uttlerly broken when compiling openssh: there were libcrypt errors in ssh output, so not a good idea.
Does it work? yes
Does it compete against a treadripper 1950x with aes-ni? no
Why do it? I assume it's more power-efficient to wakeup a CCP instead of letting a huge CPU core handle SSH traffic

so, step 1:
All possible algorithms in your kernel need to be compiled as modules! even sha1_generic and sha256_generic!
You may need to force some kernel functionality into modules as well ... sorry, no way around that.

Step 2: enable afalg autoload (in etc/conf.d/modules):
modules="ccp_crypto crc32-pclmul algif_rng algif_aead algif_skcipher algif_hash"

Step 3: investigate /proc/crypto, view which modules the CCP supports:
Code:

linuxserver ~ # cat /proc/crypto | grep ccp
driver       : rsa-ccp
module       : ccp_crypto
driver       : hmac-sha512-ccp
module       : ccp_crypto
driver       : sha512-ccp
module       : ccp_crypto
driver       : hmac-sha384-ccp
module       : ccp_crypto
driver       : sha384-ccp
module       : ccp_crypto
driver       : hmac-sha256-ccp
module       : ccp_crypto
driver       : sha256-ccp
module       : ccp_crypto
driver       : hmac-sha224-ccp
module       : ccp_crypto
driver       : sha224-ccp
module       : ccp_crypto
driver       : hmac-sha1-ccp
module       : ccp_crypto
driver       : sha1-ccp
module       : ccp_crypto
driver       : cbc-des3-ccp
module       : ccp_crypto
driver       : ecb-des3-ccp
module       : ccp_crypto
driver       : gcm-aes-ccp
module       : ccp_crypto
driver       : xts-aes-ccp
module       : ccp_crypto
driver       : cmac-aes-ccp
module       : ccp_crypto
driver       : rfc3686-ctr-aes-ccp
module       : ccp_crypto
driver       : ctr-aes-ccp
module       : ccp_crypto
driver       : ofb-aes-ccp
module       : ccp_crypto
driver       : cfb-aes-ccp
module       : ccp_crypto
driver       : cbc-aes-ccp
module       : ccp_crypto
driver       : ecb-aes-ccp
module       : ccp_crypto


step 4:
Take a 2nd look at /proc/crypto, and search for one of the algorithms which is handled by CCP driver. Note driver priority (on my pc it was 300).
Search trough the list of drivers which have a higher priority and unload them. On my PC, the aes-ni driver had a priority of 400, so whenever the afalg functionality is accessed, it would trigger the aes-ni module instead of ccp. So remove that one. Also, sha256-ssse3 needed to be removed.

Step 5:
replace the openssl afalg engine with a more "forced" one: https://github.com/cotequeiroz/afalg_engine
I modified the CmakeList.txt to turn on zero copy and digest functionality, and turned off fallback.
Build the afalg so file (cmake . && make) and copy it to /usr/lib64/engines-1.1/afalg.so.

step 6: enable afalg engine in /etc/ssl/openssl.cnf:
Code:

HOME                    = .
openssl_conf = default_conf

[default_conf]
engines = afalg_sect
oid_section             = new_oids


[afalg_sect]
afalg = afalg_engine_on

[afalg_engine_on]
default_algorithms = ALL
CIPHERS = ALL
DIGESTS = ALL
init = 1


Step 7: verify everything is working correctly:
Code:

openssl engine -pre DUMP_INFO afalg
(afalg) AF_ALG engine
Information about ciphers supported by the AF_ALG engine:
Cipher DES-CBC, NID=31, AF_ALG info: name=cbc(des). AF_ALG socket bind failed.
Cipher DES-EDE3-CBC, NID=44, AF_ALG info: name=cbc(des3_ede), driver=cbc-des3-ccp (hw accelerated)
Cipher BF-CBC, NID=91, AF_ALG info: name=cbc(blowfish). AF_ALG socket bind failed.
Cipher CAST5-CBC, NID=108, AF_ALG info: name=cbc(cast5), driver=cbc-cast5-avx (software)
Cipher AES-128-CBC, NID=419, AF_ALG info: name=cbc(aes), driver=cbc-aes-ccp (hw accelerated)
Cipher AES-192-CBC, NID=423, AF_ALG info: name=cbc(aes), driver=cbc-aes-ccp (hw accelerated)
Cipher AES-256-CBC, NID=427, AF_ALG info: name=cbc(aes), driver=cbc-aes-ccp (hw accelerated)
Cipher AES-128-CTR, NID=904, AF_ALG info: name=ctr(aes), driver=ctr-aes-ccp (hw accelerated)
Cipher AES-192-CTR, NID=905, AF_ALG info: name=ctr(aes), driver=ctr-aes-ccp (hw accelerated)
Cipher AES-256-CTR, NID=906, AF_ALG info: name=ctr(aes), driver=ctr-aes-ccp (hw accelerated)
Cipher AES-128-ECB, NID=418, AF_ALG info: name=ecb(aes), driver=ecb-aes-ccp (hw accelerated)
Cipher AES-192-ECB, NID=422, AF_ALG info: name=ecb(aes), driver=ecb-aes-ccp (hw accelerated)
Cipher AES-256-ECB, NID=426, AF_ALG info: name=ecb(aes), driver=ecb-aes-ccp (hw accelerated)

Information about digests supported by the AF_ALG engine:
Digest MD5, NID=4, AF_ALG info: name=md5. AF_ALG socket bind failed.
Digest SHA1, NID=64, AF_ALG info: name=sha1, driver=sha1-ccp (hw accelerated)
Digest RIPEMD160, NID=117, AF_ALG info: name=rmd160, driver=rmd160-generic (software)
Digest SHA224, NID=675, AF_ALG info: name=sha224, driver=sha224-ccp (hw accelerated)
Digest SHA256, NID=672, AF_ALG info: name=sha256, driver=sha256-ccp (hw accelerated)
Digest SHA384, NID=673, AF_ALG info: name=sha384, driver=sha384-ccp (hw accelerated)
Digest SHA512, NID=674, AF_ALG info: name=sha512, driver=sha512-ccp (hw accelerated)

[Success]: DUMP_INFO


and starting from now, you're on CCP! when you want to disable it, the only thing you need to do is comment out the afalg line in the default_conf of openssl.cnf
please note that it is much slower compared to AES-NI!
on software:
Code:

openssl speed -evp aes256 -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 142495202 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 43470464 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 11501836 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 2918562 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 366432 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 183275 aes-256-cbc's in 3.00s
OpenSSL 1.1.1i  8 Dec 2020
built on: Wed Feb 17 12:23:43 2021 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -O3 -flto -pipe -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG  -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc     759974.41k   927369.90k   981490.01k   996202.50k  1000603.65k  1000925.87k

And when running on CCP:
Code:

linuxserver ~ # openssl speed -evp aes256 -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 68404 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 66817 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 69127 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 63035 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 51287 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 38434 aes-256-cbc's in 3.00s
OpenSSL 1.1.1i  8 Dec 2020
built on: Wed Feb 17 12:23:43 2021 UTC
options:bn(64,64) rc4(8x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=znver1 -O3 -flto -pipe -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG  -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc        364.82k     1425.43k     5898.84k    21515.95k   140047.70k   209900.89k


And at last: modify your openssh file to make use of the aes / rsa /sha256 algorithms instead of ec or edsa:
Code:

Ciphers aes128-cbc,aes192-cbc,aes256-cbc,aes128-ctr,aes192-ctr,aes256-ctr
HostKeyAlgorithms rsa-sha2-256,rsa-sha2-512,rsa-sha2-256-cert-v01@openssh.com,rsa-sha2-512-cert-v01@openssh.com,ssh-rsa
KexAlgorithms diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha256

HostKey /etc/ssh/ssh_host_rsa_key


So why ise it?
-Having a dedicated CCP may protect you against DDoS attacks (the CCP will get overloaded, but the CPU won't care)
-CCP may? be more power efficient (not sure, I'll have to ask amd ccp guys for that, I'll contact them)
- because we can! Having a chip in your pc but not using it is a waste of silicon

I'll try to modify the forced openssl afalg engine so it accepts all features of afalg-implemented CCP drivers, but not sure whether that will work.
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3411
Location: Rasi, Finland

PostPosted: Fri Feb 19, 2021 7:04 pm    Post subject: Reply with quote

I was hoping CCP was implemented in earlier CPUs.
My home server has rather old Opteron 3380. Looks like all it has is some AES instructions in its sleeve. Shame.
_________________
..: Zucca :..
Gentoo IRC channels reside on Libera.Chat.
--
Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
jpsollie
Apprentice
Apprentice


Joined: 17 Aug 2013
Posts: 291

PostPosted: Sat Jul 30, 2022 9:26 am    Post subject: Reply with quote

there seems to be light at the end of the tunnel:
OpenSSL 3.0 allows CBC offloading from openssh to CCP via afalg.
CTR mode doesn't work, unfortunately.
devcrypto is still a no-go for openssh
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
Gatak
Apprentice
Apprentice


Joined: 04 Jan 2004
Posts: 174

PostPosted: Sat Jul 30, 2022 10:09 am    Post subject: Reply with quote

jpsollie wrote:
there seems to be light at the end of the tunnel:
OpenSSL 3.0 allows CBC offloading from openssh to CCP via afalg.
CTR mode doesn't work, unfortunately.
devcrypto is still a no-go for openssh


OpenSSL 3 is hard masked. Guessing it will take some time before we can upgrade. But it is indeed good news.
Back to top
View user's profile Send private message
jpsollie
Apprentice
Apprentice


Joined: 17 Aug 2013
Posts: 291

PostPosted: Mon Aug 01, 2022 4:41 am    Post subject: Reply with quote

hard masked sounds worse han it actually is:
if you put it in /etc/portage/package.unmask, it's fine.
So far, I've found efitools to be incompatible with new openssl, other packages didn't care
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum