Huawei OceanStor A Series Storage and NVIDIA GDS Interoperability Test Report

Logo

Axians Global

All Rights Reserved

Executive Summary

Axians Global (“Axians”) assessed the interoperability of Huawei OceanStor A Series Storage (hereinafter referred to as “the storage” as well) and NVIDIA GDS through NFS over RDMA.

In the assessment, Axians has determined that NVIDIA GDS will function with Huawei OceanStor A series Storage for the following scenarios:

Test Scenario

Storage Involved

Result

Basic Function – NAS

OceanStor A Series Storage

Passed

Reliability – NAS

OceanStor A Series Storage

Passed

This document contains detailed test cases and the corresponding output. The test procedures are referenced from standard online documentation from NVIDIA and Huawei.

1. Environment Configuration

1.1 Networking diagram

Figure 1 Huawei OceanStor A Series storage and NVIDIA GDS Test Networking

Networking description:

  • Two physical servers were deployed as NVIDIA Nodes, with MLNX_OFED, GDS, and CUDA software installed.
  • The management network and service network are configured to connect the NVIDIA Nodes and Huawei storage through switches.
  • The management network connects the NAS client and Huawei storage via 10G TCP/IP network. The Service network connects the NAS client and Huawei storage via 200G RoCE network.

1.2 Hardware and Software Configuration

1.2.1 Storage Configuration

Table 1-1 Huawei storage configuration table

Name

Model

Version

Quantity

Storage

OceanStor A Series

V700

1

1.2.2 Matching Hardware Configuration

Table 1-2 Hardware Configuration

Name

description

Usage

Quantity

Server


x86 server

  • GPU: Tesla V100
  • NIC: CX-7

NVIDIA node

1

Ethernet switch

CE9860

Service Network Switch

1

1.2.3 Test Software and Tools

Table 1-3 Test Software and Tool List

Software Name

description

Version

Operating system

Ubuntu

22.04.4

MOFED

MLNX-OFED

24.10-2.1.8.0

CUDA

NVIDIA CUDA toolkit

12.8

NVIDIA Driver

NVIDIA GPU Driver

570.133.07

GDSIO

IO test tool

12.8

Vdbench

IO test tool

5.0.4.7

2. Test Cases and Records

2.1 Functional Test

2.1.1 Verify the NVIDIA GDS configuration

Test Purpose

To verify that the NVIDIA GDS configuration is available.

Test Networking

Huawei OceanStor A Series storage and NVIDIA GDS Test Networking.

Prerequisites

1. Huawei OceanStor A Series storage system is running normally.

2. The NVIDIA nodes are running normally.

Test Procedure


Run gdscheck.py to check the driver configuration of NFS.

  1. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol.
  2. Run the gdsio tool on NVIDIA node to generate workload to the storage, and check the XferType.

Expected Result


In step 1, the driver configuration of NFS is supported.

  1. In step 2, mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol successfully.
  2. In step 3, XferType is GPUD.

Test Results

1. In step 1, the driver configuration of NFS is supported.

2. In step 2, mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol successfully.

3. In step 3, XferType is GPUD.

Test Conclusion

Passed

2.1.2 Modify file and directory permissions using the chmod command

Test Purpose

To verify that file and directory permissions can be successfully modified as required.

Test Networking

Huawei OceanStor A Series storage and NVIDIA GDS Test Networking.

Prerequisites

1. Huawei OceanStor A Series storage system is running normally.

2. The NVIDIA nodes are running normally.

3. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol successfully.

Test Procedure

Use the [chmod 000] command on the NVIDIA node to modify the permissions of file and directory in the shared directory, check the permissions of the file and directory.

2. Use the [chmod 777] command on the NVIDIA node to modify the permissions of file and directory in the shared directory, check the permissions of the file and directory.

Expected Result


1. In step 1, the file permissions are: ———-.


(No read, write, or execute permissions for Owner, Group, and Others).


the directory permissions are: d———.


(No read, write, or execute permissions for Owner, Group, and Others, d indicates it is a directory)


2. In step 2, the file permissions: -rwxrwxrwx.


(Owner, Group, and Others have read, write, and execute permissions).


the directory permissions are: drwxrwxrwx.


(Owner, Group, and Others have read, write, and execute permissions, d indicates it is a directory)

Test Results


1. In step 1, the file permissions are: ———-, the directory permissions are: d———.

2. In step 2, the file permissions are: -rwxrwxrwx, the directory permissions are: d rwxrwxrwx.

Test Conclusion

Passed

2.1.3 Create files with filenames in various languages

Test Purpose

To test file names of various languages.

Test Networking

Huawei OceanStor A Series storage and NVIDIA GDS Test Networking.

Prerequisites

1. Huawei OceanStor A Series storage system is running normally.

2. The NVIDIA nodes are running normally.

3. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol successfully.

Test Procedure


1. Use the [echo] command to create files with filenames in various languages in the shared directory. Use the [stat] command to verify the file’s properties.


2. Rename the files’ names to English characters. Use the [stat] command to verify the files’ properties.


Languages include:


Simplified Chinese、Traditional Chinese(寧鈴禮劉)、English(english)、Japanese(にほんご) 、Korean(조선말)、Spanish(Habláis)、Arabic(مع سّلامة) 、French(Désolé)、Russian(Поучиться)、Numbers(123).

Expected Result


1. In step 1, the files were successfully created, and their properties were successfully verified.


2. In step 2, the files are renamed successfully, and their properties remain consistent with those before the renaming operation.

Test Results


1. In step 1, the files are created successfully, and their properties are viewed successfully.










2. In step 2, the files are renamed successfully, and their properties remain consistent with those before the renaming operation.

Test Conclusion

Passed

2.1.4 Create a large file

Test Purpose

To verify that the NVIDIA node can create a large file.

Test Networking

Huawei OceanStor A Series storage and NVIDIA GDS Test Networking.

Prerequisites

1. Huawei OceanStor A Series storage system is running normally.

2. The NVIDIA nodes are running normally.

3. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol successfully.

Test Procedure

1. Creates a 2TB file with gdsio in the shared directory.

2. Check the file with command [ls].

3. Delete this file.

Expected Result


1. In step 1, the file is created successfully.


2. In step 2, the size of the created file is 2TB.


3. In step 3, the file is deleted successfully.

Test Results


1. In step 1, the file is created successfully.

2. In step 2, the size of the created file is 2TB.

3. In step 3, the file is deleted successfully.

Test Conclusion

Passed

2.1.5 Copy files

Test Purpose

To verify that the NVIDIA node can copy files successfully.

Test Networking

Huawei OceanStor A Series storage and NVIDIA GDS Test Networking.

Prerequisites

1. Huawei OceanStor A Series storage system is running normally.

2. The NVIDIA nodes are running normally.

3. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol successfully.

Test Procedure


1. Creates two directories, A and B, in the shared directory, and creates five files in directory A (including files with a name length of 255, files created with various character languages, and files with different case names).


2. Copy all files from directory A to directory B


3. Copy all files from directory A to the shared root directory


4. Clear the shared directory, and create directories A and B, along with multiple files, in the shared directory


5. Copy all files from the shared root directory to directory A


6. Copy all files from the shared root directory to a local directory


7. Copy all files from the local directory to shared directory B

Expected Result


1. In step 1, the creation was successful.


2. In step 2, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.


3. In step 3, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.


4. In step 5, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.


5. In step 6, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.


6. In step 7, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.

Test Results


1. In step 1, the creation was successful.

2. In step 2, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.

3. In step 3, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.

4. In step 5, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.

5. In step 6, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.

6. In step 7, the copy operation was successful; all attributes of the target file are normal; the content of the target file is consistent with the source file.

Test Conclusion

Passed

2.1.6 Create directories with directory names in various languages

Test Purpose

To test directory names of various languages.

Test Networking

Huawei OceanStor A Series storage and NVIDIA GDS Test Networking.

Prerequisites

1. Huawei OceanStor A Series storage system is running normally.

2. The NVIDIA nodes are running normally.

3. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol successfully.

Test Procedure

1. Use the [mkdir] command to create directories with directory names in various languages in the shared directory.

2. Rename the directories’ names to English characters.

Languages include:

Simplified Chinese、Traditional Chinese(寧鈴禮劉)、English(english)、Japanese(にほんご) 、Korean(조선말)、Spanish(Habláis)、Arabic(مع سّلامة) 、French(Désolé)、Russian(Поучиться)、Numbers(123).

Expected Result

  1. In step 1, the directories are created successfully.
  2. In step 2, the directories are renamed successfully.

Test Results

  1. In step 1, the directories are created successfully.

2. In step 2, the directories are renamed successfully.


Test Conclusion

Passed

2.1.7 Mount the NFS filesystem of the storage on the NVIDIA node using the nconnect parameter.

Test Purpose

To verify that the NFS filesystem can be successfully mounted on the NVIDIA node using the nconnect parameter.

Test Networking

Huawei OceanStor A Series storage and NVIDIA GDS Test Networking.

Prerequisites

1. Huawei OceanStor A Series storage system is running normally.

2. The NVIDIA nodes are running normally.

Test Procedure


1. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol and the nconnect parameter, set the nconnect parameter as 16.

2. Run the gdsio tool on NVIDIA node to generate workload to the storage, and check the XferType.

3. Mount the NFS filesystem of the storage on the NVIDIA node with RDMA protocol and the nconnect parameter, set the nconnect parameter as 17.

Expected Result

  1. In step 1, mount successfully.
  2. In step 2, XferType is GPUD.
  3. In step 3, mount failed with wrong parameter.

Test Results

  1. In step 1, mount successfully.
2. In step 2, XferType is GPUD


3. In step 3, mount failed.


Test Conclusion

Passed

2.2 Reliability Test

2.2.1 NVIDIA node reboot

Test Purpose

To verify the I/O stress could be recovered from a reboot of NVIDIA node.

Test Networking

Huawei OceanStor AI storage and NVIDIA GDS Test Networking

Prerequisites


1. Huawei OceanStor storage system is running normally.


2. The NVIDIA nodes are running normally.


3. The file system from storage are mounted normally on the NVIDIA node.

Test Procedure


1. Run I/O stress through gdsio and vdbench to the mounted file system.


2. Reboot the NVIDIA node.


3. Remount the file system after rebooting.


4. Re-run I/O stress through gdsio and vdbench to the mounted file system.

Expected Result


1. In step 1, the I/O stress is running normally.


2. In step 2, the node reboot successfully.


3. In step 3, the file system could be remount successfully


4. In step 4, the I/O stress is running normally.

Test Results


1. In step 1, the I/O stress running normally.

2. In step 2, the node reboot successfully.




3. In step 3, the file system could be remount successfully.



4. In step 4, the I/O stress running normally.

Test Conclusion

Passed

2.2.2 Network interface of NVIDIA node failure

Test Purpose

To verify the network interface failure of NVIDIA node will not affect the I/O stress.

Test Networking

Huawei OceanStor AI storage and NVIDIA GDS Test Networking

Prerequisites


1. Huawei OceanStor storage system is running normally.


2. The NVIDIA nodes are running normally.


3. The file system from storage are mounted normally on the NVIDIA node.

Test Procedure


1. Run I/O stress through gdsio and vdbench to the mounted file system

2. Shutdown one of the network interface on NVIDIA node.

3. Check the status of I/O stress.

Expected Result

In step 1, the I/O stress is running normally.

In step 2, the network interface shutdown successfully.

  1. In step 3, the I/O stress are not affected, and the reset time is within the specified range.

Test Results

1. In step 1, the I/O stress is running normally.

2. In step 2, the network interface shutdown successfully.

3. In step 3, the I/O stress are not affected, and the reset time is within the specified range.

Test Conclusion

Passed

2.2.3 Unmount and mount the file system repeatedly

Test Purpose

To verify mount and unmount a file system repeatedly will not affect other mount points.

Test Networking

Huawei OceanStor AI storage and NVIDIA GDS Test Networking

Prerequisites


1. Huawei OceanStor storage system is running normally.


2. The NVIDIA nodes are running normally.


3. The file system from storage are mounted normally on the NVIDIA node.

Test Procedure


1. Run I/O stress through gdsio and vdbench to the mounted file system.


2. Mount and umount a new NFS file system repeatedly.


3. Check the status of I/O stress.

Expected Result


1. In step 1, the I/O stress is running normally.


2. In step 2, mount and umount a new NFS file system repeatedly successfully.


3. In step 3, the I/O stress are not affected.

Test Results


1. In step 1, the I/O stress is running normally.

2. In step 2, mount and umount a new NFS file system repeatedly successfully.



3. In step 3, the I/O stress are not affected.



Test Conclusion

Passed

2.2.4 Rpcbind process failure

Test Purpose

To verify the NFS process failure will not affect the I/O stress.

Test Networking

Huawei OceanStor AI storage and NVIDIA GDS Test Networking

Prerequisites


1. Huawei OceanStor storage system is running normally.


2. The NVIDIA nodes are running normally.


3. The file system from storage are mounted normally on the NVIDIA node.

Test Procedure


1. Run I/O stress through gdsio and vdbench to the mounted file system.


2. Kill the NFS process of both storage controller.


3. Check the status of I/O stress.

Expected Result


1. In step 1, the I/O stress is running normally.


2. In step 2, the NFS process killed successfully.


3. In step 3, the I/O stress are not affected.

Test Results


1. In step 1, the I/O stress is running normally.

2. In step 2, the NFS process killed successfully.

3. In step 3, the I/O stress are not affected.

Test Conclusion

Passed

2.2.5 Network interface of storage failure

Test Purpose

To verify the storage interface failure will not affect the I/O stress.

Test Networking

Huawei OceanStor AI storage and NVIDIA GDS Test Networking

Prerequisites


1. Huawei OceanStor storage system is running normally.


2. The NVIDIA nodes are running normally.


3. The file system from storage are mounted normally on the NVIDIA node.

Test Procedure


1. Run I/O stress through gdsio and vdbench to the mounted file system.

2. Shutdown the interface on the storage which the NVIDIA node connected.

3. Check the status of I/O stress.

Expected Result


1. In step 1, the I/O stress is running normally.

2. In step 2, the storage interface shutdown successfully.

3. In step 3, the zeroing count of I/O stress is within the specification range.

Test Results


1. In step 1, the I/O stress is running normally.

2. In step 2, the storage interface shutdown successfully.




3. In step 3, the zeroing count of I/O stress is within the specification range.




Test Conclusion

Passed