coawst 性能分析以及优化 v1
系统背景
集群上运行大规模 COAWST(wrf roms swan) 耦合算例
问题
- 不同程序的
编译项与参数配置过多且复杂 - 对于一个 IO 项问题,是没设置正确的环境变量导致没编译功能,还是没设置好参数导致没启用功能,无从得知
IO 相关参数
wrf
# IO 方式 2,11,13,102 ...
io_form_history
io_form_restart
# Quilt I/O Server
nio_tasks_per_group
nio_groups编译设置 (环境变量):
PNETCDF_QUILT # 启用基于 PnetCDF 的 Quilt I/O Server 实现
# PNETCDF_QUILT “值得开启”,但前提是 PnetCDF 编译链完整且稳定
# 与 nio_tasks_per_group = n 同时使用,
# 不启用 也能使用 Quilt Server(配置:nio_tasks_per_group nio_groupsroms
PIO 相关编译设置:
分析
通过输出文件
通过分析输出文件的 大小/数量 初步判断瓶颈
# 文件数量
find . -maxdepth 1 -type f -name 'ocean_his_*' | wc -l
# 文件大小
du -ch ocean_his_* | tail -1其中三个程序的输出有:
swan:swaninit*
Roms: ocean_his_* ocean_avg_*
Wrf: wrfout_*
输出占比集中在roms 与 WRF
LOG 分析
- Slrum log 能得到很多信息
- 同时可以使用 darshan 评估
并行IO
基本信息
不同程序分配的节点数
Model Coupling:
Ocean Model MPI nodes: 00000 - 00255
Waves Model MPI nodes: 00256 - 00511
Atmos Model MPI nodes: 00512 - 01535Roms 网格切分
Resolution, Grid 01: 970x1198x50, Parallel Nodes: 256, Tiling: 16x16Wrf 网格切分
Ntasks in X 31 , ntasks in Y 32同时能算出分配到 io 的节点 wrf-nodes - NtasksX*NtasksY
耗时指标
Swan 是耗时最小的部分,不做研究
WRF
- Timing for Writing (写文件耗时)
- Timing for main (计算步长耗时)
Timing for main: time 2026-04-03_23:50:00 on domain 1: 3.69922 elapsed seconds
Timing for Writing wrfout_d01_2026-04-04_00:00:00 for domain 1: 8.47534 elapsed secondsROMS
程序运行时间 ≈ Total / roms 分配进程 (log 中的 Average )
Average: 578.752
Minimum: 577.427
Maximum: 578.860
Nonlinear model elapsed CPU time profile, Grid: 01
Allocation and array initialization .............. 20.930 ( 0.0188 %)
Ocean state initialization ....................... 51.577 ( 0.0464 %)
Reading of input data ............................ 400.877 ( 0.3608 %)
Processing of input data ......................... 29.871 ( 0.0269 %)
Processing of output time averaged data .......... 67.112 ( 0.0604 %)
Computation of vertical boundary conditions ...... 21.002 ( 0.0189 %)
Computation of global information integrals ...... 17.770 ( 0.0160 %)
Writing of output data ........................... 14673.920 (13.2054 %)
Model 2D kernel .................................. 5109.128 ( 4.5978 %)
2D/3D coupling, vertical metrics ................. 405.235 ( 0.3647 %)
Omega vertical velocity .......................... 285.565 ( 0.2570 %)
Equation of state for seawater ................... 961.461 ( 0.8652 %)
Atmosphere-Ocean bulk flux parameterization ...... 62.334 ( 0.0561 %)
KPP vertical mixing parameterization ............. 2940.584 ( 2.6463 %)
3D equations right-side terms .................... 856.016 ( 0.7704 %)
3D equations predictor step ...................... 1448.507 ( 1.3035 %)
Pressure gradient ................................ 444.194 ( 0.3997 %)
Harmonic mixing of tracers, geopotentials ........ 816.935 ( 0.7352 %)
Biharmonic mixing of tracers, geopotentials ...... 393.041 ( 0.3537 %)
Harmonic stress tensor, geopotentials ............ 2031.193 ( 1.8279 %)
Corrector time-step for 3D momentum .............. 921.772 ( 0.8295 %)
Corrector time-step for tracers .................. 937.017 ( 0.8432 %)
Unused 07 ........................................ 41156.609 (37.0379 %)
Total: 74052.650 66.6419 %
Unique kernel(s) regions profiled ................ 74052.650 66.6419 %
Residual, non-profiled code ...................... 37067.639 33.3581 %
All percentages are with respect to total time = 111120.289
MPI communications profile, Grid: 01
Message Passage: 2D halo exchanges ............... 3098.897 ( 2.7888 %)
Message Passage: 3D halo exchanges ............... 1388.548 ( 1.2496 %)
Message Passage: 4D halo exchanges ............... 539.617 ( 0.4856 %)
Message Passage: data broadcast .................. 13393.653 (12.0533 %)
Message Passage: data reduction .................. 253.303 ( 0.2280 %)
Message Passage: data gathering .................. 3194.237 ( 2.8746 %)
Message Passage: data scattering.................. 2440.529 ( 2.1963 %)
Message Passage: point data gathering ............ 0.254 ( 0.0002 %)
Message Passage: synchronization barrier ......... 7.261 ( 0.0065 %)
Total: 24316.300 21.8829 %
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>其中需要关注的有:
- Writing of output data
- Unused 07 (闲置状态)
- MPI data broadcast
LOG 分析以及优化思路
优化思路
- 线性扩展
- wrf 历史模式:102
- pnetcdf 并行 io
进行一次完整运行
ROM:
Computation of global information integrals ...... 8438.950 ( 0.0370 %)
Writing of output data ........................... 2007518.132 ( 8.8010 %)
Model 2D kernel .................................. 2813099.155 (12.3327 %)
All percentages are with respect to total time = 22810024.664
MPI communications profile, Grid: 01
Message Passage: 2D halo exchanges ............... 2572137.150 (11.2763 %)
Message Passage: data broadcast .................. 1064456.268 ( 4.6666 %)
Total: 5866359.029 25.7183 %- Unused (ROMS 闲置时间) ≈ 60% ,ROMS 等待 WRF 的结果 --> 参数分配上还有优化空间
WRF:
(使用脚本统计数据指标)
| Timing for Writing | Timing for main | ||
| total | Avg | total | Avg |
| 831.46 | 20.28 | 21017.38 | 525.43 |
- Timing for main 包括计算、等待 IO 进程、等待耦合的时间
- 这里平均每步等待 20s
设计 30 分钟迭代的快速测试
线性扩展
darshan log
# darshan log version: 3.41
# compression method: ZLIB
# exe: ./coawstM.backup coupling.in
# uid: 66756
# jobid: 1471163
# start_time: 1779956254
# start_time_asci: Thu May 28 16:17:34 2026
# end_time: 1779957375
# end_time_asci: Thu May 28 16:36:15 2026
# nprocs: 128
# run time: 1120.9576
# metadata: lib_ver = 3.5.0
# metadata: h = romio_no_indep_rw=true;cb_nodes=4
# log file regions
# -------------------------------------------------------
# header: 1328 bytes (uncompressed)
# job data: 243 bytes (compressed)
# record table: 24007 bytes (compressed)
# POSIX module: 33758 bytes (compressed), ver=4
# LUSTRE module: 6917 bytes (compressed), ver=2
# STDIO module: 773 bytes (compressed), ver=2
# HEATMAP module: 9885 bytes (compressed), ver=1
# mounted file systems (mount point and fs type)
# -------------------------------------------------------
# mount entry: /proc/sys/fs/binfmt_misc autofs
# mount entry: /sys/fs/cgroup/unified cgroup2
# mount entry: /sys/kernel/tracing tracefs
# mount entry: /sys/kernel/config configfs
# mount entry: /sys/fs/bpf bpf
# mount entry: /dev/mqueue mqueue
# mount entry: /vol8 lustre
# mount entry: /dev devtmpfs
# *******************************************************
# POSIX module data
# *******************************************************
# description of POSIX counters:
# POSIX_*: posix operation counts.
# READS,WRITES,OPENS,SEEKS,STATS,MMAPS,SYNCS,FILENOS,DUPS are types of operations.
# POSIX_RENAME_SOURCES/TARGETS: total count file was source or target of a rename operation
# POSIX_RENAMED_FROM: Darshan record ID of the first rename source, if file was a rename target
# POSIX_MODE: mode that file was opened in.
# POSIX_BYTES_*: total bytes read and written.
# POSIX_MAX_BYTE_*: highest offset byte read and written.
# POSIX_CONSEC_*: number of exactly adjacent reads and writes.
# POSIX_SEQ_*: number of reads and writes from increasing offsets.
# POSIX_RW_SWITCHES: number of times access alternated between read and write.
# POSIX_*_ALIGNMENT: memory and file alignment.
# POSIX_*_NOT_ALIGNED: number of reads and writes that were not aligned.
# POSIX_MAX_*_TIME_SIZE: size of the slowest read and write operations.
# POSIX_SIZE_*_*: histogram of read and write access sizes.
# POSIX_STRIDE*_STRIDE: the four most common strides detected.
# POSIX_STRIDE*_COUNT: count of the four most common strides.
# POSIX_ACCESS*_ACCESS: the four most common access sizes.
# POSIX_ACCESS*_COUNT: count of the four most common access sizes.
# POSIX_*_RANK: rank of the processes that were the fastest and slowest at I/O (for shared files).
# POSIX_*_RANK_BYTES: bytes transferred by the fastest and slowest ranks (for shared files).
# POSIX_F_*_START_TIMESTAMP: timestamp of first open/read/write/close.
# POSIX_F_*_END_TIMESTAMP: timestamp of last open/read/write/close.
# POSIX_F_READ/WRITE/META_TIME: cumulative time spent in read, write, or metadata operations.
# POSIX_F_MAX_*_TIME: duration of the slowest read and write operations.
# POSIX_F_*_RANK_TIME: fastest and slowest I/O time for a single rank (for shared files).
# POSIX_F_VARIANCE_RANK_*: variance of total I/O time and bytes moved for all ranks (for shared files).
# WARNING: POSIX_OPENS counter includes both POSIX_FILENOS and POSIX_DUPS counts
# WARNING: POSIX counters related to file offsets may be incorrect if a file is simultaneously accessed by both POSIX and STDIO (e.g., using fileno())
# - Affected counters include: MAX_BYTE_{READ|WRITTEN}, CONSEC_{READS|WRITES}, SEQ_{READS|WRITES}, {MEM|FILE}_NOT_ALIGNED, STRIDE*_STRIDE
total_POSIX_OPENS: 1167
total_POSIX_FILENOS: 13
total_POSIX_DUPS: 0
total_POSIX_READS: 107443
total_POSIX_WRITES: 759720
total_POSIX_SEEKS: 5894
total_POSIX_STATS: 1477
total_POSIX_MMAPS: -1
total_POSIX_FSYNCS: 0
total_POSIX_FDSYNCS: 0
total_POSIX_RENAME_SOURCES: 0
total_POSIX_RENAME_TARGETS: 0
total_POSIX_RENAMED_FROM: 0
total_POSIX_MODE: 0
total_POSIX_BYTES_READ: 4642184152
total_POSIX_BYTES_WRITTEN: 12596970403
total_POSIX_MAX_BYTE_READ: 6624238151
total_POSIX_MAX_BYTE_WRITTEN: 5010776075
total_POSIX_CONSEC_READS: 103785
total_POSIX_CONSEC_WRITES: 753463
total_POSIX_SEQ_READS: 105843
total_POSIX_SEQ_WRITES: 758835
total_POSIX_RW_SWITCHES: 2439
total_POSIX_MEM_NOT_ALIGNED: 0
total_POSIX_MEM_ALIGNMENT: 8
total_POSIX_FILE_NOT_ALIGNED: 862089
total_POSIX_FILE_ALIGNMENT: 1048576
total_POSIX_MAX_READ_TIME_SIZE: 28834614
total_POSIX_MAX_WRITE_TIME_SIZE: 28996093
total_POSIX_SIZE_READ_0_100: 3548
total_POSIX_SIZE_READ_100_1K: 915
total_POSIX_SIZE_READ_1K_10K: 101545
total_POSIX_SIZE_READ_10K_100K: 144
total_POSIX_SIZE_READ_100K_1M: 572
total_POSIX_SIZE_READ_1M_4M: 667
total_POSIX_SIZE_READ_4M_10M: 11
total_POSIX_SIZE_READ_10M_100M: 41
total_POSIX_SIZE_READ_100M_1G: 0
total_POSIX_SIZE_READ_1G_PLUS: 0
total_POSIX_SIZE_WRITE_0_100: 2234
total_POSIX_SIZE_WRITE_100_1K: 2425
total_POSIX_SIZE_WRITE_1K_10K: 751992
total_POSIX_SIZE_WRITE_10K_100K: 969
total_POSIX_SIZE_WRITE_100K_1M: 643
total_POSIX_SIZE_WRITE_1M_4M: 1276
total_POSIX_SIZE_WRITE_4M_10M: 29
total_POSIX_SIZE_WRITE_10M_100M: 152
total_POSIX_SIZE_WRITE_100M_1G: 0
total_POSIX_SIZE_WRITE_1G_PLUS: 0
total_POSIX_STRIDE1_STRIDE: 4194304
total_POSIX_STRIDE2_STRIDE: 4
total_POSIX_STRIDE3_STRIDE: 4096
total_POSIX_STRIDE4_STRIDE: 1
total_POSIX_STRIDE1_COUNT: 1218
total_POSIX_STRIDE2_COUNT: 288
total_POSIX_STRIDE3_COUNT: 124
total_POSIX_STRIDE4_COUNT: 24
total_POSIX_ACCESS1_ACCESS: 4189
total_POSIX_ACCESS2_ACCESS: 4163
total_POSIX_ACCESS3_ACCESS: 4194304
total_POSIX_ACCESS4_ACCESS: 8192
total_POSIX_ACCESS1_COUNT: 619252
total_POSIX_ACCESS2_COUNT: 109913
total_POSIX_ACCESS3_COUNT: 1849
total_POSIX_ACCESS4_COUNT: 99909
total_POSIX_FASTEST_RANK: -1
total_POSIX_FASTEST_RANK_BYTES: -1
total_POSIX_SLOWEST_RANK: -1
total_POSIX_SLOWEST_RANK_BYTES: -1
total_POSIX_F_OPEN_START_TIMESTAMP: 0.011120
total_POSIX_F_READ_START_TIMESTAMP: 0.014384
total_POSIX_F_WRITE_START_TIMESTAMP: 1.191473
total_POSIX_F_CLOSE_START_TIMESTAMP: 0.021221
total_POSIX_F_OPEN_END_TIMESTAMP: 1035.282613
total_POSIX_F_READ_END_TIMESTAMP: 1035.839145
total_POSIX_F_WRITE_END_TIMESTAMP: 1122.177334
total_POSIX_F_CLOSE_END_TIMESTAMP: 1122.201477
total_POSIX_F_READ_TIME: 7.116889
total_POSIX_F_WRITE_TIME: 56.739679
total_POSIX_F_META_TIME: 6.663915
total_POSIX_F_MAX_READ_TIME: 0.119348
total_POSIX_F_MAX_WRITE_TIME: 0.111346
total_POSIX_F_FASTEST_RANK_TIME: 0.000000
total_POSIX_F_SLOWEST_RANK_TIME: 0.000000
total_POSIX_F_VARIANCE_RANK_TIME: 0.000000
total_POSIX_F_VARIANCE_RANK_BYTES: 0.000000
# *******************************************************
# STDIO module data
# *******************************************************
# description of STDIO counters:
# STDIO_{OPENS|FDOPENS|WRITES|READS|SEEKS|FLUSHES} are types of operations.
# STDIO_BYTES_*: total bytes read and written.
# STDIO_MAX_BYTE_*: highest offset byte read and written.
# STDIO_*_RANK: rank of the processes that were the fastest and slowest at I/O (for shared files).
# STDIO_*_RANK_BYTES: bytes transferred by the fastest and slowest ranks (for shared files).
# STDIO_F_*_START_TIMESTAMP: timestamp of the first call to that type of function.
# STDIO_F_*_END_TIMESTAMP: timestamp of the completion of the last call to that type of function.
# STDIO_F_*_TIME: cumulative time spent in different types of functions.
# STDIO_F_*_RANK_TIME: fastest and slowest I/O time for a single rank (for shared files).
# STDIO_F_VARIANCE_RANK_*: variance of total I/O time and bytes moved for all ranks (for shared files).
# WARNING: STDIO_OPENS counter includes STDIO_FDOPENS count
# WARNING: STDIO counters related to file offsets may be incorrect if a file is simultaneously accessed by both STDIO and POSIX (e.g., using fdopen())
# - Affected counters include: MAX_BYTE_{READ|WRITTEN}
total_STDIO_OPENS: 19
total_STDIO_FDOPENS: 0
total_STDIO_READS: 16
total_STDIO_WRITES: 0
total_STDIO_SEEKS: 26
total_STDIO_FLUSHES: 0
total_STDIO_BYTES_WRITTEN: 0
total_STDIO_BYTES_READ: 54526051
total_STDIO_MAX_BYTE_READ: 4194303
total_STDIO_MAX_BYTE_WRITTEN: 0
total_STDIO_FASTEST_RANK: -1
total_STDIO_FASTEST_RANK_BYTES: -1
total_STDIO_SLOWEST_RANK: -1
total_STDIO_SLOWEST_RANK_BYTES: -1
total_STDIO_F_META_TIME: 0.025318
total_STDIO_F_WRITE_TIME: 0.000000
total_STDIO_F_READ_TIME: 0.107740
total_STDIO_F_OPEN_START_TIMESTAMP: 2.131395
total_STDIO_F_CLOSE_START_TIMESTAMP: 2.132253
total_STDIO_F_WRITE_START_TIMESTAMP: 0.000000
total_STDIO_F_READ_START_TIMESTAMP: 2.136211
total_STDIO_F_OPEN_END_TIMESTAMP: 1035.230478
total_STDIO_F_CLOSE_END_TIMESTAMP: 1035.235342
total_STDIO_F_WRITE_END_TIMESTAMP: 0.000000
total_STDIO_F_READ_END_TIMESTAMP: 1035.233100
total_STDIO_F_FASTEST_RANK_TIME: 0.000000
total_STDIO_F_SLOWEST_RANK_TIME: 0.000000
total_STDIO_F_VARIANCE_RANK_TIME: 0.000000
total_STDIO_F_VARIANCE_RANK_BYTES: 0.000000# darshan log version: 3.41
# compression method: ZLIB
# exe: ./coawstM.backup coupling.in
# uid: 66756
# jobid: 1134026
# start_time: 1779960614
# start_time_asci: Thu May 28 17:30:14 2026
# end_time: 1779961290
# end_time_asci: Thu May 28 17:41:30 2026
# nprocs: 1024
# run time: 675.4815
# metadata: lib_ver = 3.5.0
# metadata: h = romio_no_indep_rw=true;cb_nodes=4
# log file regions
# -------------------------------------------------------
# header: 1328 bytes (uncompressed)
# job data: 243 bytes (compressed)
# record table: 192120 bytes (compressed)
# POSIX module: 252901 bytes (compressed), ver=4
# LUSTRE module: 54221 bytes (compressed), ver=2
# STDIO module: 758 bytes (compressed), ver=2
# HEATMAP module: 68184 bytes (compressed), ver=1
# mounted file systems (mount point and fs type)
# -------------------------------------------------------
# mount entry: /proc/sys/fs/binfmt_misc autofs
# mount entry: /sys/fs/cgroup/unified cgroup2
# mount entry: /sys/kernel/tracing tracefs
# mount entry: /sys/kernel/config configfs
# mount entry: /sys/fs/bpf bpf
# mount entry: /dev/mqueue mqueue
# mount entry: /vol8 lustre
# mount entry: /dev devtmpfs
# *******************************************************
# POSIX module data
# *******************************************************
# description of POSIX counters:
# POSIX_*: posix operation counts.
# READS,WRITES,OPENS,SEEKS,STATS,MMAPS,SYNCS,FILENOS,DUPS are types of operations.
# POSIX_RENAME_SOURCES/TARGETS: total count file was source or target of a rename operation
# POSIX_RENAMED_FROM: Darshan record ID of the first rename source, if file was a rename target
# POSIX_MODE: mode that file was opened in.
# POSIX_BYTES_*: total bytes read and written.
# POSIX_MAX_BYTE_*: highest offset byte read and written.
# POSIX_CONSEC_*: number of exactly adjacent reads and writes.
# POSIX_SEQ_*: number of reads and writes from increasing offsets.
# POSIX_RW_SWITCHES: number of times access alternated between read and write.
# POSIX_*_ALIGNMENT: memory and file alignment.
# POSIX_*_NOT_ALIGNED: number of reads and writes that were not aligned.
# POSIX_MAX_*_TIME_SIZE: size of the slowest read and write operations.
# POSIX_SIZE_*_*: histogram of read and write access sizes.
# POSIX_STRIDE*_STRIDE: the four most common strides detected.
# POSIX_STRIDE*_COUNT: count of the four most common strides.
# POSIX_ACCESS*_ACCESS: the four most common access sizes.
# POSIX_ACCESS*_COUNT: count of the four most common access sizes.
# POSIX_*_RANK: rank of the processes that were the fastest and slowest at I/O (for shared files).
# POSIX_*_RANK_BYTES: bytes transferred by the fastest and slowest ranks (for shared files).
# POSIX_F_*_START_TIMESTAMP: timestamp of first open/read/write/close.
# POSIX_F_*_END_TIMESTAMP: timestamp of last open/read/write/close.
# POSIX_F_READ/WRITE/META_TIME: cumulative time spent in read, write, or metadata operations.
# POSIX_F_MAX_*_TIME: duration of the slowest read and write operations.
# POSIX_F_*_RANK_TIME: fastest and slowest I/O time for a single rank (for shared files).
# POSIX_F_VARIANCE_RANK_*: variance of total I/O time and bytes moved for all ranks (for shared files).
# WARNING: POSIX_OPENS counter includes both POSIX_FILENOS and POSIX_DUPS counts
# WARNING: POSIX counters related to file offsets may be incorrect if a file is simultaneously accessed by both POSIX and STDIO (e.g., using fileno())
# - Affected counters include: MAX_BYTE_{READ|WRITTEN}, CONSEC_{READS|WRITES}, SEQ_{READS|WRITES}, {MEM|FILE}_NOT_ALIGNED, STRIDE*_STRIDE
total_POSIX_OPENS: 8943
total_POSIX_FILENOS: 13
total_POSIX_DUPS: 0
total_POSIX_READS: 830835
total_POSIX_WRITES: 899851
total_POSIX_SEEKS: 5606
total_POSIX_STATS: 11461
total_POSIX_MMAPS: -1
total_POSIX_FSYNCS: 0
total_POSIX_FDSYNCS: 0
total_POSIX_RENAME_SOURCES: 0
total_POSIX_RENAME_TARGETS: 0
total_POSIX_RENAMED_FROM: 0
total_POSIX_MODE: 0
total_POSIX_BYTES_READ: 10420613385
total_POSIX_BYTES_WRITTEN: 13183610642
total_POSIX_MAX_BYTE_READ: 6624238151
total_POSIX_MAX_BYTE_WRITTEN: 5010776075
total_POSIX_CONSEC_READS: 820249
total_POSIX_CONSEC_WRITES: 893034
total_POSIX_SEQ_READS: 822019
total_POSIX_SEQ_WRITES: 898406
total_POSIX_RW_SWITCHES: 2439
total_POSIX_MEM_NOT_ALIGNED: 0
total_POSIX_MEM_ALIGNMENT: 8
total_POSIX_FILE_NOT_ALIGNED: 1712797
total_POSIX_FILE_ALIGNMENT: 1048576
total_POSIX_MAX_READ_TIME_SIZE: 8192
total_POSIX_MAX_WRITE_TIME_SIZE: 28834614
total_POSIX_SIZE_READ_0_100: 17852
total_POSIX_SIZE_READ_100_1K: 915
total_POSIX_SIZE_READ_1K_10K: 810585
total_POSIX_SIZE_READ_10K_100K: 99
total_POSIX_SIZE_READ_100K_1M: 665
total_POSIX_SIZE_READ_1M_4M: 667
total_POSIX_SIZE_READ_4M_10M: 11
total_POSIX_SIZE_READ_10M_100M: 41
total_POSIX_SIZE_READ_100M_1G: 0
total_POSIX_SIZE_READ_1G_PLUS: 0
total_POSIX_SIZE_WRITE_0_100: 2221
total_POSIX_SIZE_WRITE_100_1K: 2448
total_POSIX_SIZE_WRITE_1K_10K: 892017
total_POSIX_SIZE_WRITE_10K_100K: 1065
total_POSIX_SIZE_WRITE_100K_1M: 643
total_POSIX_SIZE_WRITE_1M_4M: 1276
total_POSIX_SIZE_WRITE_4M_10M: 29
total_POSIX_SIZE_WRITE_10M_100M: 152
total_POSIX_SIZE_WRITE_100M_1G: 0
total_POSIX_SIZE_WRITE_1G_PLUS: 0
total_POSIX_STRIDE1_STRIDE: 4194304
total_POSIX_STRIDE2_STRIDE: 4096
total_POSIX_STRIDE3_STRIDE: 1
total_POSIX_STRIDE4_STRIDE: 26
total_POSIX_STRIDE1_COUNT: 1218
total_POSIX_STRIDE2_COUNT: 124
total_POSIX_STRIDE3_COUNT: 24
total_POSIX_STRIDE4_COUNT: 20
total_POSIX_ACCESS1_ACCESS: 4189
total_POSIX_ACCESS2_ACCESS: 8192
total_POSIX_ACCESS3_ACCESS: 4530
total_POSIX_ACCESS4_ACCESS: 2334
total_POSIX_ACCESS1_COUNT: 676442
total_POSIX_ACCESS2_COUNT: 761680
total_POSIX_ACCESS3_COUNT: 2193
total_POSIX_ACCESS4_COUNT: 2176
total_POSIX_FASTEST_RANK: -1
total_POSIX_FASTEST_RANK_BYTES: -1
total_POSIX_SLOWEST_RANK: -1
total_POSIX_SLOWEST_RANK_BYTES: -1
total_POSIX_F_OPEN_START_TIMESTAMP: 0.090209
total_POSIX_F_READ_START_TIMESTAMP: 0.097283
total_POSIX_F_WRITE_START_TIMESTAMP: 0.853514
total_POSIX_F_CLOSE_START_TIMESTAMP: 0.101990
total_POSIX_F_OPEN_END_TIMESTAMP: 565.882341
total_POSIX_F_READ_END_TIMESTAMP: 570.081440
total_POSIX_F_WRITE_END_TIMESTAMP: 675.602195
total_POSIX_F_CLOSE_END_TIMESTAMP: 675.626367
total_POSIX_F_READ_TIME: 29.532315
total_POSIX_F_WRITE_TIME: 64.455771
total_POSIX_F_META_TIME: 286.078577
total_POSIX_F_MAX_READ_TIME: 0.102825
total_POSIX_F_MAX_WRITE_TIME: 0.078899
total_POSIX_F_FASTEST_RANK_TIME: 0.000000
total_POSIX_F_SLOWEST_RANK_TIME: 0.000000
total_POSIX_F_VARIANCE_RANK_TIME: 0.000000
total_POSIX_F_VARIANCE_RANK_BYTES: 0.000000
# *******************************************************
# STDIO module data
# *******************************************************
# description of STDIO counters:
# STDIO_{OPENS|FDOPENS|WRITES|READS|SEEKS|FLUSHES} are types of operations.
# STDIO_BYTES_*: total bytes read and written.
# STDIO_MAX_BYTE_*: highest offset byte read and written.
# STDIO_*_RANK: rank of the processes that were the fastest and slowest at I/O (for shared files).
# STDIO_*_RANK_BYTES: bytes transferred by the fastest and slowest ranks (for shared files).
# STDIO_F_*_START_TIMESTAMP: timestamp of the first call to that type of function.
# STDIO_F_*_END_TIMESTAMP: timestamp of the completion of the last call to that type of function.
# STDIO_F_*_TIME: cumulative time spent in different types of functions.
# STDIO_F_*_RANK_TIME: fastest and slowest I/O time for a single rank (for shared files).
# STDIO_F_VARIANCE_RANK_*: variance of total I/O time and bytes moved for all ranks (for shared files).
# WARNING: STDIO_OPENS counter includes STDIO_FDOPENS count
# WARNING: STDIO counters related to file offsets may be incorrect if a file is simultaneously accessed by both STDIO and POSIX (e.g., using fdopen())
# - Affected counters include: MAX_BYTE_{READ|WRITTEN}
total_STDIO_OPENS: 19
total_STDIO_FDOPENS: 0
total_STDIO_READS: 16
total_STDIO_WRITES: 0
total_STDIO_SEEKS: 26
total_STDIO_FLUSHES: 0
total_STDIO_BYTES_WRITTEN: 0
total_STDIO_BYTES_READ: 54526051
total_STDIO_MAX_BYTE_READ: 4194303
total_STDIO_MAX_BYTE_WRITTEN: 0
total_STDIO_FASTEST_RANK: -1
total_STDIO_FASTEST_RANK_BYTES: -1
total_STDIO_SLOWEST_RANK: -1
total_STDIO_SLOWEST_RANK_BYTES: -1
total_STDIO_F_META_TIME: 0.019924
total_STDIO_F_WRITE_TIME: 0.000000
total_STDIO_F_READ_TIME: 0.106629
total_STDIO_F_OPEN_START_TIMESTAMP: 1.386794
total_STDIO_F_CLOSE_START_TIMESTAMP: 1.387560
total_STDIO_F_WRITE_START_TIMESTAMP: 0.000000
total_STDIO_F_READ_START_TIMESTAMP: 1.391299
total_STDIO_F_OPEN_END_TIMESTAMP: 565.481624
total_STDIO_F_CLOSE_END_TIMESTAMP: 565.487266
total_STDIO_F_WRITE_END_TIMESTAMP: 0.000000
total_STDIO_F_READ_END_TIMESTAMP: 565.484678
total_STDIO_F_FASTEST_RANK_TIME: 0.000000
total_STDIO_F_SLOWEST_RANK_TIME: 0.000000
total_STDIO_F_VARIANCE_RANK_TIME: 0.000000
total_STDIO_F_VARIANCE_RANK_BYTES: 0.000000核心指标直观对比表
- 可以看出因为串行模型下,扩展核数导致 Lustre 的元数据服务器(MDS)拥堵
WRF IO MODE 102
该模式让 WRF 每个进程分别负责各自的文件,避免了上千进程的写文件的拥堵;该方法也是 WRF 论坛中对边 >500 规模网格的推荐方法
ROM
Average: 257.028
Minimum: 253.560
Maximum: 257.158
Writing of output data ........................... 14544.184 (29.4719 %)
Model 2D kernel .................................. 4990.394 (10.1124 %)
Unused 07 ........................................ 330.811 ( 0.6703 %)
Total: 33027.832 66.9266 %
Unique kernel(s) regions profiled ................ 33027.832 66.9266 %
Residual, non-profiled code ...................... 16321.481 33.0734 %
All percentages are with respect to total time = 49349.313
MPI communications profile, Grid: 01
Message Passage: 2D halo exchanges ............... 2917.842 ( 5.9126 %)
Message Passage: 3D halo exchanges ............... 1340.847 ( 2.7171 %)
Message Passage: 4D halo exchanges ............... 515.030 ( 1.0436 %)
Message Passage: data broadcast .................. 13709.517 (27.7806 %)
Message Passage: data reduction .................. 286.763 ( 0.5811 %)
Message Passage: data gathering .................. 2557.551 ( 5.1825 %)
Message Passage: data scattering.................. 2209.861 ( 4.4780 %)
Message Passage: point data gathering ............ 0.976 ( 0.0020 %)
Message Passage: synchronization barrier ......... 3.564 ( 0.0072 %)
Total: 23541.949 47.7047 %
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>WRF
| Timing for Writing | Timing for main | |||
| total | Avg | total | Avg | |
| io_mode 2 | 158.38 | 39.59 | 472.45 | - |
| io_mode 102 | 4.03 | 1.01 | 189.08 | - |
ROM 中的 Unused 07 (空闲时间)与 WRF 中的 writing 耗时都大幅缩短
- 该方法会导致大量碎片文件;需要后处理 (可以另起程序一边生成新的数据,一边合并) 总速度时也有显著提升