运行WPS、WRF、WRF-Hydro的一些错误总结一

运行WRF/WRF-Hydro的一些错误经验

1.执行mpirun -np 1 ./real.exe命令,rsl.error.0000文件出现以下错误。

1
2
3
4
5
6
7
8
9
10
 metgrid input_wrf.F first_date_input = 2024-03-12_00:00:00
metgrid input_wrf.F first_date_nml = 2024-03-12_00:00:00
i_parent_start from namelist.input file = 53
i_parent_start from gridded input file = 10
j_parent_start from namelist.input file = 25
j_parent_start from gridded input file = 20
d02 2024-03-12_00:00:00 ---- ERROR: Nest start locations do not match: namelist.input vs gridded input file
NOTE: 1 namelist vs input data inconsistencies found.
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: <stdin> LINE: 1298

解决办法:

使namelist.wps和namelist.input两个文件中的 i_parent_start 和 j_parent_start 两个参数保持一致。

2.操作:WRF-Hydro在执行xxx命令出现错误。

错误:

1
2
3
netcdf_layer.f90:77:19:            
& comm = object%mpi_communicator, info = object%default_info)
1 Error: Keyword argument ‘comm’ at (1) is not in the procedure

重新设置netcdf环境,参考教程

3.操作:安装netcdf-fortran,尽管我的netcdf-c的版本大于4.7.4依旧error。

错误:configure netcdf-fortran error,详细错误如下

1
2
3
..l...
checking for nc_def_var_szip...no
configure:error:netcdf-c version 4.7.4 or greater is required.

解决方案: 向~/.bashrc文件添加环境变量如下,主要是LDFLAGS变量和CPPFLAGS

1
2
3
4
5
6
## netcdf 4.9.2

export WRFHYDRO_DIR=~/coding/WRF_Hydro/LIBRARIES
export NETCDF=$WRFHYDRO_DIR/netcdf
export LDFLAGS=-L$NETCDF/lib
export CPPFLAGS=-I$NETCDF/include

感谢WardF的回答,同时参考了他的另一次解答

4.利用3的解决方案解决了错误后出现了另外一个错误。错误大致如下

1
2
configure error: netcdf could not link to netcdfc library.Please set LDFLAGS;
for static builds set LIBS to the results fo nc-config--libs ....

解决方案:没能直接解决这个问题。也参考过别的博客上在环境变量中添加或更改LDFLAGSLIBS的值依旧未成功,在configure时添加这两个参数也未成功。最终通过在/usr/local/netcdf4中新建了一个netcdf环境才解决,相应的安装教程和环境设置参考xxx。虽然没能查明原因,但是很大可能是因为在安装netcdf-fortran时netcdf-c库、netcdf-fortran库、zlib库、hdf5库和curl库的安装位置和安装顺序的原因,在教程xxx中将其放在同个位置下进行另一个版本的netcdf安装并重新设置环境变量就解决了这个问题。

综合3和4,出现这两个错误的很大程度上的原因是因为通过教程安装的netcdf版本为v4.1.3,不满足运行WRF-Hydro的条件,WRF-Hydro运行的netcdf版本为v4.4+。

5.操作:在WRF-Hydro中执行./compile_offline_Noah.sh setEnvar.sh

错误:

1
2
3
4
5
6
7
f951: Warning: Nonexistent include directory ‘/home/root123/coding/WRF/Build_WRF/LIBRARIES/netcdf/include’ [-Wmissing-include-dirs]
netcdf_layer.f90:2:6:

use netcdf
1
Fatal Error: Can't open module file ‘netcdf.mod’ for reading at (1): 没有那个文件或目录
compilation terminated.

解决方案:netcdf版本不够,升级版本,参考安装教程

6.[错误] 运行命令 mpirun -np 4 ./wrf_hydro.exe > log出现错误

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
 Calling config noahlsm_offline
Calling config noahlsm_offline
Calling config noahlsm_offline
Calling config noahlsm_offline
WARNING: LDASIN file has a perverse version identifier
ldasin_version = 0
WARNING: LDASIN file has a perverse version identifier
ldasin_version = 0
WARNING: LDASIN file has a perverse version identifier
ldasin_version = 0
WARNING: LDASIN file has a perverse version identifier
ldasin_version = 0








SNOW HEIGHT NOT FOUND - VALUE DEFINED IN LSMINIT
SNOW HEIGHT NOT FOUND - VALUE DEFINED IN LSMINIT
SNOW HEIGHT NOT FOUND - VALUE DEFINED IN LSMINIT
SNOW HEIGHT NOT FOUND - VALUE DEFINED IN LSMINIT
The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.
---
FATAL ERROR! Program stopped. Recompile with environment variable HYDRO_D set to 1 for enhanced debug information.

The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.
---
FATAL ERROR! Program stopped. Recompile with environment variable HYDRO_D set to 1 for enhanced debug information.

The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.
---
FATAL ERROR! Program stopped. Recompile with environment variable HYDRO_D set to 1 for enhanced debug information.

The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.
---
FATAL ERROR! Program stopped. Recompile with environment variable HYDRO_D set to 1 for enhanced debug information.


===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

解决方案:重新编译wrf-hydro,在setEnvar.sh文件中调整WRF_HYDRO参数为

1
export HYDRO_D=1

重新编译

1
2
./configure
./compile_offline_NoahMP.sh setEnvar.sh

7.[错误]运行命令 mpirun -np 4 ./wrf_hydro.exe > log出现错误

1
2
3
4
5
6
7
8
9
10
11
 The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.
The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.
The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.
The job is stopped due to the fatal error. hydro.namelist ERROR: Please specify a udmap_file file.

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

解决方案:修改 hydro.namelist 文件中的UDMP_OPT参数和udmap_file参数

1
2
UDMP_OPT = 0
udmap_file = ""

8.[错误]运行命令 mpirun -np 4 ./wrf_hydro.exe出现错误

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
*** Error in `./wrf_hydro.exe': free(): invalid next size (fast): 0x000000001dafcb80 ***
*** Error in `./wrf_hydro.exe': munmap_chunk(): invalid pointer: 0x000000001bd0e9f0 ***
======= Backtrace: =========
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777f5)[0x7f480ffc37f5]
/lib/x86_64-linux-gnu/libc.so.6(+0x777f5)[0x7f7f0b95a7f5]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x1a8)[0x7f480ffd06e8]
......
......
......
#13 0x46E9B9 in __module_noahmp_hrldas_driver_MOD_land_driver_ini
#9 0x4A1404 in __module_hydro_io_MOD_read_routelink
#10 0x55D48A in __module_routing_MOD_landrt_ini
#11 0x52A6FA in __module_hydro_drv_MOD_hydro_ini
#12 0x52C889 in __module_hrldas_hydro_MOD_hrldas_cpl_hydro_ini
#13 0x52FB65 in hrldas_drv_hydro_ini_
#14 0x46E9B9 in __module_noahmp_hrldas_driver_MOD_land_driver_ini

不使用mpirun命令执行./wrf_hydro.exe

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
RNING: KDAY is deprecated and may be removed in a future version, please use KHOUR.
WARNING: In land_driver_ini() - KHOUR < 0. DEFINED USING KDAY.
WARNING: LDASIN file has a perverse version identifier
ldasin_version = 0


SNOW HEIGHT NOT FOUND - VALUE DEFINED IN LSMINIT
getting dimension from file: ./DOMAIN/Route_Link.nc
reading from hydrotbl_f(HYDRO.TBL.nc) file ....
WARNING: get2d_real: failed to find the variables: CHAN_DEPTH and CHAN_DEPTH
compound_channel is FALSE in hydro.namelist.
read gwbasmskfil as nc format: ./DOMAIN/GWBASINS.nc
read GWBUCKPARM file as nc format: ./DOMAIN/GWBUCKPARM.nc
Resetting RESTART Accumulation Variables to 0... 1
application called MPI_Abort(comm=0x84000000, 1) - process 0

使用./wrf_hydro.exe > log打印日志文件,log文件中的错误意思差不多为

1
无法找到wrfout_d01_2024-02-01_00:00:00相关forcing数据

解决方案

1
2
FORCING文件夹中的文件名不能使用wrfout_d02_*,必须使用wrfout_d01_*
如果要使用d02的数据也必须重命名为d01

9.[错误]运行命令 mpirun -np 4 ./wrf_hydro.exe出现错误

1
2
3
4
5
6
7
8
9
10
11
12
13
14
 starting wrf task            0  of            4
starting wrf task 1 of 4
starting wrf task 2 of 4
starting wrf task 3 of 4

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

解决方案

减小时间步长,同时能够被时间间隔整除,history_interval是积分步长time_step的整数倍

1
2
&domains
time_step

分辨率太高,区域太大,内存不足,参数化方案更换(微物理方案)

10.利用i_parent_start和e_we选择WRF domain的问题

i_parent_start和e_we相关

j_parent_start和e_sn相关

11.运行命令./metgrid.exe报错

1
2
3
4
5
6
7
8
Processing domain 1 of 3
Processing 2024-02-01_00
FILE
WARNING: Field PRES has missing values at level 200100 at (i,j)=(1,1)
WARNING: Field PMSL has missing values at level 200100 at (i,j)=(1,1)
WARNING: Field PSFC has missing values at level 200100 at (i,j)=(1,1)
ERROR: Missing values encountered in interpolated fields. Stopping.
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0

ERA5数据范围比domain小,重新下载

12.运行./wrf_hydro.exe报错

1
2
3
4
5
6
7
8
9
10
11
12
13
14
......
......
(Near) match for destination layer: Taking destination layer at 0.7000 from source layer at 0.7000
(Near) match for destination layer: Taking destination layer at 1.5000 from source layer at 1.5000
xstart,xend, ystart, yend 1 159 1 159
xstart,xend, ystart, yend 1 159 1 159
xstart,xend, ystart, yend 1 159 1 159
xstart,xend, ystart, yend 1 159 1 159
SOIL TEXTURE CLASSIFICATION = STAS FOUND 19 CATEGORIES
SNOW HEIGHT NOT FOUND - VALUE DEFINED IN LSMINIT
NTIME = 216 KHOUR= 216 dtbl = 3600.00000
zsoil/soil_thick_input = 0.100000001 0.300000012 0.600000024 1.00000000
The job is stopped due to the fatal error. HYDRO_nlst namelist error in read_rt_nlst
application called MPI_Abort(comm=0x84000000, 1) - process 0

解决方案

hydro.namelist中的变量格式错误,我这里是这样的(不知道什么时候弄得)

1
2
3
GWBUCKPARM_file = "./DOMAIN/
GWBUCKPARM.nc"
................................................................

需更改为

1
GWBUCKPARM_file = "./DOMAIN/GWBUCKPARM.nc"

3.运行命令   ./wrf_hydro.exe报错

1
2
3
4
5
6
7
8
9
10
PrPt/LkIn   0 *** ********** ********* 1588 1590
PrPt/LkIn 0 *** ********** ********* 1589 1590
PrPt/LkIn 0 *** ********** ********* 1590 1590
found type 0 nodes 291
total number of channel elements 292
total number of NLINKS 2528100
Apparent error in network topology 292 2528100
ixrt = 1590 jxrt = 1590
The job is stopped due to the fatal error. READ_ROUTEDIM
application called MPI_Abort(comm=0x84000000, 1) - process 0

终端打印的错误中开始报错的位置及信息如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
(Near) match for destination layer:  Taking destination layer at  1.5000 from source layer at  1.5000
xstart,xend, ystart, yend 1 159 1 159
xstart,xend, ystart, yend 1 159 1 159
xstart,xend, ystart, yend 1 159 1 159
xstart,xend, ystart, yend 1 159 1 159
SOIL TEXTURE CLASSIFICATION = STAS FOUND 19 CATEGORIES
SNOW HEIGHT NOT FOUND - VALUE DEFINED IN LSMINIT
NTIME = 216 KHOUR= 216 dtbl = 3600.00000
zsoil/soil_thick_input = 0.100000001 0.300000012 0.600000024 1.00000000
rt_domain(did)%g_IXRT, rt_domain(did)%g_JXRT, rt_domain(did)%ixrt, rt_domain(did)%jxrt
1431 1431 1431 1431
rt_domain(did)%ix, rt_domain(did)%jx
159 159
global_nx, global_ny, local_nx, local_ny
159 159 159 159
Channel Option in Routedim is 3
get2d_int: failed to read the variable: CHANNELGRID in ./DOMAIN/Fulldom_hires.nc
read LINKID for CH_LNKRT from ./DOMAIN/Fulldom_hires.nc
get2d_int: failed to read the variable: FLOWDIRECTION in ./DOMAIN/Fulldom_hires.nc
get2d_int: failed to read the variable: LAKEGRID in ./DOMAIN/Fulldom_hires.nc
WARNING: get2d_real: failed to read the variable: LATITUDE or LATITUDE
WARNING: get2d_real: failed to read the variable: LONGITUDE or LONGITUDE
NLINKS IS 2047761
PrPt/LkIn 0 *** ********** ********* 1 1
PrPt/LkIn 0 *** ********** ********* 2 1
PrPt/LkIn 0 *** ********** ********* 3 1
PrPt/LkIn 0 *** ********** ********* 4 1
PrPt/LkIn 0 *** ********** ********* 5 1
PrPt/LkIn 0 *** ********** ********* 6 1
...................................................................
...................................................................
...................................................................
...................................................................

分析

与AGGFACTRT有关,具体原因未知。目前想到的办法就是提高geo_em.d01的分辨率或者尽可能使用小的AGGFACTRT,大的DXRT(但是这个一般在洪水预测中低于300),所以只能提高geo_em.d01的分辨率。

一个很大可能性的原因:根据错误出现开始的位置,报错信息与Fulldom_hires文件有关,可能是Fulldom_hires中的变量和geo_em.d01、FORCING的分辨率冲突。分析如下:在WPS GIS Preprocessing工具箱中有一个参数为Ridgridding Factor,可计算Routing Resolution = GEOGRID Resolution / Regridding Factor. 包括Fulldom_hires文件在内的由WPS GIS Preprocessing工具箱生成的所有变量分辨率都为Routing Resolution,FORCING的分辨率与geo_em.d01的分辨率相同,如果AGGFACTRT和DXRT设置不恰当,geo_em.d01与FORCING生成的子网格的分辨率就和Fulldom_hires的分辨率对不上,就会出现上述错误。

与DOMAIN中是否有海洋无关。

解决方案

公式:Routing Resolution = GEOGRID Resolution / Regridding Factor

            GEOGRID Resolution = AGGFACTRT × DXRT

预测洪水要求子网格大小 DXRT < 300m

假设DXRT 取300m,geo_em.d01的分辨率为3km,则Routing Resolution为300m,推导得出Regridding Factor为10,AGGFACTRT等于10。

14.运行命令./wrf.exempirun -np 2 ./wrf.exe出现段错误。

物理参数使用默认。参数错误?格网点太多?分辨率太低或太高?内存不够?我使用两个(27和9km)或三个域(27、9、3km)运行时就会出错,用一个域(3km)跑模型的时候就不会,是不是证明和内存有关。

15.使用NLDAS重新网格化后的数据(已经替换NAN值)作为FORCING数据运行mpirun -np 6 ./wrf_hydro.exe报错,通过diag_hydro.00000文件查看

1
2
3
4
t0OutputFlag:            1
read forcing data at 2024-02-01_00:00:00./FORCING/2024020101.LDASIN_DOMAIN1
name = "T2D"
The job is stopped due to the fatal error. In get_2d_netcdf() - nf90_get_var problem

FORCING中的LDASIN_DOMAIN1类型数据覆盖范围与DOMAIN中的geo_em.d01数据覆盖范围不匹配,一开始我以为FORCING是全球覆盖的,就直接拷贝了过来,后来可视化发现覆盖范围对不上,一定要用geo_em.d01数据和NLDAS数据重新网格化,保证LDASIN_DOMAI数据是DOMAIN中的geo_em.d01生成。

16.mpirun -np 4 ./wrf_hydro.exe

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
 get2d_int: failed to read the variable: LAKEGRID in ./DOMAIN/Fulldom_hires.nc
WARNING: get2d_real: failed to read the variable: LATITUDE or LATITUDE
WARNING: get2d_real: failed to read the variable: LONGITUDE or LONGITUDE
Apparent error in network topology 4 13967
ixrt = 150 jxrt = 160
The job is stopped due to the fatal error. READ_ROUTEDIM
---
FATAL ERROR! Program stopped. Recompile with environment variable HYDRO_D set to 1 for enhanced debug information.

application called MPI_Abort(comm=0x84000004, 1) - process 0

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

调整DXRT和AGGFACTRT参数,DXRT×AGGFACTRT=geo_dem.d0*分辨率

17.mpirun -np 4 ./ungrib.exe报错,错误位于ungrib.log。错误如下

1
2
3
4
5
6
7
...
...
2024-07-21 16:04:32.899 --- 200 X - X X X X - - - - - - - - - - - - - - - - - - - - -
2024-07-21 16:04:32.902 --- 100 X - X X X X - - - - - - - - - - - - - - - - - - - - -
2024-07-21 16:04:32.906 --- -------------------------------------------------
2024-07-21 16:04:36.462 --- INFORM: First pass done, doing a reprocess
2024-07-21 16:04:36.485 --- ERROR: unknown out_format, ifv = 1176463528

解决办法; 不能使用mpirun运行ungrib.exe

其他错误参考资料

CSDN-WRF报错记录

知乎-WRF模型模拟时所遇到的问题及解决方法

WRF常见bug及解决方案

CSDN-WRF-Hydro运行过程与错误总结


运行WPS、WRF、WRF-Hydro的一些错误总结一
https://singyutang.github.io/2024/07/22/运行WPS、WRF、WRF-Hydro的一些错误总结一/
作者
SingyuTang
发布于
2024年7月22日
许可协议