Guides Guides

 

The standard user scenario can be described through the description of the main portal's workflows and tasks.

 

The below described workflows provide functionally for the efficient and easy WRF-ARW execution on the national grid infrastructure (CRO-NGI, http://www.cro-ngi.hr). The reason for close binding with the CRO-NGI lies in the fact that the model requires high-bandwidth computing resource (infiniband interconnection between nodes) with the support for MPI. Furthermore, the CRO-NGI provides a very fast network-shared files system called Gfarm (http://datafarm.apgrid.org/). The Gfarm proved to bring high-performance and easy usage and thus, all workflows described here rely on the Gfarm file systems. The Gfarm functionalities are implemented through the bash script that is provided as the input for workflow's tasks. In order to use CRO- NGI infrastructure, the users must have a valid user account and an access to the CRO-NGI infrastructure that can be procured from the University Computing Centre (SRCE, http://www.srce.unizg.hr/homepage/). The valid certificates have to be stored on the MyProxy server. Prior to running the workflows, users have to download their user certificates from MyProxy server and associate it with the CRO-NGI virtual organization.

 

The two main workflows are the workflow for downloading and preparing the static geographical data and the workflow that performs pre-processing and the core WRF-ARW application.

 

Workflow "Download geogrid data"

This workflow downloads all the geographical data from the www.mmm.ucar.edu and stores these data in the Gfarm file structure. The workflow can be imported from the SHIWA repository or the gateway local repository. Once imported, the workflow is accessible through the Workflow->Concrete portlet. The "Download geogrid data" workflow should be the first workflow that a new user has to run before he/she starts any WRF-ARW job. In general, this workflow should be run only once, at the first login to the gateway. Once the workflow is done, the data are permanently stored in the pre-defined Gfarm structure and can be access by the user in any other workflows by simply providing the absolute path in the Gfarm structure to the folder where the static geographical data are stored.

 

The workflow consists of only one task and is run as a sequential job. The input parameter for the task is the relative path to the folder in the Gfarm structure where the downloaded data will be stored. The main user structure of the Gfarm is "/home/user_name/" and the new folder with the user defined name "dir_name" will be created (if not exists) in the Gfarm path "/home/user_name/dir_name". All the data are permanently stored. The output of the workflow is the file "downloadOut.txt" with the absolute path of the output folder.

 

Workflow "WRF"

The "WRF" workflow is the main workflow and can be applied for almost any WRF user scenario or simulation. The workflow can be imported, by the end-user, from the SHIWA repository or from the gateway local repository. Once imported, the workflows can be accessed through Workflow->Concrete portlet. The WRF workflow describes the most commonly used scenario for WRF-ARW prognostic model and consists of 5 tasks (Figure 1).

WRF workflow

 

The WRF workflow is divided in the following tasks:

 

getGFS

This task is responsible for downloading initial and boundary conditions GFS data from the NCEP (nomads.ncep.noaa.gov). The task accepts 4 parameters: "{output folder} {start date} {start hour} {number of hours}" and can be run in two modes. The first mode is to provide the relative path to the folder in the Gfarm structure that contains existing initial and boundary condition data. This is done by setting parameters: "/relative_path/folder_name" -1", where for the "start date" is given "-1" providing that the data are no not downloaded. The second mode of execution is to download the data from the NCEP server. This is done by adding start date, start time and the period for which we want to run simulations in hours. The task then downloaded the data from the NCEP servers starting from the given data and time for the specified number of hours. The output of the getGFS task is the internal file with the absolute path to the folder with the initial and boundary conditions data passed as an input to the ungrib task. For the execution the user can set any of the 5 CRO-NGI resources.

 

ungrib

This task performs ungrib per-processor as an sequential job. The users can setup up on which of the 5 CRO-NGI resources the task will be run. The input parameters for the ungrib task are "out_folder Vtable" where "out_folder" is as before, the Gfarm relative path to the folder where the output data of the ungrib pre-processor will be stored. If the file does not exists it will be created in runtime. In the current test case, when only the GFS data are downloaded, the parameter "Vtable" should be set to "GFS". Furthermore, the ungrib namelist.wps have to be provided before the task execution. The user are required to prepare their namelist.wps file and upload it in the ungrib task configuration window->Job I/O. The output of the task is automatically generated and sent as an input for the metgrid tasks. The getGFS and the ungrib task are run in parallel.

 

geogrid

The geogrid task runs geogrid pre-processor. The task is also performed on only one processor, thus no additional setting up of the task is required. The task has two input parameters: "geo_folder output_folder". The geo_folder is the relative path in the Gfarm structure to the folder where the geological data are stored. The data can be stored by the user (manually) or can be downloaded and stored automatically by running the "download geog data" workflow before. In the former case, the values of "geo_folder" should be the value return by the "download geog data" output file. The "out_folder" is a relative path to the folder where the geogrid output data will be stored. The output of the geogrid task is the file with the absolute path to the folder defined by the parameter "out_folder". The task also requires the namelist.wps file as an input that have to be uploaded in the configuration window->Job I/O. The output of the task is the internal file with the absolute path to the output folder defined by "out_folder" and passed as an input to the metgrid task.

 

metgrid

The metgrid task implements the metgrid pre-processors. The input files for the metgrid task are the geogridOut and ungribOut. The end user does not have to operate these files as they are passed automatically. The only input required from the end user is to define the metgrid output folder in the Gfarm structure. As in previous two pre-processing tasks, the namelist.wsp file is required and have to be uploaded by the user under configuration window->Job I/O. The output of the metgrid task is the internal file passed as an input for the WRF task.

 

WRF

This task in the main task of the workflow and runs real.exe and wrf.exe, the core executables of the WRF-ARW model. This is a parallel part of the code and the user have to mark it as MPI type of job and define the number of processors. The users can choose any of the 5 CRO-NGI resources, but for the best performance the ce2.cro-ngi.hr resource is recommended. This is only one resource that has an infiniband interconnection which greatly influence the model execution time. The input parameter for the task is the relative path to the output folder. The output folder, if not exists, will be created in the Gfarm. Furthermore, during the task configuration, the user has to upload the namelist.input file. The file can be uploaded in the configuration window->Job I/O, Port name: NamelistWRF.

 

In the tasks ungrib, geogrid, metgrid and WRF, the executables are bash scripts while the binaries and the auxiliary files are defined as the input files (compressed in tar.gz).