From Naf-Wiki
Jump to: navigation, search

NAF General NAF Docu NAF storage

The NAF gives the user the opportunity to have different access methods to different kind of data. The user can choose the best kind of storage for his different kinds of data (like scripts, histograms, big data sets,...) This means on the other side that the user needs to think about the access pattern of each different kind of his data.

In the table below and the more extensive descriptions (see links) you can find some advice about how to use the different storage systems best.

In general you should keep in mind that many small files are bad in some file systems but are worse in other file systems: so please try to avoid them => make tarballs of log files (stdout, stderr, ....aux, etc.)

Brief storage overview Usage pattern
AFS is the well known wide-area filesystem. It is optimal for holding small login scrips, eventually some code and small ntuples. Backup DO keep scripts and small important files for backup

DO NOT have more than 65k entries in a directory

dCache is the main (and largest) storage system which is entry point for all data and exchange point with the Grid world. dCache can be accessed using the pNFS mount (/pnfs/…), or using dCap, XROOTD or GridFTP protocol. DO store analysis output and large files which should stay and eventually be visible from the grid

DO NOT put small files here

DUST is a scratch "playground" area and a cluster file system DO reading and writing large files with high bandwidth

DO NOT store too many small files (you might want to tar directories that contain many small files). No backup.

Other Of course, you also have local disks, mainly as temporary local scratch area. There also are experiment specific storage systems, like the TAG database for Atlas. DO copy data to local disks in the WN that needs to run over quite often

DO NOT leave data there as it will vanish at the end of your job