Too much space is used by your output files

From ScientificComputing
Revision as of 09:51, 23 February 2017 by Sfux (talk | contribs)

Jump to: navigation, search

Introduction

On our clusters, data written to stdout/stderr are buffered in a shadow file system with a small quota of 2 GB per user. When this quota is reached, all jobs would crash. We have therefore recently modified the batch system to detect this condition and preemptively reject new jobs until the data stored by these jobs in the shadow file system have been removed.

Error message

Users receive then an error message

 Too much space is used by your output files
 in the LSF batch system's temporary directory.

You cannot clean up your files in the shadow file system yourself. If you receive this error message, then please contact Cluster Support.

Howto solve this problem

Writing so much data to stdout or stderr does not only fill up the shadow file system; it also slows down your jobs. You should therefore:

  1. Kill all jobs to prevent further problems
  2. Modify the program to NOT write all these to stdout
  3. Resubmit all jobs