Frequently Asked Questions
SSH Connections
Backup your working SSH key
- many SSH problems to connect to the ESPRI computing center can be solved by this solution : copy your SSH keys to another host or private cloud
SSH error "permission denied"
- look here for sending debug info : https://documentations.ipsl.fr/spirit/ssh/about_ssh_key.html#permission-denied-issue
- If it's not your computer or your computer has been reinstalled, be sure to have restored your SSH keys on it. Look here: https://documentations.ipsl.fr/spirit/ssh/about_ssh_key.html#keys-replication
- if you change right on your home on the cluster ( write for group or everybody ), ssh keys cannot be used anymore before removing wrong right
SSH public keys removed shortly from my $HOME/.ssh/authorized_keys
- It's not a problem, it's a feature on our clusters and there are many reasons for that :
- It automatically corrects any errors that you could do with your SSH authorized_keys
- It guarantees that you can log on the various ESPRI clusters with different homes (spirit/spiritx/ciclad/climserv/hal) with your SSH keys
- Security: any key added by someone else doesn't stay
I want to access from another computer
Please : no need of new ssh key + Look here: https://documentations.ipsl.fr/spirit/ssh/about_ssh_key.html#keys-replication
SSH message "sign_and_send_pubkey: signing failed: agent refused operations"
- If you have copied your key on a new computer
- Solution: try
ssh-add
on your computer
SSH Sessions are dying many times a day.
- This problem is not coming from us but from YOUR PLACE
- This is bad network device or bad configuration not honoring tcp session (Internet box,router or firewall)
- We saw this on many places around the world
- for Linux , MacOS and any other openssh client :
- Try:
ssh -o ServerAliveInterval=90s
and look here to make this change permanent: https://documentations.ipsl.fr/spirit/ssh/about_ssh_key.html#ssh-client-configuration- for mobaxterm :
- Disable (uncheck) SSH keepalive and close MobaXterm
- Open MobaXterm and check SSH keepalive close MobaXterm (again)
- should be working
SSH graphical session not working or stopping working
- Verify your quota for your IPSL Mesocenter home
quotas
: It's not working anymore if over quota on /home- On Mac, have you installed XQuartz
- On linux and Mac, Try to start ssh with -X option
- On Windows use mobaxterm
- if nothing above resolve your problem, It's problem with your laptop/workstation and you have to see with your local support team
SSH graphical session stop working after 15/20 minutes
remote graphics don't work anymore ( especially on MacOS ) + try:
ssh -o ForwardX11Timeout=168h -X user@host
and look here to make this change permanent: https://documentations.ipsl.fr/MESO_User/SSH/About_SSH_key.html#using-config-file-on-linux-or-macos
scp to ipsl mesocenter is not working anymore
Do not put any echo command in your .bashrc, scp is not working after
SSH messages "Unable to negotiate"
- "Unable to negotiate with xxx.xxx.xxx.xxx port 22: no matching host key type found.Their offer: ssh-rsa,ssh-dss"
- recent linux distributions, MacOS or mobaxterm with openssh version >= 8.7 have deactivated RSA alghorithm in their configuration : to know ssh version on your laptop/workstation/server type
ssh -V
- if your version is >= 8.7 , apply the following solution. create on your host a
$HOME/.ssh/config
file or modify it if it already exists with:
Host forge*
HostKeyAlgorithms +ssh-rsa
PubkeyAcceptedAlgorithms +ssh-rsa
Host spirit*
PubkeyAcceptedAlgorithms +ssh-rsa
Host hal*
PubkeyAcceptedAlgorithms +ssh-rsa
Head node
- error message : -bash: fork: Resource temporarily unavailable
you have too many process on the concerned head node (maximum is 512 per user). only solution is to ask on support to kill all your process on this node. it's often seen with vscode server accumulated sessions
filesystem access and backup
-
on spirit I have removed by error files in /xxx
if it's on /home there is a mirror on /backupfs/home/ (started everyday at 5AM) and we have also an incremental backup of home files where we can retreive files deleted since to a maximum of six months
To retrieve from incremental backup you have to ask to support if it's other filesystem SORRY NO BACKUP -
on CMIP6 data I obtain HDF5 Error when reading some files
This problem seem to only occur on IPSL CMIP6 data (hosted at TGCC) This is not a problem with hdf5 library but with filesystem
it's could happen on one node and not on others
we don't know why this happen but we know how to correct just contact the support to say on which node it's happening
cmip6 hdf error on node xxxx could be a good subject ;-) -
I try to access on /xxxx and it's saying
permission denied
some dataset are protected and need to be in special group to access ask to support, we could verify if we can allow you to acces this dataset
-
I try to write file on /xxxx and it's saying
read-only filesystem
you are trying to write on a remote filesystem All remote filesystems are READ-ONLY sample: trying to write on /scratchx or /homedata on spirit cluster or trying to write on /scratchu or /data on spiritX cluster solution: work from the right cluster or write to the right filesystem to know more https://documentations.ipsl.fr/spirit/spirit_clusters/user_spaces.html
Jobs
- I submit a batch job and it's stay in queue ?
slurm job could be blocked by limit and in this case
slqueue -b
can help you to know why your job is blocked see also on https://documentations.ipsl.fr/spirit/spirit_clusters/slurm.html#slurm-user-limits
- Same jobs works sometime and sometime not
look on output file to see on which node they run when they works and which node when it's not
working ( could be a problem on one node , hardware , filesystem full or missing library
in job output first and last line give you the "Running Host: host name "
In case of submission of the problem to meso-support@ipsl.fr thanks to give us job number ,
place of script launched and also place of output of your jobs ( without this, we can't do something )