clusterssh is a package that allows you to control multiple ssh xterm sessions at the same time.

When the monkeysphere-ssh-proxycommand is enabled and I launch 5 or more cssh sessions, intermittently one or two out of every five will fail with: Connection timed out during banner exchange.

I tried to debug by running:

MONKEYSPHERE_LOG_LEVEL=debug cssh -D -d <server1> <server2> etc.

However, while it produced some private data, it didn't give me any insight into what was going wrong. Also, it didn't output any Monkeysphere debugging info.

I had no luck with google and the error message being output.

This isn't a huge priority (it's not hard to disable the monkeysphere-ssh-proxycommand before running cssh), however, it would be nice to figure out why it's not working.

What do you mean by "produced some private data" when you set the log level to DEBUG? Monkeysphere does not output any "private" data in the sense of private keys or passwords or anything like that. Maybe you mean the cssh debug mode outputs private data? or do you just mean "info that you don't want to post here"? It might be useful to see some output, so maybe you could just block out the nasty bits? But I'm not sure it will help.

The problem may be due to the locking of the known_hosts file while the proxycommand is running. At the moment, the monkeysphere-ssh-proxycommand can only be run serially, since each invocation will lock the known_hosts file while it updates it. I think this is required, since we obviously can't have two invocations modifying the file at the same time. However, it's probably possible to decrease the amount of time it takes to update the file. It's not done very efficiently at the moment. The file is locked basically at the very begining, and is locked while all gpg interactions are done, which are slow. I think it should be possible to take the gpg interactions out of the loop.

I just tried cssh and it doesn't seem to work very well with my ssh setup at all. For instance, the simultaneous ssh connections cause simultaneous calls to the agent to get my permission to use the key, which don't interact very well with each other. This of course is not a monkeysphere problem but a general problem with trying to make simultaneous ssh connections with an agent that want key use confirmation.

-- jrollins

I can get cssh to work fine with a confirmation-required agent if i turn off the monkeysphere proxycommand:

cssh -l username -o '-oProxyCommand=none' $(cat hostlist.txt)

with the proxycommand, i definitely get the "Connection timed out during banner exchange" message.

However, i'm also able to get the cssh connection to work if i assert that a longer connection timeout is acceptable:

cssh -l username -o '-oConnectTimeout=30' $(cat hostlist.txt)

Perhaps this is an acceptable workaround?

-- dkg

Sorry for not being more clear. Monkeysphere debug does not reveal personal information - but cluster cssh -d -D exposes the hosts in my ssh config file.

dkg's approach seems to work. His suggestion, as written, didn't work for me. But that's because I ran cssh -u > $HOME/.csshrc, which generates a default cssh config file (that specifies ConnectTimeout=10). That file seems to override the command linke (perhaps a cssh bug?). I changed the ConnectTimeout to 30 in my ~/.csshrc file and now everything works.

I think jrollins' assessment is probably right - the extra delay due to locking causes the timeout. I think tweaking ConnectTimeout parameter via the .csshrc file or via the command line is a reasonable fix for this bug, so I'm closing as done.

-- Sir Jam Jam

I just posted the cssh bug in debian.