Description
Pre-intro
Apologies for ticket length; the issue at hand is not simple and has many overlapping factors/considerations. Consider skipping down to the bottom of the description, where there is a concise summary that should function as a tl;dr.
Intro
This ticket used to be partly about prompt detection. We're now of the opinion that detecting prompts beforehand (in order to know when to present users with a Python-level prompt) will alway be painful and will never cover 100% of possible use cases. Instead, we feel that actual live interaction with the remote end (i.e. sending local stdin to the other side) will not only sidestep this problem, but be more useful and more in line with user expectations. See #177 for more on the "expect" approach.
The "live" approach itself has shortcomings, but none significantly worse than manually invoking ssh
by hand, and anything in this space is certainly better than the "nothing" we have now.
Investigation into SSH and terminal behavior
Mostly because we can't really hope to offer "better" behavior than vanilla
ssh
does. Plus this presents a learning opportunity -- all of the below
behaviors are reflected in Paramiko itself, as one might expect.
There are basically two issues at stake when performing fully interactive command line calls remotely: the mixing of stdout and stderr, and how stdin is echoed.
Stdout/stderr
Stdout and stderr mixing were tested with the following program (which prints 0
through 9 alternating to stdout and stderr, unbuffered).
#!/usr/bin/env python
import sys
from itertools import izip, cycle
for pipe, num in izip(cycle([sys.stdout, sys.stderr]), range(10)):
pipe.write("%s\n" % num)
pipe.flush()
No pty
When invoked normally (without -t
) ssh
appears to separate stdout
and stderr on at least a line-by-line basis, if not moreso, insofar as we see all of stdout first, and then stderr. Printed normally:
$ ssh localhost "~/test.py"
0
2
4
6
8
1
3
5
7
9
With streams separated for examination:
$ ssh localhost "~/test.py" >out 2>err
$ cat out
0
2
4
6
8
$ cat err
1
3
5
7
9
Thus, pty-less SSH is going to look a bit different than the same program interacted with locally.
With pty
When invoked with a pty, we get the expected result of the numbers being in
order, but the streams are now combined together before we get to them (since
all we get is the output from the pseudo-terminal device on the remote end,
just as if we were reading a real terminal window). Printed normally:
$ ssh localhost -t "~/test.py"
0
1
2
3
4
5
6
7
8
9
Connection to localhost closed.
Examining the streams:
$ ssh localhost -t "~/test.py" >out 2>err
$ cat out
0
1
2
3
4
5
6
7
8
9
$ cat err
Connection to localhost closed.
Thus, the tradeoff here is "correct"-looking output versus the ability to get a
distinct stdout and stderr.
Echoing of stdin
No pty
Without a pty, ssh must echo the user's stdin wholesale (or hide it entirely,
though there do not appear to be options for this) and this means that password
prompts become unsafe. Sudo without a pty:
$ ssh localhost "sudo ls /"
Password:mypassword
.DS_Store
.Spotlight-V100
.Trashes
.com.apple.timemachine.supported
Applications
Developer
[...]
Note that the user's password, typed to stdin, shows up in the output. For
thoroughness, let's examine what went to which stream:
$ ssh localhost "sudo ls /" >out 2>err
mypassword
$ cat out
.DS_Store
.Spotlight-V100
.Trashes
.com.apple.timemachine.supported
Applications
Developer
[...]
$ cat err
Password:
As expected, the user's stdin didn't end up in the streams from the remote end
(ergo it is the local terminal echoing stdin, and not the remote end) and the
password prompt showed up in stderr.
With pty
Here's the same sequence but with -t
enabled, forcing a pty:
$ ssh -t localhost "sudo ls /"
Password:
.DS_Store Applications
.Spotlight-V100 .Trashes
Developer [...]
Connection to localhost closed.
Note that in addition to not echoing the user's password, ls
picked up on the
terminal being present and altered its behavior. This is orthogonal to our
research but is still a useful thing to keep in mind.
As before, use of pty means that all output now goes into stdout, leaving
stderr empty save for local output from the ssh
program itself:
$ ssh -t localhost "sudo ls /" >out 2>err
$ cat out
Password:
.DS_Store Applications
.Spotlight-V100 .Trashes
Developer [...]
$ cat err
Connection to localhost closed.
And as with the previous invocation, our password never shows up, even on our
local terminal.
Non-hidden output
Finally, as a sanity test to ensure that non-password stdin is echoed by the
remote pty when appropriate, we remove a (previously created) test file with
rm
's "are you sure" option enabled:
$ ssh -t localhost "rm -i /tmp/testfile"
remove /tmp/testfile? y
Connection to localhost closed.
And proof that it is the remote end doing the echoing -- our stdin shows up in
the stdout from the remote end:
$ ssh -t localhost "rm -i /tmp/testfile" >out 2>err
$ cat out
remove /tmp/testfile? y
$ cat err
Connection to localhost closed.
Conclusion
As seen above, there are a number of different behaviors one may encounter when
using, or not using, a pty. The tradeoff being, essentially, access to distinct
stdout and stderr streams (but garbled output and blanket echo of stdin) versus
a more shell-like behavior (but without the ability to tell the remote stderr
from stdout).
In our experience, the ssh
program defaults to not using a pty, but the
average Fabric user is probably best served by enforcing one. New
users are more likely to expect "shell-like" behavior (such as proper
multiplexing of stdout and stderr, and hiding of password prompt stdin) and
Fabric already defaults to a "shell-like" behavior insofar as it wraps commands
in a login shell.
Summation of early comments
A summary of findings so far (contains up through comment 16):
- Python's default I/O buffering is typically line-by-line (linewise). I/O is
not typically printed to the destination until a line ending is encountered.
This applies both to input and output. (It's also why
fabric.utils.fastprint
was created -- one must manually flush output to
e.g. stdout to get things like progress bars to show up reliably.)
- Fab's current mode of I/O is also linewise, partly because of point 1, and
partly to allow printing of stdout and stderr streams independently. As a
side effect, partial line output such as prompts will not be displayed to
the Fabric user's console.
- As seen above, SSH's default buffering mode is mostly linewise, insofar as the
default non-pty behavior mixes the two streams up but on a line by line
basis, but it is still capable of presenting partial lines (prompts) when
necessary.
- Because we cannot discern a reliable way of printing less-than-a-line output without moving to bytewise buffering, we'll need to switch to printing every byte as we receive it, in order for the user to see things such as prompts (or more complicated output, e.g. curses apps or things like
top
).
- If/when the secret of
ssh
's print buffering is found, use that algorithm instead.
- Forcing Python's stdin to be bytewise requires the use of the Unix-only
termios
and tty
libraries, but I believe there may be Windows
alternatives. For now, we plan to focus on the best Unix-oriented approach
and will implement Windows compatibility later if possible. (Sorry, Windows
folks.)
- Obtaining remote data bytewise is a bit easier insofar as data from the
client isn't linewise. However, shortening the size of the buffer throws a
wrench in Fabric's current method of detecting whether there is no more
output to be had, so we are currently experimenting with other approaches,
specifically
select.select
(which, yes, is another Windows compatibility
pain point.)
- Any new solution should also hopefully obviate all the annoying, painful,
error-prone issues with the current
output_thread
I/O loop, insofar as line
remainders and such are concerned.
- Ideally, as with
select
, this should also remove the need for threads
entirely, which will make it easier to fully paralellize Fabric in the
future, and kill another entire class of occasional problems.
- With bytewise output, we run into problems where the remote stdout and
stderr get mixed up character-by-character (e.g. the last line of regular
output can become garbled up with a "following" line containing a prompt, since
many prompts print to stderr). Until/unless we can figure out how the
regular SSH client accomplishes its "linewise but not really" buffering, the
only way to avoid this problem is to set
set_combine_stderr
to True.
- We could, and probably should, offer this as a setting in case users have
need for it.
- And without using a pty, we are forced to manually echo all stdin, just as
how vanilla SSH does (see previous major section). This then presents issues
with password prompts becoming insecure.
Putting it all together
So, here's the planned TODO for this issue, given all of the above and the
current state of the feature branch (namely, hardcoded bytewise stdin, skipping
out on the output threads in favor of select
, and printing prefixes after
each newline):
Abstract out the currently-implemented stdin manipulation; it essentially
requires a try/finally
and I think it'd be handy to have as a
context manager or similar.
Possibly also make it configurable, since bytewise stdin is not
absolutely required much of the time. Still feel it should be enabled by
default, though.
Offer an option to allow suppression of stdin echoing, just because.
Expose set_combine_stderr
as a user-facing option. Default should be on -- not too many people need the distinct
stderr access, and with it off, output is very likely to be garbled
unexpectedly. It's an advanced user sort of thing.
Change the pty
option to default to True (currently False). This will
provide the smoothest user experience, and since we're combining the streams
by default anyway, it's a no-brainer.
Decide what to do with output_thread
's password detection and response.
This may become more difficult with bytewise buffering, and was originally
implemented to get around the lack of stdin.
Drop the feature entirely, since users can now enter prompts
interactively. Dropping features isn't great, though.
Repackage it as a "password memory" feature (it needs an overhaul
anyways). Maybe as part of #177.
Keep it entirely as-is, and just use the output capturing as the read
buffer in place of the current approach (checking the as-big-as-possible
chunk from the remote end). Possibly quickest. We won't be able to hide the
prompt itself from user eyes anymore (that's the biggest reason #80 can't
work) but that's not required, just nice.
Figure out if it's possible to omit printing the output prefix in lines where the user's input is being echoed by the remote end. Currently this results in said prefix showing up mid-line in some prompt situations (usually where the echoed stdin is the first data to show up in the stdout buffer, though it could also be a problem once the user hits Enter to submit the prompt too).
Might be able to conditionally hide prefix in cases where the byte coming in to stdout is the same as the last byte seen on stdin, but that is messy (e.g. output coming in long after the user is done typing -- do we add time memory? how much of one? etc)
Depending on exactly how it shakes out, this may not even be an issue for anything but the case where the typed input's echo is the first stdout. will have to see.
Add an interact_with
that makes use of invoke_shell
, assuming it can work seamlessly with the final exec_command
based solution without code duplication.
Come up with Windows-compatible solutions, if possible, for all Unix-isms
used in this effort.
Note in the parallel-related ticket(s) that this solution will make it more difficult for a parallel execution setup to function, insofar as bytewise-vs-linewise output is concerned. A truly parallel execution would be incredibly confusing even on a line-by-line basis, however, so a better solution is likely to be needed anyways.
Reorganize operations.py
and network.py
-- nuke old outdated code, shuffle around new code, it should ideally live in another module that is neither network or operations (?)
Document all of the above changes thoroughly, and attend to related tickets re: tutorial etc.
Update changelog (the pty default is now backwards incompatible!)
Make sure users know they need to deactivate both pty and
combine-streams options in order to get distinct streams.
Update skeleton usage docs re: interactivity
Search for mentions of use of the stderr
attribute and update them since it's not populated by default anymore
Originally submitted by Jeff Forcier (bitprophet) on 2009-07-20 at 05:24pm EDT
Relations
- Related to #73: Once Git can be used, update tutorial to use it.
- Duplicated by #49: Fabric does not prompt for input when the host does.
- Related to #80: See whether
paramiko.SSHClient.invoke_shell
+ paramiko.Channel.send
is feasible
- Duplicated by #153: Hangs When Encountering an Invalid Security Certificate
- Related to #177: Investigate pexpect/expect integration
- Related to #20: Rework output mechanisms
- Related to #182: New I/O mechanisms print "extra" blank lines on \r
- Related to #183: Prompts appear to kill capturing (now with bonus test server!)
- Related to #190: Sudo prompt mixed up a bit
- Related to #192: Per-user/host password memory (was: Possible issue in password memory)
- Related to #193: Terminal resizing support/detection
- Related to #196: open_shell() doesn't do readline too well
- Related to #163: Formattable output prefix.
- Related to #197: Handle running without any controlling tty
- Related to #204: Better in-thread exception handling
- Related to #209: Some password prompts no longer specify the user
- Related to #212: Hitting Ctrl-C during I/O still requires shell reset
- Related to #219: Blank lines after silent commands
- Related to #223: Full stack tests choking on passphrase-vs-password issue
Closed as Done on 2010-08-06 at 11:22pm EDT
Wart Feature Network