Chapter 0 — The Problem
On a fine Monday morning:
I tried to SSH to my EC2 instance, but it kept bailing out on me.
Chapter 1 — The Puzzle
Why am I unable to SSH to an instance which worked fine until Friday.
I can ping the instance though. 100% success rate. No packets dropped.
I listed out the things changed from my end:
EC2 Instance type: unchanged
Host machine : Same host machine — RHEL instance
Chapter 2 — The Chase
- nc -zv <ec2_instance_ip> 22
- The -z option is “zero-I/O mode” specifically for scanning.
- The -v option means “verbose” and actually causes the output to be generated; without this option only the exit status will indicate whether the port is open or not (0 = yes, 1 = no). This makes it easy to use this in scripting.
# nc -zv <ec2_instance_ip> 22
Connection to <ec2_instance_ip> port 22 [tcp/ssh] succeeded!
Nping — Network packet generation tool / ping utility
- nping -c 5 –tcp-connect -p 22 <ec2_instance_ip>
- -c, — count <n> : Stop after <n> rounds.
- PROBE MODES:
— tcp-connect : Unprivileged TCP connect probe mode.
— tcp : TCP probe mode.
— udp : UDP probe mode.
— icmp : ICMP probe mode.
— arp : ARP/RARP probe mode.
— tr, — traceroute : Traceroute mode (can only be used with TCP/UDP/ICMP modes).
Max rtt: 84.699ms | Min rtt: 80.338ms | Avg rtt: 82.519ms
TCP connection attempts: 5 | Successful connections: 5 | Failed: 0 (0.00%)
Nping done: 1 IP address pinged in 4.10 seconds
Reference for Nping:
nping(1) - Linux manual page
NPING(1) Nping Reference Guide NPING(1) nping - Network packet generation tool / ping utility Nping is an open-source…
Oh wait, you said SSH. I know how to debug this.
- Instead of SSH, just add –vvv for further debug.
- ssh -v will tell you what is happening mostly on your end
- ssh -vv will tell you low level on both ends
- ssh -vvv will tell you almost everything from both ends.
Contacted AWS support.
A very patient support rep helped me debug the issue further
Step 0: SSH with “-vvv” flag for verbosity
I did that, didn’t help. Still lost connection.
Step 1: Create packet capture using tcpdump (command-line packet analyzer)
$ sudo -i
# tcpdump -i any -w /tmp/$(hostname)_capturefile.cap -s 256 port 22
# ssh -vvv user@<elastic_ip_ec2_instance>
# killall tcpdump ; pkill tcpdump
# zip -9 /tmp/$(hostname)_capturefile.cap.zip /tmp/$(hostname)_capturefile.cap
- -i interface — interface=interface
- -w file
Write the raw packets to file rather than parsing and printing them out. They can later be printed with the -r option.
- -s snaplen — snapshot-length=snaplen
Snarf snaplen bytes of data from each packet rather than the default of 262144 bytes. Packets truncated because of a limited snapshot are indicated in the output
#apt install tcpdump (Ubuntu)
#yum install tcpdump (Redhat/Centos)
Step 2: Perform TCP traceroute (helps diagnose network issue) over different ports, such as 22 and 443. Also add mtr (my traceroute — combines much of the functionality of ping and traceroute into one interface) to the mix .
To install tcptraceroute:
# yum -y install — enablerepo=’*’ tcptraceroute telnet (Redhat/Centos)
# apt install tcptraceroute (Ubuntu)
$ mtr -c 50 — no-dns — show-ips — report-wide — report — tcp — port 443 <elastic_Ip>
$ mtr -c 50 — no-dns — show-ips — report-wide — report — tcp — port 22 <elastic_Ip>
Use this option to force mtr to display numeric IP numbers and not try to resolve the host names.
This option puts mtr into wide report mode. When in this mode, mtr will not cut hostnames in the report.
Use this option to tell mtr to display both the host names and numeric IP numbers.
This option puts mtr into report mode. When in this mode, mtr will run for the number of cycles specified by the -c option, and then print statistics and exit.
$ tcptraceroute <elastic_Ip> 22
$ traceroute -T -p 22 –n <elastic_Ip>
Use TCP SYN for probes
Do not try to map IP addresses to host names when displaying them.
And that didn’t help either.
Okay, Let’s take a step back and check my email.
“IT has upgraded your VM from RHEL6 to RHEL8 over the weekend. Please open a support case with us in case you are facing issues”.
Check the host machine:
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: RedHatEnterprise
Description: Red Hat Enterprise Linux release 8.1 (Ootpa)
“ootpa” is IRC nick of Larry Troan, who was a Red Hat engineer and who died in 2016.
RHEL 8 “ootpa” codename was chosen as a tribute to Larry Troan.
Great, something has changed with respect to host machine, but what exactly.
<Few days pass by>
How does SSH work behind the scenes?
Information Security: Principles and Practice, Mark Stamp
<Search google for Red Hat documentation>
- Two versions of SSH currently exist: version 1, and the newer version 2.
- The OpenSSH suite in Red Hat Enterprise Linux 8 supports only SSH version 2, which has an enhanced key-exchange algorithm not vulnerable to known exploits in version 1.
- OpenSSH is a program depending on OpenSSL the library, specifically OpenSSH uses the libcrypto part of OpenSSL.
man ssh_config: (on RHEL8)
The supported ciphers are:
<deep google search for redhat issues>
- “GCM ciphers are not available in SSH on RHEL 7.4 in FIPS mode” https://github.com/ComplianceAsCode/content/issues/1613
- GCM ciphers used to be allowed in FIPS mode, but it seems that was a bug.
- FIPS guide (Federal Information Processing Standards)
Go back to my host machine and look at logs:
- debug1: SSH2_MSG_KEXINIT sent
- debug1: SSH2_MSG_KEXINIT received
- debug1: kex: algorithm: ecdh-sha2-nistp256
- debug1: kex: host key algorithm: ecdsa-sha2-nistp256
- debug1: kex: server->client cipher: email@example.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: firstname.lastname@example.org MAC: <implicit> compression: none
- debug1: sending SSH2_MSG_KEX_ECDH_INIT
- debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
- Connection closed by <elastic-ip> port 22
AES-GCM authenticated encryption
AES with Galois/Counter Mode (AES-GCM) provides both authenticated encryption (confidentiality and authentication) and…
Go to my EC2 instance and take a look:
ec2:/etc/ssh# cat ssh_config
# Cipher 3des
# Port 22
# Protocol 2
# Cipher 3des
- Issue in EC2 instance code where its defaulting to GCM ciphers.
Real bug- filed and fixed
- Genuine Red Hat bug which accidentally blocks GCM ciphers, which kept me hanging (still not fixed yet)
- Simple workaround:
Look for any common cipher in host and EC2 instance:
For example: “AES256-CTR” is there in both places
Use it to SSH to the instance:
Example usage: ssh — c “AES256-CTR” user@<elastic_ip_ec2_instance>
Happy Ending after all.