Linux Interview Questions for DevOps Engineers (2026)
Linux knowledge separates strong DevOps engineers from average ones. At any company running servers, you're expected to diagnose issues in a running system under pressure. Here are the questions asked most often.
The Scenario Question Every Interview Has
"A production server is responding slowly. You SSH in. Walk me through what you check."
This is the most common Linux scenario question. A strong answer is systematic:
# Step 1: System overview
top # CPU, memory, load average, running processes
htop # better visualization
# Step 2: Load average
uptime
# 3 numbers: 1min, 5min, 15min averages
# Rule: load > number of CPU cores = overloaded
# Step 3: CPU breakdown
vmstat 1 5 # CPU: us/sy/wa/id — is it user, system, or wait?
# wa (wait) high = I/O bottleneck
# sy (system) high = kernel issue or too many context switches
# Step 4: Memory
free -h # total, used, free, buff/cache
# If available is low and swap is being used → memory pressure
# Step 5: Disk I/O
iostat -x 1 # utilization per disk, await time
iotop # which process is causing I/O
# Step 6: Network
netstat -tuln # open ports and connections
ss -s # connection summary — too many TIME_WAIT = connection exhaustion
# Step 7: Logs
journalctl -n 100 -f # systemd logs
tail -f /var/log/syslog
dmesg | tail -20 # kernel messages — OOM killer, disk errorsProcess Management
1. What is the difference between a process and a thread?
A process is an independent program with its own memory space, file descriptors, and system resources.
A thread is an execution unit within a process. Threads share the process's memory space and resources but have their own stack and registers.
In Linux: ps aux shows processes. ps -eLf shows threads. Each thread appears as a separate line with the same PID but different LWP (Light Weight Process) ID.
2. Explain signals in Linux. What is SIGTERM vs SIGKILL?
Signals are software interrupts sent to processes.
| Signal | Number | Description |
|---|---|---|
| SIGTERM | 15 | Graceful termination request. Process can handle it |
| SIGKILL | 9 | Immediate termination. Cannot be caught or ignored |
| SIGHUP | 1 | Hangup — often used to reload config |
| SIGINT | 2 | Interrupt (Ctrl+C) |
Always use SIGTERM first — give the process time to close connections, flush buffers, and clean up. Only use SIGKILL if SIGTERM doesn't work after a reasonable timeout.
kill -15 <pid> # SIGTERM
kill -9 <pid> # SIGKILL — last resort3. What does `ps aux` output mean?
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1234 2.5 0.8 123456 8192 ? Ssl 09:00 0:05 nginx- VSZ: Virtual memory size (includes all mapped memory, including shared libs)
- RSS: Resident Set Size — actual physical memory in use. This is what you should watch
- STAT: Process state (S=sleeping, R=running, Z=zombie, D=uninterruptible sleep/I/O wait)
Zombie processes: A process that has finished but its parent hasn't called wait() to collect its exit status. Shows as Z in STAT. Harmless but indicates a parent process bug.
Filesystem & Disk
4. A disk is showing 100% full but du shows lots of free space. Why?
This is a classic gotcha. The issue is deleted files still held open by a running process.
When a process opens a file and the file is deleted, the filesystem marks it for deletion but doesn't actually free the space until the process closes its file descriptor.
# Find open deleted files
lsof | grep deleted
# Fix: restart the process holding the file
# or send it a signal to reopen its log file
kill -HUP <pid> # for logging daemons that support config reloadAnother cause: inodes exhausted even though blocks are available.
df -i # show inode usage5. Explain file permissions in Linux
-rwxr-xr-- 1 owner group size date filenamePermission bits: owner (rwx), group (r-x), others (r--)
| Symbol | Octal | Meaning |
|---|---|---|
| r | 4 | Read |
| w | 2 | Write |
| x | 1 | Execute |
chmod 755 = rwxr-xr-x (owner can do everything, others can read and execute)
Special bits:
- setuid (4xxx): File runs with owner's permissions (e.g.,
/usr/bin/passwdruns as root) - setgid (2xxx): Files created in directory inherit group ownership
- sticky bit (1xxx): Only owner can delete their files (e.g.,
/tmp)
Networking
6. What happens at the OS level when you run `curl https://example.com`?
1. DNS resolution: /etc/resolv.conf → recursive resolver → DNS response with IP
2. TCP handshake: SYN → SYN-ACK → ACK to port 443
3. TLS handshake: Certificate exchange, cipher negotiation, session key establishment
4. HTTP request sent over encrypted connection
5. Response received, connection closed (or kept alive for reuse)
Common follow-up: What tool do you use to debug each step?
- DNS:
dig example.comornslookup - TCP connectivity:
telnet example.com 443ornc -zv example.com 443 - TLS:
openssl s_client -connect example.com:443 - Full HTTP:
curl -v https://example.com
7. What is the difference between ss and netstat?
Both show network connections. ss is the modern replacement for netstat — faster (reads directly from kernel), more information, netstat reads /proc/net/tcp which is slower.
ss -tuln # TCP/UDP listening ports
ss -s # summary statistics
ss -tp # TCP connections with process names
netstat -tuln # same but slower, may not be installedShell Scripting
8. How do you make a shell script robust?
#!/bin/bash
set -euo pipefail
# -e: exit on any error
# -u: treat undefined variables as errors
# -o pipefail: pipe fails if any command in it fails
# Trap for cleanup on exit
cleanup() {
echo "Cleaning up..."
rm -f /tmp/tempfile
}
trap cleanup EXIT
# Check dependencies
command -v aws >/dev/null 2>&1 || { echo "aws CLI required"; exit 1; }
# Use double quotes for variables
FILE="/path/to/${FILENAME}"
# Check if file exists before using it
[[ -f "$FILE" ]] || { echo "File not found: $FILE"; exit 1; }The three most common shell script bugs: unquoted variables with spaces, unhandled errors, and using $? incorrectly after multiple commands.
Practice Linux Scenarios Out Loud
InterviewDrill.io has a Linux & Shell track with real debugging scenarios. First session free → interviewdrill.io
