Essential Linux Commands for System Troubleshooting & Monitoring
Master essential Linux troubleshooting commands for system administrators. Complete guide covering process management, service control, and network diagnostics.
Mastering Linux troubleshooting commands is essential for system administrators and DevOps engineers. This comprehensive guide covers the most important Linux system administration commands for diagnosing issues, monitoring performance, and maintaining healthy systems.
Whether you’re troubleshooting performance issues, managing services, or diagnosing network problems, these commands will help you quickly identify and resolve system issues.
Table of Contents
- Quick Reference Cheat Sheet
- Safety Guidelines
- System Information & Hardware
- Performance Monitoring & Resource Usage
- Process Management
- SystemD Service Management
- Memory Analysis
- Network Configuration & Troubleshooting
- Port Management & Network Services
- File System Operations
- Permissions & Ownership
- System Control
- Development Environment
- Cloud & Container Tools
- Common Troubleshooting Scenarios
- Log Management & Analysis
- Security & Intrusion Detection
- Distribution-Specific Commands
- Troubleshooting Decision Tree
- Conclusion
- Quick Command Reference
- Related Articles
- Additional Learning Resources
Quick Reference Cheat Sheet
Emergency Commands (⚠️ Use with caution)
Medium Risk:
kill -9 <pid>
- Force kill process (cannot be ignored)pkill -9 <process_name>
- Force kill all matching processes
High Risk:
sudo reboot
- Restart system immediatelysudo iptables -F
- Flush all firewall rulessudo systemctl stop <critical_service>
- Stop critical system service
NEVER USE:
rm -rf /
- Delete everything (system destruction)dd if=/dev/zero of=/dev/sda
- Wipe disk completely
Most Used Commands
1
2
3
4
5
6
7
8
9
10
11
# System status
htop # Interactive process viewer
free -h # Memory usage
df -h # Disk usage
systemctl status # Service status
# Quick troubleshooting
journalctl -f # Follow system logs
netstat -tulpn # List open ports
ps aux | head -10 # Top processes
lsof -i # Network connections
Safety Guidelines
⚠️ Warning: Commands marked with this symbol can cause system damage or data loss.
💡 Tip: Always test commands in a non-production environment first.
🔒 Security: Never run unknown scripts with sudo privileges.
Before Running Destructive Commands:
- Backup critical data
- Test in staging environment
- Have rollback plan ready
- Inform team members
System Information & Hardware
Operating System Information
Get detailed information about your Linux distribution and version:
1
2
3
4
# OS version and distribution info
cat /etc/os-release
cat /etc/centos-release # CentOS/RHEL specific
lsb_release -a # Ubuntu/Debian specific
CPU Architecture & Core Information
Analyze CPU specifications and architecture:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Number of processing units (logical cores with hyperthreading)
nproc
# Detailed CPU architecture information
lscpu
# Extract CPU count from lscpu output
lscpu | grep "^CPU(s):" | awk '{print $2}'
# System architecture
uname -m
arch
# Count physical CPU cores
cat /proc/cpuinfo | grep "processor" | wc -l
Performance Monitoring & Resource Usage
Top Resource Consumers
Identify processes consuming the most CPU and memory:
1
2
3
4
5
6
7
# Top 10 CPU consumers
ps aux --sort -%cpu | head -10
top -o %CPU | head -n 16
# Top 10 memory consumers
ps aux --sort -%mem | head -10
top -o %MEM | head -n 16
Interactive Process Monitoring
Real-time system monitoring tools:
1
2
3
4
5
6
# Enhanced process viewer (colored, sortable)
htop
# Standard process viewer
top -i # hide idle processes
top # show all processes
Process Management
Finding Processes
Locate specific processes by name or pattern:
1
2
3
4
5
6
7
8
# Find process by name (shows PID, CPU%, MEM%)
ps aux | grep <process_name>
ps aux | grep java
ps aux | grep nginx
# Alternative process search methods
pgrep <process_name> # returns PIDs only
pidof <process_name> # returns PIDs only
Terminating Processes
Safely and forcefully terminate processes:
1
2
3
4
5
6
7
8
9
10
11
12
# Graceful termination (SIGTERM)
kill <pid>
# Force kill process (SIGKILL - cannot be ignored)
kill -9 <pid>
# Kill all processes matching name
pkill <process_name>
pkill -9 <process_name> # force kill
# Kill process by name with PID extraction
ps aux | grep -i firefox | awk '{print $2}' | xargs kill -9
SystemD Service Management
Listing Services
View and filter system services:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# All installed unit files
systemctl list-unit-files
systemctl list-unit-files --all
# List services by type
systemctl list-units --type=service # loaded services
systemctl list-units --type=service --all # all services
# List services by state
sudo systemctl list-unit-files --type=service --state=enabled
sudo systemctl list-unit-files --type=service --state=disabled
sudo systemctl list-units --type=service --state=active
sudo systemctl list-units --type=service --state=running
sudo systemctl list-units --type=service --state=failed
sudo systemctl list-units --type=service --state=exited
# Filter services
systemctl list-unit-files | grep enabled
systemctl list-unit-files | grep disabled
Service Control Operations
Manage service lifecycle and configuration:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Reload systemd configuration
sudo systemctl daemon-reload
# Enable/disable services (auto-start on boot)
sudo systemctl enable <service-name>
sudo systemctl disable <service-name>
# Service lifecycle management (systemctl)
sudo systemctl start <service-name>
sudo systemctl stop <service-name>
sudo systemctl restart <service-name>
sudo systemctl reload <service-name> # reload config without restart
sudo systemctl status <service-name>
# Alternative service command (legacy)
sudo service <service-name> start
sudo service <service-name> stop
sudo service <service-name> restart
sudo service <service-name> status
Creating Custom Services
Create and manage custom systemd services:
Service File Locations
- System services:
/etc/systemd/system/
- User services:
/usr/lib/systemd/system/
Example: Loki Service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
sudo tee /etc/systemd/system/loki.service<<EOF
[Unit]
Description=Loki service
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/bin/loki -config.file=/opt/loki/config/loki-config.yaml
Restart=always
[Install]
WantedBy=multi-user.target
EOF
Example: Fluent-bit Service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
sudo tee /etc/systemd/system/fluent-bit.service<<EOF
[Unit]
Description=Fluent Bit
Documentation=https://docs.fluentbit.io/manual/
Requires=network.target
After=network.target
[Service]
Type=simple
EnvironmentFile=-/etc/sysconfig/fluent-bit
EnvironmentFile=-/etc/default/fluent-bit
ExecStart=/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
Restart=always
[Install]
WantedBy=multi-user.target
EOF
Activating Custom Services
1
2
3
sudo systemctl daemon-reload
sudo systemctl enable <service-name>
sudo systemctl start <service-name>
Service Log Analysis (journalctl)
Analyze service logs for troubleshooting:
Basic Log Viewing
1
2
3
4
5
# View all logs for specific service
journalctl -u <service-name>
journalctl -u <service-name> | head -50 # first 50 lines
journalctl -u <service-name> | tail -50 # last 50 lines
journalctl -u <service-name> -n 10 # last 10 lines
Time-based Log Filtering
1
2
3
4
# View logs by time range
journalctl --since=yesterday -u <service-name>
journalctl --since "1 hour ago" -u <service-name>
journalctl --since "2025-01-01 10:00:00" --until "2025-01-01 11:00:00" -u <service-name>
Log Content Filtering
1
2
3
4
5
6
# Filter logs by content
journalctl -u <service-name> | grep "error"
journalctl -u <service-name> | grep "started"
# Real-time log monitoring
journalctl -u <service-name> -f
Log Priority Levels
1
2
3
4
# View logs by severity
journalctl -u <service-name> -p err # error and above
journalctl -u <service-name> -p warning # warning and above
journalctl -u <service-name> -p info # info and above
Advanced Log Formats
1
2
3
4
5
# Detailed log output formats
journalctl -u <service-name> -o verbose # detailed output
journalctl -u <service-name> -o json # JSON format
journalctl -u <service-name> -o json-pretty # formatted JSON
journalctl -u <service-name> -x # with help texts
Boot-specific Logs
1
2
3
4
# View logs by boot session
journalctl -u <service-name> -b # current boot
journalctl -u <service-name> -b -1 # previous boot
journalctl --list-boots # list available boots
System-wide Log Analysis
1
2
3
4
5
# System-wide log commands
journalctl # all system logs
journalctl -k # kernel messages only
journalctl -b # current boot logs
journalctl --disk-usage # journal disk usage
Memory Analysis
Memory Usage Overview
Monitor system memory consumption and availability:
1
2
3
4
5
6
7
8
9
10
11
# Memory usage in human-readable format
free -h # human-readable (KB, MB, GB)
free -g # memory usage in GB
free -m # memory usage in MB
# Virtual memory statistics
vmstat # swap, I/O, CPU statistics
vmstat 1 5 # update every 1 second, 5 times
# Detailed memory information
cat /proc/meminfo
Memory Performance Monitoring
1
2
3
4
5
# Memory usage by process
ps aux --sort=-%mem | head -10
# System memory pressure
sar -r 1 10 # memory utilization every 1 second
Network Configuration & Troubleshooting
Network Interface Management
Configure and manage network interfaces:
1
2
3
4
5
6
7
8
9
10
# View network interfaces
ip addr show
ifconfig
# View routing table
ip route show
route -n
# Network interface statistics
ip -s link show
Static IP Configuration
Configure static IP addresses:
Method 1: Network Interfaces File (Debian/Ubuntu)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Check current configuration
sudo ifdown eth0
# Edit network interfaces file
sudo nano /etc/network/interfaces
# Add static IP configuration:
auto eth0
iface eth0 inet static
address <static_IP_address>
netmask <netmask>
gateway <default_gateway>
pre-up sleep 2
# Activate new configuration
sudo ifup eth0
Method 2: Netplan (Ubuntu 18.04+)
1
2
3
4
5
# Edit netplan configuration
sudo nano /etc/netplan/01-network-manager-all.yaml
# Apply configuration
sudo netplan apply
Network Connectivity Testing
1
2
3
4
5
6
7
8
9
10
11
# Test connectivity
ping -c 4 google.com
ping -c 4 8.8.8.8
# DNS resolution testing
nslookup google.com
dig google.com
# Network path tracing
traceroute google.com
mtr google.com # continuous traceroute
Port Management & Network Services
Listing Open Ports
Identify active network services and listening ports:
1
2
3
4
5
6
7
# List all listening ports
sudo netstat -tulpn | grep LISTEN
sudo ss -tulpn | grep LISTEN # modern alternative
# List all network connections
sudo netstat -tunpl
sudo ss -tunpl
Port-to-Service Mapping
Identify services associated with port numbers:
1
2
3
4
5
6
7
8
9
# View system service-to-port mappings
cat /etc/services
# Search for specific ports
grep -w '80/tcp' /etc/services
grep -w '443/tcp' /etc/services
grep -E -w '22/(tcp|udp)' /etc/services
grep -E -w '2020/(tcp|udp)' /etc/services
cat /etc/services | grep 8080
Process-to-Port Analysis
Identify which processes are using specific ports:
1
2
3
4
5
6
7
8
9
10
11
12
13
# List programs using listening ports
sudo lsof -nP -iTCP -sTCP:LISTEN
# Check specific port usage
sudo lsof -nP -i :8080 # TCP port 8080
sudo lsof -nP -iUDP:53 # UDP port 53
sudo fuser 9092/tcp # process using TCP port 9092
sudo fuser -n tcp 9092 # alternative syntax
# Network connections with process info
sudo netstat -tunpl
sudo netstat -ant | grep :2181
sudo netstat -peanut | grep ":5140"
Terminating Port-specific Processes
Kill processes using specific ports:
1
2
3
4
5
6
7
# Kill process using specific port
fuser -k 8080/tcp
kill -9 $(lsof -t -i:8080)
# Multiple port examples
kill -9 $(lsof -t -i:9092)
fuser -k 9092/tcp
Port Connectivity Testing
Test network service availability:
1
2
3
4
5
6
7
8
9
10
11
# Install telnet client
sudo yum install telnet # RHEL/CentOS
sudo apt install telnet # Ubuntu/Debian
# Test port connectivity
telnet localhost 8080
telnet example.com 80
# Alternative connectivity tests
nc -zv localhost 8080 # netcat port scan
curl -I http://localhost:8080 # HTTP service test
Port Scanning & Service Discovery
Scan for open ports and running services:
1
2
3
4
5
6
7
8
9
10
11
12
13
# Install nmap
sudo yum install nmap # RHEL/CentOS
sudo apt install nmap # Ubuntu/Debian
# Port scanning examples
sudo nmap -sT -O localhost # TCP scan localhost
sudo nmap -sT -O 127.0.0.1 # TCP ports
sudo nmap -sU -O 192.168.2.254 # UDP ports
sudo nmap -sTU -O 192.168.2.24 # both TCP and UDP
# Service version detection
nmap -sV localhost
nmap -A localhost # aggressive scan with OS detection
Firewall Configuration
Manage firewall rules for port access:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# View current iptables rules
sudo iptables -S # rules by specification
sudo iptables -L # rules in table format
# Allow loopback traffic
sudo iptables -A INPUT -i lo -j ACCEPT
sudo iptables -A OUTPUT -o lo -j ACCEPT
# Open specific ports
sudo iptables -A INPUT -p tcp --dport 2020 -m conntrack --ctstate NEW,ESTABLISHED -j ACCEPT
sudo iptables -A OUTPUT -p tcp --sport 2020 -m conntrack --ctstate ESTABLISHED -j ACCEPT
# Save iptables rules (varies by distribution)
sudo iptables-save > /etc/iptables/rules.v4 # Debian/Ubuntu
sudo service iptables save # RHEL/CentOS
File System Operations
File Size Analysis
Analyze file and directory sizes:
1
2
3
4
5
6
7
8
9
10
11
# Human-readable file size
du -h yourfile.txt # disk usage format (K, M, G, T)
ls -lh yourfile.txt # file details with readable size
stat -c%s yourfile.txt # actual file size in bytes
# Specific size units
du -b yourfile.txt # bytes
du -k yourfile.txt # kilobytes
du -m yourfile.txt # megabytes
du -BG yourfile.txt # gigabytes
du -BT yourfile.txt # terabytes
File Content Statistics
Analyze file content metrics:
1
2
3
4
# File content analysis
wc -l yourfile.txt # number of lines
wc -w yourfile.txt # number of words
wc -c yourfile.txt # number of characters/bytes
Longest Line Analysis
Find and analyze the longest lines in files:
1
2
3
4
5
6
7
8
# Length of longest line (number only)
awk '{ print length }' yourfile.txt | sort -nr | head -n 1
# Display the longest line itself
awk '{ if ( length > max ) { max = length; longest = $0 } } END { print longest }' yourfile.txt
# Line number and length of longest line
awk '{ if ( length > max ) { max = length; line = NR } } END { print "Line", line, "Length:", max }' yourfile.txt
Viewing Specific Lines
Extract specific lines from files:
1
2
3
4
5
6
7
8
# Show single line (line 42)
sed -n '42p' yourfile.txt
awk 'NR==42' yourfile.txt
head -n 42 yourfile.txt | tail -n 1
# Show range of lines (42 to 45)
sed -n '42,45p' yourfile.txt
awk 'NR>=42 && NR<=45' yourfile.txt
Directory Size Analysis
Analyze directory and folder sizes:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Show all folders in current directory
du -sh */ # human-readable, unsorted
du -sh */ | sort -hr # sorted largest first
du -s */ # KB, unsorted
du -s */ | sort -nr # KB, sorted largest first
# Specific folder size in different units
du -sh /path/to/folder # human-readable (K, M, G, T)
du -s /path/to/folder # KB
du -sm /path/to/folder # MB
du -sg /path/to/folder # GB
# Files by size (descending)
ls -lhS # human-readable
ls -lS # bytes
du -ah . | sort -hr # all files/folders, human-readable
du -ah . | sort -hr | head -10 # top 10 largest
# Disk usage summary
df -h # filesystem usage
du -sh . # current directory total
Permissions & Ownership
Changing File Ownership
Modify file and directory ownership:
1
2
3
4
5
6
7
8
9
10
11
12
# Change ownership recursively
sudo chown -R username:groupname ./directory/
sudo chown -R snikam:snikam ./Prod/
# Change ownership for single file
sudo chown username:groupname filename
# Change only user ownership
sudo chown username filename
# Change only group ownership
sudo chgrp groupname filename
File Permissions
Manage file and directory permissions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# View permissions
ls -la filename
stat filename
# Change permissions (numeric)
chmod 755 filename # rwxr-xr-x
chmod 644 filename # rw-r--r--
chmod 600 filename # rw-------
# Change permissions (symbolic)
chmod u+x filename # add execute for user
chmod g-w filename # remove write for group
chmod o+r filename # add read for others
# Recursive permission changes
chmod -R 755 directory/
System Control
System Shutdown & Restart
Safely shutdown and restart the system:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Restart system
sudo shutdown -r now # restart immediately
sudo reboot # alternative restart command
sudo systemctl reboot # systemd restart
# Shutdown system
sudo shutdown -h now # shutdown immediately
sudo poweroff # alternative shutdown
sudo systemctl poweroff # systemd shutdown
# Scheduled shutdown/restart
sudo shutdown -r +10 # restart in 10 minutes
sudo shutdown -h 20:30 # shutdown at 8:30 PM
sudo shutdown -c # cancel scheduled shutdown
Development Environment
Python Development
Resolve common Python development issues:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Fix ModuleNotFoundError
export PYTHONPATH=${PYTHONPATH}:${HOME}/your_python_module
export PYTHONPATH=${PYTHONPATH}:$(pwd)
# Add to Python script
import sys
import os
sys.path.append(os.getcwd())
# Virtual environment management
python3 -m venv myenv
source myenv/bin/activate # Linux/Mac
deactivate # exit virtual environment
# Package management
pip install package_name
pip freeze > requirements.txt
pip install -r requirements.txt
Code Editor Tips
Useful shortcuts and configurations:
1
2
3
4
5
6
# VS Code column selection (Mac)
# Shift + Option + Command + Right/Left
# VS Code command line integration
code filename.txt # open file in VS Code
code . # open current directory
Cloud & Container Tools
Minikube Management
Manage local Kubernetes development environment:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# View Minikube configuration
minikube config view
# Check actual allocated resources
minikube ssh -- 'grep -E "cpu|mem|disk" /proc/meminfo /proc/cpuinfo && df -h'
# Inspect VM resources directly
minikube ssh
nproc # CPU count
free -h # Memory usage
df -h / # Disk usage
# Kubernetes resource view
kubectl describe node
# Shows: Capacity, Allocatable, and Allocated resources
# Start Minikube with custom resources
minikube start --cpus=4 --nodes=2 --memory=8192 --disk-size=32g
minikube start --kubernetes-version=v1.32.0 --cpus=4 --memory=8192 --disk-size=30g --vm-driver=hyperkit
AWS CLI Operations
Monitor AWS resources from command line:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# CloudWatch metrics - CPU utilization
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--period 3600 \
--statistics Maximum \
--dimensions Name=InstanceId,Value=i-0b100699f2321e6U1 \
--region us-west-1 \
--profile your_profile \
--start-time 2024-08-24T01:39:06 \
--end-time 2024-08-24T10:39:06
# List EC2 instances
aws ec2 describe-instances --region us-west-1
# S3 bucket operations
aws s3 ls
aws s3 sync ./local-folder s3://bucket-name/
Common Troubleshooting Scenarios
Scenario 1: High CPU Usage
Problem: Server running slow, high load average
Solution Steps:
htop
- Identify CPU-intensive processesps aux --sort -%cpu | head -10
- List top CPU consumerskill -15 <pid>
- Gracefully terminate problematic processsystemctl restart <service>
- Restart if it’s a service
Example Output:
1
2
3
4
5
6
7
$ htop
# Look for processes with high CPU% (>80%)
# Note the PID of problematic processes
$ ps aux --sort -%cpu | head -5
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1234 95.2 2.1 123456 8192 ? R 10:30 5:23 problematic_app
Scenario 2: Out of Memory
Problem: Applications crashing, system unresponsive
Solution Steps:
free -h
- Check available memoryps aux --sort -%mem | head -10
- Find memory hogssudo swapoff -a && sudo swapon -a
- Clear swap cachesystemctl restart <memory-intensive-service>
Example Output:
1
2
3
4
$ free -h
total used free shared buff/cache available
Mem: 7.7G 7.5G 100M 180M 200M 50M
Swap: 2.0G 1.8G 200M
Interpretation: Critical - only 50M available memory, swap heavily used
Scenario 3: Network Connectivity Issues
Problem: Cannot reach external services
Solution Steps:
ping 8.8.8.8
- Test internet connectivityip route show
- Check routing tablesystemctl status NetworkManager
- Check network servicesudo systemctl restart NetworkManager
- Restart networking
Scenario 4: Service Won’t Start
Problem: Critical service fails to start
Solution Steps:
systemctl status <service>
- Check service statusjournalctl -u <service> -n 50
- View recent logs- Check configuration files for syntax errors
systemctl daemon-reload
- Reload if config changed
Log Management & Analysis
Critical Log Locations
1
2
3
4
5
6
7
/var/log/syslog # System messages (Ubuntu/Debian)
/var/log/messages # System messages (CentOS/RHEL)
/var/log/auth.log # Authentication logs
/var/log/kern.log # Kernel messages
/var/log/cron.log # Cron job logs
/var/log/nginx/ # Web server logs
/var/log/apache2/ # Apache logs
Log Analysis Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Find errors in logs
grep -i error /var/log/syslog
grep -i "failed\|error\|critical" /var/log/messages
# Monitor logs in real-time
tail -f /var/log/syslog
multitail /var/log/syslog /var/log/auth.log
# Search logs by date
journalctl --since "2024-01-01" --until "2024-01-02"
journalctl --since "1 hour ago"
# Count error occurrences
grep -c "error" /var/log/syslog
grep "error" /var/log/syslog | wc -l
# Find large log files
find /var/log -type f -size +100M -exec ls -lh {} \;
Log Rotation Management
1
2
3
4
5
6
7
8
9
# Check logrotate configuration
cat /etc/logrotate.conf
ls /etc/logrotate.d/
# Manually rotate logs
sudo logrotate -f /etc/logrotate.conf
# Check log rotation status
sudo logrotate -d /etc/logrotate.conf
Security & Intrusion Detection
Failed Login Attempts
1
2
3
4
5
6
7
8
9
10
11
# Check failed SSH attempts
grep "Failed password" /var/log/auth.log
grep "Invalid user" /var/log/auth.log
# Count failed attempts by IP
grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -nr
# Check successful logins
grep "Accepted password" /var/log/auth.log
last -n 20 # Last 20 logins
who # Currently logged in users
Process Security Analysis
1
2
3
4
5
6
7
8
9
10
11
# Check for suspicious processes
ps aux | grep -E "(nc|netcat|ncat)"
ps aux | grep -v "\[.*\]" | awk '{print $11}' | sort | uniq -c | sort -nr
# Check listening ports and associated processes
netstat -tulpn | grep LISTEN
lsof -i -P -n | grep LISTEN
# Check for unusual network connections
netstat -an | grep ESTABLISHED
ss -tuln | grep LISTEN
File System Security
1
2
3
4
5
6
7
8
# Find files with unusual permissions
find / -perm -4000 -type f 2>/dev/null # SUID files
find / -perm -2000 -type f 2>/dev/null # SGID files
find / -perm -777 -type f 2>/dev/null # World writable files
# Check for recently modified files
find /etc -mtime -1 -type f # Modified in last 24 hours
find /home -name ".*" -type f # Hidden files in home directories
Distribution-Specific Commands
Package Management
Ubuntu/Debian:
1
2
3
4
5
apt update && apt upgrade # Update packages
apt install <package> # Install package
apt search <package> # Search package
apt remove <package> # Remove package
apt list --installed # List installed packages
CentOS/RHEL:
1
2
3
4
5
yum update # Update packages (CentOS 7)
dnf update # Update packages (CentOS 8+)
yum install <package> # Install package
yum search <package> # Search package
yum remove <package> # Remove package
Arch Linux:
1
2
3
4
5
pacman -Syu # Update packages
pacman -S <package> # Install package
pacman -Ss <package> # Search package
pacman -R <package> # Remove package
pacman -Q # List installed packages
openSUSE:
1
2
3
4
zypper update # Update packages
zypper install <package> # Install package
zypper search <package> # Search package
zypper remove <package> # Remove package
Service Management
Modern Systems (systemd):
1
2
3
4
5
systemctl start <service> # Start service
systemctl stop <service> # Stop service
systemctl status <service> # Service status
systemctl enable <service> # Enable on boot
systemctl disable <service> # Disable on boot
Legacy Systems (SysV):
1
2
3
4
5
service <service> start # Start service
service <service> stop # Stop service
service <service> status # Service status
chkconfig <service> on # Enable on boot
chkconfig <service> off # Disable on boot
Network Configuration
Ubuntu 18.04+ (Netplan):
1
2
3
sudo nano /etc/netplan/01-network-manager-all.yaml
sudo netplan apply # Apply configuration
netplan status # Check status
CentOS/RHEL (NetworkManager):
1
2
3
nmcli connection show # Show connections
nmcli device status # Device status
sudo firewall-cmd --list-all # Firewall status
Arch Linux (systemd-networkd):
1
2
3
sudo systemctl enable systemd-networkd
sudo systemctl enable systemd-resolved
sudo iptables -L # Firewall rules
Troubleshooting Decision Tree
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
System Issue?
├── Performance Problem?
│ ├── High CPU → Check processes (htop, ps aux --sort -%cpu)
│ │ └── Kill problematic process (kill -15 <pid>)
│ ├── High Memory → Check memory usage (free -h, ps aux --sort -%mem)
│ │ └── Restart memory-intensive services
│ └── High I/O → Check disk usage (iotop, df -h)
│ └── Clean up disk space or optimize I/O
├── Network Problem?
│ ├── No connectivity → Check interfaces (ip addr show)
│ │ └── Restart NetworkManager (systemctl restart NetworkManager)
│ ├── Slow connection → Check routing (traceroute, mtr)
│ │ └── Investigate network path issues
│ └── Port issues → Check listening ports (netstat -tulpn)
│ └── Configure firewall or restart services
├── Service Problem?
│ ├── Won't start → Check logs (journalctl -u <service>)
│ │ └── Fix configuration or dependencies
│ ├── Crashes → Check system logs (journalctl -p err)
│ │ └── Investigate error messages
│ └── Config issues → Validate config files
│ └── Test configuration syntax
└── Security Issue?
├── Unauthorized access → Check auth logs (/var/log/auth.log)
│ └── Block suspicious IPs, change passwords
├── Suspicious processes → Check running processes (ps aux)
│ └── Investigate and terminate if malicious
└── File changes → Check file integrity (find, checksums)
└── Restore from backup if compromised
Conclusion
This comprehensive guide covers essential Linux troubleshooting commands for system administrators and DevOps engineers. Regular practice with these commands will improve your ability to quickly diagnose and resolve system issues.
Key Takeaways:
- System monitoring: Use
htop
,top
, andps
for process analysis - Service management: Master
systemctl
andjournalctl
for service control - Network troubleshooting: Leverage
netstat
,lsof
, andnmap
for network issues - Performance analysis: Utilize
free
,vmstat
, anddu
for resource monitoring - Log analysis: Use
journalctl
with various filters for effective troubleshooting
Bookmark this guide for quick reference during system troubleshooting sessions. Regular use of these commands will make you more efficient at maintaining Linux systems.
Quick Command Reference
1
2
3
4
5
6
7
# Emergency troubleshooting one-liners
htop # System overview
journalctl -f # Live system logs
netstat -tulpn | grep LISTEN # Open ports
ps aux --sort -%cpu | head -5 # Top CPU users
free -h && df -h # Memory and disk
systemctl --failed # Failed services
Related Articles
Development Workflow Integration
- Complete Git Workflows Guide - Master Git commands and workflows for version control in your Linux development environment
- Ubuntu Fresh Install Setup Guide - Complete Ubuntu development environment setup and system configuration
System Administration
These Linux troubleshooting commands are essential for managing development servers, CI/CD pipelines, and production environments. Master both system setup and troubleshooting for complete Linux administration expertise.
Additional Learning Resources
- Linux System Administration Best Practices
- Advanced SystemD Service Management
- Network Security with iptables
- Performance Monitoring Tools Guide
- Linux Security Hardening Guide
- Container Troubleshooting with Docker
- Kubernetes Debugging Commands
- Red Hat System Administration Guide
- Ubuntu Server Documentation
- Arch Linux Wiki
Happy troubleshooting! 🚀