Compare commits

..

36 Commits

Author SHA1 Message Date
RadinPirouz
f2c0cebd8a added jitsi replication doc 2026-05-30 20:35:11 +03:30
RadinPirouz
33399a8019 added jitsi plugin doc 2026-05-30 20:30:38 +03:30
a276e44338 Jitsi Introduction 2026-05-29 23:47:49 +03:30
4a421526c9 Added Innodb Docs 2026-04-29 00:34:46 +03:30
ff7e1fd246 Added MySQL Benchmark Doc 2026-04-26 00:27:42 +03:30
d9d59f570e Added MySQL Configuration Docs 2026-04-26 00:12:18 +03:30
30bae64e51 Added bind9 docs 2026-04-21 23:15:52 +03:30
457faf1989 Added bind9 docs 2026-04-21 23:09:34 +03:30
fa6bb1557d added jq documents 2026-04-15 00:45:23 +03:30
edea1fe9e8 Added Zombie Ps Docs 2026-04-14 18:02:10 +03:30
ded4f55fb8 removed space from dir names 2026-04-10 23:52:56 +03:30
9c419f72c4 removed space from dir names 2026-04-10 23:46:40 +03:30
d14e844a38 Added tcpdump doc 2026-04-09 01:59:48 +03:30
2182412ade Added hping3 to documents 2026-04-05 23:27:19 +03:30
bd21f7c0df Rewrited Git Doc 2026-03-16 15:46:58 +03:30
06eef16b93 Merge pull request 'Update From Dev To Main' (#1) from dev into main
Reviewed-on: #1
2026-03-13 10:35:05 +00:00
RadinPirouz
1b25ca1081 date: added to doc 2026-01-19 14:27:36 +03:30
RadinPirouz
006ea9a79f diff: cleaned doc 2026-01-19 14:27:21 +03:30
RadinPirouz
4093ac50ad openssl: update and cleaned 2026-01-19 14:23:50 +03:30
RadinPirouz
9cbf903552 git doc: updated and cleaned 2026-01-19 14:21:44 +03:30
3c5fde8b6d web-servers: added openssl doc 2026-01-17 01:35:28 +03:30
ac454ce347 linux doc: cleaned date command 2026-01-17 01:34:02 +03:30
27f18f5b4a date command 2026-01-17 01:27:33 +03:30
e210a6e44e git doc : added some git remote command 2026-01-13 01:03:54 +03:30
0dbb061575 git doc: update information for some command ( not cleaned ) 2026-01-13 00:12:30 +03:30
398fa38b1a git doc : update git docuemnt (not cleaned) 2026-01-13 00:09:57 +03:30
e8c0e6f7a5 update kuber information doc 2025-12-27 22:39:31 +03:30
d05232dd4b update kuber information doc 2025-12-27 22:13:18 +03:30
d3f932a896 elk node type doc 2025-12-26 16:57:23 +03:30
RadinPirouz
fa0601df04 AWS Information doc 2025-12-25 17:02:47 +03:30
c02d683d6c ELK Doc 2025-12-13 14:35:10 +03:30
c3917e3471 Network Vagrant File 2025-11-27 00:04:38 +03:30
8e6d4cac6b Added Vagrant doc 2025-11-26 14:29:12 +03:30
eda8d5204b Remove Test Dir 2025-11-26 10:27:39 +03:30
a1f46ec042 Merge branch 'main' into dev 2025-11-26 10:26:42 +03:30
Radin Pirouz
747800cf58 test commit 2025-09-16 23:56:54 +03:30
87 changed files with 6566 additions and 216 deletions

209
AWS/1-Information.md Normal file
View File

@@ -0,0 +1,209 @@
# AWS Core Services Overview
## Compute & Container Services
**EC2 (Elastic Compute Cloud)**
* Infrastructure as a Service (IaaS)
* Provides virtual machines (instances)
* Storage options:
* **EBS** (Elastic Block Store): High-performance block storage attached to a single instance
* **EFS** (Elastic File System): Network file system that can be mounted by multiple instances
* Requires user management of OS, patching, and scaling
**ECS (Elastic Container Service)**
* AWS-managed container orchestration service
* Supports Docker containers
* Deployment options:
1. **EC2 Launch Type** you manage EC2 instances
2. **Fargate Launch Type** serverless, AWS manages infrastructure
**ECR (Elastic Container Registry)**
* Fully managed Docker container image registry
* Used to store, manage, and deploy container images for ECS and EKS
**EKS (Elastic Kubernetes Service)**
* Managed Kubernetes service
* AWS manages the Kubernetes control plane
* Worker nodes can run on EC2 or Fargate
**AWS Lambda**
* Serverless compute service
* Event-driven execution
* Maximum execution time: **15 minutes**
* No server management required
* Common use cases: APIs, background jobs, automation
---
## Messaging & Integration
**SQS (Simple Queue Service)**
* Fully managed message queue service
* Used for decoupling and scaling distributed systems
* Supports Standard and FIFO queues
---
## Databases
**RDS (Relational Database Service)**
* Managed relational databases (MySQL, PostgreSQL, Oracle, SQL Server, MariaDB, Aurora)
* Typically deployed in **private subnets**
* High availability using Multi-AZ
* Automated backups, patching, and scaling
**DynamoDB**
* Fully managed NoSQL key-value and document database
* Serverless, auto-scaling, and highly available
* Low latency and global replication support
---
## Networking & Traffic Management
**VPC (Virtual Private Cloud)**
* Isolated virtual network in AWS
* Uses CIDR ranges for IP addressing
**Subnets**
* **Public Subnet**: Has a route to the Internet Gateway
* **Private Subnet**: No direct internet access
**Internet Gateway (IGW)**
* Enables inbound and outbound internet access for public subnets
**NAT Gateway**
* Placed in a public subnet
* Allows **outbound-only** internet access for private subnet resources
* Cannot receive inbound connections
**Route 53**
* Managed DNS service
* Supports domain registration, routing policies, and health checks
---
## Load Balancing
**ELB (Elastic Load Balancing)**
* Distributes traffic across multiple targets
**ALB (Application Load Balancer)**
* Layer 7 (Application layer)
* Supports HTTP/HTTPS routing rules
* Can route traffic to:
* EC2
* ECS
* Lambda
* IP addresses
---
## Security & Identity
**IAM (Identity and Access Management)**
* Manages users, groups, roles, and permissions
* Global AWS service
**IAM Roles**
* Used by AWS services to access other AWS resources securely
**IAM Reports**
* **Credential Report**: Shows credential status for all users
* **Access Advisor**: Shows last-used service permissions
**Security Groups**
* Stateful virtual firewalls for AWS resources
* Control inbound and outbound traffic
* Attached to EC2, ALB, RDS, ECS, etc.
---
## Monitoring & Logging
**CloudWatch**
* Monitoring and observability service
* Collects metrics, logs, and events
* Used for alarms, dashboards, and automation
---
## AWS Global Infrastructure
**Region**
* Geographic area containing multiple Availability Zones
**Availability Zone (AZ)**
* One or more isolated data centers within a region
**Global Services**
* IAM
* Route 53
* CloudFront
* AWS WAF
**Regional Services**
* EC2
* ECS
* EKS
* RDS
* Lambda
---
## IP Addressing
**Private IP**
* Assigned from VPC CIDR range
* Used for internal communication
**Public IP**
* Assigned automatically to EC2 instances in public subnets
* Released when instance is stopped
**Elastic IP (EIP)**
* Static public IPv4 address
* Remains allocated even if the instance stops
* Used for failover and stable endpoints
---
## Database Networking Best Practices
* RDS instances should run in **private subnets**
* Access options:
* EC2 in the same VPC
* Bastion host
* VPN or Direct Connect
* NAT Gateway can be used for outbound access (updates, patches)

View File

@@ -1,161 +1,468 @@
# Git Commands Guide # Git Commands Guide for DevOps Engineers
## Getting Started with Git **Professional Reference Document**
*Comprehensive Git workflow for development, CI/CD pipelines, and team collaboration*
### 1. Installing Git ---
Before you begin, ensure Git is installed on your machine. You can download it from [git-scm.com](https://git-scm.com/). ## Table of Contents
1. [Installation and Setup](#1-installation-and-setup)
2. [SSH Key Configuration](#2-ssh-key-configuration)
3. [Repository Initialization](#3-repository-initialization)
4. [Basic Workflow](#4-basic-workflow)
5. [Status and History](#5-status-and-history)
6. [File Operations](#6-file-operations)
7. [Branch Management](#7-branch-management)
8. [Merging and Rebasing](#8-merging-and-rebasing)
9. [Remote Operations](#9-remote-operations)
10. [Commit Management](#10-commit-management)
11. [Removing Commits](#11-removing-commits)
12. [Stash Operations](#12-stash-operations)
13. [Tags and Releases](#13-tags-and-releases)
14. [.gitignore Management](#14-gitignore-management)
15. [Configuration and Aliases](#15-configuration-and-aliases)
16. [Troubleshooting and Recovery](#16-troubleshooting-and-recovery)
17. [Repository Cloning](#17-repository-cloning)
### 2. Check Git Installation ---
To verify that Git is installed, run: ## 1. Installation and Setup
### **Install Git**
Download from official source: [git-scm.com](https://git-scm.com/)
**Linux Distributions:**
```bash
# Debian/Ubuntu
sudo apt update && sudo apt install git -y
# RHEL/CentOS/Fedora
sudo yum install git -y # or dnf install git -y
```
**macOS:**
```bash
brew install git
```
### **Verify Installation**
```bash ```bash
git --version git --version
``` ```
*Displays installed Git version*
### 3. Configure Git User Information ### **Configure User Identity**
Git requires author information for every commit:
Set up your name and email address, which will be used for your commits:
```bash ```bash
git config --global user.name "Your Name" git config --global user.name "Your Full Name"
git config --global user.email "your.email@example.com" git config --global user.email "your.email@company.com"
``` ```
## Configuring Git to Use a Custom SSH Key **Configuration Scopes:**
| Scope | Command Flag | Applies To | Persistence |
If you need to use a specific SSH key for your Git operations, you can configure Git as follows: |-------|--------------|------------|-------------|
| System | `--system` | All users on machine | System-wide |
| Global | `--global` | Current user | User account |
| Local | `--local` | Specific repository | Repository only |
**Verify Configuration:**
```bash ```bash
git config --add --local core.sshCommand 'ssh -i <PATH_TO_SSH_KEY>' git config --list
``` ```
For Clone With Custom SSH Key Use: ---
## 2. SSH Key Configuration
### **Generate SSH Key Pair**
```bash ```bash
git -c core.sshCommand="ssh -i <key-path>" clone host:repo ssh-keygen -t ed25519 -C "your.email@company.com"
```
- **`-t ed25519`**: Modern, secure key algorithm
- **`-C`**: Comment for key identification
### **SSH Agent Management**
```bash
# Start SSH agent
eval "$(ssh-agent -s)"
# Add private key to agent
ssh-add ~/.ssh/id_ed25519
``` ```
### **Per-Repository SSH Key**
*Replace `<PATH_TO_SSH_KEY>` with the actual path to your SSH key file.*
## Creating and Managing a Local Git Repository
### 1. Initialize a Git Repository
Start by creating a new Git repository in your local project directory:
```bash ```bash
# Set custom key for specific repo
git config --local core.sshCommand "ssh -i /path/to/custom_key"
# Clone with specific key (one-time)
git -c core.sshCommand="ssh -i /path/to/key" clone git@host:repo.git
```
---
## 3. Repository Initialization
### **Create New Repository**
```bash
# Initialize with main branch
git init -b main git init -b main
# Initialize with default branch
git init
``` ```
*The `-b main` flag sets the default branch name to "main".* **Key Concepts:**
- **Working Directory**: Files not yet tracked by Git
- **Staging Area (Index)**: Files prepared for commit
- **Repository**: Committed history and metadata
### 2. Add Files and Commit Changes ---
Next, stage all your files and create your initial commit: ## 4. Basic Workflow
### **Stage Changes**
```bash ```bash
# Stage all changes (new, modified, deleted)
git add -A git add -A
git commit -m "Initial Commit"
# Stage specific files
git add <file1> <file2>
# Stage all modified files (not new files)
git add .
``` ```
*The `git add -A` command stages all changes, while the `git commit` command records those changes with a descriptive message.* ### **Commit Changes**
### 3. Connect to a Remote Repository
Now, link your local repository to a remote GitHub repository:
```bash ```bash
git remote add origin <Repo-Link> git commit -m "Descriptive commit message"
``` ```
*Replace `<Repo-Link>` with the URL of your GitHub repository.* ### **Connect to Remote**
### 4. Push Changes to GitHub
Finally, push your initial commit to the remote repository:
```bash ```bash
git push origin main git remote add origin <repository-url>
git remote -v # Verify remote configuration
``` ```
## Common Git Commands for Beginners ### **Push to Remote**
```bash
# First push (sets upstream tracking)
git push -u origin main
### 1. Check the Status of Your Repository # Subsequent pushes
git push
```
To see which changes are staged, unstaged, or untracked: ---
## 5. Status and History
### **Repository Status**
```bash ```bash
git status git status
``` ```
*Shows working directory and staging area state*
### 2. View Commit History ### **Commit History**
To view the commit history of your repository:
```bash ```bash
git log # One-line summary
git log --oneline
# Visual graph of all branches
git log --graph --oneline --all
# Last N commits with patch
git log -p -3
# Show specific commit details
git show <commit-hash>
``` ```
*You can press `q` to exit the log view.* ### **Change Visualization**
### 3. Viewing Changes
To see changes made to files before staging them:
```bash ```bash
# Unstaged changes (working directory)
git diff git diff
# Staged changes (index vs HEAD)
git diff --staged
# Branch comparison
git diff main..develop
``` ```
### 4. Staging Individual Files ---
If you want to stage specific files instead of all changes: ## 6. File Operations
| Operation | Command | Effect |
|-----------|---------|---------|
| Stage file | `git add <file>` | Moves file to staging area |
| Unstage | `git reset <file>` | Removes from staging, keeps changes |
| Discard changes | `git restore <file>` | Reverts to last committed version |
| Rename | `git mv old new` | Stages rename operation |
| Remove (tracked) | `git rm <file>` | Stages file deletion |
| Untrack | `git rm --cached <file>` | Removes from Git, keeps locally |
---
## 7. Branch Management
### **Branch Operations**
```bash
# Create and switch
git switch -c feature/new-api
# List branches
git branch -v # Local branches with last commit
git branch -a # All branches (local + remote)
# Delete branch
git branch -d feature # Safe delete (merged)
git branch -D feature # Force delete
# Rename branch
git branch -m old-name new-name
```
**Branch States:**
- **Local Branch**: Exists only in your repository
- **Remote Branch**: Exists on remote server (`origin/main`)
- **Tracking Branch**: Local branch linked to remote (`main -> origin/main`)
---
## 8. Merging and Rebasing
### **Merge (Preserves History)**
```bash
git checkout main
git merge feature/xyz
```
**Merge Types:**
| Type | Condition | Result |
|------|-----------|---------|
| Fast-forward | Target ahead, no divergence | Linear history |
| Three-way | Both branches have new commits | Merge commit created |
### **Rebase (Linear History)**
```bash
git checkout feature/xyz
git rebase main
```
**Rebase Controls:**
```bash
git rebase --abort # Cancel rebase
git rebase --continue # Resolve conflicts and continue
```
---
## 9. Remote Operations
### **Remote Management**
```bash
git remote -v # List remotes
git remote show origin # Detailed remote info
git fetch --all # Fetch all remotes
```
### **Pull Strategies**
```bash
git pull # Fetch + merge
git pull --rebase # Fetch + rebase (cleaner history)
```
---
## 10. Commit Management
### **Modify Last Commit**
```bash
git commit --amend # Edit message/files
```
### **Safe Undo (Shared Branches)**
```bash
git revert <commit-hash> # Creates reversing commit
```
### **Reset Types**
```bash
git reset --soft HEAD~1 # Keeps staging area
git reset HEAD~1 # Unstages, keeps files
git reset --hard HEAD~1 # Discards everything
```
---
## 11. Removing Commits
### **Remove Local (Unpushed) Commit**
```bash
# Soft reset (interactive rebase recommended)
git reset --soft HEAD~1
# Interactive rebase for multiple commits
git rebase -i HEAD~3
# Change 'pick' to 'drop' or delete line
```
### **Remove Pushed Commit from Remote**
**⚠️ DANGER: Rewrites shared history**
```bash ```bash
git add <filename> # 1. Reset locally
git reset --hard HEAD~1
# 2. Force push (collaborators must coordinate)
git push --force-with-lease origin main
# 3. Alternative: Safer revert
git revert HEAD # Creates undoing commit
git push
``` ```
*Replace `<filename>` with the name of the file you wish to stage.* **Team Coordination Required:**
```
1. Notify team before force push
2. Team runs: git fetch && git reset --hard origin/main
3. Use revert for shared production branches
```
### 5. Undoing Changes ### **Remove Specific Pushed Commit**
```bash
# Interactive rebase
git rebase -i <commit-before-target>~1
To unstage a file that you added by mistake: # Or create revert
git revert <specific-commit-hash>
```
---
## 12. Stash Operations
**Temporary Storage:**
```bash
git stash push -m "WIP: API changes"
git stash list
git stash apply stash@{0} # Keep stash
git stash pop # Apply and remove
```
---
## 13. Tags and Releases
### **Tag Management**
```bash
# Lightweight tag
git tag v1.2.3
# Annotated tag (recommended)
git tag -a v1.2.3 -m "Release v1.2.3"
# Push tags
git push origin --tags
```
---
## 14. .gitignore Management
**Create/Update:**
```bash
touch .gitignore
```
**Common Patterns:**
```
# Dependencies
node_modules/
vendor/
# Logs
*.log
logs/
# Environment
.env
*.env.local
# OS
.DS_Store
Thumbs.db
```
**Apply Existing .gitignore:**
```bash
git rm -r --cached .
git add . && git commit -m "Apply .gitignore"
```
---
## 15. Configuration and Aliases
### **Editor and Pager**
```bash
git config --global core.editor "code --wait"
```
### **Productivity Aliases**
```bash
git config --global alias.st "status"
git config --global alias.co "checkout"
git config --global alias.br "branch -v"
git config --global alias.cm "!f() { git add -A && git commit -m \"$@\"; }; f"
```
---
## 16. Troubleshooting and Recovery
### **Common Recovery**
```bash
# View all history (including resets)
git reflog
# Recover deleted branch
git checkout -b recovery-branch <commit-hash>
# Fix detached HEAD
git checkout main
```
---
## 17. Repository Cloning
```bash ```bash
git reset <filename> # Standard clone
git clone <url>
# Specific branch
git clone -b develop <url>
# Shallow clone (history limited)
git clone --depth 1 <url>
``` ```
To discard changes in a file and revert it to the last committed state: ---
```bash ## Key Git Concepts Explained
git checkout -- <filename>
```
### 6. Cloning a Repository | Concept | Definition | Importance |
|---------|------------|-----------|
| **HEAD** | Current commit/branch pointer | Always points to active commit |
| **Index/Staging** | Intermediate area between working dir and repo | Prepares exact commit content |
| **Fast-forward** | Linear merge without merge commit | Clean history |
| **Detached HEAD** | HEAD points directly to commit | Use for inspection, create branch to save work |
| **Reflog** | Local history of HEAD movements | Recovery lifeline |
| **Force Push** | Overwrites remote history | Use only with team coordination |
If you want to create a copy of an existing remote repository: **Document Version: 2.0**
*Optimized for DevOps workflows, CI/CD integration, and team collaboration*
```bash
git clone <Repo-Link>
```
*Replace `<Repo-Link>` with the URL of the repository you want to clone.*
### 7. Creating a New Branch
To create a new branch for development:
```bash
git checkout -b <branch-name>
```
*Replace `<branch-name>` with your desired branch name.*
### 8. Merging Branches
To merge changes from another branch into your current branch:
```bash
git merge <branch-name>
```

View File

@@ -1,87 +1,357 @@
# 🚢 Kubernetes (K8s) Documentation # Kubernetes (K8s) Technical Documentation
## 🌐 Overview ## 1. Overview
**Kubernetes (K8s)** is an open-source container orchestration platform designed to automate the deployment, scaling, and operation of containerized applications.
**Kubernetes (K8s)** is an open-source container orchestration platform that automates the deployment, scaling, networking, and lifecycle management of containerized applications. It provides declarative configuration and self-healing capabilities to maintain the desired state of workloads.
Kubernetes follows a **control plane / worker node** architecture and is designed to run reliably at scale.
--- ---
## 🧠 Control Plane (CP) ## 2. Kubernetes Architecture
The **Control Plane** is the core management component of a Kubernetes cluster. It makes global decisions about the cluster (e.g., scheduling) and maintains the desired state of the cluster by managing workloads and directing communication within the system.
> 💡 **Note:** By default, the Control Plane does not directly manage or run application containers. A Kubernetes cluster consists of:
### 🔑 Key Components of the Control Plane * **Control Plane nodes** manage cluster state
* **Worker nodes** run application workloads
- **API Server (`kube-apiserver`)**
Exposes the Kubernetes API and serves as the cluster's entry point. It handles communication between internal components and external clients.
- **Scheduler (`kube-scheduler`)**
Assigns workloads (e.g., Pods) to nodes based on resource availability and defined policies.
- **Controller Manager (`kube-controller-manager`)**
Runs controllers that monitor and regulate the cluster's state, such as the Node Controller and Replication Controller.
- **etcd**
A consistent and highly available key-value store that stores all cluster data, configurations, and state. This is the "database" of Kubernetes.
--- ---
## 🧱 Worker Nodes ## 3. Control Plane
**Worker nodes** are the machines where containerized applications run. Each node contains essential components for managing containers.
### 🔧 Key Components of a Worker Node The **Control Plane** is responsible for managing the overall cluster state. It does not normally run application workloads.
- **Kubelet** ### 3.1 Control Plane Components
An agent that ensures containers run as specified in their Pod definitions. It communicates with the Control Plane to execute assigned tasks.
- **Kube Proxy** #### kube-apiserver
Maintains network rules and manages routing for communication within the cluster and with external systems.
* Entry point to the Kubernetes cluster
* Exposes the Kubernetes REST API
* Validates requests and persists state to etcd
* All components communicate through the API server
#### etcd
* Distributed, consistent key-value store
* Stores all cluster state and configuration
* Uses the **Raft consensus algorithm**
* Requires an **odd number of members (3, 5, …)** to maintain quorum
* Minimum recommended production setup: **3 etcd members**
#### kube-scheduler
* Assigns Pods to nodes
* Makes scheduling decisions based on:
* Resource requests and limits
* Node affinity / anti-affinity
* Taints and tolerations
* Pod affinity rules
#### kube-controller-manager
* Runs multiple controllers, including:
* Node Controller
* ReplicaSet Controller
* Deployment Controller
* Job Controller
* Ensures the actual cluster state matches the desired state
--- ---
## 🔄 Data Flow ## 4. Worker Nodes
- **Kubelet** and **Kube Proxy** on each worker node interact with the **API Server** to perform operations and update resource states.
- The **Scheduler** selects suitable nodes for pod placement based on available resources. Worker nodes run application containers and system workloads.
- The **Controller Manager** ensures the actual state of the cluster matches the desired state.
### 4.1 Worker Node Components
#### kubelet
* Node agent running on each worker
* Responsibilities:
* Register the node with the API server
* Create and manage Pods
* Monitor Pod and container health
* Report node and Pod status
* Manage DaemonSet Pods
* Communicates with the container runtime via CRI
#### kube-proxy
* Handles networking and service routing
* Maintains iptables or IPVS rules
* Enables Service abstraction and load balancing
* Usually runs as a **DaemonSet**
#### Container Runtime
* Responsible for running containers
* Must be **CRI-compliant**
* Common runtimes:
* containerd (recommended)
* CRI-O
--- ---
## 🛠️ Administration Tools ## 5. Container Runtime Interface (CRI)
- **`kubeadm`** **CRI (Container Runtime Interface)** is a Kubernetes API that allows kubelet to communicate with container runtimes.
A command-line tool to bootstrap and configure Kubernetes clusters. It streamlines the setup of both the Control Plane and worker nodes.
- **`kubectl`** Important clarification:
The CLI for interacting with the Kubernetes API. It's used to deploy apps, inspect cluster resources, and manage configurations.
* CRI is **not a registry**
* It is an interface between kubelet and the container runtime
--- ---
## 🧩 Kubernetes Version Compatibility ## 6. Cluster Networking & DNS
### Kubernetes and Container Runtimes ### 6.1 CoreDNS
- **Kubernetes ≤ 1.23** * Kubernetes internal DNS service
✅ Compatible with **Docker** as the default container runtime. * Runs as a **Deployment** (not DaemonSet in modern clusters)
* Provides service discovery inside the cluster
- **Kubernetes 1.24 1.25** #### Default cluster domain
❌ Docker is **not supported** directly. Use `containerd` or another CRI-compliant runtime.
- **Kubernetes ≥ 1.25** ```
⚠️ Docker may be installed on the system but must be used **indirectly** through `containerd` or another supported CRI. cluster.local
```
#### DNS formats
* Service:
```
<service-name>.<namespace>.svc.cluster.local
```
* Pod:
```
<pod-ip>.<namespace>.pod.cluster.local
```
--- ---
## 👥 Kubernetes Roles ## 7. Administration Tools
- **Control Plane (Manager)** ### kubeadm
Requires an **odd number** of nodes for high availability (e.g., 1, 3, 5, ...). This ensures quorum in distributed consensus.
- **Worker (none)** * Tool for bootstrapping Kubernetes clusters
These nodes run application workloads and do not participate in control decisions. * Used to initialize control plane and join worker nodes
### kubectl
image pull policy in kubernetes: * Command-line interface to interact with Kubernetes API
* Used for deployment, debugging, inspection, and administration
### Lens
example of all work loads: * Client-side GUI for Kubernetes
https://k8s-examples.container-solutions.com/ * Requires kubeconfig access
### Kubernetes Dashboard
* Server-side web UI
* Runs inside the cluster
* Requires RBAC configuration for access
---
## 8. Kubernetes Version & Runtime Compatibility
| Kubernetes Version | Docker Support |
| ------------------ | -------------------------------------------- |
| ≤ 1.23 | Docker supported via dockershim |
| 1.24+ | Docker shim removed |
| 1.25+ | Docker only usable indirectly via containerd |
**Recommendation:** Use `containerd` directly.
---
## 9. Node Roles & High Availability
### Control Plane
* Requires **odd number of nodes** (1, 3, 5…)
* Necessary for etcd quorum and fault tolerance
### Worker Nodes
* Can scale horizontally without restrictions
* Do not participate in control decisions
---
## 10. Pod Lifecycle Hooks
### postStart
* Executed immediately after container creation
* Runs asynchronously with container startup
* Failure causes container restart
### preStop
* Executed before container termination
* Commonly used for graceful shutdown
* Kubernetes waits for completion (within termination grace period)
---
## 11. Static Pods
* Managed directly by kubelet
* Defined via local manifest files
* Do **not** require API server scheduling
* Commonly used for core components:
* kube-apiserver
* etcd
* kube-controller-manager
---
## 12. Workload Types
Common Kubernetes workloads:
* Deployment
* ReplicaSet
* StatefulSet
* DaemonSet
* Job
* CronJob
Examples:
[https://k8s-examples.container-solutions.com/](https://k8s-examples.container-solutions.com/)
---
## 13. Scheduling Behavior
Pod scheduling is **skipped** for:
* **DaemonSet Pods**
* **Static Pods**
These are directly bound to nodes.
---
## 14. Scaling
### Horizontal Scaling
* Adjust replica count
* Manual or automatic
### Vertical Scaling
* Adjust CPU and memory resources
* Requires Pod restart
---
## 15. Autoscaling Components
### Horizontal Pod Autoscaler (HPA)
* Scales replicas based on:
* CPU
* Memory
* Custom metrics
### Vertical Pod Autoscaler (VPA)
Components:
1. **Recommender** calculates resource recommendations
2. **Updater** evicts Pods if needed
3. **Admission Controller** applies recommendations at Pod creation
### Cluster Autoscaler (CA)
* Scales worker nodes up/down
* Integrates with cloud providers or node groups
---
## 16. Resource Management
### ResourceQuota
* Limits total resource usage per namespace
* Controls CPU, memory, object count, etc.
### LimitRange
* Sets default and maximum limits per Pod or container
* Applies at namespace level
---
## 17. Finalizers
* Prevent resource deletion until cleanup is complete
* Common use cases:
* External resource cleanup
* Storage detachment
* Object remains in `Terminating` state until finalizer is removed
---
## 18. Deployment Update Strategies
### Recreate
* Terminates old Pods before creating new ones
* Causes downtime
### RollingUpdate
* Gradual replacement
* Zero or minimal downtime
* Default for Deployments
### Blue-Green Deployment
* Two environments (blue and green)
* Traffic switched after validation
### Canary Deployment
* Gradual traffic increase to new version
* Used for risk reduction
### A/B Testing
* Traffic split between versions
* Used for experimentation
### Shadow Testing
* New version receives production traffic without user impact
* Used for performance and behavior analysis
---
## 19. Services
### Service
* Provides stable networking and load balancing
* Uses label selectors to target Pods
### Headless Service
* No virtual IP
* Direct Pod DNS resolution
* Commonly used with StatefulSets (e.g., databases)

View File

@@ -0,0 +1,216 @@
# MySQL Performance and Administration Guide for DevOps
This document covers essential MySQL configuration parameters, monitoring practices, data integrity checks, slow query tuning, and useful command-line tools for database administration.
## Table of Contents
- [MySQL Performance and Administration Guide for DevOps](#mysql-performance-and-administration-guide-for-devops)
- [Table of Contents](#table-of-contents)
- [Configuration Parameters](#configuration-parameters)
- [max\_allowed\_packet](#max_allowed_packet)
- [Error and Slow Query Logs](#error-and-slow-query-logs)
- [skip\_name\_resolve](#skip_name_resolve)
- [Initial Root Password and Access Control](#initial-root-password-and-access-control)
- [Monitoring](#monitoring)
- [Performance Schema and Information Schema](#performance-schema-and-information-schema)
- [Percona Monitoring and Management (PMM)](#percona-monitoring-and-management-pmm)
- [Data Corruption Checking](#data-corruption-checking)
- [Slow Query Configuration Details](#slow-query-configuration-details)
- [Tools](#tools)
- [pt-stalk](#pt-stalk)
- [pt-diskstats](#pt-diskstats)
- [pt-summary](#pt-summary)
- [mysqlcheck](#mysqlcheck)
---
## Configuration Parameters
### max_allowed_packet
```ini
max_allowed_packet = 128M
```
- **Purpose**: Defines the maximum size of a single communication packet between the MySQL client and server.
- **Best Practice**: For large BLOB/ TEXT fields or large dumps, set to `1G`. Adjust according to workload and available memory.
### Error and Slow Query Logs
Place these directives under the `[mysqld]` section:
```ini
[mysqld]
log-error = /var/log/mysql/error.log
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
```
- `log-error`: Location of the error log file.
- `slow_query_log`: Enables slow query logging.
- `slow_query_log_file`: Path to the slow query log file.
### skip_name_resolve
```ini
skip_name_resolve
```
- **Effect**: Disables resolution of client hostnames to IP addresses.
- **Benefit**: Improves connection speed and reduces DNS overhead. Use when all users connect via IP addresses or CIDR ranges.
---
## Initial Root Password and Access Control
After the first initialization of MySQL, the temporary root password is stored in `/var/log/mysqld.log`. Use it to log in and change the password.
**Change root password:**
```sql
ALTER USER 'root'@'%' IDENTIFIED BY '123';
```
**Restrict access to a specific IP or range** (e.g., 192.168.1.0/24):
```sql
ALTER USER 'root'@'192.168.1.0/24' IDENTIFIED BY '123';
```
> Replace `'123'` with a strong password and adjust the subnet as needed.
---
## Monitoring
### Performance Schema and Information Schema
MySQL provides two built-in schemas for monitoring:
- **performance_schema**: Tracks server execution details at a low level (waits, events, statements, etc.).
- **information_schema**: Provides metadata about database objects (tables, columns, privileges, etc.).
### Percona Monitoring and Management (PMM)
PMM is an open-source monitoring solution that integrates with **Grafana** for dashboards and visualization. It collects metrics from MySQL, PostgreSQL, MongoDB, and system hosts.
**Key features**:
- Query analytics and slow query tracking.
- Realtime performance dashboards.
- Historical data retention.
**How to use**:
1. Install PMM Server (Docker or package) on a dedicated host.
2. Install PMM Client on each MySQL host.
3. Connect the client to the server:
`pmm-admin config --server-url=https://<pmm-server-ip>:443`
4. Add MySQL service:
`pmm-admin add mysql --username=root --password=<pwd>`
---
## Data Corruption Checking
Use `mysqlcheck` to verify table integrity.
**Check all databases:**
```bash
mysqlcheck --check --all-databases -u root -p
```
**Check a specific database:**
```bash
mysqlcheck --check <database_name> -u root -p
```
The command will report any corrupted tables. For deeper repair, use `--repair` after verifying the need.
---
## Slow Query Configuration Details
Extended slow query log configuration:
```ini
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 2
```
- `long_query_time`: Queries that take more than `2` seconds are logged. Fractional seconds allowed (e.g., `0.5`).
- Additional useful parameters:
- `log_queries_not_using_indexes = 1` logs queries that do not utilise indexes.
- `log_slow_admin_statements = 1` logs slow administrative statements (OPTIMIZE, ANALYZE, ALTER).
After changes, restart MySQL: `sudo systemctl restart mysql`
---
## Tools
This section covers command-line tools from the **Percona Toolkit** (commonly used by DBAs and DevOps). The names in the original notes (`py-stals`, `py-disktats`, `pt-summery`) likely refer to `pt-stalk`, `pt-diskstats`, and `pt-summary`.
### pt-stalk
**Description**: Watches for a MySQL problem (e.g., high load, long lock wait) and collects diagnostic data when the problem occurs.
**Installation** (Ubuntu/Debian):
```bash
sudo apt-get install percona-toolkit
```
**Basic usage**:
```bash
pt-stalk --user=root --password=<pwd> --dest=/var/log/pt-stalk -- --defaults-file=/etc/mysql/my.cnf
```
- `--user`, `--password`: MySQL credentials.
- `--dest`: Directory where collected data will be stored.
- The `--` separates pt-stalk options from MySQL options.
- By default, the script runs as a daemon. Use `--run-time=30s` for a single collection cycle.
**How to use**:
1. Configure thresholds (disk free, processlist size, etc.) to trigger data collection.
2. Review collected files (tarballs) after an incident to diagnose root causes.
### pt-diskstats
**Description**: Analyzes disk I/O performance interactively, similar to `iostat`, but with more detailed perdevice statistics and latency histograms.
**Basic usage**:
```bash
pt-diskstats --interval=5 --iterations=10
```
- `--interval`: Seconds between samples.
- `--iterations`: Number of samples (omit for infinite).
- You can specify devices: `pt-diskstats --devices=sda,sdb`
**How to use**:
- Monitor disk latency and IOPS in real time to identify storage bottlenecks for MySQL.
- Redirect output to a file for later analysis: `pt-diskstats --interval=2 > /tmp/io.log`.
### pt-summary
**Description**: Collects and prints a system overview CPU, memory, disk, network, and MySQL configuration.
**Basic usage**:
```bash
pt-summary
```
**How to use**:
- Run before and after changes to capture baseline system state.
- Combine with `pt-mysql-summary` for MySQLspecific detail.
- The output helps quickly understand the environment when debugging performance issues.
### mysqlcheck
Already covered in the [Data Corruption Checking](#data-corruption-checking) section.

View File

@@ -0,0 +1,161 @@
```markdown
# Benchmarking MySQL Performance
## Introduction
As a DevOps engineer, understanding MySQL performance under various workloads is critical for capacity planning, query optimization, and infrastructure tuning. Benchmarking provides repeatable, measurable insights into how your database behaves under stress. This document outlines standard methodologies, tools, and metrics for benchmarking MySQL effectively.
## Key Performance Metrics
Before running benchmarks, focus on these core metrics:
- **Throughput** - Transactions per second (TPS) or queries per second (QPS)
- **Latency** - Average, 95th, and 99th percentile response times
- **Concurrency** - How performance scales with increasing connections
- **Resource Utilization** - CPU, memory, disk I/O, and network usage on database host
- **Transaction Consistency** - Ensure ACID properties hold under load
## Benchmarking Tools
### Sysbench
The most common and flexible tool. Supports OLTP workloads, point selects, random reads/writes, and more.
Installation:
```bash
# Ubuntu/Debian
sudo apt install sysbench
# RHEL/CentOS
sudo yum install sysbench
```
### mysqlslap
Built-in MySQL utility for simulating client load. Simple but less customizable.
```bash
mysqlslap --host=localhost --user=root --password=secret \
--auto-generate-sql --concurrency=50 --iterations=3
```
### Other Tools
- **HammerDB** - Graphical TPC-C style benchmarking
- **tcpdump + pt-query-digest** - Analyze real production traffic
- **dbt2** - Open source TPC-C implementation
## Benchmark Methodology
### Prerequisites
1. **Isolate the environment** - Use a dedicated database server or cloud instance. Disable OS background services (backups, cron, monitoring) that interfere.
2. **Configure MySQL** - Match production settings (buffer pool, log file sizes, innodb_flush_log_at_trx_commit, etc.).
3. **Prepare data** - Use realistic data volumes. For sysbench, typically 10-100 million rows per table.
4. **Warm up the buffer pool** - Run a trial workload before measuring.
### Phases
1. **Plan** - Define workload type (read-heavy, write-heavy, mixed), duration, and concurrency levels.
2. **Prepare** - Create test tables and data.
3. **Run** - Execute benchmark with monitoring tools active (e.g., `htop`, `iostat`, `mysqladmin status`).
4. **Cleanup** - Remove test databases.
5. **Analyze** - Compare results against baseline.
## Example: Sysbench OLTP Benchmark
### 1. Prepare Data
Create 4 tables with 1 million rows each:
```bash
sysbench oltp_read_write \
--mysql-host=127.0.0.1 \
--mysql-port=3306 \
--mysql-user=sysbench \
--mysql-password=bench123 \
--mysql-db=testdb \
--tables=4 \
--table-size=1000000 \
prepare
```
### 2. Run the Benchmark
Execute with varying concurrency (e.g., 1, 4, 8, 16, 32, 64 threads):
```bash
sysbench oltp_read_write \
--mysql-host=127.0.0.1 \
--port=3306 \
--user=sysbench \
--password=bench123 \
--db=testdb \
--tables=4 \
--table-size=1000000 \
--threads=32 \
--time=300 \
--report-interval=10 \
run
```
Parameters explained:
- `--threads` - Number of concurrent clients
- `--time` - Benchmark duration in seconds (300 = 5 minutes)
- `--report-interval` - Print intermediate stats every N seconds
### 3. Clean Up
```bash
sysbench oltp_read_write \
--mysql-host=127.0.0.1 \
--mysql-user=sysbench \
--mysql-password=bench123 \
--mysql-db=testdb \
cleanup
```
## Analyzing Results
### Key Output from Sysbench
After a run, sysbench outputs:
```
SQL statistics:
queries performed:
read: 1091424
write: 311836
other: 155918
total: 1559178
transactions: 77958 (259.83 per sec.)
queries: 1559178 (5196.67 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 300.0050s
total number of events: 77958
Latency (ms):
min: 4.01
avg: 123.07
max: 1152.19
95th percentile: 210.56
sum: 9591283.90
```
Critical metrics:
- **Transactions per second** - Primary throughput indicator
- **95th percentile latency** - Important for SLOs
- **Avg latency** - General responsiveness
### Interpreting Results
| Observation | Potential Cause |
|-------------|----------------|
| TPS scales linearly with threads up to a point | Healthy system, then bottleneck may shift |
| Latency spikes after certain concurrency | Contention on locks, mutexes, or I/O queue saturation |
| Dropping TPS at high concurrency | Context switching overhead or connection limits |
| High 95th vs avg latency | Occasional stalls (checkpointing, swapping, network latency) |

View File

@@ -0,0 +1,189 @@
# InnoDB Storage Engine:
This document provides an in-depth explanation of the InnoDB storage engine, its on-disk structures, memory management mechanisms (buffer pool), and change buffering. The target audience is database administrators and DevOps engineers who need to understand and tune InnoDB for performance and reliability.
## Table of Contents
1. What is InnoDB?
2. MySQL Data Directories Related to InnoDB
3. Pages in InnoDB
4. Index Pages
5. Tablespaces
6. Buffer and Buffer Pool
- Buffer Pool Metrics
- Configuration Example
7. Change Buffering
- Configuration Parameters
---
## 1. What is InnoDB?
InnoDB is a storage engine for MySQL that provides:
- ACID compliance (Atomicity, Consistency, Isolation, Durability)
- Row-level locking
- Foreign key constraints
- Crash recovery
- Multi-version concurrency control (MVCC)
It is the default storage engine for MySQL since version 5.5. InnoDB stores data in tablespaces, which are composed of pages.
---
## 2. MySQL Data Directories Related to InnoDB
In a typical MySQL installation, several directories are used to store InnoDBrelated files. Understanding their purpose helps with backup, recovery, and capacity planning.
| Directory | Description |
|---------------------|-------------------------------------------------------------------------------------------------|
| `innodb_redo` | Contains redo log files. Redo logs record changes made to InnoDB data to ensure durability. |
| `innodb_temp` | Stores temporary tablespaces used for internal temporary tables and ondisk temporary objects. |
| `mysql` | The system schema that holds metadata (database names, tables, privileges, etc.). |
Even though `mysql` is not exclusively InnoDB, many system tables now use InnoDB by default.
---
## 3. Pages in InnoDB
A page is the smallest unit of storage in InnoDB. All data (table rows, indexes, etc.) is stored in pages.
- **Default page size**: 16 KB (can be configured to 4 KB, 8 KB, 32 KB, or 64 KB via `innodb_page_size`).
- **Structure**: Each page contains a header, a trailer (checksum), and the actual data.
- When a page is full, InnoDB allocates a new page to hold more data.
Pages are read from disk into memory (the buffer pool) and written back to disk when modified.
---
## 4. Index Page in InnoDB
An **index page** is a special type of page that stores index entries. InnoDB uses a Btree data structure for both primary and secondary indexes.
- **Primary key index (clustered index)**: The leaf pages contain the actual row data for the table. The entire table is organised as a Btree based on the primary key.
- **Secondary index**: Leaf pages contain the indexed column value and the primary key value (which is used to look up the full row in the clustered index).
Index pages are also 16 KB by default. Each index page contains pointers to child pages (for nonleaf levels) or row pointers (for leaf levels).
---
## 5. Tablespace
A tablespace is a logical storage container that holds InnoDB data. There are several types of tablespaces:
| Tablespace Type | Description |
|--------------------------|------------------------------------------------------------------------------------|
| System tablespace | Contains the data dictionary, doublewrite buffer, change buffer, and undo logs. |
| Filepertable tablespace| Each table has its own `.ibd` file (controlled by `innodb_file_per_table`). |
| General tablespaces | Usercreated tablespaces that can hold multiple tables. |
| Undo tablespace | Stores undo logs for MVCC and transaction rollback. |
| Temporary tablespace | Stores temporary tables created during queries or sessions (nonpersistent). |
Each tablespace is divided into pages. The system tablespace (usually `ibdata1`) starts at 12 MB and grows as needed.
---
## 6. Buffer and Buffer Pool
### What is a Buffer?
A buffer is a memory area that temporarily holds data read from disk to reduce the number of direct disk I/O operations. In InnoDB, the main buffer is called the **buffer pool**.
### Buffer Pool
When a query requests data, InnoDB first checks whether the required pages are already present in the buffer pool:
- **If yes (cache hit)**: The data is returned directly from memory (extremely fast).
- **If no (cache miss)**: InnoDB reads the relevant pages from disk into the buffer pool, then serves the data from memory.
#### Recommended Size
A common best practice is to set the buffer pool size to approximately 75% of the available system memory on a dedicated database server. For shared servers, reduce the percentage accordingly.
### Configuration Example
In MySQL configuration file (`my.cnf` or `my.ini`):
```ini
[mysqld]
innodb_buffer_pool_size = 1G
```
Alternatively, change it dynamically at runtime (MySQL 8.0+):
```sql
SET PERSIST innodb_buffer_pool_size = 1073741824; -- value in bytes
```
### Buffer Pool Metrics
These status variables help monitor buffer pool efficiency. Query them with:
```sql
SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool%';
```
| Metric | Description |
|-------------------------------------|------------------------------------------------------------------------------------------------------|
| `Innodb_buffer_pool_reads` | Number of times InnoDB had to read a page from disk because it was not available in the buffer pool. High values indicate a shortage of buffer pool memory. |
| `Innodb_buffer_pool_read_requests` | Total number of logical read requests (page accesses) made to the buffer pool. |
| `Innodb_buffer_pool_wait_free` | Count of times a thread had to wait for a clean page to become available. Nonzero values suggest the buffer pool is under pressure (e.g., dirty page flushing is slow). |
| `Innodb_buffer_pool_pages_free` | Number of free pages currently in the buffer pool. Low values mean the buffer pool is nearly full. |
#### Interpreting Metrics
- **Cache hit ratio** = `(Innodb_buffer_pool_read_requests - Innodb_buffer_pool_reads) / Innodb_buffer_pool_read_requests`. Aim for >99%.
- If `Innodb_buffer_pool_wait_free` keeps increasing, consider increasing the buffer pool size or tuning flushing behaviour (`innodb_io_capacity`, `innodb_max_dirty_pages_pct`).
- Low `Innodb_buffer_pool_pages_free` alone is not a problem; it just shows the buffer pool is actively used.
---
## 7. Change Buffering
Change buffering is a feature that delays writing changes to secondary index pages. Instead of immediately updating the index pages on disk when a nonunique secondary index is modified, InnoDB records the change in a special area called the **change buffer** (which is part of the system tablespace). Later, when the index pages are read into the buffer pool by other queries, the buffered changes are merged (applied) to the pages.
This reduces random disk I/O and improves performance for workloads with many Data Manipulation Language (DML) operations (INSERT, UPDATE, DELETE) that affect secondary indexes.
### Configuration Parameters
Both parameters are set in the MySQL configuration file.
#### `innodb_change_buffering`
Controls which operations are buffered. Possible values:
| Value | Description |
|-----------|--------------------------------------------------------------------------|
| `none` | Do not buffer any changes. |
| `inserts` | Buffer only insert operations. |
| `deletes` | Buffer only delete operations (including purge operations). |
| `changes` | Buffer inserts and deletemarking operations (but not actual purges). |
| `purges` | Buffer only the physical deletion of rows that occur during background purge. |
| `all` | Buffer inserts, deletemarking, and purges (default value). |
Example configuration:
```ini
[mysqld]
innodb_change_buffering = all
```
#### `innodb_change_buffer_max_size`
Specifies the maximum size of the change buffer as a percentage of the total buffer pool size. The default is 25 (meaning 25% of the buffer pool). Valid range is 0 to 50.
Increasing this value allows more space for buffered changes, which can help workloads with heavy DML on secondary indexes, but it reduces the space available for cached data pages.
Example:
```ini
[mysqld]
innodb_change_buffer_max_size = 30
```
### When to Tune Change Buffering
- **Writeheavy OLTP**: Keep `innodb_change_buffering = all` and possibly increase `innodb_change_buffer_max_size` to 3040.
- **Readonly or mostly reads**: Set `innodb_change_buffering = none` to avoid wasting buffer pool memory.
- **Unique indexes**: Change buffering does not apply to unique secondary indexes because uniqueness checks require immediate disk access.

View File

View File

@@ -0,0 +1,46 @@
## Zombie Processes
### What is a Zombie Process?
In Linux/Unix operating systems, when a process ends, its execution is halted, but it leaves behind an entry in the process table. This entry contains the process's exit status, which needs to be read by its parent process.
A **zombie process** (or defunct process, indicated by the `Z` state in `ps` output) is a child process that has completed its execution, but its parent process has not yet called the `wait()` or `waitpid()` system calls to read its exit status. Because the parent hasn't acknowledged the death, the OS keeps the child's entry in the process table.
### The Effect of Zombie Processes
At first glance, a zombie process seems harmless:
* It consumes **$0$** CPU resources.
* It consumes **$0$** Memory (RAM).
**The Danger: PID Exhaustion**
The only resource a zombie consumes is an entry in the OS process table and a Process ID (PID). Operating systems have a maximum limit of PIDs available (often $32768$ by default, though tunable in `sysctl`). If a poorly written parent process continuously spawns children and never reaps them, the system will eventually run out of available PIDs.
When PID exhaustion occurs, the OS cannot create any new processes. You won't be able to SSH into the server, execute basic commands, or spawn new application threads, effectively bringing the system down.
### How to Identify Zombies
* **Using `top`:** The header will explicitly show a counter for zombie processes.
* **Using `ps`:** List the PIDs of all processes with a `Z` (Zombie) state:
```bash
ps aux | awk '{ print $8 " " $2 }' | grep -w Z
```
### How to "Kill" a Zombie Process
**Important Rule:** You cannot kill a zombie process directly. Even `kill -9 <zombie_pid>` (SIGKILL) will not work because the process is already dead. To clear a zombie, you must deal with its **parent process**.
**Step 1: Find the Parent Process ID (PPID)**
Find out which process spawned the zombie:
```bash
ps -o ppid= -p <zombie_pid>
```
**Step 2: Ask the parent to reap the child**
Send a `SIGCHLD` signal to the parent process. This acts as a gentle reminder for the parent to execute the `wait()` system call and clean up its children.
```bash
kill -s SIGCHLD <parent_pid>
```
**Step 3: Kill the Parent Process (If Step 2 fails)**
If the parent process is poorly programmed, hung, or ignoring the `SIGCHLD` signal, your only operational choice is to kill the parent process:
```bash
kill -9 <parent_pid>
```
*Note on Step 3:* When the parent dies, the zombie process becomes an "orphan". The OS kernel automatically reassigns all orphan processes to the init system (usually `systemd` or `init`, which is PID $1$). PID $1$ is specifically designed to routinely execute `wait()` and will instantly reap the zombie, finally clearing it from the process table.

View File

@@ -0,0 +1,386 @@
## 1. Overview
jq is a lightweight and powerful command-line tool for parsing, filtering, transforming, and formatting JSON data.
In DevOps workflows, `jq` is commonly used to:
* Analyze logs (Docker, Kubernetes, application logs)
* Filter observability data (metrics/events in JSON format)
* Debug CI/CD pipelines
* Process API responses (AWS, GitHub, Terraform outputs)
* Transform JSON for automation scripts
It is essentially the “grep + awk + sed” equivalent for JSON.
---
## 2. Installation
### Linux (Debian/Ubuntu)
```bash
sudo apt-get update
sudo apt-get install jq
```
### RHEL/CentOS
```bash
sudo yum install jq
```
### macOS
```bash
brew install jq
```
### Verify installation
```bash
jq --version
```
---
## 3. Basic Syntax
```bash
jq '<filter>' file.json
```
Or pipe input:
```bash
cat file.json | jq '<filter>'
```
---
## 4. Core Concepts
### 4.1 Identity filter
Returns input as-is:
```bash
jq '.'
```
### 4.2 Access fields
```bash
jq '.name'
jq '.user.id'
```
### 4.3 Arrays
```bash
jq '.items[]'
```
### 4.4 Pretty print
```bash
jq '.'
```
---
## 5. Filtering Logs (DevOps Use Case)
### Example log entry
```json
{
"level": "error",
"service": "auth",
"message": "invalid credentials",
"status": 401,
"timestamp": "2026-04-15T10:00:00Z"
}
```
### Filter only errors
```bash
jq 'select(.level == "error")'
```
### Filter by service
```bash
jq 'select(.service == "auth")'
```
### Extract specific fields
```bash
jq '{time: .timestamp, msg: .message}'
```
---
## 6. Working with Arrays (Common in Logs)
### Example: multiple log entries
### Count entries
```bash
jq 'length'
```
### Filter array elements
```bash
jq '.[] | select(.status >= 500)'
```
### Extract fields from array
```bash
jq '.[] | {service, status, message}'
```
---
## 7. Kubernetes Logs with jq
### Example:
```bash
kubectl logs pod-name -n default | jq
```
### Filter error logs
```bash
kubectl logs pod-name | jq 'select(.level=="error")'
```
### Extract container metadata logs
```bash
kubectl logs pod-name | jq '{time, container, message}'
```
---
## 8. Docker Logs with jq
### Streaming logs
```bash
docker logs container_name | jq
```
### Filter failures
```bash
docker logs container_name | jq 'select(.status != "success")'
```
---
## 9. AWS / Cloud Logs (JSON-based)
### Example CloudWatch JSON logs
```bash
aws logs filter-log-events --log-group-name my-app | jq
```
### Extract messages only
```bash
... | jq '.events[].message'
```
### Filter by keyword
```bash
... | jq '.events[] | select(.message | contains("ERROR"))'
```
---
## 10. Transforming JSON (Automation Use Cases)
### Rename fields
```bash
jq '{userId: .id, username: .name}'
```
### Add computed fields
```bash
jq '. + {isActive: true}'
```
### Build new structure
```bash
jq '{users: [.[] | {id, name}]}'
```
---
## 11. Advanced Filtering
### Logical conditions
```bash
jq 'select(.status == 200 and .service == "api")'
```
### Regex matching
```bash
jq 'select(.message | test("timeout|failed"))'
```
### Sorting
```bash
jq 'sort_by(.timestamp)'
```
### Unique values
```bash
jq 'unique_by(.service)'
```
---
## 12. Aggregations (DevOps Analytics)
### Count by status
```bash
jq 'group_by(.status) | map({status: .[0].status, count: length})'
```
### Error rate estimation
```bash
jq 'map(select(.status >= 400)) | length'
```
---
## 13. Formatting Output for Humans
### Compact JSON
```bash
jq -c '.'
```
### Raw output (no quotes)
```bash
jq -r '.message'
```
### Tabular-like output
```bash
jq -r '[.timestamp, .level, .message] | @tsv'
```
---
## 14. Debugging Pipelines
### Validate JSON
```bash
jq empty file.json
```
### Highlight structure
```bash
jq '. | type'
```
### Pretty inspect nested structures
```bash
jq 'paths'
```
---
## 15. DevOps Best Practices
### 1. Always validate JSON first
```bash
jq empty
```
### 2. Use `-c` in pipelines
Reduces log noise:
```bash
jq -c '.'
```
### 3. Use `-r` for scripting
```bash
jq -r '.field'
```
### 4. Combine with grep when needed
```bash
grep ERROR app.log | jq
```
### 5. Avoid unnecessary formatting in CI/CD
Keep output machine-readable.
---
## 16. Common Patterns Cheat Sheet
| Task | Command |
| --------------- | ------------------------------ |
| Pretty print | `jq '.'` |
| Filter by field | `jq 'select(.field=="value")'` |
| Extract field | `jq '.field'` |
| Array iteration | `jq '.[]'` |
| Count items | `jq 'length'` |
| Convert to text | `jq -r '.field'` |
| Compact output | `jq -c '.'` |
---
## 17. Real DevOps Example Pipeline
### Analyze application logs
```bash
cat app.log | jq -c 'select(.level=="error") | {time, service, message}'
```
### Kubernetes debugging
```bash
kubectl logs my-pod | jq -c 'select(.status>=500)'
```
### CI/CD artifact inspection
```bash
cat terraform-output.json | jq '.outputs'
```

View File

@@ -1,18 +1,18 @@
# ⚙️ PS Command # PS Command
The `ps` (process status) command is used to **view running processes** on a Linux system. Its useful for monitoring and troubleshooting tasks. The `ps` (process status) command is used to **view running processes** on a Linux system. Its useful for monitoring and troubleshooting tasks.
--- ---
## 🧾 Basic Usage ## Basic Usage
### 🔍 Show tasks in the current shell ### Show tasks in the current shell
```bash ```bash
ps ps
``` ```
### 🔍 Show tasks in the current shell with **full info** ### Show tasks in the current shell with **full info**
```bash ```bash
ps -f ps -f
@@ -20,9 +20,9 @@ ps -f
--- ---
## 🌍 View System-Wide Processes ## View System-Wide Processes
### 📋 Show **all** processes ### Show **all** processes
```bash ```bash
ps -A ps -A
@@ -32,17 +32,17 @@ ps -e
--- ---
### 👤 Show tasks by **specific user** ### Show tasks by **specific user**
```bash ```bash
ps -u <username> ps -u <username>
``` ```
📌 Replace `<username>` with the actual user name. Replace `<username>` with the actual user name.
--- ---
### 📊 Show **detailed info for all** tasks ### Show **detailed info for all** tasks
```bash ```bash
ps aux ps aux
@@ -50,7 +50,7 @@ ps aux
--- ---
## 📘 Output Fields Explained ## Output Fields Explained
| Column | Description | | Column | Description |
| --------- | -------------------------------------------------- | | --------- | -------------------------------------------------- |
@@ -58,22 +58,20 @@ ps aux
| `PID` | Process ID | | `PID` | Process ID |
| `%CPU` | CPU usage percentage | | `%CPU` | CPU usage percentage |
| `%MEM` | Memory usage percentage | | `%MEM` | Memory usage percentage |
| `STAT` | Process state: `R` (running), `S` (sleeping), etc. | | `STAT` | Process state: `R` (running), `S` (sleeping), `Z` (zombie), etc. |
| `START` | Time when the process started | | `START` | Time when the process started |
| `TIME` | Total CPU time used | | `TIME` | Total CPU time used |
| `COMMAND` | Command that started the process | | `COMMAND` | Command that started the process |
### 📑 Show List Jobs ### Show List Jobs
```bash ```bash
jobs jobs
``` ```
### 🔄Move Process From Background To Forground ### Move Process From Background To Foreground
```bash ```bash
fg fg
``` ```

View File

@@ -0,0 +1,190 @@
# diff Command Reference
The `diff` command is a standard Unix/Linux utility used to compare files line by line. It is commonly used in development, DevOps, and system administration to identify changes between configuration files, source code, logs, or generated outputs.
---
## 1. Basic File Comparison
Compare two files line by line:
```bash
diff file1 file2
```
### Output Behavior
* Shows only the lines that differ
* Uses symbols to indicate changes:
* `<` line from `file1`
* `>` line from `file2`
* `c` change
* `a` addition
* `d` deletion
---
## 2. Side-by-Side Comparison
Display files next to each other:
```bash
diff -y file1 file2
```
### Notes
* Useful for human-readable comparison
* Differences are shown in two columns
* Change indicators appear in the middle
Limit output width:
```bash
diff -y --width=120 file1 file2
```
Suppress common lines:
```bash
diff -y --suppress-common-lines file1 file2
```
---
## 3. Unified Diff Format (Most Common)
Generate a unified diff:
```bash
diff -u file1 file2
```
### Why Unified Diff
* Standard format used by Git, patch, and code reviews
* Shows context before and after changes
* Easier to read and apply
Example output markers:
* `+` added lines
* `-` removed lines
* `@@` line numbers and context
---
## 4. Save Diff Output to a File
Redirect diff output to a file:
```bash
diff -u file1 file2 > different.diff
```
Common use cases:
* Code reviews
* Patch creation
* Change tracking
* CI/CD artifact storage
---
## 5. Recursive Directory Comparison
Compare directories:
```bash
diff -r dir1 dir2
```
Unified recursive diff:
```bash
diff -ru dir1 dir2
```
Ignore missing files:
```bash
diff -rq dir1 dir2
```
---
## 6. Ignore Differences
Ignore whitespace:
```bash
diff -w file1 file2
```
Ignore blank lines:
```bash
diff -B file1 file2
```
Ignore case differences:
```bash
diff -i file1 file2
```
---
## 7. Apply a Diff as a Patch
Create a patch:
```bash
diff -u oldfile newfile > change.patch
```
Apply patch:
```bash
patch < change.patch
```
Dry-run patch:
```bash
patch --dry-run < change.patch
```
---
## 8. diff vs Git diff
| diff | git diff |
| -------------------- | ------------------------------- |
| Compares any files | Compares Git-tracked files |
| Works without Git | Requires Git repository |
| Produces patch files | Integrated with version control |
---
## 9. Best Practices
* Use `-u` format for readability and compatibility
* Store diff files with `.diff` or `.patch` extensions
* Avoid committing generated diff files unless required
* Use `diff` for system configuration audits
* Use `git diff` inside Git repositories
---
## 10. Quick Reference
```bash
diff file1 file2
diff -y file1 file2
diff -u file1 file2
diff -ru dir1 dir2
diff -u file1 file2 > change.diff
```

View File

@@ -0,0 +1,243 @@
Here is a cleaned up, expanded, and more production-ready version of your **date and time manipulation** document, written from a DevOps / Linux system administration perspective and keeping it concise and accurate.
---
# Essential Date and Time Manipulation Commands
This document covers common shell commands for working with dates and times. These commands are frequently used in scripting, logging, monitoring, automation, and system administration tasks.
---
## 1. The `date` Command Overview
The `date` command is a standard Unix/Linux utility used to:
* Display the current system date and time
* Format timestamps
* Convert between human-readable dates and Unix epoch time
* Perform date arithmetic
Check system time:
```bash
date
```
---
## 2. Creating Epoch Timestamps
### Current Epoch Time (Seconds)
```bash
date +%s
```
Output example:
```
1737154200
```
### Current Epoch Time (Milliseconds)
```bash
date +%s%3N
```
Notes:
* `%s` → seconds since Unix epoch
* `%3N` → milliseconds (GNU date only)
---
### Convert Specific Date to Epoch (Milliseconds)
Convert a human-readable date to epoch time:
```bash
date -d "2026-01-13 14:31:26" +%s%3N
```
Common use cases:
* Log correlation
* API timestamps
* CI/CD pipeline timing
* Monitoring and alerting
---
## 3. Formatting Current Date and Time
Custom output formats are widely used in scripts and logs.
### Common Date Formats
| Command | Description | Example Output |
| --------------------------- | --------------------- | ---------------------------- |
| `date` | Default system format | Wed Jan 17 10:30:00 UTC 2026 |
| `date +'%Y-%m-%d %H:%M:%S'` | ISO-like format | 2026-01-17 10:30:00 |
| `date +'%Y-%m-%d'` | Date only | 2026-01-17 |
| `date +'%H:%M:%S'` | Time only | 10:30:00 |
| `date +%s` | Epoch (seconds) | 1737154200 |
### Common Format Specifiers
| Specifier | Meaning |
| --------- | --------------- |
| `%Y` | Year (4 digits) |
| `%m` | Month (0112) |
| `%d` | Day (0131) |
| `%H` | Hour (0023) |
| `%M` | Minute (0059) |
| `%S` | Second (0060) |
| `%N` | Nanoseconds |
---
## 4. Converting Epoch to Human-Readable Time
Convert epoch seconds to readable format:
```bash
date -d @1737154200
```
Using a variable:
```bash
EPOCH_TIME=1737154200
date -d @"$EPOCH_TIME"
```
Formatted output:
```bash
date -d @"$EPOCH_TIME" '+%Y-%m-%d %H:%M:%S'
```
---
## 5. Date Arithmetic
GNU `date` supports flexible date calculations.
### Relative Dates
| Command | Description |
| ----------------------- | ----------------------- |
| `date -d "yesterday"` | Previous day |
| `date -d "tomorrow"` | Next day |
| `date -d "2 days ago"` | Two days in the past |
| `date -d "+1 hour"` | One hour from now |
| `date -d "+30 minutes"` | Thirty minutes from now |
| `date -d "+1 week"` | One week from now |
| `date -d "2 weeks ago"` | Two weeks ago |
---
## 6. Date Arithmetic with Formatting
Example: date 7 days from now:
```bash
date -d "+7 days" '+%Y-%m-%d'
```
Example: timestamp 15 minutes ago (epoch):
```bash
date -d "-15 minutes" +%s
```
---
## 7. Working with UTC and Time Zones
Display current UTC time:
```bash
date -u
```
Format UTC time:
```bash
date -u '+%Y-%m-%d %H:%M:%S'
```
Convert date in a specific timezone:
```bash
TZ=UTC date
TZ=America/New_York date
```
---
## 8. Script-Friendly Usage Examples
Add timestamp to a log entry:
```bash
echo "$(date '+%Y-%m-%d %H:%M:%S') Application started"
```
Generate a timestamped filename:
```bash
backup_$(date +%Y%m%d_%H%M%S).tar.gz
```
Measure execution time:
```bash
start=$(date +%s)
# command here
end=$(date +%s)
echo "Duration: $((end - start)) seconds"
```
---
## 9. macOS Compatibility Notes
macOS uses BSD `date`, which differs from GNU `date`.
Example difference:
```bash
# GNU (Linux)
date -d "yesterday"
# BSD (macOS)
date -v -1d
```
Install GNU date on macOS:
```bash
brew install coreutils
gdate -d "yesterday"
```
---
## 10. Best Practices
* Use UTC for logs and distributed systems
* Store timestamps as epoch values when possible
* Format dates only at display time
* Avoid locale-dependent formats in scripts
* Be aware of GNU vs BSD `date` differences
---
If you want, I can:
* Add cron-specific date examples
* Add log rotation and backup use cases
* Provide a quick-reference cheat sheet
* Add cross-platform date handling strategies

View File

@@ -0,0 +1,159 @@
# ELK Stack Overview (DevOps Notes)
## What is ELK?
**ELK** stands for:
* **Elasticsearch**
* **Logstash**
* **Kibana**
The ELK Stack is a powerful platform used for **log management, monitoring, data analysis, and observability**. It is widely used in DevOps for **centralized logging, troubleshooting, and performance monitoring**.
---
## Core Components
### 1. Elasticsearch
* Distributed, REST-based **search and analytics engine**
* Used for **storing, indexing, and searching logs and metrics**
* Built on Apache Lucene
* Highly scalable and fast for full-text search
**Key Responsibilities:**
* Store logs and events
* Index data for fast search
* Support aggregations and analytics
---
### 2. Logstash
* **Data processing pipeline**
* Ingests data from multiple sources
* Transforms, parses, enriches, and forwards data
**Pipeline Stages:**
```
Input → Filter → Output
```
**Examples of filters:**
* grok (parse logs)
* mutate (modify fields)
* date (timestamp handling)
* geoip (location enrichment)
---
### 3. Kibana
* **Visualization and analytics UI**
* Connects directly to Elasticsearch
* Used for:
* Dashboards
* Log exploration
* Metrics visualization
* Alerts and reporting
---
## Beats (Data Shippers)
**Beats** are lightweight agents installed on servers to collect and send data to Elasticsearch or Logstash.
Common Beats:
* **Filebeat** collects log files
* **Metricbeat** system and service metrics (CPU, memory, disk)
* **Heartbeat** uptime and availability monitoring
* **Packetbeat** network traffic analysis
* **Auditbeat** security and audit data
**Role:**
* Data collection
* Minimal resource usage
* Sends data to Logstash or directly to Elasticsearch
---
## Fluentd
* **Cloud-native log aggregator and processor**
* Alternative to Logstash
* Common in Kubernetes environments
**Responsibilities:**
* Collect logs from multiple sources
* Enrich and transform data
* Route logs to multiple destinations (Elasticsearch, S3, Kafka)
---
## Typical ELK Architecture
```
Server / Application
Filebeat
Logstash
Elasticsearch
Kibana
```
> Note: In some setups, Beats can send data **directly to Elasticsearch** (Logstash optional).
---
## Database Concepts vs Elasticsearch Concepts
| Traditional Database | Elasticsearch |
| -------------------- | -------------------------- |
| Database | Index |
| Schema | Mapping |
| Table | Index (Type is deprecated) |
| Column | Field |
| Row | Document |
| Primary Key | Document ID |
> ⚠️ **Note:** `Type` is deprecated in modern Elasticsearch versions (7+).
---
## Elasticsearch Data Model
* **Index**: Logical namespace for documents
* **Document**: JSON object containing data
* **Field**: Key-value pair in a document
* **Mapping**: Defines field types and structure
---
## Why ELK in DevOps?
* Centralized logging
* Faster incident response
* Debugging distributed systems
* Monitoring infrastructure and applications
* Security analysis (SIEM use cases)
---
## Summary
* **Elasticsearch** → Storage & search engine
* **Logstash / Fluentd** → Data processing & enrichment
* **Beats** → Lightweight data collectors
* **Kibana** → Visualization & dashboards
The ELK Stack enables DevOps teams to **observe, analyze, and troubleshoot systems at scale**.

View File

@@ -0,0 +1,220 @@
# ELK Node Types
## Overview
The ELK Stack (Elasticsearch, Logstash, Kibana) is commonly deployed using multiple node types (roles) to ensure scalability, performance, and resilience. This document outlines the main node types used in production-grade ELK deployments from a DevOps perspective.
---
## 1. Elasticsearch Node Types
Elasticsearch nodes can be assigned one or more roles. In production environments, roles are usually separated for stability and performance.
### 1.1 Master Node (Dedicated Master)
**Purpose:** Cluster coordination and management
Responsibilities:
* Manages cluster state
* Controls shard allocation
* Handles node joins and failures
Best Practices:
* Deploy 3 dedicated master nodes (odd number for quorum)
* Do not assign data or ingest roles
* Require minimal CPU and disk, but stable memory
Configuration:
```yaml
node.roles: [ master ]
```
---
### 1.2 Data Nodes
**Purpose:** Store data and execute search and indexing operations
#### a. Hot Data Node
* Handles recent and high-traffic data
* Requires fast SSD storage
* Heavy indexing and querying workload
```yaml
node.roles: [ data_hot ]
```
#### b. Warm Data Node
* Stores less frequently accessed data
* Moderate CPU and disk requirements
```yaml
node.roles: [ data_warm ]
```
#### c. Cold Data Node
* Stores rarely accessed data
* Optimized for cost efficiency
```yaml
node.roles: [ data_cold ]
```
#### d. Frozen Data Node
* Archival data with searchable snapshots
* Minimal local storage requirements
```yaml
node.roles: [ data_frozen ]
```
---
### 1.3 Coordinating Node
**Purpose:** Query routing and result aggregation
Characteristics:
* No data storage
* No master role
* Acts as a load balancer for search requests
Use Case:
* Kibana and client applications connect to coordinating nodes
```yaml
node.roles: [ ]
```
---
### 1.4 Ingest Node
**Purpose:** Data preprocessing before indexing
Responsibilities:
* Executes ingest pipelines
* Performs grok parsing, enrichment, geoip, and transformations
* Reduces load on data nodes
```yaml
node.roles: [ ingest ]
```
---
### 1.5 Machine Learning Node
**Purpose:** Run machine learning jobs
Use Cases:
* Anomaly detection
* Advanced analytics
```yaml
node.roles: [ ml ]
```
---
### 1.6 Transform Node
**Purpose:** Data transformation and aggregation
Use Cases:
* Pivot and latest transforms
* Pre-aggregated indices
```yaml
node.roles: [ transform ]
```
---
## 2. Logstash Node Types
Logstash does not use formal roles but is deployed based on function.
### 2.1 Ingest / Collector Nodes
* Receive data from Beats, syslog, Kafka, etc.
* Minimal processing
### 2.2 Processing Nodes
* Perform heavy parsing and enrichment
* CPU-intensive workloads
### 2.3 Output Nodes
* Focused on reliable delivery to Elasticsearch
---
## 3. Kibana Node Types
### 3.1 Kibana Server Node
* Provides UI and REST API
* Stateless and horizontally scalable
### 3.2 Reporting / Task Manager Node
* Handles scheduled tasks and reporting
* Often separated in large deployments
---
## 4. Beats and Agents (Edge Nodes)
Although not part of the core ELK stack, Beats are critical for data collection.
Common Beats:
* Filebeat: Log collection
* Metricbeat: System and service metrics
* Auditbeat: Security events
* Heartbeat: Uptime and endpoint monitoring
---
## 5. Typical Production Architectures
### Small Cluster
* 3 nodes with combined roles (master, data, ingest)
### Medium to Large Cluster
* 3 Dedicated Master Nodes
* Hot, Warm, and Cold Data Nodes
* Dedicated Ingest Nodes
* Coordinating Nodes
* Optional ML and Transform Nodes
---
## 6. Node Role Summary
| Node Type | Purpose |
| --------------------------- | ---------------------------- |
| Master | Cluster coordination |
| Data (Hot/Warm/Cold/Frozen) | Data storage and querying |
| Coordinating | Query routing |
| Ingest | Data preprocessing |
| ML | Anomaly detection |
| Transform | Data aggregation |
| Logstash | Data pipeline |
| Kibana | Visualization and management |

View File

Before

Width:  |  Height:  |  Size: 181 KiB

After

Width:  |  Height:  |  Size: 181 KiB

View File

Before

Width:  |  Height:  |  Size: 82 KiB

After

Width:  |  Height:  |  Size: 82 KiB

188
README.md
View File

@@ -1,57 +1,111 @@
# 🐧 DevOps Documents # 🐧 DevOps Knowledge Base
A curated collection of scripts, configuration files, and guides for managing and configuring Linux-based systems. This personal repository serves as a comprehensive knowledge base to simplify deployment, automation, monitoring, security, and much more. > 🚀 *Your centralized hub for Linux, DevOps, and Infrastructure mastery*
A structured and ever-growing collection of **scripts, configurations, and hands-on guides** designed to simplify:
* ⚙️ Automation
* 🐳 Containerization
* 📊 Monitoring
* 🔐 Security
* ☁️ Cloud & Infrastructure
--- ---
## 📂 Repository Structure ## 🧭 Quick Navigation
### ⚙️ Configuration Management & Automation ### ⚙️ Configuration & Automation
- [Ansible](./Configuration%20Management%20&%20Automation/Ansible)
- [CronJob](./Configuration%20Management%20&%20Automation/CronJob)
### 🐳 Containerization & Orchestration * 🔹 Ansible
- [Docker](./Containerization%20&%20Orchestration/Docker) * 🔹 CronJobs
- [Kubernetes (In Progress)](./Containerization%20&%20Orchestration/Kubernetes)
### 🐳 Containers & Orchestration
* 🔹 Docker
* 🔹 Kubernetes *(Work in Progress)*
* 🔹 Dozzle
### ☁️ Cloud
* 🔹 AWS
### 🗄️ Databases ### 🗄️ Databases
- [PostgreSQL](./Databases/Postgresql)
* 🔹 PostgreSQL
* 🔹 MariaDB
### ⚡ Caching ### ⚡ Caching
- [Redis](./Caching/redis)
* 🔹 Redis
### 💻 Code Management ### 💻 Code Management
- [Git](./Code%20Management/Git)
* 🔹 Git
* 🔹 GitLab (CI/CD, Cache, Baremetal Setup)
### 🔀 High Availability ### 🔀 High Availability
- [HAProxy](./High%20Availability/Ha-Proxy)
* 🔹 HAProxy
### 📊 Monitoring & Logging ### 📊 Monitoring & Logging
- [Grafana](./Monitoring%20&%20Logging/Grafana)
- [LibreNMS](./Monitoring%20&%20Logging/Librenms)
- [Netdata](./Monitoring%20&%20Logging/Netdata)
- [Zabbix](./Monitoring%20&%20Logging/Zabbix)
### 🔐 Networking & Security * 🔹 Grafana
- [iptables](./Security%20&%20Networking/Iptables) * 🔹 Zabbix
- [Nmap](./Security%20&%20Networking/Nmap) * 🔹 Netdata
- [Nginx](./Security%20&%20Networking/Nginx) * 🔹 LibreNMS
- [File Sharing](./Security%20&%20Networking/FileSharing) * 🔹 ELK Stack
### 📦 Storage ### 🔐 Security & Networking
- [NFS](./Storage/NFS)
### 🧠 System & Kernel Management * 🔹 iptables
- [Kernel](./System%20&%20Kernel%20Management/Kernel) * 🔹 Nmap
* 🔹 tcpdump
* 🔹 hping3
* 🔹 File Sharing (SMB)
### 📦 Storage & Object Systems
* 🔹 NFS
* 🔹 MinIO
* 🔹 S5CMD
### 🧠 Linux & System Administration
* 🔹 Bash Scripting
* 🔹 System Administration
* 🔹 File Synchronization (rsync)
* 🔹 Terminal Multiplexers (screen)
### 🔁 Web Servers & Reverse Proxies ### 🔁 Web Servers & Reverse Proxies
- [Nginx (Web)](./Web%20Servers%20&%20Reverse%20Proxies/Nginx)
### 🤖 Bots & Automation Tools * 🔹 Nginx
- [Telegram Bot](./Bots%20&%20Automation%20Tools/TelegramBot) * 🔹 Certbot
* 🔹 Nextcloud
### 📝 Miscellaneous ### 🔑 Password Management
- [Info](./Info)
* 🔹 Vaultwarden
### 🖥️ Virtualization & Dev Environments
* 🔹 Vagrant
### 🤖 Automation & Bots
* 🔹 Telegram Bot
---
## 🗂️ Documentation Structure
This repository is organized into **topic-based directories**, each containing:
* 📘 Step-by-step guides
* ⚡ Real-world configurations
* 🧪 Practical examples
* 🧾 Ready-to-use scripts
> 💡 Each section is self-contained—start anywhere based on your needs.
--- ---
@@ -60,47 +114,73 @@ A curated collection of scripts, configuration files, and guides for managing an
```bash ```bash
git clone https://github.com/RadinPirouz/linux-documents.git git clone https://github.com/RadinPirouz/linux-documents.git
cd linux-documents cd linux-documents
```` ```
* Explore each folder for setup guides, scripts, and configuration examples. 📌 Then:
* Follow individual README or documentation files inside each directory before running any scripts.
1. Navigate to the relevant category
2. Open the `.md` documentation files
3. Follow instructions step-by-step
--- ---
## 📌 Notes ## 🧪 Philosophy
* Tested on **Debian/Ubuntu** and **CentOS/RHEL**-based distributions. This knowledge base is built on:
* ⚠️ Always review and test configurations in a staging environment before applying to production.
* ✅ Practical, real-world usage
* ✅ Minimal theory, maximum application
* ✅ Copy-paste friendly configs
* ✅ Modular learning approach
---
## ⚠️ Important Notes
* 🐧 Tested on:
* Debian / Ubuntu
* CentOS / RHEL
* 🚨 Always:
* Review configs before running
* Test in staging environments
* Understand before deploying to production
--- ---
## 🤝 Contributing ## 🤝 Contributing
Contributions are welcome! 🛠️ Want to improve this knowledge base? You're welcome!
1. Fork the repository. ```bash
2. Create a new branch: # 1. Fork the repo
`git checkout -b feature/YourFeature` # 2. Create your feature branch
3. Commit your changes: git checkout -b feature/your-feature
`git commit -m "Add new config for X"`
4. Push to the branch:
`git push origin feature/YourFeature`
5. Open a Pull Request 🙌
Please ensure your code is tested and well-documented. # 3. Commit changes
git commit -m "Add: your feature"
# 4. Push to GitHub
git push origin feature/your-feature
```
Then open a Pull Request 🙌
--- ---
## 📬 Contact ## 📬 Contact & Support
Questions or feedback? Reach out: * 💬 Telegram: [https://t.me/RadinPirouz](https://t.me/RadinPirouz)
* 🐛 Issues: [https://github.com/RadinPirouz/linux-documents/issues](https://github.com/RadinPirouz/linux-documents/issues)
* 💬 Telegram: [@RadinPirouz](https://t.me/RadinPirouz)
* 🐛 GitHub Issues: [Open an Issue](https://github.com/RadinPirouz/linux-documents/issues)
--- ---
## ⭐ Support ## ⭐ Support the Project
If you find this repository useful, please give it a ⭐ and share it with others! If this helped you:
* ⭐ Star the repository
* 🔁 Share it with others
* 🧠 Use it, improve it, contribute back

View File

@@ -0,0 +1,233 @@
# BIND9 DNS Forwarder Configuration Guide
## 1. Installing BIND9
```bash
sudo apt install bind9
```
### Explanation
BIND9 (Berkeley Internet Name Domain) is one of the most widely used DNS servers. In this setup, it will act as a **DNS forwarder**, meaning it forwards DNS queries to upstream servers instead of resolving them recursively from root servers.
---
## 2. Configuration Overview
The configuration snippet defines how BIND9 behaves as a DNS server. It is typically located in:
```
/etc/bind/named.conf.options
```
---
## 3. Detailed Configuration Breakdown
### Global Options Block
```conf
options {
directory "/var/cache/bind";
```
* `directory`: Specifies where BIND stores cache and zone files.
* `/var/cache/bind`: Default working directory for cached DNS data.
---
### Forwarders
```conf
forwarders {
192.168.1.10;
8.8.8.8;
1.1.1.1;
};
```
* Defines upstream DNS servers to which queries are forwarded.
* `192.168.1.10`: Likely an internal DNS server (e.g., corporate or local network).
* `8.8.8.8`: Public DNS server provided by Google.
* `1.1.1.1`: Public DNS server provided by Cloudflare.
**Behavior:**
* Queries that BIND cannot resolve locally are sent to these servers.
---
### DNSSEC Validation
```conf
dnssec-validation no;
```
* Disables DNSSEC (DNS Security Extensions) validation.
* DNSSEC ensures DNS responses are authentic and not tampered with.
**Why disable it?**
* Simplicity in lab or internal environments.
* Avoid issues if upstream servers or zones are misconfigured.
**Production note:**
* It is generally recommended to enable DNSSEC in secure environments.
---
### Listening Interfaces
```conf
#listen-on { any; };
# listen-on-v6 { any; };
listen-on port 53 { 127.0.0.1; };
listen-on-v6 { none; };
```
* `listen-on port 53 { 127.0.0.1; };`
* BIND listens only on the loopback interface (localhost).
* This means only the local machine can query this DNS server.
* `listen-on-v6 { none; };`
* Disables IPv6 listening.
* Commented lines:
* `#listen-on { any; };` would allow all IPv4 interfaces.
* `#listen-on-v6 { any; };` would enable IPv6 support.
**Implication:**
* This configuration is suitable for a **local DNS resolver**, not a network-wide DNS server.
---
### Forwarding Mode
```conf
forward only;
```
* Forces BIND to **only use forwarders**.
* It will not attempt full recursive resolution if forwarders fail.
**Behavior:**
* If all forwarders fail → DNS resolution fails.
---
### Query Access Control
```conf
allow-query { any; };
```
* Allows any client to query the DNS server.
**Note:**
* Safe here because the server only listens on `127.0.0.1`.
---
### Recursion Settings
```conf
recursion yes;
allow-recursion { any; };
```
* `recursion yes;`
* Enables recursive DNS resolution (required for a caching resolver).
* `allow-recursion { any; };`
* Allows all clients to use recursion.
**Important:**
* In public-facing servers, unrestricted recursion can lead to abuse (e.g., DNS amplification attacks).
* In this case, it is safe due to localhost restriction.
---
## 4. Summary of Behavior
This configuration sets up BIND9 as:
* A **local DNS forwarder**
* Listening only on **localhost (127.0.0.1)**
* Forwarding queries to:
* Internal DNS: `192.168.1.10`
* Public DNS: `8.8.8.8`, `1.1.1.1`
* Performing recursion via forwarders only
* Not using DNSSEC validation
* Not exposed to external clients
---
## 5. Typical Use Cases
* Local development environments
* Caching DNS resolver for a single machine
* Forwarding DNS queries inside containers or VMs
* Acting as a DNS proxy for internal services
---
## 6. Recommendations for Production
* Enable DNSSEC validation:
```conf
dnssec-validation auto;
```
* Restrict recursion:
```conf
allow-recursion { trusted_network; };
```
* Bind to specific internal interfaces instead of localhost if needed:
```conf
listen-on { 192.168.1.0/24; };
```
* Implement logging for observability
---
## 7. Restarting the Service
After making changes:
```bash
sudo systemctl restart bind9
```
To check status:
```bash
sudo systemctl status bind9
```
---
## 8. Testing DNS Resolution
```bash
dig google.com @127.0.0.1
```
* Confirms that the local BIND server is resolving queries correctly via forwarders.

View File

@@ -0,0 +1,293 @@
# BIND9 Zone File and SOA Configuration Guide
## 1. What is a Zone File
A **zone file** defines DNS records for a specific domain. It maps domain names to IP addresses and other resources.
In this example, we are configuring a zone for:
```
test.com
```
---
## 2. SOA (Start of Authority) Record
### Example
```conf id="soa-example"
$TTL 120
@ IN SOA test.com. admin.test.com (
1;
86400;
7200;
57600;
3600);
```
### Explanation
#### `$TTL 120`
* Default Time To Live for all records in this zone.
* Value is in seconds (120 seconds = 2 minutes).
* Controls how long DNS responses are cached.
---
### SOA Record Structure
```
@ IN SOA <primary-ns> <admin-email> (
<serial>
<refresh>
<retry>
<expire>
<minimum>
)
```
#### Fields Breakdown
* `@`
* Refers to the root of the zone (`test.com`).
* `IN`
* Internet class (standard for DNS).
* `SOA`
* Start of Authority record. Defines the authoritative source for the zone.
---
### SOA Parameters
* **Primary Nameserver**
```
test.com.
```
* The authoritative DNS server for this zone.
* Must be a fully qualified domain name (FQDN).
* **Admin Email**
```
admin.test.com
```
* Represents `admin@test.com`.
* The `@` is replaced with a dot in DNS format.
---
### Timing Parameters
* **Serial**
```
1;
```
* Version number of the zone.
* Must be incremented on every change.
* Secondary DNS servers use this to detect updates.
* **Refresh (86400 seconds = 24 hours)**
* How often secondary servers check for updates.
* **Retry (7200 seconds = 2 hours)**
* Retry interval if refresh fails.
* **Expire (57600 seconds = 16 hours)**
* Time after which secondary servers discard the zone if they cannot reach the primary.
* **Minimum TTL (3600 seconds = 1 hour)**
* Default negative caching time (NXDOMAIN responses).
---
## 3. DNS Records in the Zone
### Example Zone File
```conf id="zone-file"
@ IN NS test.com.
@ IN A 10.10.30.1
www IN CNAME docs.test.com
docs IN A 10.10.20.1
```
---
### NS Record
```conf id="ns-record"
@ IN NS test.com.
```
* Defines the authoritative nameserver for the domain.
* `test.com.` must resolve to an IP (via an A record).
---
### A Record
```conf id="a-record-root"
@ IN A 10.10.30.1
```
* Maps `test.com` → `10.10.30.1`.
---
### CNAME Record
```conf id="cname-record"
www IN CNAME docs.test.com
```
* `www.test.com` becomes an alias of `docs.test.com`.
* DNS queries for `www` will resolve to the IP of `docs`.
---
### Additional A Record
```conf id="a-record-docs"
docs IN A 10.10.20.1
```
* Maps `docs.test.com` → `10.10.20.1`.
---
## 4. The Trailing Dot in DNS
### Example
```
test.com.
```
### Explanation
* The trailing dot (`.`) indicates a **fully qualified domain name (FQDN)**.
* Without the dot, BIND appends the current zone name.
#### Example Behavior
* `docs.test.com` (no dot)
→ interpreted as `docs.test.com.test.com`
* `docs.test.com.` (with dot)
→ interpreted correctly as `docs.test.com`
**Rule:**
* Always use a trailing dot for absolute domain names in zone files.
---
## 5. Zone Configuration in BIND
### File: `/etc/bind/named.conf.local`
```conf id="named-conf-local"
zone 'test.com' IN {
type master;
file "/etc/bind/zones/test.com.zone";
};
```
### Explanation
* `zone 'test.com'`
* Declares the domain being managed.
* `type master`
* This server is the authoritative source for the zone.
* `file`
* Path to the zone file.
---
## 6. Validating the Zone File
```bash id="check-zone"
named-checkzone test.com /etc/bind/zones/test.com.zone
```
### Purpose
* Validates syntax and logic of the zone file.
* Detects:
* Missing dots
* Invalid records
* Formatting errors
---
## 7. Applying Configuration Changes
### Reconfigure BIND
```bash id="rndc-reconfig"
rndc reconfig
```
* Reloads BIND configuration files.
* Detects new or modified zones.
---
### Reload Specific Zone
```bash id="rndc-reload"
rndc reload test.com
```
* Reloads only the `test.com` zone.
* Faster and more efficient than restarting the entire service.
---
## 8. Key Operational Notes
* Always increment the **serial number** after modifying the zone.
* Use `named-checkzone` before applying changes.
* Prefer `rndc reload` over full service restart for production systems.
* Ensure proper file permissions for `/etc/bind/zones/`.
---
## 9. Summary
This setup defines:
* A **master DNS zone** for `test.com`
* Authoritative records:
* Root domain (`test.com`)
* `docs.test.com`
* Alias `www.test.com`
* Proper SOA configuration for synchronization
* DNS validation and reload workflow using BIND tools

View File

@@ -0,0 +1,248 @@
# 01. Information What is `hping3`?
## Overview
`hping3` is a powerful network tool used primarily for:
- Crafting and sending custom TCP/IP packets
- Testing firewalls and intrusion detection systems (IDS/IPS)
- Network scanning, mapping, and discovery
- Performance and connectivity testing (latency, MTU, path issues)
From a DevOps/SRE perspective, `hping3` is like a “Swiss Army knife” for lowlevel network troubleshooting and securityoriented testing. It allows you to send packets with very precise control over headers and flags, which goes far beyond what tools like `ping` or `traceroute` can do.
> Note: `hping3` should be used only on networks and systems you are authorized to test. It can easily be mistaken for malicious traffic.
---
## Key Capabilities
### 1. Custom Packet Crafting
`hping3` lets you build packets with specific parameters:
- **IP layer**:
- Source/destination IP
- TTL, fragmentation, IP ID
- **TCP layer**:
- Source/destination port
- Flags (SYN, ACK, FIN, RST, PSH, URG)
- Sequence/ack numbers
- **UDP & ICMP**:
- Custom payloads
- Port selection (UDP)
- ICMP type and code
This is useful for:
- Reproducing odd traffic patterns seen in logs
- Simulating client behavior at the packet level
- Testing how devices and middleboxes handle specific combinations of flags
---
### 2. Stateful Firewall & IDS Testing
Because `hping3` can manipulate flags and headers, it is commonly used to test:
- Firewall rules (ingress/egress)
- NAT behavior
- IDS/IPS detection and blocking
Examples of what you can validate:
- Whether SYN packets to certain ports are correctly blocked or allowed
- How a firewall responds to fragmented packets
- Whether “stealth” scans are detected by security tooling
---
### 3. Port Scanning and Host Discovery
`hping3` can act as a flexible port scanner:
- TCP SYN scans on specific ports or ranges
- FIN/XMAS/NULL scans to observe firewall behavior
- Host discovery based on custom probes (TCP/UDP/ICMP)
While tools like `nmap` are more convenient for general scanning, `hping3` is useful when you need precise control over how probes are sent or you want to emulate specific traffic patterns.
---
### 4. Network Performance & Path Testing
`hping3` can be used to measure:
- Round-trip time (RTT) for various protocols and ports
- Packet loss and jitter under different conditions
- MTU/path issues with fragmentation control
Typical use cases:
- Measuring latency to a specific TCP port (e.g., 443) instead of relying on ICMP `ping`
- Determining whether ICMP is blocked and testing alternative paths with TCP/UDP
- Debugging connectivity problems through stateful devices that treat ICMP differently from TCP
---
### 5. Traceroute-like Functionality
`hping3` can perform traceroutestyle path discovery, but using TCP or UDP instead of ICMP:
- Helps when ICMP is filtered or rate-limited
- Shows how TCP packets to specific ports traverse the network
This is useful when:
- ICMP-based `traceroute` doesnt give meaningful results
- You need path information for application ports (e.g., 80, 443, 5432)
---
## Why DevOps/SRE Engineers Care
In modern environments (cloud, containers, microservices), networking problems often involve:
- Security groups, NACLs, firewalls
- Load balancers and proxies
- Overlay networks (e.g., Kubernetes CNI)
- Complex routing or NAT
`hping3` helps you:
- Validate security rules (e.g., between Kubernetes nodes, across VPCs/VNETs)
- Troubleshoot weird connectivity issues that dont show up with `ping`
- Investigate asymmetrical routing or stateful filtering
- Reproduce network conditions reported by applications or logs
It is especially valuable when standard utilities (`ping`, `curl`, `telnet`, `nc`) arent enough to reveal how packets are handled in transit.
---
## TCP Flags & Special Packets (FIN, URG, RST, XMAS) and Flooding
`hping3` gives you direct control over TCP flags. Understanding these is crucial for using it correctly and interpreting responses.
### FIN (Finish) flag / FIN packet
- **What it is**:
The FIN flag indicates that the sender has finished sending data and wants to gracefully close the TCP connection.
- **Normal use**:
Used at the end of a TCP session as part of the connection teardown (FIN/ACK, ACK).
- **In scanning/testing**:
- A **FIN scan** sends packets with only the FIN flag set to a port.
- On a **closed port**, the target should respond with `RST`.
- On an **open port**, many TCP/IP stacks ignore the packet (no response).
This behavior is used to infer whether ports are open/filtered without sending SYN packets that might be logged more aggressively.
### URG (Urgent) flag / URG packet
- **What it is**:
URG marks that some of the data in the TCP segment is “urgent” and should be prioritized by the receiving host.
- **Normal use**:
Rarely used in modern applications. Historically used for things like interrupt signals.
- **In scanning/testing**:
Setting the URG flag along with other flags can:
- Stress or test how TCP stacks handle unusual or rarely seen combinations
- Help detect middleboxes that mishandle or log such packets
Tools like `hping3` can create URG packets to see how targets or firewalls react.
### RST (Reset) flag / RST packet
- **What it is**:
The RST flag instructs the receiver to immediately terminate the TCP connection.
- **Normal use**:
- Sent when a packet arrives for a port where no service is listening.
- Used to abort a connection abruptly (e.g., when a process crashes or refuses a connection).
- **In scanning/testing**:
- When you send a SYN to a **closed** port, a typical response is a `RST` packet.
- Tools use the presence or absence of RST to determine whether a port is open or closed.
- You can also send RST packets to tear down existing connections (for testing, in controlled environments).
### XMAS packet
- **What it is**:
A “XMAS” (Christmas tree) packet is a TCP packet with multiple flags set at once, commonly: **FIN, PSH, URG**.
- **Why the name**:
Its called a “Christmas tree” packet because many flags are “lit up” at the same time, like lights on a tree.
- **In scanning/testing**:
- Used for **XMAS scans**.
- Similar to FIN scans:
- On **closed** ports, the host often responds with `RST`.
- On **open** ports, many stacks send no reply.
- Some older or non-standard TCP/IP stacks respond differently, leaking information about OS type or configuration.
- **Firewall/IDS behavior**:
XMAS packets are unusual and often treated as suspicious, so many devices log or drop them, which can be useful for testing detection.
---
## What is a Flood?
In the context of `hping3` and network testing, a **flood** means sending a very high rate of packets to a target, typically as fast as possible.
- **Purpose in legitimate testing**:
- Stress-test network devices (firewalls, load balancers, routers).
- Identify bottlenecks or performance limits in network paths.
- Observe how systems behave under heavy packet load (Do they drop packets? Do they rate-limit?).
- **Types of floods (conceptually)**:
- **SYN flood**: flood of TCP SYN packets to a port.
- **ICMP flood**: flood of ICMP echo requests.
- **UDP flood**: flood of UDP packets.
- **Use in `hping3`**:
- `hping3` can send packets in “flood mode” (no delays between packets).
- This is powerful and potentially disruptive: packet floods can consume bandwidth and CPU, degrade service, or trigger protective mechanisms.
- **Operational considerations**:
- Only perform flood tests on infrastructure you control and where such testing is explicitly allowed.
- Coordinate with network and security teams.
- Monitor carefully (CPU, memory, bandwidth, and logs) during tests to avoid unintended outages.
---
## Typical Usage Contexts
- **On-prem / data center**:
Test firewalls, routers, and IDS, validate segmentation between environments (e.g., prod vs. nonprod).
- **Cloud environments (AWS/Azure/GCP/etc.)**:
- Verify security group/NACL behavior at the packet level.
- Test connectivity between VPCs/VNETs, onprem VPNs, and cloud workloads.
- **Kubernetes & containerized apps**:
- Validate node-to-node or pod-to-pod connectivity.
- Test ingress/egress rules in CNIs and service meshes.
- Debug why a service is reachable via one path but not another.
---
## Limitations & Considerations
- Requires appropriate privileges (often root) to craft raw packets.
- Can generate traffic patterns similar to port scans or attacks, so:
- Always get proper authorization.
- Coordinate with security teams to avoid false alarms.
- Not designed as a full replacement for higher-level tools (e.g., `nmap`, `iperf`, `traceroute`), but as a complementary low-level tool.
- Behavior may differ slightly across OSes and network stacks.
---
## Installation (High-Level)
Availability varies by distribution, but generally:
- **Debian/Ubuntu**: via `apt` (package usually named `hping3`)
- **RHEL/CentOS/Fedora**: via `yum`/`dnf` or EPEL
- **macOS**: via Homebrew (if available) or compile from source
- **Others**: typically built from source from the official repository
(Installation instructions can be detailed in a separate document.)
---
## Summary
`hping3` is a low-level TCP/IP packet crafting and analysis tool used by DevOps/SRE and security engineers to:
- Test and validate firewall and network security policies
- Perform targeted port scans (including FIN/XMAS-style scans) and host discovery
- Troubleshoot complex connectivity and performance issues
- Generate controlled floods for stress tests (in authorized environments)

View File

@@ -0,0 +1,252 @@
# 02. Commands Practical `hping3` Usage
This document explains common `hping3` commands and what they do at a packet/protocol level.
Replace `<target>` with an IP or hostname, and `<port>` with a TCP/UDP port number.
> Use these commands only on systems and networks you are authorized to test.
---
## 1. ICMP “Normal Ping”
```bash
hping3 -1 <target>
```
- `-1`: Use **ICMP mode** (type 8 echo request), similar to the standard `ping` command.
- Behavior:
- Sends ICMP echo request packets to `<target>`.
- Measures round-trip time (RTT) and indicates packet loss.
- Use case:
- Basic connectivity check when you want to use `hping3` instead of `ping`.
- Helpful if you want later to switch to more advanced testing without changing tools.
---
## 2. Send TCP ACK Packets
```bash
hping3 -A <target>
```
- `-A`: Set the **ACK** flag in TCP packets.
- Behavior:
- Sends TCP packets with the ACK flag set to the default port (0 unless `-p` is specified).
- Use case:
- Test firewall rules related to **established** connections (many firewalls allow ACK packets but block SYN).
- Map which hosts respond to unsolicited ACK packets and how (RST/no response).
To target a specific port (for example, 80):
```bash
hping3 -A <target> -p 80
```
---
## 3. Send TCP SYN Packets
```bash
hping3 -S <target>
```
- `-S`: Set the **SYN** flag in TCP packets.
- Behavior:
- Sends SYN packets to the default port (0 unless `-p` is specified).
- Use case:
- Test how the target responds to connection attempts.
- When combined with `-p`, this becomes a basic SYN scan for that port.
With a specific port:
```bash
hping3 -S <target> -p <port>
```
---
## 4. Send TCP FIN Packets
```bash
hping3 -F <target>
```
- `-F`: Set the **FIN** flag in TCP packets.
- Behavior:
- Sends packets that look like “finish” requests for a connection.
- Use case:
- Perform **FIN scans** (when combined with `-p`) to check firewall behavior:
- Closed ports typically respond with `RST`.
- Open ports often send no response.
- Useful for testing how devices treat non-SYN traffic.
Example with a port:
```bash
hping3 -F <target> -p 80
```
---
## 5. Send TCP RST (Reset) Packets
```bash
hping3 -R <target>
```
- `-R`: Set the **RST** flag in TCP packets.
- Behavior:
- Sends packets that instruct the receiver to immediately terminate a connection.
- Use case:
- Observe how the target or firewall handles unexpected RST packets.
- In controlled tests, can be used to tear down test connections.
With a specific port:
```bash
hping3 -R <target> -p 80
```
---
## 6. Send TCP URG (Urgent) Packets
```bash
hping3 -U <target>
```
- `-U`: Set the **URG** flag in TCP packets.
- Behavior:
- Marks data as “urgent” (though most modern applications rarely use it).
- Use case:
- Test how TCP stacks and firewalls handle **uncommon flags**.
- Validate logging/alerting for rare or suspicious traffic patterns.
Example with a port:
```bash
hping3 -U <target> -p 80
```
---
## 7. Send XMAS Packets
```bash
hping3 -X <target>
```
- `-X`: Send **XMAS** packets (commonly FIN + PSH + URG flags set).
- Behavior:
- Creates “Christmas tree” packets with multiple flags lit.
- Use case:
- **XMAS scans**:
- Closed ports usually respond with `RST`.
- Open ports often do not respond.
- Test firewall/IDS handling of obviously suspicious packets.
Example with a port:
```bash
hping3 -X <target> -p 80
```
---
## 8. Send SYN Packet to a Destination Port
```bash
hping3 -S <target> -p <port>
```
- `-S`: SYN flag.
- `-p <port>`: Destination port.
- Behavior:
- Sends a TCP SYN packet to the specified `<port>` on `<target>`.
- Use case:
- Simple port check:
- Open port: typically responds with SYN/ACK.
- Closed port: typically responds with RST.
- Validate firewall rules for a specific service port.
---
## 9. Send SYN Packets with Random Source Address
```bash
hping3 -S <target> --rand-source
```
- `-S`: SYN flag.
- `--rand-source`: Randomize the **source IP address** for each packet.
- Behavior:
- Target sees SYN packets as if they are coming from many different IPs.
- Use case (legitimate, controlled testing):
- Test how firewalls, load balancers, or DDoS protection handle **spoofed** or distributed-looking traffic.
- Validate rate-limiting or connection limiting across “different” clients.
Note: Because of IP spoofing, responses will not come back to you; this is for observing target-side behavior/logs.
---
## 10. SYN Flood with Random Source
```bash
hping3 -S <target> --rand-source --flood
```
- `-S`: SYN flag.
- `--rand-source`: Randomize source IP per packet.
- `--flood`: Send packets as fast as possible, no output per packet.
- Behavior:
- High-rate SYN traffic with spoofed source IPs.
- Use case:
- **Stress testing** and **capacity testing** of firewalls/load balancers/IPS in a lab or authorized environment.
- Warning:
- This can severely impact services and look like a SYN flood attack.
- Use only with explicit permission and monitoring in place.
---
## 11. ICMP Flood with Spoofed Source Address
```bash
hping3 -1 <target> -a <src-address> --flood
```
> Note: Your original example used `-i`, but for ICMP mode it should be `-1`.
- `-1`: ICMP mode (echo requests).
- `-a <src-address>`: Spoof **source IP** as `<src-address>`.
- `--flood`: Send packets as fast as possible.
- Behavior:
- Sends a high-rate ICMP echo request flood to `<target>` with a fake source IP.
- Use case:
- Test how devices handle **ICMP flood** conditions and spoofed traffic (in a controlled environment).
- Warning:
- Can consume bandwidth and trigger DDoS protections or rate limits.
- Only for authorized stress testing.
If you really meant `-i` (interval), that changes send rate instead of protocol:
```bash
hping3 -1 <target> -a <src-address> --flood
# or with custom interval (e.g., 10 ms):
hping3 -1 <target> -a <src-address> -i u10000
```
---
## 12. Check If Port 22 (SSH) Is Open
```bash
hping3 -S <target> -p 22 -c 1
```
- `-S`: SYN flag (start of TCP handshake).
- `-p 22`: Destination port 22 (typically SSH).
- `-c 1`: Send only **one** packet.
- Behavior:
- Sends a single SYN to port 22 on `<target>`.
- How to interpret:
- If you see a **SYN/ACK** response, port 22 is likely open and reachable.
- If you see a **RST**, port 22 is closed or actively refused.
- If there is **no response**, the port may be filtered by a firewall or silently dropped.
---
## Summary
- `-1`: ICMP mode (ping-like).
- `-S`, `-A`, `-F`, `-R`, `-U`, `-X`: Control which TCP flags are set (SYN, ACK, FIN, RST, URG, XMAS).
- `-p <port>`: Target a specific port.
- `--rand-source`: Spoof/randomize source IPs.
- `-a <src-address>`: Spoof a specific source IP.
- `--flood`: Send packets as fast as possible (for stress testing).
- `-c <count>`: Limit number of packets sent.

View File

@@ -0,0 +1,352 @@
# tcpdump
## Overview
`tcpdump` is a powerful command-line packet analyzer used to capture and inspect network traffic in real time. It is widely used by DevOps engineers, network administrators, and security professionals for troubleshooting, monitoring, and debugging network-related issues.
It works by intercepting packets flowing through a network interface and displaying them based on defined filters.
---
## How tcpdump Works
### Packet Capture Mechanism
`tcpdump` relies on the **libpcap** library to capture packets. The process involves:
1. **Network Interface Access**
- tcpdump attaches to a network interface (e.g., `eth0`, `ens33`, `wlan0`).
2. **Promiscuous Mode**
- By default, tcpdump can enable promiscuous mode, allowing it to capture all packets on the network segment, not just those addressed to the host.
3. **Kernel-Level Filtering**
- Uses Berkeley Packet Filter (BPF) to filter packets efficiently in the kernel space before sending them to user space.
4. **Packet Decoding**
- Captured packets are decoded and printed in a human-readable format.
---
## Installation
### Linux (Debian/Ubuntu)
```bash
sudo apt update
sudo apt install tcpdump
````
### Linux (RHEL/CentOS)
```bash
sudo yum install tcpdump
```
### macOS
```bash
brew install tcpdump
```
---
## Basic Syntax
```bash
tcpdump [options] [filter expression]
```
---
## Common Options
| Option | Description |
| ------------------- | ------------------------------------- |
| `-i <interface>` | Specify network interface |
| `-c <count>` | Capture a specific number of packets |
| `-n` | Disable hostname resolution |
| `-nn` | Disable hostname and port resolution |
| `-v`, `-vv`, `-vvv` | Increase verbosity |
| `-X` | Show packet contents in hex and ASCII |
| `-A` | Display packet contents in ASCII |
| `-w <file>` | Write output to file |
| `-r <file>` | Read packets from file |
| `-s <snaplen>` | Set capture size |
| `-D` | List available interfaces |
---
## Common Use Cases
### 1. Capture Packets on an Interface
```bash
tcpdump -i eth0
```
### 2. Capture a Limited Number of Packets
```bash
tcpdump -i eth0 -c 10
```
### 3. Disable Name Resolution (Faster Output)
```bash
tcpdump -nn -i eth0
```
### 4. Capture and Save to File
```bash
tcpdump -i eth0 -w capture.pcap
```
### 5. Read from a Capture File
```bash
tcpdump -r capture.pcap
```
---
## Filtering with BPF (Berkeley Packet Filter)
Filters are the most powerful feature of tcpdump.
### Basic Structure
```bash
tcpdump [options] 'filter expression'
```
### Filter Types
#### Host Filter
```bash
tcpdump host 192.168.1.1
```
#### Source/Destination Filter
```bash
tcpdump src 192.168.1.1
tcpdump dst 192.168.1.1
```
#### Port Filter
```bash
tcpdump port 80
tcpdump src port 443
tcpdump dst port 22
```
#### Protocol Filter
```bash
tcpdump tcp
tcpdump udp
tcpdump icmp
```
#### Network Filter
```bash
tcpdump net 192.168.1.0/24
```
---
## Combining Filters
### Logical Operators
| Operator | Meaning |
| -------- | -------------------------- |
| `and` | Both conditions must match |
| `or` | Either condition matches |
| `not` | Negates the condition |
### Examples
```bash
tcpdump tcp and port 80
tcpdump host 192.168.1.1 and port 22
tcpdump not port 22
tcpdump tcp and (port 80 or port 443)
```
---
## Packet Output Interpretation
Example output:
```
14:32:10.123456 IP 192.168.1.10.54321 > 93.184.216.34.80: Flags [S], seq 123456, win 65535
```
### Breakdown
| Field | Description |
| ----------- | ------------------------------- |
| Timestamp | Packet capture time |
| Protocol | IP, ARP, etc. |
| Source | Source IP and port |
| Destination | Destination IP and port |
| Flags | TCP flags (SYN, ACK, FIN, etc.) |
| seq | Sequence number |
| win | Window size |
---
## TCP Flags
| Flag | Meaning |
| ---- | ---------------------- |
| SYN | Connection initiation |
| ACK | Acknowledgment |
| FIN | Connection termination |
| RST | Reset connection |
| PSH | Push data immediately |
| URG | Urgent data |
---
## Advanced Usage
### 1. Capture HTTP Traffic
```bash
tcpdump -i eth0 -A port 80
```
### 2. Capture HTTPS Traffic (Metadata Only)
```bash
tcpdump -i eth0 port 443
```
### 3. Capture DNS Queries
```bash
tcpdump -i eth0 port 53
```
### 4. Capture Traffic Between Two Hosts
```bash
tcpdump host 192.168.1.1 and 192.168.1.2
```
### 5. Capture Large Packets Fully
```bash
tcpdump -i eth0 -s 0
```
---
## Writing and Analyzing PCAP Files
### Capture to File
```bash
tcpdump -i eth0 -w traffic.pcap
```
### Analyze with tcpdump
```bash
tcpdump -r traffic.pcap
```
### Integration with Wireshark
* Export `.pcap` files and analyze using GUI tools like Wireshark.
---
## Performance Considerations
* Use `-n` or `-nn` to reduce DNS lookups.
* Apply filters to minimize captured data.
* Avoid capturing full packets unless necessary (`-s 0`).
* Use `-c` to limit capture size.
---
## Security and Permissions
* Requires root or sudo privileges:
```bash
sudo tcpdump -i eth0
```
* Be cautious when capturing sensitive data (credentials, tokens).
---
## Troubleshooting Scenarios
### 1. Debugging Connectivity Issues
```bash
tcpdump -i eth0 host <target-ip>
```
### 2. Checking Open Ports
```bash
tcpdump -i eth0 tcp port 22
```
### 3. Investigating Packet Loss
* Look for retransmissions and duplicate ACKs.
### 4. Diagnosing DNS Problems
```bash
tcpdump -i eth0 port 53
```
---
## Best Practices
* Always filter traffic to reduce noise.
* Capture only what is necessary.
* Store captures securely.
* Use rotation when capturing long sessions:
```bash
tcpdump -i eth0 -w file_%Y%m%d%H%M%S.pcap
```
---
## Limitations
* Cannot decrypt encrypted traffic (e.g., HTTPS).
* High traffic environments may drop packets.
* Output can become overwhelming without filters.
---
## Alternatives and Complementary Tools
* `tshark` (CLI version of Wireshark)
* `wireshark` (GUI packet analyzer)
* `ngrep` (network grep tool)
* `iftop` / `nload` (bandwidth monitoring)
---
## Summary
`tcpdump` is an essential tool in a DevOps engineers toolkit for low-level network inspection. Mastery of filtering, efficient capture strategies, and output interpretation enables effective debugging and monitoring of complex distributed systems.

View File

@@ -0,0 +1,839 @@
---
title: "Jitsi Production Component Guide"
subtitle: "Component-by-component explanation for production DevOps design"
author: "Prepared for production planning"
date: "2026-05-29"
---
# Jitsi Production Component Guide
## 1. Purpose of this document
This document explains the main Jitsi components, what each one does, how they communicate, what ports they use, how they scale, and how to operate them in a production environment.
The focus is a production Jitsi Meet deployment that can handle more than 1000 concurrent participants across many different meetings. This is not the same as one single 1000-person interactive room. A single huge room should normally be treated as a webinar or livestream design, while many simultaneous rooms are handled by horizontal scaling of the media layer.
## 2. Core idea: signaling and media are separate
A Jitsi system has two main traffic planes:
1. Signaling plane: users and backend components exchange control messages. This includes joining a room, creating a conference, presence, mute state, chat, permissions, lobby, and room metadata. Jitsi uses XMPP for this signaling layer.
2. Media plane: audio, video, screen share, RTP/RTCP, bandwidth estimation, packet routing, and WebRTC transport. This is handled mainly by Jitsi Videobridge.
A production deployment is successful when these two planes are treated separately:
- Prosody and Jicofo are the control/signaling brain.
- Jitsi Videobridge is the high-bandwidth media router.
- Nginx serves the web app and proxies WebSocket/BOSH traffic.
- TURN helps users behind restrictive networks.
- Jibri records or livestreams conferences.
- Jigasi connects SIP/PSTN-style audio systems.
## 3. High-level architecture
```text
Browser / Mobile App
|
| HTTPS 443, WebSocket, BOSH
v
Nginx / Jitsi Meet Web
|
| XMPP signaling
v
Prosody XMPP Server <----> Jicofo Conference Focus
^ |
| | COLIBRI / bridge control
| v
| Jitsi Videobridge Pool
| |
| | WebRTC media: UDP 10000, SRTP/RTP/RTCP
v v
Participants <---------------> JVB media routing
Optional components:
- Coturn: STUN/TURN relay for difficult NAT/firewall cases
- Jibri: recording and livestreaming worker
- Jigasi: SIP audio gateway
- Etherpad: collaborative document integration
- Monitoring: Prometheus, Grafana, logs, alerts
```
Official Jitsi documentation describes the main components as Jitsi Meet, Jitsi Videobridge, Jicofo, Jigasi, Jibri, and Prosody. It also defines Prosody as the XMPP server used for signaling, JVB as the WebRTC server that routes video streams, and Jicofo as the server-side focus component that manages media sessions and acts as a load-balancing controller between participants and videobridges. [1]
## 4. Component summary table
| Component | Main job | Traffic type | Scale method | Production note |
|---|---|---|---|---|
| Jitsi Meet Web | Browser UI and frontend application | HTTPS, WebSocket, BOSH | Horizontally with stateless web nodes or shards | Keep config consistent across nodes |
| Nginx | TLS termination, static files, reverse proxy | TCP 80/443 | Horizontal behind load balancer | Must correctly proxy WebSocket/BOSH paths |
| Prosody | XMPP signaling and authentication | XMPP, internal modules | Usually per shard; not the main media bottleneck | Protect internal XMPP ports |
| Jicofo | Conference focus, room orchestration, bridge selection | XMPP, COLIBRI control | Usually per shard; one active focus per deployment/shard | Critical control-plane component |
| Jitsi Videobridge | SFU media routing | WebRTC, UDP/RTP/SRTP | Add more JVB nodes | Main scaling point for 1000+ users |
| Coturn | STUN/TURN relay | UDP/TCP/TLS relay | Add more TURN nodes | Can consume large bandwidth |
| Jibri | Recording/livestream worker | Joins as special participant, encodes output | One worker per simultaneous recording | Heavy CPU/RAM/disk usage |
| Jigasi | SIP audio gateway | SIP/RTP/XMPP | Add instances if SIP demand grows | Audio-only SIP bridge |
| Etherpad | Shared notes/document editing | HTTP/WebSocket | Optional app scaling | Not required for video calls |
| Prometheus/Grafana/Loki | Metrics, dashboarding, logs | Metrics/log collection | Scale by observability need | Required for production operation |
## 5. Jitsi Meet Web
### What it is
Jitsi Meet Web is the user-facing web application. It is a WebRTC-compatible JavaScript application built with React and React Native concepts. In a browser deployment, users load this app from the Jitsi web server, usually through Nginx. The same product family also supports mobile applications.
### What it does
Jitsi Meet Web handles:
- Room URL and initial page load.
- User interface for camera, microphone, screen share, chat, reactions, tiles, moderator controls, lobby, settings, and device selection.
- WebRTC client logic in the browser.
- Signaling connection to Prosody through BOSH or WebSocket.
- Interaction with the Jitsi Meet External API when embedded inside another application.
- Configuration from files such as `config.js` and `interface_config.js` in package-based deployments.
### How it works in a call
1. User opens `https://meet.example.com/room-name`.
2. Nginx serves the static Jitsi Meet web application.
3. The web app reads configuration such as domain, anonymous domain, BOSH/WebSocket URLs, video constraints, prejoin behavior, lobby, and authentication settings.
4. The browser connects to Prosody for signaling.
5. The browser starts WebRTC negotiation and exchanges transport/media information through the signaling layer.
6. Actual audio/video packets go to Jitsi Videobridge, not to the web app.
### Production handling
- Keep web configuration version-controlled.
- Use the same `config.js` values across all web nodes in a shard.
- Put web nodes behind a load balancer only if the signaling paths and domain/shard routing are designed correctly.
- Do not overload the web component with recording, media routing, or TURN duties.
- For application integration, prefer JWT authentication and controlled room creation rather than public anonymous room creation.
## 6. Nginx or reverse proxy
### What it is
Nginx is normally used to serve the Jitsi Meet frontend, terminate TLS, redirect HTTP to HTTPS, and proxy special routes for signaling and bridge communication.
### What it does
Nginx handles:
- Port 80 for HTTP redirects and Let's Encrypt validation.
- Port 443 for HTTPS access.
- Static web assets.
- Reverse proxying for XMPP over WebSocket or BOSH.
- Reverse proxying for Colibri WebSocket paths used by JVB.
- Optional TLS routing or stream multiplexing when TURN over 443 is used.
### Important routes
Common important routes include:
```text
/ Jitsi Meet web application
/http-bind BOSH fallback for XMPP signaling
/xmpp-websocket XMPP over WebSocket
/colibri-ws JVB Colibri WebSocket path
```
### Production handling
- Use a trusted TLS certificate.
- Enable HTTP to HTTPS redirect.
- Forward WebSocket upgrade headers correctly.
- Do not expose internal admin or metrics endpoints through public Nginx.
- If using Cloudflare or another proxy, ensure WebRTC and WebSocket behavior is compatible with the deployment.
- Keep Nginx logs integrated with centralized logging.
## 7. Prosody
### What it is
Prosody is the XMPP server in Jitsi. It is the signaling backbone. Jitsi documentation describes Prosody as the XMPP server used for signaling. [1]
### What it does
Prosody handles:
- User XMPP sessions.
- Presence inside rooms.
- Multi-user chat rooms for conferences.
- Authentication domains.
- Guest domains.
- Lobby rooms and waiting behavior.
- JWT token verification modules.
- Internal accounts used by Jicofo, JVB, Jibri, and Jigasi.
- XMPP service discovery.
- TURN credential advertisement through XMPP when configured.
### Important virtual hosts and components
A typical deployment can include these logical domains:
```text
meet.example.com Main user-facing XMPP virtual host
auth.meet.example.com Internal authenticated domain
conference.meet.example.com MUC component for conference rooms
internal.auth.meet.example.com Internal component/auth domain
focus.meet.example.com Jicofo focus identity
recorder.meet.example.com Jibri recorder domain, if recording is enabled
guest.meet.example.com Anonymous guest domain, if guest access is enabled
```
Exact names depend on package, Docker, and authentication design.
### How Prosody works in a call
1. A browser client connects to Prosody through Nginx using XMPP over WebSocket or BOSH.
2. Prosody authenticates the user or treats the user as a guest depending on configuration.
3. The user joins a MUC room such as `room-name@conference.meet.example.com`.
4. Jicofo observes room creation and coordinates the conference.
5. JVB and Jicofo also connect through XMPP service accounts.
6. Signaling messages continue through Prosody while media flows through JVB.
### Production handling
- Do not expose Prosody's internal ports publicly.
- Restrict XMPP component/client ports to internal networks or known JVB/Jicofo/Jibri hosts.
- Use JWT authentication for app-based deployments.
- Use guest domain only when you want authenticated moderators and unauthenticated attendees.
- Monitor Prosody CPU, memory, connection count, and logs.
- Be aware that Prosody is usually not the media bottleneck, but it is a critical control-plane dependency.
## 8. Jicofo
### What it is
Jicofo means Jitsi Conference Focus. It is the conference coordinator. Official Jitsi documentation describes Jicofo as the server-side focus component used in Jitsi Meet conferences that manages media sessions and acts as a load balancer between participants and the videobridge. [1]
### What it does
Jicofo handles:
- Conference creation and lifecycle.
- Selecting a Jitsi Videobridge for a conference.
- Managing participants at the signaling level.
- Managing the relationship between conference rooms and JVBs.
- Controlling JVBs through the COLIBRI protocol.
- Coordinating Jibri sessions for recording or livestreaming.
- Moderator and feature coordination with Prosody modules.
- Bridge health and load-aware bridge selection.
### How Jicofo works in a call
1. A user joins a room through Prosody.
2. Jicofo detects or is assigned to manage the room.
3. Jicofo checks available JVBs.
4. Jicofo selects an appropriate JVB for the conference.
5. Jicofo instructs JVB to create or update the conference state.
6. Participants exchange WebRTC offers/answers and ICE data through signaling.
7. Jicofo continues coordinating participant joins/leaves, bridge state, and optional services.
### Scaling behavior
In the official scalable setup, the central Jitsi Meet instance includes Nginx, Prosody, and Jicofo, while multiple JVB nodes are attached separately. The documentation states that when a new conference starts, Jicofo picks a videobridge and schedules the conference on it. [3]
### Production handling
- Treat Jicofo as a critical control-plane service.
- Keep Jicofo logs centralized.
- Monitor bridge selection behavior and conference allocation.
- If using sharding, run separate Jicofo/Prosody stacks per shard rather than trying to make one huge control plane without design.
- Restart Jicofo carefully because existing conferences can be affected depending on deployment behavior and reconnect handling.
## 9. Jitsi Videobridge
### What it is
Jitsi Videobridge, usually called JVB, is the media router. It is a Selective Forwarding Unit, or SFU. Official Jitsi documentation describes it as a WebRTC-compatible server designed to route video streams among conference participants. [1]
### What it does
JVB handles:
- WebRTC media transport.
- ICE connectivity.
- DTLS-SRTP media security.
- RTP and RTCP packet routing.
- Audio/video forwarding.
- Screen-share forwarding.
- Simulcast and scalable video routing.
- Bandwidth estimation.
- Receiver constraints and LastN behavior.
- Packet loss recovery features such as retransmissions, depending on configuration and browser support.
- Colibri WebSocket communication.
- Media-related metrics.
### How JVB is different from an MCU
JVB normally does not mix or transcode every user's video into one combined stream. Instead, it selectively forwards streams. This is why Jitsi can scale better than a traditional MCU design, but it also means that bandwidth planning and client CPU are still important.
### How media flows
```text
User A camera/mic -> encrypted WebRTC stream -> JVB
JVB decides which participants should receive User A
JVB forwards selected encrypted media packets -> User B, User C, User D
```
JVB is not the same as TURN. TURN simply relays traffic when endpoints cannot connect normally. JVB understands the conference and makes routing decisions.
### Why JVB is the main scaling component
The official scalable setup says the first limiting factor in a single-server Jitsi installation is the Videobridge, because it handles the actual video and audio traffic, and that videobridges are easy to scale horizontally by adding as many as needed. [3]
### Production handling
- Put JVB nodes on servers with strong network capacity.
- Open UDP 10000 from the public internet to each JVB unless your deployment uses a different media design.
- Ensure the advertised public IP is correct, especially with Docker, NAT, or private cloud networks.
- Keep 25-35 percent spare capacity.
- Monitor network throughput, packet loss, CPU, memory, conferences, endpoints, and bridge stress.
- Add JVB nodes to scale concurrent meetings.
- Do not assume one JVB can safely carry a whole production deployment.
## 10. Coturn, STUN, and TURN
### What it is
Coturn is a TURN/STUN server commonly used with Jitsi. STUN helps clients discover how they are seen from the public internet. TURN relays media when direct UDP connectivity is blocked or impossible.
### What it does
TURN helps users behind:
- Symmetric NAT.
- Corporate firewalls.
- UDP-blocking networks.
- Mobile carrier restrictions.
- Networks that only allow TCP 443.
### How TURN works with Jitsi
In a simple case, users connect to JVB over UDP 10000. In a restrictive network, the browser may be unable to send media directly to the JVB. TURN then relays the traffic through an allowed port such as TCP/TLS 443 or TCP 5349.
Official Jitsi TURN documentation says peer-to-peer calls should avoid going through JVB when possible, but direct connection is not always possible, and in those cases a TURN server can relay traffic. It also notes that default TURN ports include UDP 3478 and TCP/TLS 5349, and that TCP 443 can be useful for corporate networks that allow only HTTPS-like traffic. [7]
### Production handling
- Run at least two TURN servers for production.
- Treat TURN as a bandwidth-heavy media service.
- Avoid static TURN credentials exposed to browsers for long periods; prefer ephemeral credentials when possible.
- Monitor TURN bandwidth separately from JVB bandwidth.
- Do not put TURN on the same overloaded machine as Jitsi unless traffic is tiny.
- Use valid TLS certificates for TLS TURN services.
## 11. Jibri
### What it is
Jibri means Jitsi Broadcasting Infrastructure. It is the component used for server-side recording and livestreaming.
Official Jitsi architecture documentation describes Jibri as tools for recording and/or streaming a conference by launching a Chrome instance in a virtual framebuffer and capturing/encoding the output with ffmpeg. [1]
### What it does
Jibri handles:
- Joining a Jitsi room as a special recorder participant.
- Rendering the conference in Chrome.
- Capturing the rendered output.
- Encoding with ffmpeg.
- Saving a recording file or streaming to a service such as YouTube/RTMP.
- Reporting recording state back through XMPP.
### How Jibri works in a call
```text
Moderator clicks Record or Livestream
|
v
Jicofo requests an available Jibri
|
v
Jibri joins the room as a hidden/special participant
|
v
Chrome renders the conference layout
|
v
ffmpeg captures and encodes the output
|
v
Recording file or livestream output is created
```
### Capacity rule
Jibri does not scale like JVB. One Jibri instance supports one recording or livestream session at a time. Jitsi requirements documentation states that Jibri needs one system per recording: one Jibri instance equals one meeting, and five simultaneous recordings require five Jibri instances. [5]
The Jibri repository also states that only one recording at a time is supported on a single Jibri. [6]
### Production handling
- Do not run Jibri on the main Jitsi controller node for serious production.
- Do not run Jibri on JVB nodes.
- Use a separate Jibri pool.
- Size one worker per simultaneous recording or livestream.
- Monitor CPU, memory, disk throughput, disk usage, Chrome/Chromedriver health, and ffmpeg errors.
- Store recordings on durable storage, object storage, or a post-processing pipeline.
## 12. Jigasi
### What it is
Jigasi means Jitsi Gateway to SIP. It allows regular SIP clients to join Jitsi Meet conferences. Official Jitsi architecture documentation describes Jigasi as a server-side application that allows regular SIP clients to join Jitsi Meet conferences. [1]
### What it does
Jigasi handles:
- SIP call-in or call-out integration.
- Audio bridge between SIP endpoints and a Jitsi room.
- Connection to Prosody/XMPP as a component or service account.
- SIP registration to a SIP provider or PBX.
- Optional transcription-related workflows in some deployments.
### How Jigasi works
```text
SIP phone / PBX / provider
|
| SIP/RTP
v
Jigasi
|
| XMPP signaling + media bridge behavior
v
Jitsi room
```
### Production handling
- Deploy only if you need SIP/PSTN integration.
- Isolate SIP credentials and PBX connectivity.
- Plan port ranges for SIP media if enabled.
- Monitor call setup failures, SIP registration state, audio quality, and provider errors.
- Keep SIP access restricted to expected providers or internal PBX networks.
## 13. Etherpad
### What it is
Etherpad is an optional collaborative text editing service that can be integrated with Jitsi Meet for shared notes.
### What it does
Etherpad handles:
- Shared collaborative documents.
- Meeting notes.
- Text collaboration beside the video call.
### Production handling
- Do not deploy Etherpad unless users need shared notes.
- Put it behind authentication or controlled access if documents contain sensitive content.
- Back up its database if meeting notes matter.
- Monitor it separately from Jitsi media services.
## 14. Authentication components
### Available authentication models
Common production authentication models include:
- Internal Prosody users.
- JWT/token authentication.
- LDAP authentication.
- Guest domain with authenticated moderators.
- Application-controlled room creation.
Official token authentication documentation states that Jitsi can allow only users with a valid token to create new conference rooms, while others can join from an anonymous domain after the room exists. [8]
### Recommended production model
For a custom application or platform:
```text
Use JWT auth.
Only your backend creates valid meeting tokens.
Moderators receive tokens with room permissions.
Guests join through controlled room links or guest domain.
Lobby is enabled for sensitive rooms.
Anonymous public room creation is disabled.
```
### Why JWT is usually best for production
JWT makes Jitsi part of your application security model:
- Your app decides who can create rooms.
- Your app decides who is moderator.
- Your app can restrict access by room name.
- Your app can expire tokens.
- Your app can map users, avatars, display names, and roles.
### Production handling
- Store JWT secrets safely.
- Rotate secrets carefully.
- Use short token lifetimes where possible.
- Do not expose app secrets in frontend code.
- Disable anonymous room creation.
- Enable lobby and moderator controls.
## 15. Web clients and mobile clients
### What they are
Clients are the browsers and mobile apps used by participants.
### What they do
Clients handle:
- Camera and microphone capture.
- WebRTC encryption and transport.
- Encoding local media.
- Decoding remote media.
- UI interactions.
- Sending receiver constraints to request fewer or lower-quality remote streams.
### Production handling
Client performance is part of your infrastructure capacity. Even if JVB has enough bandwidth, weak phones or old laptops may fail in large rooms.
Use:
- Start with audio muted for large rooms.
- Start with video muted for large rooms.
- Limit default resolution.
- Limit visible remote videos with LastN/receiver constraints.
- Recommend Chrome/Chromium-based browsers or tested clients.
- Monitor client-side error reports if you control the application.
## 16. Monitoring and logging components
### What they are
Production Jitsi needs observability. Without monitoring, you cannot know whether the problem is JVB bandwidth, TURN fallback, Prosody signaling, client CPU, or bad network conditions.
### Recommended stack
```text
Prometheus Metrics collection
Grafana Dashboards
Loki or ELK Logs
Node exporter Server metrics
Blackbox exporter External health checks
Alertmanager Alerts
```
### Metrics to watch
| Area | Important metrics |
|---|---|
| JVB | endpoints, conferences, packet loss, bitrate, CPU, memory, network in/out |
| Prosody | connections, auth failures, MUC behavior, module errors |
| Jicofo | bridge selection, conference allocation, errors |
| Nginx | 4xx/5xx, WebSocket upgrade failures, TLS expiry |
| TURN | relay bandwidth, allocation count, failed allocations |
| Jibri | active sessions, failed recordings, CPU, memory, disk, ffmpeg errors |
| System | CPU steal, disk usage, disk IO, network saturation, process restarts |
### Production handling
- Alert before saturation, not after users complain.
- Keep dashboards per shard and per JVB.
- Log conference IDs and bridge allocation events where possible.
- Track TURN usage percentage. A sudden increase means users cannot reach UDP directly.
- Track certificate expiry.
## 17. Ports and network paths
### Standard ports
| Port | Protocol | Component | Purpose |
|---|---|---|---|
| 80 | TCP | Nginx/Web | HTTP redirect and Let's Encrypt validation |
| 443 | TCP | Nginx/Web | HTTPS web app and WebSocket/BOSH proxy |
| 10000 | UDP | JVB | Main WebRTC media path |
| 5222 | TCP | Prosody | XMPP client/component communication, usually internal/restricted |
| 3478 | UDP | Coturn | STUN/TURN |
| 5349 | TCP/TLS | Coturn | TURN over TLS fallback |
| 20000-20050 | UDP | Jigasi | Optional SIP media range depending on deployment |
The official Docker deployment documentation lists external ports 80/tcp, 443/tcp, and 10000/udp, and also notes 20000-20050/udp for Jigasi SIP access if that component is deployed. [2]
The official Debian/Ubuntu self-hosting guide lists 80 TCP, 443 TCP, 10000 UDP, SSH, 3478 UDP, and 5349 TCP as relevant firewall ports for a typical server with coturn support. [4]
### Production rule
Open only what must be public. Keep the rest private.
```text
Public:
- 80/tcp
- 443/tcp
- 10000/udp on JVB nodes
- 3478/udp and 5349/tcp on TURN nodes, if used
Private/restricted:
- 5222/tcp Prosody
- metrics ports
- admin ports
- SSH
- Docker/Kubernetes APIs
```
## 18. How a participant joins a meeting
```text
1. User opens room URL.
2. Browser downloads Jitsi Meet Web from Nginx.
3. Browser opens XMPP signaling over WebSocket or BOSH.
4. Prosody authenticates or accepts guest access.
5. Browser joins the MUC room.
6. Jicofo coordinates the conference.
7. Jicofo selects a JVB.
8. Browser and JVB exchange WebRTC transport information through signaling.
9. Browser sends encrypted media to JVB.
10. JVB forwards selected media streams to other participants.
11. Prosody/Jicofo continue managing room state while JVB handles media.
```
## 19. Production topologies
### Single-server deployment
```text
One server:
- Nginx
- Jitsi Meet Web
- Prosody
- Jicofo
- JVB
- optional coturn
```
Good for testing, small internal teams, and proof of concept.
Not recommended for 1000+ production users.
### Split JVB deployment
```text
Controller node:
- Nginx
- Jitsi Meet Web
- Prosody
- Jicofo
Media nodes:
- JVB 1
- JVB 2
- JVB 3
- JVB N
```
This is the first serious production architecture. Jitsi's scalable setup recommends splitting the central Jitsi Meet instance from videobridges as the first scaling step. [3]
### Sharded deployment
```text
Shard A:
- Web
- Prosody
- Jicofo
- JVB pool
Shard B:
- Web
- Prosody
- Jicofo
- JVB pool
Application router:
- routes rooms/users to a shard
```
Good for high availability and large scale.
### Regional deployment
```text
EU region:
- EU web/control shard
- EU JVBs
US region:
- US web/control shard
- US JVBs
Asia region:
- Asia web/control shard
- Asia JVBs
```
Good when users are globally distributed and latency matters.
## 20. Scaling guide by component
| Component | Bottleneck | How to scale |
|---|---|---|
| Web/Nginx | TLS, static traffic, WebSocket proxying | Add web nodes or shards |
| Prosody | XMPP connections, modules, room state | Usually scale by shard, not by simply adding random replicas |
| Jicofo | Conference orchestration, bridge control | Usually scale by shard; design active focus carefully |
| JVB | Bandwidth, packet forwarding, CPU | Add more JVB nodes |
| TURN | Relay bandwidth | Add more TURN nodes, use geo-distribution |
| Jibri | Encoding CPU/RAM/disk | One Jibri per simultaneous recording |
| Jigasi | SIP calls and audio bridge load | Add more Jigasi instances if SIP use grows |
| Monitoring | Metrics/log volume | Scale storage and retention separately |
## 21. Production best practices
### Infrastructure
- Use separate controller and JVB nodes.
- Use separate TURN nodes for serious production.
- Use separate Jibri nodes for recording.
- Use configuration management such as Ansible, Terraform, or GitOps.
- Use pinned versions, not random latest images in production.
- Keep staging and production separate.
### Security
- Use trusted TLS certificates.
- Disable anonymous room creation.
- Prefer JWT for application integration.
- Enable lobby for guest access.
- Restrict Prosody, metrics, SSH, and admin ports.
- Rotate secrets.
- Do not expose Docker socket or internal service ports.
### Performance
- Keep UDP 10000 working for JVB nodes.
- Use TURN only as fallback, not as the normal path for everyone.
- Limit default video quality.
- Limit LastN/visible remote video count in large rooms.
- Start audio/video muted for large public rooms.
- Keep enough JVB spare capacity.
### Operations
- Monitor before production launch.
- Load test with realistic room patterns.
- Keep rollback packages or images ready.
- Upgrade JVB nodes one by one.
- Drain traffic before restarting busy components when possible.
- Keep clear incident runbooks.
## 22. Failure modes and where to look
| Symptom | Likely component | What to check |
|---|---|---|
| Page does not load | Nginx, DNS, TLS | DNS, certificate, Nginx logs, firewall 443 |
| Users can join but no audio/video | JVB, firewall, NAT | UDP 10000, JVB advertised IP, browser ICE candidates |
| Two users work but three or more fail | JVB path | JVB public IP, UDP 10000, NAT, Docker advertise IP |
| Users behind corporate networks fail | TURN | coturn health, 443/5349, credentials, certificates |
| Rooms are not created | Prosody/Jicofo | XMPP logs, auth config, Jicofo service account |
| JWT users cannot join | Prosody auth | app_id, app_secret, token claims, room claim, time skew |
| Recording fails | Jibri | Chrome, Chromedriver, ffmpeg, ALSA loopback, disk, Jibri account |
| SIP call-in fails | Jigasi | SIP credentials, PBX routing, firewall, media range |
| High packet loss | JVB/network | NIC saturation, cloud network, UDP drops, region distance |
| Random disconnects | Client/network/JVB | WebSocket stability, JVB stress, browser logs |
## 23. Recommended architecture for 1000+ concurrent users
For more than 1000 concurrent participants across many calls, a practical starting design is:
```text
2 Jitsi control shards
8-10 JVB nodes total
2 TURN nodes
1 monitoring/logging stack
Jibri pool only if recording/livestreaming is required
```
Each shard:
```text
1 controller node:
- Nginx
- Jitsi Meet Web
- Prosody
- Jicofo
4-5 JVB nodes:
- Jitsi Videobridge only
```
Shared or per-region:
```text
2 TURN nodes
Monitoring/logging
Optional Jibri worker pool
Optional Jigasi worker pool
```
This design allows:
- Horizontal media scaling by adding JVBs.
- Failure isolation by shard.
- Easier upgrades.
- Better observability.
- Controlled authentication and room routing.
## 24. Component responsibility map
```text
User interface problem -> Jitsi Meet Web / browser
TLS or static file problem -> Nginx / reverse proxy
Login or room auth problem -> Prosody / JWT / LDAP
Room orchestration problem -> Jicofo
Audio/video routing problem -> JVB
Strict firewall problem -> TURN
Recording/livestream problem -> Jibri
SIP/PSTN problem -> Jigasi
Scale problem -> Usually JVB, then sharding
```
## 25. Final production checklist
Before launch:
- Domain and DNS are correct.
- TLS certificate is trusted and auto-renewing.
- UDP 10000 reaches every JVB.
- JVB advertised IPs are correct.
- Prosody internal ports are not publicly exposed.
- JWT or chosen authentication is working.
- Guests cannot create rooms unless intentionally allowed.
- TURN works for restricted networks.
- Monitoring and alerts are active.
- Load test has been run with realistic room distribution.
- Jibri is deployed only if recording/livestreaming is needed.
- Backups exist for configuration and secrets.
- Upgrade and rollback procedure exists.
## References
[1] Jitsi Meet Handbook, Architecture: https://jitsi.github.io/handbook/docs/architecture/
[2] Jitsi Meet Handbook, Self-Hosting Guide - Docker: https://jitsi.github.io/handbook/docs/devops-guide/devops-guide-docker/
[3] Jitsi Meet Handbook, DevOps Guide - Scalable setup: https://jitsi.github.io/handbook/docs/devops-guide/devops-guide-scalable/
[4] Jitsi Meet Handbook, Self-Hosting Guide - Debian/Ubuntu server: https://jitsi.github.io/handbook/docs/devops-guide/devops-guide-quickstart/
[5] Jitsi Meet Handbook, Requirements - Recording/Jibri: https://jitsi.github.io/handbook/docs/devops-guide/devops-guide-requirements/
[6] Jitsi Jibri GitHub repository: https://github.com/jitsi/jibri
[7] Jitsi Meet Handbook, Setting up TURN: https://jitsi.github.io/handbook/docs/devops-guide/turn/
[8] Jitsi Meet Handbook, Token Authentication: https://jitsi.github.io/handbook/docs/devops-guide/token-authentication/

View File

@@ -0,0 +1,179 @@
# Jitsi Docker Plugins and Third-Party Software Catalog
This is a practical DevOps checklist for a self-hosted Jitsi Meet deployment running with Docker Compose. The official Docker stack is based around `web`, `prosody`, `jicofo`, and `jvb`, with optional Compose overlays for services like `jibri`, `jigasi`, `etherpad`, `whiteboard`, `transcriber`, `grafana`, `prometheus`, `rtcstats`, and log analysis. ([GitHub][1])
## 1. Core Jitsi Docker Components
| Component | Purpose | Docker Service |
| ----------------- | --------------------------------------------------- | -------------- |
| Jitsi Meet Web | Frontend web UI, Nginx, static assets, external API | `web` |
| Prosody | XMPP server used for signaling, auth, room control | `prosody` |
| Jicofo | Conference focus, room/session orchestration | `jicofo` |
| Jitsi Videobridge | SFU media bridge for audio/video routing | `jvb` |
| Jibri | Recording and live streaming worker | `jibri` |
| Jigasi | SIP gateway and dial-in/dial-out support | `jigasi` |
| Jitsi Transcriber | Speech-to-text transcription support | `transcriber` |
| JaaS Components | Hosted Jigasi-style components from 8x8/JaaS | optional |
## 2. Official Optional Docker Overlays
| Overlay File | Feature | Use Case |
| ------------------ | ----------------------- | --------------------------------------- |
| `jibri.yml` | Recording and streaming | Record meetings, stream to YouTube/RTMP |
| `jigasi.yml` | SIP gateway | Connect SIP PBX, PSTN, VoIP users |
| `etherpad.yml` | Shared documents | Collaborative meeting notes |
| `whiteboard.yml` | Excalidraw whiteboard | Collaborative drawing/whiteboard |
| `transcriber.yml` | Transcription | Meeting captions/transcripts |
| `grafana.yml` | Grafana dashboard | Metrics visualization |
| `prometheus.yml` | Metrics scraping | Monitoring Jitsi services |
| `rtcstats.yml` | WebRTC analytics | Client-side WebRTC quality data |
| `log-analyser.yml` | Log analysis | Loki/OpenTelemetry/Grafana log view |
The official Docker guide shows these overlays being started with commands like `docker compose -f docker-compose.yml -f jibri.yml up -d`, and similar combinations for Jigasi, Etherpad, whiteboard, transcriber, Grafana, and log analysis. ([Jitsi][2])
## 3. Reverse Proxy and TLS Software
| Software | Purpose | Docker-Friendly | Notes |
| ------------- | -------------------------------------------- | --------------- | --------------------------------------------------------------- |
| Nginx | Reverse proxy, TLS termination, HTTP routing | Yes | Common production choice |
| Traefik | Dynamic reverse proxy for Docker labels | Yes | Good for multi-service Docker hosts |
Jitsi Docker requires a real `PUBLIC_URL` for production deployments, and the official `.env` includes Lets Encrypt-related settings such as domain, email, staging mode, and ACME server selection. ([Jitsi][2])
## 4. NAT, STUN, and TURN
| Software | Purpose | When to Use |
| ------------------ | ---------------------- | ------------------------------------------------------- |
| coturn | TURN/STUN relay server | Required for reliable calls behind strict NAT/firewalls |
| Google STUN | Public STUN service | Basic NAT discovery, not enough for all networks |
| Custom STUN | Your own STUN endpoint | Controlled infrastructure |
| TURN over TCP 443 | Firewall bypass | Corporate networks that block UDP |
| TURN over TLS 5349 | Secure TURN relay | Better for enterprise deployments |
Jitsi can use a TURN server for cases where direct peer-to-peer connectivity fails; the official TURN guide discusses coturn, XMPP-delivered TURN credentials, UDP 3478, TCP/TLS 5349, and using port 443 for restrictive networks. ([Jitsi][3])
## 5. Authentication and SSO
| Tool | Integration Type | Notes |
| ------------------------------ | -------------------------------- | ------------------------------------------------- |
| Internal Prosody Auth | Username/password inside Prosody | Simple small deployment |
| JWT Auth | Token-based authentication | Best for custom apps and portals |
| LDAP | Directory authentication | Enterprise user directories |
| Active Directory | LDAP/SASL integration | Corporate auth |
| OpenLDAP | LDAP backend | Self-hosted directory |
| Keycloak | OIDC/SAML identity provider | Usually integrated through JWT adapters |
| authentik | OIDC/SAML identity provider | Good self-hosted SSO option |
| Authelia | SSO and access control | Usually used in front of apps |
| Dex | Lightweight OIDC provider | Kubernetes-friendly |
| OAuth2 Proxy | Auth gateway | Can protect Jitsi landing pages or custom portals |
| jitsi-OIDC-adapter | OIDC to Jitsi JWT bridge | Community integration |
| jitsi-OIDC-SAML-adapter | OIDC/SAML to Jitsi JWT bridge | Community integration |
| nordeck/jitsi-keycloak-adapter | Keycloak adapter | Dockerized Jitsi integration |
The official Docker `.env` supports `AUTH_TYPE=internal`, `jwt`, `ldap`, or `matrix`, and includes JWT and LDAP configuration fields. Jitsis JWT auth plugin verifies client connections using JWT and supports shared-secret or public-key validation. ([GitHub][4])
## 6. SIP, VoIP, and Telephony
| Software | Purpose | Works With |
| --------------------------- | ---------------------- | ------------------------ |
| Jigasi | Jitsi SIP gateway | SIP providers, PBX, PSTN |
| Asterisk | PBX server | Jigasi |
| FreePBX | Asterisk management UI | Jigasi |
| FreeSWITCH | PBX/media server | Jigasi |
| Kamailio | SIP proxy | Large SIP routing |
| OpenSIPS | SIP proxy | Large SIP routing |
| SIP provider account | External calling | Jigasi |
| Twilio Elastic SIP Trunking | SIP trunk | Jigasi/Asterisk |
| Telnyx SIP | SIP trunk | Jigasi/Asterisk |
| VoIP.ms | SIP trunk | Jigasi/Asterisk |
| SignalWire | SIP/telephony | Jigasi/Asterisk |
Jitsi Dockers `.env` includes Jigasi SIP settings such as SIP URI, SIP password, SIP server, SIP port, and SIP transport. ([GitHub][4])
## 7. Recording, Streaming, and Storage
| Software | Purpose | Notes |
| ---------------------- | ---------------------------- | ------------------------------------- |
| Jibri | Recording and streaming | Official Jitsi recording component |
| FFmpeg | Media processing | Used in recording/streaming workflows |
| Google Chrome/Chromium | Headless capture for Jibri | Required by Jibri |
| ALSA/PulseAudio | Audio capture stack | Used by Jibri |
| YouTube Live | RTMP streaming target | Jibri can stream to RTMP |
| Twitch | RTMP streaming target | Possible with stream key |
| Facebook Live | RTMP streaming target | Possible with stream key |
| Nginx RTMP Module | Self-hosted RTMP endpoint | Internal streaming pipeline |
| Owncast | Self-hosted live streaming | RTMP target |
| Restream | Multi-platform streaming | RTMP target |
| MinIO | S3-compatible object storage | Store recordings |
| AWS S3 | Object storage | Store recordings |
| Wasabi | S3-compatible storage | Store recordings |
| Backblaze B2 | Object storage | Store recordings |
| rclone | Upload/sync recordings | Post-recording automation |
## 8. Collaboration Add-ons
| Software | Purpose | Integration Style |
| ---------------------- | ---------------------------- | ------------------------------ |
| Etherpad | Shared document editing | Official Docker overlay |
| Excalidraw | Whiteboard | Official whiteboard overlay |
| Nextcloud | Files, calendar, office docs | External integration |
| OnlyOffice | Document editing | With Nextcloud or standalone |
| Collabora Online | Document editing | With Nextcloud |
The official Docker setup has direct support for Etherpad document sharing and an Excalidraw-based virtual collaborative whiteboard. ([Jitsi][2])
## 9. Chat and Team Platform Integrations
| Platform | Integration Method | Notes |
| -------------------------- | ----------------------------------------- | ----------------------------------- |
| Matrix / Element | Matrix auth or meeting integration | Jitsi can be used from Matrix rooms |
| Mattermost | Jitsi plugin/integration | Team chat video calls |
| Rocket.Chat | Jitsi integration | Team chat video calls |
| Nextcloud Talk / Nextcloud | External meeting links or app integration | Good self-hosted suite |
| Moodle | Jitsi plugin | Education/LMS |
## 10. Web and App Embedding
| Tool | Purpose | Notes |
| ----------------- | ------------------------------- | ------------------------------ |
| Jitsi IFrame API | Embed meetings in websites/apps | Official supported method |
| External API JS | Browser-side meeting control | Loaded from `/external_api.js` |
| lib-jitsi-meet | Low-level JS library | Build custom video apps |
The official IFrame API lets you embed Jitsi Meet into your own application, and the event API allows listening to meeting events through `JitsiMeetExternalAPI`. ([Jitsi][5])
## 11. Prosody Plugins and XMPP Modules
| Plugin / Module Type | Purpose |
| ---------------------------- | ------------------------------- |
| Custom Prosody modules | Add custom XMPP behavior |
| JWT auth module | Token authentication |
| LDAP/SASL auth module | Enterprise directory auth |
| MUC modules | Room behavior customization |
| Lobby modules | Guest waiting room behavior |
| MUC size module | Room participant metrics |
| MUC domain mapper | Multi-domain setups |
| Token moderation | Moderator control from JWT |
| Room metadata modules | Store extra room info |
| Reservation modules | Room booking or room validation |
| External services module | TURN credential delivery |
| Rate limiting modules | Abuse protection |
| Anti-spam modules | Public server protection |
| Webhook-style custom module | Send events to external backend |
| Custom access control module | Per-room or per-user policy |
For Docker deployments, custom Prosody plugins are usually mounted into the Prosody config/plugin path and enabled through Prosody/Jitsi configuration. The official Docker guide creates a `prosody/prosody-plugins-custom` directory for custom plugin use. ([Jitsi][2])
## 12. Monitoring and Observability
| Software | Purpose | Notes |
| ------------------- | ------------------------------- | -------------------------------------- |
| Prometheus | Metrics collection | Official Docker overlay exists |
| Grafana | Dashboards | Official Docker overlay exists |
| Jitsi Meet Exporter | Prometheus exporter | Exposes Jitsi metrics |
| Loki | Log aggregation | Used in log analyzer stack |
| OpenTelemetry | Telemetry/log pipeline | Used in log analyzer stack |
The Jitsi Docker repository includes `prometheus.yml`, `grafana.yml`, `rtcstats.yml`, and `log-analyser.yml`; the log analyser uses Grafana Loki and OpenTelemetry for log management and analysis. ([GitHub][1])

View File

@@ -0,0 +1,527 @@
# Replicating Jitsi Videobridge in Docker
## 1. What JVB replication means
In Jitsi, the component that normally becomes the bottleneck is **Jitsi Videobridge**, also called **JVB**. JVB is the SFU/media router that handles RTP audio and video traffic. The official Jitsi scalable setup is based on **one Jitsi Meet core node** running web, Prosody, and Jicofo, plus **multiple JVB nodes** handling media traffic. Jitsis own guide says the videobridge is usually the first limiting factor and that bridges can be scaled horizontally by adding more of them. ([Jitsi][1])
Important: this is not classic load balancing with HAProxy or Nginx in front of UDP media. JVBs register into the bridge pool, and **Jicofo selects a bridge when a new conference starts**. ([Jitsi][1])
## 2. Target architecture
```text
Users
|
80/443 TCP
|
+----------------+
| Jitsi Core |
| web/nginx |
| prosody |
| jicofo |
+----------------+
| 5222 TCP
private / firewall-restricted XMPP
|
+----------------+----------------+
| | |
+---------------+ +---------------+ +---------------+
| JVB node 1 | | JVB node 2 | | JVB node 3 |
| Docker jvb | | Docker jvb | | Docker jvb |
| 10000/udp | | 10000/udp | | 10000/udp |
+---------------+ +---------------+ +---------------+
| | |
+------ media RTP to clients -----+
```
Jitsis scalable guide shows this same pattern: one central Jitsi Meet server with nginx, Prosody, and Jicofo, plus multiple videobridges connected over XMPP. It also documents `80/tcp`, `443/tcp`, `5222/tcp`, and `10000/udp` as the key ports in this architecture. ([Jitsi][1])
## 3. What replication improves
JVB replication improves:
* Total concurrent meetings
* Total concurrent users
* Media CPU capacity
* Network bandwidth capacity
* Fault isolation between conferences
* Easier horizontal scaling by adding more bridge hosts
It does not automatically make one very large conference split across all bridges. By default, Jicofo schedules a conference onto a selected bridge. If you need one conference distributed across multiple bridges, that becomes an **Octo / cascading JVB** design and should be treated as a separate advanced architecture.
## 4. Recommended deployment model
For production, use:
```text
1 core Jitsi node:
web
prosody
jicofo
optionally one local jvb
N remote JVB nodes:
jvb only
```
Running many JVB containers on the same host is possible for testing, but it is not the best production model because each JVB needs UDP media ports, CPU, memory, kernel UDP buffers, and public reachability. The official sizing guide also notes that videobridges carry more load than the main Jitsi Meet server and suggests larger CPU allocation for JVB hosts. ([Jitsi][1])
## 5. Required ports
### Core Jitsi node
| Port | Direction | Purpose |
| ---------- | -----------------------------: | ---------------------------------- |
| `80/tcp` | public inbound | HTTP redirect or ACME challenge |
| `443/tcp` | public inbound | Jitsi web UI and WebSocket traffic |
| `5222/tcp` | private inbound from JVB nodes | Prosody XMPP client connection |
| `5347/tcp` | internal only | XMPP component connections |
| `5280/tcp` | internal or reverse-proxied | BOSH/WebSocket depending on setup |
The Docker self-hosting guide lists `80/tcp`, `443/tcp`, and `10000/udp` as the main external ports, and the scalable guide says `5222/tcp` should be open only to videobridges. ([Jitsi][2])
### Each JVB node
| Port | Direction | Purpose |
| ------------------------------- | ---------------------: | --------------------------------------- |
| `10000/udp` | public inbound | WebRTC RTP media |
| `8080/tcp` | localhost/private only | Colibri REST API |
| `443/tcp` or reverse proxy path | optional | Colibri WebSocket if exposed separately |
The Docker guide defines `JVB_PORT` as the UDP media port, defaulting to `10000`, and `JVB_COLIBRI_PORT` as the local Colibri API port, defaulting to `8080`. ([Jitsi][2])
## 6. Core node Docker configuration
Start with the normal `docker-jitsi-meet` stack.
```bash
git clone https://github.com/jitsi/docker-jitsi-meet
cd docker-jitsi-meet
cp env.example .env
./gen-passwords.sh
mkdir -p ~/.jitsi-meet-cfg/{web,transcripts,prosody/config,prosody/prosody-plugins-custom,jicofo,jvb,jigasi,jibri}
```
The Docker guide recommends copying `env.example`, generating strong internal passwords with `./gen-passwords.sh`, and creating the required config directories before starting the stack. ([Jitsi][2])
Example core `.env`:
```env
CONFIG=~/.jitsi-meet-cfg
TZ=UTC
PUBLIC_URL=https://meet.example.com
HTTP_PORT=80
HTTPS_PORT=443
ENABLE_LETSENCRYPT=1
LETSENCRYPT_DOMAIN=meet.example.com
LETSENCRYPT_EMAIL=admin@example.com
ENABLE_HTTP_REDIRECT=1
JVB_AUTH_USER=jvb
JVB_AUTH_PASSWORD=use_the_generated_password_from_core_env
JVB_BREWERY_MUC=jvbbrewery
XMPP_DOMAIN=meet.jitsi
XMPP_AUTH_DOMAIN=auth.meet.jitsi
XMPP_INTERNAL_MUC_DOMAIN=internal-muc.meet.jitsi
XMPP_MUC_DOMAIN=muc.meet.jitsi
XMPP_SERVER=xmpp.meet.jitsi
XMPP_PORT=5222
```
Expose Prosody `5222/tcp` from the core node to the JVB nodes. Do not expose it to the entire Internet.
Example `docker-compose.override.yml` on the core node:
```yaml
services:
prosody:
ports:
- "10.0.0.10:5222:5222"
```
Use a private network address if possible. If your JVBs are on separate public servers, restrict this port with firewall rules.
Example firewall logic:
```bash
ufw allow 80/tcp
ufw allow 443/tcp
ufw allow from JVB1_PUBLIC_OR_PRIVATE_IP to any port 5222 proto tcp
ufw allow from JVB2_PUBLIC_OR_PRIVATE_IP to any port 5222 proto tcp
ufw deny 5222/tcp
```
Start the core stack:
```bash
docker compose up -d
```
## 7. Remote JVB node Docker Compose
On every remote JVB server, run only the `jvb` container.
Directory layout:
```bash
mkdir -p /opt/jitsi-jvb
cd /opt/jitsi-jvb
mkdir -p ~/.jitsi-meet-cfg/jvb
```
Create `.env`:
```env
CONFIG=~/.jitsi-meet-cfg
TZ=UTC
JITSI_IMAGE_VERSION=stable
PUBLIC_URL=https://meet.example.com
XMPP_SERVER=10.0.0.10
XMPP_PORT=5222
XMPP_DOMAIN=meet.jitsi
XMPP_AUTH_DOMAIN=auth.meet.jitsi
XMPP_INTERNAL_MUC_DOMAIN=internal-muc.meet.jitsi
JVB_AUTH_USER=jvb
JVB_AUTH_PASSWORD=the_same_JVB_AUTH_PASSWORD_from_core
JVB_BREWERY_MUC=jvbbrewery
JVB_PORT=10000
JVB_ADVERTISE_IPS=JVB_PUBLIC_IP
JVB_MUC_NICKNAME=jvb-node-01
JVB_INSTANCE_ID=jvb-node-01
COLIBRI_REST_ENABLED=true
SHUTDOWN_REST_ENABLED=true
VIDEOBRIDGE_MAX_MEMORY=3072m
```
`JVB_ADVERTISE_IPS` is critical. The Docker guide says it controls which IPs and ports the bridge advertises for WebRTC media, and it must be set correctly when behind NAT or on the public Internet. If it is wrong, calls can fail when more than two users join. ([Jitsi][2])
Create `docker-compose.yml`:
```yaml
services:
jvb:
image: jitsi/jvb:${JITSI_IMAGE_VERSION:-stable}
restart: unless-stopped
ports:
- "${JVB_PORT:-10000}:${JVB_PORT:-10000}/udp"
- "127.0.0.1:8080:8080"
volumes:
- ${CONFIG}/jvb:/config
environment:
- TZ
- PUBLIC_URL
- XMPP_SERVER
- XMPP_PORT
- XMPP_DOMAIN
- XMPP_AUTH_DOMAIN
- XMPP_INTERNAL_MUC_DOMAIN
- JVB_AUTH_USER
- JVB_AUTH_PASSWORD
- JVB_BREWERY_MUC
- JVB_PORT
- JVB_ADVERTISE_IPS
- JVB_MUC_NICKNAME
- JVB_INSTANCE_ID
- COLIBRI_REST_ENABLED
- SHUTDOWN_REST_ENABLED
- VIDEOBRIDGE_MAX_MEMORY
```
Start the remote JVB:
```bash
docker compose up -d
```
Check logs:
```bash
docker compose logs -f jvb
```
On the core node:
```bash
docker compose logs -f prosody
docker compose logs -f jicofo
```
You should see the new bridge join the brewery MUC, and Jicofo should detect it. The scalable setup guide says you can verify bridge connection in Prosody and Jicofo logs, and that Jicofo picks a videobridge when a new conference starts. ([Jitsi][1])
## 8. Same-host multi-JVB setup
Use this only for testing or small deployments.
Problem: two containers cannot both bind host port `10000/udp`.
Example:
```yaml
services:
jvb1:
image: jitsi/jvb:${JITSI_IMAGE_VERSION:-stable}
restart: unless-stopped
ports:
- "10000:10000/udp"
- "127.0.0.1:8081:8080"
environment:
- JVB_PORT=10000
- JVB_ADVERTISE_IPS=PUBLIC_IP#10000
- JVB_MUC_NICKNAME=jvb1
- JVB_INSTANCE_ID=jvb1
- JVB_AUTH_USER=jvb
- JVB_AUTH_PASSWORD=${JVB_AUTH_PASSWORD}
- JVB_BREWERY_MUC=jvbbrewery
- XMPP_SERVER=xmpp.meet.jitsi
- XMPP_DOMAIN=meet.jitsi
- XMPP_AUTH_DOMAIN=auth.meet.jitsi
- XMPP_INTERNAL_MUC_DOMAIN=internal-muc.meet.jitsi
jvb2:
image: jitsi/jvb:${JITSI_IMAGE_VERSION:-stable}
restart: unless-stopped
ports:
- "10001:10000/udp"
- "127.0.0.1:8082:8080"
environment:
- JVB_PORT=10000
- JVB_ADVERTISE_IPS=PUBLIC_IP#10001
- JVB_MUC_NICKNAME=jvb2
- JVB_INSTANCE_ID=jvb2
- JVB_AUTH_USER=jvb
- JVB_AUTH_PASSWORD=${JVB_AUTH_PASSWORD}
- JVB_BREWERY_MUC=jvbbrewery
- XMPP_SERVER=xmpp.meet.jitsi
- XMPP_DOMAIN=meet.jitsi
- XMPP_AUTH_DOMAIN=auth.meet.jitsi
- XMPP_INTERNAL_MUC_DOMAIN=internal-muc.meet.jitsi
```
The `#port` syntax is used when the advertised external port differs from the internal JVB port. The Docker guide documents this pattern for `JVB_ADVERTISE_IPS`. ([Jitsi][2])
## 9. Colibri WebSocket considerations
Modern Jitsi deployments commonly use Colibri WebSockets for bridge-channel communication. Jitsi Videobridge documents that WebSocket URLs include a `server-id` path such as:
```text
/colibri-ws/server-id/conf-id/endpoint-id
```
When multiple bridges are behind one HTTP proxy, the proxy must route each `server-id` to the correct JVB. Jitsis Videobridge WebSocket documentation explicitly shows separate proxy routes for `jvb1` and `jvb2`. ([GitHub][3])
For a simple Docker deployment, the easiest options are:
1. Keep JVBs directly reachable by UDP and avoid custom WebSocket routing unless needed.
2. If using Colibri WebSocket through the main domain, assign unique bridge IDs and configure reverse-proxy routing.
3. If using remote JVBs with their own public hostnames, make each JVB advertise the correct public WebSocket domain.
For production behind a reverse proxy, review these variables:
```env
ENABLE_COLIBRI_WEBSOCKET=1
COLIBRI_WEBSOCKET_REGEX=
COLIBRI_WEBSOCKET_JVB_LOOKUP_NAME=
DISABLE_COLIBRI_WEBSOCKET_JVB_LOOKUP=
JVB_WS_DOMAIN=
JVB_WS_SERVER_ID=
JVB_WS_TLS=
```
The Docker guide states that `COLIBRI_WEBSOCKET_REGEX` controls proxy matching to JVBs and recommends overriding it in production with values matching the possible JVB IP ranges. ([Jitsi][2])
## 10. Health checks
### Check JVB container
```bash
docker compose ps
docker compose logs --tail=200 jvb
```
### Check UDP listening
```bash
ss -lunp | grep 10000
```
### Check Colibri REST locally
```bash
curl -s http://127.0.0.1:8080/colibri/stats | jq
```
Useful fields:
```text
conferences
participants
endpoints
bit_rate_download
bit_rate_upload
packet_rate_download
packet_rate_upload
stress_level
version[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
```
### Check Jicofo sees bridges
On the core node:
```bash
docker compose logs jicofo | grep -i bridge
```
Expected idea:
```text
Added new videobridge
Bridge selected for conference
```
### Check Prosody connection
```bash
docker compose logs prosody | grep -i jvb
```
## 11. Monitoring
Recommended stack:
```text
Prometheus
Grafana
Loki or centralized Docker logs
Node Exporter
cAdvisor
Blackbox Exporter
```
Monitor at least:
| Metric | Why it matters |
| ------------------------- | -------------------------------- |
| CPU usage per JVB | SFU forwarding is CPU-sensitive |
| NIC bandwidth | Media traffic is bandwidth-heavy |
| UDP packet drops | Causes audio/video instability |
| JVB stress level | Used for bridge load decisions |
| Conferences per JVB | Confirms distribution |
| Participants per JVB | Capacity planning |
| Jicofo bridge count | Detects missing bridges |
| Prosody 5222 availability | Remote JVB registration |
| Packet loss / jitter | User quality indicator |
## 12. Autoscaling approach
Basic autoscaling logic:
```text
if average JVB stress_level > 0.75 for 5 minutes:
add one JVB node
if average JVB stress_level < 0.25 for 30 minutes:
drain one JVB node
wait until conferences = 0
remove node
```
Safe scale-down process:
```bash
curl -X POST http://127.0.0.1:8080/colibri/shutdown
```
Then wait until:
```bash
curl -s http://127.0.0.1:8080/colibri/stats | jq '.conferences'
```
returns:
```text
0
```
Do not kill a busy JVB unless you accept dropping active conferences.
## 13. Common failure modes
### Calls work with two users but fail with three or more
Most likely cause:
```text
JVB_ADVERTISE_IPS is wrong
UDP 10000 is blocked
NAT is not forwarding UDP correctly
```
The Docker guide specifically warns([Jitsi][2])IP advertisement can cause calls to crash when more than two users join. citeturn115407view3
### Remote JVB never appears in Jicofo
Check:
```text
JVB_AUTH_PASSWORD mismatch
Prosody 5222 blocked
Wrong XMPP_SERVER
Wrong XMPP_AUTH_DOMAIN
Wrong XMPP_INTERNAL_MUC_DOMAIN
Wrong JVB_BREWERY_MUC
Firewall allows only public interface but JVB uses private route
```
### Multiple JVBs appear, but traffic only goes to one
Possible causes:
```text
Very few conferences
Bridge stress threshold not reached
Jicofo bridge selection strategy
One bridge has better region/locality
Other bridges are unhealthy
```
Remember: distribution is usually per conference, not per packet.
### Browser console shows Colibri WebSocket errors
Check:
```text
ENABLE_COLIBRI_WEBSOCKET
COLIBRI_WEBSOCKET_REGEX
JVB_WS_SERVER_ID
JVB_WS_DOMAIN
Reverse proxy websocket Upgrade headers
Routing /colibri-ws/<server-id>/ to the correct JVB
```
Jitsi Videobridges WebSocket documentati([GitHub][3]) support WebSocket and route the `server-id` path to the correct bridge. citeturn206078view0

159
Vagrant/Commands.md Normal file
View File

@@ -0,0 +1,159 @@
# Vagrant Installation and Operations Guide (Debian/Ubuntu)
## 1. Install Vagrant on Debian/Ubuntu
Add the HashiCorp GPG key and repository, then install Vagrant:
```bash
wget -O - https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install vagrant
```
---
## 2. Project Setup
### Initialize a Vagrant Environment
```bash
vagrant init
```
### Initialize with a Specific Base Box
```bash
vagrant init ubuntu/focal64
```
---
## 3. Lifecycle Management
### Start and Provision the VM
```bash
vagrant up
```
### Stop the VM (Graceful Shutdown)
```bash
vagrant halt
```
### Force Stop (Power Off)
```bash
vagrant halt -f
```
### Reboot the VM
```bash
vagrant reload
```
### Reload and Re-Provision
```bash
vagrant reload --provision
```
---
## 4. Provisioning
### Run Provisioners Without Restarting
```bash
vagrant provision
```
---
## 5. Box Operations
### List Installed Boxes
```bash
vagrant box list
```
### Add a Box
```bash
vagrant box add ubuntu/focal64
```
### Remove a Box
```bash
vagrant box remove ubuntu/focal64
```
---
## 6. VM Management and Access
### SSH into the Machine
```bash
vagrant ssh
```
### Get SSH Configuration
```bash
vagrant ssh-config
```
### Check VM Status
```bash
vagrant status
```
### Show Machine Information
```bash
vagrant info
```
---
## 7. Cleanup
### Destroy a VM
```bash
vagrant destroy
```
### Destroy Without Confirmation
```bash
vagrant destroy -f
```
---
## 8. Synced Folder and Debugging
### View Global VM List
```bash
vagrant global-status
```
### Prune Stale Global Entries
```bash
vagrant global-status --prune
```
### Enable Debug Output
```bash
vagrant up --debug
```

110
Vagrant/Network.md Normal file
View File

@@ -0,0 +1,110 @@
# Vagrant Networking Configuration Guide (DevOps Oriented)
This document provides a structured reference for configuring Vagrant virtual machine networking, focusing on private, public, and forwarded port configurations commonly used in DevOps workflows.
---
## 1. Types of Networks
Vagrant supports multiple networking modes. This guide focuses on two primary types:
### 1.1 Private Network
A private network allows the VM to communicate only with the host or other VMs on the same private network. Often used for internal service communication or NAT-style layouts.
**NAT-based private network:**
```ruby
config.vm.network "private_network", type: "dhcp"
```
**Static IP private network:**
```ruby
config.vm.network "private_network", ip: "192.168.50.4"
```
**Static IP with manual configuration (no auto-config):**
```ruby
config.vm.network "private_network", ip: "192.168.50.4", auto_config: false
```
**IPv6 private network:**
```ruby
config.vm.network "private_network", ip: "fde4:8dba:82e1::c4"
```
---
### 1.2 Public Network
A public network bridges the VM directly to the physical network, making it appear as a full-fledged device on the LAN, similar to the host.
**Basic public network:**
```ruby
config.vm.network "public_network"
```
**Use DHCP-assigned default route:**
```ruby
config.vm.network "public_network", use_dhcp_assigned_default_route: true
```
**Static IP assignment:**
```ruby
config.vm.network "public_network", ip: "192.168.0.17"
```
**Specify network bridge interface:**
```ruby
config.vm.network "public_network", bridge: "en1: Wi-Fi (AirPort)"
```
**Multiple bridge options:**
```ruby
config.vm.network "public_network", bridge: [
"en1: Wi-Fi (AirPort)",
"en6: Broadcom NetXtreme Gigabit Ethernet Controller",
]
```
---
## 2. Port Forwarding
Port forwarding maps ports from the host machine to the guest VM, allowing external access to services running inside the VM.
### 2.1 Basic Port Forwarding
```ruby
config.vm.network "forwarded_port", guest: 80, host: 8080
```
### 2.2 Port Forwarding with Protocol Selection
```ruby
config.vm.network "forwarded_port", guest: 2003, host: 12003, protocol: "tcp"
config.vm.network "forwarded_port", guest: 2003, host: 12003, protocol: "udp"
```
### 2.3 Auto Correcting Port Conflicts
```ruby
config.vm.network "forwarded_port", guest: 80, host: 8080, auto_correct: true
```
### 2.4 Define a Usable Host Port Range
Useful when Vagrant must auto-correct ports within a controlled range.
```ruby
config.vm.usable_port_range = 8000..8999
```

245
Vagrant/Vagrantfile.md Normal file
View File

@@ -0,0 +1,245 @@
# **Vagrant Configuration Reference for DevOps Engineers**
This document provides a practical collection of Vagrant configurations commonly used in DevOps workflows. Each example includes explanations and recommended usage patterns.
---
# **1. Basic Ubuntu VM**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/focal64"
config.vm.network "private_network", ip: "192.168.56.10"
config.vm.provider "virtualbox" do |vb|
vb.memory = "1024"
vb.cpus = 1
end
end
```
### **Summary**
* Defines a simple Ubuntu VM.
* Adds private network access.
* Sets CPU and memory limits for VirtualBox.
---
# **2. VM with Shell Provisioning**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/jammy64"
config.vm.provision "shell", inline: <<-SHELL
apt update
apt install -y nginx
systemctl enable nginx
SHELL
end
```
### **Summary**
* Runs a shell script automatically on VM boot.
* Installs and enables Nginx.
---
# **3. Multi-Machine Environment (Web + Database)**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.define "web" do |web|
web.vm.box = "ubuntu/focal64"
web.vm.hostname = "web"
web.vm.network "private_network", ip: "192.168.56.11"
web.vm.provision "shell", inline: <<-SHELL
apt update
apt install -y apache2
SHELL
end
config.vm.define "db" do |db|
db.vm.box = "ubuntu/focal64"
db.vm.hostname = "db"
db.vm.network "private_network", ip: "192.168.56.12"
db.vm.provision "shell", inline: <<-SHELL
apt update
apt install -y mysql-server
SHELL
end
end
```
### **Summary**
* Creates two VMs.
* Web VM runs Apache.
* DB VM runs MySQL.
* Local network enables app-to-database communication.
---
# **4. Using Ansible for Provisioning**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/focal64"
config.vm.provision "ansible" do |ansible|
ansible.playbook = "playbook.yml"
ansible.inventory_path = "hosts"
end
end
```
### **Summary**
* Leverages Ansible for idempotent provisioning.
* Requires a playbook and inventory file.
---
# **5. Docker Provider Example**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.provider "docker" do |d|
d.image = "nginx:latest"
d.remains_running = true
d.ports = ["8080:80"]
end
end
```
### **Summary**
* Uses Docker engine instead of a VM.
* Runs Nginx container accessible on host port 8080.
---
# **6. Synced Folder Example**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/focal64"
config.vm.synced_folder "./app", "/var/www/app", type: "virtualbox"
config.vm.provision "shell", inline: <<-SHELL
apt update
apt install -y nodejs npm
SHELL
end
```
### **Summary**
* Syncs host source code into VM.
* Suitable for development environments where code updates must sync immediately.
---
# **7. Kubernetes Single-Node Cluster (kubeadm)**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/focal64"
config.vm.hostname = "k8s-master"
config.vm.provider "virtualbox" do |vb|
vb.cpus = 2
vb.memory = 4096
end
config.vm.provision "shell", path: "install-k8s.sh"
end
```
### **Summary**
* Boots a VM with enough resources for Kubernetes.
* Runs external script containing Kubernetes installation steps.
---
# **8. Port Forwarding with Multiple Shell Provisioners and VirtualBox Limits**
### **Vagrantfile**
```ruby
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/focal64"
config.vm.hostname = "webserver"
# Port forwarding (host 8080 → guest 80)
config.vm.network "forwarded_port", guest: 80, host: 8080
# VirtualBox hardware limits
config.vm.provider "virtualbox" do |vb|
vb.memory = "2048"
vb.cpus = 2
end
# Provisioner 1: package install
config.vm.provision "shell", inline: <<-SHELL
apt update
apt install -y nginx curl
SHELL
# Provisioner 2: configuration and service restart
config.vm.provision "shell", inline: <<-SHELL
echo "Hello from Vagrant" > /var/www/html/index.html
systemctl restart nginx
SHELL
end
```
### **Summary**
* Demonstrates forwarded ports for local development.
* Runs two shell provisioners in sequence.
* Applies VirtualBox memory and CPU constraints.
---
# **Additional Recommendations**
### **General Best Practices**
* Use `config.ssh.insert_key = false` for shared team environments.
* Install recommended plugins:
* vagrant-vbguest
* vagrant-hostmanager
* Pin specific box versions for reproducibility.
* Store provisioning scripts in separate files for maintainability.
### **Resource Allocation Guidelines**
* Web servers: 12 GB RAM, 12 CPUs
* Kubernetes nodes: 4 GB+ RAM, 2 CPUs+
* Database nodes: 24 GB RAM minimum

View File

@@ -0,0 +1,239 @@
# OpenSSL Command Reference for Self-Signed Certificate Generation
This document explains the OpenSSL command-line tool and provides a structured, DevOps-friendly guide for generating self-signed SSL/TLS certificates. These certificates are commonly used for internal services, development environments, testing, or private infrastructure components such as Gitea, Jenkins, internal APIs, or Kubernetes ingress controllers.
---
## 1. What is OpenSSL?
**OpenSSL** is a widely used, open-source cryptographic toolkit that implements the SSL and TLS protocols. It provides utilities for:
* Generating private and public key pairs
* Creating Certificate Signing Requests (CSRs)
* Issuing and managing X.509 certificates
* Encrypting and decrypting data
* Inspecting and troubleshooting TLS connections
OpenSSL is available on most Linux distributions by default and is commonly used in DevOps and SRE workflows.
Check installed version:
```bash
openssl version
```
---
## 2. When to Use Self-Signed Certificates
Self-signed certificates are suitable for:
* Internal services
* Development and staging environments
* Private networks
* Lab and proof-of-concept setups
They are **not recommended for public-facing production systems**, because they are not trusted by browsers or external clients without manual trust configuration.
---
## 3. Generating a Self-Signed Certificate (Example: Gitea)
This process consists of three logical steps:
1. Generate a private key
2. Create a Certificate Signing Request (CSR)
3. Sign the CSR using the same key to produce a self-signed certificate
---
## 4. Step 1: Generate a Private Key
Generate a 2048-bit RSA private key:
```bash
openssl genrsa -out gitea.key 2048
```
### Explanation
| Component | Description |
| ---------------- | -------------------------------------------------- |
| `genrsa` | Generates an RSA private key |
| `-out gitea.key` | Output file for the private key |
| `2048` | Key size in bits (2048 is the recommended minimum) |
Security note:
* The private key must be kept secret
* Restrict file permissions:
```bash
chmod 600 gitea.key
```
---
## 5. Step 2: Create a Certificate Signing Request (CSR)
Generate a CSR using the private key:
```bash
openssl req -new -key gitea.key -out gitea.csr
```
### Explanation
| Component | Description |
| ---------------- | ---------------------------------------- |
| `req` | Certificate request management command |
| `-new` | Creates a new CSR |
| `-key gitea.key` | Private key used to generate the request |
| `-out gitea.csr` | Output file for the CSR |
### Interactive Fields
During execution, OpenSSL prompts for certificate metadata:
* Country Name (C)
* State or Province (ST)
* Locality (L)
* Organization (O)
* Organizational Unit (OU)
* **Common Name (CN)**
Important:
* The **Common Name (CN)** should match the service hostname or domain name
Example:
```
gitea.example.com
```
For automation or CI/CD pipelines, this step can be made non-interactive (see section 8).
---
## 6. Step 3: Create the Self-Signed Certificate
Sign the CSR using the same private key:
```bash
openssl x509 -req -in gitea.csr -signkey gitea.key -out gitea.crt
```
### Explanation
| Component | Description |
| -------------------- | ------------------------------------------ |
| `x509` | X.509 certificate management |
| `-req` | Indicates the input is a CSR |
| `-in gitea.csr` | Input CSR file |
| `-signkey gitea.key` | Signs the certificate with the private key |
| `-out gitea.crt` | Output certificate file |
Default behavior:
* Certificate is valid immediately
* Validity is typically 30 days unless specified
To set validity explicitly (example: 365 days):
```bash
openssl x509 -req -days 365 -in gitea.csr -signkey gitea.key -out gitea.crt
```
---
## 7. Verify the Certificate
Inspect certificate details:
```bash
openssl x509 -in gitea.crt -text -noout
```
Verify key and certificate match:
```bash
openssl rsa -noout -modulus -in gitea.key | openssl md5
openssl x509 -noout -modulus -in gitea.crt | openssl md5
```
Matching hashes confirm correctness.
---
## 8. Non-Interactive Certificate Generation (Recommended for Automation)
Generate key and certificate in one command:
```bash
openssl req -x509 -newkey rsa:2048 \
-keyout gitea.key \
-out gitea.crt \
-days 365 \
-nodes \
-subj "/C=US/ST=State/L=City/O=Company/OU=DevOps/CN=gitea.example.com"
```
Options explained:
* `-x509`: Generate a self-signed certificate
* `-newkey`: Create a new private key
* `-nodes`: Do not encrypt the private key (required for services)
* `-subj`: Certificate subject (non-interactive)
This approach is ideal for:
* CI/CD pipelines
* Docker images
* Infrastructure as Code workflows
---
## 9. Subject Alternative Name (SAN) Consideration
Modern TLS clients require **SAN** instead of CN.
Example OpenSSL config snippet:
```
subjectAltName = DNS:gitea.example.com,DNS:gitea.internal
```
Without SAN:
* Browsers may reject the certificate
* TLS warnings may appear even if CN matches
---
## 10. Summary of Generated Files
| File | Description |
| ----------- | ---------------------------------------------------- |
| `gitea.key` | Private key (keep secure, server-side only) |
| `gitea.csr` | Certificate Signing Request (optional after signing) |
| `gitea.crt` | Self-signed certificate presented to clients |
---
## 11. Best Practices
* Never commit private keys to Git
* Store secrets securely (Vault, SSM, Kubernetes secrets)
* Rotate certificates regularly
* Use Lets Encrypt or internal CA for production
* Prefer SAN over CN
* Automate certificate generation where possible
---
If you want, I can:
* Add an OpenSSL SAN configuration example
* Provide Kubernetes, Nginx, or Gitea-specific TLS configs
* Convert this into a reusable internal PKI guide
* Add troubleshooting and common TLS errors