Tuesday, June 4, 2024

AWS Instance that's SSM Manageable and terminates itself after an hour

A common problem is that you create an instance to do some testing, and then forget about it and it ends up costing for no reason....

Here is some Terraform code that:

creates the instance
creates instance role, with SSM (AmazonSSMFullAccess) so you can use session manager to get to it
sets an EC2 termination policy so you can terminate it from within (this was version 1)
creates a user through user-data so you can use EC2 Serial Console to log into it if needed
makes sure the AWS SSM Agent is installed.
Sets it to terminate upon stop, changes the "shutdown behavior" to terminate (see pic below)

edit main.tf, line 160, and change the time you want the instance to terminate here are some examples:

0 * * * * every hour (the default)

0 0 * * * every day at 12am (this would be UTC usually)

You can use something like https://crontab.cronhub.io/ to tailor the time.

To use it, simply put the VPC and Subnet you want it on, and run

# terraform init

# terraform apply

https://github.com/bminitzer/terraform/tree/main/temp-instance

Enjoy!

Friday, March 24, 2023

Terraform code for creating a new VPC with secondary CIDR (for Kubernetes for example)

I put this as a module, creates a VPC with 3 public and 3 private subnets, IGW, 1 Nat gateway for each private subnet, and a seperate routing table for each private subnet. It also creates a flowlog, and KMS key for encryption, otherwise it wouldn't pass security checks :)

Put code below as your module, and change the "source" line to whereever you put main.tf, inputs.tf and outputs.tf.

This was written around terraform version 1.3

#terraform plan ; terraform apply

module "vpc-3az-plus-secondary" {

    source="../modules/platform/vpc-3az-secondary-cidr/"
    EnvironmentName="whatever-you-need-dev-stage-or-prod"
    VpcRegion="us-east-1"
    VpcCIDR="22.125.106.0/24"
    SecondaryVpcCIDR="100.64.0.0/20"
    PublicSubnet1CIDR = "22.125.106.192/28"
    PublicSubnet2CIDR =  "22.125.106.208/28"
    PublicSubnet3CIDR =   "22.125.106.224/28"
    PrivateSubnet1CIDR =  "22.125.106.0/26"
    PrivateSubnet2CIDR =   "22.125.106.64/26"
    PrivateSubnet3CIDR =  "22.125.106.128/26"
    SecondaryPrivateSubnet1CIDR = "100.64.0.0/23"
    SecondaryPrivateSubnet2CIDR = "100.64.2.0/23"
    SecondaryPrivateSubnet3CIDR = "100.64.4.0/23"
}

here are the files:

Main.tf:

data "aws_availability_zones" "AZ" {
  state = "available"
}
resource "random_uuid" ""bm2023 {
}
resource "aws_vpc" "main" {
  cidr_block = var.VpcCIDR
  tags = {
    Name = var.EnvironmentName
  }
}
resource "aws_default_security_group" "default" {
  vpc_id = aws_vpc.main.id
}
resource "aws_flow_log" "flowlogs" {
  iam_role_arn    = aws_iam_role.vpcflowlogrole.arn
  log_destination = aws_cloudwatch_log_group.vpcflowloggroup.arn
  traffic_type    = "ALL"
  vpc_id          = aws_vpc.main.id
}
data "aws_caller_identity" "current" {}
locals {
  account_id     = data.aws_caller_identity.current.account_id
}
resource "aws_cloudwatch_log_group" "vpcflowloggroup" {
  name = "vpcflowlog-${var.EnvironmentName}-${random_uuid.bm2023.result}"
  retention_in_days = 14
  depends_on = [
    aws_kms_key.kmskey_cw_forloggroup]
  kms_key_id= aws_kms_key.kmskey_cw_forloggroup.arn

}
resource "aws_kms_key" "kmskey_cw_forloggroup" {
  description             = "KMS key for ${var.EnvironmentName} cloudwatch log group"
  enable_key_rotation     = true
  deletion_window_in_days = 10
   tags = {
    Name = "${var.EnvironmentName}-cloudwatch-loggroup"
  }
  policy = <<EOF
{
    "Id": "key-consolepolicy-3",
    "Version": "2012-10-17",
    "Statement": [
        {

            "Sid": "Enable IAM User Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::${local.account_id}:root"
            },
            "Action": "kms:*",
            "Resource": "*"
        },
        {
             "Sid": "Enable Cloud watch log group access",
    "Effect": "Allow",
    "Principal": {
        "Service": "logs.${var.vpcRegion}.amazonaws.com"
    },
    "Action": [
        "kms:Encrypt*",
        "kms:Decrypt*",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:Describe*"
    ],
    "Resource": "*",
    "Condition": {
        "ArnEquals": {
            "kms:EncryptionContext:aws:logs:arn": "arn:aws:logs:${var.vpcRegion}:${local.account_id}:log-group:*"
        }
    }
    }
    ]
}
EOF
}
resource "aws_iam_role" "vpcflowlogrole" {
  name = "vpcflowlogrole-${var.EnvironmentName}"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "vpc-flow-logs.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy" "vpcflowlogrolepolicy" {
  name = "vpcflowlogrole-${var.EnvironmentName}"
  role = aws_iam_role.vpcflowlogrole.id

  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogGroups",
        "logs:DescribeLogStreams"
      ],
      "Effect": "Allow",
      "Resource": "${aws_cloudwatch_log_group.vpcflowloggroup.arn}"
    }
  ]
}
EOF
}

resource "aws_internet_gateway" "igw" {
 vpc_id = aws_vpc.main.id

 tags = {
   Name = var.EnvironmentName
 }
}
resource "aws_subnet" "PublicSubnet1" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,0)
  map_public_ip_on_launch = false
  cidr_block = var.PublicSubnet1CIDR
   tags = {
   Name = "${var.EnvironmentName} Public Subnet (AZ1)"
   subnet_type = "public"
 }
}
resource "aws_subnet" "PublicSubnet2" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,1)
  map_public_ip_on_launch = false
  cidr_block = var.PublicSubnet2CIDR
     tags = {
   Name = "${var.EnvironmentName} Public Subnet (AZ2)"
    subnet_type = "public"
 }
}
resource "aws_subnet" "PublicSubnet3" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,2)
  map_public_ip_on_launch = false
  cidr_block = var.PublicSubnet3CIDR
     tags = {
   Name = "${var.EnvironmentName} Public Subnet (AZ3)"
    subnet_type = "public"
 }
}
resource "aws_subnet" "PrivateSubnet1" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,0)
  cidr_block = var.PrivateSubnet1CIDR
   tags = {
   Name = "${var.EnvironmentName} Private Subnet (AZ1)"
    subnet_type = "private"
 }
}
resource "aws_subnet" "PrivateSubnet2" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,1)
  cidr_block = var.PrivateSubnet2CIDR
   tags = {
   Name = "${var.EnvironmentName} Private Subnet (AZ2)"
   subnet_type = "private"
 }
}
resource "aws_subnet" "PrivateSubnet3" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,2)
  cidr_block = var.PrivateSubnet3CIDR
   tags = {
   Name = "${var.EnvironmentName} Private Subnet (AZ3)"
   subnet_type = "private"
 }
}
resource "aws_vpc_ipv4_cidr_block_association" "secondary_cidr" {
  vpc_id     = aws_vpc.main.id
  cidr_block = var.SecondaryVpcCIDR
}
resource "aws_subnet" "SecondaryPrivateSubnet1" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,0)
  cidr_block = var.SecondaryPrivateSubnet1CIDR
   tags = {
   Name = "${var.EnvironmentName} Private Subnet (AZ1) Secondary"
 }
}
resource "aws_subnet" "SecondaryPrivateSubnet2" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,1)
  cidr_block = var.SecondaryPrivateSubnet2CIDR
   tags = {
   Name = "${var.EnvironmentName} Private Subnet (AZ2) Secondary"
 }
}
resource "aws_subnet" "SecondaryPrivateSubnet3" {
  vpc_id = aws_vpc.main.id
  availability_zone = element(data.aws_availability_zones.AZ.names,2)
  cidr_block = var.SecondaryPrivateSubnet3CIDR
   tags = {
   Name = "${var.EnvironmentName} Private Subnet (AZ3) Secondary"
 }
}

resource "aws_eip" "NatGateway1EIP" {
  vpc        = true
  depends_on = [aws_internet_gateway.igw]
}
resource "aws_eip" "NatGateway2EIP" {
  vpc        = true
  depends_on = [aws_internet_gateway.igw]
}
resource "aws_eip" "NatGateway3EIP" {
  vpc        = true
  depends_on = [aws_internet_gateway.igw]
}
resource "aws_nat_gateway" "NatGateway1" {
  allocation_id = aws_eip.NatGateway1EIP.id
  subnet_id     = aws_subnet.PublicSubnet1.id
  depends_on    = [aws_internet_gateway.igw]
  tags = {
    Name = var.EnvironmentName
  }
}
resource "aws_nat_gateway" "NatGateway2" {
  allocation_id = aws_eip.NatGateway2EIP.id
  subnet_id     = aws_subnet.PublicSubnet2.id
  depends_on    = [aws_internet_gateway.igw]
  tags = {
    Name = var.EnvironmentName
  }
}
resource "aws_nat_gateway" "NatGateway3" {
  allocation_id = aws_eip.NatGateway3EIP.id
  subnet_id     = aws_subnet.PublicSubnet3.id
  depends_on    = [aws_internet_gateway.igw]
  tags = {
    Name = var.EnvironmentName
  }
}
resource "aws_route_table" "PublicRouteTable" {
  vpc_id = aws_vpc.main.id
   tags = {
    Name = "${var.EnvironmentName} Public Routes"
  }
}
resource "aws_route_table" "PrivateRouteTable1" {
  vpc_id = aws_vpc.main.id
   tags = {
    Name = "${var.EnvironmentName} Private Routes (AZ1)"
  }
}

resource "aws_route_table" "PrivateRouteTable2" {
  vpc_id = aws_vpc.main.id
   tags = {
    Name = "${var.EnvironmentName} Private Routes (AZ2)"
  }
}

resource "aws_route_table" "PrivateRouteTable3" {
  vpc_id = aws_vpc.main.id
   tags = {
    Name = "${var.EnvironmentName} Private Routes (AZ3)"
  }
}
resource "aws_route_table" "SecondaryPrivateRouteTable1" {
  vpc_id = aws_vpc.main.id
   tags = {
    Name = "${var.EnvironmentName} Private Routes (AZ1)"
  }
}

resource "aws_route_table" "SecondaryPrivateRouteTable2" {
  vpc_id = aws_vpc.main.id
   tags = {
    Name = "${var.EnvironmentName} Private Routes (AZ2)"
  }
}

resource "aws_route_table" "SecondaryPrivateRouteTable3" {
  vpc_id = aws_vpc.main.id
   tags = {
    Name = "${var.EnvironmentName} Private Routes (AZ3)"
  }
}
resource "aws_route" "DefaultPublicRoute" {
  route_table_id = aws_route_table.PublicRouteTable.id
  destination_cidr_block = "0.0.0.0/0"
  gateway_id = aws_internet_gateway.igw.id
  depends_on    = [aws_internet_gateway.igw]
}
resource "aws_route_table_association" "PublicSubnet1RouteTableAssociation" {
  route_table_id = aws_route_table.PublicRouteTable.id
  subnet_id = aws_subnet.PublicSubnet1.id
}
resource "aws_route_table_association" "PublicSubnet2RouteTableAssociation" {
  route_table_id = aws_route_table.PublicRouteTable.id
  subnet_id = aws_subnet.PublicSubnet2.id
}
resource "aws_route_table_association" "PublicSubnet3RouteTableAssociation" {
  route_table_id = aws_route_table.PublicRouteTable.id
  subnet_id = aws_subnet.PublicSubnet3.id
}
resource "aws_route" "DefaultPrivateRoute1" {
  route_table_id         = aws_route_table.PrivateRouteTable1.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.NatGateway1.id
}
resource "aws_route" "DefaultPrivateRoute2" {
  route_table_id         = aws_route_table.PrivateRouteTable2.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.NatGateway2.id
}
resource "aws_route" "DefaultPrivateRoute3" {
  route_table_id         = aws_route_table.PrivateRouteTable3.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.NatGateway3.id
}

resource "aws_route" "SecondaryDefaultPrivateRoute1" {
  route_table_id         = aws_route_table.SecondaryPrivateRouteTable1.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.NatGateway1.id
}
resource "aws_route" "SecondaryDefaultPrivateRoute2" {
  route_table_id         = aws_route_table.SecondaryPrivateRouteTable2.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.NatGateway2.id
}
resource "aws_route" "SecondaryDefaultPrivateRoute3" {
  route_table_id         = aws_route_table.SecondaryPrivateRouteTable3.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.NatGateway3.id
}
resource "aws_route_table_association" "PrivateSubnet1RouteTableAssociation" {
  route_table_id = aws_route_table.PrivateRouteTable1.id
  subnet_id = aws_subnet.PrivateSubnet1.id
}
resource "aws_route_table_association" "PrivateSubnet2RouteTableAssociation" {
  route_table_id = aws_route_table.PrivateRouteTable2.id
  subnet_id = aws_subnet.PrivateSubnet2.id
}
resource "aws_route_table_association" "PrivateSubnet3RouteTableAssociation" {
  route_table_id = aws_route_table.PrivateRouteTable3.id
  subnet_id = aws_subnet.PrivateSubnet3.id
}
resource "aws_route_table_association" "SecondaryPrivateSubnet1RouteTableAssociation" {
  route_table_id = aws_route_table.SecondaryPrivateRouteTable1.id
  subnet_id = aws_subnet.SecondaryPrivateSubnet1.id
}
resource "aws_route_table_association" "SecondaryPrivateSubnet2RouteTableAssociation" {
  route_table_id = aws_route_table.SecondaryPrivateRouteTable2.id
  subnet_id = aws_subnet.SecondaryPrivateSubnet2.id
}
resource "aws_route_table_association" "SecondaryPrivateSubnet3RouteTableAssociation" {
  route_table_id = aws_route_table.SecondaryPrivateRouteTable3.id
  subnet_id = aws_subnet.SecondaryPrivateSubnet3.id
}

Outputs.tf:
output "EnvironmentName" {
  value = var.EnvironmentName
}
output "vpcId" {
    value = aws_vpc.main.id
}
output "vpc_cidr" {
    value = var.VpcCIDR
}
output "aws_cloudwatch_log_group" {
    value = aws_cloudwatch_log_group.vpcflowloggroup
}
output "kms_key" {
    value = aws_kms_key.kmskey_cw_forloggroup
}
output "aws_flow_log" {
    value = aws_flow_log.flowlogs
}
output "Secondary_vpc_cidr" {
    value = var.SecondaryVpcCIDR
}
output "PublicSubnet1" {
  value = aws_subnet.PublicSubnet1.id
}
output "PublicSubnet2" {
  value = aws_subnet.PublicSubnet2.id
}
output "PublicSubnet3" {
  value = aws_subnet.PublicSubnet3.id
}
output "public_subnet_route_table_id" {
    value = aws_route_table.PublicRouteTable.id
}
output "PrivateSubnet1" {
    value = aws_subnet.PrivateSubnet1.id
}
output "private_subnet1_route_table_id" {
    value = aws_route_table.PrivateRouteTable1.id
}
output "NatGatewayip_public_cidr1" {
    value = aws_nat_gateway.NatGateway1.id

}
output "PrivateSubnet2" {
    value = aws_subnet.PrivateSubnet2.id
}
output "private_subnet2_route_table_id" {
    value = aws_route_table.PrivateRouteTable2.id
}
output "NatGatewayip_public_cidr2" {
    value = aws_nat_gateway.NatGateway2.id
}
output "PrivateSubnet3" {
    value = aws_subnet.PrivateSubnet3.id
}
output "private_subnet3_route_table_id" {
    value = aws_route_table.PrivateRouteTable3.id
}
output "NatGatewayip_public_cidr3" {
    value = aws_nat_gateway.NatGateway3.id
}
output "SecondaryPrivateSubnet1" {
  value = aws_subnet.SecondaryPrivateSubnet1.id
}
output "SecondaryPrivateSubnet2" {
  value = aws_subnet.SecondaryPrivateSubnet2.id
}
output "SecondaryPrivateSubnet3" {
  value = aws_subnet.SecondaryPrivateSubnet3.id
}

*** This assumes you have a pipeline, that has something along the following in the git/repo directory for that pipeline:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
}

provider "aws" {

    region = "us-east-1"

# Role in remote account:
    assume_role {
        role_arn = "arn:aws:iam::00000000000:role/yourrole-terraform-automation"
    }

    default_tags {
        tags = {
            automation_provider         = "codepipeline"
            automation_account_number   = "11111111111111"
            automation_account_name     = "Deployment-Acct"
            automation_github_url       = "https://github.com/tree/terraform/"
            automation_github_env       = "11111111111111-us-east-1"
            automation_pipeline         = "11111111111111-us-east-1"
            automation_region           = "us-east-1"
            department                  = "Cloud DevOPS"
        }
    }

}

limiting ingest of logs to Splunk with FluentBit and Docker

Requirement: We wanted to send logs to splunk (A LOT of logs) and to do it with a docker container running fluentbit.

*for a more detailed discussion of why use Fluent-bit vs FluentD, you can check out https://logz.io/blog/fluentd-vs-fluent-bit/.

However for this scenario Fluent-Bit was the better choice.

Deliverable: Each VM has 5-20 Docker containers on it, and one of them will be the fluent-bit container. It will send the logs from all the containers required, not exceeding 500Gb a day, a quota shared by the amount of containers. Example: if there are 10 containers, then each can send 50 a day, however if there are 15 containers, each can send 33.3Gb a day.

If a container is stopped, we need to know at what point the stop happened (this happens through the fluentbit configuration).

Architecture of System:

The Dockerfile builds a container which monitors logs on a VM which runs between 1 and 20 containers on it. (there is no “hard” limit, can be 100 containers or more)

Since docker container terminate after the main process on them is sent a SIGHUP or any signal 1-15, we had to use a supervisord program which run as PID 1, and encapsulates the fluentbit process within it.

Supervisord:

 https://gdevillele.github.io/engine/admin/using_supervisord/

 http://supervisord.org/

This is the structure of the container build:

• Dockerfile – the file from which the container is built

• config.sh – main configuration file for the monitor-log.sh script

• fluent-bit.conf – initial fluent-bit configuration

• inputs.conf – the file which defines all the containers

• metadata.lua – filtering file

• monitor-log.sh – main script which runs every 10min

• oneopsscript.sh – script file that strips the logs to what is needed.

• supervisord.conf – the supervisord daemon config

• uncomment-fluent-config.sh – script that runs at 12am and resets the count for the next day.

Running the container:

To run the monitoring container you need to do the following:

1. # docker build -t fbit .

The arguments for this command are:

The docker build command builds Docker images from a Dockerfile and a “context”. A build’s context is the set of files located in the specified PATH

-t is: Name and optionally a tag in the 'name:tag' format

2. # docker run -v /data/weiotadm/docker/lib/containers/:/data/we/docker/lib/containers/ -v /data/:/home/prom/ --name=splunk_fbit122 fbit

This runs the container you just compiled in step #1, with mounting the directories

-v or --volume: Consists of three fields, separated by colon characters (:). The fields must be in the correct order, and the meaning of each field is not immediately obvious.

• In the case of bind mounts, the first field is the path to the file or directory on the host machine.

• The second field is the path where the file or directory is mounted in the container

• The third field is optional, and is a comma-separated list of options, such as ro, z, and Z

Note: If you use -v or --volume to bind-mount a file or directory that does not yet exist on the Docker host, -v creates the endpoint for you. It is always created as a directory.

If you use --mount to bind-mount a file or directory that does not yet exist on the Docker host, Docker does not automatically create it for you, but generates an error.

The Docker File explained:

config.sh explained

This is the meat and potatoes of the system, the monitor-log.sh script:

#!/bin/bash
# run every 1-5 minutes to monitor log
## cat config.v2.json | grep -o '"Name":"[^"]*' | grep -o '[^"]*$'
# load config
source ./config.sh
echo "Monitoring Log Size"
count=0
for entry in "$log_dir"/*; do

if [ -d $entry ]; then
count=$((count+1))
container_folder_name=`basename $entry`
main_log_file_name=$container_folder_name-json.log
main_log_file_path=$entry/$main_log_file_name
if [ ! -f $main_log_file_path ]; then
continue
fi
check_config="$(grep -wn "$^\s.*\|^\|^\#.*\|^\s.*\#.*$Path.*$container_folder_name" $fluent_config_file | cut -d: -f1)"

echo $check_config

if [ -z "$check_config" ]; then
## add more INPUT configure if it does not exits
echo "
[INPUT]
name tail
Path $entry/*.log
Parser json
Skip_Empty_Lines true
Tag_Regex (.*\/(?<container_id>.*)-json\.log)
Tag docker.<container_id>
Docker_Mode true
Read_from_Head true
Mem_Buf_Limit 5000MB
Buffer_Chunk_Size 250k
Buffer_Max_Size 500k
Refresh_Interval 10" >> $fluent_config_file
fi
## if file_size > 50MB --> remove fluent config
##
tag_lines="$(grep -wn "^\s*.*\[$[A-Z]$*\]\|^\s*\@[A-Z].*" $fluent_config_file | cut -d: -f1)"
log_config_line="$(grep -wn "$^\s.*\|^$Path.*$container_folder_name" $fluent_config_file | cut -d: -f1)"
echo "tag_line=" $tag_lines
if [ ! -z "$log_config_line" ]; then
echo "Log config line: " $log_config_line
today=`date +"%Y-%m-%d"`
## get container name & max size
config_json_file=$entry'/config.v2.json'
container_name=`cat ${config_json_file} | grep -o '"Name":"[^"]*' | grep -o '[^"/]*$'`
if [ ${!container_name} ]; then
max_file_size_byte=$((${!container_name}*1024*1024))
else
## get default max size
max_file_size_byte=$(($default_max_file_size*1024*1024))
fi
echo "max_file_size_byte=" $max_file_size_byte

## calculate log size today using grep, cat, sed
##file_size=`grep "${today}" $main_log_file_path | egrep "${log_type_pattern}" | wc -c`
file_size=`sed -n "/${today}/p" $main_log_file_path | egrep "${log_type_pattern}" | wc -c`
##file_size=`cat $main_log_file_path | grep "${today}" | egrep "${log_type_pattern}" | wc -c`

# write file_size into file
date_format=`date +"%Y%m%d"`
#echo ${file_size} > $entry'/size'${date_format}'.txt'
echo "log size of container: $container_name=" $file_size "Byte, Max="$max_file_size_byte;

if [ $file_size -lt $max_file_size_byte ]; then
continue
fi

#start line & end_line to remove configure
start_line=0
end_line=0

for input_line in $tag_lines
do
if [ $log_config_line -gt $input_line ]; then
echo "[INPUT] start at line:" $input_line
start_line=$input_line
continue
else
# less than input_line
end_line=$((input_line-1))
break
fi
done
if [[ $start_line -gt 0 && $end_line == 0 ]]; then
end_line=`wc -l $fluent_config_file | cut -d' ' -f1`
fi
if [[ $start_line -gt 0 && $end_line -gt 0 ]]; then
echo "Comment from: "$start_line "to" $end_line
sed -i -e ''"${start_line}"','"${end_line}"'s/^/#/' $fluent_config_file
killall -HUP fluent-bit
fi

fi
fi
done

This is the uncomment script:

# This script should run at 00:01 every day
# It uncomments fluent-bit [INPUT] configuration
# load config
source ./config.sh
# find comment INPUT & remove the comment from them.
tag_lines="$(grep -wn "^\s*.*\[$[A-Z]$*\]\|^\s*\@[A-Z].*" $fluent_config_file | cut -d: -f1)"
echo "tag_lines="$tag_lines
count=0
for entry in "$log_dir"/*; do

if [ -d $entry ]; then
count=$((count+1))
container_folder_name=`basename $entry`
main_log_file_name=$container_folder_name-json.log
main_log_file_path=$entry/$main_log_file_name

log_config_line="$(grep -wn "$^\#.*\|^\s.*\#.*$Path.*$container_folder_name" $fluent_config_file | cut -d: -f1)"
#start line & end_line to remove configure
start_line=0
end_line=0
if [ -z "$log_config_line" ]; then
continue
fi
echo "log_config_line=" $log_config_line
for tag_line in $tag_lines
do
if [ $log_config_line -gt $tag_line ]; then
echo "[INPUT] start at line:" $tag_line
start_line=$tag_line
continue
else

end_line=$((tag_line-1))
break
fi
done
if [[ $start_line -gt 0 && $end_line == 0 ]]; then
end_line=`wc -l $fluent_config_file | cut -d' ' -f1`
fi
if [[ $start_line -gt 0 && $end_line -gt 0 ]]; then
echo "uncomment from: "$start_line " to " $end_line
sed -i -e ''"${start_line}"','"${end_line}"'s/.*#//' $fluent_config_file
killall -HUP fluent-bit
fi
fi
done

Wednesday, January 30, 2019

How to have AWS Config send an SNS notification if a VPC's allows unrestricted incoming SSH traffic.

This is to have AWS Config send an SNS notification if a VPC's allow unrestricted incoming SSH traffic.

General Overview:

In order to configure a region to start recording AWS Config, you need to enable it. If you have many accounts and every account at this point has 15 regions, that's alot of work on GUI, so we will
do most using CLI.

Steps:

Configure an S3 bucket to hold all the awsconfig data.
Configure SNS Topic for this in each region
Create an IAM Role for AWS Config to send to S3/SNS/Get the data
Enable aws config per account in each region.
Create an Aggregator in the main account to receive all the data from all the other accounts/regions
Authorize this above aggregator in each and ever account/region
Test that it works.

S3:

We need to create a S3 bucket to hold all these configurations. Since S3 is a single name across all AWS, I created a bucket called awsconfig-bucket in one account and allowed all the other accounts to write to it.

we need to allow all the other accounts to write to the bucket

You can get the canonical account ID by doing this:

| => aws s3api list-buckets --profile xxx | grep OWNER

OWNER XXX-XXXX 8d1da5691369aa323454339960db71b048b21de4db6f6b8b698e83df0c27129b4

Next, we need the bucket policy set on the S3 bucket, For a bucket to receive log files from multiple accounts, (ref: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-set-bucket-policy-for-multiple-accounts.html ) otherwise you will get an error like:

An error occurred (InsufficientDeliveryPolicyException) when calling the PutDeliveryChannel operation: Insufficient delivery policy to s3 bucket

We'll put this policy in, inline or call it something:

{

"Version": "2012-10-17",

"Statement": [

{

"Sid": "AWSCloudTrailAclCheck20131101",

"Effect": "Allow",

"Principal": {

"Service": "config.amazonaws.com"

},

"Action": "s3:GetBucketAcl",

"Resource": "arn:aws:s3:::awsconfig-bucket"

},

{

"Sid": "AWSCloudTrailWrite20131101",

"Effect": "Allow",

"Principal": {

"Service": "config.amazonaws.com"

},

"Action": "s3:PutObject",

"Resource": [

"arn:aws:s3:::awsconfig-bucket/AWSLogs/xxxxxxx/*",

"arn:aws:s3:::awsconfig-bucket/AWSLogs/xxxxxxx/*",

"arn:aws:s3:::awsconfig-bucket/AWSLogs/xxxxxxx/*"

],

"Condition": {

"StringEquals": {

"s3:x-amz-acl": "bucket-owner-full-control"

}

}

}

]

}

AWS CONFIG

the command to enable awsconfig is like this:

aws configservice subscribe --s3-bucket awsconfig-bucket --sns-topic arn:aws:sns:us-west-2:xxxxxxxxxxx:awsconfig --iam-role arn:aws:iam::xxxxxx:role/aws_config_s3_role --profile xxxxxx --region us-west-2

Therefore we have to enable all the components in the command, i.e SNS and S3 and configure the IAM role:

SNS:

You have to have an SNS topic, otherwise it gives you an error.  So we create "dummy" SNS topics just for the command to work:

for z in `cat list_of_regions.txt` ; do echo "aws sns create-topic --name awsconfig --profile `echo $PROFILE` --region $z"; done

So we get something like this:

aws sns create-topic --name awsconfig --profile xxxxxxx --region us-west-2

Then we need to create this for every region in every account. (using the bash one liner)


IAM: 


Next we create an IAM role per account that grants AWS Config permissions to access the Amazon S3 bucket, access the Amazon SNS topic, 
and get configuration details for supported AWS resources.   Lets call this role awsconfig_role

We attach 3 policies to this user: AWSConfigRole, AmazonSNSRole (AWS managed policies) and create a policy to write to the S3 bucket:

| => cat s3_policy.json

    "Version": "2012-10-17",

    "Statement": [

            "Sid": "VisualEditor0",

            "Effect": "Allow",

            "Action": "s3:*",

            "Resource": [

                "arn:aws:s3:::awsconfig-bucket",

                "arn:aws:s3:::*/*"


Also in the trust relationship, we add this policy in JSON:





| => cat trust_relationship_awsconfig.json 
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "config.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}


This is so that AWSConfig can assume this role to do all its stuff.

We can do this in this CLI:



aws iam create-role --role-name awsconfig_role --assume-role-policy-document file://trust_relationship_awsconfig.json --profile XXXX
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/service-role/AWSConfigRole --role-name awsconfig_role --profile XXXX
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/service-role/AmazonSNSRole --role-name awsconfig_role --profile XXXX
aws iam put-role-policy --role-name awsconfig_role --policy-name S3_AWSCONFIG_BUCKET --policy-document file://s3_policy.json --profile XXXX



so now we use the same configuration to create this role in all 23 accounts.
for z in `cat list_of_accounts.txt` ; do echo "aws iam create-role --role-name awsconfig_role --assume-role-policy-document file://trust_relationship_awsconfig.json --profile $z"; done
for z in `cat list_of_accounts.txt` ; do echo "aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/service-role/AWSConfigRole --role-name awsconfig_role --profile $z"; done
for z in `cat list_of_accounts.txt` ; do echo "aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/service-role/AmazonSNSRole --role-name awsconfig_role --profile $z"; done
for z in `cat list_of_accounts.txt` ; do echo "aws iam put-role-policy --role-name awsconfig_role --policy-name S3_AWSCONFIG_BUCKET --policy-document file://s3_policy.json --profile $z"; done






AGGREGATOR



5. Next we need to create an aggregator in the main account for all the accounts and all regions. ( we do this now so it sends authorizations to all accounts/regions)
aws configservice put-configuration-aggregator --configuration-aggregator-name XXXX_ALL --account-aggregation-sources "[{\"AccountIds\": [\"123456789\",\"12343333333\",\"AccountID3\"],\"AllAwsRegions\": true}]" 

(you can put all the accountID's you need, I put all of them)



You can check that this worked by doing this:  
| => aws configservice describe-configuration-aggregators
CONFIGURATIONAGGREGATORS arn:aws:config:us-east-1:12344555445:config-aggregator/config-aggregator-aphkgdyd XXXX_ALL 1546544013.02 1547232995.08
ACCOUNTAGGREGATIONSOURCES True
ACCOUNTIDS 123456789112
ACCOUNTIDS 123456789113
ACCOUNTIDS 123456789114
ACCOUNTIDS 123456789115
ACCOUNTIDS 123456789116
ACCOUNTIDS 123456789117

... there rest of the accounts came here, I cut them out for the purpose of this page.





6. Finally, after adding all the source accounts, authorizations are sent everywhere (all regions of every account) so we need to authorize all of them:

This is what it looks like from the GUI (however we don't want to log into ALL accounts * 16 regions, hundreds of times to do this !) 







We do this with CLI:
This is for example to authorize the region of us-east-1 in the account of 12344555445
aws configservice put-aggregation-authorization --authorized-account-id 12344555445 --authorized-aws-region us-east-1 --profile 12344555445 --region us-east-1

Friday, March 16, 2018

Azure bits and pieces - CSV Dump of VMs and Private/Public IP's

Getting a list of Azure VM's and IP's

You would think this would be easier, and it's relatively easy from the GUI, but not if you have over 50 VM's or so. As of today (March 2018) there is no option to export to CSV or the likes from the interface, so doing this through PowerShell is the solution.

Note, I had many issues with this Azure CloudShell (see pic) many commands simply don't work or are work as expected in Cloud shell, however on my Mac with PS installed they work fine.

Attached is a script I use to get any stats on VM's I need, see screenshot below.

You can download this script here as well

Tuesday, April 4, 2017

Python Scripts and Bits

Just a simple Python password generator I use in various applications:

Github link: https://github.com/bminitzer/password-generator

Wednesday, June 8, 2016

Getting a list of patches for your ESX hosts through update manager PowerCLI

So you have alot of VMware hosts in your environment, and want to get an actual list of all the patches and upgrades needed?

Well, I know you can get that in the compliance view in the update manager plugin in vCenter, however there is no way to print it from there.

So you can easily do it from PowerCLI, however you need to also install the VUM - VMware Update Manager PowerCLI,, this is on top of your regular VMware PowerCLI:

You can find the version you need here:

https://communities.vmware.com/community/vmtn/automationtools/powercli/updatemanager

For me, it was version 6, so install that, and then connect to your vCenter and you can run this PowerCLI script:

ForEach ($HostToCheck in Get-VMHost){
$Details = Get-Compliance $HostToCheck -Detailed| Select -ExpandPropertyNotCompliantPatches| Select @{N="Hostname";E={$HostToCheck}}, Severity,IdByVendor, ReleaseDate, Description, Name
$ComplianceResult += $Details
}
$ComplianceResult | Export-CSV -Path c:\temp\NeededPatches.CSV -NoType

That will create a CSV file in c:\temp that will have information like this:

That's about it, you now have a detailed list of hosts/patches needed with a URL to the VMware KB for a description of the patch.

Tuesday, July 7, 2015

Add Multiple Hosts to vCenter and other PowerCLI snippets

Once in a while you may need to add a whole chassis of 16 blades, or even 5 chassis's to vCenter, and I'm pretty sure you don't want to do it manually 90 times....

So here it goes:

1. Make a file, call it vcenter-hosts.txt or whatever you want, and put it in c:\temp, put all your hosts that you want to enter. Mine looks like this:

vmhost-la01-ch01-bl01.dvirt.net
vmhost-la01-ch01-bl02.dvirt.net
vmhost-la01-ch01-bl03.dvirt.net
vmhost-la01-ch01-bl04.dvirt.net

2. Connect via PowerCLI to your vCenter and issue this command:

Get-Content c:\temp\vcenter-hosts.txt | Foreach-Object { Add-VMHost $_ -Location (Get-Datacenter LosAngeles01) -User root -Password changeme-RunAsync -force:$true}

You will see this:

On the vCenter it will look like this:

That's it.

Of course your hosts need to be resolved by the vCenter, or you will get a nice error like this:

Add-VMHost : 6/22/2015 8:23:45 PM Add-VMHost Cannot contact the
specified host (host01.blah.net). The host may not be available on
the network, a network configuration problem may exist, or the management
services on this host may not be responding.
At line:1 char:45

After you add all these hosts, you may want to use Ansible to configure them all, or if you rather, you can do some stuff such as set hostname and DNS and others via command line such as below:

Set ESXi hostname via Command line (SSH directly to the host)
esxcli system hostname set --host=esxi08.abcdomain.net

Set ESXi search domains: (SSH directly to the host)
esxcli network ip dns search add -d yahoo.com domain.local

Set up nameserver/s: (SSH directly to the host)
esxcli network ip dns server add -s 4.2.2.2

Another issue that may come up (especially if you use Ansible) is that you want to name all your datastores the same thing, if they are not, or you want to name them a good name, this would be the command in PowerCLI:

get-vmhost esxi08.abcdomain.net | get-datastore | set-datastore -name esxi08-local

It would look like this:

However you need to do this when the host is NOT in vCenter. When you import say 16 hosts into vCenter, the first one will have its datastore called "datastore1" then the next one will be datastore1 (1) and the one after that datastore1 (2) and so on. example:

So in order for ansible to work, when it's expecting datastore1, you need to rename the datastore to that (or just leave it if you didnt bring it into vCenter) Once you remove it from vCenter, the name remains, but then you can use the command above to change it back or change it to whatever name you want.

Monday, June 29, 2015

Deploying multiple Windows VM's from template (powerCLI)

Unlike this post, which talks about deploying Linux VM's, this one is about deploying Windows VM's which is a little different.

Your PowerCLI command will look like this:

Import-Csv "C:\boaz\NewVMs-LA01.csv" -UseCulture | %{
## Gets Customization info to set NIC to Static and assign static IP address
Get-OSCustomizationSpec $_.Customization | Get-OSCustomizationNicMapping | `
## Sets the Static IP info
Set-OSCustomizationNicMapping -IpMode UseStaticIP -IpAddress $_."IPAddress" `
-SubnetMask $_.Subnetmask -DefaultGateway $_.DefaultGateway -Dns $_.DNS1,$_.DNS2
## Sets the name of the VMs OS
$cust = Get-OSCustomizationSpec -name Windows2008R2_profile
Set-OSCustomizationSpec -OSCustomizationSpec $cust -NamingScheme Fixed -NamingPrefix $_.VMName
## Creates the New VM from the template
$vm=New-VM -name $_."VMName" -Template $_.Template -Host $_."VMHost" `
-Datastore $_.Datastore -OSCustomizationSpec $_.Customization `
-Confirm:$false -RunAsync
}

You will of course need to create a customization profile for this Windows Server, in which you can put all the relevant information, including a Domain membership, license key (If you're not using a KMS server) and others.

Your CSV file looks like this below in this example: (click on it to see bigger)

You can download this CSV from here

Your PowerCLI unlike Linux will look like this:(sorry for the red patches, had to remove identifying information)

Thursday, June 11, 2015

Adding a Stand Alone ESXi host to Active Directory Authentication

If you're not using vCenter, or even if you are and your hosts aren't in lockdown mode, you may want to have authentication to the local ESXi hosts done through Active Directory.

These are the steps:

Pre-requisits:

Since you are adding to the domain, you need a name server to be able to resolve the Active Directoy domain controller or server that hosts the Master FSMO Role.
_ldap._tcp.dc._msdcs. DNSDomainName SRV resource record, which identifies the name of the domain controller that hosts the Active Directory domain.

Go to DNS and Routing, and put the hostname,, domain (as it shows in AD) and the IP of the DNS server that can answer for the SRV Record for the domain.

1. Go to Configuration --> Authentication Services, and then to Properties.

2. Choose "Active Directory" from the pull-down menu, and then put in your domain name, and click "Join Domain" it will then prompt you to put in credentials of a user that can add computers to the domain.

IMPORTANT: Wait until this finishes, look for an event saying that it's "Join Windows domain" and wait for it to complete: See pic below, don't continue until this is done.

3. Go to Configuration --> Advanced Services and to Config, and scroll down to Config.HostAgent.plugins.hostsvc.esxAdminsGroup and add an Active Directory group that you want to be Administrators on this box.

4. SSH into the box, create a directory /var/lock/subsys, then restart the following services as such:

~ # mkdir /var/lock/subsys

~ # /etc/init.d/netlogond restart; /etc/init.d/lwiod restart; /etc/init.d/lsassd restart;

5. Now you should see the domain you added when you go to add a permission, as well as any trusts if you have that configured.

That's it. You can now login into this ESXi with your domain\username and your AD password.

However root/password will still work, so you may want to put a different password so no one that knew root before will access the ESXi host.