Empowering DevOps Excellence: Research and Assessment for Continuous Improvement

hammer with nails on wood

DevOps has transformed the software development landscape, enabling organizations to deliver high-quality applications at a faster pace. To achieve DevOps excellence, it is essential to continuously assess and improve your DevOps practices. Research and assessment play a crucial role in understanding your current state, identifying areas for improvement, and implementing effective strategies. In this blog post, we’ll explore the importance of DevOps research and assessment and how it can empower organizations to drive continuous improvement in their DevOps journey.

The Need for DevOps Research

DevOps research provides valuable insights into industry trends, best practices, and success stories. It helps organizations understand the principles, methodologies, and tools that are driving successful DevOps implementations. By staying up-to-date with the latest research, you can learn from others’ experiences, avoid common pitfalls, and adopt proven practices that align with your organizational goals.

Assessing Current DevOps Practices

Conducting a thorough assessment of your current DevOps practices is a crucial step in understanding your strengths, weaknesses, and areas for improvement. An assessment involves evaluating various aspects such as culture, collaboration, automation, release management, monitoring, and feedback loops. It helps identify bottlenecks, inefficiencies, and gaps in your DevOps processes and enables you to set clear improvement goals.

Several assessment frameworks exist to help organizations evaluate their DevOps maturity and identify improvement areas. Some popular frameworks include the DevOps Capability Assessment (DOCA), DevOps Maturity Model (DMM), and the DevOps Assessment Toolkit (DATK). These frameworks provide a structured approach to assess different dimensions of DevOps, measure performance, and benchmark against industry standards.

Key Assessment Areas

During a DevOps assessment, it is important to focus on key areas that contribute to successful DevOps implementation. These may include:

  1. Culture and Collaboration: Assessing the cultural aspects of collaboration, trust, and shared responsibilities within teams.
  2. Automation and Infrastructure: Evaluating the degree of automation in build, test, deployment, and infrastructure provisioning processes.
  3. Continuous Integration and Delivery: Assessing the maturity of CI/CD pipelines and their effectiveness in achieving fast, reliable, and repeatable deployments.
  4. Monitoring and Feedback: Evaluating the monitoring and feedback mechanisms in place to enable proactive issue detection and rapid feedback loops.
  5. Security and Compliance: Assessing the integration of security and compliance practices throughout the DevOps lifecycle.

DevOps assessment relies on collecting accurate data and feedback from various stakeholders, including development teams, operations teams, and business units. This can be done through surveys, interviews, and observations. It is important to encourage open and honest feedback to gain a comprehensive understanding of the current state and identify improvement opportunities.

Implementing Improvement Strategies

Based on the assessment findings, organizations can develop a roadmap for improving their DevOps practices. This may involve implementing changes in processes, tools, and cultural aspects. Prioritize improvement areas based on their impact and feasibility. DevOps research can guide you in selecting best practices and proven strategies to address identified gaps and drive continuous improvement.

DevOps research and assessment are not one-time activities but ongoing processes. Continuously monitor and measure the effectiveness of implemented improvements. Gather feedback from teams and stakeholders to understand the impact of changes and make necessary adjustments. DevOps is a journey of continuous learning and refinement, and research and assessment play a vital role in this iterative process.

DevOps research and assessment provide organizations with valuable insights into industry trends, best practices, and improvement opportunities. By conducting thorough assessments, organizations can identify their strengths and weaknesses, set improvement goals, and implement effective strategies. Continuous monitoring and learning ensure that DevOps practices evolve and adapt to meet the changing needs of the organization. Embrace the power of research and assessment to empower your DevOps journey and achieve continuous improvement and excellence.

Check out our other DevOps articles here.

A Comprehensive Guide to Demystifying Kubernetes Networking Configuration

Illusion street with boat like a pod in networking

Kubernetes has become the de facto standard for container orchestration, enabling the seamless deployment and scaling of applications. However, understanding and configuring networking in a Kubernetes cluster can be complex, especially for newcomers. We’ll delve into the intricacies of Kubernetes networking and provide a comprehensive guide to help you navigate through the various options and configurations.

In a K8s cluster, networking plays a vital role in facilitating communication between pods, services, and external clients. Each pod in Kubernetes gets its own IP address, allowing containers within the pod to communicate with each other over the loopback interface. However, pods are ephemeral, and their IP addresses change. This is where configurations come into play.

Pod-to-Pod Communication

To enable communication between pods in the cluster, Kubernetes implements a flat networking model. Pods can communicate directly with each other using their IP addresses, regardless of the node they are running on. The Container Network Interface (CNI) plugin is responsible for managing pod networking and assigning IP addresses to pods. Popular CNI plugins include Calico, Flannel, and Weave.

Kubernetes Services provide a stable endpoint for accessing pods. Services abstract the underlying pod IP addresses, allowing clients to access pods through a consistent DNS name or IP address. Services support different types of load balancing, such as round-robin or session affinity, to distribute traffic among the pods behind the service. Kubernetes automatically manages the load balancing configuration based on the service type and endpoints.

Ingress and External Connectivity

Ingress is a Kubernetes resource that provides external connectivity to services within the cluster. It acts as an entry point for incoming traffic and allows for the routing and load balancing of requests to different services based on specific rules. To enable Ingress functionality, an Ingress Controller is required, which can be implemented using various solutions such as Nginx Ingress Controller, Traefik, or Istio.

Network Policies allow you to define fine-grained rules to control traffic flow within the cluster. They act as a firewall for your Kubernetes network, allowing or denying traffic based on specific criteria such as pod labels, namespaces, or IP ranges. By leveraging policies, you can enforce security and isolation between different components of your application and ensure that only authorized communication is allowed.

Networking Plugins and Configuration

Kubernetes network plugins, such as Calico, Flannel, or Weave, provide the underlying infrastructure for pod communication. These plugins integrate with the CNI interface and handle IP address management, routing, and network policy enforcement. Choosing the right plugin depends on factors such as scalability requirements, performance, and compatibility with your cloud provider or on-premises infrastructure.

In some cases, you may require custom networking configurations to meet specific requirements. Kubernetes allows for advanced networking features, such as network overlays, multi-cluster networking, or integrating with external services. These custom configurations often involve working with additional tools and technologies like Virtual Extensible LAN (VXLAN), Border Gateway Protocol (BGP), or Service Mesh solutions like Istio.

Understanding and configuring networking in Kubernetes is crucial for building scalable, resilient, and secure applications. By grasping the basics of pod-to-pod communication, service discovery, load balancing, ingress, network policies, and networking plugins, you can effectively design and manage your Kubernetes networking infrastructure. As you gain expertise, exploring custom networking configurations can provide additional flexibility and enable advanced networking capabilities. With this comprehensive guide, you’re equipped to navigate the intricacies of Kubernetes networking. You’ll be able to create robust production quality networking solutions for your applications.

Take a look at the other articles here.

Streamline Application Deployment with Helm Charts on Kubernetes

Driving alone on a road

Managing and deploying complex applications on Kubernetes can be a challenging task. Fortunately, Helm charts come to the rescue. Helm is a package manager for Kubernetes that allows you to define, install, and manage applications as reusable packages called charts. In this blog post, we’ll explore the concept of Helm charts, their benefits, and how they simplify the deployment and management of applications on Kubernetes.

Understanding Helm

Helm is a tool that streamlines the installation and management of applications on Kubernetes. It introduces the concept of charts, which are packages containing all the resources required to run an application on Kubernetes. A Helm chart typically includes Kubernetes manifests, such as deployments, services, and config maps, along with customizable templates and optional values files.

One of the key advantages of Helm charts is their reusability and modularity. Charts allow you to package applications and their dependencies into a single, versioned unit. This makes it easy to share and distribute applications across different environments and teams. Helm can also be extended or customized using values files, enabling you to adapt the application configuration to specific deployment scenarios.

Using Helm, the deployment process becomes straightforward and repeatable. You can install a chart with a single command, specifying the chart name and values file, if needed. Helm takes care of creating all the required Kubernetes resources, such as pods, services, and ingresses, based on the chart’s configuration. This simplifies the deployment process and reduces the chances of configuration errors.

Ability to Version and Rollback

Helm provides versioning and rollback capabilities, allowing you to manage application releases effectively. Each installed chart version is tracked, enabling you to roll back to a previous version if issues arise. This ensures that you can easily manage updates and deployments, maintaining the stability and reliability of your applications.

Helm benefits from a vibrant and active community, which has contributed a wide range of pre-built charts for popular applications. An internet search or a search on GitHub will provide charts for various applications, services, and tools or checkout the Artifactory repo. Leveraging these charts saves time and effort, as they are thoroughly tested and provide best-practice configurations.

Helm Charts Templating

Helm introduces a powerful templating engine that allows you to generate Kubernetes manifests dynamically. It uses Go templates, enabling you to define reusable templates for Kubernetes resources. Templates can include conditional logic, loops, and variable substitution, providing flexibility and configurability for your deployments. This templating mechanism makes Helm charts highly customizable and adaptable to different deployment scenarios.

With Helm, managing updates for deployed applications becomes seamless. Helm charts can be easily updated by running a single command, specifying the new chart version or values file. Helm automatically handles the upgrade process, ensuring that only the necessary changes are applied to the Kubernetes resources. This simplifies the management of application updates and reduces downtime.

Helm charts provide a powerful mechanism for packaging, deploying, and managing applications on Kubernetes. With their reusability, modularity, simplified deployment process, versioning, and templating capabilities, Helm charts streamline the application lifecycle and promote best practices in Kubernetes deployments. By leveraging the Helm community’s chart repository and actively contributing to the Helm ecosystem, you can unlock the full potential of Helm and accelerate your application deployments on Kubernetes.

See our other articles here.

A Step-by-step Guide for Creating an EKS Cluster

Stack of rocks overlooking a mountain range.

AWS Elastic Kubernetes Service (EKS) simplifies the management and operation of Kubernetes clusters on the Amazon Web Services (AWS) platform. With EKS, you can leverage the power of container orchestration while benefiting from the scalability, availability, and security features offered by AWS. In this blog post, we will walk you through the step-by-step process of creating an EKS cluster, allowing you to harness the full potential of Kubernetes on AWS.

Prerequisites and Setup

Prerequisites and Setup Before creating a K8s cluster, ensure you have the necessary prerequisites in place. These include an AWS account, AWS CLI installed and configured, and kubectl installed. Additionally, make sure you have the appropriate IAM permissions to create EKS clusters.

Create an Amazon VPC To provide networking capabilities for your EKS cluster, you need to create an Amazon Virtual Private Cloud (VPC). The VPC acts as an isolated virtual network where your cluster will reside. Use the AWS Management Console or the AWS CLI to create a VPC, ensuring it meets your specific requirements, such as IP address range and subnets.

Set up the IAM Role and Policies EKS requires an IAM role to manage the cluster resources and interact with other AWS services. Create an IAM role with the necessary policies to grant permissions for EKS cluster creation and management. The role should include policies for EKS, EC2, and any other AWS services your applications will interact with. Attach the role to your EC2 instances that will serve as worker nodes in your cluster.

Install and Configure eksctl

eksctl is a command-line tool that simplifies the creation and management of K8s clusters. Install eksctl on your local machine by using this link: https://github.com/weaveworks/eksctl/blob/main/README.md#installation. Before running eksctl, you will need to run aws configure. This involves providing your AWS credentials, region, and other relevant information. It will then create two files named ~/.aws/config and ~/.aws/credentials which are required for eksctl and any operations using the aws CLI.

Create the EKS Cluster

With eksctl installed, you can now create your cluster. Use the eksctl create cluster command, specifying the desired cluster name, region, VPC, and worker node configuration. You can customize various aspects of your cluster, such as the Kubernetes version, instance types, and autoscaling options. The cluster creation process may take up to 10 minutes as EKS provisions the necessary resources and sets up the control plane.

eksctl will handle the cluster creation process, making it straightforward and efficient. The following simple example will create an EKS cluster, and update the ~/.kube/config file which is required for kubectl. This is the most simplistic command for creating clusters as eksctl has many different options depending on what you are needing to setup or destroy a cluster.

eksctl create cluster --name app1_dev --region us-east-1 --fargate

Managing EKS Clusters

eksctl automatically configures ~/.kube/config which contains the necessary credentials and cluster information. Once the cluster creation is complete, verify its status using kubectl. Run kubectl get nodes to ensure that your worker nodes are registered and ready. You should see the list of worker nodes and their status. This confirms that your EKS cluster is up and running.

kubectl get nodes
NAME                                                   STATUS   ROLES    AGE   VERSION
fargate-ip-192-239-71-111.us-west-1.compute.internal   Ready    <none>   1d    v1.25.8-eks-f4dc2c0
fargate-ip-192-390-21-91.us-west-1.compute.internal    Ready    <none>   1d    v1.25.8-eks-f4dc2c0

Deploy and Manage Applications

With your EKS cluster ready, you can start deploying and managing applications on Kubernetes. Utilize kubectl to create deployments, services, and other K8s resources as you would any other K8s cluster. Use Helm Charts to simplify YAML configs or use YAML files if using a simple deployment. Leverage the scalability, load balancing, and self-healing capabilities of Kubernetes to ensure the optimal performance and availability of your applications.

Creating an EKS cluster empowers you to harness the power of Kubernetes on the AWS platform while benefiting from the managed services and robust infrastructure provided by AWS. By following this step-by-step guide, you can seamlessly create or destroy EKS clusters within minutes.

Please checkout other articles on orchestration here.

Unlocking the Power of Orchestration with AWS Kubernetes Service

Pomegranate clustered like K8s pods

Containerization has revolutionized the way we develop, deploy, and scale applications. Kubernetes, an open-source container orchestration platform, has emerged as the de facto standard for managing containerized workloads efficiently. However, setting up and managing a Kubernetes (K8s) cluster can be a complex and time-consuming task. This is where AWS Elastic Kubernetes Service (EKS) comes to the rescue. In this blog post, we’ll explore the key features and benefits of EKS and how it simplifies the deployment and management of Kubernetes clusters on AWS.

Unserstanding Elastic Kubernetes Service

EKS is a fully managed service that makes it easier to run Kubernetes on AWS without the need to install and manage the K8s control plane. It takes care of the underlying infrastructure, including server provisioning, scaling, and patching, allowing developers and operations teams to focus on deploying and managing applications.

One of the significant advantages of AWS EKS is its seamless integration with other AWS services. EKS leverages Elastic Load Balancers (ELB), Amazon RDS for database management, and Amazon VPC for networking, enabling you to build highly scalable and resilient applications on the AWS platform. Additionally, EKS integrates with AWS Identity and Access Management (IAM) for secure authentication and authorization.

Scalability and Security

AWS EKS provides a highly available and scalable K8s control plane. It runs across multiple Availability Zones (AZs), ensuring redundancy and minimizing downtime. EKS automatically detects and replaces unhealthy control plane nodes, ensuring the stability of your cluster. Moreover, EKS enables you to scale your cluster horizontally by adding or removing worker nodes to meet the changing demands of your applications.

Security is a critical aspect of any cloud service, and AWS EKS offers robust security features. EKS integrates with AWS Identity and Access Management (IAM), allowing you to define granular access controls for your Kubernetes cluster. It also supports encryption of data at rest and in transit, using AWS Key Management Service (KMS) and Transport Layer Security (TLS) respectively. With EKS, you can meet various compliance requirements, such as HIPAA, GDPR, and PCI DSS.

Monitoring and Logging

AWS EKS provides comprehensive monitoring and logging capabilities. You can leverage Amazon CloudWatch to collect and analyze logs, metrics, and events from your EKS cluster. CloudWatch enables you to set up alarms and notifications to proactively monitor the health and performance of your applications. Additionally, EKS integrates with AWS X-Ray, a service for tracing and debugging distributed applications, allowing you to gain insights into the behavior of your microservices.

Cost Optimization

AWS EKS offers cost optimization features to help you manage your infrastructure efficiently. With EKS, you only pay for the resources you use, and you can scale your worker nodes based on demand. EKS integrates with AWS Auto Scaling, which automatically adjusts the number of worker nodes in your cluster based on predefined rules and metrics. This ensures optimal resource utilization and cost savings.

Elastic Kubernetes Service is a powerful service that simplifies management of Kubernetes clusters on the AWS platform. By leveraging the seamless integration with other AWS services, high availability, scalability, robust security, monitoring, and cost optimization features, AWS EKS empowers developers and operations teams to focus on building and scaling their applications without worrying about the underlying infrastructure. If you’re considering Kubernetes for your next project on AWS, EKS should be at the top of your list.

Checkout our other articles on Containers here.

Harnessing the Power of Python: Converting Images to Text

Reading a book in the park

In today’s digital era, images play a crucial role in communication and information sharing. However, extracting meaningful information from images can be a challenging task. That’s where the power of Python and its libraries, such as pytesseract and open-cv, come into play. In this blog post, we’ll explore the fascinating world of converting images to text using Python, uncovering the possibilities and applications of this remarkable technique.

Understanding Optical Character Recognition (OCR)

Optical Character Recognition (OCR) is the technology that enables computers to extract text from images or scanned documents. By leveraging OCR, we can convert images into editable and searchable text, providing a wealth of opportunities for various applications, including data entry automation, document analysis, and content extraction.

Python Image Libraries

Python offers several powerful libraries that make it relatively easy to perform image to text conversion. The two most widely used libraries are:

  1. Tesseract OCR: Tesseract is an open-source OCR engine developed by Google. It supports over 100 languages and provides robust text recognition capabilities. Python provides an interface to Tesseract through the pytesseract library, enabling seamless integration of OCR functionality into Python applications.
  2. OpenCV: OpenCV is a popular computer vision library that includes various image processing functions. While not primarily an OCR library, OpenCV provides a strong foundation for preprocessing images before passing them to an OCR engine. It can be used for tasks such as noise removal, image enhancement, and text localization, improving the accuracy of OCR results.

Converting Images to Text with Python:

To get started with image to text conversion in Python, you’ll need to install the necessary libraries. Use the following commands in your terminal or command prompt:

pip install pytesseract
pip install opencv-python

Once the libraries are installed, you can utilize the power of OCR in Python with the following steps:

  1. Import the required libraries:
import cv2
import pytesseract
  1. Load the image:
image = cv2.imread('image.jpg')
  1. Perform OCR using pytesseract
text = pytesseract.image_to_string(image)
print(text)
  1. If the image isn’t clear or if the text is surrounded by pictures, sdd config options to image_to_string. This is especially true if you see garbage in the text or if text isn’t aligning correctly. You may need to adjust the --psm 4 setting. Sometimes 2, 4 or 8 will work best. This Stack Overflow conversation describes the psm option in detail: https://stackoverflow.com/questions/44619077/pytesseract-ocr-multiple-config-options
config_opts = ("--oem 1 --psm 4")
text = pytesseract.image_to_string(image, config=config_opts)
print(text)
  1. Analyze and utilize the extracted text. At this stage, the text should be extracted, so you will be able to operate on it as you would any other text in Python or directly insert it into a database.

Applications and Use Cases

The ability to convert images to text opens up numerous possibilities across various domains. Here are a few use cases where Python’s image to text conversion capabilities can be invaluable:

  1. Data Entry Automation: Automatically extracting data from forms, invoices, or receipts and converting them into machine-readable text can significantly streamline data entry processes.
  2. Document Analysis: Converting scanned documents or handwritten notes into editable text allows for efficient content analysis, searchability, and text mining.
  3. Accessibility: Converting text from images can improve accessibility for visually impaired individuals by enabling text-to-speech applications or screen readers to interpret the content.
  4. Content Extraction: Extracting text from images can aid in content curation, social media monitoring, and sentiment analysis, allowing businesses to gain valuable insights from visual content.

Python provides an extensive range of tools and libraries for converting images to text, thanks to its versatility and powerful third-party packages. With the help of OCR libraries like Tesseract and image processing capabilities offered by OpenCV, developers can effortlessly extract text from images and unlock a multitude of applications. Automating data entry, analyzing documents, or extracting content, Python’s image to text conversion capabilities makes this capability fairly easy.

Be sure to checkout the other Python articles here: https://sim10tech.com/category/python/

Creating Containerized Applications

Creating Docker Images

Once you’ve mastered running containers, the next step is to deploy containerized applications. In this article, we will validate the environment and troubleshooting common issues. If you haven’t read the previous Docker articles, please see Part 1 and Part 2.

Getting Started with Images – Prep Work

As a first step, create a working directory for all of the files used in creating the image. This is where the Dockerfile and any libraries or other applications needed in the image can be stored.

mkdir wc
cd wc

The Dockerfile is the configuration file for creating Docker images. For all of the options, the reference is located in Docker’s documentation. Next, create a file called run_http.py with these contents:

#!/usr/bin/env python3
python3 -m http_server

Dockerfile Options

Now that the working directory is created, let’s start with some Dockerfile commands. There are many more options with tons of blogs and YouTube videos discussing how to create images. The image below simply runs a Python web server. When creating an image, using ENV and COPY in the Dockerfile is extremely useful and for the most part, required. These phrases sets up the image environment (i.e. CLASSPATH, PATH or anything else the app needs to run) and will ensure dependent libraries are in the correct path.

  • FROM
    • Declares a parent image to use as a base.
  • ENV
    • Sets up environment variables if the application requires it. For example, the postgres image has several environment variables for startup of the container. Providing environment variables so users can change the behavior or provide different startup options helps flexibility.
  • COPY / ADD
    • Copy or add a file from the host to the image. Most commonly this is an initialization script to prepare the container’s environment. Alternatively, you can specify a persistent volume when running the container if the files are installed on the host.
    • Important note: Wherever the scripts, libraries or other dependencies are copied, make sure the application is configured to search the destination path.
  • EXPOSE
    • Expose a port to the running container. This is typically used when running a container with -p. For example docker run -p 5448:5432 which tells docker to expose port 5448 on the host and 5432 in the container. When connecting to the service, you would connect to the hostname on port 5448.
  • ENTRYPOINT
    • The container will run an exec of ENTRYPOINT and is used as the running application. If the application dies or is killed, the container is killed as well.
FROM centos
LABEL  maintainer="Your Friendly Maintainer"

EXPOSE 80

COPY run_http.py /
RUN yum install -y epel-release
RUN yum install -y python3 python3-pip
RUN python3 -m pip install pip --upgrade
ENV PATH=${PATH}
ENTRYPOINT /run_http.py

Distributing Images

With this file saved in the wc directory as Dockerfile, run the build command. Note that -t is typically used to tag the version of the application. Make use of this since tags will make life much easier for upgrades and lifecycle management.

docker build -t py-web .
docker image tag py-web:latest py-web:latest

At this point, the developer has two choices. Either push the image to repository, local or Docker Hub, or save the image for distribution.

docker push py-web:latest # Pushing to Docker Hub
docker push 1.1.1.1:5000/py-web:latest # Pushing the image to a local repository

If the image is saved, users can use docker image load to load the image locally. The drawback from this approach is usability. While running the docker image load command is straightforward, users will sometimes react negatively. Since most companies either use Docker Hub or their own registry for distribution. This makes it very easy for users to run containers. The other issue is every release, the user has to go back and download the tar file and upload it to their system. Whenever possible, either use Hub or a public registry.
Here is an example of saving an image:

docker save py-web:latest > py-web.tar

Regular Expressions in Python

python regular expressions

I designed a tool a while back in Python that used sar and Solaris explorer data for capacity analysis. One of the issues I faced was needing to find data in between two regular expressions. Fortunately, Python has a powerful regular expression module called re module. Working with regexes can be daunting if you haven’t worked with it before. If you’re unfamiliar with regular pattern matching, please read this: RegEx Primer

Using the Regular Expressions (RE) Module

Three main methods of the re module are compile()match() and search(). The compile() method creates a regex object which makes searching through data much faster. match() will return a re.match object only if the beginning of the string matches the pattern. search() will find any occurrence of the pattern within the string. This is a fairly simple example in that it’s only a string being matched. Typically, the string will actually be patterns instead of simple strings. As an example, something like ^fd.ss$ is more common in pattern matching. This statement says:

  1. ^fd – find “fd” at the beginning of the line. ^ means to match at the beginning of the line.
  2. .ss after finding “fd”, match any character followed by “ss”. The . matches any one character.
  3. ss$ “ss” is the last two characters at the end of the line. $ says end of line, but not including new line characters.
import re
data_str = 'this is my search string'
srch_recomp = re.compile('string')
# Match won't find anything since 'string' is not at the beginning of data_str
regex_found = re.match(srch_recomp, data) 
type(regex_found)

regex_found = re.search(srch_recomp, data_str) # Search will find the pattern in data_str
regex_found
_sre.SRE_Match object; span=(18, 24), match='string'>
In this example, we change the variable src_recomp so re.match() will find the pattern.

data_str = 'this is my search string' 
srch_recomp = re.compile('this')
regex_found = re.match(srch_recomp, data)
regex_found
_sre.SRE_Match object; span=(0, 4), match='this'>

PYTHON FORWARD SEARCH

The algorithm is fairly simple to search for data between two patterns. Using the regular expressions module, re, search for a begin string, append all of the lines in a list until end string is found. This example class is using a file, but the file object can be easily replaced with another object type. Comments in code if you don’t need the begin_re and end_re strings in the final output.

Regular expressions is a complex subject at first mostly because the pattern matching syntax is so different. Start by reading and trying simple expressions at first. For the most part, re follows standard matching syntax, so knowing grep in Linux/UNIX will transfer that knowledge into Python easily. Refer to the re documentation here: Python RE module or check Stackoverflow for examples.

import re

class LookForward():
    """
        begin_re: beginning search pattern
        end_re: end serach pattern
        file_name: File name to search for begin_re and end_re strings.
        Return: a double list of search elements
    """
    def __init__(self, begin_re, end_re, file_name):
        self.begin_re = begin_re
        self.end_re = end_re
        self.file_name = file_name

    def look_forward(self):
        """
           Method that returns a list containing lines between
           begin and end regular expressions.
        """
        return_val = []
        try:
            with open(self.file_name) as file_ctx:
                f_data = file_ctx.readlines()
        except (OSError, PermissionError) as err:
            print(f"Encountered an error while opening {self.file_name}:"
                  f" {err}")
            raise OSError
        for line in f_data:
            begin_pattern = re.compile(self.begin_re)
            begin_match = re.search(begin_pattern, line)
            final_pattern = re.compile(self.end_re)
            # if there is a match for the beginning search pattern, then 
            # start parsing until end_re is found.
            if begin_match:
                try:
                    for next_line in f_data:
                        # take next line in file append each line to 
                        # first list. strip() removes the new line char.
                        return_val.append(next_line.strip())
                        final_match = re.search(final_pattern, next_line)
                        # check if new_line is a match for end_re
                        if final_match:
                            # Uncomment the line below if the end_re should
                            # not be included in the results:
                            # return_val = return_val[:-1]
                            # break the inner loop since end_re was found
                            break
                except StopIteration:
                    continue
        return return_val

To implement this class, initialize the LookForward class by passing begin and end regular expressions and a filename to search. In the example below, “this” is the begin search string, “that” is the end search string and “text.txt” is the file that is searched for these strings.

lf_data = LookForward("this", "that", "test.txt")
lf_data.look_forward()
output = lf_data.look_forward()
for lines in output:
    print(lines)

Checkout the other Python related articles here.

Docker Containers Part 2 – Working with Images

docker containers and images administration

If you haven’t installed Docker, please read Part 1 of this Docker series.

Managing container lifecycles is more involved than starting and stopping. In this second part of Docker Containers, we show how to administer images locally and on remote repositories. The syntax for maintaining images is the subcommand image. We will cover list/ls/inspect, pull and rm/prune in this article.

Working with Images

List Images

The first part of managing images is to know which images are being used, what their disk utilization is and the image version. Listing images is done in one of two ways: either long docker image list or more Linux/UNIX friendly docker image ls.

docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
python latest e285995a3494 10 days ago 921MB
postgres latest 75993dd36176 10 days ago 376MB

Inspecting images provides information such as environment variables, parent image, commands used during initialization, network information, volumes and much more. This data is vital when troubleshooting issues with container startup or creating new images. The following is only an excerpt – the actual command has about two pages of data.

docker inspect postgres
[
{
"RepoTags": [
"postgres:latest"
],
"Hostname": "81312c458473",
"ExposedPorts": {
"5432/tcp": {}
},
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/14/bin",
"PG_MAJOR=14",
"PG_VERSION=14.5-1.pgdg110+1",
"PGDATA=/var/lib/postgresql/data"
],
"Cmd": [
"/bin/sh",
"-c",
"#(nop) ",
"CMD [\"postgres\"]"
.......

Search Repositories

Searching repositories can be accomplished by either going to Docker Hub, or searching by command line, docker search <string>, so you never have to leave the shell. Here’s an example:

docker search postgres
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
postgres The PostgreSQL object-relational database sy… 11486 [OK]
bitnami/postgresql Bitnami PostgreSQL Docker Image 154 [OK]
circleci/postgres The PostgreSQL object-relational database sy… 30
ubuntu/postgres PostgreSQL is an open source object-relation… 19
bitnami/postgresql-repmgr 18
rapidfort/postgresql RapidFort optimized, hardened image for Post… 15

Pulling Images

In order to run containers, you will need to pull the image from a repository. This can be accomplished either by docker pull <image name> or docker run <image name> which will automatically pull the image if it doesn’t exist locally. By default, pull will get the latest version, Alternatively, you can specify a version by using a colon, :, after the image name like this: docker pull <image>:4.2.0

docker pull postgres
Using default tag: latest
latest: Pulling from library/postgres
31b3f1ad4ce1: Pull complete
1d3679a4a1a1: Pull complete
667bd4154fa2: Pull complete
87267fb600a9: Pull complete
Digest: sha256:b0ee049a2e347f5ec8c64ad225c7edbc88510a9e34450f23c4079a489ce16268
Status: Downloaded newer image for postgres:latest
docker.io/library/postgres:latest

Removing and Pruning Images

Unfortunately, Docker doesn’t automatically remove images, so disk utilization tends to grow fairly quickly if not managed. Docker has two commands to remove images, prune, rm and rmi. As part of normal maintenance, prune should run in cron every few weeks or once a month depending on how active the system is.

docker image prune – Deletes unused images
docker rm <container IDs> – Removes container IDs from the system.
docker rmi <image ID> – Remov the image.

docker image prune
WARNING! This will remove all dangling images.
Are you sure you want to continue? [y/N] y
<none>                               <none>      48e3a3f31a48   10 months ago   999MB
<none>                               <none>      89108dc97df7   10 months ago   1.37GB
<none>                               <none>      26e43fa5dd7c   11 months ago   998MB
<none>                               <none>      b98d351f790b   11 months ago   1.37GB
<none>                               <none>      334a4df3c05a   11 months ago   998MB
<none>                               <none>      17c5a57654e4   11 months ago   1.37GB

Please checkout the other container articles here.

 

Passwords with non-standard Characters in JSON using Python

Path less traveled

Python and JSON

I had a requirement to have passwords contain a slash \ in an API call with JSON. However, when attempting to run json.dumps for the credentials, Python would throw this exception:

Expecting value: line 1 column 34 (char 33)

Not surprisingly, a solution wasn’t found on Stack Overflow or any internet searches. I’m guessing the reason being \ is a reserve character in JSON similar to Python. Unfortunately, that didn’t matter as the requirements were already set and accepted, so I needed to find a fix. I attempted the following:

  1. Escape the \ with two backslashes like this: \\.
  2. Different quotes: ' and ''.
  3. Encapsulating the quotes like '" and "'.
  4. Using strict=False for json.loads.
    1. Example: json.loads(json_creds, strict=False)
    2. This was the most cited workaround I found, but it never worked with the slash. json.loads would throw the Expecting value exception every time.

However, none of that worked mostly because, in all honestly, it shouldn’t. The reserver characters are there just like in Python for the language to function correctly. We wouldn’t add an @ in a method or function definition for the same reason we shouldn’t add \ in passwords for JSON. I’m digressing a bit – back to how to work around this.

I found that if the password is encoded using json.dumps first, and then passed to the JSON URL, it worked perfectly.

password = "This.\Sample"
encoded_pw = json.dumps(password)
JSON_DATA = "{\"username\": \"" + username + "\", \"password\":" + encoded_pw + "}"

For other Python-related articles, please checkout other Python articles here.