<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Tips, demos and solutions on important topics and practical issues in Python development, data analytics, data science and engineering]]></title><description><![CDATA[Tips, demos and solutions on essential topics and practical issues in Python development, data analytics, data science and engineering.]]></description><link>https://henryeleonu.com</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1671815184332/dWIw9LNJz.png</url><title>Tips, demos and solutions on important topics and practical issues in Python development, data analytics, data science and engineering</title><link>https://henryeleonu.com</link></image><generator>RSS for Node</generator><lastBuildDate>Mon, 20 Apr 2026 12:22:46 GMT</lastBuildDate><atom:link href="https://henryeleonu.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Importance of Writing Only A Single Task in a Function]]></title><description><![CDATA[I will be talking about the importance of writing only a single task in a function based on my recent experience in a project I was involved with. I was engaged in making some updates to simulation software for transport and logistics, and I noticed ...]]></description><link>https://henryeleonu.com/importance-of-writing-only-a-single-task-in-a-function</link><guid isPermaLink="true">https://henryeleonu.com/importance-of-writing-only-a-single-task-in-a-function</guid><category><![CDATA[Python]]></category><category><![CDATA[Software Engineering]]></category><category><![CDATA[Code Quality]]></category><category><![CDATA[software development]]></category><category><![CDATA[data-engineering]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Sun, 07 May 2023 17:36:51 GMT</pubDate><content:encoded><![CDATA[<p>I will be talking about the importance of writing only a single task in a function based on my recent experience in a project I was involved with. I was engaged in making some updates to simulation software for transport and logistics, and I noticed that the transport and logistics department, which was supposed to use the software, was not using it because they couldn’t update it to meet their current needs and developers engaged previously had a hard time at maintaining the project. It was immediately apparent that the software was difficult to maintain because many tasks were lumped into each function, making the functions monolithic. It was surprising to learn that the company that developed the project is a well-known software consulting company, and as such, I did not expect them to deliver such poorly written code.</p>
<p>Apart from software projects meeting the functional and non-functional requirements in terms of deliverables, organisations should also have minimum standards in terms of code quality. This is very important because poor code quality will ultimately have a negative impact on the maintainability of a software project. The solution to their problems was writing only a single task in a function.</p>
<p>Some of the benefits of doing this are:</p>
<ol>
<li><p>It reduces the code complexity of functions.</p>
</li>
<li><p>Reduce code coupling</p>
</li>
<li><p>Makes code readable, easier to understand and maintain</p>
</li>
<li><p>Improves code quality</p>
</li>
<li><p>Encourages the adoption of test-driven software development</p>
</li>
</ol>
<h2 id="heading-reduction-of-code-complexity-of-functions">Reduction of Code Complexity of Functions</h2>
<p>Writing only a single task in a function reduces the function code complexity by simply reducing the number of lines of code of the function. Another way the complexity is reduced is that it reduces the nesting of conditional statements or program loop statements. </p>
<h2 id="heading-reduce-code-coupling">Reduce Code Coupling</h2>
<p>It decouples the tasks that could have been coupled in one function into separate functions, which improves the reuse functions.</p>
<h2 id="heading-make-code-readable-easier-to-understand-and-maintain">Make Code Readable, Easier to Understand and Maintain</h2>
<p>Reducing code complexity and coupling are the major factors for improving code readability and understandability, consequently improving code maintainability. </p>
<h2 id="heading-improve-code-quality">Improve Code Quality</h2>
<p>Improved readability, understandability and maintainability of code are key to improved code quality. </p>
<h2 id="heading-encourage-the-adoption-of-test-driven-software-development">Encourage The Adoption of Test-Driven Software Development</h2>
<p>By writing only a single task in a function, it is easier to write tests for each task represented by a function. This makes it possible to adopt a test-driven software development approach.</p>
]]></content:encoded></item><item><title><![CDATA[Some Best Practices for Improving Code Maintainability and Quality]]></title><description><![CDATA[I will discuss my experience while maintaining simulation software written in Python. I was engaged in making some updates to simulation software for transport and logistics, and I noticed that the transport and logistics department, which was suppos...]]></description><link>https://henryeleonu.com/some-best-practices-for-improving-code-maintainability-and-quality</link><guid isPermaLink="true">https://henryeleonu.com/some-best-practices-for-improving-code-maintainability-and-quality</guid><category><![CDATA[Code Quality]]></category><category><![CDATA[best practices]]></category><category><![CDATA[Software Engineering]]></category><category><![CDATA[Python]]></category><category><![CDATA[code maintainability]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Sun, 07 May 2023 12:57:48 GMT</pubDate><content:encoded><![CDATA[<p>I will discuss my experience while maintaining simulation software written in Python. I was engaged in making some updates to simulation software for transport and logistics, and I noticed that the transport and logistics department, which was supposed to use the software, was not using it because they couldn’t update it to meet their current needs and developers engaged previously had a hard time at maintaining the project. It was immediately apparent that the software was difficult to maintain because many tasks were lumped into each function, making the functions monolithic. It was surprising to learn that the company that developed the project is a well-known software consulting company, and as such, I did not expect them to deliver such poorly written code.</p>
<p>Apart from software projects meeting the functional and non-functional requirements in terms of deliverables, organisations should also have minimum standards in terms of code quality. This is very important because poor code quality will ultimately negatively impact the maintainability of a software project. </p>
<p>When the project was no longer being used because it was difficult to update it to meet current needs, it was regarded as a failed project by those in charge of transport and logistics. This shows the impacts of not following best practices in software engineering in software projects. The developers of the project failed to, as much as possible, make sure that each function in the software performs only one task. The functions had many lines of code, making reading and understanding the code very difficult. The monolithic functions had higher complexity which made them difficult to maintain. Also, because of the many tasks in each function, there is high code coupling, often resulting in spaghetti code that is hard to understand. Reusing the code for the many tasks embedded within each function was also impossible. Because many tasks were lumped into each function, it was difficult to write tests for a function that would simultaneously test the functionality of the many tasks embedded in one function. This made it practically impossible to adopt a test-driven software development approach.</p>
]]></content:encoded></item><item><title><![CDATA[How To Encourage Best Practices in Python Programming By Complying With PEP8 Style Guide]]></title><description><![CDATA[As part of this blog post, I have added a YouTube demo on how to enable PEP8 compliance in Visual Studio Code.
Enabling PEP8 Compatibility of Python Code in Visual Studio Code - YouTube
https://www.youtube.com/watch?v=ZkwHwQ6l4wI
 
Some of the best p...]]></description><link>https://henryeleonu.com/how-to-encourage-best-practices-in-python-programming-by-complying-with-pep8-style-guide</link><guid isPermaLink="true">https://henryeleonu.com/how-to-encourage-best-practices-in-python-programming-by-complying-with-pep8-style-guide</guid><category><![CDATA[php8]]></category><category><![CDATA[type checking]]></category><category><![CDATA[Python]]></category><category><![CDATA[vscode extensions]]></category><category><![CDATA[Visual Studio Code]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Mon, 17 Apr 2023 14:28:10 GMT</pubDate><content:encoded><![CDATA[<p>As part of this blog post, I have added a YouTube demo on how to enable PEP8 compliance in Visual Studio Code.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=ZkwHwQ6l4wI">Enabling PEP8 Compatibility of Python Code in Visual Studio Code - YouTube</a></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=ZkwHwQ6l4wI">https://www.youtube.com/watch?v=ZkwHwQ6l4wI</a></div>
<p> </p>
<p>Some of the best practices in Python programming are:</p>
<ol>
<li><p>Comply with PEP8 conventions.</p>
</li>
<li><p>Enforce data type checking.</p>
</li>
<li><p>Autogenerate docstring</p>
</li>
<li><p>Install autopep8 for code formatting.</p>
</li>
</ol>
<h1 id="heading-complying-with-pep8-conventions">Complying With PEP8 Conventions</h1>
<p>The key benefit of following best practices and conventions in Python programming is to improve the readability and consistency of code. The readability of code is very important because code is read much more than it is written. The readability of code cannot be over emphasized because it helps to improve the maintainability of code. Following the standards in PEP8 will go a long way in improving the quality of the code we write. PEP8 convention provides a style guide for writing consistent and readable Python code and I will show you how to enforce or encourage PEP8 conventions in Visual Studio Code. The first step is to install PEP8 which has the Python coding standards such as variable naming style, module docstring, function docstring, and inconsistent indentation.</p>
<p>The first step is to Install PEP8 by running the following command:</p>
<p><code>$ pip install pep8</code></p>
<p>The next step: To enforce the PEP8 standard, we will install Pylint in Visual Studio Code. Pylint is the tool that checks whether our code complies with the PEP8 standard and returns errors where we fail to comply. Install Pylint The easiest way to install Pylint is to go to the extension tab in Visual Studio Code, search for Pylint and then install it.<br />Another way to install Pylint is to run the following code on the terminal:</p>
<p><code>$ pip install pylint</code></p>
<p>Then Pylint needs to be enabled on VSCode by following these steps:</p>
<ol>
<li><p>Press "Ctrl + Shift + P" to get Command Palette</p>
</li>
<li><p>Type "Lint"</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1681740639833/286d5a8a-dbf9-47b0-b62a-12c280337783.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p>Select "Python : Enable/Disable Linting", and click on "Enable"</p>
</li>
<li><p>Repeat Steps 1 &amp; 2, now select "Python : Select Linter", Select pylint from options</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1681740958155/b26ca261-5da8-4072-95cd-3cdd6de3b484.png" alt class="image--center mx-auto" /></p>
<p>Note: apart from highlighting stylistic problems, Pylint highlights syntactical errors in your code.</p>
<h1 id="heading-enable-type-checking">Enable Type Checking</h1>
<p>You might also enable type checking in VScode by going to settings, typing “type checking” and changing type checking mode to either basic or strict. This will highlight function parameters without data type specified.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1681741183839/a9118f47-24d2-488d-8303-1adf3634c05e.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-autogenerate-docstring">Autogenerate docstring</h1>
<p>For a large project, writing docstring could be cumbersome, so I use autoDocstring to automatically generate docstring for modules and functions. To install autoDocstring, go to an extension on VsCode, type autoDocstring and then install it. To generate a docstring, Cursor must be at the line where you want your docstring to start, right click and then click “Generate Docstring”. You must write your function first before you can generate a docstring for it.</p>
<h1 id="heading-install-autopep8-formatter-extension-for-vscode">Install autopep8 Formatter Extension for VScode</h1>
<p>autopep8 automatically formats Python code to conform to the PEP8 style guide. Install autopep8 formatter from extensions in Vscode.</p>
]]></content:encoded></item><item><title><![CDATA[Deploy Jupyter Notebook and Spark on AWS Elastic Kubernetes Service (EKS)]]></title><description><![CDATA[In this article, I am going to show the steps to follow to enable you to run Apache Spark on a cluster managed by Kubernetes. But before this, you have to first create the EKS cluster. I have another article on how to create an EKS cluster in AWS. Sp...]]></description><link>https://henryeleonu.com/deploy-jupyter-notebook-and-spark-on-aws-elastic-kubernetes-service-eks</link><guid isPermaLink="true">https://henryeleonu.com/deploy-jupyter-notebook-and-spark-on-aws-elastic-kubernetes-service-eks</guid><category><![CDATA[Jupyter Notebook ]]></category><category><![CDATA[#apache-spark]]></category><category><![CDATA[spark]]></category><category><![CDATA[AWS]]></category><category><![CDATA[EKS]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Tue, 20 Dec 2022 17:28:20 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1671815627745/593527a0-9b1b-4546-8d72-33a7cb2e6483.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this article, I am going to show the steps to follow to enable you to run Apache Spark on a cluster managed by Kubernetes. But before this, you have to first create the EKS cluster. I have another article on <a target="_blank" href="https://henryeleonu.hashnode.dev/how-to-create-an-aws-elastic-kubernetes-service-eks-cluster">how to create an EKS cluster in AWS</a>. Spark is a framework for big data processing which enables in-memory processing of a large amount of data by partitioning the data and distributing the partitions to the nodes that make up the cluster to be processed. The Dockerfile and YAML files used for the deployment of Spark on EKS can be found in my GitHub repository:</p>
<p><a target="_blank" href="https://github.com/henryeleonu/spark-kubernetes/tree/jupyter-spark-kube"><strong>https://github.com/henryeleonu/spark-kubernetes/tree/jupyter-spark-kube</strong></a></p>
<p>I also have a Youtube demo on the deployment of Spark on EKS.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=XGvdlSmNMvc">https://www.youtube.com/watch?v=XGvdlSmNMvc</a></div>
<p> </p>
<p><strong>Docker Image</strong></p>
<p>You will build a Spark docker image from a Dockerfile. The docker image is required to run the Spark docker containers in the Kubernetes cluster. Docker is a container runtime environment that is frequently used with Kubernetes. Spark ships with a Dockerfile that can be used to build the spark image. This Spark official Dockerfile can be customized to meet an individual application’s needs. The first step is to download Apache Spark from <a target="_blank" href="https://spark.apache.org/downloads.html">https://spark.apache.org/downloads.html</a>. Choose a package type: Pre-built for Apache Hadoop 3.3 and later. Then download the tar archive file to your local directory and extract it.</p>
<p>The next step is to build the Spark image. The Spark download has a <code>bin/docker-image-tool.sh</code> script that you can be used to build the Spark image found in the <code>kubernetes/dockerfiles/</code> directory. We will be building an additional Pyspark image with a docker file in this directory /kubernetes/dockerfiles/spark/bindings/python/Dockerfile</p>
<p>Example</p>
<pre><code class="lang-bash"><span class="hljs-comment"># To build additional PySpark docker image</span>
$ ./bin/docker-image-tool.sh -r &lt;repo&gt; -t my-tag -p ./kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
</code></pre>
<p>Replace &lt;repo&gt; with the name of your docker hub repository and my-tag with the tag of the repository.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># To build additional PySpark docker image</span>
$ ./bin/docker-image-tool.sh -r heleonu/spark-py -t 1.1 -p ./kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
</code></pre>
<p>After building, push the image to docker hub using the command:</p>
<p>docker push heleonu/spark-py:1.1</p>
<p>You will create a Dockerfile that is based on the Spark base image we created earlier. This image which we will be using will have the packages and libraries we need such as Jupyter Notebook etc. You will build an image from this Dockerfile and push it to docker hub.</p>
<p>docker build -t heleonu/spark-py-kube:1.2 .</p>
<p>docker push heleonu/spark-py-kube:1.2</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="64b247a5e6471203cdd7a83f9841775d"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/64b247a5e6471203cdd7a83f9841775d" class="embed-card">https://gist.github.com/henryeleonu/64b247a5e6471203cdd7a83f9841775d</a></div><p> </p>
<p><strong>Create a Service Account and Permissions</strong></p>
<p>A service account provides an identity for processes that run in a Pod, and maps to a ServiceAccount object. A ClusterRole contains rules that represent a set of permissions. A ClusterRoleBinding grants the permissions defined in a ClusterRole to a user, groups, or service accounts. The YAML configuration below has the Service Account, ClusterRole and ClusterRoleBinding we used.</p>
<pre><code class="lang-bash">kubectl apply -f service-account.yaml
</code></pre>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="46e8a2505e82ab03d7aef98d50088baa"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/46e8a2505e82ab03d7aef98d50088baa" class="embed-card">https://gist.github.com/henryeleonu/46e8a2505e82ab03d7aef98d50088baa</a></div><p> </p>
<p><strong>Create Secret</strong></p>
<p>Kubernetes Secrets can be used to provide credentials for a Spark application to access secured services. The secret YAML file has the login credential spark required to communicate with the PostgreSQL database. The login credentials are encoded with base64 encoding. I have another article on <a target="_blank" href="https://henryeleonu.hashnode.dev/how-to-deploy-postgresql-on-aws-elastic-kubernetes-service-eks">how to deploy PostgreSQL on EKS</a>.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="268516c3e68be13d77936bb459e0f6ff"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/268516c3e68be13d77936bb459e0f6ff" class="embed-card">https://gist.github.com/henryeleonu/268516c3e68be13d77936bb459e0f6ff</a></div><p> </p>
<p>Run the command to deploy the secret:</p>
<pre><code class="lang-bash">kubectl apply -f postgres-login-secret.yaml
</code></pre>
<p><strong>Creating Spark Pod</strong></p>
<p>You will create a spark pod with the YAML configuration below.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="157c513a560d384cf05339549a9afb09"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/157c513a560d384cf05339549a9afb09" class="embed-card">https://gist.github.com/henryeleonu/157c513a560d384cf05339549a9afb09</a></div><p> </p>
<p>From the YAML file, the service account we created earlier is bound to the spark pod. We use the spark image, heleonu/spark-py-kube:1.2, which we created earlier. This command,</p>
<pre><code class="lang-yaml"><span class="hljs-attr">command:</span> [<span class="hljs-string">"jupyter"</span>, <span class="hljs-string">"notebook"</span>, <span class="hljs-string">"--ip"</span>, <span class="hljs-string">"0.0.0.0"</span>, <span class="hljs-string">"--allow-root"</span>]
</code></pre>
<p>starts the Jupyter Notebook when the Spark container starts running. By default, Jupyter Notebook runs on port 8888. The secret is attached to the spark pod using these lines.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">envFrom:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">secretRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">mysecret</span>
</code></pre>
<p>Run the command to deploy the spark pod:</p>
<pre><code class="lang-bash">kubectl apply -f spark-pod.yaml
</code></pre>
<p><strong>Create Headless Service</strong></p>
<p>This headless service enables the spark executors to communicate with the spark driver.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="e3c10027c1306aabf9f201c69b3e6e91"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/e3c10027c1306aabf9f201c69b3e6e91" class="embed-card">https://gist.github.com/henryeleonu/e3c10027c1306aabf9f201c69b3e6e91</a></div><p> </p>
<p>Run the command to deploy the headless service:</p>
<pre><code class="lang-bash">kubectl apply -f spark-headless-service.yaml
</code></pre>
<p><strong>Start Jupyter Notebook</strong></p>
<p>We may want to start Jupyter Notebook by Kubectl exec into the spark pod using the command</p>
<p>kubectl exec -it spark-pod bash</p>
<p>And then start Jupyter Notebook in the spark pod by running the command:</p>
<p>jupyter notebook --ip 0.0.0.0 --allow-root</p>
<p>When Jupyter Notebook starts in the spark pod it will display the URL on Jupyter Notebook on the terminal. This URL will have the port on which Jupyter Notebook is running in the pod. You need to port forward this port to enable you to run Jupyter Notebook on the browser on your local machine. You will open another terminal to do the port forwarding by running the command:</p>
<p>kubectl port-forward pod/spark-pod 8889:8889</p>
<p>This command assumes that Jupyter Notebook is running on port 8889 in the spark pod and this port is forwarded to port 8889 on your local machine. You can then copy the URL of Jupyter Notebook and paste it into the browser on your local machine to run it from your local machine. This way, you can submit Spark jobs to AWS EKS from your local browser. To better understand how this works, watch this YouTube demo.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=XGvdlSmNMvc">https://www.youtube.com/watch?v=XGvdlSmNMvc</a></div>
<p> </p>
<p>This is an example code to use on your notebook to set the spark configurations to enable the submission of Spark jobs on EKS.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="2d2116f732c525159240c974001d7a01"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/2d2116f732c525159240c974001d7a01" class="embed-card">https://gist.github.com/henryeleonu/2d2116f732c525159240c974001d7a01</a></div>]]></content:encoded></item><item><title><![CDATA[How To Deploy PostgreSQL on AWS Elastic Kubernetes Service (EKS)]]></title><description><![CDATA[In this post, I am going to explain the step I followed to deploy PostgreSQL running in docker containers to AWS EKS. The YAML files I used for this deployment can be found in my GitHub repository:
https://github.com/henryeleonu/spark-kubernetes/tree...]]></description><link>https://henryeleonu.com/how-to-deploy-postgresql-on-aws-elastic-kubernetes-service-eks</link><guid isPermaLink="true">https://henryeleonu.com/how-to-deploy-postgresql-on-aws-elastic-kubernetes-service-eks</guid><category><![CDATA[AWS Elastic Kubernetes Service]]></category><category><![CDATA[PostgreSQL]]></category><category><![CDATA[AWS]]></category><category><![CDATA[EKS]]></category><category><![CDATA[Kubernetes]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Mon, 19 Dec 2022 11:26:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1671815866692/8b07d2ab-4d7a-457a-81ff-c09d7e5fd236.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this post, I am going to explain the step I followed to deploy PostgreSQL running in docker containers to AWS EKS. The YAML files I used for this deployment can be found in my GitHub repository:</p>
<p><a target="_blank" href="https://github.com/henryeleonu/spark-kubernetes/tree/jupyter-spark-kube">https://github.com/henryeleonu/spark-kubernetes/tree/jupyter-spark-kube</a></p>
<p>In this YouTube video, I demoed the deployment of Apache Spark and PostgreSQL on EKS:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=XGvdlSmNMvc">https://www.youtube.com/watch?v=XGvdlSmNMvc</a></div>
<p> </p>
<p>Before your deploy PostgreSQL, you must deploy the EKS cluster on AWS. Click this link to see <a target="_blank" href="https://henryeleonu.hashnode.dev/how-to-create-an-aws-elastic-kubernetes-service-eks-cluster">how we deploy EKS cluster</a>.</p>
<p><strong>Create a Storage Volume</strong></p>
<p>As Docker containers are ephemeral. All the data which is generated by or in the container will be lost after the termination of the container instance. To save the data, we will be using Persistent volumes and persistent volume claim resources within Kubernetes to store the data on persistent storages. We use the manifest below to specify persistent volumes and persistent volume claim resources. Note that since we are deploying this database container to AWS EKS, the instance storage in the EC2 will be used. This means on termination of the EC2, your data will be lost. For a production database, the best approach is to use storage external to the EC2 instance on which the database pod is running such as Elastic Block Store (EBS) volume or Elastic File System (EFS).</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="235c91bfd299767a369650b58103b44f"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/235c91bfd299767a369650b58103b44f" class="embed-card">https://gist.github.com/henryeleonu/235c91bfd299767a369650b58103b44f</a></div><p> </p>
<p>To create a persistent volume and persistent volume you run the command on the terminal:</p>
<p>kubectl create -f postgres-storage.yaml</p>
<p>OR</p>
<p>kubectl apply -f postgres-storage.yaml</p>
<p>To check the Persistent Volume, run the command:</p>
<p>kubectl get pv</p>
<p>To check the Persistent Volume Claim, run the command:</p>
<p>kubectl get pvc</p>
<p><strong>Create Secret</strong></p>
<p>A Secret is an object that contains a small amount of sensitive data such as a password, a token, or a key. By using secret, you don't need to include confidential data in your application code. Secrets are created independently of the Pods that use them, therefore, there is less risk of confidential data being exposed during the workflow of creating and viewing and editing Pods. Kubernetes, and applications that run in your cluster, can also take additional precautions with Secrets, such as avoiding writing secret data to nonvolatile storage. I used base64 encoding to encode the values for POSTGRES_DB, POSTGRES_USER AND POSTGRES_PASSWORD in the secret manifest. Below is the secret manifest we used.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="268516c3e68be13d77936bb459e0f6ff"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/268516c3e68be13d77936bb459e0f6ff" class="embed-card">https://gist.github.com/henryeleonu/268516c3e68be13d77936bb459e0f6ff</a></div><p> </p>
<p>To create the secret, you run the command on your terminal:</p>
<p>kubectl apply -f postgres-login-secret.yaml</p>
<p><strong>Creating PostgreSQL Deployment</strong></p>
<p>In the deployment manifest below, the pod is specified under template, which has the image of postgres:14-alpine. It has the information from the secret associated to the pod and also mounts the volume created from the persistent volumes and claims. Set the replica of the deployment to 1. If you need to run multiple pods of the database in the Kubernetes cluster, then do not use a deployment for it because it will not be able to synchronize the data in the multiple databases which will compromise the integrity of data. If you need multiple databases, like one master and one or more read replicas, then use a StatefulSet. A manifest of the kind: StatefulSet will ensure the synchronization of read replicas with the master pod.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="87c278384453eaebcdbd60e285d707a6"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/87c278384453eaebcdbd60e285d707a6" class="embed-card">https://gist.github.com/henryeleonu/87c278384453eaebcdbd60e285d707a6</a></div><p> </p>
<p>To create Postgres deployment, run the command on your terminal:</p>
<p>kubectl create -f postgres-deployment.yaml</p>
<p>To get the list of Kubernetes deployments run the command:</p>
<p>kubectl get deployments</p>
<p><strong>Create PostgreSQL Service</strong></p>
<p>To access the deployment or container, we need to expose PostgreSQL service. Kubernetes provides different types of services like ClusterIP, NodePort and LoadBalancer. With ClusterIP we can access the PostgreSQL service within a Kubernetes cluster. NodePort gives the ability to expose a service endpoint on the Kubernetes node, which is the EC2 instance in the case of AWS. For accessing PostgreSQL externally, you need to use a Load Balancer service type which exposes the service externally. Apart from service, you can also use Ingress to expose a service externally.</p>
<p>In our case, we don't need to expose the database externally and it is a good practice never to do this because the database should not be access directly but through a front end. Therefore the manifest below which we used specifies a service of type ClusterIP.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="3f5b100c327d5abf3cc970024803622c"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/3f5b100c327d5abf3cc970024803622c" class="embed-card">https://gist.github.com/henryeleonu/3f5b100c327d5abf3cc970024803622c</a></div><p> </p>
<p>To create Postgres Service, run the command in the terminal:</p>
<p>kubectl create -f postgres-service.yaml</p>
<p>To verify the Kubernetes service, run the command:</p>
<p>kubectl get svc</p>
<p><strong>Connecting to PostgreSQL via kubectl Command</strong></p>
<p>To get into the PostgreSQL pod, run the kubectl exec command below. Be sure to change the pod name, the username and the database name.</p>
<p>kubectl exec -it &lt;name of postres pod&gt; --psql -h localhost -U &lt;database username&gt; --password -p &lt;postgres port number&gt; &lt;database name&gt;</p>
<pre><code class="lang-pgsql">kubectl exec -it postgres<span class="hljs-number">-574</span>d8d5f<span class="hljs-number">-2488</span>v <span class="hljs-comment">--psql -h localhost -U postgres --password -p 5432 postgres</span>
</code></pre>
<p><strong>Delete PostgreSQL Deployments</strong></p>
<p>For the deletion of PostgreSQL resources, we need to use the below commands.</p>
<p>kubectl delete -f postgres-deployment.yaml</p>
<p>kubectl delete -f postgres-login-secret.yaml</p>
<p>kubectl delete -f postgres-service.yaml</p>
<p>kubectl delete -f postgres-storage.yaml</p>
]]></content:encoded></item><item><title><![CDATA[How to Create an AWS Elastic Kubernetes Service (EKS) Cluster]]></title><description><![CDATA[We will be explaining steps to follow to create to create an AWS EKS cluster.
Set Up The IDE or Command Line Interface
The first step to start from in creating an EKS cluster on AWS is to set up the interfaces and Integrated Development Environments ...]]></description><link>https://henryeleonu.com/how-to-create-an-aws-elastic-kubernetes-service-eks-cluster</link><guid isPermaLink="true">https://henryeleonu.com/how-to-create-an-aws-elastic-kubernetes-service-eks-cluster</guid><category><![CDATA[EKS cluster]]></category><category><![CDATA[Elastic Kubernetes Service]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[EKS]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Sun, 18 Dec 2022 22:30:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1671816086662/ba9ad733-47da-49be-9147-dc11ea49542d.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We will be explaining steps to follow to create to create an AWS EKS cluster.</p>
<p><strong>Set Up The IDE or Command Line Interface</strong></p>
<p>The first step to start from in creating an EKS cluster on AWS is to set up the interfaces and Integrated Development Environments (IDE) to enable communication with AWS APIs. You can set up the AWS Command Line Interface (AWS CLI) on your local machine or set up AWS Cloud9 (an IDE) on AWS. AWS CLI is preinstalled on Cloud9, unlike your local machine which requires the setting up of AWS CLI. If you want to know how to set up Cloud9, I have another blog post on how to do this <a target="_blank" href="https://henryeleonu.hashnode.dev/how-to-set-up-aws-cloud9-environment">click here</a>. You may follow the steps <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html">click here</a> to download and install AWS CLI and for these steps to configure AWS CLI <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html">click here</a>.</p>
<p><strong>Install or Update kubeclt</strong></p>
<p>kubectl is a command line tool that you use to communicate with the Kubernetes API server. kubectl is available in many package managers and installation via a package manager is often easier than a manual download and install process. Steps for the installation of kubectl can be found in this link <a target="_blank" href="https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html">click here</a>.</p>
<p><strong>Install or Update eksctl</strong></p>
<p>eksctl is a simple CLI tool for creating and managing clusters on EKS. It is written in Go, uses CloudFormation, and was created by Weaveworks. eksctl provides the fastest and easiest way to create a new cluster with nodes for Amazon EKS. As a prerequisite, kubectl must be installed before the installation of eksctl. This link <a target="_blank" href="https://docs.aws.amazon.com/eks/latest/userguide/eksctl.html">click here</a> shows short steps to follow to install eksctl.</p>
<p><strong>Create an AWS EKS Role</strong></p>
<p>To enable your Kubernetes clusters managed by Amazon EKS to make calls to other AWS services and manage the resources on AWS, you must create an IAM role with the following policies: <a target="_blank" href="https://us-east-1.console.aws.amazon.com/iam/home?region=us-east-1&amp;skipRegion=true#/policies/arn:aws:iam::aws:policy/AmazonEKSClusterPolicy$jsonEditor">AmazonEKSClusterPolicy</a> <a target="_blank" href="https://docs.aws.amazon.com/eks/latest/userguide/service_IAM_role.html">Click here</a> to find the steps to create this role.</p>
<p><strong>Creating a VPC for your Amazon EKS cluster</strong></p>
<p>You may decide to create a VPC beforehand or create it during cluster creation. Follow these steps <a target="_blank" href="https://docs.aws.amazon.com/eks/latest/userguide/creating-a-vpc.html">click here</a> to create a VPC beforehand.</p>
<p><strong>Create an EKS cluster</strong></p>
<p>To create the Kubernetes cluster, we will first write a manifest in a YAML file with the file name eksctl-cluster.yaml. This is the manifest I used to create the Kubernetes cluster on AWS. The YAML files used to create the EKS cluster can be found in my GitHub repository:</p>
<p><a target="_blank" href="https://github.com/henryeleonu/spark-kubernetes/tree/jupyter-spark-kube">https://github.com/henryeleonu/spark-kubernetes/tree/jupyter-spark-kube</a></p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="e395c72d413a2b7c8001f1d2cb839b32"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/e395c72d413a2b7c8001f1d2cb839b32" class="embed-card">https://gist.github.com/henryeleonu/e395c72d413a2b7c8001f1d2cb839b32</a></div><p> </p>
<p>I ran the following command on my terminal to create the cluster.</p>
<p>eksctl create cluster -f eksctl-cluster.yaml</p>
<p>After the creation of the cluster, I ran the following commands:</p>
<p>To get all contexts:</p>
<p>kubectl config get-contexts</p>
<p>To get the current context:</p>
<p>kubectl config current-context</p>
<p>To set the context to EKS cluster on AWS:</p>
<p>kubectl config use-context <a target="_blank" href="mailto:henry@spark-nodes.eu-west-2.eksctl.io">henry@spark-nodes.eu-west-2.eksctl.io</a></p>
<p><strong>Creating an IAM OIDC provider for your cluster</strong></p>
<p>A Kubernetes service account provides an identity for processes that run in a pod. If a pod needs access to AWS services, a service account is mapped to an AWS Identity and Access Management identity to grant that access. Your cluster has an OpenID Connect (OIDC) issuer URL associated with it. To use AWS Identity and Access Management (IAM) roles for service accounts, an IAM OIDC provider must exist for your cluster. Follow these steps <a target="_blank" href="https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html">click here</a> to create an IAM OIDC provider for your cluster.</p>
<p><strong>Deploy The Cluster Autoscaler</strong></p>
<p>Autoscaling is a function that enables automatic horizontal scaling of your resources, that is, scaling resources up or down to meet changing demands. This is a crucial Kubernetes function that would otherwise be difficult to achieve if performed manually.</p>
<p>Amazon EKS supports two autoscaling products. The Kubernetes Cluster Autoscaler and the Karpenter open-source autoscaling project. The cluster autoscaler uses AWS scaling groups, while Karpenter works directly with the Amazon EC2 fleet. We will be using cluster autoscaler.</p>
<p>The Cluster Autoscaler requires the following tags on your Auto Scaling groups so that they can be auto-discovered. If you used eksctl to create your node groups, these tags are automatically applied.</p>
<p>Key Value</p>
<p><a target="_blank" href="http://k8s.io/cluster-autoscaler/my-cluster">k8s.io/cluster-autoscaler/my-cluster</a> owned</p>
<p><a target="_blank" href="http://k8s.io/cluster-autoscaler/enabled">k8s.io/cluster-autoscaler/enabled</a> true</p>
<p>Create an IAM policy that grants the permissions that the Cluster Autoscaler requires to use an IAM role. Follow these steps here <a target="_blank" href="https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html">click here</a> to create the role and policy.</p>
<p>To deploy the Cluster Autoscaler:</p>
<p>Follow the steps here <a target="_blank" href="https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html">click here</a>.</p>
<p>Download the Cluster Autoscaler YAML file by running the following command:</p>
<p>curl -o cluster-autoscaler-autodiscover.yaml <a target="_blank" href="https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml">https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml</a></p>
<p>Modify the YAML file and replace <code>&lt;YOUR CLUSTER NAME&gt;</code> with your cluster name. Also, consider replacing the cpu and memory values as determined by your environment</p>
<p>Run the command on the terminal to deploy the Cluster Autoscaler:</p>
<p>kubectl apply -f cluster-autoscaler-autodiscover.yaml</p>
]]></content:encoded></item><item><title><![CDATA[How to Set Up AWS Cloud9 Environment]]></title><description><![CDATA[Cloud9 is a web-based IDE that runs on an AWS EC2 instance. This means we do not need to install any IDE on our local machine to be able to develop on AWS.
Create Cloud9 Environment
Navigate to the AWS console to create the environment. Click on Crea...]]></description><link>https://henryeleonu.com/how-to-set-up-aws-cloud9-environment</link><guid isPermaLink="true">https://henryeleonu.com/how-to-set-up-aws-cloud9-environment</guid><category><![CDATA[cloud9]]></category><category><![CDATA[Elastic Kubernetes Service]]></category><category><![CDATA[AWS]]></category><category><![CDATA[EKS]]></category><category><![CDATA[IDEs]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Sun, 18 Dec 2022 13:11:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1671833582716/1871110b-bc8d-4c43-887b-5a301e6358f2.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Cloud9 is a web-based IDE that runs on an AWS EC2 instance. This means we do not need to install any IDE on our local machine to be able to develop on AWS.</p>
<p><strong>Create Cloud9 Environment</strong></p>
<p>Navigate to the AWS console to create the environment. Click on Create environment. Fill in the detail which includes selecting the instance type.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671357667448/AUFTVZ0t4.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671357905300/99Y08TjYP.png" alt class="image--center mx-auto" /></p>
<p>Choose the default VPC under VPC settings</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671361317003/z4MVHghLN.png" alt class="image--center mx-auto" /></p>
<p>Then click on Create. After the creation of cloud9, click on the name of the created cloud9 environment</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671363172363/hq_lJzG68.png" alt class="image--center mx-auto" /></p>
<p>Click on Open in Cloud9</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671363375107/rJjx7zSaU.png" alt class="image--center mx-auto" /></p>
<p>The Cloud9 IDE opens in another tab on your browser</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671363699114/_ScjZ7LgH.png" alt class="image--center mx-auto" /></p>
<p><strong>Create An IAM Role for EC2</strong></p>
<p>We will then create an IAM role the EC2 instance which Cloud9 runs on will assume. This role will allow the EC2 instance to make API calls to other AWS services. Navigate to Identity and Access Management (IAM), under Acess Management, click Roles and then click Create role. Under the Trust entity type, select AWS services because the EC2 instance is an AWS service that will assume the role that is being created. Under the use case, select EC2.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671365583039/lOls38tAP.png" alt class="image--center mx-auto" /></p>
<p>Click next. To attach a policy to the role, search for and select AdministratorAccess from permissions policies and click next.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671366094655/kqCXy9iJi.png" alt class="image--center mx-auto" /></p>
<p>Click next, name the role and create the role.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671366429521/hyPUOHLUA.png" alt class="image--center mx-auto" /></p>
<p>Navigate to EC2 console, and click on instances to see the EC2 instance created on which cloud9 runs on.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671366973710/PUbfmzcmD.png" alt class="image--center mx-auto" /></p>
<p>Select the EC2 instance, click the Actions dropdown menu, select Security and then select Modify IAM role. From the menu, select the role we created earlier and that click Update IAM role.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671367512607/dEOpzRzsc.png" alt class="image--center mx-auto" /></p>
<p><strong>Remove The Temporary IAM Credentials for AWS Cloud9</strong></p>
<p>Go to AWS Cloud9 IDE, and choose Settings in the gear icon on the top right corner. Under Preferences, choose AWS settings and then choose Credentials. Turn off AWS managed temporary credentials and close the tab.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1671368567162/SZ82s0sRH.png" alt class="image--center mx-auto" /></p>
]]></content:encoded></item><item><title><![CDATA[Code Generation Using ChatGPT: Optimizing Language Models for Dialogue]]></title><description><![CDATA[OpenAI has released a trained model called ChatGPT that can interact conversationally. I am particularly interested in how it generates code. In this post, I will be writing about my interactions with ChatGPT and some of my observations.
My Interacti...]]></description><link>https://henryeleonu.com/code-generation-using-chatgpt-optimizing-language-models-for-dialogue</link><guid isPermaLink="true">https://henryeleonu.com/code-generation-using-chatgpt-optimizing-language-models-for-dialogue</guid><category><![CDATA[Automated code generation]]></category><category><![CDATA[chatgpt]]></category><category><![CDATA[chatbot]]></category><category><![CDATA[Machine Learning]]></category><category><![CDATA[natural language processing]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Fri, 16 Dec 2022 12:29:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1671833181440/94ac891e-2792-4316-b70e-9eda52c9642a.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>OpenAI has released a trained model called ChatGPT that can interact conversationally. I am particularly interested in how it generates code. In this post, I will be writing about my interactions with ChatGPT and some of my observations.</p>
<h1 id="heading-my-interactions-with-chatgpt">My Interactions with ChatGPT</h1>
<p><strong>I asked this question:</strong></p>
<p>How can I scrape a table from a website in python?</p>
<p><strong>I got this response from ChatGPT:</strong></p>
<p>To scrape a table from a website in Python, you will need to use a library called Beautiful Soup. Beautiful Soup is a popular Python library for extracting data from HTML and XML files. It provides simple, idiomatic ways of navigating, searching, and modifying the parse tree.</p>
<p>Here is an example of how you can use Beautiful Soup to scrape a table from a website:</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="c398d95b3eabb64b7f28d891a4d44952"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/c398d95b3eabb64b7f28d891a4d44952" class="embed-card">https://gist.github.com/henryeleonu/c398d95b3eabb64b7f28d891a4d44952</a></div><p> </p>
<p>This code will make a request to the specified URL, parse the HTML content of the page, find the table element in the HTML, and then loop through the rows and cells of the table, extracting the text from each cell and printing it to the console.</p>
<p>You can also use other libraries such as Selenium or Pandas to scrape tables from websites. Selenium is a tool for controlling a web browser through code, which can be useful if the table you want to scrape is generated dynamically by JavaScript. Pandas has a function called <code>read_html()</code> which can be used to automatically parse tables from an HTML page and return them as a list of DataFrame objects.</p>
<p>I hope this helps! Let me know if you have any questions.</p>
<p><strong>I asked the same question a second time and I got a different response, this time it gave the steps without writing any code.</strong></p>
<p><strong>This is the response I got:</strong></p>
<p>There are a few ways to scrape a table from a website in Python:</p>
<ol>
<li>Using the BeautifulSoup library:</li>
</ol>
<ul>
<li><p>First, install the BeautifulSoup library using pip install beautifulsoup4</p>
</li>
<li><p>Then, use the requests library to send a GET request to the website and retrieve the HTML content</p>
</li>
<li><p>Parse the HTML content using BeautifulSoup and find the table element</p>
</li>
<li><p>Extract the rows and columns from the table and store the data in a list or a dictionary</p>
</li>
</ul>
<ol>
<li>Using the Selenium library:</li>
</ol>
<ul>
<li><p>First, install the Selenium library using pip install selenium</p>
</li>
<li><p>Then, use Selenium to open the website and retrieve the HTML content</p>
</li>
<li><p>Parse the HTML content and find the table element</p>
</li>
<li><p>Extract the rows and columns from the table and store the data in a list or a dictionary</p>
</li>
</ul>
<ol>
<li>Using the pandas library:</li>
</ol>
<ul>
<li><p>First, install the pandas library using pip install pandas</p>
</li>
<li><p>Then, use the <a target="_blank" href="http://pandas.read">pandas.read</a>_html() function to extract the tables from the website</p>
</li>
<li><p>The function will return a list of DataFrame objects, which can be accessed and manipulated as needed.</p>
</li>
</ul>
<p><strong>I then asked a this more specific question</strong></p>
<p>How can I write a python code to scrape a table from the website, <a target="_blank" href="https://en.wikipedia.org/wiki/List_of_airlines_of_the_United_Kingdom">https://en.wikipedia.org/wiki/List_of_airlines_of_the_United_Kingdom</a>?</p>
<p><strong>This is the response from ChatGPT:</strong></p>
<p>Here is a sample code that demonstrates how you can scrape the table from the website using Python's Beautiful Soup library:</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="7d3f86733328e27cc6a56123d97459b4"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/henryeleonu/7d3f86733328e27cc6a56123d97459b4" class="embed-card">https://gist.github.com/henryeleonu/7d3f86733328e27cc6a56123d97459b4</a></div><p> </p>
<p>This code will print out the contents of the first three columns of the table, which contain the name, IATA code, and ICAO code of the airlines. You can modify the code to extract other information from the table, such as the type of airline or the country of origin.</p>
<h1 id="heading-my-observations">My Observations</h1>
<p>From my interactions, I can say that ChatGPT is impressive, especially from the way it makes the interaction feel natural and how detailed the responses it gives. But in terms of code generation, I will say that the code is more of a template that needs to be customized to meet your needs. The code generated is not guaranteed to run without errors, as the above code all have errors on running them. Generally speaking, the code generation aspect might be beneficial at the initial stage of a project especially to a beginner programmer to know the step to take to solve a problem.</p>
]]></content:encoded></item><item><title><![CDATA[How to Install Multiple Versions of Python Using Virtualenv]]></title><description><![CDATA[Click this line to watch: how to install multiple versions of python using virtualenv - YouTube
There are situations when we need to have multiple versions of python, for instance, when we need to install dependencies that are not compatible with the...]]></description><link>https://henryeleonu.com/how-to-install-multiple-versions-of-python-using-virtualenv</link><guid isPermaLink="true">https://henryeleonu.com/how-to-install-multiple-versions-of-python-using-virtualenv</guid><category><![CDATA[Python]]></category><category><![CDATA[virtual environment]]></category><category><![CDATA[virtualenv]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Thu, 15 Dec 2022 09:11:12 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1671833648682/fb2891d6-569a-4765-a840-2146cd21092c.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Click this line to watch:</strong> <a target="_blank" href="https://www.youtube.com/watch?v=sk-ikK90AyQ">how to install multiple versions of python using virtualenv - YouTube</a></p>
<p>There are situations when we need to have multiple versions of python, for instance, when we need to install dependencies that are not compatible with the python version we are running. I have run into a situation where a dependency I need to install is only compatible with previous versions of python. To solve this problem, I had to run the previous version of python in a virtual environment. I use virtualenv utility, which enabled me to run multiple versions of python. I will be explaining some of the steps I followed to get things running.</p>
<ol>
<li><p>Install virtualenv<br />pip install virtualenv</p>
</li>
<li><p>Download the desired version of python<br />I already had my main python installed in this path: C:\Python\Python311\<br />I install the previous version of python here: C:\Python\Python310\</p>
</li>
<li><p>Create a project directory<br />I created a project directory here: C:\Python_Workspace\my_project\</p>
</li>
<li><p>Create a virtual environment in your project directory<br />Open the terminal and change directory to project directory<br />cd C:\Python_Workspace\my_project\<br />Create your virtual environment<br />python -m virtualenv -p C:\Python\Python310\python.exe my-virtual-env</p>
</li>
<li><p>Activate virtual environment<br />.\my-virtual-env\Scripts\activate</p>
</li>
<li><p>To deactivate the virtual environment, run<br />deactivate</p>
</li>
<li><p>We can now go ahead to install all the dependencies we need.</p>
</li>
<li><p>To create a requirements.txt file of all dependencies in the virtual environment, run:<br />pip freeze &gt; requirements.txt</p>
</li>
</ol>
]]></content:encoded></item><item><title><![CDATA[Pros and Cons of Stock APIs]]></title><description><![CDATA[Real-time and historic stock or other financial datasets are very essential for developing financial applications. Developers and Data Engineers typically want to extract data from Application Programming Interfaces (API) from within their code. Ther...]]></description><link>https://henryeleonu.com/pros-and-cons-of-stock-apis</link><guid isPermaLink="true">https://henryeleonu.com/pros-and-cons-of-stock-apis</guid><category><![CDATA[Stocks API]]></category><category><![CDATA[Python Stock Library]]></category><category><![CDATA[financila api]]></category><dc:creator><![CDATA[Henry Eleonu]]></dc:creator><pubDate>Thu, 15 Dec 2022 08:46:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1671833049761/97081e64-f4ef-407b-b354-e9e478e23266.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Real-time and historic stock or other financial datasets are very essential for developing financial applications. Developers and Data Engineers typically want to extract data from Application Programming Interfaces (API) from within their code. There are many stock APIs out there, therefore could be a bit difficult to decide on which API to use.</p>
<p>We will be looking at some of the popular ones out there and comparing them based on cost, how well-supported the API is, ease of use, limitations on the size of the dataset, and support for real-time data.</p>
<p>In terms of cost, some are free, others give free days of trial, while some are not free and don’t give free trial.</p>
<p>We can measure how well-supported an API is by its last update, for example in the case of a python library of an API, the last date of update in PyPi. We can also measure this by the frequency of updates of their GitHub repository.</p>
<h1 id="heading-yahoo-finance"><strong>Yahoo Finance</strong></h1>
<p>Yahoo Finance publishes free stock data from the major stock market around the world.</p>
<p><strong>Pros:</strong></p>
<p>· It is free</p>
<p>· We can get a huge amount of data from it</p>
<p>· It has two well-supported Python libraries, pandas-datareader library as well as the yfinance library. The libraries simplify the extraction and use of the data by reducing the amount of code. The latest release for yfinance was on Nov 16, 2022</p>
<p><strong>Cons:</strong></p>
<p>· The API is not an official Yahoo Finance API</p>
<p>· Only basic dataset can be retrieved</p>
<p>· Because it is unofficial, the rate of API calls could be limited</p>
<h1 id="heading-alpha-vantage"><strong>Alpha Vantage</strong></h1>
<p>Alpha Vantage provides enterprise-grade real-time and historical financial market data through a set of powerful and developer-friendly data APIs and spreadsheets. Delivered through REST stock APIs, Excel, and Google Sheets.</p>
<p>Pros:</p>
<p>· It has a python library, but it is not popular</p>
<p>· It is suitable for commercial applications</p>
<p>· Supports real-time and historical financial market data</p>
<p>Cons:</p>
<p>· It is not free.</p>
<p>· It has an unofficial python library that is not well supported of which the latest release is July 4, 2021</p>
<h1 id="heading-bloomberg-api"><strong>Bloomberg API</strong></h1>
<p>Pros:</p>
<p>· it has an official python library</p>
<p>· suitable for commercial applications</p>
<p>· Supports real-time and historical financial market data</p>
<p>Cons:</p>
<p>· Not free. Start at $2000 per month</p>
<p>· Its python library is not well supported. The latest version was released Jan 1, 2019</p>
<h1 id="heading-stock-news"><strong>Stock News</strong></h1>
<p>Focused on stock news and summary reports</p>
<p>Pros:</p>
<p>· Good at stock news</p>
<p>Cons:</p>
<ul>
<li><p>No custom python library. You make us of request to get data in json format.</p>
</li>
<li><p>Gives a 14-day free trial plan starting at $19.99 per month.</p>
</li>
<li><p>Requires details like card number and name on the card to make payment.</p>
</li>
</ul>
<h1 id="heading-iex-cloud"><strong>IEX Cloud</strong></h1>
<p>Pros:</p>
<p>· Has a free trial</p>
<p>· Supports real-time and historical financial market data</p>
<p>Cons:</p>
<ul>
<li><p>No custom python library. You make us of request to get data in json format.</p>
</li>
<li><p>Not free. Starting at $49</p>
</li>
</ul>
<h1 id="heading-morning-star"><strong>Morning star</strong></h1>
<p>Pros:</p>
<p>· Has a python library</p>
<p>Cons:</p>
<ul>
<li><p>Not free.</p>
</li>
<li><p>Python library is not well supported. The latest release was Jun 16, 2020</p>
</li>
</ul>
]]></content:encoded></item></channel></rss>